diff --git "a/sf_log.txt" "b/sf_log.txt" --- "a/sf_log.txt" +++ "b/sf_log.txt" @@ -1,32 +1,39 @@ -[2023-09-26 15:16:36,666][04574] Saving configuration to ./train_atari/atari_kongfumaster/config.json... -[2023-09-26 15:16:36,932][04574] Rollout worker 0 uses device cpu -[2023-09-26 15:16:36,933][04574] Rollout worker 1 uses device cpu -[2023-09-26 15:16:36,933][04574] Rollout worker 2 uses device cpu -[2023-09-26 15:16:36,934][04574] Rollout worker 3 uses device cpu -[2023-09-26 15:16:36,935][04574] Rollout worker 4 uses device cpu -[2023-09-26 15:16:36,935][04574] Rollout worker 5 uses device cpu -[2023-09-26 15:16:36,936][04574] Rollout worker 6 uses device cpu -[2023-09-26 15:16:36,936][04574] Rollout worker 7 uses device cpu -[2023-09-26 15:16:36,937][04574] In synchronous mode, we only accumulate one batch. Setting num_batches_to_accumulate to 1 -[2023-09-26 15:16:36,984][04574] Using GPUs [0] for process 0 (actually maps to GPUs [0]) -[2023-09-26 15:16:36,985][04574] InferenceWorker_p0-w0: min num requests: 1 -[2023-09-26 15:16:36,988][04574] Using GPUs [1] for process 1 (actually maps to GPUs [1]) -[2023-09-26 15:16:36,988][04574] InferenceWorker_p1-w0: min num requests: 1 -[2023-09-26 15:16:37,012][04574] Starting all processes... -[2023-09-26 15:16:37,012][04574] Starting process learner_proc0 -[2023-09-26 15:16:38,601][04574] Starting process learner_proc1 -[2023-09-26 15:16:38,604][05384] Using GPUs [0] for process 0 (actually maps to GPUs [0]) -[2023-09-26 15:16:38,605][05384] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for learning process 0 -[2023-09-26 15:16:38,623][05384] Num visible devices: 1 -[2023-09-26 15:16:38,639][05384] Starting seed is not provided -[2023-09-26 15:16:38,639][05384] Using GPUs [0] for process 0 (actually maps to GPUs [0]) -[2023-09-26 15:16:38,640][05384] Initializing actor-critic model on device cuda:0 -[2023-09-26 15:16:38,640][05384] RunningMeanStd input shape: (4, 84, 84) -[2023-09-26 15:16:38,641][05384] RunningMeanStd input shape: (1,) -[2023-09-26 15:16:38,652][05384] ConvEncoder: input_channels=4 -[2023-09-26 15:16:38,811][05384] Conv encoder output size: 512 -[2023-09-26 15:16:38,813][05384] Created Actor Critic model with architecture: -[2023-09-26 15:16:38,813][05384] ActorCriticSharedWeights( +[2023-10-13 00:11:57,454][45375] Saving configuration to ./train_atari/atari_kongfumaster_APPO/config.json... +[2023-10-13 00:11:57,771][45375] Rollout worker 0 uses device cpu +[2023-10-13 00:11:57,776][45375] Rollout worker 1 uses device cpu +[2023-10-13 00:11:57,776][45375] Rollout worker 2 uses device cpu +[2023-10-13 00:11:57,777][45375] Rollout worker 3 uses device cpu +[2023-10-13 00:11:57,777][45375] Rollout worker 4 uses device cpu +[2023-10-13 00:11:57,778][45375] Rollout worker 5 uses device cpu +[2023-10-13 00:11:57,779][45375] Rollout worker 6 uses device cpu +[2023-10-13 00:11:57,779][45375] Rollout worker 7 uses device cpu +[2023-10-13 00:11:57,780][45375] Rollout worker 8 uses device cpu +[2023-10-13 00:11:57,780][45375] Rollout worker 9 uses device cpu +[2023-10-13 00:11:57,781][45375] Rollout worker 10 uses device cpu +[2023-10-13 00:11:57,781][45375] Rollout worker 11 uses device cpu +[2023-10-13 00:11:57,781][45375] Rollout worker 12 uses device cpu +[2023-10-13 00:11:57,782][45375] Rollout worker 13 uses device cpu +[2023-10-13 00:11:57,782][45375] Rollout worker 14 uses device cpu +[2023-10-13 00:11:57,783][45375] Rollout worker 15 uses device cpu +[2023-10-13 00:11:58,074][45375] Using GPUs [0] for process 0 (actually maps to GPUs [0]) +[2023-10-13 00:11:58,074][45375] InferenceWorker_p0-w0: min num requests: 2 +[2023-10-13 00:11:58,077][45375] Using GPUs [1] for process 1 (actually maps to GPUs [1]) +[2023-10-13 00:11:58,078][45375] InferenceWorker_p1-w0: min num requests: 2 +[2023-10-13 00:11:58,124][45375] Starting all processes... +[2023-10-13 00:11:58,125][45375] Starting process learner_proc0 +[2023-10-13 00:11:59,861][45375] Starting process learner_proc1 +[2023-10-13 00:11:59,864][46091] Using GPUs [0] for process 0 (actually maps to GPUs [0]) +[2023-10-13 00:11:59,865][46091] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for learning process 0 +[2023-10-13 00:11:59,883][46091] Num visible devices: 1 +[2023-10-13 00:11:59,901][46091] Setting fixed seed 1234 +[2023-10-13 00:11:59,902][46091] Using GPUs [0] for process 0 (actually maps to GPUs [0]) +[2023-10-13 00:11:59,902][46091] Initializing actor-critic model on device cuda:0 +[2023-10-13 00:11:59,902][46091] RunningMeanStd input shape: (4, 84, 84) +[2023-10-13 00:11:59,903][46091] RunningMeanStd input shape: (1,) +[2023-10-13 00:11:59,914][46091] ConvEncoder: input_channels=4 +[2023-10-13 00:12:00,069][46091] Conv encoder output size: 512 +[2023-10-13 00:12:00,071][46091] Created Actor Critic model with architecture: +[2023-10-13 00:12:00,072][46091] ActorCriticSharedWeights( (obs_normalizer): ObservationNormalizer( (running_mean_std): RunningMeanStdDictInPlace( (running_mean_std): ModuleDict( @@ -67,35 +74,41 @@ (distribution_linear): Linear(in_features=512, out_features=14, bias=True) ) ) -[2023-09-26 15:16:39,401][05384] Using optimizer -[2023-09-26 15:16:39,402][05384] No checkpoints found -[2023-09-26 15:16:39,402][05384] Did not load from checkpoint, starting from scratch! -[2023-09-26 15:16:39,403][05384] Initialized policy 0 weights for model version 0 -[2023-09-26 15:16:39,404][05384] LearnerWorker_p0 finished initialization! -[2023-09-26 15:16:39,404][05384] Using GPUs [0] for process 0 (actually maps to GPUs [0]) -[2023-09-26 15:16:40,241][04574] Starting all processes... -[2023-09-26 15:16:40,245][05596] Using GPUs [1] for process 1 (actually maps to GPUs [1]) -[2023-09-26 15:16:40,245][05596] Set environment var CUDA_VISIBLE_DEVICES to '1' (GPU indices [1]) for learning process 1 -[2023-09-26 15:16:40,248][04574] Starting process inference_proc0-0 -[2023-09-26 15:16:40,249][04574] Starting process inference_proc1-0 -[2023-09-26 15:16:40,249][04574] Starting process rollout_proc0 -[2023-09-26 15:16:40,249][04574] Starting process rollout_proc1 -[2023-09-26 15:16:40,264][05596] Num visible devices: 1 -[2023-09-26 15:16:40,249][04574] Starting process rollout_proc2 -[2023-09-26 15:16:40,250][04574] Starting process rollout_proc3 -[2023-09-26 15:16:40,289][05596] Starting seed is not provided -[2023-09-26 15:16:40,289][05596] Using GPUs [0] for process 1 (actually maps to GPUs [1]) -[2023-09-26 15:16:40,290][05596] Initializing actor-critic model on device cuda:0 -[2023-09-26 15:16:40,290][05596] RunningMeanStd input shape: (4, 84, 84) -[2023-09-26 15:16:40,254][04574] Starting process rollout_proc4 -[2023-09-26 15:16:40,291][05596] RunningMeanStd input shape: (1,) -[2023-09-26 15:16:40,254][04574] Starting process rollout_proc5 -[2023-09-26 15:16:40,257][04574] Starting process rollout_proc6 -[2023-09-26 15:16:40,258][04574] Starting process rollout_proc7 -[2023-09-26 15:16:40,303][05596] ConvEncoder: input_channels=4 -[2023-09-26 15:16:40,621][05596] Conv encoder output size: 512 -[2023-09-26 15:16:40,623][05596] Created Actor Critic model with architecture: -[2023-09-26 15:16:40,623][05596] ActorCriticSharedWeights( +[2023-10-13 00:12:00,631][46091] Using optimizer +[2023-10-13 00:12:00,632][46091] No checkpoints found +[2023-10-13 00:12:00,632][46091] Did not load from checkpoint, starting from scratch! +[2023-10-13 00:12:00,632][46091] Initialized policy 0 weights for model version 0 +[2023-10-13 00:12:00,633][46091] LearnerWorker_p0 finished initialization! +[2023-10-13 00:12:00,634][46091] Using GPUs [0] for process 0 (actually maps to GPUs [0]) +[2023-10-13 00:12:01,628][45375] Starting all processes... +[2023-10-13 00:12:01,632][46384] Using GPUs [1] for process 1 (actually maps to GPUs [1]) +[2023-10-13 00:12:01,633][46384] Set environment var CUDA_VISIBLE_DEVICES to '1' (GPU indices [1]) for learning process 1 +[2023-10-13 00:12:01,635][45375] Starting process inference_proc0-0 +[2023-10-13 00:12:01,635][45375] Starting process inference_proc1-0 +[2023-10-13 00:12:01,636][45375] Starting process rollout_proc0 +[2023-10-13 00:12:01,636][45375] Starting process rollout_proc1 +[2023-10-13 00:12:01,652][46384] Num visible devices: 1 +[2023-10-13 00:12:01,636][45375] Starting process rollout_proc2 +[2023-10-13 00:12:01,637][45375] Starting process rollout_proc3 +[2023-10-13 00:12:01,677][46384] Setting fixed seed 1234 +[2023-10-13 00:12:01,679][46384] Using GPUs [0] for process 1 (actually maps to GPUs [1]) +[2023-10-13 00:12:01,679][46384] Initializing actor-critic model on device cuda:0 +[2023-10-13 00:12:01,679][46384] RunningMeanStd input shape: (4, 84, 84) +[2023-10-13 00:12:01,680][46384] RunningMeanStd input shape: (1,) +[2023-10-13 00:12:01,637][45375] Starting process rollout_proc4 +[2023-10-13 00:12:01,637][45375] Starting process rollout_proc5 +[2023-10-13 00:12:01,637][45375] Starting process rollout_proc6 +[2023-10-13 00:12:01,641][45375] Starting process rollout_proc7 +[2023-10-13 00:12:01,646][45375] Starting process rollout_proc8 +[2023-10-13 00:12:01,692][46384] ConvEncoder: input_channels=4 +[2023-10-13 00:12:01,648][45375] Starting process rollout_proc9 +[2023-10-13 00:12:01,649][45375] Starting process rollout_proc10 +[2023-10-13 00:12:01,649][45375] Starting process rollout_proc11 +[2023-10-13 00:12:01,650][45375] Starting process rollout_proc12 +[2023-10-13 00:12:01,650][45375] Starting process rollout_proc13 +[2023-10-13 00:12:02,170][46384] Conv encoder output size: 512 +[2023-10-13 00:12:02,193][46384] Created Actor Critic model with architecture: +[2023-10-13 00:12:02,193][46384] ActorCriticSharedWeights( (obs_normalizer): ObservationNormalizer( (running_mean_std): RunningMeanStdDictInPlace( (running_mean_std): ModuleDict( @@ -136,2213 +149,26479 @@ (distribution_linear): Linear(in_features=512, out_features=14, bias=True) ) ) -[2023-09-26 15:16:41,230][05596] Using optimizer -[2023-09-26 15:16:41,230][05596] No checkpoints found -[2023-09-26 15:16:41,230][05596] Did not load from checkpoint, starting from scratch! -[2023-09-26 15:16:41,231][05596] Initialized policy 1 weights for model version 0 -[2023-09-26 15:16:41,232][05596] LearnerWorker_p1 finished initialization! -[2023-09-26 15:16:41,232][05596] Using GPUs [0] for process 1 (actually maps to GPUs [1]) -[2023-09-26 15:16:42,174][05939] Worker 5 uses CPU cores [20, 21, 22, 23] -[2023-09-26 15:16:42,174][05901] Using GPUs [1] for process 1 (actually maps to GPUs [1]) -[2023-09-26 15:16:42,175][05901] Set environment var CUDA_VISIBLE_DEVICES to '1' (GPU indices [1]) for inference process 1 -[2023-09-26 15:16:42,175][05941] Worker 7 uses CPU cores [28, 29, 30, 31] -[2023-09-26 15:16:42,177][05933] Worker 0 uses CPU cores [0, 1, 2, 3] -[2023-09-26 15:16:42,194][05901] Num visible devices: 1 -[2023-09-26 15:16:42,196][05938] Worker 4 uses CPU cores [16, 17, 18, 19] -[2023-09-26 15:16:42,197][05900] Using GPUs [0] for process 0 (actually maps to GPUs [0]) -[2023-09-26 15:16:42,197][05900] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for inference process 0 -[2023-09-26 15:16:42,216][05900] Num visible devices: 1 -[2023-09-26 15:16:42,234][05934] Worker 1 uses CPU cores [4, 5, 6, 7] -[2023-09-26 15:16:42,250][05937] Worker 3 uses CPU cores [12, 13, 14, 15] -[2023-09-26 15:16:42,280][05936] Worker 2 uses CPU cores [8, 9, 10, 11] -[2023-09-26 15:16:42,289][05940] Worker 6 uses CPU cores [24, 25, 26, 27] -[2023-09-26 15:16:42,826][05901] RunningMeanStd input shape: (4, 84, 84) -[2023-09-26 15:16:42,826][05901] RunningMeanStd input shape: (1,) -[2023-09-26 15:16:42,836][05900] RunningMeanStd input shape: (4, 84, 84) -[2023-09-26 15:16:42,836][05900] RunningMeanStd input shape: (1,) -[2023-09-26 15:16:42,837][05901] ConvEncoder: input_channels=4 -[2023-09-26 15:16:42,847][05900] ConvEncoder: input_channels=4 -[2023-09-26 15:16:42,908][04574] Fps is (10 sec: nan, 60 sec: nan, 300 sec: nan). Total num frames: 0. Throughput: 0: nan, 1: nan. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) -[2023-09-26 15:16:42,934][05901] Conv encoder output size: 512 -[2023-09-26 15:16:42,940][04574] Inference worker 1-0 is ready! -[2023-09-26 15:16:42,944][05900] Conv encoder output size: 512 -[2023-09-26 15:16:42,949][04574] Inference worker 0-0 is ready! -[2023-09-26 15:16:42,950][04574] All inference workers are ready! Signal rollout workers to start! -[2023-09-26 15:16:43,412][05934] Decorrelating experience for 0 frames... -[2023-09-26 15:16:43,413][05941] Decorrelating experience for 0 frames... -[2023-09-26 15:16:43,415][05937] Decorrelating experience for 0 frames... -[2023-09-26 15:16:43,415][05939] Decorrelating experience for 0 frames... -[2023-09-26 15:16:43,459][05938] Decorrelating experience for 0 frames... -[2023-09-26 15:16:43,474][05933] Decorrelating experience for 0 frames... -[2023-09-26 15:16:43,479][05940] Decorrelating experience for 0 frames... -[2023-09-26 15:16:43,489][05936] Decorrelating experience for 0 frames... -[2023-09-26 15:16:47,908][04574] Fps is (10 sec: 1638.4, 60 sec: 1638.4, 300 sec: 1638.4). Total num frames: 8192. Throughput: 0: 204.8, 1: 204.8. Samples: 2048. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:16:47,909][04574] Avg episode reward: [(0, '0.000'), (1, '0.500')] -[2023-09-26 15:16:52,907][04574] Fps is (10 sec: 3276.9, 60 sec: 3276.9, 300 sec: 3276.9). Total num frames: 32768. Throughput: 0: 407.2, 1: 408.4. Samples: 8156. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:16:52,908][04574] Avg episode reward: [(0, '0.850'), (1, '0.444')] -[2023-09-26 15:16:56,972][04574] Heartbeat connected on Batcher_0 -[2023-09-26 15:16:56,974][04574] Heartbeat connected on LearnerWorker_p0 -[2023-09-26 15:16:56,978][04574] Heartbeat connected on Batcher_1 -[2023-09-26 15:16:56,980][04574] Heartbeat connected on LearnerWorker_p1 -[2023-09-26 15:16:56,986][04574] Heartbeat connected on InferenceWorker_p0-w0 -[2023-09-26 15:16:56,990][04574] Heartbeat connected on InferenceWorker_p1-w0 -[2023-09-26 15:16:56,993][04574] Heartbeat connected on RolloutWorker_w0 -[2023-09-26 15:16:56,994][04574] Heartbeat connected on RolloutWorker_w1 -[2023-09-26 15:16:56,999][04574] Heartbeat connected on RolloutWorker_w2 -[2023-09-26 15:16:57,002][04574] Heartbeat connected on RolloutWorker_w3 -[2023-09-26 15:16:57,005][04574] Heartbeat connected on RolloutWorker_w4 -[2023-09-26 15:16:57,007][04574] Heartbeat connected on RolloutWorker_w5 -[2023-09-26 15:16:57,009][04574] Heartbeat connected on RolloutWorker_w6 -[2023-09-26 15:16:57,012][04574] Heartbeat connected on RolloutWorker_w7 -[2023-09-26 15:16:57,908][04574] Fps is (10 sec: 5734.4, 60 sec: 4369.1, 300 sec: 4369.1). Total num frames: 65536. Throughput: 0: 413.5, 1: 412.6. Samples: 12391. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:16:57,909][04574] Avg episode reward: [(0, '0.784'), (1, '0.378')] -[2023-09-26 15:17:00,004][05901] Updated weights for policy 1, policy_version 160 (0.0015) -[2023-09-26 15:17:00,004][05900] Updated weights for policy 0, policy_version 160 (0.0017) -[2023-09-26 15:17:02,908][04574] Fps is (10 sec: 6553.5, 60 sec: 4915.2, 300 sec: 4915.2). Total num frames: 98304. Throughput: 0: 556.4, 1: 558.8. Samples: 22305. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:17:02,909][04574] Avg episode reward: [(0, '0.962'), (1, '0.375')] -[2023-09-26 15:17:07,908][04574] Fps is (10 sec: 5734.4, 60 sec: 4915.2, 300 sec: 4915.2). Total num frames: 122880. Throughput: 0: 620.7, 1: 620.8. Samples: 31038. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:17:07,909][04574] Avg episode reward: [(0, '1.015'), (1, '0.493')] -[2023-09-26 15:17:12,908][04574] Fps is (10 sec: 5734.4, 60 sec: 5188.3, 300 sec: 5188.3). Total num frames: 155648. Throughput: 0: 598.9, 1: 597.1. Samples: 35879. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:17:12,909][04574] Avg episode reward: [(0, '1.175'), (1, '0.549')] -[2023-09-26 15:17:13,421][05901] Updated weights for policy 1, policy_version 320 (0.0017) -[2023-09-26 15:17:13,421][05900] Updated weights for policy 0, policy_version 320 (0.0019) -[2023-09-26 15:17:17,907][04574] Fps is (10 sec: 6553.8, 60 sec: 5383.3, 300 sec: 5383.3). Total num frames: 188416. Throughput: 0: 639.7, 1: 643.7. Samples: 44917. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:17:17,908][04574] Avg episode reward: [(0, '1.370'), (1, '0.600')] -[2023-09-26 15:17:22,908][04574] Fps is (10 sec: 6553.6, 60 sec: 5529.6, 300 sec: 5529.6). Total num frames: 221184. Throughput: 0: 673.5, 1: 674.0. Samples: 53901. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:17:22,909][04574] Avg episode reward: [(0, '1.660'), (1, '0.740')] -[2023-09-26 15:17:22,910][05384] Saving new best policy, reward=1.660! -[2023-09-26 15:17:22,910][05596] Saving new best policy, reward=0.740! -[2023-09-26 15:17:26,763][05900] Updated weights for policy 0, policy_version 480 (0.0018) -[2023-09-26 15:17:26,764][05901] Updated weights for policy 1, policy_version 480 (0.0018) -[2023-09-26 15:17:27,908][04574] Fps is (10 sec: 5734.3, 60 sec: 5461.3, 300 sec: 5461.3). Total num frames: 245760. Throughput: 0: 653.2, 1: 653.9. Samples: 58818. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:17:27,909][04574] Avg episode reward: [(0, '2.030'), (1, '0.940')] -[2023-09-26 15:17:28,025][05384] Saving new best policy, reward=2.030! -[2023-09-26 15:17:28,063][05596] Saving new best policy, reward=0.940! -[2023-09-26 15:17:32,907][04574] Fps is (10 sec: 5734.4, 60 sec: 5570.6, 300 sec: 5570.6). Total num frames: 278528. Throughput: 0: 730.2, 1: 730.0. Samples: 67759. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:17:32,908][04574] Avg episode reward: [(0, '2.250'), (1, '1.080')] -[2023-09-26 15:17:32,914][05384] Saving new best policy, reward=2.250! -[2023-09-26 15:17:32,915][05596] Saving new best policy, reward=1.080! -[2023-09-26 15:17:37,907][04574] Fps is (10 sec: 6553.7, 60 sec: 5660.0, 300 sec: 5660.0). Total num frames: 311296. Throughput: 0: 767.5, 1: 767.8. Samples: 77243. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:17:37,908][04574] Avg episode reward: [(0, '2.520'), (1, '1.220')] -[2023-09-26 15:17:37,909][05384] Saving new best policy, reward=2.520! -[2023-09-26 15:17:37,909][05596] Saving new best policy, reward=1.220! -[2023-09-26 15:17:40,087][05900] Updated weights for policy 0, policy_version 640 (0.0016) -[2023-09-26 15:17:40,088][05901] Updated weights for policy 1, policy_version 640 (0.0017) -[2023-09-26 15:17:42,907][04574] Fps is (10 sec: 6553.7, 60 sec: 5734.4, 300 sec: 5734.4). Total num frames: 344064. Throughput: 0: 772.4, 1: 772.7. Samples: 81920. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:17:42,908][04574] Avg episode reward: [(0, '3.000'), (1, '1.480')] -[2023-09-26 15:17:42,909][05384] Saving new best policy, reward=3.000! -[2023-09-26 15:17:42,909][05596] Saving new best policy, reward=1.480! -[2023-09-26 15:17:47,908][04574] Fps is (10 sec: 6143.8, 60 sec: 6075.7, 300 sec: 5734.4). Total num frames: 372736. Throughput: 0: 767.7, 1: 767.3. Samples: 91380. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0) -[2023-09-26 15:17:47,909][04574] Avg episode reward: [(0, '3.380'), (1, '1.780')] -[2023-09-26 15:17:47,916][05384] Saving new best policy, reward=3.380! -[2023-09-26 15:17:47,923][05596] Saving new best policy, reward=1.780! -[2023-09-26 15:17:52,907][04574] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 5734.4). Total num frames: 401408. Throughput: 0: 770.4, 1: 770.3. Samples: 100367. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:17:52,908][04574] Avg episode reward: [(0, '3.950'), (1, '2.230')] -[2023-09-26 15:17:52,909][05596] Saving new best policy, reward=2.230! -[2023-09-26 15:17:52,909][05384] Saving new best policy, reward=3.950! -[2023-09-26 15:17:53,261][05900] Updated weights for policy 0, policy_version 800 (0.0020) -[2023-09-26 15:17:53,262][05901] Updated weights for policy 1, policy_version 800 (0.0017) -[2023-09-26 15:17:57,908][04574] Fps is (10 sec: 6144.1, 60 sec: 6144.0, 300 sec: 5789.0). Total num frames: 434176. Throughput: 0: 768.8, 1: 770.0. Samples: 105125. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:17:57,909][04574] Avg episode reward: [(0, '4.410'), (1, '2.760')] -[2023-09-26 15:17:57,910][05384] Saving new best policy, reward=4.410! -[2023-09-26 15:17:57,910][05596] Saving new best policy, reward=2.760! -[2023-09-26 15:18:02,907][04574] Fps is (10 sec: 6553.6, 60 sec: 6144.0, 300 sec: 5836.8). Total num frames: 466944. Throughput: 0: 777.1, 1: 773.9. Samples: 114713. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) -[2023-09-26 15:18:02,908][04574] Avg episode reward: [(0, '4.790'), (1, '3.310')] -[2023-09-26 15:18:02,913][05384] Saving new best policy, reward=4.790! -[2023-09-26 15:18:02,914][05596] Saving new best policy, reward=3.310! -[2023-09-26 15:18:06,391][05900] Updated weights for policy 0, policy_version 960 (0.0015) -[2023-09-26 15:18:06,391][05901] Updated weights for policy 1, policy_version 960 (0.0015) -[2023-09-26 15:18:07,908][04574] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 5879.0). Total num frames: 499712. Throughput: 0: 778.3, 1: 778.0. Samples: 123934. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:18:07,909][04574] Avg episode reward: [(0, '5.340'), (1, '3.630')] -[2023-09-26 15:18:07,910][05384] Saving new best policy, reward=5.340! -[2023-09-26 15:18:07,910][05596] Saving new best policy, reward=3.630! -[2023-09-26 15:18:12,908][04574] Fps is (10 sec: 5734.3, 60 sec: 6144.0, 300 sec: 5825.4). Total num frames: 524288. Throughput: 0: 774.3, 1: 774.8. Samples: 128525. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) -[2023-09-26 15:18:12,909][04574] Avg episode reward: [(0, '6.000'), (1, '4.440')] -[2023-09-26 15:18:12,945][05596] Saving new best policy, reward=4.440! -[2023-09-26 15:18:12,962][05384] Saving new best policy, reward=6.000! -[2023-09-26 15:18:17,908][04574] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 5863.7). Total num frames: 557056. Throughput: 0: 780.1, 1: 779.8. Samples: 137953. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-09-26 15:18:17,909][04574] Avg episode reward: [(0, '6.950'), (1, '4.970')] -[2023-09-26 15:18:17,914][05384] Saving new best policy, reward=6.950! -[2023-09-26 15:18:17,915][05596] Saving new best policy, reward=4.970! -[2023-09-26 15:18:19,515][05900] Updated weights for policy 0, policy_version 1120 (0.0017) -[2023-09-26 15:18:19,516][05901] Updated weights for policy 1, policy_version 1120 (0.0018) -[2023-09-26 15:18:22,908][04574] Fps is (10 sec: 6553.6, 60 sec: 6144.0, 300 sec: 5898.2). Total num frames: 589824. Throughput: 0: 780.4, 1: 779.9. Samples: 147456. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) -[2023-09-26 15:18:22,909][04574] Avg episode reward: [(0, '7.850'), (1, '5.660')] -[2023-09-26 15:18:22,910][05384] Saving new best policy, reward=7.850! -[2023-09-26 15:18:22,910][05596] Saving new best policy, reward=5.660! -[2023-09-26 15:18:27,907][04574] Fps is (10 sec: 6553.7, 60 sec: 6280.5, 300 sec: 5929.5). Total num frames: 622592. Throughput: 0: 775.3, 1: 775.3. Samples: 151695. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-09-26 15:18:27,908][04574] Avg episode reward: [(0, '8.490'), (1, '6.360')] -[2023-09-26 15:18:27,909][05384] Saving new best policy, reward=8.490! -[2023-09-26 15:18:27,909][05596] Saving new best policy, reward=6.360! -[2023-09-26 15:18:32,807][05900] Updated weights for policy 0, policy_version 1280 (0.0019) -[2023-09-26 15:18:32,807][05901] Updated weights for policy 1, policy_version 1280 (0.0015) -[2023-09-26 15:18:32,907][04574] Fps is (10 sec: 6553.7, 60 sec: 6280.5, 300 sec: 5957.8). Total num frames: 655360. Throughput: 0: 778.7, 1: 778.3. Samples: 161443. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:18:32,908][04574] Avg episode reward: [(0, '9.250'), (1, '6.890')] -[2023-09-26 15:18:32,914][05384] Saving ./train_atari/atari_kongfumaster/checkpoint_p0/checkpoint_000001280_327680.pth... -[2023-09-26 15:18:32,914][05596] Saving ./train_atari/atari_kongfumaster/checkpoint_p1/checkpoint_000001280_327680.pth... -[2023-09-26 15:18:32,947][05596] Saving new best policy, reward=6.890! -[2023-09-26 15:18:32,948][05384] Saving new best policy, reward=9.250! -[2023-09-26 15:18:37,907][04574] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 5912.5). Total num frames: 679936. Throughput: 0: 775.0, 1: 775.0. Samples: 170117. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) -[2023-09-26 15:18:37,908][04574] Avg episode reward: [(0, '10.570'), (1, '7.590')] -[2023-09-26 15:18:37,909][05384] Saving new best policy, reward=10.570! -[2023-09-26 15:18:37,909][05596] Saving new best policy, reward=7.590! -[2023-09-26 15:18:42,907][04574] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 5939.2). Total num frames: 712704. Throughput: 0: 774.9, 1: 775.2. Samples: 174879. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:18:42,908][04574] Avg episode reward: [(0, '11.180'), (1, '8.260')] -[2023-09-26 15:18:42,909][05384] Saving new best policy, reward=11.180! -[2023-09-26 15:18:42,909][05596] Saving new best policy, reward=8.260! -[2023-09-26 15:18:46,142][05900] Updated weights for policy 0, policy_version 1440 (0.0018) -[2023-09-26 15:18:46,142][05901] Updated weights for policy 1, policy_version 1440 (0.0018) -[2023-09-26 15:18:47,908][04574] Fps is (10 sec: 6553.5, 60 sec: 6212.3, 300 sec: 5963.8). Total num frames: 745472. Throughput: 0: 773.3, 1: 773.5. Samples: 184320. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:18:47,909][04574] Avg episode reward: [(0, '12.310'), (1, '9.100')] -[2023-09-26 15:18:47,917][05384] Saving new best policy, reward=12.310! -[2023-09-26 15:18:47,918][05596] Saving new best policy, reward=9.100! -[2023-09-26 15:18:52,908][04574] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 5986.5). Total num frames: 778240. Throughput: 0: 772.8, 1: 772.6. Samples: 193479. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:18:52,909][04574] Avg episode reward: [(0, '13.350'), (1, '9.810')] -[2023-09-26 15:18:52,910][05384] Saving new best policy, reward=13.350! -[2023-09-26 15:18:52,910][05596] Saving new best policy, reward=9.810! -[2023-09-26 15:18:57,907][04574] Fps is (10 sec: 5734.5, 60 sec: 6144.0, 300 sec: 5946.8). Total num frames: 802816. Throughput: 0: 776.3, 1: 775.8. Samples: 198369. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:18:57,908][04574] Avg episode reward: [(0, '14.200'), (1, '10.710')] -[2023-09-26 15:18:57,976][05596] Saving new best policy, reward=10.710! -[2023-09-26 15:18:58,001][05384] Saving new best policy, reward=14.200! -[2023-09-26 15:18:59,303][05900] Updated weights for policy 0, policy_version 1600 (0.0017) -[2023-09-26 15:18:59,305][05901] Updated weights for policy 1, policy_version 1600 (0.0019) -[2023-09-26 15:19:02,907][04574] Fps is (10 sec: 5734.5, 60 sec: 6144.0, 300 sec: 5968.5). Total num frames: 835584. Throughput: 0: 772.1, 1: 773.0. Samples: 207479. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:19:02,908][04574] Avg episode reward: [(0, '14.590'), (1, '11.300')] -[2023-09-26 15:19:02,913][05596] Saving new best policy, reward=11.300! -[2023-09-26 15:19:02,913][05384] Saving new best policy, reward=14.590! -[2023-09-26 15:19:07,908][04574] Fps is (10 sec: 6553.5, 60 sec: 6144.0, 300 sec: 5988.6). Total num frames: 868352. Throughput: 0: 773.7, 1: 773.7. Samples: 217088. Policy #0 lag: (min: 9.0, avg: 9.0, max: 9.0) -[2023-09-26 15:19:07,909][04574] Avg episode reward: [(0, '15.370'), (1, '12.180')] -[2023-09-26 15:19:07,910][05384] Saving new best policy, reward=15.370! -[2023-09-26 15:19:07,910][05596] Saving new best policy, reward=12.180! -[2023-09-26 15:19:12,659][05901] Updated weights for policy 1, policy_version 1760 (0.0015) -[2023-09-26 15:19:12,659][05900] Updated weights for policy 0, policy_version 1760 (0.0017) -[2023-09-26 15:19:12,907][04574] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6007.5). Total num frames: 901120. Throughput: 0: 773.2, 1: 772.9. Samples: 221271. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) -[2023-09-26 15:19:12,908][04574] Avg episode reward: [(0, '16.030'), (1, '12.570')] -[2023-09-26 15:19:12,909][05384] Saving new best policy, reward=16.030! -[2023-09-26 15:19:12,909][05596] Saving new best policy, reward=12.570! -[2023-09-26 15:19:17,908][04574] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 5972.2). Total num frames: 925696. Throughput: 0: 765.4, 1: 764.2. Samples: 230276. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:19:17,909][04574] Avg episode reward: [(0, '16.470'), (1, '13.360')] -[2023-09-26 15:19:17,915][05384] Saving new best policy, reward=16.470! -[2023-09-26 15:19:18,075][05596] Saving new best policy, reward=13.360! -[2023-09-26 15:19:22,907][04574] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 5990.4). Total num frames: 958464. Throughput: 0: 772.1, 1: 772.3. Samples: 239616. Policy #0 lag: (min: 9.0, avg: 9.0, max: 9.0) -[2023-09-26 15:19:22,908][04574] Avg episode reward: [(0, '17.060'), (1, '14.160')] -[2023-09-26 15:19:22,909][05384] Saving new best policy, reward=17.060! -[2023-09-26 15:19:22,909][05596] Saving new best policy, reward=14.160! -[2023-09-26 15:19:26,022][05900] Updated weights for policy 0, policy_version 1920 (0.0020) -[2023-09-26 15:19:26,023][05901] Updated weights for policy 1, policy_version 1920 (0.0019) -[2023-09-26 15:19:27,907][04574] Fps is (10 sec: 6553.7, 60 sec: 6144.0, 300 sec: 6007.5). Total num frames: 991232. Throughput: 0: 770.2, 1: 769.8. Samples: 244181. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) -[2023-09-26 15:19:27,908][04574] Avg episode reward: [(0, '17.370'), (1, '14.730')] -[2023-09-26 15:19:27,909][05384] Saving new best policy, reward=17.370! -[2023-09-26 15:19:27,909][05596] Saving new best policy, reward=14.730! -[2023-09-26 15:19:32,908][04574] Fps is (10 sec: 6553.4, 60 sec: 6144.0, 300 sec: 6023.5). Total num frames: 1024000. Throughput: 0: 768.7, 1: 769.7. Samples: 253549. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0) -[2023-09-26 15:19:32,909][04574] Avg episode reward: [(0, '17.740'), (1, '15.160')] -[2023-09-26 15:19:32,916][05384] Saving new best policy, reward=17.740! -[2023-09-26 15:19:32,917][05596] Saving new best policy, reward=15.160! -[2023-09-26 15:19:37,908][04574] Fps is (10 sec: 5734.3, 60 sec: 6144.0, 300 sec: 5991.9). Total num frames: 1048576. Throughput: 0: 766.1, 1: 765.7. Samples: 262411. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) -[2023-09-26 15:19:37,909][04574] Avg episode reward: [(0, '18.170'), (1, '15.880')] -[2023-09-26 15:19:37,910][05384] Saving new best policy, reward=18.170! -[2023-09-26 15:19:37,910][05596] Saving new best policy, reward=15.880! -[2023-09-26 15:19:39,501][05900] Updated weights for policy 0, policy_version 2080 (0.0016) -[2023-09-26 15:19:39,502][05901] Updated weights for policy 1, policy_version 2080 (0.0017) -[2023-09-26 15:19:42,908][04574] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6007.5). Total num frames: 1081344. Throughput: 0: 763.1, 1: 763.2. Samples: 267052. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:19:42,909][04574] Avg episode reward: [(0, '18.820'), (1, '16.090')] -[2023-09-26 15:19:42,910][05384] Saving new best policy, reward=18.820! -[2023-09-26 15:19:42,910][05596] Saving new best policy, reward=16.090! -[2023-09-26 15:19:47,907][04574] Fps is (10 sec: 6553.8, 60 sec: 6144.0, 300 sec: 6022.2). Total num frames: 1114112. Throughput: 0: 766.9, 1: 766.5. Samples: 276480. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:19:47,908][04574] Avg episode reward: [(0, '19.690'), (1, '16.870')] -[2023-09-26 15:19:47,914][05384] Saving new best policy, reward=19.690! -[2023-09-26 15:19:47,914][05596] Saving new best policy, reward=16.870! -[2023-09-26 15:19:52,644][05900] Updated weights for policy 0, policy_version 2240 (0.0019) -[2023-09-26 15:19:52,644][05901] Updated weights for policy 1, policy_version 2240 (0.0017) -[2023-09-26 15:19:52,908][04574] Fps is (10 sec: 6553.6, 60 sec: 6144.0, 300 sec: 6036.2). Total num frames: 1146880. Throughput: 0: 763.8, 1: 764.2. Samples: 285848. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:19:52,909][04574] Avg episode reward: [(0, '19.830'), (1, '17.500')] -[2023-09-26 15:19:52,910][05384] Saving new best policy, reward=19.830! -[2023-09-26 15:19:52,910][05596] Saving new best policy, reward=17.500! -[2023-09-26 15:19:57,908][04574] Fps is (10 sec: 6143.9, 60 sec: 6212.3, 300 sec: 6028.5). Total num frames: 1175552. Throughput: 0: 769.3, 1: 770.0. Samples: 290542. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0) -[2023-09-26 15:19:57,909][04574] Avg episode reward: [(0, '19.650'), (1, '17.460')] -[2023-09-26 15:20:02,907][04574] Fps is (10 sec: 5734.5, 60 sec: 6144.0, 300 sec: 6021.1). Total num frames: 1204224. Throughput: 0: 770.5, 1: 771.5. Samples: 299666. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) -[2023-09-26 15:20:02,908][04574] Avg episode reward: [(0, '19.660'), (1, '17.960')] -[2023-09-26 15:20:02,915][05596] Saving new best policy, reward=17.960! -[2023-09-26 15:20:05,813][05900] Updated weights for policy 0, policy_version 2400 (0.0016) -[2023-09-26 15:20:05,814][05901] Updated weights for policy 1, policy_version 2400 (0.0018) -[2023-09-26 15:20:07,908][04574] Fps is (10 sec: 6144.0, 60 sec: 6144.0, 300 sec: 6034.1). Total num frames: 1236992. Throughput: 0: 773.7, 1: 773.7. Samples: 309248. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-09-26 15:20:07,909][04574] Avg episode reward: [(0, '19.840'), (1, '18.110')] -[2023-09-26 15:20:07,910][05384] Saving new best policy, reward=19.840! -[2023-09-26 15:20:07,910][05596] Saving new best policy, reward=18.110! -[2023-09-26 15:20:12,908][04574] Fps is (10 sec: 6553.5, 60 sec: 6144.0, 300 sec: 6046.5). Total num frames: 1269760. Throughput: 0: 774.8, 1: 774.8. Samples: 313910. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0) -[2023-09-26 15:20:12,908][04574] Avg episode reward: [(0, '20.060'), (1, '17.820')] -[2023-09-26 15:20:12,909][05384] Saving new best policy, reward=20.060! -[2023-09-26 15:20:17,907][04574] Fps is (10 sec: 6553.7, 60 sec: 6280.5, 300 sec: 6058.3). Total num frames: 1302528. Throughput: 0: 778.7, 1: 777.7. Samples: 323584. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:20:17,908][04574] Avg episode reward: [(0, '20.880'), (1, '17.890')] -[2023-09-26 15:20:17,917][05384] Saving new best policy, reward=20.880! -[2023-09-26 15:20:18,640][05900] Updated weights for policy 0, policy_version 2560 (0.0015) -[2023-09-26 15:20:18,640][05901] Updated weights for policy 1, policy_version 2560 (0.0017) -[2023-09-26 15:20:22,908][04574] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6069.5). Total num frames: 1335296. Throughput: 0: 787.4, 1: 787.7. Samples: 333290. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0) -[2023-09-26 15:20:22,909][04574] Avg episode reward: [(0, '21.270'), (1, '18.150')] -[2023-09-26 15:20:22,910][05384] Saving new best policy, reward=21.270! -[2023-09-26 15:20:22,910][05596] Saving new best policy, reward=18.150! -[2023-09-26 15:20:27,908][04574] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6080.3). Total num frames: 1368064. Throughput: 0: 787.8, 1: 787.0. Samples: 337920. Policy #0 lag: (min: 9.0, avg: 9.0, max: 9.0) -[2023-09-26 15:20:27,909][04574] Avg episode reward: [(0, '21.630'), (1, '18.180')] -[2023-09-26 15:20:27,910][05384] Saving new best policy, reward=21.630! -[2023-09-26 15:20:27,910][05596] Saving new best policy, reward=18.180! -[2023-09-26 15:20:31,499][05900] Updated weights for policy 0, policy_version 2720 (0.0018) -[2023-09-26 15:20:31,499][05901] Updated weights for policy 1, policy_version 2720 (0.0014) -[2023-09-26 15:20:32,908][04574] Fps is (10 sec: 6553.4, 60 sec: 6280.5, 300 sec: 6090.6). Total num frames: 1400832. Throughput: 0: 789.6, 1: 790.1. Samples: 347569. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) -[2023-09-26 15:20:32,909][04574] Avg episode reward: [(0, '21.910'), (1, '18.430')] -[2023-09-26 15:20:32,920][05596] Saving ./train_atari/atari_kongfumaster/checkpoint_p1/checkpoint_000002736_700416.pth... -[2023-09-26 15:20:32,921][05384] Saving ./train_atari/atari_kongfumaster/checkpoint_p0/checkpoint_000002736_700416.pth... -[2023-09-26 15:20:32,957][05596] Saving new best policy, reward=18.430! -[2023-09-26 15:20:32,957][05384] Saving new best policy, reward=21.910! -[2023-09-26 15:20:37,908][04574] Fps is (10 sec: 5734.4, 60 sec: 6280.5, 300 sec: 6065.6). Total num frames: 1425408. Throughput: 0: 785.5, 1: 785.0. Samples: 356522. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:20:37,909][04574] Avg episode reward: [(0, '22.300'), (1, '18.940')] -[2023-09-26 15:20:37,910][05384] Saving new best policy, reward=22.300! -[2023-09-26 15:20:38,092][05596] Saving new best policy, reward=18.940! -[2023-09-26 15:20:42,907][04574] Fps is (10 sec: 5734.7, 60 sec: 6280.6, 300 sec: 6075.7). Total num frames: 1458176. Throughput: 0: 786.5, 1: 785.6. Samples: 361288. Policy #0 lag: (min: 8.0, avg: 8.0, max: 8.0) -[2023-09-26 15:20:42,908][04574] Avg episode reward: [(0, '22.790'), (1, '19.230')] -[2023-09-26 15:20:42,909][05596] Saving new best policy, reward=19.230! -[2023-09-26 15:20:42,909][05384] Saving new best policy, reward=22.790! -[2023-09-26 15:20:44,734][05901] Updated weights for policy 1, policy_version 2880 (0.0019) -[2023-09-26 15:20:44,734][05900] Updated weights for policy 0, policy_version 2880 (0.0019) -[2023-09-26 15:20:47,907][04574] Fps is (10 sec: 6553.7, 60 sec: 6280.5, 300 sec: 6085.5). Total num frames: 1490944. Throughput: 0: 789.1, 1: 789.1. Samples: 370688. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:20:47,908][04574] Avg episode reward: [(0, '23.600'), (1, '18.920')] -[2023-09-26 15:20:47,917][05384] Saving new best policy, reward=23.600! -[2023-09-26 15:20:52,908][04574] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6094.8). Total num frames: 1523712. Throughput: 0: 786.0, 1: 785.7. Samples: 379978. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:20:52,909][04574] Avg episode reward: [(0, '24.100'), (1, '18.830')] -[2023-09-26 15:20:52,910][05384] Saving new best policy, reward=24.100! -[2023-09-26 15:20:57,886][05901] Updated weights for policy 1, policy_version 3040 (0.0015) -[2023-09-26 15:20:57,887][05900] Updated weights for policy 0, policy_version 3040 (0.0016) -[2023-09-26 15:20:57,908][04574] Fps is (10 sec: 6553.4, 60 sec: 6348.8, 300 sec: 6103.8). Total num frames: 1556480. Throughput: 0: 788.9, 1: 789.6. Samples: 384941. Policy #0 lag: (min: 8.0, avg: 8.0, max: 8.0) -[2023-09-26 15:20:57,909][04574] Avg episode reward: [(0, '23.690'), (1, '17.840')] -[2023-09-26 15:21:02,908][04574] Fps is (10 sec: 5734.4, 60 sec: 6280.5, 300 sec: 6081.0). Total num frames: 1581056. Throughput: 0: 778.9, 1: 779.2. Samples: 393699. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0) -[2023-09-26 15:21:02,909][04574] Avg episode reward: [(0, '22.840'), (1, '18.160')] -[2023-09-26 15:21:07,908][04574] Fps is (10 sec: 5734.4, 60 sec: 6280.5, 300 sec: 6089.9). Total num frames: 1613824. Throughput: 0: 775.4, 1: 776.6. Samples: 403130. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:21:07,909][04574] Avg episode reward: [(0, '22.900'), (1, '18.330')] -[2023-09-26 15:21:11,295][05901] Updated weights for policy 1, policy_version 3200 (0.0018) -[2023-09-26 15:21:11,296][05900] Updated weights for policy 0, policy_version 3200 (0.0018) -[2023-09-26 15:21:12,908][04574] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6098.5). Total num frames: 1646592. Throughput: 0: 773.7, 1: 773.7. Samples: 407552. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) -[2023-09-26 15:21:12,909][04574] Avg episode reward: [(0, '23.360'), (1, '18.310')] -[2023-09-26 15:21:17,908][04574] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6106.8). Total num frames: 1679360. Throughput: 0: 771.0, 1: 770.5. Samples: 416936. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0) -[2023-09-26 15:21:17,909][04574] Avg episode reward: [(0, '23.020'), (1, '18.820')] -[2023-09-26 15:21:22,908][04574] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6085.5). Total num frames: 1703936. Throughput: 0: 776.6, 1: 776.4. Samples: 426409. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) -[2023-09-26 15:21:22,909][04574] Avg episode reward: [(0, '22.810'), (1, '19.290')] -[2023-09-26 15:21:23,064][05596] Saving new best policy, reward=19.290! -[2023-09-26 15:21:24,330][05900] Updated weights for policy 0, policy_version 3360 (0.0016) -[2023-09-26 15:21:24,330][05901] Updated weights for policy 1, policy_version 3360 (0.0018) -[2023-09-26 15:21:27,908][04574] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6093.7). Total num frames: 1736704. Throughput: 0: 777.8, 1: 778.2. Samples: 431304. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) -[2023-09-26 15:21:27,909][04574] Avg episode reward: [(0, '22.770'), (1, '19.370')] -[2023-09-26 15:21:27,910][05596] Saving new best policy, reward=19.370! -[2023-09-26 15:21:32,908][04574] Fps is (10 sec: 6553.5, 60 sec: 6144.0, 300 sec: 6101.6). Total num frames: 1769472. Throughput: 0: 776.1, 1: 775.9. Samples: 440531. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:21:32,909][04574] Avg episode reward: [(0, '22.320'), (1, '19.090')] -[2023-09-26 15:21:37,474][05901] Updated weights for policy 1, policy_version 3520 (0.0017) -[2023-09-26 15:21:37,474][05900] Updated weights for policy 0, policy_version 3520 (0.0016) -[2023-09-26 15:21:37,908][04574] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6109.3). Total num frames: 1802240. Throughput: 0: 778.7, 1: 778.2. Samples: 450038. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:21:37,909][04574] Avg episode reward: [(0, '22.030'), (1, '19.150')] -[2023-09-26 15:21:42,908][04574] Fps is (10 sec: 6553.7, 60 sec: 6280.5, 300 sec: 6192.6). Total num frames: 1835008. Throughput: 0: 774.9, 1: 774.3. Samples: 454656. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:21:42,909][04574] Avg episode reward: [(0, '21.470'), (1, '19.290')] -[2023-09-26 15:21:47,908][04574] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6192.6). Total num frames: 1859584. Throughput: 0: 781.1, 1: 781.1. Samples: 463998. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:21:47,909][04574] Avg episode reward: [(0, '20.680'), (1, '19.290')] -[2023-09-26 15:21:50,797][05900] Updated weights for policy 0, policy_version 3680 (0.0018) -[2023-09-26 15:21:50,797][05901] Updated weights for policy 1, policy_version 3680 (0.0017) -[2023-09-26 15:21:52,908][04574] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6192.6). Total num frames: 1892352. Throughput: 0: 777.9, 1: 776.8. Samples: 473088. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:21:52,909][04574] Avg episode reward: [(0, '20.040'), (1, '19.430')] -[2023-09-26 15:21:52,910][05596] Saving new best policy, reward=19.430! -[2023-09-26 15:21:57,908][04574] Fps is (10 sec: 6553.6, 60 sec: 6144.0, 300 sec: 6192.6). Total num frames: 1925120. Throughput: 0: 775.4, 1: 775.3. Samples: 477331. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:21:57,909][04574] Avg episode reward: [(0, '20.270'), (1, '20.110')] -[2023-09-26 15:21:57,910][05596] Saving new best policy, reward=20.110! -[2023-09-26 15:22:02,908][04574] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6220.4). Total num frames: 1957888. Throughput: 0: 776.5, 1: 775.7. Samples: 486786. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) -[2023-09-26 15:22:02,909][04574] Avg episode reward: [(0, '20.020'), (1, '20.300')] -[2023-09-26 15:22:02,919][05596] Saving new best policy, reward=20.300! -[2023-09-26 15:22:04,073][05900] Updated weights for policy 0, policy_version 3840 (0.0017) -[2023-09-26 15:22:04,074][05901] Updated weights for policy 1, policy_version 3840 (0.0018) -[2023-09-26 15:22:07,908][04574] Fps is (10 sec: 6144.0, 60 sec: 6212.3, 300 sec: 6206.5). Total num frames: 1986560. Throughput: 0: 775.5, 1: 775.7. Samples: 496212. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0) -[2023-09-26 15:22:07,908][04574] Avg episode reward: [(0, '19.940'), (1, '20.540')] -[2023-09-26 15:22:07,910][05596] Saving new best policy, reward=20.540! -[2023-09-26 15:22:12,908][04574] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6192.6). Total num frames: 2015232. Throughput: 0: 774.5, 1: 775.8. Samples: 501069. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) -[2023-09-26 15:22:12,909][04574] Avg episode reward: [(0, '20.220'), (1, '20.350')] -[2023-09-26 15:22:16,930][05900] Updated weights for policy 0, policy_version 4000 (0.0017) -[2023-09-26 15:22:16,930][05901] Updated weights for policy 1, policy_version 4000 (0.0018) -[2023-09-26 15:22:17,907][04574] Fps is (10 sec: 6144.0, 60 sec: 6144.0, 300 sec: 6192.6). Total num frames: 2048000. Throughput: 0: 778.0, 1: 778.3. Samples: 510564. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0) -[2023-09-26 15:22:17,908][04574] Avg episode reward: [(0, '20.950'), (1, '21.480')] -[2023-09-26 15:22:17,919][05596] Saving new best policy, reward=21.480! -[2023-09-26 15:22:22,907][04574] Fps is (10 sec: 6553.7, 60 sec: 6280.5, 300 sec: 6220.4). Total num frames: 2080768. Throughput: 0: 779.1, 1: 779.9. Samples: 520192. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:22:22,908][04574] Avg episode reward: [(0, '21.450'), (1, '21.840')] -[2023-09-26 15:22:22,909][05596] Saving new best policy, reward=21.840! -[2023-09-26 15:22:27,908][04574] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6220.4). Total num frames: 2113536. Throughput: 0: 775.0, 1: 775.0. Samples: 524405. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:22:27,909][04574] Avg episode reward: [(0, '20.930'), (1, '22.500')] -[2023-09-26 15:22:27,910][05596] Saving new best policy, reward=22.500! -[2023-09-26 15:22:30,061][05901] Updated weights for policy 1, policy_version 4160 (0.0017) -[2023-09-26 15:22:30,061][05900] Updated weights for policy 0, policy_version 4160 (0.0016) -[2023-09-26 15:22:32,907][04574] Fps is (10 sec: 6553.6, 60 sec: 6280.6, 300 sec: 6220.4). Total num frames: 2146304. Throughput: 0: 779.6, 1: 780.2. Samples: 534187. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:22:32,908][04574] Avg episode reward: [(0, '20.750'), (1, '22.070')] -[2023-09-26 15:22:32,917][05596] Saving ./train_atari/atari_kongfumaster/checkpoint_p1/checkpoint_000004192_1073152.pth... -[2023-09-26 15:22:32,918][05384] Saving ./train_atari/atari_kongfumaster/checkpoint_p0/checkpoint_000004192_1073152.pth... -[2023-09-26 15:22:32,952][05596] Removing ./train_atari/atari_kongfumaster/checkpoint_p1/checkpoint_000001280_327680.pth -[2023-09-26 15:22:32,953][05384] Removing ./train_atari/atari_kongfumaster/checkpoint_p0/checkpoint_000001280_327680.pth -[2023-09-26 15:22:37,908][04574] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6192.6). Total num frames: 2170880. Throughput: 0: 776.2, 1: 775.7. Samples: 542923. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:22:37,909][04574] Avg episode reward: [(0, '20.480'), (1, '21.720')] -[2023-09-26 15:22:42,907][04574] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6206.5). Total num frames: 2203648. Throughput: 0: 781.2, 1: 781.4. Samples: 547646. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) -[2023-09-26 15:22:42,908][04574] Avg episode reward: [(0, '21.010'), (1, '22.320')] -[2023-09-26 15:22:43,393][05900] Updated weights for policy 0, policy_version 4320 (0.0017) -[2023-09-26 15:22:43,393][05901] Updated weights for policy 1, policy_version 4320 (0.0018) -[2023-09-26 15:22:47,907][04574] Fps is (10 sec: 6553.8, 60 sec: 6280.6, 300 sec: 6220.4). Total num frames: 2236416. Throughput: 0: 780.3, 1: 781.2. Samples: 557056. Policy #0 lag: (min: 8.0, avg: 8.0, max: 8.0) -[2023-09-26 15:22:47,908][04574] Avg episode reward: [(0, '21.250'), (1, '23.130')] -[2023-09-26 15:22:47,918][05596] Saving new best policy, reward=23.130! -[2023-09-26 15:22:52,908][04574] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6220.4). Total num frames: 2269184. Throughput: 0: 781.7, 1: 780.2. Samples: 566498. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0) -[2023-09-26 15:22:52,909][04574] Avg episode reward: [(0, '21.930'), (1, '23.990')] -[2023-09-26 15:22:52,910][05596] Saving new best policy, reward=23.990! -[2023-09-26 15:22:56,687][05901] Updated weights for policy 1, policy_version 4480 (0.0015) -[2023-09-26 15:22:56,689][05900] Updated weights for policy 0, policy_version 4480 (0.0017) -[2023-09-26 15:22:57,908][04574] Fps is (10 sec: 6143.9, 60 sec: 6212.3, 300 sec: 6206.5). Total num frames: 2297856. Throughput: 0: 780.5, 1: 777.7. Samples: 571188. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:22:57,909][04574] Avg episode reward: [(0, '22.270'), (1, '24.400')] -[2023-09-26 15:22:57,910][05596] Saving new best policy, reward=24.400! -[2023-09-26 15:23:02,907][04574] Fps is (10 sec: 5734.5, 60 sec: 6144.0, 300 sec: 6192.6). Total num frames: 2326528. Throughput: 0: 771.8, 1: 771.2. Samples: 579995. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) -[2023-09-26 15:23:02,908][04574] Avg episode reward: [(0, '23.070'), (1, '25.240')] -[2023-09-26 15:23:02,915][05596] Saving new best policy, reward=25.240! -[2023-09-26 15:23:07,908][04574] Fps is (10 sec: 6143.9, 60 sec: 6212.3, 300 sec: 6220.4). Total num frames: 2359296. Throughput: 0: 771.9, 1: 772.1. Samples: 589670. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:23:07,909][04574] Avg episode reward: [(0, '23.460'), (1, '25.950')] -[2023-09-26 15:23:07,910][05596] Saving new best policy, reward=25.950! -[2023-09-26 15:23:10,017][05901] Updated weights for policy 1, policy_version 4640 (0.0019) -[2023-09-26 15:23:10,017][05900] Updated weights for policy 0, policy_version 4640 (0.0019) -[2023-09-26 15:23:12,907][04574] Fps is (10 sec: 6553.6, 60 sec: 6280.6, 300 sec: 6220.4). Total num frames: 2392064. Throughput: 0: 772.4, 1: 772.4. Samples: 593921. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:23:12,908][04574] Avg episode reward: [(0, '23.780'), (1, '26.350')] -[2023-09-26 15:23:12,909][05596] Saving new best policy, reward=26.350! -[2023-09-26 15:23:17,908][04574] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6220.4). Total num frames: 2424832. Throughput: 0: 770.4, 1: 769.1. Samples: 603463. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:23:17,909][04574] Avg episode reward: [(0, '24.200'), (1, '27.140')] -[2023-09-26 15:23:17,920][05384] Saving new best policy, reward=24.200! -[2023-09-26 15:23:17,920][05596] Saving new best policy, reward=27.140! -[2023-09-26 15:23:22,908][04574] Fps is (10 sec: 5734.3, 60 sec: 6144.0, 300 sec: 6192.6). Total num frames: 2449408. Throughput: 0: 772.0, 1: 772.5. Samples: 612424. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:23:22,909][04574] Avg episode reward: [(0, '24.090'), (1, '27.500')] -[2023-09-26 15:23:22,910][05596] Saving new best policy, reward=27.500! -[2023-09-26 15:23:23,215][05900] Updated weights for policy 0, policy_version 4800 (0.0019) -[2023-09-26 15:23:23,217][05901] Updated weights for policy 1, policy_version 4800 (0.0020) -[2023-09-26 15:23:27,908][04574] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6192.6). Total num frames: 2482176. Throughput: 0: 775.3, 1: 775.0. Samples: 617411. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) -[2023-09-26 15:23:27,909][04574] Avg episode reward: [(0, '24.100'), (1, '26.990')] -[2023-09-26 15:23:32,908][04574] Fps is (10 sec: 6553.6, 60 sec: 6144.0, 300 sec: 6220.4). Total num frames: 2514944. Throughput: 0: 773.8, 1: 773.8. Samples: 626697. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:23:32,909][04574] Avg episode reward: [(0, '24.220'), (1, '27.030')] -[2023-09-26 15:23:32,921][05384] Saving new best policy, reward=24.220! -[2023-09-26 15:23:36,319][05900] Updated weights for policy 0, policy_version 4960 (0.0017) -[2023-09-26 15:23:36,320][05901] Updated weights for policy 1, policy_version 4960 (0.0017) -[2023-09-26 15:23:37,908][04574] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6220.4). Total num frames: 2547712. Throughput: 0: 771.6, 1: 773.1. Samples: 636011. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:23:37,909][04574] Avg episode reward: [(0, '24.140'), (1, '27.180')] -[2023-09-26 15:23:42,908][04574] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6192.6). Total num frames: 2572288. Throughput: 0: 772.5, 1: 773.3. Samples: 640750. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0) -[2023-09-26 15:23:42,909][04574] Avg episode reward: [(0, '23.690'), (1, '27.560')] -[2023-09-26 15:23:43,033][05596] Saving new best policy, reward=27.560! -[2023-09-26 15:23:47,908][04574] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6192.6). Total num frames: 2605056. Throughput: 0: 775.4, 1: 777.3. Samples: 649867. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) -[2023-09-26 15:23:47,909][04574] Avg episode reward: [(0, '23.850'), (1, '27.310')] -[2023-09-26 15:23:49,509][05901] Updated weights for policy 1, policy_version 5120 (0.0016) -[2023-09-26 15:23:49,510][05900] Updated weights for policy 0, policy_version 5120 (0.0016) -[2023-09-26 15:23:52,908][04574] Fps is (10 sec: 6553.6, 60 sec: 6144.0, 300 sec: 6220.4). Total num frames: 2637824. Throughput: 0: 775.5, 1: 775.3. Samples: 659456. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:23:52,909][04574] Avg episode reward: [(0, '24.790'), (1, '27.740')] -[2023-09-26 15:23:52,909][05384] Saving new best policy, reward=24.790! -[2023-09-26 15:23:52,910][05596] Saving new best policy, reward=27.740! -[2023-09-26 15:23:57,907][04574] Fps is (10 sec: 6553.8, 60 sec: 6212.3, 300 sec: 6220.4). Total num frames: 2670592. Throughput: 0: 778.9, 1: 778.8. Samples: 664015. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:23:57,908][04574] Avg episode reward: [(0, '25.340'), (1, '27.190')] -[2023-09-26 15:23:57,909][05384] Saving new best policy, reward=25.340! -[2023-09-26 15:24:02,710][05900] Updated weights for policy 0, policy_version 5280 (0.0017) -[2023-09-26 15:24:02,710][05901] Updated weights for policy 1, policy_version 5280 (0.0017) -[2023-09-26 15:24:02,908][04574] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6220.4). Total num frames: 2703360. Throughput: 0: 777.8, 1: 779.5. Samples: 673539. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) -[2023-09-26 15:24:02,909][04574] Avg episode reward: [(0, '26.170'), (1, '28.020')] -[2023-09-26 15:24:02,917][05384] Saving new best policy, reward=26.170! -[2023-09-26 15:24:02,917][05596] Saving new best policy, reward=28.020! -[2023-09-26 15:24:07,907][04574] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6192.6). Total num frames: 2727936. Throughput: 0: 777.2, 1: 776.6. Samples: 682342. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:24:07,908][04574] Avg episode reward: [(0, '26.720'), (1, '27.540')] -[2023-09-26 15:24:07,909][05384] Saving new best policy, reward=26.720! -[2023-09-26 15:24:12,908][04574] Fps is (10 sec: 5734.5, 60 sec: 6144.0, 300 sec: 6220.4). Total num frames: 2760704. Throughput: 0: 772.3, 1: 772.2. Samples: 686910. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:24:12,908][04574] Avg episode reward: [(0, '26.670'), (1, '27.880')] -[2023-09-26 15:24:15,976][05900] Updated weights for policy 0, policy_version 5440 (0.0015) -[2023-09-26 15:24:15,976][05901] Updated weights for policy 1, policy_version 5440 (0.0019) -[2023-09-26 15:24:17,908][04574] Fps is (10 sec: 6553.5, 60 sec: 6144.0, 300 sec: 6220.4). Total num frames: 2793472. Throughput: 0: 773.7, 1: 773.7. Samples: 696328. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:24:17,909][04574] Avg episode reward: [(0, '27.730'), (1, '28.220')] -[2023-09-26 15:24:17,918][05384] Saving new best policy, reward=27.730! -[2023-09-26 15:24:17,918][05596] Saving new best policy, reward=28.220! -[2023-09-26 15:24:22,907][04574] Fps is (10 sec: 6553.7, 60 sec: 6280.6, 300 sec: 6220.4). Total num frames: 2826240. Throughput: 0: 775.2, 1: 775.4. Samples: 705788. Policy #0 lag: (min: 9.0, avg: 9.0, max: 9.0) -[2023-09-26 15:24:22,908][04574] Avg episode reward: [(0, '28.300'), (1, '28.220')] -[2023-09-26 15:24:22,908][05384] Saving new best policy, reward=28.300! -[2023-09-26 15:24:27,908][04574] Fps is (10 sec: 6144.0, 60 sec: 6212.3, 300 sec: 6206.5). Total num frames: 2854912. Throughput: 0: 776.3, 1: 777.1. Samples: 710656. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:24:27,909][04574] Avg episode reward: [(0, '28.480'), (1, '27.770')] -[2023-09-26 15:24:27,910][05384] Saving new best policy, reward=28.480! -[2023-09-26 15:24:29,355][05901] Updated weights for policy 1, policy_version 5600 (0.0016) -[2023-09-26 15:24:29,355][05900] Updated weights for policy 0, policy_version 5600 (0.0015) -[2023-09-26 15:24:32,907][04574] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6220.4). Total num frames: 2883584. Throughput: 0: 771.6, 1: 770.3. Samples: 719250. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:24:32,908][04574] Avg episode reward: [(0, '27.560'), (1, '27.690')] -[2023-09-26 15:24:32,916][05596] Saving ./train_atari/atari_kongfumaster/checkpoint_p1/checkpoint_000005632_1441792.pth... -[2023-09-26 15:24:32,917][05384] Saving ./train_atari/atari_kongfumaster/checkpoint_p0/checkpoint_000005632_1441792.pth... -[2023-09-26 15:24:32,951][05596] Removing ./train_atari/atari_kongfumaster/checkpoint_p1/checkpoint_000002736_700416.pth -[2023-09-26 15:24:32,952][05384] Removing ./train_atari/atari_kongfumaster/checkpoint_p0/checkpoint_000002736_700416.pth -[2023-09-26 15:24:37,907][04574] Fps is (10 sec: 6144.1, 60 sec: 6144.0, 300 sec: 6220.4). Total num frames: 2916352. Throughput: 0: 773.5, 1: 773.3. Samples: 729062. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:24:37,908][04574] Avg episode reward: [(0, '28.450'), (1, '27.870')] -[2023-09-26 15:24:42,408][05901] Updated weights for policy 1, policy_version 5760 (0.0017) -[2023-09-26 15:24:42,409][05900] Updated weights for policy 0, policy_version 5760 (0.0018) -[2023-09-26 15:24:42,908][04574] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6220.4). Total num frames: 2949120. Throughput: 0: 772.3, 1: 772.8. Samples: 733544. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:24:42,909][04574] Avg episode reward: [(0, '28.750'), (1, '27.770')] -[2023-09-26 15:24:42,910][05384] Saving new best policy, reward=28.750! -[2023-09-26 15:24:47,908][04574] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6220.4). Total num frames: 2981888. Throughput: 0: 772.0, 1: 771.1. Samples: 742977. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0) -[2023-09-26 15:24:47,909][04574] Avg episode reward: [(0, '29.240'), (1, '28.020')] -[2023-09-26 15:24:47,920][05384] Saving new best policy, reward=29.240! -[2023-09-26 15:24:52,908][04574] Fps is (10 sec: 5734.3, 60 sec: 6144.0, 300 sec: 6206.5). Total num frames: 3006464. Throughput: 0: 774.2, 1: 775.4. Samples: 752077. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) -[2023-09-26 15:24:52,909][04574] Avg episode reward: [(0, '29.120'), (1, '27.670')] -[2023-09-26 15:24:55,703][05901] Updated weights for policy 1, policy_version 5920 (0.0015) -[2023-09-26 15:24:55,703][05900] Updated weights for policy 0, policy_version 5920 (0.0017) -[2023-09-26 15:24:57,908][04574] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6220.4). Total num frames: 3039232. Throughput: 0: 776.8, 1: 777.0. Samples: 756835. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:24:57,909][04574] Avg episode reward: [(0, '28.950'), (1, '29.060')] -[2023-09-26 15:24:57,910][05596] Saving new best policy, reward=29.060! -[2023-09-26 15:25:02,907][04574] Fps is (10 sec: 6553.7, 60 sec: 6144.0, 300 sec: 6220.4). Total num frames: 3072000. Throughput: 0: 773.6, 1: 773.6. Samples: 765952. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:25:02,908][04574] Avg episode reward: [(0, '29.960'), (1, '29.650')] -[2023-09-26 15:25:02,918][05596] Saving new best policy, reward=29.650! -[2023-09-26 15:25:02,918][05384] Saving new best policy, reward=29.960! -[2023-09-26 15:25:07,907][04574] Fps is (10 sec: 6553.7, 60 sec: 6280.5, 300 sec: 6220.4). Total num frames: 3104768. Throughput: 0: 773.1, 1: 773.0. Samples: 775360. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) -[2023-09-26 15:25:07,908][04574] Avg episode reward: [(0, '30.500'), (1, '29.540')] -[2023-09-26 15:25:07,909][05384] Saving new best policy, reward=30.500! -[2023-09-26 15:25:09,016][05901] Updated weights for policy 1, policy_version 6080 (0.0018) -[2023-09-26 15:25:09,016][05900] Updated weights for policy 0, policy_version 6080 (0.0016) -[2023-09-26 15:25:12,908][04574] Fps is (10 sec: 5734.3, 60 sec: 6144.0, 300 sec: 6192.6). Total num frames: 3129344. Throughput: 0: 770.3, 1: 771.2. Samples: 780024. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0) -[2023-09-26 15:25:12,909][04574] Avg episode reward: [(0, '30.480'), (1, '29.780')] -[2023-09-26 15:25:12,943][05596] Saving new best policy, reward=29.780! -[2023-09-26 15:25:17,908][04574] Fps is (10 sec: 5734.3, 60 sec: 6144.0, 300 sec: 6192.6). Total num frames: 3162112. Throughput: 0: 777.8, 1: 777.7. Samples: 789245. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:25:17,909][04574] Avg episode reward: [(0, '30.580'), (1, '30.340')] -[2023-09-26 15:25:17,919][05384] Saving new best policy, reward=30.580! -[2023-09-26 15:25:17,919][05596] Saving new best policy, reward=30.340! -[2023-09-26 15:25:22,083][05901] Updated weights for policy 1, policy_version 6240 (0.0019) -[2023-09-26 15:25:22,083][05900] Updated weights for policy 0, policy_version 6240 (0.0015) -[2023-09-26 15:25:22,908][04574] Fps is (10 sec: 6553.6, 60 sec: 6144.0, 300 sec: 6192.6). Total num frames: 3194880. Throughput: 0: 773.9, 1: 774.1. Samples: 798720. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:25:22,909][04574] Avg episode reward: [(0, '30.360'), (1, '30.860')] -[2023-09-26 15:25:22,910][05596] Saving new best policy, reward=30.860! -[2023-09-26 15:25:27,908][04574] Fps is (10 sec: 6553.6, 60 sec: 6212.3, 300 sec: 6192.6). Total num frames: 3227648. Throughput: 0: 774.9, 1: 774.5. Samples: 803270. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:25:27,909][04574] Avg episode reward: [(0, '31.040'), (1, '30.230')] -[2023-09-26 15:25:27,910][05384] Saving new best policy, reward=31.040! -[2023-09-26 15:25:32,908][04574] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6220.4). Total num frames: 3260416. Throughput: 0: 778.9, 1: 778.4. Samples: 813056. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0) -[2023-09-26 15:25:32,909][04574] Avg episode reward: [(0, '30.310'), (1, '30.640')] -[2023-09-26 15:25:35,219][05900] Updated weights for policy 0, policy_version 6400 (0.0016) -[2023-09-26 15:25:35,220][05901] Updated weights for policy 1, policy_version 6400 (0.0016) -[2023-09-26 15:25:37,908][04574] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6192.6). Total num frames: 3284992. Throughput: 0: 776.7, 1: 775.4. Samples: 821919. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:25:37,909][04574] Avg episode reward: [(0, '31.270'), (1, '30.460')] -[2023-09-26 15:25:37,979][05384] Saving new best policy, reward=31.270! -[2023-09-26 15:25:42,908][04574] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6192.6). Total num frames: 3317760. Throughput: 0: 774.6, 1: 774.8. Samples: 826558. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:25:42,909][04574] Avg episode reward: [(0, '30.260'), (1, '30.740')] -[2023-09-26 15:25:47,908][04574] Fps is (10 sec: 6553.6, 60 sec: 6144.0, 300 sec: 6192.6). Total num frames: 3350528. Throughput: 0: 773.8, 1: 773.8. Samples: 835594. Policy #0 lag: (min: 9.0, avg: 9.0, max: 9.0) -[2023-09-26 15:25:47,909][04574] Avg episode reward: [(0, '30.530'), (1, '30.990')] -[2023-09-26 15:25:47,919][05596] Saving new best policy, reward=30.990! -[2023-09-26 15:25:48,686][05900] Updated weights for policy 0, policy_version 6560 (0.0019) -[2023-09-26 15:25:48,686][05901] Updated weights for policy 1, policy_version 6560 (0.0017) -[2023-09-26 15:25:52,907][04574] Fps is (10 sec: 6553.8, 60 sec: 6280.6, 300 sec: 6192.6). Total num frames: 3383296. Throughput: 0: 774.0, 1: 774.4. Samples: 845036. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:25:52,908][04574] Avg episode reward: [(0, '29.530'), (1, '30.780')] -[2023-09-26 15:25:57,907][04574] Fps is (10 sec: 6553.8, 60 sec: 6280.6, 300 sec: 6220.4). Total num frames: 3416064. Throughput: 0: 777.1, 1: 776.2. Samples: 849920. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:25:57,908][04574] Avg episode reward: [(0, '28.890'), (1, '30.890')] -[2023-09-26 15:26:01,642][05901] Updated weights for policy 1, policy_version 6720 (0.0016) -[2023-09-26 15:26:01,642][05900] Updated weights for policy 0, policy_version 6720 (0.0015) -[2023-09-26 15:26:02,908][04574] Fps is (10 sec: 6143.9, 60 sec: 6212.3, 300 sec: 6206.5). Total num frames: 3444736. Throughput: 0: 779.0, 1: 779.0. Samples: 859352. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:26:02,909][04574] Avg episode reward: [(0, '29.580'), (1, '30.970')] -[2023-09-26 15:26:07,908][04574] Fps is (10 sec: 5734.3, 60 sec: 6144.0, 300 sec: 6192.6). Total num frames: 3473408. Throughput: 0: 773.7, 1: 773.7. Samples: 868352. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:26:07,909][04574] Avg episode reward: [(0, '29.610'), (1, '31.470')] -[2023-09-26 15:26:07,910][05596] Saving new best policy, reward=31.470! -[2023-09-26 15:26:12,908][04574] Fps is (10 sec: 6144.1, 60 sec: 6280.5, 300 sec: 6192.6). Total num frames: 3506176. Throughput: 0: 774.3, 1: 773.5. Samples: 872922. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0) -[2023-09-26 15:26:12,908][04574] Avg episode reward: [(0, '28.410'), (1, '31.090')] -[2023-09-26 15:26:15,141][05900] Updated weights for policy 0, policy_version 6880 (0.0016) -[2023-09-26 15:26:15,141][05901] Updated weights for policy 1, policy_version 6880 (0.0018) -[2023-09-26 15:26:17,907][04574] Fps is (10 sec: 6553.7, 60 sec: 6280.5, 300 sec: 6220.4). Total num frames: 3538944. Throughput: 0: 769.0, 1: 768.9. Samples: 882260. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) -[2023-09-26 15:26:17,908][04574] Avg episode reward: [(0, '28.310'), (1, '31.090')] -[2023-09-26 15:26:22,907][04574] Fps is (10 sec: 6553.7, 60 sec: 6280.6, 300 sec: 6220.4). Total num frames: 3571712. Throughput: 0: 774.3, 1: 775.2. Samples: 891649. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:26:22,908][04574] Avg episode reward: [(0, '29.180'), (1, '30.850')] -[2023-09-26 15:26:27,908][04574] Fps is (10 sec: 5734.3, 60 sec: 6144.0, 300 sec: 6192.6). Total num frames: 3596288. Throughput: 0: 778.6, 1: 776.8. Samples: 896549. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:26:27,909][04574] Avg episode reward: [(0, '29.370'), (1, '30.780')] -[2023-09-26 15:26:28,068][05900] Updated weights for policy 0, policy_version 7040 (0.0014) -[2023-09-26 15:26:28,069][05901] Updated weights for policy 1, policy_version 7040 (0.0016) -[2023-09-26 15:26:32,908][04574] Fps is (10 sec: 5734.1, 60 sec: 6144.0, 300 sec: 6192.6). Total num frames: 3629056. Throughput: 0: 776.0, 1: 775.8. Samples: 905421. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) -[2023-09-26 15:26:32,909][04574] Avg episode reward: [(0, '30.970'), (1, '30.140')] -[2023-09-26 15:26:32,919][05596] Saving ./train_atari/atari_kongfumaster/checkpoint_p1/checkpoint_000007088_1814528.pth... -[2023-09-26 15:26:32,920][05384] Saving ./train_atari/atari_kongfumaster/checkpoint_p0/checkpoint_000007088_1814528.pth... -[2023-09-26 15:26:32,949][05596] Removing ./train_atari/atari_kongfumaster/checkpoint_p1/checkpoint_000004192_1073152.pth -[2023-09-26 15:26:32,954][05384] Removing ./train_atari/atari_kongfumaster/checkpoint_p0/checkpoint_000004192_1073152.pth -[2023-09-26 15:26:37,908][04574] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6192.6). Total num frames: 3661824. Throughput: 0: 777.7, 1: 776.4. Samples: 914967. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) -[2023-09-26 15:26:37,909][04574] Avg episode reward: [(0, '31.170'), (1, '30.030')] -[2023-09-26 15:26:41,347][05900] Updated weights for policy 0, policy_version 7200 (0.0018) -[2023-09-26 15:26:41,347][05901] Updated weights for policy 1, policy_version 7200 (0.0018) -[2023-09-26 15:26:42,908][04574] Fps is (10 sec: 6553.8, 60 sec: 6280.6, 300 sec: 6220.4). Total num frames: 3694592. Throughput: 0: 773.7, 1: 773.7. Samples: 919554. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:26:42,909][04574] Avg episode reward: [(0, '30.640'), (1, '30.450')] -[2023-09-26 15:26:47,908][04574] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6192.6). Total num frames: 3719168. Throughput: 0: 772.4, 1: 773.0. Samples: 928898. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:26:47,909][04574] Avg episode reward: [(0, '30.990'), (1, '31.650')] -[2023-09-26 15:26:47,979][05596] Saving new best policy, reward=31.650! -[2023-09-26 15:26:52,908][04574] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6192.6). Total num frames: 3751936. Throughput: 0: 773.7, 1: 773.7. Samples: 937984. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:26:52,909][04574] Avg episode reward: [(0, '31.360'), (1, '31.240')] -[2023-09-26 15:26:52,910][05384] Saving new best policy, reward=31.360! -[2023-09-26 15:26:54,683][05901] Updated weights for policy 1, policy_version 7360 (0.0016) -[2023-09-26 15:26:54,683][05900] Updated weights for policy 0, policy_version 7360 (0.0017) -[2023-09-26 15:26:57,908][04574] Fps is (10 sec: 6553.6, 60 sec: 6144.0, 300 sec: 6192.6). Total num frames: 3784704. Throughput: 0: 773.8, 1: 773.6. Samples: 942551. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:26:57,909][04574] Avg episode reward: [(0, '31.040'), (1, '31.460')] -[2023-09-26 15:27:02,907][04574] Fps is (10 sec: 6553.7, 60 sec: 6212.3, 300 sec: 6206.5). Total num frames: 3817472. Throughput: 0: 778.4, 1: 778.5. Samples: 952319. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:27:02,908][04574] Avg episode reward: [(0, '31.120'), (1, '31.110')] -[2023-09-26 15:27:07,812][05900] Updated weights for policy 0, policy_version 7520 (0.0020) -[2023-09-26 15:27:07,813][05901] Updated weights for policy 1, policy_version 7520 (0.0018) -[2023-09-26 15:27:07,908][04574] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6220.4). Total num frames: 3850240. Throughput: 0: 773.6, 1: 773.4. Samples: 961263. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:27:07,909][04574] Avg episode reward: [(0, '31.400'), (1, '31.920')] -[2023-09-26 15:27:07,910][05384] Saving new best policy, reward=31.400! -[2023-09-26 15:27:07,910][05596] Saving new best policy, reward=31.920! -[2023-09-26 15:27:12,908][04574] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6192.6). Total num frames: 3874816. Throughput: 0: 771.9, 1: 773.7. Samples: 966100. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0) -[2023-09-26 15:27:12,908][04574] Avg episode reward: [(0, '31.770'), (1, '32.100')] -[2023-09-26 15:27:12,909][05384] Saving new best policy, reward=31.770! -[2023-09-26 15:27:12,910][05596] Saving new best policy, reward=32.100! -[2023-09-26 15:27:17,908][04574] Fps is (10 sec: 5734.3, 60 sec: 6144.0, 300 sec: 6192.6). Total num frames: 3907584. Throughput: 0: 775.3, 1: 775.3. Samples: 975198. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) -[2023-09-26 15:27:17,909][04574] Avg episode reward: [(0, '32.040'), (1, '32.460')] -[2023-09-26 15:27:17,920][05384] Saving new best policy, reward=32.040! -[2023-09-26 15:27:17,920][05596] Saving new best policy, reward=32.460! -[2023-09-26 15:27:21,044][05900] Updated weights for policy 0, policy_version 7680 (0.0018) -[2023-09-26 15:27:21,045][05901] Updated weights for policy 1, policy_version 7680 (0.0016) -[2023-09-26 15:27:22,908][04574] Fps is (10 sec: 6553.6, 60 sec: 6144.0, 300 sec: 6192.6). Total num frames: 3940352. Throughput: 0: 774.9, 1: 776.1. Samples: 984763. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:27:22,909][04574] Avg episode reward: [(0, '32.530'), (1, '32.760')] -[2023-09-26 15:27:22,910][05384] Saving new best policy, reward=32.530! -[2023-09-26 15:27:22,910][05596] Saving new best policy, reward=32.760! -[2023-09-26 15:27:27,908][04574] Fps is (10 sec: 6553.7, 60 sec: 6280.5, 300 sec: 6192.6). Total num frames: 3973120. Throughput: 0: 773.7, 1: 773.7. Samples: 989184. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:27:27,909][04574] Avg episode reward: [(0, '33.250'), (1, '32.260')] -[2023-09-26 15:27:27,910][05384] Saving new best policy, reward=33.250! -[2023-09-26 15:27:32,908][04574] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6220.4). Total num frames: 4005888. Throughput: 0: 774.7, 1: 773.9. Samples: 998585. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:27:32,909][04574] Avg episode reward: [(0, '33.550'), (1, '31.580')] -[2023-09-26 15:27:32,921][05384] Saving new best policy, reward=33.550! -[2023-09-26 15:27:34,202][05901] Updated weights for policy 1, policy_version 7840 (0.0015) -[2023-09-26 15:27:34,203][05900] Updated weights for policy 0, policy_version 7840 (0.0018) -[2023-09-26 15:27:37,908][04574] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6192.6). Total num frames: 4030464. Throughput: 0: 775.4, 1: 775.3. Samples: 1007763. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:27:37,908][04574] Avg episode reward: [(0, '34.360'), (1, '31.090')] -[2023-09-26 15:27:37,909][05384] Saving new best policy, reward=34.360! -[2023-09-26 15:27:42,907][04574] Fps is (10 sec: 5734.6, 60 sec: 6144.0, 300 sec: 6192.6). Total num frames: 4063232. Throughput: 0: 775.3, 1: 776.2. Samples: 1012368. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:27:42,908][04574] Avg episode reward: [(0, '34.660'), (1, '30.980')] -[2023-09-26 15:27:42,909][05384] Saving new best policy, reward=34.660! -[2023-09-26 15:27:47,375][05901] Updated weights for policy 1, policy_version 8000 (0.0019) -[2023-09-26 15:27:47,375][05900] Updated weights for policy 0, policy_version 8000 (0.0019) -[2023-09-26 15:27:47,907][04574] Fps is (10 sec: 6553.7, 60 sec: 6280.6, 300 sec: 6192.6). Total num frames: 4096000. Throughput: 0: 773.7, 1: 773.7. Samples: 1021952. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:27:47,908][04574] Avg episode reward: [(0, '34.960'), (1, '31.370')] -[2023-09-26 15:27:47,917][05384] Saving new best policy, reward=34.960! -[2023-09-26 15:27:52,907][04574] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6206.5). Total num frames: 4128768. Throughput: 0: 777.0, 1: 778.1. Samples: 1031246. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:27:52,908][04574] Avg episode reward: [(0, '35.470'), (1, '31.360')] -[2023-09-26 15:27:52,909][05384] Saving new best policy, reward=35.470! -[2023-09-26 15:27:57,908][04574] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6192.6). Total num frames: 4153344. Throughput: 0: 774.1, 1: 774.7. Samples: 1035796. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:27:57,908][04574] Avg episode reward: [(0, '35.930'), (1, '31.630')] -[2023-09-26 15:27:57,909][05384] Saving new best policy, reward=35.930! -[2023-09-26 15:28:00,780][05900] Updated weights for policy 0, policy_version 8160 (0.0017) -[2023-09-26 15:28:00,780][05901] Updated weights for policy 1, policy_version 8160 (0.0016) -[2023-09-26 15:28:02,908][04574] Fps is (10 sec: 5734.2, 60 sec: 6144.0, 300 sec: 6192.6). Total num frames: 4186112. Throughput: 0: 772.9, 1: 773.2. Samples: 1044773. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) -[2023-09-26 15:28:02,909][04574] Avg episode reward: [(0, '36.450'), (1, '32.070')] -[2023-09-26 15:28:02,919][05384] Saving new best policy, reward=36.450! -[2023-09-26 15:28:07,908][04574] Fps is (10 sec: 6553.5, 60 sec: 6144.0, 300 sec: 6192.6). Total num frames: 4218880. Throughput: 0: 771.2, 1: 773.6. Samples: 1054279. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) -[2023-09-26 15:28:07,909][04574] Avg episode reward: [(0, '36.850'), (1, '32.500')] -[2023-09-26 15:28:07,910][05384] Saving new best policy, reward=36.850! -[2023-09-26 15:28:12,908][04574] Fps is (10 sec: 5734.6, 60 sec: 6144.0, 300 sec: 6164.8). Total num frames: 4243456. Throughput: 0: 772.4, 1: 773.7. Samples: 1058760. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:28:12,909][04574] Avg episode reward: [(0, '37.100'), (1, '32.730')] -[2023-09-26 15:28:12,979][05384] Saving new best policy, reward=37.100! -[2023-09-26 15:28:14,332][05901] Updated weights for policy 1, policy_version 8320 (0.0018) -[2023-09-26 15:28:14,332][05900] Updated weights for policy 0, policy_version 8320 (0.0018) -[2023-09-26 15:28:17,908][04574] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6192.6). Total num frames: 4276224. Throughput: 0: 767.5, 1: 767.5. Samples: 1067658. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0) -[2023-09-26 15:28:17,908][04574] Avg episode reward: [(0, '37.350'), (1, '33.120')] -[2023-09-26 15:28:17,917][05596] Saving new best policy, reward=33.120! -[2023-09-26 15:28:18,110][05384] Saving new best policy, reward=37.350! -[2023-09-26 15:28:22,908][04574] Fps is (10 sec: 6553.6, 60 sec: 6144.0, 300 sec: 6192.6). Total num frames: 4308992. Throughput: 0: 772.0, 1: 772.1. Samples: 1077248. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0) -[2023-09-26 15:28:22,909][04574] Avg episode reward: [(0, '36.840'), (1, '31.920')] -[2023-09-26 15:28:27,446][05900] Updated weights for policy 0, policy_version 8480 (0.0016) -[2023-09-26 15:28:27,446][05901] Updated weights for policy 1, policy_version 8480 (0.0017) -[2023-09-26 15:28:27,908][04574] Fps is (10 sec: 6553.7, 60 sec: 6144.0, 300 sec: 6192.6). Total num frames: 4341760. Throughput: 0: 770.9, 1: 771.1. Samples: 1081761. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:28:27,909][04574] Avg episode reward: [(0, '36.740'), (1, '31.580')] -[2023-09-26 15:28:32,908][04574] Fps is (10 sec: 6553.5, 60 sec: 6144.0, 300 sec: 6192.6). Total num frames: 4374528. Throughput: 0: 768.1, 1: 767.6. Samples: 1091062. Policy #0 lag: (min: 9.0, avg: 9.0, max: 9.0) -[2023-09-26 15:28:32,909][04574] Avg episode reward: [(0, '36.980'), (1, '30.940')] -[2023-09-26 15:28:32,920][05384] Saving ./train_atari/atari_kongfumaster/checkpoint_p0/checkpoint_000008544_2187264.pth... -[2023-09-26 15:28:32,920][05596] Saving ./train_atari/atari_kongfumaster/checkpoint_p1/checkpoint_000008544_2187264.pth... -[2023-09-26 15:28:32,955][05384] Removing ./train_atari/atari_kongfumaster/checkpoint_p0/checkpoint_000005632_1441792.pth -[2023-09-26 15:28:32,962][05596] Removing ./train_atari/atari_kongfumaster/checkpoint_p1/checkpoint_000005632_1441792.pth -[2023-09-26 15:28:37,907][04574] Fps is (10 sec: 6553.7, 60 sec: 6280.6, 300 sec: 6220.4). Total num frames: 4407296. Throughput: 0: 769.6, 1: 768.3. Samples: 1100451. Policy #0 lag: (min: 9.0, avg: 9.0, max: 9.0) -[2023-09-26 15:28:37,908][04574] Avg episode reward: [(0, '37.150'), (1, '31.630')] -[2023-09-26 15:28:40,522][05901] Updated weights for policy 1, policy_version 8640 (0.0017) -[2023-09-26 15:28:40,523][05900] Updated weights for policy 0, policy_version 8640 (0.0019) -[2023-09-26 15:28:42,907][04574] Fps is (10 sec: 5734.5, 60 sec: 6144.0, 300 sec: 6192.6). Total num frames: 4431872. Throughput: 0: 772.0, 1: 771.9. Samples: 1105271. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:28:42,908][04574] Avg episode reward: [(0, '37.170'), (1, '32.330')] -[2023-09-26 15:28:47,908][04574] Fps is (10 sec: 5734.3, 60 sec: 6144.0, 300 sec: 6192.6). Total num frames: 4464640. Throughput: 0: 775.1, 1: 774.9. Samples: 1114521. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:28:47,909][04574] Avg episode reward: [(0, '38.200'), (1, '31.200')] -[2023-09-26 15:28:47,921][05384] Saving new best policy, reward=38.200! -[2023-09-26 15:28:52,908][04574] Fps is (10 sec: 6553.5, 60 sec: 6144.0, 300 sec: 6192.6). Total num frames: 4497408. Throughput: 0: 773.6, 1: 773.6. Samples: 1123903. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0) -[2023-09-26 15:28:52,909][04574] Avg episode reward: [(0, '37.780'), (1, '30.330')] -[2023-09-26 15:28:53,953][05901] Updated weights for policy 1, policy_version 8800 (0.0016) -[2023-09-26 15:28:53,953][05900] Updated weights for policy 0, policy_version 8800 (0.0016) -[2023-09-26 15:28:57,908][04574] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6192.6). Total num frames: 4530176. Throughput: 0: 773.5, 1: 771.2. Samples: 1128272. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:28:57,909][04574] Avg episode reward: [(0, '38.860'), (1, '31.210')] -[2023-09-26 15:28:57,910][05384] Saving new best policy, reward=38.860! -[2023-09-26 15:29:02,908][04574] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6192.6). Total num frames: 4554752. Throughput: 0: 775.7, 1: 775.5. Samples: 1137462. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:29:02,909][04574] Avg episode reward: [(0, '38.400'), (1, '30.760')] -[2023-09-26 15:29:06,974][05901] Updated weights for policy 1, policy_version 8960 (0.0017) -[2023-09-26 15:29:06,974][05900] Updated weights for policy 0, policy_version 8960 (0.0018) -[2023-09-26 15:29:07,908][04574] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6192.6). Total num frames: 4587520. Throughput: 0: 773.8, 1: 773.8. Samples: 1146888. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:29:07,909][04574] Avg episode reward: [(0, '38.010'), (1, '31.880')] -[2023-09-26 15:29:12,908][04574] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6192.6). Total num frames: 4620288. Throughput: 0: 774.4, 1: 774.6. Samples: 1151467. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) -[2023-09-26 15:29:12,909][04574] Avg episode reward: [(0, '37.220'), (1, '32.550')] -[2023-09-26 15:29:17,907][04574] Fps is (10 sec: 6553.7, 60 sec: 6280.6, 300 sec: 6192.6). Total num frames: 4653056. Throughput: 0: 775.4, 1: 776.5. Samples: 1160894. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) -[2023-09-26 15:29:17,908][04574] Avg episode reward: [(0, '36.470'), (1, '32.580')] -[2023-09-26 15:29:20,409][05900] Updated weights for policy 0, policy_version 9120 (0.0018) -[2023-09-26 15:29:20,410][05901] Updated weights for policy 1, policy_version 9120 (0.0017) -[2023-09-26 15:29:22,908][04574] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6178.7). Total num frames: 4677632. Throughput: 0: 771.0, 1: 770.5. Samples: 1169818. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:29:22,909][04574] Avg episode reward: [(0, '36.580'), (1, '32.510')] -[2023-09-26 15:29:27,907][04574] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6192.6). Total num frames: 4710400. Throughput: 0: 770.4, 1: 771.2. Samples: 1174642. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:29:27,908][04574] Avg episode reward: [(0, '37.080'), (1, '32.070')] -[2023-09-26 15:29:32,908][04574] Fps is (10 sec: 6553.6, 60 sec: 6144.0, 300 sec: 6192.6). Total num frames: 4743168. Throughput: 0: 769.2, 1: 769.2. Samples: 1183749. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:29:32,909][04574] Avg episode reward: [(0, '36.180'), (1, '32.470')] -[2023-09-26 15:29:33,591][05900] Updated weights for policy 0, policy_version 9280 (0.0017) -[2023-09-26 15:29:33,592][05901] Updated weights for policy 1, policy_version 9280 (0.0018) -[2023-09-26 15:29:37,907][04574] Fps is (10 sec: 6553.6, 60 sec: 6144.0, 300 sec: 6192.6). Total num frames: 4775936. Throughput: 0: 772.5, 1: 772.5. Samples: 1193429. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:29:37,908][04574] Avg episode reward: [(0, '35.920'), (1, '31.290')] -[2023-09-26 15:29:42,908][04574] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6164.8). Total num frames: 4800512. Throughput: 0: 771.1, 1: 773.1. Samples: 1197760. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:29:42,909][04574] Avg episode reward: [(0, '34.800'), (1, '31.420')] -[2023-09-26 15:29:47,094][05900] Updated weights for policy 0, policy_version 9440 (0.0014) -[2023-09-26 15:29:47,095][05901] Updated weights for policy 1, policy_version 9440 (0.0017) -[2023-09-26 15:29:47,908][04574] Fps is (10 sec: 5734.3, 60 sec: 6144.0, 300 sec: 6192.6). Total num frames: 4833280. Throughput: 0: 768.7, 1: 769.6. Samples: 1206688. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0) -[2023-09-26 15:29:47,909][04574] Avg episode reward: [(0, '34.460'), (1, '30.570')] -[2023-09-26 15:29:52,908][04574] Fps is (10 sec: 6553.6, 60 sec: 6144.0, 300 sec: 6192.6). Total num frames: 4866048. Throughput: 0: 772.1, 1: 772.6. Samples: 1216399. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:29:52,909][04574] Avg episode reward: [(0, '34.180'), (1, '29.820')] -[2023-09-26 15:29:57,907][04574] Fps is (10 sec: 6553.7, 60 sec: 6144.0, 300 sec: 6192.6). Total num frames: 4898816. Throughput: 0: 771.1, 1: 770.9. Samples: 1220855. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:29:57,908][04574] Avg episode reward: [(0, '32.610'), (1, '31.360')] -[2023-09-26 15:30:00,009][05900] Updated weights for policy 0, policy_version 9600 (0.0018) -[2023-09-26 15:30:00,009][05901] Updated weights for policy 1, policy_version 9600 (0.0017) -[2023-09-26 15:30:02,908][04574] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6192.6). Total num frames: 4931584. Throughput: 0: 774.6, 1: 773.4. Samples: 1230557. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:30:02,909][04574] Avg episode reward: [(0, '31.630'), (1, '31.350')] -[2023-09-26 15:30:07,907][04574] Fps is (10 sec: 5734.5, 60 sec: 6144.0, 300 sec: 6192.6). Total num frames: 4956160. Throughput: 0: 771.1, 1: 771.8. Samples: 1239248. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:30:07,908][04574] Avg episode reward: [(0, '32.550'), (1, '31.780')] -[2023-09-26 15:30:12,907][04574] Fps is (10 sec: 5734.5, 60 sec: 6144.0, 300 sec: 6192.6). Total num frames: 4988928. Throughput: 0: 771.2, 1: 768.7. Samples: 1243934. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) -[2023-09-26 15:30:12,908][04574] Avg episode reward: [(0, '31.670'), (1, '31.770')] -[2023-09-26 15:30:13,540][05900] Updated weights for policy 0, policy_version 9760 (0.0016) -[2023-09-26 15:30:13,540][05901] Updated weights for policy 1, policy_version 9760 (0.0013) -[2023-09-26 15:30:17,908][04574] Fps is (10 sec: 6553.5, 60 sec: 6144.0, 300 sec: 6192.6). Total num frames: 5021696. Throughput: 0: 773.6, 1: 773.2. Samples: 1253356. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:30:17,909][04574] Avg episode reward: [(0, '30.420'), (1, '31.180')] -[2023-09-26 15:30:22,908][04574] Fps is (10 sec: 6553.4, 60 sec: 6280.5, 300 sec: 6192.6). Total num frames: 5054464. Throughput: 0: 768.4, 1: 766.8. Samples: 1262511. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:30:22,909][04574] Avg episode reward: [(0, '29.550'), (1, '31.170')] -[2023-09-26 15:30:26,618][05901] Updated weights for policy 1, policy_version 9920 (0.0015) -[2023-09-26 15:30:26,619][05900] Updated weights for policy 0, policy_version 9920 (0.0016) -[2023-09-26 15:30:27,908][04574] Fps is (10 sec: 6553.7, 60 sec: 6280.5, 300 sec: 6192.6). Total num frames: 5087232. Throughput: 0: 775.8, 1: 773.8. Samples: 1267494. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:30:27,908][04574] Avg episode reward: [(0, '29.610'), (1, '32.720')] -[2023-09-26 15:30:32,908][04574] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6192.6). Total num frames: 5111808. Throughput: 0: 778.3, 1: 777.8. Samples: 1276715. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:30:32,909][04574] Avg episode reward: [(0, '28.850'), (1, '32.690')] -[2023-09-26 15:30:33,042][05596] Saving ./train_atari/atari_kongfumaster/checkpoint_p1/checkpoint_000010000_2560000.pth... -[2023-09-26 15:30:33,068][05596] Removing ./train_atari/atari_kongfumaster/checkpoint_p1/checkpoint_000007088_1814528.pth -[2023-09-26 15:30:33,078][05384] Saving ./train_atari/atari_kongfumaster/checkpoint_p0/checkpoint_000010000_2560000.pth... -[2023-09-26 15:30:33,106][05384] Removing ./train_atari/atari_kongfumaster/checkpoint_p0/checkpoint_000007088_1814528.pth -[2023-09-26 15:30:37,907][04574] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6192.6). Total num frames: 5144576. Throughput: 0: 775.8, 1: 775.1. Samples: 1286189. Policy #0 lag: (min: 8.0, avg: 8.0, max: 8.0) -[2023-09-26 15:30:37,908][04574] Avg episode reward: [(0, '27.510'), (1, '32.030')] -[2023-09-26 15:30:39,545][05900] Updated weights for policy 0, policy_version 10080 (0.0014) -[2023-09-26 15:30:39,545][05901] Updated weights for policy 1, policy_version 10080 (0.0016) -[2023-09-26 15:30:42,907][04574] Fps is (10 sec: 6553.8, 60 sec: 6280.6, 300 sec: 6192.6). Total num frames: 5177344. Throughput: 0: 779.9, 1: 779.5. Samples: 1291025. Policy #0 lag: (min: 8.0, avg: 8.0, max: 8.0) -[2023-09-26 15:30:42,908][04574] Avg episode reward: [(0, '28.110'), (1, '31.260')] -[2023-09-26 15:30:47,908][04574] Fps is (10 sec: 6553.4, 60 sec: 6280.5, 300 sec: 6192.6). Total num frames: 5210112. Throughput: 0: 774.0, 1: 775.3. Samples: 1300278. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0) -[2023-09-26 15:30:47,909][04574] Avg episode reward: [(0, '28.150'), (1, '30.650')] -[2023-09-26 15:30:52,672][05900] Updated weights for policy 0, policy_version 10240 (0.0016) -[2023-09-26 15:30:52,672][05901] Updated weights for policy 1, policy_version 10240 (0.0017) -[2023-09-26 15:30:52,908][04574] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6192.6). Total num frames: 5242880. Throughput: 0: 783.5, 1: 783.1. Samples: 1309744. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:30:52,909][04574] Avg episode reward: [(0, '27.590'), (1, '30.830')] -[2023-09-26 15:30:57,908][04574] Fps is (10 sec: 6553.7, 60 sec: 6280.5, 300 sec: 6206.5). Total num frames: 5275648. Throughput: 0: 786.2, 1: 787.1. Samples: 1314732. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:30:57,909][04574] Avg episode reward: [(0, '26.730'), (1, '32.810')] -[2023-09-26 15:31:02,908][04574] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6192.6). Total num frames: 5300224. Throughput: 0: 785.9, 1: 786.2. Samples: 1324101. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:31:02,909][04574] Avg episode reward: [(0, '27.450'), (1, '33.570')] -[2023-09-26 15:31:02,919][05596] Saving new best policy, reward=33.570! -[2023-09-26 15:31:05,505][05900] Updated weights for policy 0, policy_version 10400 (0.0015) -[2023-09-26 15:31:05,505][05901] Updated weights for policy 1, policy_version 10400 (0.0017) -[2023-09-26 15:31:07,907][04574] Fps is (10 sec: 5734.5, 60 sec: 6280.5, 300 sec: 6192.6). Total num frames: 5332992. Throughput: 0: 789.1, 1: 787.6. Samples: 1333464. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:31:07,908][04574] Avg episode reward: [(0, '26.910'), (1, '32.840')] -[2023-09-26 15:31:12,908][04574] Fps is (10 sec: 6553.7, 60 sec: 6280.5, 300 sec: 6192.6). Total num frames: 5365760. Throughput: 0: 786.1, 1: 787.7. Samples: 1338315. Policy #0 lag: (min: 9.0, avg: 9.0, max: 9.0) -[2023-09-26 15:31:12,909][04574] Avg episode reward: [(0, '27.040'), (1, '33.170')] -[2023-09-26 15:31:17,908][04574] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6192.6). Total num frames: 5398528. Throughput: 0: 787.5, 1: 787.3. Samples: 1347584. Policy #0 lag: (min: 9.0, avg: 9.0, max: 9.0) -[2023-09-26 15:31:17,909][04574] Avg episode reward: [(0, '26.870'), (1, '33.730')] -[2023-09-26 15:31:17,921][05596] Saving new best policy, reward=33.730! -[2023-09-26 15:31:18,630][05901] Updated weights for policy 1, policy_version 10560 (0.0018) -[2023-09-26 15:31:18,630][05900] Updated weights for policy 0, policy_version 10560 (0.0019) -[2023-09-26 15:31:22,908][04574] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6220.4). Total num frames: 5431296. Throughput: 0: 786.4, 1: 786.4. Samples: 1356965. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:31:22,909][04574] Avg episode reward: [(0, '26.590'), (1, '33.120')] -[2023-09-26 15:31:27,907][04574] Fps is (10 sec: 6553.7, 60 sec: 6280.5, 300 sec: 6220.4). Total num frames: 5464064. Throughput: 0: 785.9, 1: 787.1. Samples: 1361809. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:31:27,908][04574] Avg episode reward: [(0, '27.610'), (1, '33.110')] -[2023-09-26 15:31:31,816][05900] Updated weights for policy 0, policy_version 10720 (0.0017) -[2023-09-26 15:31:31,816][05901] Updated weights for policy 1, policy_version 10720 (0.0017) -[2023-09-26 15:31:32,908][04574] Fps is (10 sec: 5734.4, 60 sec: 6280.5, 300 sec: 6192.6). Total num frames: 5488640. Throughput: 0: 785.8, 1: 785.2. Samples: 1370969. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:31:32,909][04574] Avg episode reward: [(0, '27.270'), (1, '32.860')] -[2023-09-26 15:31:37,907][04574] Fps is (10 sec: 5734.4, 60 sec: 6280.5, 300 sec: 6192.6). Total num frames: 5521408. Throughput: 0: 784.3, 1: 784.8. Samples: 1380352. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:31:37,908][04574] Avg episode reward: [(0, '27.340'), (1, '32.580')] -[2023-09-26 15:31:42,908][04574] Fps is (10 sec: 6553.7, 60 sec: 6280.5, 300 sec: 6220.4). Total num frames: 5554176. Throughput: 0: 781.0, 1: 780.9. Samples: 1385021. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0) -[2023-09-26 15:31:42,909][04574] Avg episode reward: [(0, '27.460'), (1, '32.300')] -[2023-09-26 15:31:44,871][05900] Updated weights for policy 0, policy_version 10880 (0.0016) -[2023-09-26 15:31:44,872][05901] Updated weights for policy 1, policy_version 10880 (0.0016) -[2023-09-26 15:31:47,907][04574] Fps is (10 sec: 6553.6, 60 sec: 6280.6, 300 sec: 6220.4). Total num frames: 5586944. Throughput: 0: 784.3, 1: 783.7. Samples: 1394658. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:31:47,908][04574] Avg episode reward: [(0, '28.920'), (1, '33.020')] -[2023-09-26 15:31:52,908][04574] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6220.4). Total num frames: 5619712. Throughput: 0: 777.8, 1: 778.0. Samples: 1403475. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:31:52,909][04574] Avg episode reward: [(0, '28.800'), (1, '33.600')] -[2023-09-26 15:31:57,907][04574] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6192.6). Total num frames: 5644288. Throughput: 0: 779.9, 1: 779.8. Samples: 1408499. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0) -[2023-09-26 15:31:57,908][04574] Avg episode reward: [(0, '30.070'), (1, '34.530')] -[2023-09-26 15:31:57,909][05596] Saving new best policy, reward=34.530! -[2023-09-26 15:31:58,204][05900] Updated weights for policy 0, policy_version 11040 (0.0018) -[2023-09-26 15:31:58,204][05901] Updated weights for policy 1, policy_version 11040 (0.0017) -[2023-09-26 15:32:02,908][04574] Fps is (10 sec: 5734.4, 60 sec: 6280.5, 300 sec: 6192.6). Total num frames: 5677056. Throughput: 0: 777.3, 1: 777.3. Samples: 1417543. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0) -[2023-09-26 15:32:02,909][04574] Avg episode reward: [(0, '30.730'), (1, '34.420')] -[2023-09-26 15:32:07,908][04574] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6220.4). Total num frames: 5709824. Throughput: 0: 781.2, 1: 781.6. Samples: 1427293. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:32:07,909][04574] Avg episode reward: [(0, '31.410'), (1, '33.180')] -[2023-09-26 15:32:11,152][05900] Updated weights for policy 0, policy_version 11200 (0.0016) -[2023-09-26 15:32:11,152][05901] Updated weights for policy 1, policy_version 11200 (0.0018) -[2023-09-26 15:32:12,908][04574] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6220.4). Total num frames: 5742592. Throughput: 0: 777.8, 1: 776.5. Samples: 1431755. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:32:12,909][04574] Avg episode reward: [(0, '31.520'), (1, '33.420')] -[2023-09-26 15:32:17,908][04574] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6220.4). Total num frames: 5775360. Throughput: 0: 782.7, 1: 782.3. Samples: 1441396. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:32:17,909][04574] Avg episode reward: [(0, '32.100'), (1, '33.510')] -[2023-09-26 15:32:22,908][04574] Fps is (10 sec: 5734.5, 60 sec: 6144.0, 300 sec: 6192.6). Total num frames: 5799936. Throughput: 0: 779.8, 1: 779.4. Samples: 1450518. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:32:22,909][04574] Avg episode reward: [(0, '32.040'), (1, '32.930')] -[2023-09-26 15:32:24,255][05901] Updated weights for policy 1, policy_version 11360 (0.0017) -[2023-09-26 15:32:24,255][05900] Updated weights for policy 0, policy_version 11360 (0.0016) -[2023-09-26 15:32:27,908][04574] Fps is (10 sec: 5734.5, 60 sec: 6144.0, 300 sec: 6192.6). Total num frames: 5832704. Throughput: 0: 781.8, 1: 781.9. Samples: 1455388. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:32:27,909][04574] Avg episode reward: [(0, '32.070'), (1, '33.570')] -[2023-09-26 15:32:32,908][04574] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6220.4). Total num frames: 5865472. Throughput: 0: 778.8, 1: 779.5. Samples: 1464781. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:32:32,909][04574] Avg episode reward: [(0, '32.350'), (1, '32.380')] -[2023-09-26 15:32:32,921][05596] Saving ./train_atari/atari_kongfumaster/checkpoint_p1/checkpoint_000011456_2932736.pth... -[2023-09-26 15:32:32,921][05384] Saving ./train_atari/atari_kongfumaster/checkpoint_p0/checkpoint_000011456_2932736.pth... -[2023-09-26 15:32:32,957][05384] Removing ./train_atari/atari_kongfumaster/checkpoint_p0/checkpoint_000008544_2187264.pth -[2023-09-26 15:32:32,957][05596] Removing ./train_atari/atari_kongfumaster/checkpoint_p1/checkpoint_000008544_2187264.pth -[2023-09-26 15:32:37,221][05901] Updated weights for policy 1, policy_version 11520 (0.0017) -[2023-09-26 15:32:37,221][05900] Updated weights for policy 0, policy_version 11520 (0.0016) -[2023-09-26 15:32:37,908][04574] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6220.4). Total num frames: 5898240. Throughput: 0: 788.8, 1: 789.1. Samples: 1474481. Policy #0 lag: (min: 9.0, avg: 9.0, max: 9.0) -[2023-09-26 15:32:37,909][04574] Avg episode reward: [(0, '32.190'), (1, '32.780')] -[2023-09-26 15:32:42,908][04574] Fps is (10 sec: 6553.7, 60 sec: 6280.5, 300 sec: 6220.4). Total num frames: 5931008. Throughput: 0: 782.4, 1: 781.5. Samples: 1478877. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) -[2023-09-26 15:32:42,909][04574] Avg episode reward: [(0, '31.700'), (1, '32.440')] -[2023-09-26 15:32:47,908][04574] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6220.4). Total num frames: 5963776. Throughput: 0: 789.8, 1: 790.3. Samples: 1488648. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) -[2023-09-26 15:32:47,909][04574] Avg episode reward: [(0, '32.610'), (1, '32.820')] -[2023-09-26 15:32:50,318][05901] Updated weights for policy 1, policy_version 11680 (0.0017) -[2023-09-26 15:32:50,318][05900] Updated weights for policy 0, policy_version 11680 (0.0017) -[2023-09-26 15:32:52,908][04574] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6220.4). Total num frames: 5988352. Throughput: 0: 781.4, 1: 781.5. Samples: 1497620. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0) -[2023-09-26 15:32:52,909][04574] Avg episode reward: [(0, '32.100'), (1, '32.090')] -[2023-09-26 15:32:57,907][04574] Fps is (10 sec: 5734.5, 60 sec: 6280.5, 300 sec: 6220.4). Total num frames: 6021120. Throughput: 0: 782.4, 1: 782.2. Samples: 1502162. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0) -[2023-09-26 15:32:57,908][04574] Avg episode reward: [(0, '34.220'), (1, '32.550')] -[2023-09-26 15:33:02,908][04574] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6220.4). Total num frames: 6053888. Throughput: 0: 778.0, 1: 778.2. Samples: 1511424. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0) -[2023-09-26 15:33:02,909][04574] Avg episode reward: [(0, '34.510'), (1, '32.870')] -[2023-09-26 15:33:03,735][05901] Updated weights for policy 1, policy_version 11840 (0.0018) -[2023-09-26 15:33:03,735][05900] Updated weights for policy 0, policy_version 11840 (0.0019) -[2023-09-26 15:33:07,908][04574] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 6086656. Throughput: 0: 778.6, 1: 778.7. Samples: 1520597. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0) -[2023-09-26 15:33:07,909][04574] Avg episode reward: [(0, '35.260'), (1, '32.870')] -[2023-09-26 15:33:12,908][04574] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 6119424. Throughput: 0: 780.1, 1: 780.5. Samples: 1525615. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-09-26 15:33:12,909][04574] Avg episode reward: [(0, '35.600'), (1, '33.540')] -[2023-09-26 15:33:16,836][05901] Updated weights for policy 1, policy_version 12000 (0.0019) -[2023-09-26 15:33:16,836][05900] Updated weights for policy 0, policy_version 12000 (0.0017) -[2023-09-26 15:33:17,908][04574] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6220.4). Total num frames: 6144000. Throughput: 0: 776.9, 1: 777.0. Samples: 1534706. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-09-26 15:33:17,909][04574] Avg episode reward: [(0, '36.440'), (1, '32.440')] -[2023-09-26 15:33:22,908][04574] Fps is (10 sec: 5734.4, 60 sec: 6280.5, 300 sec: 6220.4). Total num frames: 6176768. Throughput: 0: 774.8, 1: 774.4. Samples: 1544192. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:33:22,909][04574] Avg episode reward: [(0, '36.710'), (1, '31.440')] -[2023-09-26 15:33:27,908][04574] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6220.4). Total num frames: 6209536. Throughput: 0: 775.9, 1: 776.4. Samples: 1548728. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:33:27,909][04574] Avg episode reward: [(0, '36.090'), (1, '30.540')] -[2023-09-26 15:33:30,056][05901] Updated weights for policy 1, policy_version 12160 (0.0016) -[2023-09-26 15:33:30,056][05900] Updated weights for policy 0, policy_version 12160 (0.0017) -[2023-09-26 15:33:32,907][04574] Fps is (10 sec: 6553.7, 60 sec: 6280.6, 300 sec: 6220.4). Total num frames: 6242304. Throughput: 0: 772.8, 1: 773.3. Samples: 1558223. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) -[2023-09-26 15:33:32,908][04574] Avg episode reward: [(0, '35.720'), (1, '28.860')] -[2023-09-26 15:33:37,908][04574] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 6275072. Throughput: 0: 776.6, 1: 776.6. Samples: 1567516. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) -[2023-09-26 15:33:37,909][04574] Avg episode reward: [(0, '35.610'), (1, '29.710')] -[2023-09-26 15:33:42,908][04574] Fps is (10 sec: 5734.3, 60 sec: 6144.0, 300 sec: 6220.4). Total num frames: 6299648. Throughput: 0: 780.7, 1: 781.8. Samples: 1572478. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0) -[2023-09-26 15:33:42,909][04574] Avg episode reward: [(0, '35.730'), (1, '30.060')] -[2023-09-26 15:33:43,171][05901] Updated weights for policy 1, policy_version 12320 (0.0019) -[2023-09-26 15:33:43,171][05900] Updated weights for policy 0, policy_version 12320 (0.0017) -[2023-09-26 15:33:47,907][04574] Fps is (10 sec: 5734.5, 60 sec: 6144.0, 300 sec: 6220.4). Total num frames: 6332416. Throughput: 0: 776.4, 1: 776.2. Samples: 1581289. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0) -[2023-09-26 15:33:47,908][04574] Avg episode reward: [(0, '35.670'), (1, '29.550')] -[2023-09-26 15:33:52,908][04574] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6220.4). Total num frames: 6365184. Throughput: 0: 779.6, 1: 781.2. Samples: 1590833. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) -[2023-09-26 15:33:52,909][04574] Avg episode reward: [(0, '35.260'), (1, '30.850')] -[2023-09-26 15:33:56,430][05900] Updated weights for policy 0, policy_version 12480 (0.0019) -[2023-09-26 15:33:56,430][05901] Updated weights for policy 1, policy_version 12480 (0.0018) -[2023-09-26 15:33:57,908][04574] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 6397952. Throughput: 0: 775.6, 1: 775.0. Samples: 1595392. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) -[2023-09-26 15:33:57,909][04574] Avg episode reward: [(0, '35.010'), (1, '31.520')] -[2023-09-26 15:34:02,908][04574] Fps is (10 sec: 5734.5, 60 sec: 6144.0, 300 sec: 6220.4). Total num frames: 6422528. Throughput: 0: 773.3, 1: 774.8. Samples: 1604370. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) -[2023-09-26 15:34:02,909][04574] Avg episode reward: [(0, '34.520'), (1, '31.470')] -[2023-09-26 15:34:07,908][04574] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6220.4). Total num frames: 6455296. Throughput: 0: 773.7, 1: 773.7. Samples: 1613824. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) -[2023-09-26 15:34:07,909][04574] Avg episode reward: [(0, '34.890'), (1, '31.690')] -[2023-09-26 15:34:09,664][05900] Updated weights for policy 0, policy_version 12640 (0.0016) -[2023-09-26 15:34:09,664][05901] Updated weights for policy 1, policy_version 12640 (0.0018) -[2023-09-26 15:34:12,907][04574] Fps is (10 sec: 6553.7, 60 sec: 6144.0, 300 sec: 6220.4). Total num frames: 6488064. Throughput: 0: 774.9, 1: 773.8. Samples: 1618419. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) -[2023-09-26 15:34:12,908][04574] Avg episode reward: [(0, '34.890'), (1, '31.640')] -[2023-09-26 15:34:17,908][04574] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 6520832. Throughput: 0: 777.5, 1: 776.6. Samples: 1628160. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) -[2023-09-26 15:34:17,909][04574] Avg episode reward: [(0, '35.140'), (1, '30.010')] -[2023-09-26 15:34:22,714][05900] Updated weights for policy 0, policy_version 12800 (0.0018) -[2023-09-26 15:34:22,714][05901] Updated weights for policy 1, policy_version 12800 (0.0017) -[2023-09-26 15:34:22,907][04574] Fps is (10 sec: 6553.6, 60 sec: 6280.6, 300 sec: 6248.1). Total num frames: 6553600. Throughput: 0: 776.1, 1: 775.3. Samples: 1637326. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) -[2023-09-26 15:34:22,908][04574] Avg episode reward: [(0, '35.940'), (1, '29.990')] -[2023-09-26 15:34:27,908][04574] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6220.4). Total num frames: 6578176. Throughput: 0: 774.8, 1: 775.7. Samples: 1642252. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) -[2023-09-26 15:34:27,909][04574] Avg episode reward: [(0, '36.780'), (1, '30.490')] -[2023-09-26 15:34:32,908][04574] Fps is (10 sec: 5734.2, 60 sec: 6144.0, 300 sec: 6220.4). Total num frames: 6610944. Throughput: 0: 776.7, 1: 777.0. Samples: 1651205. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) -[2023-09-26 15:34:32,909][04574] Avg episode reward: [(0, '36.870'), (1, '29.160')] -[2023-09-26 15:34:32,921][05384] Saving ./train_atari/atari_kongfumaster/checkpoint_p0/checkpoint_000012912_3305472.pth... -[2023-09-26 15:34:32,921][05596] Saving ./train_atari/atari_kongfumaster/checkpoint_p1/checkpoint_000012912_3305472.pth... -[2023-09-26 15:34:32,961][05596] Removing ./train_atari/atari_kongfumaster/checkpoint_p1/checkpoint_000010000_2560000.pth -[2023-09-26 15:34:32,963][05384] Removing ./train_atari/atari_kongfumaster/checkpoint_p0/checkpoint_000010000_2560000.pth -[2023-09-26 15:34:35,933][05901] Updated weights for policy 1, policy_version 12960 (0.0017) -[2023-09-26 15:34:35,934][05900] Updated weights for policy 0, policy_version 12960 (0.0017) -[2023-09-26 15:34:37,907][04574] Fps is (10 sec: 6553.7, 60 sec: 6144.0, 300 sec: 6248.1). Total num frames: 6643712. Throughput: 0: 779.2, 1: 778.0. Samples: 1660911. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) -[2023-09-26 15:34:37,908][04574] Avg episode reward: [(0, '37.790'), (1, '29.600')] -[2023-09-26 15:34:42,908][04574] Fps is (10 sec: 6553.7, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 6676480. Throughput: 0: 776.4, 1: 776.3. Samples: 1665262. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:34:42,909][04574] Avg episode reward: [(0, '34.280'), (1, '30.270')] -[2023-09-26 15:34:47,908][04574] Fps is (10 sec: 6553.4, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 6709248. Throughput: 0: 786.2, 1: 785.1. Samples: 1675079. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:34:47,909][04574] Avg episode reward: [(0, '34.240'), (1, '29.710')] -[2023-09-26 15:34:48,986][05901] Updated weights for policy 1, policy_version 13120 (0.0016) -[2023-09-26 15:34:48,986][05900] Updated weights for policy 0, policy_version 13120 (0.0017) -[2023-09-26 15:34:52,908][04574] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 6742016. Throughput: 0: 780.9, 1: 780.8. Samples: 1684102. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) -[2023-09-26 15:34:52,909][04574] Avg episode reward: [(0, '34.270'), (1, '30.290')] -[2023-09-26 15:34:57,908][04574] Fps is (10 sec: 5734.6, 60 sec: 6144.0, 300 sec: 6220.4). Total num frames: 6766592. Throughput: 0: 781.4, 1: 783.8. Samples: 1688853. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) -[2023-09-26 15:34:57,908][04574] Avg episode reward: [(0, '32.740'), (1, '30.980')] -[2023-09-26 15:35:02,215][05901] Updated weights for policy 1, policy_version 13280 (0.0017) -[2023-09-26 15:35:02,215][05900] Updated weights for policy 0, policy_version 13280 (0.0018) -[2023-09-26 15:35:02,908][04574] Fps is (10 sec: 5734.4, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 6799360. Throughput: 0: 775.6, 1: 775.8. Samples: 1697976. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:35:02,908][04574] Avg episode reward: [(0, '32.620'), (1, '31.030')] -[2023-09-26 15:35:07,908][04574] Fps is (10 sec: 6553.4, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 6832128. Throughput: 0: 782.7, 1: 783.7. Samples: 1707817. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:35:07,909][04574] Avg episode reward: [(0, '33.300'), (1, '30.930')] -[2023-09-26 15:35:12,908][04574] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 6864896. Throughput: 0: 777.0, 1: 775.8. Samples: 1712129. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:35:12,909][04574] Avg episode reward: [(0, '32.810'), (1, '31.260')] -[2023-09-26 15:35:15,226][05901] Updated weights for policy 1, policy_version 13440 (0.0015) -[2023-09-26 15:35:15,226][05900] Updated weights for policy 0, policy_version 13440 (0.0017) -[2023-09-26 15:35:17,908][04574] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 6897664. Throughput: 0: 783.6, 1: 783.3. Samples: 1721718. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:35:17,909][04574] Avg episode reward: [(0, '32.980'), (1, '30.180')] -[2023-09-26 15:35:22,908][04574] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6220.4). Total num frames: 6922240. Throughput: 0: 778.4, 1: 778.3. Samples: 1730962. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:35:22,909][04574] Avg episode reward: [(0, '33.020'), (1, '30.500')] -[2023-09-26 15:35:27,908][04574] Fps is (10 sec: 5734.5, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 6955008. Throughput: 0: 782.9, 1: 783.2. Samples: 1735739. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:35:27,909][04574] Avg episode reward: [(0, '33.060'), (1, '30.070')] -[2023-09-26 15:35:28,355][05901] Updated weights for policy 1, policy_version 13600 (0.0017) -[2023-09-26 15:35:28,355][05900] Updated weights for policy 0, policy_version 13600 (0.0018) -[2023-09-26 15:35:32,908][04574] Fps is (10 sec: 6553.6, 60 sec: 6280.6, 300 sec: 6248.1). Total num frames: 6987776. Throughput: 0: 775.9, 1: 775.6. Samples: 1744896. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:35:32,909][04574] Avg episode reward: [(0, '33.080'), (1, '30.330')] -[2023-09-26 15:35:37,908][04574] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6220.4). Total num frames: 7012352. Throughput: 0: 773.7, 1: 773.0. Samples: 1753704. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:35:37,908][04574] Avg episode reward: [(0, '33.210'), (1, '31.160')] -[2023-09-26 15:35:41,840][05901] Updated weights for policy 1, policy_version 13760 (0.0017) -[2023-09-26 15:35:41,840][05900] Updated weights for policy 0, policy_version 13760 (0.0015) -[2023-09-26 15:35:42,907][04574] Fps is (10 sec: 5734.5, 60 sec: 6144.0, 300 sec: 6220.4). Total num frames: 7045120. Throughput: 0: 774.0, 1: 772.4. Samples: 1758443. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:35:42,908][04574] Avg episode reward: [(0, '33.640'), (1, '31.570')] -[2023-09-26 15:35:47,908][04574] Fps is (10 sec: 6553.5, 60 sec: 6144.0, 300 sec: 6220.4). Total num frames: 7077888. Throughput: 0: 775.9, 1: 775.3. Samples: 1767779. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) -[2023-09-26 15:35:47,909][04574] Avg episode reward: [(0, '33.200'), (1, '31.060')] -[2023-09-26 15:35:52,908][04574] Fps is (10 sec: 6553.5, 60 sec: 6144.0, 300 sec: 6220.4). Total num frames: 7110656. Throughput: 0: 774.3, 1: 774.3. Samples: 1777505. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) -[2023-09-26 15:35:52,909][04574] Avg episode reward: [(0, '33.340'), (1, '32.050')] -[2023-09-26 15:35:54,912][05900] Updated weights for policy 0, policy_version 13920 (0.0016) -[2023-09-26 15:35:54,912][05901] Updated weights for policy 1, policy_version 13920 (0.0018) -[2023-09-26 15:35:57,908][04574] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 7143424. Throughput: 0: 774.4, 1: 774.2. Samples: 1781814. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) -[2023-09-26 15:35:57,909][04574] Avg episode reward: [(0, '32.940'), (1, '32.880')] -[2023-09-26 15:36:02,908][04574] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 7176192. Throughput: 0: 774.2, 1: 774.6. Samples: 1791414. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) -[2023-09-26 15:36:02,909][04574] Avg episode reward: [(0, '31.580'), (1, '32.210')] -[2023-09-26 15:36:07,908][04574] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6220.4). Total num frames: 7200768. Throughput: 0: 769.9, 1: 769.8. Samples: 1800248. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:36:07,909][04574] Avg episode reward: [(0, '30.920'), (1, '31.820')] -[2023-09-26 15:36:08,220][05901] Updated weights for policy 1, policy_version 14080 (0.0017) -[2023-09-26 15:36:08,220][05900] Updated weights for policy 0, policy_version 14080 (0.0018) -[2023-09-26 15:36:12,908][04574] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6220.4). Total num frames: 7233536. Throughput: 0: 771.1, 1: 771.6. Samples: 1805162. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:36:12,909][04574] Avg episode reward: [(0, '31.490'), (1, '31.050')] -[2023-09-26 15:36:17,908][04574] Fps is (10 sec: 6553.6, 60 sec: 6144.0, 300 sec: 6220.4). Total num frames: 7266304. Throughput: 0: 773.7, 1: 773.7. Samples: 1814529. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) -[2023-09-26 15:36:17,909][04574] Avg episode reward: [(0, '31.650'), (1, '30.660')] -[2023-09-26 15:36:21,217][05900] Updated weights for policy 0, policy_version 14240 (0.0017) -[2023-09-26 15:36:21,217][05901] Updated weights for policy 1, policy_version 14240 (0.0016) -[2023-09-26 15:36:22,907][04574] Fps is (10 sec: 6553.7, 60 sec: 6280.6, 300 sec: 6220.4). Total num frames: 7299072. Throughput: 0: 782.4, 1: 782.8. Samples: 1824138. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) -[2023-09-26 15:36:22,908][04574] Avg episode reward: [(0, '32.320'), (1, '31.040')] -[2023-09-26 15:36:27,908][04574] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 7331840. Throughput: 0: 780.6, 1: 781.7. Samples: 1828747. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:36:27,909][04574] Avg episode reward: [(0, '32.680'), (1, '31.520')] -[2023-09-26 15:36:32,908][04574] Fps is (10 sec: 5734.3, 60 sec: 6144.0, 300 sec: 6220.4). Total num frames: 7356416. Throughput: 0: 779.8, 1: 780.0. Samples: 1837967. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:36:32,909][04574] Avg episode reward: [(0, '32.280'), (1, '31.590')] -[2023-09-26 15:36:32,920][05596] Saving ./train_atari/atari_kongfumaster/checkpoint_p1/checkpoint_000014368_3678208.pth... -[2023-09-26 15:36:32,954][05596] Removing ./train_atari/atari_kongfumaster/checkpoint_p1/checkpoint_000011456_2932736.pth -[2023-09-26 15:36:33,015][05384] Saving ./train_atari/atari_kongfumaster/checkpoint_p0/checkpoint_000014384_3682304.pth... -[2023-09-26 15:36:33,041][05384] Removing ./train_atari/atari_kongfumaster/checkpoint_p0/checkpoint_000011456_2932736.pth -[2023-09-26 15:36:34,464][05900] Updated weights for policy 0, policy_version 14400 (0.0017) -[2023-09-26 15:36:34,464][05901] Updated weights for policy 1, policy_version 14400 (0.0017) -[2023-09-26 15:36:37,908][04574] Fps is (10 sec: 5734.4, 60 sec: 6280.5, 300 sec: 6220.4). Total num frames: 7389184. Throughput: 0: 775.7, 1: 775.2. Samples: 1847296. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:36:37,909][04574] Avg episode reward: [(0, '32.230'), (1, '31.150')] -[2023-09-26 15:36:42,908][04574] Fps is (10 sec: 6553.7, 60 sec: 6280.5, 300 sec: 6220.4). Total num frames: 7421952. Throughput: 0: 776.4, 1: 776.3. Samples: 1851687. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:36:42,909][04574] Avg episode reward: [(0, '32.270'), (1, '31.950')] -[2023-09-26 15:36:47,518][05900] Updated weights for policy 0, policy_version 14560 (0.0019) -[2023-09-26 15:36:47,518][05901] Updated weights for policy 1, policy_version 14560 (0.0018) -[2023-09-26 15:36:47,908][04574] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6220.4). Total num frames: 7454720. Throughput: 0: 778.5, 1: 778.0. Samples: 1861460. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:36:47,909][04574] Avg episode reward: [(0, '32.660'), (1, '32.170')] -[2023-09-26 15:36:52,908][04574] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 7487488. Throughput: 0: 784.6, 1: 785.1. Samples: 1870886. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:36:52,909][04574] Avg episode reward: [(0, '32.580'), (1, '31.930')] -[2023-09-26 15:36:57,908][04574] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 7520256. Throughput: 0: 785.3, 1: 784.8. Samples: 1875817. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:36:57,909][04574] Avg episode reward: [(0, '32.580'), (1, '31.710')] -[2023-09-26 15:37:00,428][05901] Updated weights for policy 1, policy_version 14720 (0.0017) -[2023-09-26 15:37:00,429][05900] Updated weights for policy 0, policy_version 14720 (0.0015) -[2023-09-26 15:37:02,908][04574] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6220.4). Total num frames: 7544832. Throughput: 0: 784.0, 1: 784.8. Samples: 1885128. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) -[2023-09-26 15:37:02,909][04574] Avg episode reward: [(0, '32.310'), (1, '31.380')] -[2023-09-26 15:37:07,908][04574] Fps is (10 sec: 5734.4, 60 sec: 6280.5, 300 sec: 6220.4). Total num frames: 7577600. Throughput: 0: 780.5, 1: 780.9. Samples: 1894401. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) -[2023-09-26 15:37:07,909][04574] Avg episode reward: [(0, '32.770'), (1, '31.640')] -[2023-09-26 15:37:12,907][04574] Fps is (10 sec: 6553.7, 60 sec: 6280.5, 300 sec: 6220.4). Total num frames: 7610368. Throughput: 0: 781.7, 1: 780.3. Samples: 1899034. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:37:12,908][04574] Avg episode reward: [(0, '33.270'), (1, '30.880')] -[2023-09-26 15:37:13,612][05900] Updated weights for policy 0, policy_version 14880 (0.0017) -[2023-09-26 15:37:13,612][05901] Updated weights for policy 1, policy_version 14880 (0.0018) -[2023-09-26 15:37:17,908][04574] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 7643136. Throughput: 0: 785.4, 1: 785.6. Samples: 1908659. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:37:17,909][04574] Avg episode reward: [(0, '32.270'), (1, '31.310')] -[2023-09-26 15:37:22,908][04574] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 7675904. Throughput: 0: 782.3, 1: 781.9. Samples: 1917686. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:37:22,909][04574] Avg episode reward: [(0, '32.620'), (1, '29.960')] -[2023-09-26 15:37:26,761][05900] Updated weights for policy 0, policy_version 15040 (0.0016) -[2023-09-26 15:37:26,762][05901] Updated weights for policy 1, policy_version 15040 (0.0018) -[2023-09-26 15:37:27,908][04574] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6220.4). Total num frames: 7700480. Throughput: 0: 785.3, 1: 785.4. Samples: 1922369. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:37:27,909][04574] Avg episode reward: [(0, '33.070'), (1, '29.200')] -[2023-09-26 15:37:32,908][04574] Fps is (10 sec: 5734.4, 60 sec: 6280.5, 300 sec: 6220.4). Total num frames: 7733248. Throughput: 0: 779.4, 1: 779.6. Samples: 1931616. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:37:32,909][04574] Avg episode reward: [(0, '32.900'), (1, '29.260')] -[2023-09-26 15:37:37,908][04574] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6220.4). Total num frames: 7766016. Throughput: 0: 784.5, 1: 784.5. Samples: 1941493. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:37:37,909][04574] Avg episode reward: [(0, '32.620'), (1, '29.570')] -[2023-09-26 15:37:39,803][05900] Updated weights for policy 0, policy_version 15200 (0.0017) -[2023-09-26 15:37:39,803][05901] Updated weights for policy 1, policy_version 15200 (0.0016) -[2023-09-26 15:37:42,907][04574] Fps is (10 sec: 6553.7, 60 sec: 6280.5, 300 sec: 6220.4). Total num frames: 7798784. Throughput: 0: 778.0, 1: 778.1. Samples: 1945843. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:37:42,908][04574] Avg episode reward: [(0, '32.080'), (1, '30.040')] -[2023-09-26 15:37:47,907][04574] Fps is (10 sec: 6553.8, 60 sec: 6280.6, 300 sec: 6248.1). Total num frames: 7831552. Throughput: 0: 782.2, 1: 784.0. Samples: 1955611. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0) -[2023-09-26 15:37:47,908][04574] Avg episode reward: [(0, '32.560'), (1, '30.960')] -[2023-09-26 15:37:52,732][05900] Updated weights for policy 0, policy_version 15360 (0.0017) -[2023-09-26 15:37:52,732][05901] Updated weights for policy 1, policy_version 15360 (0.0016) -[2023-09-26 15:37:52,908][04574] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 7864320. Throughput: 0: 784.2, 1: 784.4. Samples: 1964988. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0) -[2023-09-26 15:37:52,909][04574] Avg episode reward: [(0, '32.780'), (1, '31.200')] -[2023-09-26 15:37:57,908][04574] Fps is (10 sec: 5734.3, 60 sec: 6144.0, 300 sec: 6220.4). Total num frames: 7888896. Throughput: 0: 786.4, 1: 787.2. Samples: 1969847. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0) -[2023-09-26 15:37:57,909][04574] Avg episode reward: [(0, '31.730'), (1, '30.550')] -[2023-09-26 15:38:02,907][04574] Fps is (10 sec: 5734.5, 60 sec: 6280.6, 300 sec: 6220.4). Total num frames: 7921664. Throughput: 0: 782.1, 1: 782.1. Samples: 1979049. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-09-26 15:38:02,908][04574] Avg episode reward: [(0, '31.380'), (1, '30.530')] -[2023-09-26 15:38:05,879][05900] Updated weights for policy 0, policy_version 15520 (0.0016) -[2023-09-26 15:38:05,879][05901] Updated weights for policy 1, policy_version 15520 (0.0017) -[2023-09-26 15:38:07,908][04574] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6220.4). Total num frames: 7954432. Throughput: 0: 787.8, 1: 788.2. Samples: 1988608. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-09-26 15:38:07,909][04574] Avg episode reward: [(0, '31.220'), (1, '30.630')] -[2023-09-26 15:38:12,907][04574] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 7987200. Throughput: 0: 785.7, 1: 786.4. Samples: 1993114. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:38:12,908][04574] Avg episode reward: [(0, '31.670'), (1, '30.790')] -[2023-09-26 15:38:17,908][04574] Fps is (10 sec: 6553.7, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 8019968. Throughput: 0: 788.0, 1: 788.1. Samples: 2002538. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:38:17,908][04574] Avg episode reward: [(0, '30.770'), (1, '30.950')] -[2023-09-26 15:38:18,956][05901] Updated weights for policy 1, policy_version 15680 (0.0016) -[2023-09-26 15:38:18,956][05900] Updated weights for policy 0, policy_version 15680 (0.0017) -[2023-09-26 15:38:22,908][04574] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 8052736. Throughput: 0: 781.7, 1: 781.5. Samples: 2011834. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) -[2023-09-26 15:38:22,909][04574] Avg episode reward: [(0, '30.770'), (1, '30.970')] -[2023-09-26 15:38:27,908][04574] Fps is (10 sec: 5734.3, 60 sec: 6280.5, 300 sec: 6220.4). Total num frames: 8077312. Throughput: 0: 787.3, 1: 786.8. Samples: 2016677. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) -[2023-09-26 15:38:27,909][04574] Avg episode reward: [(0, '30.650'), (1, '31.360')] -[2023-09-26 15:38:31,898][05900] Updated weights for policy 0, policy_version 15840 (0.0017) -[2023-09-26 15:38:31,898][05901] Updated weights for policy 1, policy_version 15840 (0.0015) -[2023-09-26 15:38:32,908][04574] Fps is (10 sec: 5734.3, 60 sec: 6280.5, 300 sec: 6220.4). Total num frames: 8110080. Throughput: 0: 784.7, 1: 781.9. Samples: 2026106. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) -[2023-09-26 15:38:32,909][04574] Avg episode reward: [(0, '31.370'), (1, '31.810')] -[2023-09-26 15:38:32,920][05384] Saving ./train_atari/atari_kongfumaster/checkpoint_p0/checkpoint_000015840_4055040.pth... -[2023-09-26 15:38:32,920][05596] Saving ./train_atari/atari_kongfumaster/checkpoint_p1/checkpoint_000015840_4055040.pth... -[2023-09-26 15:38:32,954][05384] Removing ./train_atari/atari_kongfumaster/checkpoint_p0/checkpoint_000012912_3305472.pth -[2023-09-26 15:38:32,955][05596] Removing ./train_atari/atari_kongfumaster/checkpoint_p1/checkpoint_000012912_3305472.pth -[2023-09-26 15:38:37,908][04574] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 8142848. Throughput: 0: 785.9, 1: 785.8. Samples: 2035712. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:38:37,909][04574] Avg episode reward: [(0, '30.550'), (1, '32.190')] -[2023-09-26 15:38:42,908][04574] Fps is (10 sec: 6553.7, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 8175616. Throughput: 0: 780.6, 1: 780.2. Samples: 2040081. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:38:42,909][04574] Avg episode reward: [(0, '30.100'), (1, '32.890')] -[2023-09-26 15:38:45,061][05900] Updated weights for policy 0, policy_version 16000 (0.0019) -[2023-09-26 15:38:45,061][05901] Updated weights for policy 1, policy_version 16000 (0.0019) -[2023-09-26 15:38:47,908][04574] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 8208384. Throughput: 0: 783.1, 1: 785.8. Samples: 2049647. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-09-26 15:38:47,909][04574] Avg episode reward: [(0, '30.060'), (1, '32.700')] -[2023-09-26 15:38:52,908][04574] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6220.4). Total num frames: 8232960. Throughput: 0: 780.1, 1: 779.9. Samples: 2058810. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-09-26 15:38:52,908][04574] Avg episode reward: [(0, '31.240'), (1, '32.960')] -[2023-09-26 15:38:57,907][04574] Fps is (10 sec: 5734.5, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 8265728. Throughput: 0: 785.6, 1: 784.5. Samples: 2063770. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-09-26 15:38:57,908][04574] Avg episode reward: [(0, '31.680'), (1, '32.330')] -[2023-09-26 15:38:58,140][05901] Updated weights for policy 1, policy_version 16160 (0.0018) -[2023-09-26 15:38:58,140][05900] Updated weights for policy 0, policy_version 16160 (0.0015) -[2023-09-26 15:39:02,907][04574] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 8298496. Throughput: 0: 783.6, 1: 783.4. Samples: 2073056. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) -[2023-09-26 15:39:02,908][04574] Avg episode reward: [(0, '31.940'), (1, '33.300')] -[2023-09-26 15:39:07,907][04574] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 8331264. Throughput: 0: 786.7, 1: 786.6. Samples: 2082634. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) -[2023-09-26 15:39:07,908][04574] Avg episode reward: [(0, '30.740'), (1, '33.050')] -[2023-09-26 15:39:11,088][05900] Updated weights for policy 0, policy_version 16320 (0.0017) -[2023-09-26 15:39:11,088][05901] Updated weights for policy 1, policy_version 16320 (0.0017) -[2023-09-26 15:39:12,907][04574] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 8364032. Throughput: 0: 783.2, 1: 783.3. Samples: 2087171. Policy #0 lag: (min: 8.0, avg: 8.0, max: 8.0) -[2023-09-26 15:39:12,908][04574] Avg episode reward: [(0, '30.570'), (1, '32.320')] -[2023-09-26 15:39:17,907][04574] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 8396800. Throughput: 0: 786.6, 1: 785.1. Samples: 2096830. Policy #0 lag: (min: 8.0, avg: 8.0, max: 8.0) -[2023-09-26 15:39:17,908][04574] Avg episode reward: [(0, '30.970'), (1, '31.840')] -[2023-09-26 15:39:22,907][04574] Fps is (10 sec: 6144.0, 60 sec: 6212.3, 300 sec: 6262.0). Total num frames: 8425472. Throughput: 0: 779.2, 1: 780.0. Samples: 2105872. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) -[2023-09-26 15:39:22,908][04574] Avg episode reward: [(0, '30.190'), (1, '31.550')] -[2023-09-26 15:39:24,385][05901] Updated weights for policy 1, policy_version 16480 (0.0016) -[2023-09-26 15:39:24,385][05900] Updated weights for policy 0, policy_version 16480 (0.0016) -[2023-09-26 15:39:27,908][04574] Fps is (10 sec: 5734.3, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 8454144. Throughput: 0: 782.9, 1: 782.7. Samples: 2110533. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) -[2023-09-26 15:39:27,909][04574] Avg episode reward: [(0, '28.950'), (1, '31.510')] -[2023-09-26 15:39:32,908][04574] Fps is (10 sec: 6143.9, 60 sec: 6280.6, 300 sec: 6248.1). Total num frames: 8486912. Throughput: 0: 779.6, 1: 776.8. Samples: 2119685. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) -[2023-09-26 15:39:32,908][04574] Avg episode reward: [(0, '29.960'), (1, '31.490')] -[2023-09-26 15:39:37,648][05900] Updated weights for policy 0, policy_version 16640 (0.0015) -[2023-09-26 15:39:37,648][05901] Updated weights for policy 1, policy_version 16640 (0.0016) -[2023-09-26 15:39:37,908][04574] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 8519680. Throughput: 0: 780.2, 1: 779.9. Samples: 2129013. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:39:37,909][04574] Avg episode reward: [(0, '30.370'), (1, '31.280')] -[2023-09-26 15:39:42,908][04574] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 8552448. Throughput: 0: 778.4, 1: 778.2. Samples: 2133815. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:39:42,909][04574] Avg episode reward: [(0, '30.340'), (1, '31.340')] -[2023-09-26 15:39:47,908][04574] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6220.4). Total num frames: 8577024. Throughput: 0: 779.5, 1: 779.2. Samples: 2143200. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:39:47,909][04574] Avg episode reward: [(0, '29.870'), (1, '31.160')] -[2023-09-26 15:39:50,546][05900] Updated weights for policy 0, policy_version 16800 (0.0017) -[2023-09-26 15:39:50,546][05901] Updated weights for policy 1, policy_version 16800 (0.0017) -[2023-09-26 15:39:52,907][04574] Fps is (10 sec: 5734.5, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 8609792. Throughput: 0: 778.2, 1: 778.3. Samples: 2152675. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:39:52,908][04574] Avg episode reward: [(0, '30.560'), (1, '31.560')] -[2023-09-26 15:39:57,908][04574] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 8642560. Throughput: 0: 781.8, 1: 782.0. Samples: 2157540. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:39:57,909][04574] Avg episode reward: [(0, '29.760'), (1, '31.860')] -[2023-09-26 15:40:02,908][04574] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 8675328. Throughput: 0: 776.4, 1: 778.1. Samples: 2166784. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:40:02,909][04574] Avg episode reward: [(0, '30.100'), (1, '31.990')] -[2023-09-26 15:40:03,647][05900] Updated weights for policy 0, policy_version 16960 (0.0017) -[2023-09-26 15:40:03,647][05901] Updated weights for policy 1, policy_version 16960 (0.0017) -[2023-09-26 15:40:07,908][04574] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 8708096. Throughput: 0: 782.3, 1: 782.0. Samples: 2176269. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:40:07,909][04574] Avg episode reward: [(0, '29.850'), (1, '31.900')] -[2023-09-26 15:40:12,907][04574] Fps is (10 sec: 6553.7, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 8740864. Throughput: 0: 783.7, 1: 784.2. Samples: 2181088. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:40:12,908][04574] Avg episode reward: [(0, '30.010'), (1, '31.900')] -[2023-09-26 15:40:16,897][05900] Updated weights for policy 0, policy_version 17120 (0.0018) -[2023-09-26 15:40:16,897][05901] Updated weights for policy 1, policy_version 17120 (0.0015) -[2023-09-26 15:40:17,908][04574] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6248.1). Total num frames: 8765440. Throughput: 0: 782.3, 1: 784.2. Samples: 2190176. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:40:17,909][04574] Avg episode reward: [(0, '30.440'), (1, '31.570')] -[2023-09-26 15:40:22,908][04574] Fps is (10 sec: 5734.3, 60 sec: 6212.2, 300 sec: 6248.1). Total num frames: 8798208. Throughput: 0: 783.5, 1: 784.0. Samples: 2199552. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:40:22,909][04574] Avg episode reward: [(0, '30.520'), (1, '32.280')] -[2023-09-26 15:40:27,907][04574] Fps is (10 sec: 6553.8, 60 sec: 6280.6, 300 sec: 6248.1). Total num frames: 8830976. Throughput: 0: 779.9, 1: 780.4. Samples: 2204026. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) -[2023-09-26 15:40:27,908][04574] Avg episode reward: [(0, '29.890'), (1, '32.390')] -[2023-09-26 15:40:30,218][05901] Updated weights for policy 1, policy_version 17280 (0.0017) -[2023-09-26 15:40:30,218][05900] Updated weights for policy 0, policy_version 17280 (0.0014) -[2023-09-26 15:40:32,908][04574] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 8863744. Throughput: 0: 777.3, 1: 778.5. Samples: 2213212. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) -[2023-09-26 15:40:32,909][04574] Avg episode reward: [(0, '29.680'), (1, '32.760')] -[2023-09-26 15:40:32,920][05596] Saving ./train_atari/atari_kongfumaster/checkpoint_p1/checkpoint_000017312_4431872.pth... -[2023-09-26 15:40:32,921][05384] Saving ./train_atari/atari_kongfumaster/checkpoint_p0/checkpoint_000017312_4431872.pth... -[2023-09-26 15:40:32,957][05384] Removing ./train_atari/atari_kongfumaster/checkpoint_p0/checkpoint_000014384_3682304.pth -[2023-09-26 15:40:32,959][05596] Removing ./train_atari/atari_kongfumaster/checkpoint_p1/checkpoint_000014368_3678208.pth -[2023-09-26 15:40:37,907][04574] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6248.1). Total num frames: 8888320. Throughput: 0: 776.1, 1: 775.8. Samples: 2222513. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) -[2023-09-26 15:40:37,908][04574] Avg episode reward: [(0, '29.410'), (1, '32.920')] -[2023-09-26 15:40:42,907][04574] Fps is (10 sec: 5734.6, 60 sec: 6144.0, 300 sec: 6248.1). Total num frames: 8921088. Throughput: 0: 774.7, 1: 773.4. Samples: 2227204. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:40:42,908][04574] Avg episode reward: [(0, '29.990'), (1, '33.320')] -[2023-09-26 15:40:43,356][05900] Updated weights for policy 0, policy_version 17440 (0.0018) -[2023-09-26 15:40:43,356][05901] Updated weights for policy 1, policy_version 17440 (0.0017) -[2023-09-26 15:40:47,908][04574] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 8953856. Throughput: 0: 774.1, 1: 774.0. Samples: 2236451. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:40:47,909][04574] Avg episode reward: [(0, '29.940'), (1, '33.670')] -[2023-09-26 15:40:52,908][04574] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 8986624. Throughput: 0: 778.4, 1: 778.0. Samples: 2246306. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:40:52,909][04574] Avg episode reward: [(0, '30.730'), (1, '34.890')] -[2023-09-26 15:40:52,910][05596] Saving new best policy, reward=34.890! -[2023-09-26 15:40:56,405][05900] Updated weights for policy 0, policy_version 17600 (0.0017) -[2023-09-26 15:40:56,405][05901] Updated weights for policy 1, policy_version 17600 (0.0018) -[2023-09-26 15:40:57,908][04574] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 9019392. Throughput: 0: 774.1, 1: 774.0. Samples: 2250752. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:40:57,909][04574] Avg episode reward: [(0, '31.610'), (1, '34.860')] -[2023-09-26 15:41:02,908][04574] Fps is (10 sec: 6144.0, 60 sec: 6212.3, 300 sec: 6262.0). Total num frames: 9048064. Throughput: 0: 779.2, 1: 776.9. Samples: 2260200. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:41:02,909][04574] Avg episode reward: [(0, '32.500'), (1, '36.360')] -[2023-09-26 15:41:03,008][05596] Saving new best policy, reward=36.360! -[2023-09-26 15:41:07,908][04574] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6248.1). Total num frames: 9076736. Throughput: 0: 774.6, 1: 774.5. Samples: 2269260. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:41:07,909][04574] Avg episode reward: [(0, '33.210'), (1, '36.360')] -[2023-09-26 15:41:09,513][05901] Updated weights for policy 1, policy_version 17760 (0.0016) -[2023-09-26 15:41:09,513][05900] Updated weights for policy 0, policy_version 17760 (0.0017) -[2023-09-26 15:41:12,907][04574] Fps is (10 sec: 6144.2, 60 sec: 6144.0, 300 sec: 6248.1). Total num frames: 9109504. Throughput: 0: 778.5, 1: 777.7. Samples: 2274054. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:41:12,908][04574] Avg episode reward: [(0, '32.570'), (1, '36.860')] -[2023-09-26 15:41:12,909][05596] Saving new best policy, reward=36.860! -[2023-09-26 15:41:17,908][04574] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 9142272. Throughput: 0: 781.6, 1: 780.8. Samples: 2283520. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:41:17,909][04574] Avg episode reward: [(0, '32.350'), (1, '37.690')] -[2023-09-26 15:41:17,920][05596] Saving new best policy, reward=37.690! -[2023-09-26 15:41:22,612][05901] Updated weights for policy 1, policy_version 17920 (0.0016) -[2023-09-26 15:41:22,612][05900] Updated weights for policy 0, policy_version 17920 (0.0018) -[2023-09-26 15:41:22,908][04574] Fps is (10 sec: 6553.4, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 9175040. Throughput: 0: 781.7, 1: 781.7. Samples: 2292867. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) -[2023-09-26 15:41:22,909][04574] Avg episode reward: [(0, '31.690'), (1, '38.740')] -[2023-09-26 15:41:22,910][05596] Saving new best policy, reward=38.740! -[2023-09-26 15:41:27,907][04574] Fps is (10 sec: 6553.8, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 9207808. Throughput: 0: 783.0, 1: 784.0. Samples: 2297719. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) -[2023-09-26 15:41:27,908][04574] Avg episode reward: [(0, '31.980'), (1, '39.430')] -[2023-09-26 15:41:27,909][05596] Saving new best policy, reward=39.430! -[2023-09-26 15:41:32,908][04574] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6248.1). Total num frames: 9232384. Throughput: 0: 784.2, 1: 784.6. Samples: 2307045. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:41:32,909][04574] Avg episode reward: [(0, '32.290'), (1, '39.800')] -[2023-09-26 15:41:33,090][05596] Saving new best policy, reward=39.800! -[2023-09-26 15:41:35,871][05900] Updated weights for policy 0, policy_version 18080 (0.0018) -[2023-09-26 15:41:35,871][05901] Updated weights for policy 1, policy_version 18080 (0.0018) -[2023-09-26 15:41:37,908][04574] Fps is (10 sec: 5734.3, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 9265152. Throughput: 0: 777.6, 1: 777.5. Samples: 2316288. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:41:37,909][04574] Avg episode reward: [(0, '32.560'), (1, '40.240')] -[2023-09-26 15:41:37,910][05596] Saving new best policy, reward=40.240! -[2023-09-26 15:41:42,907][04574] Fps is (10 sec: 6553.7, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 9297920. Throughput: 0: 776.5, 1: 776.5. Samples: 2320638. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:41:42,908][04574] Avg episode reward: [(0, '33.210'), (1, '40.390')] -[2023-09-26 15:41:42,909][05596] Saving new best policy, reward=40.390! -[2023-09-26 15:41:47,907][04574] Fps is (10 sec: 6144.1, 60 sec: 6212.3, 300 sec: 6234.3). Total num frames: 9326592. Throughput: 0: 775.2, 1: 779.4. Samples: 2330159. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:41:47,908][04574] Avg episode reward: [(0, '33.150'), (1, '39.840')] -[2023-09-26 15:41:49,224][05900] Updated weights for policy 0, policy_version 18240 (0.0018) -[2023-09-26 15:41:49,224][05901] Updated weights for policy 1, policy_version 18240 (0.0017) -[2023-09-26 15:41:52,908][04574] Fps is (10 sec: 5734.3, 60 sec: 6144.0, 300 sec: 6220.4). Total num frames: 9355264. Throughput: 0: 772.8, 1: 772.9. Samples: 2338816. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:41:52,909][04574] Avg episode reward: [(0, '32.340'), (1, '39.020')] -[2023-09-26 15:41:57,907][04574] Fps is (10 sec: 6144.0, 60 sec: 6144.0, 300 sec: 6248.1). Total num frames: 9388032. Throughput: 0: 767.1, 1: 768.1. Samples: 2343138. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:41:57,908][04574] Avg episode reward: [(0, '31.840'), (1, '38.670')] -[2023-09-26 15:42:02,526][05900] Updated weights for policy 0, policy_version 18400 (0.0019) -[2023-09-26 15:42:02,526][05901] Updated weights for policy 1, policy_version 18400 (0.0019) -[2023-09-26 15:42:02,907][04574] Fps is (10 sec: 6553.7, 60 sec: 6212.3, 300 sec: 6248.1). Total num frames: 9420800. Throughput: 0: 772.2, 1: 772.1. Samples: 2353015. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:42:02,908][04574] Avg episode reward: [(0, '32.160'), (1, '39.140')] -[2023-09-26 15:42:07,908][04574] Fps is (10 sec: 5734.3, 60 sec: 6144.0, 300 sec: 6220.4). Total num frames: 9445376. Throughput: 0: 766.5, 1: 766.3. Samples: 2361845. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:42:07,909][04574] Avg episode reward: [(0, '31.320'), (1, '39.230')] -[2023-09-26 15:42:12,907][04574] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6220.4). Total num frames: 9478144. Throughput: 0: 767.9, 1: 768.5. Samples: 2366857. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-09-26 15:42:12,908][04574] Avg episode reward: [(0, '31.770'), (1, '39.350')] -[2023-09-26 15:42:15,978][05901] Updated weights for policy 1, policy_version 18560 (0.0017) -[2023-09-26 15:42:15,979][05900] Updated weights for policy 0, policy_version 18560 (0.0017) -[2023-09-26 15:42:17,908][04574] Fps is (10 sec: 6553.6, 60 sec: 6144.0, 300 sec: 6220.4). Total num frames: 9510912. Throughput: 0: 762.9, 1: 762.5. Samples: 2375685. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-09-26 15:42:17,909][04574] Avg episode reward: [(0, '32.070'), (1, '39.690')] -[2023-09-26 15:42:22,908][04574] Fps is (10 sec: 6143.9, 60 sec: 6075.7, 300 sec: 6234.3). Total num frames: 9539584. Throughput: 0: 759.1, 1: 759.9. Samples: 2384643. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-09-26 15:42:22,909][04574] Avg episode reward: [(0, '32.080'), (1, '38.640')] -[2023-09-26 15:42:27,908][04574] Fps is (10 sec: 5734.4, 60 sec: 6007.4, 300 sec: 6220.4). Total num frames: 9568256. Throughput: 0: 759.1, 1: 760.1. Samples: 2389001. Policy #0 lag: (min: 9.0, avg: 9.0, max: 9.0) -[2023-09-26 15:42:27,909][04574] Avg episode reward: [(0, '31.190'), (1, '39.710')] -[2023-09-26 15:42:29,683][05900] Updated weights for policy 0, policy_version 18720 (0.0017) -[2023-09-26 15:42:29,683][05901] Updated weights for policy 1, policy_version 18720 (0.0017) -[2023-09-26 15:42:32,908][04574] Fps is (10 sec: 6143.9, 60 sec: 6144.0, 300 sec: 6220.4). Total num frames: 9601024. Throughput: 0: 758.0, 1: 754.2. Samples: 2398208. Policy #0 lag: (min: 9.0, avg: 9.0, max: 9.0) -[2023-09-26 15:42:32,909][04574] Avg episode reward: [(0, '30.930'), (1, '39.730')] -[2023-09-26 15:42:32,918][05384] Saving ./train_atari/atari_kongfumaster/checkpoint_p0/checkpoint_000018752_4800512.pth... -[2023-09-26 15:42:32,918][05596] Saving ./train_atari/atari_kongfumaster/checkpoint_p1/checkpoint_000018752_4800512.pth... -[2023-09-26 15:42:32,954][05384] Removing ./train_atari/atari_kongfumaster/checkpoint_p0/checkpoint_000015840_4055040.pth -[2023-09-26 15:42:32,957][05596] Removing ./train_atari/atari_kongfumaster/checkpoint_p1/checkpoint_000015840_4055040.pth -[2023-09-26 15:42:37,908][04574] Fps is (10 sec: 6553.6, 60 sec: 6144.0, 300 sec: 6220.4). Total num frames: 9633792. Throughput: 0: 766.8, 1: 766.8. Samples: 2407829. Policy #0 lag: (min: 9.0, avg: 9.0, max: 9.0) -[2023-09-26 15:42:37,909][04574] Avg episode reward: [(0, '31.030'), (1, '40.100')] -[2023-09-26 15:42:42,658][05900] Updated weights for policy 0, policy_version 18880 (0.0016) -[2023-09-26 15:42:42,659][05901] Updated weights for policy 1, policy_version 18880 (0.0016) -[2023-09-26 15:42:42,908][04574] Fps is (10 sec: 6553.6, 60 sec: 6144.0, 300 sec: 6220.4). Total num frames: 9666560. Throughput: 0: 771.1, 1: 771.3. Samples: 2412544. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:42:42,909][04574] Avg episode reward: [(0, '30.940'), (1, '40.100')] -[2023-09-26 15:42:47,908][04574] Fps is (10 sec: 6144.0, 60 sec: 6144.0, 300 sec: 6206.5). Total num frames: 9695232. Throughput: 0: 766.3, 1: 765.9. Samples: 2421964. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:42:47,909][04574] Avg episode reward: [(0, '30.800'), (1, '40.040')] -[2023-09-26 15:42:52,908][04574] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6220.4). Total num frames: 9723904. Throughput: 0: 771.4, 1: 771.4. Samples: 2431270. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:42:52,909][04574] Avg episode reward: [(0, '31.440'), (1, '39.720')] -[2023-09-26 15:42:55,762][05900] Updated weights for policy 0, policy_version 19040 (0.0017) -[2023-09-26 15:42:55,762][05901] Updated weights for policy 1, policy_version 19040 (0.0015) -[2023-09-26 15:42:57,908][04574] Fps is (10 sec: 6144.0, 60 sec: 6144.0, 300 sec: 6220.4). Total num frames: 9756672. Throughput: 0: 770.3, 1: 770.2. Samples: 2436181. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) -[2023-09-26 15:42:57,909][04574] Avg episode reward: [(0, '32.080'), (1, '39.010')] -[2023-09-26 15:43:02,908][04574] Fps is (10 sec: 6553.6, 60 sec: 6144.0, 300 sec: 6220.4). Total num frames: 9789440. Throughput: 0: 773.7, 1: 773.8. Samples: 2445321. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) -[2023-09-26 15:43:02,909][04574] Avg episode reward: [(0, '31.180'), (1, '39.940')] -[2023-09-26 15:43:07,907][04574] Fps is (10 sec: 6553.7, 60 sec: 6280.6, 300 sec: 6220.4). Total num frames: 9822208. Throughput: 0: 782.6, 1: 783.0. Samples: 2455094. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:43:07,908][04574] Avg episode reward: [(0, '32.030'), (1, '39.870')] -[2023-09-26 15:43:08,788][05901] Updated weights for policy 1, policy_version 19200 (0.0017) -[2023-09-26 15:43:08,788][05900] Updated weights for policy 0, policy_version 19200 (0.0017) -[2023-09-26 15:43:12,908][04574] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6220.4). Total num frames: 9854976. Throughput: 0: 785.5, 1: 784.5. Samples: 2459648. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:43:12,909][04574] Avg episode reward: [(0, '32.570'), (1, '39.850')] -[2023-09-26 15:43:17,908][04574] Fps is (10 sec: 5734.2, 60 sec: 6144.0, 300 sec: 6192.6). Total num frames: 9879552. Throughput: 0: 782.9, 1: 783.6. Samples: 2468703. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:43:17,909][04574] Avg episode reward: [(0, '31.170'), (1, '38.860')] -[2023-09-26 15:43:22,145][05900] Updated weights for policy 0, policy_version 19360 (0.0020) -[2023-09-26 15:43:22,145][05901] Updated weights for policy 1, policy_version 19360 (0.0018) -[2023-09-26 15:43:22,908][04574] Fps is (10 sec: 5734.4, 60 sec: 6212.3, 300 sec: 6220.4). Total num frames: 9912320. Throughput: 0: 780.6, 1: 780.6. Samples: 2478080. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:43:22,909][04574] Avg episode reward: [(0, '31.060'), (1, '38.210')] -[2023-09-26 15:43:27,907][04574] Fps is (10 sec: 6553.8, 60 sec: 6280.6, 300 sec: 6220.4). Total num frames: 9945088. Throughput: 0: 777.0, 1: 777.1. Samples: 2482477. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:43:27,908][04574] Avg episode reward: [(0, '32.550'), (1, '37.990')] -[2023-09-26 15:43:32,907][04574] Fps is (10 sec: 6553.7, 60 sec: 6280.6, 300 sec: 6220.4). Total num frames: 9977856. Throughput: 0: 780.9, 1: 781.5. Samples: 2492273. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:43:32,908][04574] Avg episode reward: [(0, '32.860'), (1, '38.310')] -[2023-09-26 15:43:35,164][05900] Updated weights for policy 0, policy_version 19520 (0.0019) -[2023-09-26 15:43:35,165][05901] Updated weights for policy 1, policy_version 19520 (0.0019) -[2023-09-26 15:43:37,907][04574] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6220.4). Total num frames: 10010624. Throughput: 0: 779.0, 1: 780.4. Samples: 2501440. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0) -[2023-09-26 15:43:37,908][04574] Avg episode reward: [(0, '32.850'), (1, '39.260')] -[2023-09-26 15:43:42,907][04574] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6192.6). Total num frames: 10035200. Throughput: 0: 778.0, 1: 778.0. Samples: 2506201. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0) -[2023-09-26 15:43:42,908][04574] Avg episode reward: [(0, '33.520'), (1, '39.430')] -[2023-09-26 15:43:47,907][04574] Fps is (10 sec: 5734.4, 60 sec: 6212.3, 300 sec: 6220.4). Total num frames: 10067968. Throughput: 0: 776.9, 1: 776.9. Samples: 2515242. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0) -[2023-09-26 15:43:47,908][04574] Avg episode reward: [(0, '33.930'), (1, '39.130')] -[2023-09-26 15:43:48,371][05900] Updated weights for policy 0, policy_version 19680 (0.0017) -[2023-09-26 15:43:48,371][05901] Updated weights for policy 1, policy_version 19680 (0.0019) -[2023-09-26 15:43:52,908][04574] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6220.4). Total num frames: 10100736. Throughput: 0: 778.4, 1: 777.8. Samples: 2525123. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) -[2023-09-26 15:43:52,909][04574] Avg episode reward: [(0, '33.710'), (1, '39.390')] -[2023-09-26 15:43:57,908][04574] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6220.4). Total num frames: 10133504. Throughput: 0: 773.8, 1: 773.8. Samples: 2529288. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) -[2023-09-26 15:43:57,909][04574] Avg episode reward: [(0, '33.610'), (1, '39.280')] -[2023-09-26 15:44:01,558][05900] Updated weights for policy 0, policy_version 19840 (0.0015) -[2023-09-26 15:44:01,559][05901] Updated weights for policy 1, policy_version 19840 (0.0016) -[2023-09-26 15:44:02,908][04574] Fps is (10 sec: 6144.0, 60 sec: 6212.3, 300 sec: 6206.5). Total num frames: 10162176. Throughput: 0: 779.5, 1: 779.9. Samples: 2538873. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) -[2023-09-26 15:44:02,909][04574] Avg episode reward: [(0, '33.470'), (1, '39.370')] -[2023-09-26 15:44:07,908][04574] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6192.6). Total num frames: 10190848. Throughput: 0: 777.3, 1: 777.2. Samples: 2548033. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) -[2023-09-26 15:44:07,909][04574] Avg episode reward: [(0, '34.500'), (1, '39.890')] -[2023-09-26 15:44:12,907][04574] Fps is (10 sec: 6144.1, 60 sec: 6144.0, 300 sec: 6192.6). Total num frames: 10223616. Throughput: 0: 783.7, 1: 783.6. Samples: 2553002. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) -[2023-09-26 15:44:12,908][04574] Avg episode reward: [(0, '35.220'), (1, '39.050')] -[2023-09-26 15:44:14,569][05900] Updated weights for policy 0, policy_version 20000 (0.0018) -[2023-09-26 15:44:14,570][05901] Updated weights for policy 1, policy_version 20000 (0.0020) -[2023-09-26 15:44:17,907][04574] Fps is (10 sec: 6553.7, 60 sec: 6280.6, 300 sec: 6206.5). Total num frames: 10256384. Throughput: 0: 776.2, 1: 775.8. Samples: 2562112. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) -[2023-09-26 15:44:17,908][04574] Avg episode reward: [(0, '35.810'), (1, '38.550')] -[2023-09-26 15:44:22,907][04574] Fps is (10 sec: 6553.6, 60 sec: 6280.6, 300 sec: 6220.4). Total num frames: 10289152. Throughput: 0: 781.4, 1: 779.3. Samples: 2571672. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:44:22,908][04574] Avg episode reward: [(0, '36.260'), (1, '37.580')] -[2023-09-26 15:44:27,799][05900] Updated weights for policy 0, policy_version 20160 (0.0018) -[2023-09-26 15:44:27,799][05901] Updated weights for policy 1, policy_version 20160 (0.0017) -[2023-09-26 15:44:27,908][04574] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6220.4). Total num frames: 10321920. Throughput: 0: 779.0, 1: 779.7. Samples: 2576344. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:44:27,909][04574] Avg episode reward: [(0, '36.180'), (1, '36.550')] -[2023-09-26 15:44:32,908][04574] Fps is (10 sec: 5734.2, 60 sec: 6144.0, 300 sec: 6192.6). Total num frames: 10346496. Throughput: 0: 778.2, 1: 778.6. Samples: 2585301. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:44:32,909][04574] Avg episode reward: [(0, '36.900'), (1, '36.360')] -[2023-09-26 15:44:32,919][05384] Saving ./train_atari/atari_kongfumaster/checkpoint_p0/checkpoint_000020208_5173248.pth... -[2023-09-26 15:44:32,920][05596] Saving ./train_atari/atari_kongfumaster/checkpoint_p1/checkpoint_000020208_5173248.pth... -[2023-09-26 15:44:32,957][05384] Removing ./train_atari/atari_kongfumaster/checkpoint_p0/checkpoint_000017312_4431872.pth -[2023-09-26 15:44:32,957][05596] Removing ./train_atari/atari_kongfumaster/checkpoint_p1/checkpoint_000017312_4431872.pth -[2023-09-26 15:44:37,907][04574] Fps is (10 sec: 5734.5, 60 sec: 6144.0, 300 sec: 6192.6). Total num frames: 10379264. Throughput: 0: 774.7, 1: 774.0. Samples: 2594816. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:44:37,908][04574] Avg episode reward: [(0, '37.360'), (1, '36.460')] -[2023-09-26 15:44:41,045][05900] Updated weights for policy 0, policy_version 20320 (0.0018) -[2023-09-26 15:44:41,045][05901] Updated weights for policy 1, policy_version 20320 (0.0016) -[2023-09-26 15:44:42,907][04574] Fps is (10 sec: 6553.7, 60 sec: 6280.5, 300 sec: 6220.4). Total num frames: 10412032. Throughput: 0: 778.4, 1: 778.4. Samples: 2599346. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:44:42,908][04574] Avg episode reward: [(0, '36.860'), (1, '36.860')] -[2023-09-26 15:44:47,908][04574] Fps is (10 sec: 6553.3, 60 sec: 6280.5, 300 sec: 6220.4). Total num frames: 10444800. Throughput: 0: 780.6, 1: 779.6. Samples: 2609083. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:44:47,909][04574] Avg episode reward: [(0, '37.240'), (1, '37.630')] -[2023-09-26 15:44:52,908][04574] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6220.4). Total num frames: 10477568. Throughput: 0: 781.1, 1: 781.1. Samples: 2618332. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:44:52,909][04574] Avg episode reward: [(0, '37.870'), (1, '37.030')] -[2023-09-26 15:44:54,106][05900] Updated weights for policy 0, policy_version 20480 (0.0018) -[2023-09-26 15:44:54,106][05901] Updated weights for policy 1, policy_version 20480 (0.0019) -[2023-09-26 15:44:57,908][04574] Fps is (10 sec: 5734.5, 60 sec: 6144.0, 300 sec: 6192.6). Total num frames: 10502144. Throughput: 0: 779.0, 1: 778.3. Samples: 2623081. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:44:57,909][04574] Avg episode reward: [(0, '37.320'), (1, '36.220')] -[2023-09-26 15:45:02,908][04574] Fps is (10 sec: 5734.4, 60 sec: 6212.3, 300 sec: 6192.6). Total num frames: 10534912. Throughput: 0: 782.3, 1: 782.6. Samples: 2632532. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:45:02,909][04574] Avg episode reward: [(0, '37.390'), (1, '35.970')] -[2023-09-26 15:45:07,004][05901] Updated weights for policy 1, policy_version 20640 (0.0017) -[2023-09-26 15:45:07,005][05900] Updated weights for policy 0, policy_version 20640 (0.0019) -[2023-09-26 15:45:07,908][04574] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6192.6). Total num frames: 10567680. Throughput: 0: 780.0, 1: 781.1. Samples: 2641922. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:45:07,909][04574] Avg episode reward: [(0, '35.620'), (1, '35.960')] -[2023-09-26 15:45:12,908][04574] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6220.4). Total num frames: 10600448. Throughput: 0: 779.6, 1: 778.8. Samples: 2646471. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:45:12,909][04574] Avg episode reward: [(0, '34.740'), (1, '35.920')] -[2023-09-26 15:45:17,908][04574] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6220.4). Total num frames: 10633216. Throughput: 0: 788.6, 1: 788.0. Samples: 2656245. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) -[2023-09-26 15:45:17,909][04574] Avg episode reward: [(0, '34.240'), (1, '36.530')] -[2023-09-26 15:45:20,207][05900] Updated weights for policy 0, policy_version 20800 (0.0018) -[2023-09-26 15:45:20,208][05901] Updated weights for policy 1, policy_version 20800 (0.0017) -[2023-09-26 15:45:22,908][04574] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6220.4). Total num frames: 10665984. Throughput: 0: 783.4, 1: 783.5. Samples: 2665328. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) -[2023-09-26 15:45:22,909][04574] Avg episode reward: [(0, '35.090'), (1, '35.820')] -[2023-09-26 15:45:27,908][04574] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6192.6). Total num frames: 10690560. Throughput: 0: 785.4, 1: 785.8. Samples: 2670049. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) -[2023-09-26 15:45:27,909][04574] Avg episode reward: [(0, '34.440'), (1, '34.310')] -[2023-09-26 15:45:32,908][04574] Fps is (10 sec: 5734.4, 60 sec: 6280.5, 300 sec: 6220.4). Total num frames: 10723328. Throughput: 0: 778.6, 1: 778.2. Samples: 2679138. Policy #0 lag: (min: 8.0, avg: 8.0, max: 8.0) -[2023-09-26 15:45:32,909][04574] Avg episode reward: [(0, '34.520'), (1, '34.670')] -[2023-09-26 15:45:33,510][05900] Updated weights for policy 0, policy_version 20960 (0.0018) -[2023-09-26 15:45:33,510][05901] Updated weights for policy 1, policy_version 20960 (0.0018) -[2023-09-26 15:45:37,908][04574] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6220.4). Total num frames: 10756096. Throughput: 0: 778.9, 1: 778.4. Samples: 2688413. Policy #0 lag: (min: 8.0, avg: 8.0, max: 8.0) -[2023-09-26 15:45:37,909][04574] Avg episode reward: [(0, '35.390'), (1, '34.240')] -[2023-09-26 15:45:42,908][04574] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6220.4). Total num frames: 10788864. Throughput: 0: 777.8, 1: 778.6. Samples: 2693120. Policy #0 lag: (min: 8.0, avg: 8.0, max: 8.0) -[2023-09-26 15:45:42,909][04574] Avg episode reward: [(0, '35.290'), (1, '33.740')] -[2023-09-26 15:45:46,928][05901] Updated weights for policy 1, policy_version 21120 (0.0018) -[2023-09-26 15:45:46,928][05900] Updated weights for policy 0, policy_version 21120 (0.0018) -[2023-09-26 15:45:47,908][04574] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6192.6). Total num frames: 10813440. Throughput: 0: 771.6, 1: 771.8. Samples: 2701987. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:45:47,909][04574] Avg episode reward: [(0, '36.310'), (1, '32.490')] -[2023-09-26 15:45:52,908][04574] Fps is (10 sec: 5734.5, 60 sec: 6144.0, 300 sec: 6192.6). Total num frames: 10846208. Throughput: 0: 773.3, 1: 771.8. Samples: 2711450. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:45:52,908][04574] Avg episode reward: [(0, '36.400'), (1, '32.050')] -[2023-09-26 15:45:57,908][04574] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6206.5). Total num frames: 10878976. Throughput: 0: 769.8, 1: 769.6. Samples: 2715743. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:45:57,909][04574] Avg episode reward: [(0, '34.960'), (1, '32.230')] -[2023-09-26 15:46:00,046][05900] Updated weights for policy 0, policy_version 21280 (0.0017) -[2023-09-26 15:46:00,046][05901] Updated weights for policy 1, policy_version 21280 (0.0016) -[2023-09-26 15:46:02,908][04574] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6220.4). Total num frames: 10911744. Throughput: 0: 770.3, 1: 771.2. Samples: 2725612. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) -[2023-09-26 15:46:02,909][04574] Avg episode reward: [(0, '34.310'), (1, '31.970')] -[2023-09-26 15:46:07,908][04574] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6220.4). Total num frames: 10944512. Throughput: 0: 774.4, 1: 774.4. Samples: 2735022. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) -[2023-09-26 15:46:07,909][04574] Avg episode reward: [(0, '33.450'), (1, '32.040')] -[2023-09-26 15:46:12,908][04574] Fps is (10 sec: 5734.5, 60 sec: 6144.0, 300 sec: 6192.6). Total num frames: 10969088. Throughput: 0: 774.8, 1: 775.2. Samples: 2739798. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) -[2023-09-26 15:46:12,908][04574] Avg episode reward: [(0, '32.900'), (1, '31.910')] -[2023-09-26 15:46:13,084][05900] Updated weights for policy 0, policy_version 21440 (0.0018) -[2023-09-26 15:46:13,085][05901] Updated weights for policy 1, policy_version 21440 (0.0015) -[2023-09-26 15:46:17,908][04574] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6192.6). Total num frames: 11001856. Throughput: 0: 773.6, 1: 773.4. Samples: 2748752. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) -[2023-09-26 15:46:17,909][04574] Avg episode reward: [(0, '32.170'), (1, '30.820')] -[2023-09-26 15:46:22,908][04574] Fps is (10 sec: 6553.5, 60 sec: 6144.0, 300 sec: 6192.6). Total num frames: 11034624. Throughput: 0: 780.2, 1: 780.8. Samples: 2758656. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) -[2023-09-26 15:46:22,909][04574] Avg episode reward: [(0, '32.280'), (1, '29.710')] -[2023-09-26 15:46:26,004][05901] Updated weights for policy 1, policy_version 21600 (0.0017) -[2023-09-26 15:46:26,004][05900] Updated weights for policy 0, policy_version 21600 (0.0016) -[2023-09-26 15:46:27,908][04574] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6220.4). Total num frames: 11067392. Throughput: 0: 778.6, 1: 777.8. Samples: 2763155. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) -[2023-09-26 15:46:27,909][04574] Avg episode reward: [(0, '31.620'), (1, '30.160')] -[2023-09-26 15:46:32,908][04574] Fps is (10 sec: 6553.7, 60 sec: 6280.5, 300 sec: 6220.4). Total num frames: 11100160. Throughput: 0: 787.2, 1: 787.2. Samples: 2772835. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) -[2023-09-26 15:46:32,908][04574] Avg episode reward: [(0, '30.840'), (1, '29.900')] -[2023-09-26 15:46:32,918][05596] Saving ./train_atari/atari_kongfumaster/checkpoint_p1/checkpoint_000021680_5550080.pth... -[2023-09-26 15:46:32,919][05384] Saving ./train_atari/atari_kongfumaster/checkpoint_p0/checkpoint_000021680_5550080.pth... -[2023-09-26 15:46:32,953][05596] Removing ./train_atari/atari_kongfumaster/checkpoint_p1/checkpoint_000018752_4800512.pth -[2023-09-26 15:46:32,961][05384] Removing ./train_atari/atari_kongfumaster/checkpoint_p0/checkpoint_000018752_4800512.pth -[2023-09-26 15:46:37,908][04574] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6220.4). Total num frames: 11132928. Throughput: 0: 782.3, 1: 784.2. Samples: 2781939. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:46:37,909][04574] Avg episode reward: [(0, '31.860'), (1, '29.870')] -[2023-09-26 15:46:39,191][05900] Updated weights for policy 0, policy_version 21760 (0.0017) -[2023-09-26 15:46:39,191][05901] Updated weights for policy 1, policy_version 21760 (0.0019) -[2023-09-26 15:46:42,908][04574] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6206.5). Total num frames: 11157504. Throughput: 0: 789.0, 1: 788.7. Samples: 2786739. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:46:42,909][04574] Avg episode reward: [(0, '32.080'), (1, '30.070')] -[2023-09-26 15:46:47,907][04574] Fps is (10 sec: 5734.5, 60 sec: 6280.5, 300 sec: 6220.4). Total num frames: 11190272. Throughput: 0: 778.2, 1: 777.5. Samples: 2795620. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:46:47,908][04574] Avg episode reward: [(0, '33.030'), (1, '30.650')] -[2023-09-26 15:46:52,385][05900] Updated weights for policy 0, policy_version 21920 (0.0017) -[2023-09-26 15:46:52,386][05901] Updated weights for policy 1, policy_version 21920 (0.0015) -[2023-09-26 15:46:52,908][04574] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6220.4). Total num frames: 11223040. Throughput: 0: 781.7, 1: 781.9. Samples: 2805382. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0) -[2023-09-26 15:46:52,909][04574] Avg episode reward: [(0, '33.520'), (1, '30.290')] -[2023-09-26 15:46:57,908][04574] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6220.4). Total num frames: 11255808. Throughput: 0: 778.8, 1: 778.0. Samples: 2809856. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0) -[2023-09-26 15:46:57,909][04574] Avg episode reward: [(0, '34.300'), (1, '29.820')] -[2023-09-26 15:47:02,908][04574] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6220.4). Total num frames: 11280384. Throughput: 0: 777.2, 1: 777.5. Samples: 2818715. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0) -[2023-09-26 15:47:02,909][04574] Avg episode reward: [(0, '34.830'), (1, '30.690')] -[2023-09-26 15:47:05,836][05900] Updated weights for policy 0, policy_version 22080 (0.0017) -[2023-09-26 15:47:05,836][05901] Updated weights for policy 1, policy_version 22080 (0.0017) -[2023-09-26 15:47:07,908][04574] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6220.4). Total num frames: 11313152. Throughput: 0: 773.2, 1: 773.7. Samples: 2828266. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:47:07,909][04574] Avg episode reward: [(0, '35.350'), (1, '30.950')] -[2023-09-26 15:47:12,908][04574] Fps is (10 sec: 6553.7, 60 sec: 6280.5, 300 sec: 6220.4). Total num frames: 11345920. Throughput: 0: 770.6, 1: 771.2. Samples: 2832536. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:47:12,908][04574] Avg episode reward: [(0, '34.430'), (1, '32.080')] -[2023-09-26 15:47:17,908][04574] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6234.2). Total num frames: 11378688. Throughput: 0: 770.1, 1: 770.4. Samples: 2842156. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:47:17,909][04574] Avg episode reward: [(0, '34.180'), (1, '32.310')] -[2023-09-26 15:47:19,109][05901] Updated weights for policy 1, policy_version 22240 (0.0016) -[2023-09-26 15:47:19,110][05900] Updated weights for policy 0, policy_version 22240 (0.0019) -[2023-09-26 15:47:22,907][04574] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6220.4). Total num frames: 11403264. Throughput: 0: 767.1, 1: 766.6. Samples: 2850954. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) -[2023-09-26 15:47:22,908][04574] Avg episode reward: [(0, '34.260'), (1, '33.280')] -[2023-09-26 15:47:27,908][04574] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6220.4). Total num frames: 11436032. Throughput: 0: 768.6, 1: 768.4. Samples: 2855904. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) -[2023-09-26 15:47:27,909][04574] Avg episode reward: [(0, '34.910'), (1, '33.640')] -[2023-09-26 15:47:32,069][05900] Updated weights for policy 0, policy_version 22400 (0.0016) -[2023-09-26 15:47:32,070][05901] Updated weights for policy 1, policy_version 22400 (0.0016) -[2023-09-26 15:47:32,907][04574] Fps is (10 sec: 6553.6, 60 sec: 6144.0, 300 sec: 6220.4). Total num frames: 11468800. Throughput: 0: 775.8, 1: 775.6. Samples: 2865436. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) -[2023-09-26 15:47:32,908][04574] Avg episode reward: [(0, '34.770'), (1, '33.040')] -[2023-09-26 15:47:37,908][04574] Fps is (10 sec: 6553.6, 60 sec: 6144.0, 300 sec: 6220.4). Total num frames: 11501568. Throughput: 0: 775.4, 1: 776.0. Samples: 2875191. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:47:37,909][04574] Avg episode reward: [(0, '34.210'), (1, '33.620')] -[2023-09-26 15:47:42,908][04574] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6234.3). Total num frames: 11534336. Throughput: 0: 773.7, 1: 773.7. Samples: 2879488. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:47:42,909][04574] Avg episode reward: [(0, '34.210'), (1, '34.440')] -[2023-09-26 15:47:45,279][05901] Updated weights for policy 1, policy_version 22560 (0.0017) -[2023-09-26 15:47:45,279][05900] Updated weights for policy 0, policy_version 22560 (0.0016) -[2023-09-26 15:47:47,908][04574] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 11567104. Throughput: 0: 780.5, 1: 781.4. Samples: 2889002. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:47:47,909][04574] Avg episode reward: [(0, '34.610'), (1, '35.620')] -[2023-09-26 15:47:52,908][04574] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6220.4). Total num frames: 11591680. Throughput: 0: 777.1, 1: 776.6. Samples: 2898181. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:47:52,909][04574] Avg episode reward: [(0, '35.100'), (1, '36.120')] -[2023-09-26 15:47:57,907][04574] Fps is (10 sec: 5734.6, 60 sec: 6144.0, 300 sec: 6220.4). Total num frames: 11624448. Throughput: 0: 780.5, 1: 782.6. Samples: 2902875. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:47:57,908][04574] Avg episode reward: [(0, '35.130'), (1, '36.770')] -[2023-09-26 15:47:58,447][05900] Updated weights for policy 0, policy_version 22720 (0.0017) -[2023-09-26 15:47:58,447][05901] Updated weights for policy 1, policy_version 22720 (0.0017) -[2023-09-26 15:48:02,908][04574] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6220.4). Total num frames: 11657216. Throughput: 0: 779.1, 1: 778.6. Samples: 2912256. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:48:02,909][04574] Avg episode reward: [(0, '36.000'), (1, '36.720')] -[2023-09-26 15:48:07,907][04574] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6220.4). Total num frames: 11689984. Throughput: 0: 787.4, 1: 787.4. Samples: 2921821. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:48:07,908][04574] Avg episode reward: [(0, '36.670'), (1, '36.920')] -[2023-09-26 15:48:11,419][05900] Updated weights for policy 0, policy_version 22880 (0.0017) -[2023-09-26 15:48:11,420][05901] Updated weights for policy 1, policy_version 22880 (0.0018) -[2023-09-26 15:48:12,908][04574] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 11722752. Throughput: 0: 785.1, 1: 785.7. Samples: 2926592. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:48:12,909][04574] Avg episode reward: [(0, '36.760'), (1, '37.930')] -[2023-09-26 15:48:17,908][04574] Fps is (10 sec: 5734.3, 60 sec: 6144.0, 300 sec: 6220.4). Total num frames: 11747328. Throughput: 0: 781.8, 1: 780.3. Samples: 2935728. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:48:17,909][04574] Avg episode reward: [(0, '36.990'), (1, '38.500')] -[2023-09-26 15:48:22,908][04574] Fps is (10 sec: 5734.4, 60 sec: 6280.5, 300 sec: 6220.4). Total num frames: 11780096. Throughput: 0: 776.4, 1: 775.6. Samples: 2945032. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:48:22,909][04574] Avg episode reward: [(0, '37.660'), (1, '38.070')] -[2023-09-26 15:48:24,568][05901] Updated weights for policy 1, policy_version 23040 (0.0017) -[2023-09-26 15:48:24,568][05900] Updated weights for policy 0, policy_version 23040 (0.0019) -[2023-09-26 15:48:27,907][04574] Fps is (10 sec: 6553.7, 60 sec: 6280.5, 300 sec: 6220.4). Total num frames: 11812864. Throughput: 0: 782.1, 1: 781.6. Samples: 2949855. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) -[2023-09-26 15:48:27,908][04574] Avg episode reward: [(0, '37.820'), (1, '37.980')] -[2023-09-26 15:48:32,908][04574] Fps is (10 sec: 6553.7, 60 sec: 6280.5, 300 sec: 6220.4). Total num frames: 11845632. Throughput: 0: 782.1, 1: 781.4. Samples: 2959357. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) -[2023-09-26 15:48:32,908][04574] Avg episode reward: [(0, '37.720'), (1, '38.780')] -[2023-09-26 15:48:32,918][05384] Saving ./train_atari/atari_kongfumaster/checkpoint_p0/checkpoint_000023136_5922816.pth... -[2023-09-26 15:48:32,918][05596] Saving ./train_atari/atari_kongfumaster/checkpoint_p1/checkpoint_000023136_5922816.pth... -[2023-09-26 15:48:32,955][05596] Removing ./train_atari/atari_kongfumaster/checkpoint_p1/checkpoint_000020208_5173248.pth -[2023-09-26 15:48:32,956][05384] Removing ./train_atari/atari_kongfumaster/checkpoint_p0/checkpoint_000020208_5173248.pth -[2023-09-26 15:48:37,908][04574] Fps is (10 sec: 6143.9, 60 sec: 6212.3, 300 sec: 6234.2). Total num frames: 11874304. Throughput: 0: 778.1, 1: 779.7. Samples: 2968283. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) -[2023-09-26 15:48:37,909][04574] Avg episode reward: [(0, '38.260'), (1, '38.700')] -[2023-09-26 15:48:37,996][05901] Updated weights for policy 1, policy_version 23200 (0.0016) -[2023-09-26 15:48:37,996][05900] Updated weights for policy 0, policy_version 23200 (0.0017) -[2023-09-26 15:48:42,907][04574] Fps is (10 sec: 5734.5, 60 sec: 6144.0, 300 sec: 6220.4). Total num frames: 11902976. Throughput: 0: 777.8, 1: 775.2. Samples: 2972759. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:48:42,908][04574] Avg episode reward: [(0, '36.930'), (1, '38.840')] -[2023-09-26 15:48:47,908][04574] Fps is (10 sec: 6144.0, 60 sec: 6144.0, 300 sec: 6220.4). Total num frames: 11935744. Throughput: 0: 774.2, 1: 774.0. Samples: 2981926. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:48:47,909][04574] Avg episode reward: [(0, '35.810'), (1, '37.530')] -[2023-09-26 15:48:51,158][05901] Updated weights for policy 1, policy_version 23360 (0.0018) -[2023-09-26 15:48:51,158][05900] Updated weights for policy 0, policy_version 23360 (0.0017) -[2023-09-26 15:48:52,908][04574] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6220.4). Total num frames: 11968512. Throughput: 0: 775.5, 1: 776.1. Samples: 2991643. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:48:52,909][04574] Avg episode reward: [(0, '35.240'), (1, '38.040')] -[2023-09-26 15:48:57,908][04574] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6234.2). Total num frames: 12001280. Throughput: 0: 773.7, 1: 773.7. Samples: 2996224. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-09-26 15:48:57,909][04574] Avg episode reward: [(0, '35.180'), (1, '37.360')] -[2023-09-26 15:49:02,908][04574] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 12034048. Throughput: 0: 777.4, 1: 779.8. Samples: 3005803. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-09-26 15:49:02,909][04574] Avg episode reward: [(0, '35.080'), (1, '37.140')] -[2023-09-26 15:49:04,177][05901] Updated weights for policy 1, policy_version 23520 (0.0017) -[2023-09-26 15:49:04,177][05900] Updated weights for policy 0, policy_version 23520 (0.0018) -[2023-09-26 15:49:07,908][04574] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6220.4). Total num frames: 12058624. Throughput: 0: 778.7, 1: 778.8. Samples: 3015118. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-09-26 15:49:07,909][04574] Avg episode reward: [(0, '35.440'), (1, '35.840')] -[2023-09-26 15:49:12,907][04574] Fps is (10 sec: 5734.5, 60 sec: 6144.0, 300 sec: 6220.4). Total num frames: 12091392. Throughput: 0: 778.3, 1: 779.3. Samples: 3019948. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-09-26 15:49:12,908][04574] Avg episode reward: [(0, '34.660'), (1, '36.000')] -[2023-09-26 15:49:17,198][05900] Updated weights for policy 0, policy_version 23680 (0.0016) -[2023-09-26 15:49:17,198][05901] Updated weights for policy 1, policy_version 23680 (0.0017) -[2023-09-26 15:49:17,908][04574] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6220.4). Total num frames: 12124160. Throughput: 0: 773.8, 1: 773.9. Samples: 3029002. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:49:17,909][04574] Avg episode reward: [(0, '35.200'), (1, '36.200')] -[2023-09-26 15:49:22,908][04574] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6220.4). Total num frames: 12156928. Throughput: 0: 783.0, 1: 782.5. Samples: 3038729. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:49:22,909][04574] Avg episode reward: [(0, '35.690'), (1, '36.070')] -[2023-09-26 15:49:27,908][04574] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 12189696. Throughput: 0: 780.7, 1: 782.3. Samples: 3043094. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:49:27,909][04574] Avg episode reward: [(0, '35.700'), (1, '37.340')] -[2023-09-26 15:49:30,437][05901] Updated weights for policy 1, policy_version 23840 (0.0016) -[2023-09-26 15:49:30,437][05900] Updated weights for policy 0, policy_version 23840 (0.0016) -[2023-09-26 15:49:32,907][04574] Fps is (10 sec: 5734.5, 60 sec: 6144.0, 300 sec: 6220.4). Total num frames: 12214272. Throughput: 0: 783.5, 1: 784.4. Samples: 3052482. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:49:32,908][04574] Avg episode reward: [(0, '36.280'), (1, '37.440')] -[2023-09-26 15:49:37,908][04574] Fps is (10 sec: 5734.4, 60 sec: 6212.3, 300 sec: 6220.4). Total num frames: 12247040. Throughput: 0: 778.7, 1: 778.1. Samples: 3061698. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:49:37,909][04574] Avg episode reward: [(0, '36.690'), (1, '36.840')] -[2023-09-26 15:49:42,908][04574] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6220.4). Total num frames: 12279808. Throughput: 0: 775.1, 1: 774.9. Samples: 3065973. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:49:42,909][04574] Avg episode reward: [(0, '35.330'), (1, '36.530')] -[2023-09-26 15:49:43,867][05900] Updated weights for policy 0, policy_version 24000 (0.0019) -[2023-09-26 15:49:43,867][05901] Updated weights for policy 1, policy_version 24000 (0.0017) -[2023-09-26 15:49:47,908][04574] Fps is (10 sec: 6553.7, 60 sec: 6280.5, 300 sec: 6220.4). Total num frames: 12312576. Throughput: 0: 774.8, 1: 773.6. Samples: 3075482. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:49:47,908][04574] Avg episode reward: [(0, '35.110'), (1, '35.900')] -[2023-09-26 15:49:52,908][04574] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6220.4). Total num frames: 12337152. Throughput: 0: 769.0, 1: 768.7. Samples: 3084317. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:49:52,909][04574] Avg episode reward: [(0, '35.370'), (1, '36.100')] -[2023-09-26 15:49:57,132][05900] Updated weights for policy 0, policy_version 24160 (0.0017) -[2023-09-26 15:49:57,132][05901] Updated weights for policy 1, policy_version 24160 (0.0016) -[2023-09-26 15:49:57,908][04574] Fps is (10 sec: 5734.3, 60 sec: 6144.0, 300 sec: 6220.4). Total num frames: 12369920. Throughput: 0: 770.0, 1: 769.0. Samples: 3089199. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:49:57,909][04574] Avg episode reward: [(0, '35.280'), (1, '35.140')] -[2023-09-26 15:50:02,908][04574] Fps is (10 sec: 6553.6, 60 sec: 6144.0, 300 sec: 6220.4). Total num frames: 12402688. Throughput: 0: 773.6, 1: 773.6. Samples: 3098624. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:50:02,909][04574] Avg episode reward: [(0, '35.610'), (1, '33.990')] -[2023-09-26 15:50:07,908][04574] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6220.4). Total num frames: 12435456. Throughput: 0: 767.8, 1: 767.4. Samples: 3107812. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) -[2023-09-26 15:50:07,909][04574] Avg episode reward: [(0, '35.320'), (1, '34.360')] -[2023-09-26 15:50:10,342][05901] Updated weights for policy 1, policy_version 24320 (0.0018) -[2023-09-26 15:50:10,342][05900] Updated weights for policy 0, policy_version 24320 (0.0017) -[2023-09-26 15:50:12,908][04574] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6192.6). Total num frames: 12460032. Throughput: 0: 773.1, 1: 773.6. Samples: 3112695. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) -[2023-09-26 15:50:12,909][04574] Avg episode reward: [(0, '33.550'), (1, '34.490')] -[2023-09-26 15:50:17,908][04574] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6192.6). Total num frames: 12492800. Throughput: 0: 769.9, 1: 769.1. Samples: 3121740. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) -[2023-09-26 15:50:17,909][04574] Avg episode reward: [(0, '32.880'), (1, '34.890')] -[2023-09-26 15:50:22,908][04574] Fps is (10 sec: 6553.5, 60 sec: 6144.0, 300 sec: 6220.4). Total num frames: 12525568. Throughput: 0: 772.7, 1: 773.2. Samples: 3131263. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) -[2023-09-26 15:50:22,909][04574] Avg episode reward: [(0, '33.300'), (1, '34.710')] -[2023-09-26 15:50:23,710][05900] Updated weights for policy 0, policy_version 24480 (0.0017) -[2023-09-26 15:50:23,710][05901] Updated weights for policy 1, policy_version 24480 (0.0018) -[2023-09-26 15:50:27,908][04574] Fps is (10 sec: 6553.6, 60 sec: 6144.0, 300 sec: 6220.4). Total num frames: 12558336. Throughput: 0: 772.3, 1: 772.5. Samples: 3135488. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) -[2023-09-26 15:50:27,909][04574] Avg episode reward: [(0, '34.120'), (1, '35.580')] -[2023-09-26 15:50:32,908][04574] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6220.4). Total num frames: 12591104. Throughput: 0: 773.0, 1: 772.9. Samples: 3145048. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) -[2023-09-26 15:50:32,909][04574] Avg episode reward: [(0, '33.270'), (1, '35.240')] -[2023-09-26 15:50:32,920][05384] Saving ./train_atari/atari_kongfumaster/checkpoint_p0/checkpoint_000024592_6295552.pth... -[2023-09-26 15:50:32,920][05596] Saving ./train_atari/atari_kongfumaster/checkpoint_p1/checkpoint_000024592_6295552.pth... -[2023-09-26 15:50:32,955][05596] Removing ./train_atari/atari_kongfumaster/checkpoint_p1/checkpoint_000021680_5550080.pth -[2023-09-26 15:50:32,955][05384] Removing ./train_atari/atari_kongfumaster/checkpoint_p0/checkpoint_000021680_5550080.pth -[2023-09-26 15:50:36,779][05900] Updated weights for policy 0, policy_version 24640 (0.0017) -[2023-09-26 15:50:36,780][05901] Updated weights for policy 1, policy_version 24640 (0.0017) -[2023-09-26 15:50:37,908][04574] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6192.6). Total num frames: 12615680. Throughput: 0: 776.9, 1: 776.5. Samples: 3154218. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) -[2023-09-26 15:50:37,909][04574] Avg episode reward: [(0, '33.700'), (1, '36.440')] -[2023-09-26 15:50:42,908][04574] Fps is (10 sec: 5734.5, 60 sec: 6144.0, 300 sec: 6220.4). Total num frames: 12648448. Throughput: 0: 777.3, 1: 777.7. Samples: 3159173. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:50:42,909][04574] Avg episode reward: [(0, '33.980'), (1, '36.440')] -[2023-09-26 15:50:47,907][04574] Fps is (10 sec: 6553.7, 60 sec: 6144.0, 300 sec: 6220.4). Total num frames: 12681216. Throughput: 0: 775.2, 1: 775.3. Samples: 3168398. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:50:47,908][04574] Avg episode reward: [(0, '35.350'), (1, '36.610')] -[2023-09-26 15:50:49,754][05901] Updated weights for policy 1, policy_version 24800 (0.0018) -[2023-09-26 15:50:49,754][05900] Updated weights for policy 0, policy_version 24800 (0.0018) -[2023-09-26 15:50:52,907][04574] Fps is (10 sec: 6553.7, 60 sec: 6280.6, 300 sec: 6220.4). Total num frames: 12713984. Throughput: 0: 780.9, 1: 779.8. Samples: 3178044. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:50:52,908][04574] Avg episode reward: [(0, '35.400'), (1, '36.360')] -[2023-09-26 15:50:57,908][04574] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6220.4). Total num frames: 12746752. Throughput: 0: 777.5, 1: 776.1. Samples: 3182608. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0) -[2023-09-26 15:50:57,909][04574] Avg episode reward: [(0, '35.020'), (1, '36.990')] -[2023-09-26 15:51:02,740][05901] Updated weights for policy 1, policy_version 24960 (0.0016) -[2023-09-26 15:51:02,741][05900] Updated weights for policy 0, policy_version 24960 (0.0017) -[2023-09-26 15:51:02,908][04574] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6220.4). Total num frames: 12779520. Throughput: 0: 784.7, 1: 784.8. Samples: 3192367. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0) -[2023-09-26 15:51:02,908][04574] Avg episode reward: [(0, '34.760'), (1, '37.370')] -[2023-09-26 15:51:07,908][04574] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6220.4). Total num frames: 12804096. Throughput: 0: 780.5, 1: 779.9. Samples: 3201482. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0) -[2023-09-26 15:51:07,909][04574] Avg episode reward: [(0, '34.930'), (1, '36.710')] -[2023-09-26 15:51:12,907][04574] Fps is (10 sec: 5734.5, 60 sec: 6280.6, 300 sec: 6220.4). Total num frames: 12836864. Throughput: 0: 786.4, 1: 786.2. Samples: 3206253. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0) -[2023-09-26 15:51:12,908][04574] Avg episode reward: [(0, '33.560'), (1, '35.940')] -[2023-09-26 15:51:15,951][05900] Updated weights for policy 0, policy_version 25120 (0.0017) -[2023-09-26 15:51:15,951][05901] Updated weights for policy 1, policy_version 25120 (0.0019) -[2023-09-26 15:51:17,908][04574] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6220.4). Total num frames: 12869632. Throughput: 0: 781.0, 1: 781.6. Samples: 3215369. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:51:17,909][04574] Avg episode reward: [(0, '32.590'), (1, '35.570')] -[2023-09-26 15:51:22,907][04574] Fps is (10 sec: 6553.6, 60 sec: 6280.6, 300 sec: 6220.4). Total num frames: 12902400. Throughput: 0: 790.1, 1: 790.8. Samples: 3225357. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:51:22,908][04574] Avg episode reward: [(0, '32.380'), (1, '36.140')] -[2023-09-26 15:51:27,908][04574] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6220.4). Total num frames: 12935168. Throughput: 0: 784.2, 1: 784.0. Samples: 3229743. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:51:27,909][04574] Avg episode reward: [(0, '32.950'), (1, '35.470')] -[2023-09-26 15:51:28,860][05901] Updated weights for policy 1, policy_version 25280 (0.0016) -[2023-09-26 15:51:28,860][05900] Updated weights for policy 0, policy_version 25280 (0.0017) -[2023-09-26 15:51:32,908][04574] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6220.4). Total num frames: 12967936. Throughput: 0: 788.4, 1: 789.8. Samples: 3239417. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) -[2023-09-26 15:51:32,909][04574] Avg episode reward: [(0, '31.970'), (1, '34.920')] -[2023-09-26 15:51:37,908][04574] Fps is (10 sec: 6553.6, 60 sec: 6417.1, 300 sec: 6248.1). Total num frames: 13000704. Throughput: 0: 785.4, 1: 785.2. Samples: 3248721. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) -[2023-09-26 15:51:37,909][04574] Avg episode reward: [(0, '32.200'), (1, '35.060')] -[2023-09-26 15:51:41,728][05901] Updated weights for policy 1, policy_version 25440 (0.0018) -[2023-09-26 15:51:41,728][05900] Updated weights for policy 0, policy_version 25440 (0.0019) -[2023-09-26 15:51:42,908][04574] Fps is (10 sec: 5734.4, 60 sec: 6280.5, 300 sec: 6220.4). Total num frames: 13025280. Throughput: 0: 789.7, 1: 789.4. Samples: 3253669. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) -[2023-09-26 15:51:42,909][04574] Avg episode reward: [(0, '32.270'), (1, '35.370')] -[2023-09-26 15:51:47,908][04574] Fps is (10 sec: 5734.4, 60 sec: 6280.5, 300 sec: 6220.4). Total num frames: 13058048. Throughput: 0: 784.9, 1: 785.1. Samples: 3263015. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) -[2023-09-26 15:51:47,909][04574] Avg episode reward: [(0, '32.660'), (1, '35.380')] -[2023-09-26 15:51:52,908][04574] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6220.4). Total num frames: 13090816. Throughput: 0: 791.2, 1: 791.5. Samples: 3272704. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:51:52,909][04574] Avg episode reward: [(0, '32.380'), (1, '36.360')] -[2023-09-26 15:51:54,686][05900] Updated weights for policy 0, policy_version 25600 (0.0016) -[2023-09-26 15:51:54,686][05901] Updated weights for policy 1, policy_version 25600 (0.0017) -[2023-09-26 15:51:57,908][04574] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 13123584. Throughput: 0: 789.6, 1: 788.9. Samples: 3277282. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:51:57,909][04574] Avg episode reward: [(0, '33.150'), (1, '35.600')] -[2023-09-26 15:52:02,908][04574] Fps is (10 sec: 6553.7, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 13156352. Throughput: 0: 794.2, 1: 793.4. Samples: 3286811. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:52:02,909][04574] Avg episode reward: [(0, '33.300'), (1, '35.490')] -[2023-09-26 15:52:07,908][05900] Updated weights for policy 0, policy_version 25760 (0.0017) -[2023-09-26 15:52:07,908][05901] Updated weights for policy 1, policy_version 25760 (0.0018) -[2023-09-26 15:52:07,911][04574] Fps is (10 sec: 6551.1, 60 sec: 6416.7, 300 sec: 6248.1). Total num frames: 13189120. Throughput: 0: 783.1, 1: 782.9. Samples: 3295835. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) -[2023-09-26 15:52:07,914][04574] Avg episode reward: [(0, '33.040'), (1, '36.140')] -[2023-09-26 15:52:12,908][04574] Fps is (10 sec: 5734.4, 60 sec: 6280.5, 300 sec: 6220.4). Total num frames: 13213696. Throughput: 0: 787.1, 1: 787.2. Samples: 3300588. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) -[2023-09-26 15:52:12,908][04574] Avg episode reward: [(0, '33.710'), (1, '35.790')] -[2023-09-26 15:52:17,908][04574] Fps is (10 sec: 5736.7, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 13246464. Throughput: 0: 781.4, 1: 779.7. Samples: 3309666. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) -[2023-09-26 15:52:17,908][04574] Avg episode reward: [(0, '33.750'), (1, '35.110')] -[2023-09-26 15:52:21,390][05901] Updated weights for policy 1, policy_version 25920 (0.0017) -[2023-09-26 15:52:21,390][05900] Updated weights for policy 0, policy_version 25920 (0.0017) -[2023-09-26 15:52:22,908][04574] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 13279232. Throughput: 0: 778.8, 1: 779.5. Samples: 3318845. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) -[2023-09-26 15:52:22,909][04574] Avg episode reward: [(0, '34.680'), (1, '35.500')] -[2023-09-26 15:52:27,908][04574] Fps is (10 sec: 6144.0, 60 sec: 6212.3, 300 sec: 6234.3). Total num frames: 13307904. Throughput: 0: 779.1, 1: 779.1. Samples: 3323789. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) -[2023-09-26 15:52:27,908][04574] Avg episode reward: [(0, '33.760'), (1, '36.600')] -[2023-09-26 15:52:32,908][04574] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6220.4). Total num frames: 13336576. Throughput: 0: 775.0, 1: 775.0. Samples: 3332768. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) -[2023-09-26 15:52:32,908][04574] Avg episode reward: [(0, '33.760'), (1, '34.700')] -[2023-09-26 15:52:32,917][05384] Saving ./train_atari/atari_kongfumaster/checkpoint_p0/checkpoint_000026048_6668288.pth... -[2023-09-26 15:52:32,918][05596] Saving ./train_atari/atari_kongfumaster/checkpoint_p1/checkpoint_000026048_6668288.pth... -[2023-09-26 15:52:32,947][05384] Removing ./train_atari/atari_kongfumaster/checkpoint_p0/checkpoint_000023136_5922816.pth -[2023-09-26 15:52:32,954][05596] Removing ./train_atari/atari_kongfumaster/checkpoint_p1/checkpoint_000023136_5922816.pth -[2023-09-26 15:52:34,512][05900] Updated weights for policy 0, policy_version 26080 (0.0019) -[2023-09-26 15:52:34,513][05901] Updated weights for policy 1, policy_version 26080 (0.0019) -[2023-09-26 15:52:37,908][04574] Fps is (10 sec: 6143.9, 60 sec: 6144.0, 300 sec: 6220.4). Total num frames: 13369344. Throughput: 0: 773.7, 1: 773.7. Samples: 3342336. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) -[2023-09-26 15:52:37,909][04574] Avg episode reward: [(0, '34.600'), (1, '35.580')] -[2023-09-26 15:52:42,907][04574] Fps is (10 sec: 6553.7, 60 sec: 6280.5, 300 sec: 6220.4). Total num frames: 13402112. Throughput: 0: 770.3, 1: 770.5. Samples: 3346618. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) -[2023-09-26 15:52:42,908][04574] Avg episode reward: [(0, '34.220'), (1, '35.070')] -[2023-09-26 15:52:47,678][05901] Updated weights for policy 1, policy_version 26240 (0.0016) -[2023-09-26 15:52:47,678][05900] Updated weights for policy 0, policy_version 26240 (0.0017) -[2023-09-26 15:52:47,907][04574] Fps is (10 sec: 6553.7, 60 sec: 6280.6, 300 sec: 6248.1). Total num frames: 13434880. Throughput: 0: 770.8, 1: 771.6. Samples: 3356218. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0) -[2023-09-26 15:52:47,908][04574] Avg episode reward: [(0, '34.170'), (1, '34.970')] -[2023-09-26 15:52:52,908][04574] Fps is (10 sec: 6143.9, 60 sec: 6212.3, 300 sec: 6234.2). Total num frames: 13463552. Throughput: 0: 773.6, 1: 774.1. Samples: 3365478. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0) -[2023-09-26 15:52:52,909][04574] Avg episode reward: [(0, '34.020'), (1, '35.440')] -[2023-09-26 15:52:57,908][04574] Fps is (10 sec: 5734.3, 60 sec: 6144.0, 300 sec: 6220.4). Total num frames: 13492224. Throughput: 0: 773.8, 1: 774.3. Samples: 3370249. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0) -[2023-09-26 15:52:57,909][04574] Avg episode reward: [(0, '35.480'), (1, '35.530')] -[2023-09-26 15:53:00,720][05901] Updated weights for policy 1, policy_version 26400 (0.0017) -[2023-09-26 15:53:00,721][05900] Updated weights for policy 0, policy_version 26400 (0.0019) -[2023-09-26 15:53:02,908][04574] Fps is (10 sec: 6144.0, 60 sec: 6144.0, 300 sec: 6220.4). Total num frames: 13524992. Throughput: 0: 776.8, 1: 777.5. Samples: 3379612. Policy #0 lag: (min: 8.0, avg: 8.0, max: 8.0) -[2023-09-26 15:53:02,909][04574] Avg episode reward: [(0, '35.320'), (1, '35.990')] -[2023-09-26 15:53:07,908][04574] Fps is (10 sec: 6553.6, 60 sec: 6144.4, 300 sec: 6220.4). Total num frames: 13557760. Throughput: 0: 779.1, 1: 779.9. Samples: 3389003. Policy #0 lag: (min: 8.0, avg: 8.0, max: 8.0) -[2023-09-26 15:53:07,909][04574] Avg episode reward: [(0, '36.120'), (1, '35.400')] -[2023-09-26 15:53:12,907][04574] Fps is (10 sec: 6553.7, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 13590528. Throughput: 0: 774.9, 1: 775.2. Samples: 3393544. Policy #0 lag: (min: 8.0, avg: 8.0, max: 8.0) -[2023-09-26 15:53:12,908][04574] Avg episode reward: [(0, '37.210'), (1, '35.220')] -[2023-09-26 15:53:13,753][05900] Updated weights for policy 0, policy_version 26560 (0.0016) -[2023-09-26 15:53:13,754][05901] Updated weights for policy 1, policy_version 26560 (0.0015) -[2023-09-26 15:53:17,907][04574] Fps is (10 sec: 6553.7, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 13623296. Throughput: 0: 785.2, 1: 784.9. Samples: 3403421. Policy #0 lag: (min: 8.0, avg: 8.0, max: 8.0) -[2023-09-26 15:53:17,908][04574] Avg episode reward: [(0, '38.210'), (1, '35.910')] -[2023-09-26 15:53:22,907][04574] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6220.4). Total num frames: 13647872. Throughput: 0: 777.2, 1: 777.6. Samples: 3412298. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:53:22,908][04574] Avg episode reward: [(0, '37.990'), (1, '35.790')] -[2023-09-26 15:53:26,995][05900] Updated weights for policy 0, policy_version 26720 (0.0019) -[2023-09-26 15:53:26,995][05901] Updated weights for policy 1, policy_version 26720 (0.0017) -[2023-09-26 15:53:27,908][04574] Fps is (10 sec: 5734.3, 60 sec: 6212.2, 300 sec: 6220.4). Total num frames: 13680640. Throughput: 0: 783.4, 1: 784.5. Samples: 3417174. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:53:27,909][04574] Avg episode reward: [(0, '38.440'), (1, '36.050')] -[2023-09-26 15:53:32,907][04574] Fps is (10 sec: 6553.6, 60 sec: 6280.6, 300 sec: 6234.3). Total num frames: 13713408. Throughput: 0: 780.4, 1: 780.2. Samples: 3426448. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:53:32,908][04574] Avg episode reward: [(0, '38.510'), (1, '35.480')] -[2023-09-26 15:53:37,907][04574] Fps is (10 sec: 6553.8, 60 sec: 6280.6, 300 sec: 6248.1). Total num frames: 13746176. Throughput: 0: 787.3, 1: 787.9. Samples: 3436363. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:53:37,908][04574] Avg episode reward: [(0, '38.500'), (1, '36.070')] -[2023-09-26 15:53:39,847][05901] Updated weights for policy 1, policy_version 26880 (0.0016) -[2023-09-26 15:53:39,847][05900] Updated weights for policy 0, policy_version 26880 (0.0018) -[2023-09-26 15:53:42,908][04574] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 13778944. Throughput: 0: 784.5, 1: 783.8. Samples: 3440823. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:53:42,908][04574] Avg episode reward: [(0, '39.890'), (1, '36.030')] -[2023-09-26 15:53:42,909][05384] Saving new best policy, reward=39.890! -[2023-09-26 15:53:47,907][04574] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 13811712. Throughput: 0: 788.1, 1: 788.3. Samples: 3450549. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:53:47,908][04574] Avg episode reward: [(0, '39.870'), (1, '36.800')] -[2023-09-26 15:53:52,908][04574] Fps is (10 sec: 6144.0, 60 sec: 6280.5, 300 sec: 6234.3). Total num frames: 13840384. Throughput: 0: 785.5, 1: 784.6. Samples: 3459657. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:53:52,909][04574] Avg episode reward: [(0, '39.910'), (1, '36.740')] -[2023-09-26 15:53:52,933][05384] Saving new best policy, reward=39.910! -[2023-09-26 15:53:52,936][05901] Updated weights for policy 1, policy_version 27040 (0.0018) -[2023-09-26 15:53:52,937][05900] Updated weights for policy 0, policy_version 27040 (0.0017) -[2023-09-26 15:53:57,908][04574] Fps is (10 sec: 5734.3, 60 sec: 6280.5, 300 sec: 6220.4). Total num frames: 13869056. Throughput: 0: 790.3, 1: 790.2. Samples: 3464668. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:53:57,909][04574] Avg episode reward: [(0, '39.910'), (1, '37.490')] -[2023-09-26 15:54:02,908][04574] Fps is (10 sec: 6144.0, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 13901824. Throughput: 0: 782.2, 1: 783.0. Samples: 3473853. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:54:02,909][04574] Avg episode reward: [(0, '40.680'), (1, '36.700')] -[2023-09-26 15:54:02,920][05384] Saving new best policy, reward=40.680! -[2023-09-26 15:54:06,128][05901] Updated weights for policy 1, policy_version 27200 (0.0015) -[2023-09-26 15:54:06,129][05900] Updated weights for policy 0, policy_version 27200 (0.0017) -[2023-09-26 15:54:07,907][04574] Fps is (10 sec: 6553.7, 60 sec: 6280.6, 300 sec: 6248.1). Total num frames: 13934592. Throughput: 0: 787.8, 1: 787.9. Samples: 3483208. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:54:07,908][04574] Avg episode reward: [(0, '41.380'), (1, '35.920')] -[2023-09-26 15:54:07,909][05384] Saving new best policy, reward=41.380! -[2023-09-26 15:54:12,907][04574] Fps is (10 sec: 6553.7, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 13967360. Throughput: 0: 784.6, 1: 784.1. Samples: 3487766. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:54:12,908][04574] Avg episode reward: [(0, '41.390'), (1, '36.330')] -[2023-09-26 15:54:12,909][05384] Saving new best policy, reward=41.390! -[2023-09-26 15:54:17,908][04574] Fps is (10 sec: 6553.4, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 14000128. Throughput: 0: 790.9, 1: 791.2. Samples: 3497643. Policy #0 lag: (min: 8.0, avg: 8.0, max: 8.0) -[2023-09-26 15:54:17,909][04574] Avg episode reward: [(0, '41.750'), (1, '35.470')] -[2023-09-26 15:54:17,918][05384] Saving new best policy, reward=41.750! -[2023-09-26 15:54:19,019][05900] Updated weights for policy 0, policy_version 27360 (0.0018) -[2023-09-26 15:54:19,019][05901] Updated weights for policy 1, policy_version 27360 (0.0016) -[2023-09-26 15:54:22,908][04574] Fps is (10 sec: 5734.3, 60 sec: 6280.5, 300 sec: 6220.4). Total num frames: 14024704. Throughput: 0: 782.1, 1: 780.9. Samples: 3506696. Policy #0 lag: (min: 8.0, avg: 8.0, max: 8.0) -[2023-09-26 15:54:22,909][04574] Avg episode reward: [(0, '41.570'), (1, '36.370')] -[2023-09-26 15:54:27,907][04574] Fps is (10 sec: 5734.5, 60 sec: 6280.6, 300 sec: 6248.1). Total num frames: 14057472. Throughput: 0: 784.6, 1: 784.6. Samples: 3511438. Policy #0 lag: (min: 8.0, avg: 8.0, max: 8.0) -[2023-09-26 15:54:27,908][04574] Avg episode reward: [(0, '42.400'), (1, '36.600')] -[2023-09-26 15:54:27,909][05384] Saving new best policy, reward=42.400! -[2023-09-26 15:54:32,314][05900] Updated weights for policy 0, policy_version 27520 (0.0018) -[2023-09-26 15:54:32,314][05901] Updated weights for policy 1, policy_version 27520 (0.0018) -[2023-09-26 15:54:32,908][04574] Fps is (10 sec: 6553.7, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 14090240. Throughput: 0: 777.7, 1: 777.1. Samples: 3520517. Policy #0 lag: (min: 8.0, avg: 8.0, max: 8.0) -[2023-09-26 15:54:32,908][04574] Avg episode reward: [(0, '42.930'), (1, '36.490')] -[2023-09-26 15:54:32,916][05596] Saving ./train_atari/atari_kongfumaster/checkpoint_p1/checkpoint_000027520_7045120.pth... -[2023-09-26 15:54:32,916][05384] Saving ./train_atari/atari_kongfumaster/checkpoint_p0/checkpoint_000027520_7045120.pth... -[2023-09-26 15:54:32,952][05384] Removing ./train_atari/atari_kongfumaster/checkpoint_p0/checkpoint_000024592_6295552.pth -[2023-09-26 15:54:32,955][05596] Removing ./train_atari/atari_kongfumaster/checkpoint_p1/checkpoint_000024592_6295552.pth -[2023-09-26 15:54:32,956][05384] Saving new best policy, reward=42.930! -[2023-09-26 15:54:37,908][04574] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 14123008. Throughput: 0: 782.9, 1: 782.6. Samples: 3530104. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0) -[2023-09-26 15:54:37,909][04574] Avg episode reward: [(0, '41.860'), (1, '36.830')] -[2023-09-26 15:54:42,908][04574] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 14155776. Throughput: 0: 779.8, 1: 779.8. Samples: 3534848. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0) -[2023-09-26 15:54:42,909][04574] Avg episode reward: [(0, '42.320'), (1, '36.980')] -[2023-09-26 15:54:45,238][05901] Updated weights for policy 1, policy_version 27680 (0.0016) -[2023-09-26 15:54:45,238][05900] Updated weights for policy 0, policy_version 27680 (0.0017) -[2023-09-26 15:54:47,908][04574] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 14188544. Throughput: 0: 782.9, 1: 783.5. Samples: 3544342. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0) -[2023-09-26 15:54:47,909][04574] Avg episode reward: [(0, '42.440'), (1, '37.220')] -[2023-09-26 15:54:52,908][04574] Fps is (10 sec: 5734.4, 60 sec: 6212.3, 300 sec: 6248.1). Total num frames: 14213120. Throughput: 0: 783.7, 1: 783.0. Samples: 3553708. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0) -[2023-09-26 15:54:52,909][04574] Avg episode reward: [(0, '43.610'), (1, '37.140')] -[2023-09-26 15:54:53,026][05384] Saving new best policy, reward=43.610! -[2023-09-26 15:54:57,908][04574] Fps is (10 sec: 5734.4, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 14245888. Throughput: 0: 783.6, 1: 784.1. Samples: 3558311. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:54:57,909][04574] Avg episode reward: [(0, '43.420'), (1, '37.910')] -[2023-09-26 15:54:58,340][05900] Updated weights for policy 0, policy_version 27840 (0.0013) -[2023-09-26 15:54:58,341][05901] Updated weights for policy 1, policy_version 27840 (0.0015) -[2023-09-26 15:55:02,907][04574] Fps is (10 sec: 6553.7, 60 sec: 6280.6, 300 sec: 6248.1). Total num frames: 14278656. Throughput: 0: 777.6, 1: 777.4. Samples: 3567616. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:55:02,908][04574] Avg episode reward: [(0, '43.140'), (1, '38.320')] -[2023-09-26 15:55:07,908][04574] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 14311424. Throughput: 0: 782.9, 1: 782.8. Samples: 3577154. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:55:07,909][04574] Avg episode reward: [(0, '43.210'), (1, '38.600')] -[2023-09-26 15:55:11,519][05900] Updated weights for policy 0, policy_version 28000 (0.0017) -[2023-09-26 15:55:11,519][05901] Updated weights for policy 1, policy_version 28000 (0.0016) -[2023-09-26 15:55:12,908][04574] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 14344192. Throughput: 0: 782.2, 1: 783.0. Samples: 3581873. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) -[2023-09-26 15:55:12,909][04574] Avg episode reward: [(0, '43.020'), (1, '38.200')] -[2023-09-26 15:55:17,908][04574] Fps is (10 sec: 6144.0, 60 sec: 6212.3, 300 sec: 6262.0). Total num frames: 14372864. Throughput: 0: 785.8, 1: 786.4. Samples: 3591263. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) -[2023-09-26 15:55:17,909][04574] Avg episode reward: [(0, '41.950'), (1, '38.190')] -[2023-09-26 15:55:22,908][04574] Fps is (10 sec: 5734.4, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 14401536. Throughput: 0: 781.7, 1: 781.7. Samples: 3600458. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) -[2023-09-26 15:55:22,909][04574] Avg episode reward: [(0, '42.780'), (1, '38.680')] -[2023-09-26 15:55:24,527][05901] Updated weights for policy 1, policy_version 28160 (0.0016) -[2023-09-26 15:55:24,528][05900] Updated weights for policy 0, policy_version 28160 (0.0018) -[2023-09-26 15:55:27,907][04574] Fps is (10 sec: 6144.1, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 14434304. Throughput: 0: 783.1, 1: 783.1. Samples: 3605327. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) -[2023-09-26 15:55:27,908][04574] Avg episode reward: [(0, '42.880'), (1, '37.880')] -[2023-09-26 15:55:32,908][04574] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 14467072. Throughput: 0: 782.7, 1: 781.3. Samples: 3614720. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:55:32,909][04574] Avg episode reward: [(0, '43.360'), (1, '37.030')] -[2023-09-26 15:55:37,589][05900] Updated weights for policy 0, policy_version 28320 (0.0014) -[2023-09-26 15:55:37,590][05901] Updated weights for policy 1, policy_version 28320 (0.0014) -[2023-09-26 15:55:37,908][04574] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 14499840. Throughput: 0: 782.4, 1: 783.0. Samples: 3624152. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:55:37,909][04574] Avg episode reward: [(0, '43.550'), (1, '37.210')] -[2023-09-26 15:55:42,907][04574] Fps is (10 sec: 6553.8, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 14532608. Throughput: 0: 785.3, 1: 785.6. Samples: 3629003. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:55:42,908][04574] Avg episode reward: [(0, '41.790'), (1, '37.440')] -[2023-09-26 15:55:47,908][04574] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6248.1). Total num frames: 14557184. Throughput: 0: 784.5, 1: 784.8. Samples: 3638233. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:55:47,909][04574] Avg episode reward: [(0, '42.030'), (1, '38.050')] -[2023-09-26 15:55:50,632][05901] Updated weights for policy 1, policy_version 28480 (0.0017) -[2023-09-26 15:55:50,632][05900] Updated weights for policy 0, policy_version 28480 (0.0018) -[2023-09-26 15:55:52,907][04574] Fps is (10 sec: 5734.4, 60 sec: 6280.6, 300 sec: 6248.1). Total num frames: 14589952. Throughput: 0: 781.5, 1: 781.8. Samples: 3647502. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:55:52,908][04574] Avg episode reward: [(0, '41.990'), (1, '37.020')] -[2023-09-26 15:55:57,908][04574] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 14622720. Throughput: 0: 782.2, 1: 781.7. Samples: 3652251. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:55:57,909][04574] Avg episode reward: [(0, '41.130'), (1, '37.950')] -[2023-09-26 15:56:02,908][04574] Fps is (10 sec: 6553.4, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 14655488. Throughput: 0: 784.3, 1: 783.7. Samples: 3661824. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:56:02,909][04574] Avg episode reward: [(0, '40.970'), (1, '38.020')] -[2023-09-26 15:56:03,727][05900] Updated weights for policy 0, policy_version 28640 (0.0017) -[2023-09-26 15:56:03,727][05901] Updated weights for policy 1, policy_version 28640 (0.0017) -[2023-09-26 15:56:07,908][04574] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 14688256. Throughput: 0: 783.2, 1: 782.4. Samples: 3670907. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:56:07,909][04574] Avg episode reward: [(0, '40.490'), (1, '37.610')] -[2023-09-26 15:56:12,908][04574] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6248.1). Total num frames: 14712832. Throughput: 0: 781.0, 1: 781.2. Samples: 3675624. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:56:12,909][04574] Avg episode reward: [(0, '40.940'), (1, '37.260')] -[2023-09-26 15:56:16,862][05900] Updated weights for policy 0, policy_version 28800 (0.0019) -[2023-09-26 15:56:16,862][05901] Updated weights for policy 1, policy_version 28800 (0.0015) -[2023-09-26 15:56:17,908][04574] Fps is (10 sec: 5734.4, 60 sec: 6212.3, 300 sec: 6248.1). Total num frames: 14745600. Throughput: 0: 781.2, 1: 780.7. Samples: 3685006. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:56:17,908][04574] Avg episode reward: [(0, '40.340'), (1, '37.350')] -[2023-09-26 15:56:22,908][04574] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 14778368. Throughput: 0: 782.8, 1: 782.5. Samples: 3694592. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:56:22,909][04574] Avg episode reward: [(0, '38.450'), (1, '37.030')] -[2023-09-26 15:56:27,908][04574] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 14811136. Throughput: 0: 780.0, 1: 779.6. Samples: 3699186. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:56:27,909][04574] Avg episode reward: [(0, '38.250'), (1, '36.940')] -[2023-09-26 15:56:29,952][05901] Updated weights for policy 1, policy_version 28960 (0.0015) -[2023-09-26 15:56:29,953][05900] Updated weights for policy 0, policy_version 28960 (0.0016) -[2023-09-26 15:56:32,907][04574] Fps is (10 sec: 6553.7, 60 sec: 6280.6, 300 sec: 6248.1). Total num frames: 14843904. Throughput: 0: 783.7, 1: 784.8. Samples: 3708814. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:56:32,908][04574] Avg episode reward: [(0, '37.610'), (1, '36.990')] -[2023-09-26 15:56:32,916][05384] Saving ./train_atari/atari_kongfumaster/checkpoint_p0/checkpoint_000028992_7421952.pth... -[2023-09-26 15:56:32,917][05596] Saving ./train_atari/atari_kongfumaster/checkpoint_p1/checkpoint_000028992_7421952.pth... -[2023-09-26 15:56:32,947][05596] Removing ./train_atari/atari_kongfumaster/checkpoint_p1/checkpoint_000026048_6668288.pth -[2023-09-26 15:56:32,955][05384] Removing ./train_atari/atari_kongfumaster/checkpoint_p0/checkpoint_000026048_6668288.pth -[2023-09-26 15:56:37,907][04574] Fps is (10 sec: 6553.7, 60 sec: 6280.6, 300 sec: 6275.9). Total num frames: 14876672. Throughput: 0: 781.0, 1: 781.2. Samples: 3717800. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:56:37,908][04574] Avg episode reward: [(0, '36.680'), (1, '36.410')] -[2023-09-26 15:56:42,908][04574] Fps is (10 sec: 5734.3, 60 sec: 6144.0, 300 sec: 6248.1). Total num frames: 14901248. Throughput: 0: 784.2, 1: 784.1. Samples: 3722824. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:56:42,909][04574] Avg episode reward: [(0, '36.450'), (1, '36.230')] -[2023-09-26 15:56:42,995][05900] Updated weights for policy 0, policy_version 29120 (0.0016) -[2023-09-26 15:56:42,995][05901] Updated weights for policy 1, policy_version 29120 (0.0018) -[2023-09-26 15:56:47,908][04574] Fps is (10 sec: 5734.3, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 14934016. Throughput: 0: 781.9, 1: 781.2. Samples: 3732161. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:56:47,909][04574] Avg episode reward: [(0, '36.040'), (1, '35.680')] -[2023-09-26 15:56:52,907][04574] Fps is (10 sec: 6553.7, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 14966784. Throughput: 0: 786.1, 1: 787.0. Samples: 3741696. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) -[2023-09-26 15:56:52,908][04574] Avg episode reward: [(0, '35.870'), (1, '35.410')] -[2023-09-26 15:56:55,978][05900] Updated weights for policy 0, policy_version 29280 (0.0015) -[2023-09-26 15:56:55,978][05901] Updated weights for policy 1, policy_version 29280 (0.0017) -[2023-09-26 15:56:57,908][04574] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 14999552. Throughput: 0: 784.6, 1: 784.3. Samples: 3746223. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) -[2023-09-26 15:56:57,909][04574] Avg episode reward: [(0, '36.140'), (1, '35.720')] -[2023-09-26 15:57:02,908][04574] Fps is (10 sec: 6553.4, 60 sec: 6280.5, 300 sec: 6248.2). Total num frames: 15032320. Throughput: 0: 786.9, 1: 788.2. Samples: 3755885. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) -[2023-09-26 15:57:02,909][04574] Avg episode reward: [(0, '35.280'), (1, '34.670')] -[2023-09-26 15:57:07,908][04574] Fps is (10 sec: 6553.7, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 15065088. Throughput: 0: 783.3, 1: 783.1. Samples: 3765082. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) -[2023-09-26 15:57:07,908][04574] Avg episode reward: [(0, '34.620'), (1, '34.160')] -[2023-09-26 15:57:09,108][05900] Updated weights for policy 0, policy_version 29440 (0.0018) -[2023-09-26 15:57:09,108][05901] Updated weights for policy 1, policy_version 29440 (0.0017) -[2023-09-26 15:57:12,907][04574] Fps is (10 sec: 5734.5, 60 sec: 6280.6, 300 sec: 6248.1). Total num frames: 15089664. Throughput: 0: 786.3, 1: 784.5. Samples: 3769870. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:57:12,908][04574] Avg episode reward: [(0, '33.790'), (1, '34.050')] -[2023-09-26 15:57:17,908][04574] Fps is (10 sec: 5734.3, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 15122432. Throughput: 0: 779.6, 1: 778.1. Samples: 3778912. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:57:17,909][04574] Avg episode reward: [(0, '33.060'), (1, '34.280')] -[2023-09-26 15:57:22,363][05900] Updated weights for policy 0, policy_version 29600 (0.0018) -[2023-09-26 15:57:22,363][05901] Updated weights for policy 1, policy_version 29600 (0.0016) -[2023-09-26 15:57:22,908][04574] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6262.0). Total num frames: 15155200. Throughput: 0: 785.4, 1: 788.6. Samples: 3788629. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:57:22,909][04574] Avg episode reward: [(0, '33.630'), (1, '34.940')] -[2023-09-26 15:57:27,908][04574] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 15187968. Throughput: 0: 778.7, 1: 778.7. Samples: 3792904. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:57:27,909][04574] Avg episode reward: [(0, '34.610'), (1, '33.880')] -[2023-09-26 15:57:32,908][04574] Fps is (10 sec: 6553.7, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 15220736. Throughput: 0: 779.6, 1: 783.3. Samples: 3802495. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-09-26 15:57:32,908][04574] Avg episode reward: [(0, '34.630'), (1, '34.120')] -[2023-09-26 15:57:35,471][05901] Updated weights for policy 1, policy_version 29760 (0.0017) -[2023-09-26 15:57:35,472][05900] Updated weights for policy 0, policy_version 29760 (0.0017) -[2023-09-26 15:57:37,908][04574] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6248.1). Total num frames: 15245312. Throughput: 0: 777.4, 1: 777.7. Samples: 3811675. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-09-26 15:57:37,909][04574] Avg episode reward: [(0, '34.890'), (1, '33.860')] -[2023-09-26 15:57:42,908][04574] Fps is (10 sec: 5734.3, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 15278080. Throughput: 0: 780.9, 1: 781.3. Samples: 3816520. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-09-26 15:57:42,909][04574] Avg episode reward: [(0, '34.750'), (1, '33.610')] -[2023-09-26 15:57:47,908][04574] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6262.0). Total num frames: 15310848. Throughput: 0: 777.4, 1: 776.5. Samples: 3825808. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-09-26 15:57:47,909][04574] Avg episode reward: [(0, '35.480'), (1, '33.290')] -[2023-09-26 15:57:48,460][05901] Updated weights for policy 1, policy_version 29920 (0.0016) -[2023-09-26 15:57:48,461][05900] Updated weights for policy 0, policy_version 29920 (0.0018) -[2023-09-26 15:57:52,908][04574] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 15343616. Throughput: 0: 783.5, 1: 783.9. Samples: 3835615. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0) -[2023-09-26 15:57:52,909][04574] Avg episode reward: [(0, '35.320'), (1, '33.310')] -[2023-09-26 15:57:57,907][04574] Fps is (10 sec: 6553.7, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 15376384. Throughput: 0: 778.5, 1: 780.0. Samples: 3840000. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0) -[2023-09-26 15:57:57,908][04574] Avg episode reward: [(0, '35.090'), (1, '33.830')] -[2023-09-26 15:58:01,570][05900] Updated weights for policy 0, policy_version 30080 (0.0018) -[2023-09-26 15:58:01,570][05901] Updated weights for policy 1, policy_version 30080 (0.0018) -[2023-09-26 15:58:02,908][04574] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6248.1). Total num frames: 15400960. Throughput: 0: 783.8, 1: 784.1. Samples: 3849466. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0) -[2023-09-26 15:58:02,909][04574] Avg episode reward: [(0, '35.240'), (1, '34.310')] -[2023-09-26 15:58:07,908][04574] Fps is (10 sec: 5734.3, 60 sec: 6144.0, 300 sec: 6248.1). Total num frames: 15433728. Throughput: 0: 778.8, 1: 775.2. Samples: 3858560. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0) -[2023-09-26 15:58:07,909][04574] Avg episode reward: [(0, '36.690'), (1, '34.250')] -[2023-09-26 15:58:12,908][04574] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 15466496. Throughput: 0: 784.2, 1: 784.1. Samples: 3863481. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:58:12,909][04574] Avg episode reward: [(0, '35.820'), (1, '33.700')] -[2023-09-26 15:58:14,683][05901] Updated weights for policy 1, policy_version 30240 (0.0017) -[2023-09-26 15:58:14,684][05900] Updated weights for policy 0, policy_version 30240 (0.0016) -[2023-09-26 15:58:17,908][04574] Fps is (10 sec: 6553.7, 60 sec: 6280.6, 300 sec: 6275.9). Total num frames: 15499264. Throughput: 0: 782.4, 1: 779.4. Samples: 3872779. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:58:17,908][04574] Avg episode reward: [(0, '35.250'), (1, '34.500')] -[2023-09-26 15:58:22,907][04574] Fps is (10 sec: 6553.7, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 15532032. Throughput: 0: 790.2, 1: 789.7. Samples: 3882772. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:58:22,908][04574] Avg episode reward: [(0, '35.840'), (1, '34.750')] -[2023-09-26 15:58:27,677][05900] Updated weights for policy 0, policy_version 30400 (0.0017) -[2023-09-26 15:58:27,678][05901] Updated weights for policy 1, policy_version 30400 (0.0014) -[2023-09-26 15:58:27,908][04574] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 15564800. Throughput: 0: 784.4, 1: 784.2. Samples: 3887104. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:58:27,909][04574] Avg episode reward: [(0, '35.530'), (1, '34.870')] -[2023-09-26 15:58:32,908][04574] Fps is (10 sec: 6553.4, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 15597568. Throughput: 0: 787.9, 1: 788.4. Samples: 3896741. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0) -[2023-09-26 15:58:32,909][04574] Avg episode reward: [(0, '35.730'), (1, '35.360')] -[2023-09-26 15:58:32,921][05596] Saving ./train_atari/atari_kongfumaster/checkpoint_p1/checkpoint_000030464_7798784.pth... -[2023-09-26 15:58:32,921][05384] Saving ./train_atari/atari_kongfumaster/checkpoint_p0/checkpoint_000030464_7798784.pth... -[2023-09-26 15:58:32,955][05596] Removing ./train_atari/atari_kongfumaster/checkpoint_p1/checkpoint_000027520_7045120.pth -[2023-09-26 15:58:32,955][05384] Removing ./train_atari/atari_kongfumaster/checkpoint_p0/checkpoint_000027520_7045120.pth -[2023-09-26 15:58:37,908][04574] Fps is (10 sec: 5734.3, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 15622144. Throughput: 0: 782.7, 1: 782.0. Samples: 3906030. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0) -[2023-09-26 15:58:37,909][04574] Avg episode reward: [(0, '35.970'), (1, '33.770')] -[2023-09-26 15:58:40,566][05900] Updated weights for policy 0, policy_version 30560 (0.0017) -[2023-09-26 15:58:40,566][05901] Updated weights for policy 1, policy_version 30560 (0.0018) -[2023-09-26 15:58:42,908][04574] Fps is (10 sec: 5734.5, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 15654912. Throughput: 0: 787.5, 1: 788.0. Samples: 3910897. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0) -[2023-09-26 15:58:42,909][04574] Avg episode reward: [(0, '35.620'), (1, '33.640')] -[2023-09-26 15:58:47,908][04574] Fps is (10 sec: 6553.7, 60 sec: 6280.5, 300 sec: 6262.0). Total num frames: 15687680. Throughput: 0: 785.4, 1: 785.0. Samples: 3920132. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0) -[2023-09-26 15:58:47,909][04574] Avg episode reward: [(0, '35.450'), (1, '34.330')] -[2023-09-26 15:58:52,907][04574] Fps is (10 sec: 6553.7, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 15720448. Throughput: 0: 790.3, 1: 790.5. Samples: 3929697. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:58:52,908][04574] Avg episode reward: [(0, '36.590'), (1, '34.840')] -[2023-09-26 15:58:53,732][05901] Updated weights for policy 1, policy_version 30720 (0.0015) -[2023-09-26 15:58:53,733][05900] Updated weights for policy 0, policy_version 30720 (0.0016) -[2023-09-26 15:58:57,908][04574] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 15753216. Throughput: 0: 785.8, 1: 785.9. Samples: 3934210. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:58:57,909][04574] Avg episode reward: [(0, '36.840'), (1, '34.920')] -[2023-09-26 15:59:02,908][04574] Fps is (10 sec: 6553.5, 60 sec: 6417.1, 300 sec: 6275.9). Total num frames: 15785984. Throughput: 0: 788.7, 1: 788.5. Samples: 3943753. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:59:02,909][04574] Avg episode reward: [(0, '36.060'), (1, '34.310')] -[2023-09-26 15:59:06,769][05900] Updated weights for policy 0, policy_version 30880 (0.0017) -[2023-09-26 15:59:06,770][05901] Updated weights for policy 1, policy_version 30880 (0.0016) -[2023-09-26 15:59:07,908][04574] Fps is (10 sec: 5734.4, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 15810560. Throughput: 0: 779.6, 1: 780.0. Samples: 3952956. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:59:07,909][04574] Avg episode reward: [(0, '37.100'), (1, '33.880')] -[2023-09-26 15:59:12,908][04574] Fps is (10 sec: 5734.4, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 15843328. Throughput: 0: 787.1, 1: 787.4. Samples: 3957955. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:59:12,909][04574] Avg episode reward: [(0, '37.410'), (1, '34.560')] -[2023-09-26 15:59:17,908][04574] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 15876096. Throughput: 0: 780.6, 1: 780.2. Samples: 3966978. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:59:17,909][04574] Avg episode reward: [(0, '37.440'), (1, '34.630')] -[2023-09-26 15:59:19,883][05900] Updated weights for policy 0, policy_version 31040 (0.0017) -[2023-09-26 15:59:19,883][05901] Updated weights for policy 1, policy_version 31040 (0.0017) -[2023-09-26 15:59:22,908][04574] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 15908864. Throughput: 0: 783.3, 1: 784.0. Samples: 3976557. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:59:22,909][04574] Avg episode reward: [(0, '36.550'), (1, '34.880')] -[2023-09-26 15:59:27,908][04574] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 15941632. Throughput: 0: 782.4, 1: 781.6. Samples: 3981275. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:59:27,909][04574] Avg episode reward: [(0, '37.190'), (1, '34.590')] -[2023-09-26 15:59:32,908][04574] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6248.1). Total num frames: 15966208. Throughput: 0: 780.4, 1: 779.9. Samples: 3990345. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:59:32,909][04574] Avg episode reward: [(0, '36.980'), (1, '34.260')] -[2023-09-26 15:59:33,142][05900] Updated weights for policy 0, policy_version 31200 (0.0018) -[2023-09-26 15:59:33,142][05901] Updated weights for policy 1, policy_version 31200 (0.0017) -[2023-09-26 15:59:37,908][04574] Fps is (10 sec: 5734.4, 60 sec: 6280.6, 300 sec: 6248.1). Total num frames: 15998976. Throughput: 0: 779.2, 1: 779.1. Samples: 3999822. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:59:37,909][04574] Avg episode reward: [(0, '36.760'), (1, '34.440')] -[2023-09-26 15:59:42,908][04574] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 16031744. Throughput: 0: 781.8, 1: 781.4. Samples: 4004556. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:59:42,909][04574] Avg episode reward: [(0, '36.320'), (1, '32.910')] -[2023-09-26 15:59:46,106][05901] Updated weights for policy 1, policy_version 31360 (0.0016) -[2023-09-26 15:59:46,106][05900] Updated weights for policy 0, policy_version 31360 (0.0017) -[2023-09-26 15:59:47,908][04574] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 16064512. Throughput: 0: 781.3, 1: 781.6. Samples: 4014080. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 15:59:47,909][04574] Avg episode reward: [(0, '36.050'), (1, '33.050')] -[2023-09-26 15:59:52,908][04574] Fps is (10 sec: 6553.7, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 16097280. Throughput: 0: 784.5, 1: 785.0. Samples: 4023586. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) -[2023-09-26 15:59:52,908][04574] Avg episode reward: [(0, '35.970'), (1, '33.910')] -[2023-09-26 15:59:57,908][04574] Fps is (10 sec: 6553.7, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 16130048. Throughput: 0: 783.0, 1: 782.8. Samples: 4028416. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) -[2023-09-26 15:59:57,909][04574] Avg episode reward: [(0, '35.430'), (1, '33.890')] -[2023-09-26 15:59:59,024][05900] Updated weights for policy 0, policy_version 31520 (0.0017) -[2023-09-26 15:59:59,024][05901] Updated weights for policy 1, policy_version 31520 (0.0016) -[2023-09-26 16:00:02,908][04574] Fps is (10 sec: 5734.3, 60 sec: 6144.0, 300 sec: 6248.1). Total num frames: 16154624. Throughput: 0: 785.5, 1: 785.6. Samples: 4037678. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) -[2023-09-26 16:00:02,909][04574] Avg episode reward: [(0, '35.600'), (1, '33.030')] -[2023-09-26 16:00:07,908][04574] Fps is (10 sec: 5734.4, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 16187392. Throughput: 0: 781.1, 1: 780.9. Samples: 4046848. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) -[2023-09-26 16:00:07,909][04574] Avg episode reward: [(0, '36.120'), (1, '33.550')] -[2023-09-26 16:00:12,256][05901] Updated weights for policy 1, policy_version 31680 (0.0017) -[2023-09-26 16:00:12,256][05900] Updated weights for policy 0, policy_version 31680 (0.0018) -[2023-09-26 16:00:12,907][04574] Fps is (10 sec: 6553.8, 60 sec: 6280.6, 300 sec: 6262.0). Total num frames: 16220160. Throughput: 0: 779.5, 1: 779.6. Samples: 4051435. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) -[2023-09-26 16:00:12,908][04574] Avg episode reward: [(0, '37.010'), (1, '32.880')] -[2023-09-26 16:00:17,907][04574] Fps is (10 sec: 6553.7, 60 sec: 6280.6, 300 sec: 6275.9). Total num frames: 16252928. Throughput: 0: 782.3, 1: 784.6. Samples: 4060853. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 16:00:17,908][04574] Avg episode reward: [(0, '37.230'), (1, '33.900')] -[2023-09-26 16:00:22,908][04574] Fps is (10 sec: 6553.4, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 16285696. Throughput: 0: 782.2, 1: 782.5. Samples: 4070234. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 16:00:22,909][04574] Avg episode reward: [(0, '37.970'), (1, '34.120')] -[2023-09-26 16:00:25,367][05901] Updated weights for policy 1, policy_version 31840 (0.0013) -[2023-09-26 16:00:25,368][05900] Updated weights for policy 0, policy_version 31840 (0.0016) -[2023-09-26 16:00:27,908][04574] Fps is (10 sec: 5734.3, 60 sec: 6144.0, 300 sec: 6248.1). Total num frames: 16310272. Throughput: 0: 784.4, 1: 784.2. Samples: 4075142. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 16:00:27,909][04574] Avg episode reward: [(0, '38.260'), (1, '34.280')] -[2023-09-26 16:00:32,908][04574] Fps is (10 sec: 5734.4, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 16343040. Throughput: 0: 782.5, 1: 782.6. Samples: 4084510. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 16:00:32,909][04574] Avg episode reward: [(0, '38.170'), (1, '34.240')] -[2023-09-26 16:00:32,921][05384] Saving ./train_atari/atari_kongfumaster/checkpoint_p0/checkpoint_000031920_8171520.pth... -[2023-09-26 16:00:32,921][05596] Saving ./train_atari/atari_kongfumaster/checkpoint_p1/checkpoint_000031920_8171520.pth... -[2023-09-26 16:00:32,961][05384] Removing ./train_atari/atari_kongfumaster/checkpoint_p0/checkpoint_000028992_7421952.pth -[2023-09-26 16:00:32,961][05596] Removing ./train_atari/atari_kongfumaster/checkpoint_p1/checkpoint_000028992_7421952.pth -[2023-09-26 16:00:37,907][04574] Fps is (10 sec: 6553.7, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 16375808. Throughput: 0: 782.2, 1: 781.5. Samples: 4093952. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) -[2023-09-26 16:00:37,908][04574] Avg episode reward: [(0, '38.960'), (1, '33.880')] -[2023-09-26 16:00:38,409][05901] Updated weights for policy 1, policy_version 32000 (0.0017) -[2023-09-26 16:00:38,409][05900] Updated weights for policy 0, policy_version 32000 (0.0018) -[2023-09-26 16:00:42,907][04574] Fps is (10 sec: 6553.7, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 16408576. Throughput: 0: 778.6, 1: 778.1. Samples: 4098465. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) -[2023-09-26 16:00:42,908][04574] Avg episode reward: [(0, '38.790'), (1, '34.290')] -[2023-09-26 16:00:47,908][04574] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 16441344. Throughput: 0: 784.6, 1: 784.5. Samples: 4108288. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) -[2023-09-26 16:00:47,909][04574] Avg episode reward: [(0, '37.010'), (1, '34.590')] -[2023-09-26 16:00:51,298][05901] Updated weights for policy 1, policy_version 32160 (0.0016) -[2023-09-26 16:00:51,298][05900] Updated weights for policy 0, policy_version 32160 (0.0016) -[2023-09-26 16:00:52,908][04574] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 16474112. Throughput: 0: 787.7, 1: 788.0. Samples: 4117754. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) -[2023-09-26 16:00:52,909][04574] Avg episode reward: [(0, '37.490'), (1, '34.570')] -[2023-09-26 16:00:57,908][04574] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 16506880. Throughput: 0: 790.9, 1: 790.7. Samples: 4122607. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 16:00:57,909][04574] Avg episode reward: [(0, '36.930'), (1, '34.620')] -[2023-09-26 16:01:02,908][04574] Fps is (10 sec: 6553.6, 60 sec: 6417.1, 300 sec: 6275.9). Total num frames: 16539648. Throughput: 0: 790.8, 1: 789.3. Samples: 4131957. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 16:01:02,909][04574] Avg episode reward: [(0, '36.540'), (1, '34.490')] -[2023-09-26 16:01:04,250][05900] Updated weights for policy 0, policy_version 32320 (0.0016) -[2023-09-26 16:01:04,250][05901] Updated weights for policy 1, policy_version 32320 (0.0016) -[2023-09-26 16:01:07,907][04574] Fps is (10 sec: 5734.5, 60 sec: 6280.6, 300 sec: 6275.9). Total num frames: 16564224. Throughput: 0: 787.0, 1: 786.9. Samples: 4141056. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 16:01:07,908][04574] Avg episode reward: [(0, '36.400'), (1, '34.280')] -[2023-09-26 16:01:12,908][04574] Fps is (10 sec: 5734.4, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 16596992. Throughput: 0: 781.6, 1: 782.1. Samples: 4145505. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 16:01:12,909][04574] Avg episode reward: [(0, '36.480'), (1, '35.230')] -[2023-09-26 16:01:17,515][05901] Updated weights for policy 1, policy_version 32480 (0.0017) -[2023-09-26 16:01:17,516][05900] Updated weights for policy 0, policy_version 32480 (0.0017) -[2023-09-26 16:01:17,908][04574] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 16629760. Throughput: 0: 786.2, 1: 786.6. Samples: 4155288. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 16:01:17,909][04574] Avg episode reward: [(0, '36.240'), (1, '35.220')] -[2023-09-26 16:01:22,907][04574] Fps is (10 sec: 6553.7, 60 sec: 6280.6, 300 sec: 6275.9). Total num frames: 16662528. Throughput: 0: 780.9, 1: 780.5. Samples: 4164215. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 16:01:22,908][04574] Avg episode reward: [(0, '35.680'), (1, '34.960')] -[2023-09-26 16:01:27,907][04574] Fps is (10 sec: 5734.5, 60 sec: 6280.6, 300 sec: 6248.1). Total num frames: 16687104. Throughput: 0: 783.6, 1: 785.5. Samples: 4169073. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 16:01:27,908][04574] Avg episode reward: [(0, '35.680'), (1, '35.490')] -[2023-09-26 16:01:30,805][05901] Updated weights for policy 1, policy_version 32640 (0.0016) -[2023-09-26 16:01:30,806][05900] Updated weights for policy 0, policy_version 32640 (0.0015) -[2023-09-26 16:01:32,908][04574] Fps is (10 sec: 5734.2, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 16719872. Throughput: 0: 776.4, 1: 776.6. Samples: 4178174. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 16:01:32,909][04574] Avg episode reward: [(0, '35.650'), (1, '35.410')] -[2023-09-26 16:01:37,907][04574] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 16752640. Throughput: 0: 778.6, 1: 778.4. Samples: 4187819. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 16:01:37,908][04574] Avg episode reward: [(0, '35.400'), (1, '34.370')] -[2023-09-26 16:01:42,907][04574] Fps is (10 sec: 6553.8, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 16785408. Throughput: 0: 773.9, 1: 774.2. Samples: 4192270. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 16:01:42,908][04574] Avg episode reward: [(0, '34.940'), (1, '34.690')] -[2023-09-26 16:01:43,863][05900] Updated weights for policy 0, policy_version 32800 (0.0017) -[2023-09-26 16:01:43,863][05901] Updated weights for policy 1, policy_version 32800 (0.0017) -[2023-09-26 16:01:47,908][04574] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 16818176. Throughput: 0: 777.8, 1: 777.5. Samples: 4201947. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 16:01:47,909][04574] Avg episode reward: [(0, '34.470'), (1, '34.700')] -[2023-09-26 16:01:52,907][04574] Fps is (10 sec: 6553.6, 60 sec: 6280.6, 300 sec: 6275.9). Total num frames: 16850944. Throughput: 0: 782.9, 1: 783.1. Samples: 4211526. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 16:01:52,908][04574] Avg episode reward: [(0, '35.200'), (1, '33.980')] -[2023-09-26 16:01:56,680][05901] Updated weights for policy 1, policy_version 32960 (0.0015) -[2023-09-26 16:01:56,681][05900] Updated weights for policy 0, policy_version 32960 (0.0015) -[2023-09-26 16:01:57,908][04574] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6248.1). Total num frames: 16875520. Throughput: 0: 787.6, 1: 787.8. Samples: 4216395. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 16:01:57,909][04574] Avg episode reward: [(0, '34.750'), (1, '33.210')] -[2023-09-26 16:02:02,907][04574] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6248.1). Total num frames: 16908288. Throughput: 0: 782.3, 1: 782.6. Samples: 4225706. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) -[2023-09-26 16:02:02,908][04574] Avg episode reward: [(0, '34.000'), (1, '33.200')] -[2023-09-26 16:02:07,907][04574] Fps is (10 sec: 6553.7, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 16941056. Throughput: 0: 788.3, 1: 789.1. Samples: 4235199. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) -[2023-09-26 16:02:07,908][04574] Avg episode reward: [(0, '33.270'), (1, '34.080')] -[2023-09-26 16:02:09,768][05901] Updated weights for policy 1, policy_version 33120 (0.0019) -[2023-09-26 16:02:09,768][05900] Updated weights for policy 0, policy_version 33120 (0.0018) -[2023-09-26 16:02:12,908][04574] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 16973824. Throughput: 0: 785.4, 1: 784.0. Samples: 4239699. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) -[2023-09-26 16:02:12,909][04574] Avg episode reward: [(0, '32.620'), (1, '34.140')] -[2023-09-26 16:02:17,907][04574] Fps is (10 sec: 6553.7, 60 sec: 6280.6, 300 sec: 6275.9). Total num frames: 17006592. Throughput: 0: 790.7, 1: 791.1. Samples: 4249357. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) -[2023-09-26 16:02:17,908][04574] Avg episode reward: [(0, '32.020'), (1, '34.790')] -[2023-09-26 16:02:22,769][05900] Updated weights for policy 0, policy_version 33280 (0.0018) -[2023-09-26 16:02:22,770][05901] Updated weights for policy 1, policy_version 33280 (0.0019) -[2023-09-26 16:02:22,907][04574] Fps is (10 sec: 6553.7, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 17039360. Throughput: 0: 788.6, 1: 786.9. Samples: 4258715. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) -[2023-09-26 16:02:22,908][04574] Avg episode reward: [(0, '31.390'), (1, '35.390')] -[2023-09-26 16:02:27,908][04574] Fps is (10 sec: 5734.3, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 17063936. Throughput: 0: 792.3, 1: 791.1. Samples: 4263523. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 16:02:27,909][04574] Avg episode reward: [(0, '32.460'), (1, '35.060')] -[2023-09-26 16:02:32,908][04574] Fps is (10 sec: 5734.4, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 17096704. Throughput: 0: 781.6, 1: 782.1. Samples: 4272317. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 16:02:32,908][04574] Avg episode reward: [(0, '33.260'), (1, '34.580')] -[2023-09-26 16:02:32,918][05384] Saving ./train_atari/atari_kongfumaster/checkpoint_p0/checkpoint_000033392_8548352.pth... -[2023-09-26 16:02:32,919][05596] Saving ./train_atari/atari_kongfumaster/checkpoint_p1/checkpoint_000033392_8548352.pth... -[2023-09-26 16:02:32,954][05596] Removing ./train_atari/atari_kongfumaster/checkpoint_p1/checkpoint_000030464_7798784.pth -[2023-09-26 16:02:32,956][05384] Removing ./train_atari/atari_kongfumaster/checkpoint_p0/checkpoint_000030464_7798784.pth -[2023-09-26 16:02:36,263][05900] Updated weights for policy 0, policy_version 33440 (0.0017) -[2023-09-26 16:02:36,264][05901] Updated weights for policy 1, policy_version 33440 (0.0018) -[2023-09-26 16:02:37,908][04574] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 17129472. Throughput: 0: 779.5, 1: 779.6. Samples: 4281685. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 16:02:37,909][04574] Avg episode reward: [(0, '33.750'), (1, '33.350')] -[2023-09-26 16:02:42,908][04574] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 17162240. Throughput: 0: 778.6, 1: 778.5. Samples: 4286464. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 16:02:42,909][04574] Avg episode reward: [(0, '33.960'), (1, '33.090')] -[2023-09-26 16:02:47,907][04574] Fps is (10 sec: 6553.7, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 17195008. Throughput: 0: 781.7, 1: 781.8. Samples: 4296064. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0) -[2023-09-26 16:02:47,908][04574] Avg episode reward: [(0, '32.870'), (1, '33.660')] -[2023-09-26 16:02:49,198][05900] Updated weights for policy 0, policy_version 33600 (0.0013) -[2023-09-26 16:02:49,199][05901] Updated weights for policy 1, policy_version 33600 (0.0017) -[2023-09-26 16:02:52,908][04574] Fps is (10 sec: 5734.3, 60 sec: 6144.0, 300 sec: 6248.1). Total num frames: 17219584. Throughput: 0: 777.9, 1: 777.8. Samples: 4305203. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0) -[2023-09-26 16:02:52,909][04574] Avg episode reward: [(0, '32.940'), (1, '32.740')] -[2023-09-26 16:02:57,908][04574] Fps is (10 sec: 5734.3, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 17252352. Throughput: 0: 780.8, 1: 780.8. Samples: 4309971. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0) -[2023-09-26 16:02:57,909][04574] Avg episode reward: [(0, '33.160'), (1, '32.580')] -[2023-09-26 16:03:02,247][05900] Updated weights for policy 0, policy_version 33760 (0.0015) -[2023-09-26 16:03:02,248][05901] Updated weights for policy 1, policy_version 33760 (0.0017) -[2023-09-26 16:03:02,908][04574] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 17285120. Throughput: 0: 776.7, 1: 776.2. Samples: 4319235. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0) -[2023-09-26 16:03:02,909][04574] Avg episode reward: [(0, '32.750'), (1, '31.960')] -[2023-09-26 16:03:07,908][04574] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 17317888. Throughput: 0: 774.1, 1: 776.9. Samples: 4328512. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0) -[2023-09-26 16:03:07,909][04574] Avg episode reward: [(0, '32.560'), (1, '31.340')] -[2023-09-26 16:03:12,908][04574] Fps is (10 sec: 6144.1, 60 sec: 6212.3, 300 sec: 6262.0). Total num frames: 17346560. Throughput: 0: 774.1, 1: 775.3. Samples: 4333244. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) -[2023-09-26 16:03:12,909][04574] Avg episode reward: [(0, '33.210'), (1, '30.810')] -[2023-09-26 16:03:15,565][05900] Updated weights for policy 0, policy_version 33920 (0.0017) -[2023-09-26 16:03:15,565][05901] Updated weights for policy 1, policy_version 33920 (0.0016) -[2023-09-26 16:03:17,908][04574] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6248.1). Total num frames: 17375232. Throughput: 0: 779.0, 1: 778.4. Samples: 4342400. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) -[2023-09-26 16:03:17,909][04574] Avg episode reward: [(0, '33.070'), (1, '31.180')] -[2023-09-26 16:03:22,908][04574] Fps is (10 sec: 6143.9, 60 sec: 6144.0, 300 sec: 6248.1). Total num frames: 17408000. Throughput: 0: 781.4, 1: 781.2. Samples: 4352000. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) -[2023-09-26 16:03:22,909][04574] Avg episode reward: [(0, '33.360'), (1, '31.660')] -[2023-09-26 16:03:27,907][04574] Fps is (10 sec: 6553.8, 60 sec: 6280.6, 300 sec: 6248.1). Total num frames: 17440768. Throughput: 0: 777.5, 1: 777.7. Samples: 4356448. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) -[2023-09-26 16:03:27,908][04574] Avg episode reward: [(0, '32.900'), (1, '31.190')] -[2023-09-26 16:03:28,711][05901] Updated weights for policy 1, policy_version 34080 (0.0015) -[2023-09-26 16:03:28,711][05900] Updated weights for policy 0, policy_version 34080 (0.0017) -[2023-09-26 16:03:32,907][04574] Fps is (10 sec: 6553.7, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 17473536. Throughput: 0: 777.6, 1: 776.1. Samples: 4365982. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 16:03:32,908][04574] Avg episode reward: [(0, '33.000'), (1, '31.060')] -[2023-09-26 16:03:37,907][04574] Fps is (10 sec: 5734.3, 60 sec: 6144.0, 300 sec: 6248.1). Total num frames: 17498112. Throughput: 0: 771.7, 1: 771.2. Samples: 4374631. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 16:03:37,908][04574] Avg episode reward: [(0, '32.620'), (1, '31.520')] -[2023-09-26 16:03:42,136][05900] Updated weights for policy 0, policy_version 34240 (0.0017) -[2023-09-26 16:03:42,136][05901] Updated weights for policy 1, policy_version 34240 (0.0018) -[2023-09-26 16:03:42,907][04574] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6248.1). Total num frames: 17530880. Throughput: 0: 773.3, 1: 773.3. Samples: 4379568. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 16:03:42,908][04574] Avg episode reward: [(0, '32.220'), (1, '32.020')] -[2023-09-26 16:03:47,907][04574] Fps is (10 sec: 6553.6, 60 sec: 6144.0, 300 sec: 6248.1). Total num frames: 17563648. Throughput: 0: 773.7, 1: 772.7. Samples: 4388823. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 16:03:47,908][04574] Avg episode reward: [(0, '32.000'), (1, '31.470')] -[2023-09-26 16:03:52,908][04574] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 17596416. Throughput: 0: 771.9, 1: 771.0. Samples: 4397944. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 16:03:52,909][04574] Avg episode reward: [(0, '32.180'), (1, '31.270')] -[2023-09-26 16:03:55,386][05900] Updated weights for policy 0, policy_version 34400 (0.0018) -[2023-09-26 16:03:55,386][05901] Updated weights for policy 1, policy_version 34400 (0.0018) -[2023-09-26 16:03:57,908][04574] Fps is (10 sec: 6143.9, 60 sec: 6212.3, 300 sec: 6234.3). Total num frames: 17625088. Throughput: 0: 772.8, 1: 772.4. Samples: 4402780. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 16:03:57,909][04574] Avg episode reward: [(0, '31.880'), (1, '30.330')] -[2023-09-26 16:04:02,908][04574] Fps is (10 sec: 5734.3, 60 sec: 6144.0, 300 sec: 6248.1). Total num frames: 17653760. Throughput: 0: 775.0, 1: 775.4. Samples: 4412168. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 16:04:02,909][04574] Avg episode reward: [(0, '32.280'), (1, '30.590')] -[2023-09-26 16:04:07,908][04574] Fps is (10 sec: 6144.0, 60 sec: 6144.0, 300 sec: 6248.1). Total num frames: 17686528. Throughput: 0: 773.7, 1: 773.7. Samples: 4421632. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 16:04:07,909][04574] Avg episode reward: [(0, '31.950'), (1, '30.740')] -[2023-09-26 16:04:08,376][05901] Updated weights for policy 1, policy_version 34560 (0.0017) -[2023-09-26 16:04:08,377][05900] Updated weights for policy 0, policy_version 34560 (0.0016) -[2023-09-26 16:04:12,907][04574] Fps is (10 sec: 6553.8, 60 sec: 6212.3, 300 sec: 6248.1). Total num frames: 17719296. Throughput: 0: 776.1, 1: 775.8. Samples: 4426285. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 16:04:12,908][04574] Avg episode reward: [(0, '31.670'), (1, '31.600')] -[2023-09-26 16:04:17,908][04574] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 17752064. Throughput: 0: 775.4, 1: 775.2. Samples: 4435759. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) -[2023-09-26 16:04:17,909][04574] Avg episode reward: [(0, '32.260'), (1, '31.540')] -[2023-09-26 16:04:21,684][05901] Updated weights for policy 1, policy_version 34720 (0.0015) -[2023-09-26 16:04:21,684][05900] Updated weights for policy 0, policy_version 34720 (0.0017) -[2023-09-26 16:04:22,907][04574] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6220.4). Total num frames: 17776640. Throughput: 0: 777.2, 1: 778.2. Samples: 4444626. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) -[2023-09-26 16:04:22,908][04574] Avg episode reward: [(0, '31.680'), (1, '30.280')] -[2023-09-26 16:04:27,907][04574] Fps is (10 sec: 5734.6, 60 sec: 6144.0, 300 sec: 6248.1). Total num frames: 17809408. Throughput: 0: 776.5, 1: 777.1. Samples: 4449477. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) -[2023-09-26 16:04:27,908][04574] Avg episode reward: [(0, '31.430'), (1, '30.150')] -[2023-09-26 16:04:32,908][04574] Fps is (10 sec: 6553.4, 60 sec: 6144.0, 300 sec: 6248.1). Total num frames: 17842176. Throughput: 0: 775.4, 1: 776.0. Samples: 4458636. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) -[2023-09-26 16:04:32,909][04574] Avg episode reward: [(0, '31.010'), (1, '30.090')] -[2023-09-26 16:04:32,920][05384] Saving ./train_atari/atari_kongfumaster/checkpoint_p0/checkpoint_000034848_8921088.pth... -[2023-09-26 16:04:32,920][05596] Saving ./train_atari/atari_kongfumaster/checkpoint_p1/checkpoint_000034848_8921088.pth... -[2023-09-26 16:04:32,955][05596] Removing ./train_atari/atari_kongfumaster/checkpoint_p1/checkpoint_000031920_8171520.pth -[2023-09-26 16:04:32,960][05384] Removing ./train_atari/atari_kongfumaster/checkpoint_p0/checkpoint_000031920_8171520.pth -[2023-09-26 16:04:34,781][05900] Updated weights for policy 0, policy_version 34880 (0.0017) -[2023-09-26 16:04:34,781][05901] Updated weights for policy 1, policy_version 34880 (0.0018) -[2023-09-26 16:04:37,907][04574] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 17874944. Throughput: 0: 782.2, 1: 782.1. Samples: 4468338. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) -[2023-09-26 16:04:37,908][04574] Avg episode reward: [(0, '30.880'), (1, '31.080')] -[2023-09-26 16:04:42,908][04574] Fps is (10 sec: 6553.7, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 17907712. Throughput: 0: 778.2, 1: 778.7. Samples: 4472840. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 16:04:42,909][04574] Avg episode reward: [(0, '31.460'), (1, '31.680')] -[2023-09-26 16:04:47,883][05900] Updated weights for policy 0, policy_version 35040 (0.0017) -[2023-09-26 16:04:47,883][05901] Updated weights for policy 1, policy_version 35040 (0.0016) -[2023-09-26 16:04:47,908][04574] Fps is (10 sec: 6553.4, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 17940480. Throughput: 0: 779.6, 1: 778.8. Samples: 4482297. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 16:04:47,909][04574] Avg episode reward: [(0, '31.240'), (1, '32.760')] -[2023-09-26 16:04:52,907][04574] Fps is (10 sec: 5734.5, 60 sec: 6144.0, 300 sec: 6220.4). Total num frames: 17965056. Throughput: 0: 776.3, 1: 776.8. Samples: 4491520. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 16:04:52,908][04574] Avg episode reward: [(0, '32.360'), (1, '33.010')] -[2023-09-26 16:04:57,908][04574] Fps is (10 sec: 5734.4, 60 sec: 6212.3, 300 sec: 6248.1). Total num frames: 17997824. Throughput: 0: 774.9, 1: 774.9. Samples: 4496024. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 16:04:57,909][04574] Avg episode reward: [(0, '32.090'), (1, '33.110')] -[2023-09-26 16:05:01,162][05900] Updated weights for policy 0, policy_version 35200 (0.0018) -[2023-09-26 16:05:01,163][05901] Updated weights for policy 1, policy_version 35200 (0.0017) -[2023-09-26 16:05:02,908][04574] Fps is (10 sec: 6553.4, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 18030592. Throughput: 0: 775.6, 1: 776.5. Samples: 4505600. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 16:05:02,909][04574] Avg episode reward: [(0, '32.230'), (1, '33.600')] -[2023-09-26 16:05:07,908][04574] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 18063360. Throughput: 0: 781.3, 1: 780.1. Samples: 4514891. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) -[2023-09-26 16:05:07,909][04574] Avg episode reward: [(0, '32.700'), (1, '34.950')] -[2023-09-26 16:05:12,908][04574] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6220.4). Total num frames: 18087936. Throughput: 0: 778.7, 1: 777.0. Samples: 4519486. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) -[2023-09-26 16:05:12,909][04574] Avg episode reward: [(0, '33.420'), (1, '34.860')] -[2023-09-26 16:05:14,369][05901] Updated weights for policy 1, policy_version 35360 (0.0017) -[2023-09-26 16:05:14,369][05900] Updated weights for policy 0, policy_version 35360 (0.0018) -[2023-09-26 16:05:17,907][04574] Fps is (10 sec: 5734.5, 60 sec: 6144.0, 300 sec: 6220.4). Total num frames: 18120704. Throughput: 0: 778.5, 1: 777.6. Samples: 4528663. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) -[2023-09-26 16:05:17,908][04574] Avg episode reward: [(0, '33.680'), (1, '35.090')] -[2023-09-26 16:05:22,908][04574] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 18153472. Throughput: 0: 778.2, 1: 777.7. Samples: 4538355. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) -[2023-09-26 16:05:22,909][04574] Avg episode reward: [(0, '33.620'), (1, '34.460')] -[2023-09-26 16:05:27,341][05900] Updated weights for policy 0, policy_version 35520 (0.0015) -[2023-09-26 16:05:27,342][05901] Updated weights for policy 1, policy_version 35520 (0.0015) -[2023-09-26 16:05:27,908][04574] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 18186240. Throughput: 0: 776.4, 1: 776.5. Samples: 4542721. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) -[2023-09-26 16:05:27,909][04574] Avg episode reward: [(0, '33.500'), (1, '34.340')] -[2023-09-26 16:05:32,908][04574] Fps is (10 sec: 6553.6, 60 sec: 6280.6, 300 sec: 6248.1). Total num frames: 18219008. Throughput: 0: 780.2, 1: 781.0. Samples: 4552550. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) -[2023-09-26 16:05:32,908][04574] Avg episode reward: [(0, '34.670'), (1, '34.300')] -[2023-09-26 16:05:37,907][04574] Fps is (10 sec: 6553.7, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 18251776. Throughput: 0: 780.8, 1: 780.3. Samples: 4561767. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) -[2023-09-26 16:05:37,908][04574] Avg episode reward: [(0, '35.140'), (1, '33.720')] -[2023-09-26 16:05:40,459][05900] Updated weights for policy 0, policy_version 35680 (0.0018) -[2023-09-26 16:05:40,460][05901] Updated weights for policy 1, policy_version 35680 (0.0016) -[2023-09-26 16:05:42,908][04574] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6220.4). Total num frames: 18276352. Throughput: 0: 782.5, 1: 784.3. Samples: 4566528. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) -[2023-09-26 16:05:42,909][04574] Avg episode reward: [(0, '35.680'), (1, '34.130')] -[2023-09-26 16:05:47,908][04574] Fps is (10 sec: 5734.2, 60 sec: 6144.0, 300 sec: 6220.4). Total num frames: 18309120. Throughput: 0: 775.0, 1: 774.7. Samples: 4575337. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) -[2023-09-26 16:05:47,909][04574] Avg episode reward: [(0, '36.080'), (1, '33.650')] -[2023-09-26 16:05:52,908][04574] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6220.4). Total num frames: 18341888. Throughput: 0: 776.5, 1: 776.2. Samples: 4584762. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 16:05:52,909][04574] Avg episode reward: [(0, '35.650'), (1, '34.150')] -[2023-09-26 16:05:53,833][05901] Updated weights for policy 1, policy_version 35840 (0.0017) -[2023-09-26 16:05:53,834][05900] Updated weights for policy 0, policy_version 35840 (0.0017) -[2023-09-26 16:05:57,907][04574] Fps is (10 sec: 6553.7, 60 sec: 6280.5, 300 sec: 6220.4). Total num frames: 18374656. Throughput: 0: 778.2, 1: 779.2. Samples: 4589568. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 16:05:57,908][04574] Avg episode reward: [(0, '35.290'), (1, '34.810')] -[2023-09-26 16:06:02,908][04574] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6220.4). Total num frames: 18399232. Throughput: 0: 777.6, 1: 778.6. Samples: 4598696. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 16:06:02,909][04574] Avg episode reward: [(0, '34.890'), (1, '35.090')] -[2023-09-26 16:06:06,897][05900] Updated weights for policy 0, policy_version 36000 (0.0017) -[2023-09-26 16:06:06,897][05901] Updated weights for policy 1, policy_version 36000 (0.0018) -[2023-09-26 16:06:07,908][04574] Fps is (10 sec: 5734.3, 60 sec: 6144.0, 300 sec: 6220.4). Total num frames: 18432000. Throughput: 0: 774.7, 1: 774.8. Samples: 4608082. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 16:06:07,909][04574] Avg episode reward: [(0, '33.940'), (1, '35.570')] -[2023-09-26 16:06:12,907][04574] Fps is (10 sec: 6553.7, 60 sec: 6280.5, 300 sec: 6220.4). Total num frames: 18464768. Throughput: 0: 782.2, 1: 782.3. Samples: 4613125. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 16:06:12,908][04574] Avg episode reward: [(0, '34.310'), (1, '36.040')] -[2023-09-26 16:06:17,908][04574] Fps is (10 sec: 6553.7, 60 sec: 6280.5, 300 sec: 6220.4). Total num frames: 18497536. Throughput: 0: 777.3, 1: 777.2. Samples: 4622499. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0) -[2023-09-26 16:06:17,908][04574] Avg episode reward: [(0, '34.120'), (1, '35.600')] -[2023-09-26 16:06:19,703][05900] Updated weights for policy 0, policy_version 36160 (0.0018) -[2023-09-26 16:06:19,703][05901] Updated weights for policy 1, policy_version 36160 (0.0018) -[2023-09-26 16:06:22,908][04574] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 18530304. Throughput: 0: 784.8, 1: 784.2. Samples: 4632371. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0) -[2023-09-26 16:06:22,909][04574] Avg episode reward: [(0, '34.130'), (1, '34.990')] -[2023-09-26 16:06:27,907][04574] Fps is (10 sec: 6553.7, 60 sec: 6280.6, 300 sec: 6248.1). Total num frames: 18563072. Throughput: 0: 781.0, 1: 779.2. Samples: 4636736. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0) -[2023-09-26 16:06:27,908][04574] Avg episode reward: [(0, '33.850'), (1, '34.180')] -[2023-09-26 16:06:32,666][05901] Updated weights for policy 1, policy_version 36320 (0.0018) -[2023-09-26 16:06:32,666][05900] Updated weights for policy 0, policy_version 36320 (0.0019) -[2023-09-26 16:06:32,907][04574] Fps is (10 sec: 6553.7, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 18595840. Throughput: 0: 791.0, 1: 791.5. Samples: 4646550. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0) -[2023-09-26 16:06:32,908][04574] Avg episode reward: [(0, '32.640'), (1, '34.220')] -[2023-09-26 16:06:32,915][05384] Saving ./train_atari/atari_kongfumaster/checkpoint_p0/checkpoint_000036320_9297920.pth... -[2023-09-26 16:06:32,916][05596] Saving ./train_atari/atari_kongfumaster/checkpoint_p1/checkpoint_000036320_9297920.pth... -[2023-09-26 16:06:32,948][05384] Removing ./train_atari/atari_kongfumaster/checkpoint_p0/checkpoint_000033392_8548352.pth -[2023-09-26 16:06:32,950][05596] Removing ./train_atari/atari_kongfumaster/checkpoint_p1/checkpoint_000033392_8548352.pth -[2023-09-26 16:06:37,907][04574] Fps is (10 sec: 6144.0, 60 sec: 6212.3, 300 sec: 6234.3). Total num frames: 18624512. Throughput: 0: 788.1, 1: 788.5. Samples: 4655710. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0) -[2023-09-26 16:06:37,908][04574] Avg episode reward: [(0, '33.190'), (1, '33.060')] -[2023-09-26 16:06:42,907][04574] Fps is (10 sec: 5734.4, 60 sec: 6280.6, 300 sec: 6220.4). Total num frames: 18653184. Throughput: 0: 787.4, 1: 786.7. Samples: 4660402. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 16:06:42,908][04574] Avg episode reward: [(0, '31.580'), (1, '31.920')] -[2023-09-26 16:06:46,007][05900] Updated weights for policy 0, policy_version 36480 (0.0018) -[2023-09-26 16:06:46,007][05901] Updated weights for policy 1, policy_version 36480 (0.0018) -[2023-09-26 16:06:47,908][04574] Fps is (10 sec: 6143.9, 60 sec: 6280.5, 300 sec: 6220.4). Total num frames: 18685952. Throughput: 0: 786.0, 1: 786.2. Samples: 4669441. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 16:06:47,908][04574] Avg episode reward: [(0, '31.430'), (1, '31.870')] -[2023-09-26 16:06:52,907][04574] Fps is (10 sec: 6553.6, 60 sec: 6280.6, 300 sec: 6248.1). Total num frames: 18718720. Throughput: 0: 788.5, 1: 788.7. Samples: 4679054. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 16:06:52,908][04574] Avg episode reward: [(0, '31.710'), (1, '31.660')] -[2023-09-26 16:06:57,908][04574] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 18751488. Throughput: 0: 785.1, 1: 784.9. Samples: 4683776. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 16:06:57,909][04574] Avg episode reward: [(0, '31.810'), (1, '32.320')] -[2023-09-26 16:06:58,889][05901] Updated weights for policy 1, policy_version 36640 (0.0018) -[2023-09-26 16:06:58,889][05900] Updated weights for policy 0, policy_version 36640 (0.0017) -[2023-09-26 16:07:02,908][04574] Fps is (10 sec: 6553.4, 60 sec: 6417.1, 300 sec: 6248.1). Total num frames: 18784256. Throughput: 0: 786.3, 1: 786.8. Samples: 4693291. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 16:07:02,909][04574] Avg episode reward: [(0, '31.710'), (1, '32.640')] -[2023-09-26 16:07:07,908][04574] Fps is (10 sec: 5734.4, 60 sec: 6280.5, 300 sec: 6220.4). Total num frames: 18808832. Throughput: 0: 779.4, 1: 780.1. Samples: 4702550. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 16:07:07,909][04574] Avg episode reward: [(0, '31.700'), (1, '32.820')] -[2023-09-26 16:07:12,001][05900] Updated weights for policy 0, policy_version 36800 (0.0018) -[2023-09-26 16:07:12,001][05901] Updated weights for policy 1, policy_version 36800 (0.0016) -[2023-09-26 16:07:12,908][04574] Fps is (10 sec: 5734.4, 60 sec: 6280.5, 300 sec: 6220.4). Total num frames: 18841600. Throughput: 0: 783.2, 1: 783.7. Samples: 4707246. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 16:07:12,909][04574] Avg episode reward: [(0, '30.310'), (1, '33.710')] -[2023-09-26 16:07:17,908][04574] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6220.4). Total num frames: 18874368. Throughput: 0: 778.9, 1: 778.7. Samples: 4716645. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 16:07:17,908][04574] Avg episode reward: [(0, '30.530'), (1, '32.840')] -[2023-09-26 16:07:22,908][04574] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 18907136. Throughput: 0: 784.2, 1: 783.6. Samples: 4726260. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 16:07:22,909][04574] Avg episode reward: [(0, '31.040'), (1, '33.550')] -[2023-09-26 16:07:25,138][05901] Updated weights for policy 1, policy_version 36960 (0.0018) -[2023-09-26 16:07:25,138][05900] Updated weights for policy 0, policy_version 36960 (0.0020) -[2023-09-26 16:07:27,907][04574] Fps is (10 sec: 6553.7, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 18939904. Throughput: 0: 782.7, 1: 783.4. Samples: 4730880. Policy #0 lag: (min: 9.0, avg: 9.0, max: 9.0) -[2023-09-26 16:07:27,908][04574] Avg episode reward: [(0, '31.610'), (1, '33.460')] -[2023-09-26 16:07:32,908][04574] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 18972672. Throughput: 0: 789.4, 1: 790.1. Samples: 4740519. Policy #0 lag: (min: 9.0, avg: 9.0, max: 9.0) -[2023-09-26 16:07:32,909][04574] Avg episode reward: [(0, '31.490'), (1, '33.810')] -[2023-09-26 16:07:37,908][04574] Fps is (10 sec: 5734.3, 60 sec: 6212.2, 300 sec: 6220.4). Total num frames: 18997248. Throughput: 0: 780.9, 1: 780.8. Samples: 4749333. Policy #0 lag: (min: 9.0, avg: 9.0, max: 9.0) -[2023-09-26 16:07:37,909][04574] Avg episode reward: [(0, '31.490'), (1, '33.340')] -[2023-09-26 16:07:38,248][05900] Updated weights for policy 0, policy_version 37120 (0.0018) -[2023-09-26 16:07:38,248][05901] Updated weights for policy 1, policy_version 37120 (0.0018) -[2023-09-26 16:07:42,908][04574] Fps is (10 sec: 5734.4, 60 sec: 6280.5, 300 sec: 6220.4). Total num frames: 19030016. Throughput: 0: 781.3, 1: 781.7. Samples: 4754111. Policy #0 lag: (min: 9.0, avg: 9.0, max: 9.0) -[2023-09-26 16:07:42,909][04574] Avg episode reward: [(0, '31.730'), (1, '33.810')] -[2023-09-26 16:07:47,908][04574] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 19062784. Throughput: 0: 779.9, 1: 779.7. Samples: 4763474. Policy #0 lag: (min: 9.0, avg: 9.0, max: 9.0) -[2023-09-26 16:07:47,909][04574] Avg episode reward: [(0, '32.460'), (1, '34.190')] -[2023-09-26 16:07:51,657][05900] Updated weights for policy 0, policy_version 37280 (0.0014) -[2023-09-26 16:07:51,657][05901] Updated weights for policy 1, policy_version 37280 (0.0014) -[2023-09-26 16:07:52,908][04574] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 19095552. Throughput: 0: 776.3, 1: 776.2. Samples: 4772409. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0) -[2023-09-26 16:07:52,909][04574] Avg episode reward: [(0, '32.640'), (1, '33.390')] -[2023-09-26 16:07:57,908][04574] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6220.4). Total num frames: 19120128. Throughput: 0: 780.3, 1: 781.2. Samples: 4777512. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0) -[2023-09-26 16:07:57,909][04574] Avg episode reward: [(0, '32.980'), (1, '33.650')] -[2023-09-26 16:08:02,908][04574] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6220.4). Total num frames: 19152896. Throughput: 0: 775.9, 1: 776.4. Samples: 4786498. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0) -[2023-09-26 16:08:02,909][04574] Avg episode reward: [(0, '33.550'), (1, '34.390')] -[2023-09-26 16:08:04,640][05901] Updated weights for policy 1, policy_version 37440 (0.0017) -[2023-09-26 16:08:04,640][05900] Updated weights for policy 0, policy_version 37440 (0.0018) -[2023-09-26 16:08:07,907][04574] Fps is (10 sec: 6553.7, 60 sec: 6280.6, 300 sec: 6234.3). Total num frames: 19185664. Throughput: 0: 778.4, 1: 776.7. Samples: 4796240. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0) -[2023-09-26 16:08:07,908][04574] Avg episode reward: [(0, '33.070'), (1, '34.740')] -[2023-09-26 16:08:12,908][04574] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 19218432. Throughput: 0: 773.8, 1: 773.8. Samples: 4800521. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0) -[2023-09-26 16:08:12,909][04574] Avg episode reward: [(0, '34.310'), (1, '35.770')] -[2023-09-26 16:08:17,908][04574] Fps is (10 sec: 5734.2, 60 sec: 6144.0, 300 sec: 6220.4). Total num frames: 19243008. Throughput: 0: 772.3, 1: 771.2. Samples: 4809977. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 16:08:17,909][04574] Avg episode reward: [(0, '34.550'), (1, '35.740')] -[2023-09-26 16:08:17,943][05900] Updated weights for policy 0, policy_version 37600 (0.0016) -[2023-09-26 16:08:17,944][05901] Updated weights for policy 1, policy_version 37600 (0.0017) -[2023-09-26 16:08:22,907][04574] Fps is (10 sec: 5734.5, 60 sec: 6144.0, 300 sec: 6220.4). Total num frames: 19275776. Throughput: 0: 776.2, 1: 776.0. Samples: 4819182. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 16:08:22,908][04574] Avg episode reward: [(0, '35.540'), (1, '36.260')] -[2023-09-26 16:08:27,908][04574] Fps is (10 sec: 6553.6, 60 sec: 6144.0, 300 sec: 6220.4). Total num frames: 19308544. Throughput: 0: 779.0, 1: 778.2. Samples: 4824188. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 16:08:27,909][04574] Avg episode reward: [(0, '35.890'), (1, '35.620')] -[2023-09-26 16:08:30,936][05900] Updated weights for policy 0, policy_version 37760 (0.0016) -[2023-09-26 16:08:30,935][05901] Updated weights for policy 1, policy_version 37760 (0.0019) -[2023-09-26 16:08:32,908][04574] Fps is (10 sec: 6553.4, 60 sec: 6144.0, 300 sec: 6248.1). Total num frames: 19341312. Throughput: 0: 776.0, 1: 775.6. Samples: 4833295. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 16:08:32,909][04574] Avg episode reward: [(0, '35.310'), (1, '35.880')] -[2023-09-26 16:08:32,920][05596] Saving ./train_atari/atari_kongfumaster/checkpoint_p1/checkpoint_000037776_9670656.pth... -[2023-09-26 16:08:32,920][05384] Saving ./train_atari/atari_kongfumaster/checkpoint_p0/checkpoint_000037776_9670656.pth... -[2023-09-26 16:08:32,954][05384] Removing ./train_atari/atari_kongfumaster/checkpoint_p0/checkpoint_000034848_8921088.pth -[2023-09-26 16:08:32,959][05596] Removing ./train_atari/atari_kongfumaster/checkpoint_p1/checkpoint_000034848_8921088.pth -[2023-09-26 16:08:37,908][04574] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 19374080. Throughput: 0: 782.7, 1: 782.5. Samples: 4842846. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 16:08:37,909][04574] Avg episode reward: [(0, '34.550'), (1, '35.700')] -[2023-09-26 16:08:42,908][04574] Fps is (10 sec: 6553.7, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 19406848. Throughput: 0: 778.9, 1: 778.3. Samples: 4847583. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) -[2023-09-26 16:08:42,909][04574] Avg episode reward: [(0, '35.130'), (1, '34.760')] -[2023-09-26 16:08:44,002][05900] Updated weights for policy 0, policy_version 37920 (0.0014) -[2023-09-26 16:08:44,002][05901] Updated weights for policy 1, policy_version 37920 (0.0017) -[2023-09-26 16:08:47,908][04574] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6220.4). Total num frames: 19431424. Throughput: 0: 784.4, 1: 783.9. Samples: 4857075. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) -[2023-09-26 16:08:47,908][04574] Avg episode reward: [(0, '35.500'), (1, '34.740')] -[2023-09-26 16:08:52,908][04574] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6234.3). Total num frames: 19464192. Throughput: 0: 775.1, 1: 777.6. Samples: 4866114. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) -[2023-09-26 16:08:52,908][04574] Avg episode reward: [(0, '35.190'), (1, '34.930')] -[2023-09-26 16:08:57,125][05901] Updated weights for policy 1, policy_version 38080 (0.0016) -[2023-09-26 16:08:57,126][05900] Updated weights for policy 0, policy_version 38080 (0.0019) -[2023-09-26 16:08:57,908][04574] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 19496960. Throughput: 0: 783.4, 1: 783.5. Samples: 4871035. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) -[2023-09-26 16:08:57,909][04574] Avg episode reward: [(0, '35.810'), (1, '34.620')] -[2023-09-26 16:09:02,908][04574] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 19529728. Throughput: 0: 782.2, 1: 782.6. Samples: 4880392. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) -[2023-09-26 16:09:02,909][04574] Avg episode reward: [(0, '36.520'), (1, '34.270')] -[2023-09-26 16:09:07,908][04574] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 19562496. Throughput: 0: 785.3, 1: 786.0. Samples: 4889888. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 16:09:07,909][04574] Avg episode reward: [(0, '36.590'), (1, '34.590')] -[2023-09-26 16:09:10,167][05901] Updated weights for policy 1, policy_version 38240 (0.0016) -[2023-09-26 16:09:10,167][05900] Updated weights for policy 0, policy_version 38240 (0.0015) -[2023-09-26 16:09:12,907][04574] Fps is (10 sec: 6553.7, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 19595264. Throughput: 0: 783.5, 1: 783.9. Samples: 4894720. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 16:09:12,908][04574] Avg episode reward: [(0, '35.120'), (1, '33.620')] -[2023-09-26 16:09:17,908][04574] Fps is (10 sec: 5734.4, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 19619840. Throughput: 0: 785.6, 1: 785.7. Samples: 4904006. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 16:09:17,909][04574] Avg episode reward: [(0, '34.760'), (1, '33.500')] -[2023-09-26 16:09:22,908][04574] Fps is (10 sec: 5734.3, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 19652608. Throughput: 0: 783.6, 1: 783.4. Samples: 4913359. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 16:09:22,909][04574] Avg episode reward: [(0, '34.800'), (1, '34.190')] -[2023-09-26 16:09:23,207][05900] Updated weights for policy 0, policy_version 38400 (0.0016) -[2023-09-26 16:09:23,208][05901] Updated weights for policy 1, policy_version 38400 (0.0016) -[2023-09-26 16:09:27,908][04574] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 19685376. Throughput: 0: 783.9, 1: 783.2. Samples: 4918106. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 16:09:27,909][04574] Avg episode reward: [(0, '35.040'), (1, '34.590')] -[2023-09-26 16:09:32,908][04574] Fps is (10 sec: 6553.7, 60 sec: 6280.6, 300 sec: 6248.1). Total num frames: 19718144. Throughput: 0: 782.5, 1: 782.5. Samples: 4927502. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 16:09:32,908][04574] Avg episode reward: [(0, '35.970'), (1, '34.950')] -[2023-09-26 16:09:36,254][05900] Updated weights for policy 0, policy_version 38560 (0.0017) -[2023-09-26 16:09:36,254][05901] Updated weights for policy 1, policy_version 38560 (0.0017) -[2023-09-26 16:09:37,908][04574] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 19750912. Throughput: 0: 786.6, 1: 787.8. Samples: 4936962. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 16:09:37,909][04574] Avg episode reward: [(0, '35.980'), (1, '35.580')] -[2023-09-26 16:09:42,908][04574] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 19783680. Throughput: 0: 785.2, 1: 785.7. Samples: 4941727. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 16:09:42,909][04574] Avg episode reward: [(0, '36.760'), (1, '35.450')] -[2023-09-26 16:09:47,908][04574] Fps is (10 sec: 5734.4, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 19808256. Throughput: 0: 785.9, 1: 786.1. Samples: 4951132. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 16:09:47,909][04574] Avg episode reward: [(0, '36.040'), (1, '35.530')] -[2023-09-26 16:09:49,303][05901] Updated weights for policy 1, policy_version 38720 (0.0018) -[2023-09-26 16:09:49,303][05900] Updated weights for policy 0, policy_version 38720 (0.0018) -[2023-09-26 16:09:52,908][04574] Fps is (10 sec: 5734.4, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 19841024. Throughput: 0: 782.2, 1: 781.7. Samples: 4960261. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 16:09:52,909][04574] Avg episode reward: [(0, '36.420'), (1, '34.730')] -[2023-09-26 16:09:57,908][04574] Fps is (10 sec: 6553.7, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 19873792. Throughput: 0: 777.8, 1: 777.4. Samples: 4964701. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 16:09:57,908][04574] Avg episode reward: [(0, '35.410'), (1, '34.950')] -[2023-09-26 16:10:02,526][05900] Updated weights for policy 0, policy_version 38880 (0.0018) -[2023-09-26 16:10:02,526][05901] Updated weights for policy 1, policy_version 38880 (0.0018) -[2023-09-26 16:10:02,907][04574] Fps is (10 sec: 6553.7, 60 sec: 6280.6, 300 sec: 6248.1). Total num frames: 19906560. Throughput: 0: 782.8, 1: 781.7. Samples: 4974412. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 16:10:02,908][04574] Avg episode reward: [(0, '34.600'), (1, '34.610')] -[2023-09-26 16:10:07,907][04574] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 19939328. Throughput: 0: 783.0, 1: 783.7. Samples: 4983861. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 16:10:07,908][04574] Avg episode reward: [(0, '33.840'), (1, '35.170')] -[2023-09-26 16:10:12,908][04574] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 19972096. Throughput: 0: 785.3, 1: 785.3. Samples: 4988783. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 16:10:12,909][04574] Avg episode reward: [(0, '34.110'), (1, '35.210')] -[2023-09-26 16:10:15,450][05900] Updated weights for policy 0, policy_version 39040 (0.0015) -[2023-09-26 16:10:15,451][05901] Updated weights for policy 1, policy_version 39040 (0.0017) -[2023-09-26 16:10:17,908][04574] Fps is (10 sec: 5734.3, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 19996672. Throughput: 0: 784.4, 1: 784.0. Samples: 4998083. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 16:10:17,908][04574] Avg episode reward: [(0, '34.300'), (1, '35.340')] -[2023-09-26 16:10:19,388][05596] Saving ./train_atari/atari_kongfumaster/checkpoint_p1/checkpoint_000039088_10006528.pth... -[2023-09-26 16:10:19,389][05933] Stopping RolloutWorker_w0... -[2023-09-26 16:10:19,389][05940] Stopping RolloutWorker_w6... -[2023-09-26 16:10:19,389][05939] Stopping RolloutWorker_w5... -[2023-09-26 16:10:19,389][05936] Stopping RolloutWorker_w2... -[2023-09-26 16:10:19,389][05934] Stopping RolloutWorker_w1... -[2023-09-26 16:10:19,389][05938] Stopping RolloutWorker_w4... -[2023-09-26 16:10:19,389][05384] Stopping Batcher_0... -[2023-09-26 16:10:19,389][05937] Stopping RolloutWorker_w3... -[2023-09-26 16:10:19,389][05941] Stopping RolloutWorker_w7... -[2023-09-26 16:10:19,390][05933] Loop rollout_proc0_evt_loop terminating... -[2023-09-26 16:10:19,390][05936] Loop rollout_proc2_evt_loop terminating... -[2023-09-26 16:10:19,390][05934] Loop rollout_proc1_evt_loop terminating... -[2023-09-26 16:10:19,390][05940] Loop rollout_proc6_evt_loop terminating... -[2023-09-26 16:10:19,390][05938] Loop rollout_proc4_evt_loop terminating... -[2023-09-26 16:10:19,389][04574] Component RolloutWorker_w6 stopped! -[2023-09-26 16:10:19,390][05939] Loop rollout_proc5_evt_loop terminating... -[2023-09-26 16:10:19,390][05937] Loop rollout_proc3_evt_loop terminating... -[2023-09-26 16:10:19,390][05384] Loop batcher_evt_loop terminating... -[2023-09-26 16:10:19,390][05941] Loop rollout_proc7_evt_loop terminating... -[2023-09-26 16:10:19,390][04574] Component RolloutWorker_w0 stopped! -[2023-09-26 16:10:19,391][04574] Component RolloutWorker_w5 stopped! -[2023-09-26 16:10:19,392][04574] Component RolloutWorker_w2 stopped! -[2023-09-26 16:10:19,393][04574] Component Batcher_0 stopped! -[2023-09-26 16:10:19,393][04574] Component RolloutWorker_w4 stopped! -[2023-09-26 16:10:19,394][04574] Component RolloutWorker_w1 stopped! -[2023-09-26 16:10:19,394][04574] Component RolloutWorker_w3 stopped! -[2023-09-26 16:10:19,395][04574] Component RolloutWorker_w7 stopped! -[2023-09-26 16:10:19,400][04574] Component Batcher_1 stopped! -[2023-09-26 16:10:19,409][05596] Stopping Batcher_1... -[2023-09-26 16:10:19,419][05596] Loop batcher_evt_loop terminating... -[2023-09-26 16:10:19,420][05596] Removing ./train_atari/atari_kongfumaster/checkpoint_p1/checkpoint_000036320_9297920.pth -[2023-09-26 16:10:19,424][05596] Saving ./train_atari/atari_kongfumaster/checkpoint_p1/checkpoint_000039088_10006528.pth... -[2023-09-26 16:10:19,451][05901] Weights refcount: 2 0 -[2023-09-26 16:10:19,452][05901] Stopping InferenceWorker_p1-w0... -[2023-09-26 16:10:19,452][05901] Loop inference_proc1-0_evt_loop terminating... -[2023-09-26 16:10:19,452][04574] Component InferenceWorker_p1-w0 stopped! -[2023-09-26 16:10:19,453][05900] Weights refcount: 2 0 -[2023-09-26 16:10:19,454][05900] Stopping InferenceWorker_p0-w0... -[2023-09-26 16:10:19,454][05900] Loop inference_proc0-0_evt_loop terminating... -[2023-09-26 16:10:19,454][04574] Component InferenceWorker_p0-w0 stopped! -[2023-09-26 16:10:19,462][05596] Stopping LearnerWorker_p1... -[2023-09-26 16:10:19,462][05596] Loop learner_proc1_evt_loop terminating... -[2023-09-26 16:10:19,463][04574] Component LearnerWorker_p1 stopped! -[2023-09-26 16:10:19,512][05384] Saving ./train_atari/atari_kongfumaster/checkpoint_p0/checkpoint_000039088_10006528.pth... -[2023-09-26 16:10:19,542][05384] Removing ./train_atari/atari_kongfumaster/checkpoint_p0/checkpoint_000036320_9297920.pth -[2023-09-26 16:10:19,546][05384] Saving ./train_atari/atari_kongfumaster/checkpoint_p0/checkpoint_000039088_10006528.pth... -[2023-09-26 16:10:19,582][05384] Stopping LearnerWorker_p0... -[2023-09-26 16:10:19,582][05384] Loop learner_proc0_evt_loop terminating... -[2023-09-26 16:10:19,582][04574] Component LearnerWorker_p0 stopped! -[2023-09-26 16:10:19,583][04574] Waiting for process learner_proc0 to stop... -[2023-09-26 16:10:20,231][04574] Waiting for process learner_proc1 to stop... -[2023-09-26 16:10:20,232][04574] Waiting for process inference_proc0-0 to join... -[2023-09-26 16:10:20,233][04574] Waiting for process inference_proc1-0 to join... -[2023-09-26 16:10:20,234][04574] Waiting for process rollout_proc0 to join... -[2023-09-26 16:10:20,234][04574] Waiting for process rollout_proc1 to join... -[2023-09-26 16:10:20,235][04574] Waiting for process rollout_proc2 to join... -[2023-09-26 16:10:20,236][04574] Waiting for process rollout_proc3 to join... -[2023-09-26 16:10:20,236][04574] Waiting for process rollout_proc4 to join... -[2023-09-26 16:10:20,237][04574] Waiting for process rollout_proc5 to join... -[2023-09-26 16:10:20,238][04574] Waiting for process rollout_proc6 to join... -[2023-09-26 16:10:20,238][04574] Waiting for process rollout_proc7 to join... -[2023-09-26 16:10:20,243][04574] Batcher 0 profile tree view: -batching: 21.5311, releasing_batches: 1.8340 -[2023-09-26 16:10:20,243][04574] Batcher 1 profile tree view: -batching: 21.0383, releasing_batches: 1.8348 -[2023-09-26 16:10:20,243][04574] InferenceWorker_p0-w0 profile tree view: -wait_policy: 0.0051 - wait_policy_total: 643.6505 -update_model: 37.7979 - weight_update: 0.0016 -one_step: 0.0012 - handle_policy_step: 2327.5498 - deserialize: 68.0581, stack: 16.6821, obs_to_device_normalize: 562.6493, forward: 1126.5541, send_messages: 96.1678 - prepare_outputs: 307.4518 - to_cpu: 154.8116 -[2023-09-26 16:10:20,244][04574] InferenceWorker_p1-w0 profile tree view: -wait_policy: 0.0052 - wait_policy_total: 695.2805 -update_model: 36.9582 - weight_update: 0.0016 -one_step: 0.0011 - handle_policy_step: 2279.4803 - deserialize: 67.5536, stack: 16.1003, obs_to_device_normalize: 553.3622, forward: 1097.8032, send_messages: 94.8961 - prepare_outputs: 304.7677 - to_cpu: 154.0876 -[2023-09-26 16:10:20,244][04574] Learner 0 profile tree view: -misc: 0.0168, prepare_batch: 32.5086 -train: 457.8950 - epoch_init: 0.1028, minibatch_init: 3.0742, losses_postprocess: 62.9652, kl_divergence: 5.4354, after_optimizer: 21.6736 - calculate_losses: 44.8471 - losses_init: 0.0998, forward_head: 14.3289, bptt_initial: 0.4363, bptt: 0.4467, tail: 10.3173, advantages_returns: 3.0395, losses: 12.6570 - update: 315.7712 - clip: 163.8080 -[2023-09-26 16:10:20,244][04574] Learner 1 profile tree view: -misc: 0.0155, prepare_batch: 32.1190 -train: 454.7474 - epoch_init: 0.1043, minibatch_init: 3.1167, losses_postprocess: 62.1055, kl_divergence: 5.4961, after_optimizer: 21.8255 - calculate_losses: 44.6140 - losses_init: 0.0995, forward_head: 13.5006, bptt_initial: 0.4337, bptt: 0.4775, tail: 10.4193, advantages_returns: 3.1326, losses: 12.9189 - update: 313.3632 - clip: 164.0103 -[2023-09-26 16:10:20,245][04574] RolloutWorker_w0 profile tree view: -wait_for_trajectories: 0.3959, enqueue_policy_requests: 43.3481, env_step: 1266.8802, overhead: 29.5598, complete_rollouts: 1.0491 -save_policy_outputs: 54.2213 - split_output_tensors: 18.5066 -[2023-09-26 16:10:20,245][04574] RolloutWorker_w7 profile tree view: -wait_for_trajectories: 0.3957, enqueue_policy_requests: 42.6959, env_step: 1175.2883, overhead: 28.9912, complete_rollouts: 1.0893 -save_policy_outputs: 53.0536 - split_output_tensors: 18.1580 -[2023-09-26 16:10:20,245][04574] Loop Runner_EvtLoop terminating... -[2023-09-26 16:10:20,246][04574] Runner profile tree view: -main_loop: 3223.2339 -[2023-09-26 16:10:20,246][04574] Collected {0: 10006528, 1: 10006528}, FPS: 6209.0 +[2023-10-13 00:12:02,830][46384] Using optimizer +[2023-10-13 00:12:02,831][46384] No checkpoints found +[2023-10-13 00:12:02,831][46384] Did not load from checkpoint, starting from scratch! +[2023-10-13 00:12:02,831][46384] Initialized policy 1 weights for model version 0 +[2023-10-13 00:12:02,833][46384] LearnerWorker_p1 finished initialization! +[2023-10-13 00:12:02,834][46384] Using GPUs [0] for process 1 (actually maps to GPUs [1]) +[2023-10-13 00:12:03,868][45375] Starting process rollout_proc14 +[2023-10-13 00:12:03,873][46662] Using GPUs [0] for process 0 (actually maps to GPUs [0]) +[2023-10-13 00:12:03,874][46662] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for inference process 0 +[2023-10-13 00:12:03,881][45375] Starting process rollout_proc15 +[2023-10-13 00:12:03,886][46702] Worker 5 uses CPU cores [10, 11] +[2023-10-13 00:12:03,892][46662] Num visible devices: 1 +[2023-10-13 00:12:03,920][46696] Worker 0 uses CPU cores [0, 1] +[2023-10-13 00:12:03,960][46699] Worker 2 uses CPU cores [4, 5] +[2023-10-13 00:12:03,962][46703] Worker 6 uses CPU cores [12, 13] +[2023-10-13 00:12:03,971][46707] Worker 10 uses CPU cores [20, 21] +[2023-10-13 00:12:04,052][46697] Worker 1 uses CPU cores [2, 3] +[2023-10-13 00:12:04,228][46709] Worker 11 uses CPU cores [22, 23] +[2023-10-13 00:12:04,299][46704] Worker 7 uses CPU cores [14, 15] +[2023-10-13 00:12:04,300][46701] Worker 3 uses CPU cores [6, 7] +[2023-10-13 00:12:04,384][46705] Worker 8 uses CPU cores [16, 17] +[2023-10-13 00:12:04,493][46706] Worker 9 uses CPU cores [18, 19] +[2023-10-13 00:12:04,543][46663] Using GPUs [1] for process 1 (actually maps to GPUs [1]) +[2023-10-13 00:12:04,543][46663] Set environment var CUDA_VISIBLE_DEVICES to '1' (GPU indices [1]) for inference process 1 +[2023-10-13 00:12:04,561][46663] Num visible devices: 1 +[2023-10-13 00:12:04,583][46700] Worker 4 uses CPU cores [8, 9] +[2023-10-13 00:12:04,614][46662] RunningMeanStd input shape: (4, 84, 84) +[2023-10-13 00:12:04,615][46662] RunningMeanStd input shape: (1,) +[2023-10-13 00:12:04,626][46662] ConvEncoder: input_channels=4 +[2023-10-13 00:12:04,676][46710] Worker 13 uses CPU cores [26, 27] +[2023-10-13 00:12:04,677][46708] Worker 12 uses CPU cores [24, 25] +[2023-10-13 00:12:04,732][46662] Conv encoder output size: 512 +[2023-10-13 00:12:05,171][46663] RunningMeanStd input shape: (4, 84, 84) +[2023-10-13 00:12:05,171][46663] RunningMeanStd input shape: (1,) +[2023-10-13 00:12:05,183][46663] ConvEncoder: input_channels=4 +[2023-10-13 00:12:05,285][46663] Conv encoder output size: 512 +[2023-10-13 00:12:05,801][47477] Worker 15 uses CPU cores [30, 31] +[2023-10-13 00:12:05,932][45375] Inference worker 0-0 is ready! +[2023-10-13 00:12:05,933][45375] Inference worker 1-0 is ready! +[2023-10-13 00:12:05,933][45375] All inference workers are ready! Signal rollout workers to start! +[2023-10-13 00:12:05,934][46704] EnvRunner 7-0 uses policy 1 +[2023-10-13 00:12:05,934][46699] EnvRunner 2-0 uses policy 0 +[2023-10-13 00:12:05,934][46710] EnvRunner 13-0 uses policy 1 +[2023-10-13 00:12:05,934][46702] EnvRunner 5-0 uses policy 1 +[2023-10-13 00:12:05,934][46707] EnvRunner 10-0 uses policy 0 +[2023-10-13 00:12:05,934][46703] EnvRunner 6-0 uses policy 0 +[2023-10-13 00:12:05,934][47476] Worker 14 uses CPU cores [28, 29] +[2023-10-13 00:12:05,934][46708] EnvRunner 12-0 uses policy 0 +[2023-10-13 00:12:05,934][46705] EnvRunner 8-0 uses policy 0 +[2023-10-13 00:12:05,934][46706] EnvRunner 9-0 uses policy 1 +[2023-10-13 00:12:05,934][46709] EnvRunner 11-0 uses policy 1 +[2023-10-13 00:12:05,934][46700] EnvRunner 4-0 uses policy 0 +[2023-10-13 00:12:05,934][46701] EnvRunner 3-0 uses policy 1 +[2023-10-13 00:12:05,935][46697] EnvRunner 1-0 uses policy 1 +[2023-10-13 00:12:05,935][46696] EnvRunner 0-0 uses policy 0 +[2023-10-13 00:12:05,934][45375] Fps is (10 sec: nan, 60 sec: nan, 300 sec: nan). Total num frames: 0. Throughput: 0: nan, 1: nan. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) +[2023-10-13 00:12:06,025][47477] EnvRunner 15-0 uses policy 1 +[2023-10-13 00:12:06,093][47476] EnvRunner 14-0 uses policy 0 +[2023-10-13 00:12:08,061][45375] Heartbeat connected on Batcher_0 +[2023-10-13 00:12:08,064][45375] Heartbeat connected on LearnerWorker_p0 +[2023-10-13 00:12:08,067][45375] Heartbeat connected on Batcher_1 +[2023-10-13 00:12:08,070][45375] Heartbeat connected on LearnerWorker_p1 +[2023-10-13 00:12:08,078][45375] Heartbeat connected on InferenceWorker_p0-w0 +[2023-10-13 00:12:08,084][45375] Heartbeat connected on InferenceWorker_p1-w0 +[2023-10-13 00:12:08,086][45375] Heartbeat connected on RolloutWorker_w0 +[2023-10-13 00:12:08,088][45375] Heartbeat connected on RolloutWorker_w2 +[2023-10-13 00:12:08,089][45375] Heartbeat connected on RolloutWorker_w1 +[2023-10-13 00:12:08,091][45375] Heartbeat connected on RolloutWorker_w3 +[2023-10-13 00:12:08,095][45375] Heartbeat connected on RolloutWorker_w5 +[2023-10-13 00:12:08,098][45375] Heartbeat connected on RolloutWorker_w4 +[2023-10-13 00:12:08,099][45375] Heartbeat connected on RolloutWorker_w6 +[2023-10-13 00:12:08,102][45375] Heartbeat connected on RolloutWorker_w7 +[2023-10-13 00:12:08,106][45375] Heartbeat connected on RolloutWorker_w9 +[2023-10-13 00:12:08,107][45375] Heartbeat connected on RolloutWorker_w8 +[2023-10-13 00:12:08,109][45375] Heartbeat connected on RolloutWorker_w10 +[2023-10-13 00:12:08,112][45375] Heartbeat connected on RolloutWorker_w11 +[2023-10-13 00:12:08,114][45375] Heartbeat connected on RolloutWorker_w12 +[2023-10-13 00:12:08,117][45375] Heartbeat connected on RolloutWorker_w13 +[2023-10-13 00:12:08,120][45375] Heartbeat connected on RolloutWorker_w14 +[2023-10-13 00:12:08,127][45375] Heartbeat connected on RolloutWorker_w15 +[2023-10-13 00:12:08,607][45375] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 0. Throughput: 0: 235.0, 1: 548.6. Samples: 2094. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) +[2023-10-13 00:12:13,607][45375] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 0. Throughput: 0: 810.2, 1: 931.7. Samples: 13364. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) +[2023-10-13 00:12:13,607][45375] Avg episode reward: [(0, '0.561'), (1, '0.750')] +[2023-10-13 00:12:16,094][46662] Updated weights for policy 0, policy_version 10 (0.0008) +[2023-10-13 00:12:16,432][46663] Updated weights for policy 1, policy_version 10 (0.0008) +[2023-10-13 00:12:16,458][46662] Updated weights for policy 0, policy_version 20 (0.0008) +[2023-10-13 00:12:16,796][46663] Updated weights for policy 1, policy_version 20 (0.0008) +[2023-10-13 00:12:16,835][46662] Updated weights for policy 0, policy_version 30 (0.0007) +[2023-10-13 00:12:17,169][46663] Updated weights for policy 1, policy_version 30 (0.0007) +[2023-10-13 00:12:18,607][45375] Fps is (10 sec: 6553.6, 60 sec: 5171.6, 300 sec: 5171.6). Total num frames: 65536. Throughput: 0: 1135.2, 1: 1197.7. Samples: 29564. Policy #0 lag: (min: 21.0, avg: 21.0, max: 21.0) +[2023-10-13 00:12:18,608][45375] Avg episode reward: [(0, '0.565'), (1, '0.750')] +[2023-10-13 00:12:19,719][46662] Updated weights for policy 0, policy_version 40 (0.0007) +[2023-10-13 00:12:19,982][46663] Updated weights for policy 1, policy_version 40 (0.0009) +[2023-10-13 00:12:20,085][46662] Updated weights for policy 0, policy_version 50 (0.0010) +[2023-10-13 00:12:20,350][46663] Updated weights for policy 1, policy_version 50 (0.0008) +[2023-10-13 00:12:20,470][46662] Updated weights for policy 0, policy_version 60 (0.0008) +[2023-10-13 00:12:20,716][46663] Updated weights for policy 1, policy_version 60 (0.0007) +[2023-10-13 00:12:23,607][45375] Fps is (10 sec: 13107.1, 60 sec: 7416.8, 300 sec: 7416.8). Total num frames: 131072. Throughput: 0: 1373.7, 1: 1438.0. Samples: 49688. Policy #0 lag: (min: 33.0, avg: 33.0, max: 33.0) +[2023-10-13 00:12:23,607][45375] Avg episode reward: [(0, '0.730'), (1, '0.710')] +[2023-10-13 00:12:24,100][46663] Updated weights for policy 1, policy_version 70 (0.0007) +[2023-10-13 00:12:24,295][46662] Updated weights for policy 0, policy_version 70 (0.0008) +[2023-10-13 00:12:24,472][46663] Updated weights for policy 1, policy_version 80 (0.0008) +[2023-10-13 00:12:24,666][46662] Updated weights for policy 0, policy_version 80 (0.0008) +[2023-10-13 00:12:24,831][46663] Updated weights for policy 1, policy_version 90 (0.0007) +[2023-10-13 00:12:25,047][46662] Updated weights for policy 0, policy_version 90 (0.0009) +[2023-10-13 00:12:28,532][46663] Updated weights for policy 1, policy_version 100 (0.0007) +[2023-10-13 00:12:28,607][45375] Fps is (10 sec: 13107.4, 60 sec: 8671.8, 300 sec: 8671.8). Total num frames: 196608. Throughput: 0: 1263.8, 1: 1320.1. Samples: 58582. Policy #0 lag: (min: 31.0, avg: 48.4, max: 63.0) +[2023-10-13 00:12:28,607][45375] Avg episode reward: [(0, '0.770'), (1, '0.750')] +[2023-10-13 00:12:28,762][46662] Updated weights for policy 0, policy_version 100 (0.0010) +[2023-10-13 00:12:28,887][46663] Updated weights for policy 1, policy_version 110 (0.0008) +[2023-10-13 00:12:29,136][46662] Updated weights for policy 0, policy_version 110 (0.0008) +[2023-10-13 00:12:29,254][46663] Updated weights for policy 1, policy_version 120 (0.0010) +[2023-10-13 00:12:29,514][46662] Updated weights for policy 0, policy_version 120 (0.0010) +[2023-10-13 00:12:33,314][46663] Updated weights for policy 1, policy_version 130 (0.0008) +[2023-10-13 00:12:33,509][46662] Updated weights for policy 0, policy_version 130 (0.0007) +[2023-10-13 00:12:33,607][45375] Fps is (10 sec: 13107.2, 60 sec: 9473.2, 300 sec: 9473.2). Total num frames: 262144. Throughput: 0: 1402.3, 1: 1451.4. Samples: 78968. Policy #0 lag: (min: 22.0, avg: 22.0, max: 22.0) +[2023-10-13 00:12:33,607][45375] Avg episode reward: [(0, '0.840'), (1, '0.820')] +[2023-10-13 00:12:33,676][46663] Updated weights for policy 1, policy_version 140 (0.0008) +[2023-10-13 00:12:33,877][46662] Updated weights for policy 0, policy_version 140 (0.0007) +[2023-10-13 00:12:34,043][46663] Updated weights for policy 1, policy_version 150 (0.0008) +[2023-10-13 00:12:34,250][46662] Updated weights for policy 0, policy_version 150 (0.0008) +[2023-10-13 00:12:34,404][46663] Updated weights for policy 1, policy_version 160 (0.0009) +[2023-10-13 00:12:34,405][46384] Saving new best policy, reward=0.820! +[2023-10-13 00:12:34,620][46091] Saving new best policy, reward=0.840! +[2023-10-13 00:12:34,623][46662] Updated weights for policy 0, policy_version 160 (0.0007) +[2023-10-13 00:12:38,607][45375] Fps is (10 sec: 13107.0, 60 sec: 10029.3, 300 sec: 10029.3). Total num frames: 327680. Throughput: 0: 1501.3, 1: 1540.0. Samples: 99366. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 00:12:38,607][45375] Avg episode reward: [(0, '0.810'), (1, '1.020')] +[2023-10-13 00:12:38,737][46662] Updated weights for policy 0, policy_version 170 (0.0007) +[2023-10-13 00:12:38,778][46663] Updated weights for policy 1, policy_version 170 (0.0007) +[2023-10-13 00:12:39,098][46662] Updated weights for policy 0, policy_version 180 (0.0008) +[2023-10-13 00:12:39,133][46663] Updated weights for policy 1, policy_version 180 (0.0008) +[2023-10-13 00:12:39,467][46662] Updated weights for policy 0, policy_version 190 (0.0008) +[2023-10-13 00:12:39,501][46663] Updated weights for policy 1, policy_version 190 (0.0009) +[2023-10-13 00:12:39,576][46384] Saving new best policy, reward=1.020! +[2023-10-13 00:12:43,460][46662] Updated weights for policy 0, policy_version 200 (0.0007) +[2023-10-13 00:12:43,607][45375] Fps is (10 sec: 13107.2, 60 sec: 10437.8, 300 sec: 10437.8). Total num frames: 393216. Throughput: 0: 1422.9, 1: 1451.8. Samples: 108296. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 00:12:43,607][45375] Avg episode reward: [(0, '1.020'), (1, '1.050')] +[2023-10-13 00:12:43,813][46663] Updated weights for policy 1, policy_version 200 (0.0009) +[2023-10-13 00:12:43,827][46662] Updated weights for policy 0, policy_version 210 (0.0008) +[2023-10-13 00:12:44,178][46663] Updated weights for policy 1, policy_version 210 (0.0008) +[2023-10-13 00:12:44,192][46662] Updated weights for policy 0, policy_version 220 (0.0008) +[2023-10-13 00:12:44,332][46091] Saving new best policy, reward=1.020! +[2023-10-13 00:12:44,537][46663] Updated weights for policy 1, policy_version 220 (0.0008) +[2023-10-13 00:12:44,687][46384] Saving new best policy, reward=1.050! +[2023-10-13 00:12:48,270][46662] Updated weights for policy 0, policy_version 230 (0.0007) +[2023-10-13 00:12:48,607][45375] Fps is (10 sec: 13107.4, 60 sec: 10750.6, 300 sec: 10750.6). Total num frames: 458752. Throughput: 0: 1500.7, 1: 1519.4. Samples: 128872. Policy #0 lag: (min: 31.0, avg: 38.2, max: 63.0) +[2023-10-13 00:12:48,607][45375] Avg episode reward: [(0, '1.180'), (1, '1.150')] +[2023-10-13 00:12:48,636][46662] Updated weights for policy 0, policy_version 240 (0.0007) +[2023-10-13 00:12:48,721][46663] Updated weights for policy 1, policy_version 230 (0.0007) +[2023-10-13 00:12:49,005][46662] Updated weights for policy 0, policy_version 250 (0.0009) +[2023-10-13 00:12:49,077][46663] Updated weights for policy 1, policy_version 240 (0.0007) +[2023-10-13 00:12:49,222][46091] Saving new best policy, reward=1.180! +[2023-10-13 00:12:49,449][46663] Updated weights for policy 1, policy_version 250 (0.0007) +[2023-10-13 00:12:49,670][46384] Saving new best policy, reward=1.150! +[2023-10-13 00:12:53,063][46662] Updated weights for policy 0, policy_version 260 (0.0009) +[2023-10-13 00:12:53,426][46662] Updated weights for policy 0, policy_version 270 (0.0009) +[2023-10-13 00:12:53,547][46663] Updated weights for policy 1, policy_version 260 (0.0009) +[2023-10-13 00:12:53,606][45375] Fps is (10 sec: 13107.3, 60 sec: 10997.8, 300 sec: 10997.8). Total num frames: 524288. Throughput: 0: 1638.2, 1: 1634.6. Samples: 149368. Policy #0 lag: (min: 4.0, avg: 11.7, max: 36.0) +[2023-10-13 00:12:53,607][45375] Avg episode reward: [(0, '1.330'), (1, '1.270')] +[2023-10-13 00:12:53,797][46662] Updated weights for policy 0, policy_version 280 (0.0007) +[2023-10-13 00:12:53,904][46663] Updated weights for policy 1, policy_version 270 (0.0007) +[2023-10-13 00:12:54,089][46091] Saving new best policy, reward=1.330! +[2023-10-13 00:12:54,280][46663] Updated weights for policy 1, policy_version 280 (0.0007) +[2023-10-13 00:12:54,575][46384] Saving new best policy, reward=1.270! +[2023-10-13 00:12:57,957][46662] Updated weights for policy 0, policy_version 290 (0.0007) +[2023-10-13 00:12:58,215][46663] Updated weights for policy 1, policy_version 290 (0.0009) +[2023-10-13 00:12:58,370][46662] Updated weights for policy 0, policy_version 300 (0.0008) +[2023-10-13 00:12:58,607][45375] Fps is (10 sec: 13107.1, 60 sec: 11198.0, 300 sec: 11198.0). Total num frames: 589824. Throughput: 0: 1615.2, 1: 1611.0. Samples: 158546. Policy #0 lag: (min: 31.0, avg: 38.3, max: 63.0) +[2023-10-13 00:12:58,607][45375] Avg episode reward: [(0, '1.350'), (1, '1.420')] +[2023-10-13 00:12:58,611][46663] Updated weights for policy 1, policy_version 300 (0.0009) +[2023-10-13 00:12:58,728][46662] Updated weights for policy 0, policy_version 310 (0.0008) +[2023-10-13 00:12:58,979][46663] Updated weights for policy 1, policy_version 310 (0.0008) +[2023-10-13 00:12:59,099][46091] Saving new best policy, reward=1.350! +[2023-10-13 00:12:59,100][46662] Updated weights for policy 0, policy_version 320 (0.0009) +[2023-10-13 00:12:59,344][46384] Saving new best policy, reward=1.420! +[2023-10-13 00:12:59,346][46663] Updated weights for policy 1, policy_version 320 (0.0010) +[2023-10-13 00:13:03,104][46662] Updated weights for policy 0, policy_version 330 (0.0008) +[2023-10-13 00:13:03,341][46663] Updated weights for policy 1, policy_version 330 (0.0007) +[2023-10-13 00:13:03,469][46662] Updated weights for policy 0, policy_version 340 (0.0009) +[2023-10-13 00:13:03,607][45375] Fps is (10 sec: 13106.9, 60 sec: 11363.5, 300 sec: 11363.5). Total num frames: 655360. Throughput: 0: 1663.2, 1: 1658.0. Samples: 179014. Policy #0 lag: (min: 31.0, avg: 33.8, max: 63.0) +[2023-10-13 00:13:03,608][45375] Avg episode reward: [(0, '1.620'), (1, '1.750')] +[2023-10-13 00:13:03,714][46663] Updated weights for policy 1, policy_version 340 (0.0007) +[2023-10-13 00:13:03,849][46662] Updated weights for policy 0, policy_version 350 (0.0007) +[2023-10-13 00:13:03,919][46091] Saving new best policy, reward=1.620! +[2023-10-13 00:13:04,074][46663] Updated weights for policy 1, policy_version 350 (0.0007) +[2023-10-13 00:13:04,145][46384] Saving new best policy, reward=1.750! +[2023-10-13 00:13:07,941][46662] Updated weights for policy 0, policy_version 360 (0.0008) +[2023-10-13 00:13:07,993][46663] Updated weights for policy 1, policy_version 360 (0.0009) +[2023-10-13 00:13:08,314][46662] Updated weights for policy 0, policy_version 370 (0.0009) +[2023-10-13 00:13:08,357][46663] Updated weights for policy 1, policy_version 370 (0.0007) +[2023-10-13 00:13:08,607][45375] Fps is (10 sec: 13107.0, 60 sec: 12014.9, 300 sec: 11502.6). Total num frames: 720896. Throughput: 0: 1666.0, 1: 1650.0. Samples: 198908. Policy #0 lag: (min: 26.0, avg: 28.5, max: 58.0) +[2023-10-13 00:13:08,608][45375] Avg episode reward: [(0, '1.900'), (1, '1.800')] +[2023-10-13 00:13:08,690][46662] Updated weights for policy 0, policy_version 380 (0.0009) +[2023-10-13 00:13:08,723][46663] Updated weights for policy 1, policy_version 380 (0.0007) +[2023-10-13 00:13:08,833][46091] Saving new best policy, reward=1.900! +[2023-10-13 00:13:08,866][46384] Saving new best policy, reward=1.800! +[2023-10-13 00:13:12,793][46662] Updated weights for policy 0, policy_version 390 (0.0007) +[2023-10-13 00:13:13,013][46663] Updated weights for policy 1, policy_version 391 (0.0008) +[2023-10-13 00:13:13,157][46662] Updated weights for policy 0, policy_version 400 (0.0008) +[2023-10-13 00:13:13,378][46663] Updated weights for policy 1, policy_version 401 (0.0010) +[2023-10-13 00:13:13,529][46662] Updated weights for policy 0, policy_version 410 (0.0008) +[2023-10-13 00:13:13,606][45375] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 11621.2). Total num frames: 786432. Throughput: 0: 1675.6, 1: 1661.8. Samples: 208764. Policy #0 lag: (min: 31.0, avg: 33.7, max: 63.0) +[2023-10-13 00:13:13,607][45375] Avg episode reward: [(0, '2.130'), (1, '2.110')] +[2023-10-13 00:13:13,744][46663] Updated weights for policy 1, policy_version 411 (0.0008) +[2023-10-13 00:13:13,746][46091] Saving new best policy, reward=2.130! +[2023-10-13 00:13:13,926][46384] Saving new best policy, reward=2.110! +[2023-10-13 00:13:17,468][46662] Updated weights for policy 0, policy_version 420 (0.0008) +[2023-10-13 00:13:17,836][46662] Updated weights for policy 0, policy_version 430 (0.0008) +[2023-10-13 00:13:17,855][46663] Updated weights for policy 1, policy_version 421 (0.0009) +[2023-10-13 00:13:18,204][46662] Updated weights for policy 0, policy_version 440 (0.0008) +[2023-10-13 00:13:18,221][46663] Updated weights for policy 1, policy_version 431 (0.0009) +[2023-10-13 00:13:18,593][46663] Updated weights for policy 1, policy_version 441 (0.0009) +[2023-10-13 00:13:18,606][45375] Fps is (10 sec: 16384.4, 60 sec: 13653.4, 300 sec: 12174.3). Total num frames: 884736. Throughput: 0: 1682.4, 1: 1664.7. Samples: 229584. Policy #0 lag: (min: 31.0, avg: 35.2, max: 63.0) +[2023-10-13 00:13:18,607][45375] Avg episode reward: [(0, '2.100'), (1, '2.340')] +[2023-10-13 00:13:18,848][46384] Saving new best policy, reward=2.340! +[2023-10-13 00:13:22,419][46662] Updated weights for policy 0, policy_version 450 (0.0009) +[2023-10-13 00:13:22,631][46663] Updated weights for policy 1, policy_version 451 (0.0007) +[2023-10-13 00:13:22,778][46662] Updated weights for policy 0, policy_version 460 (0.0008) +[2023-10-13 00:13:22,993][46663] Updated weights for policy 1, policy_version 461 (0.0009) +[2023-10-13 00:13:23,148][46662] Updated weights for policy 0, policy_version 470 (0.0008) +[2023-10-13 00:13:23,360][46663] Updated weights for policy 1, policy_version 471 (0.0008) +[2023-10-13 00:13:23,517][46662] Updated weights for policy 0, policy_version 480 (0.0008) +[2023-10-13 00:13:23,606][45375] Fps is (10 sec: 16384.0, 60 sec: 13653.4, 300 sec: 12234.4). Total num frames: 950272. Throughput: 0: 1670.7, 1: 1649.8. Samples: 248788. Policy #0 lag: (min: 1.0, avg: 2.1, max: 24.0) +[2023-10-13 00:13:23,607][45375] Avg episode reward: [(0, '2.040'), (1, '2.550')] +[2023-10-13 00:13:23,687][46384] Saving new best policy, reward=2.550! +[2023-10-13 00:13:27,358][46663] Updated weights for policy 1, policy_version 481 (0.0008) +[2023-10-13 00:13:27,626][46662] Updated weights for policy 0, policy_version 490 (0.0009) +[2023-10-13 00:13:27,724][46663] Updated weights for policy 1, policy_version 491 (0.0007) +[2023-10-13 00:13:27,996][46662] Updated weights for policy 0, policy_version 500 (0.0009) +[2023-10-13 00:13:28,093][46663] Updated weights for policy 1, policy_version 501 (0.0008) +[2023-10-13 00:13:28,358][46662] Updated weights for policy 0, policy_version 510 (0.0007) +[2023-10-13 00:13:28,463][46663] Updated weights for policy 1, policy_version 511 (0.0009) +[2023-10-13 00:13:28,606][45375] Fps is (10 sec: 16384.0, 60 sec: 14199.5, 300 sec: 12683.5). Total num frames: 1048576. Throughput: 0: 1683.1, 1: 1674.9. Samples: 259406. Policy #0 lag: (min: 26.0, avg: 26.0, max: 27.0) +[2023-10-13 00:13:28,607][45375] Avg episode reward: [(0, '2.240'), (1, '2.990')] +[2023-10-13 00:13:28,608][46091] Saving new best policy, reward=2.240! +[2023-10-13 00:13:28,608][46384] Saving new best policy, reward=2.990! +[2023-10-13 00:13:32,273][46662] Updated weights for policy 0, policy_version 520 (0.0009) +[2023-10-13 00:13:32,474][46663] Updated weights for policy 1, policy_version 521 (0.0009) +[2023-10-13 00:13:32,649][46662] Updated weights for policy 0, policy_version 530 (0.0008) +[2023-10-13 00:13:32,841][46663] Updated weights for policy 1, policy_version 531 (0.0007) +[2023-10-13 00:13:33,007][46662] Updated weights for policy 0, policy_version 540 (0.0010) +[2023-10-13 00:13:33,203][46663] Updated weights for policy 1, policy_version 541 (0.0008) +[2023-10-13 00:13:33,606][45375] Fps is (10 sec: 16384.0, 60 sec: 14199.5, 300 sec: 12707.7). Total num frames: 1114112. Throughput: 0: 1681.2, 1: 1674.4. Samples: 279874. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 00:13:33,607][45375] Avg episode reward: [(0, '2.210'), (1, '2.860')] +[2023-10-13 00:13:37,051][46662] Updated weights for policy 0, policy_version 550 (0.0007) +[2023-10-13 00:13:37,422][46662] Updated weights for policy 0, policy_version 560 (0.0007) +[2023-10-13 00:13:37,427][46663] Updated weights for policy 1, policy_version 551 (0.0008) +[2023-10-13 00:13:37,785][46662] Updated weights for policy 0, policy_version 570 (0.0008) +[2023-10-13 00:13:37,795][46663] Updated weights for policy 1, policy_version 561 (0.0008) +[2023-10-13 00:13:38,165][46663] Updated weights for policy 1, policy_version 571 (0.0009) +[2023-10-13 00:13:38,607][45375] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 12729.3). Total num frames: 1179648. Throughput: 0: 1661.8, 1: 1655.6. Samples: 298654. Policy #0 lag: (min: 31.0, avg: 33.9, max: 63.0) +[2023-10-13 00:13:38,607][45375] Avg episode reward: [(0, '2.220'), (1, '3.150')] +[2023-10-13 00:13:38,615][46384] Saving new best policy, reward=3.150! +[2023-10-13 00:13:41,863][46662] Updated weights for policy 0, policy_version 580 (0.0008) +[2023-10-13 00:13:42,226][46662] Updated weights for policy 0, policy_version 590 (0.0009) +[2023-10-13 00:13:42,360][46663] Updated weights for policy 1, policy_version 581 (0.0010) +[2023-10-13 00:13:42,604][46662] Updated weights for policy 0, policy_version 600 (0.0008) +[2023-10-13 00:13:42,735][46663] Updated weights for policy 1, policy_version 591 (0.0008) +[2023-10-13 00:13:43,107][46663] Updated weights for policy 1, policy_version 601 (0.0008) +[2023-10-13 00:13:43,607][45375] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 12748.6). Total num frames: 1245184. Throughput: 0: 1682.8, 1: 1682.1. Samples: 309970. Policy #0 lag: (min: 17.0, avg: 21.1, max: 49.0) +[2023-10-13 00:13:43,607][45375] Avg episode reward: [(0, '2.440'), (1, '3.580')] +[2023-10-13 00:13:43,608][46091] Saving new best policy, reward=2.440! +[2023-10-13 00:13:43,608][46384] Saving new best policy, reward=3.580! +[2023-10-13 00:13:46,746][46662] Updated weights for policy 0, policy_version 610 (0.0008) +[2023-10-13 00:13:47,145][46662] Updated weights for policy 0, policy_version 620 (0.0007) +[2023-10-13 00:13:47,382][46663] Updated weights for policy 1, policy_version 611 (0.0009) +[2023-10-13 00:13:47,521][46662] Updated weights for policy 0, policy_version 630 (0.0008) +[2023-10-13 00:13:47,791][46663] Updated weights for policy 1, policy_version 621 (0.0008) +[2023-10-13 00:13:47,884][46662] Updated weights for policy 0, policy_version 640 (0.0008) +[2023-10-13 00:13:48,151][46663] Updated weights for policy 1, policy_version 631 (0.0008) +[2023-10-13 00:13:48,606][45375] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 12766.1). Total num frames: 1310720. Throughput: 0: 1682.1, 1: 1677.4. Samples: 330190. Policy #0 lag: (min: 28.0, avg: 34.6, max: 60.0) +[2023-10-13 00:13:48,607][45375] Avg episode reward: [(0, '3.100'), (1, '3.790')] +[2023-10-13 00:13:48,608][46091] Saving new best policy, reward=3.100! +[2023-10-13 00:13:48,608][46384] Saving new best policy, reward=3.790! +[2023-10-13 00:13:52,052][46662] Updated weights for policy 0, policy_version 650 (0.0007) +[2023-10-13 00:13:52,123][46663] Updated weights for policy 1, policy_version 641 (0.0009) +[2023-10-13 00:13:52,427][46662] Updated weights for policy 0, policy_version 660 (0.0008) +[2023-10-13 00:13:52,497][46663] Updated weights for policy 1, policy_version 651 (0.0008) +[2023-10-13 00:13:52,795][46662] Updated weights for policy 0, policy_version 670 (0.0009) +[2023-10-13 00:13:52,868][46663] Updated weights for policy 1, policy_version 661 (0.0009) +[2023-10-13 00:13:53,240][46663] Updated weights for policy 1, policy_version 671 (0.0009) +[2023-10-13 00:13:53,607][45375] Fps is (10 sec: 13107.2, 60 sec: 14199.4, 300 sec: 12781.9). Total num frames: 1376256. Throughput: 0: 1661.7, 1: 1660.3. Samples: 348398. Policy #0 lag: (min: 31.0, avg: 38.6, max: 63.0) +[2023-10-13 00:13:53,607][45375] Avg episode reward: [(0, '4.060'), (1, '4.250')] +[2023-10-13 00:13:53,616][46384] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000000672_688128.pth... +[2023-10-13 00:13:53,616][46091] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000000672_688128.pth... +[2023-10-13 00:13:53,647][46384] Saving new best policy, reward=4.250! +[2023-10-13 00:13:53,653][46091] Saving new best policy, reward=4.060! +[2023-10-13 00:13:56,806][46662] Updated weights for policy 0, policy_version 680 (0.0008) +[2023-10-13 00:13:57,180][46662] Updated weights for policy 0, policy_version 690 (0.0007) +[2023-10-13 00:13:57,440][46663] Updated weights for policy 1, policy_version 681 (0.0008) +[2023-10-13 00:13:57,553][46662] Updated weights for policy 0, policy_version 700 (0.0007) +[2023-10-13 00:13:57,813][46663] Updated weights for policy 1, policy_version 691 (0.0008) +[2023-10-13 00:13:58,178][46663] Updated weights for policy 1, policy_version 701 (0.0010) +[2023-10-13 00:13:58,607][45375] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 12796.3). Total num frames: 1441792. Throughput: 0: 1684.4, 1: 1672.1. Samples: 359806. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-10-13 00:13:58,608][45375] Avg episode reward: [(0, '4.740'), (1, '4.690')] +[2023-10-13 00:13:58,609][46091] Saving new best policy, reward=4.740! +[2023-10-13 00:13:58,609][46384] Saving new best policy, reward=4.690! +[2023-10-13 00:14:01,670][46662] Updated weights for policy 0, policy_version 710 (0.0008) +[2023-10-13 00:14:02,038][46662] Updated weights for policy 0, policy_version 720 (0.0010) +[2023-10-13 00:14:02,252][46663] Updated weights for policy 1, policy_version 711 (0.0007) +[2023-10-13 00:14:02,408][46662] Updated weights for policy 0, policy_version 730 (0.0009) +[2023-10-13 00:14:02,616][46663] Updated weights for policy 1, policy_version 721 (0.0007) +[2023-10-13 00:14:02,979][46663] Updated weights for policy 1, policy_version 731 (0.0008) +[2023-10-13 00:14:03,607][45375] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 12809.5). Total num frames: 1507328. Throughput: 0: 1670.9, 1: 1664.0. Samples: 379654. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) +[2023-10-13 00:14:03,608][45375] Avg episode reward: [(0, '5.230'), (1, '4.710')] +[2023-10-13 00:14:03,609][46091] Saving new best policy, reward=5.230! +[2023-10-13 00:14:03,609][46384] Saving new best policy, reward=4.710! +[2023-10-13 00:14:06,616][46662] Updated weights for policy 0, policy_version 740 (0.0007) +[2023-10-13 00:14:06,992][46662] Updated weights for policy 0, policy_version 750 (0.0008) +[2023-10-13 00:14:07,043][46663] Updated weights for policy 1, policy_version 741 (0.0010) +[2023-10-13 00:14:07,356][46662] Updated weights for policy 0, policy_version 760 (0.0007) +[2023-10-13 00:14:07,424][46663] Updated weights for policy 1, policy_version 751 (0.0008) +[2023-10-13 00:14:07,780][46663] Updated weights for policy 1, policy_version 761 (0.0009) +[2023-10-13 00:14:08,606][45375] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 12821.7). Total num frames: 1572864. Throughput: 0: 1664.8, 1: 1664.6. Samples: 398608. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 00:14:08,607][45375] Avg episode reward: [(0, '5.770'), (1, '4.950')] +[2023-10-13 00:14:08,616][46384] Saving new best policy, reward=4.950! +[2023-10-13 00:14:08,616][46091] Saving new best policy, reward=5.770! +[2023-10-13 00:14:11,451][46662] Updated weights for policy 0, policy_version 770 (0.0007) +[2023-10-13 00:14:11,808][46662] Updated weights for policy 0, policy_version 780 (0.0007) +[2023-10-13 00:14:11,913][46663] Updated weights for policy 1, policy_version 771 (0.0010) +[2023-10-13 00:14:12,185][46662] Updated weights for policy 0, policy_version 790 (0.0008) +[2023-10-13 00:14:12,275][46663] Updated weights for policy 1, policy_version 781 (0.0008) +[2023-10-13 00:14:12,552][46662] Updated weights for policy 0, policy_version 800 (0.0009) +[2023-10-13 00:14:12,652][46663] Updated weights for policy 1, policy_version 791 (0.0008) +[2023-10-13 00:14:13,607][45375] Fps is (10 sec: 13107.0, 60 sec: 14199.4, 300 sec: 12832.8). Total num frames: 1638400. Throughput: 0: 1681.2, 1: 1671.6. Samples: 410282. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-10-13 00:14:13,608][45375] Avg episode reward: [(0, '6.020'), (1, '5.280')] +[2023-10-13 00:14:13,610][46091] Saving new best policy, reward=6.020! +[2023-10-13 00:14:13,610][46384] Saving new best policy, reward=5.280! +[2023-10-13 00:14:16,546][46662] Updated weights for policy 0, policy_version 810 (0.0008) +[2023-10-13 00:14:16,632][46663] Updated weights for policy 1, policy_version 801 (0.0008) +[2023-10-13 00:14:16,915][46662] Updated weights for policy 0, policy_version 820 (0.0007) +[2023-10-13 00:14:16,991][46663] Updated weights for policy 1, policy_version 811 (0.0007) +[2023-10-13 00:14:17,289][46662] Updated weights for policy 0, policy_version 830 (0.0009) +[2023-10-13 00:14:17,365][46663] Updated weights for policy 1, policy_version 821 (0.0009) +[2023-10-13 00:14:17,725][46663] Updated weights for policy 1, policy_version 831 (0.0007) +[2023-10-13 00:14:18,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 12843.2). Total num frames: 1703936. Throughput: 0: 1663.9, 1: 1658.8. Samples: 429394. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0) +[2023-10-13 00:14:18,607][45375] Avg episode reward: [(0, '6.420'), (1, '6.030')] +[2023-10-13 00:14:18,608][46384] Saving new best policy, reward=6.030! +[2023-10-13 00:14:18,608][46091] Saving new best policy, reward=6.420! +[2023-10-13 00:14:21,373][46662] Updated weights for policy 0, policy_version 840 (0.0007) +[2023-10-13 00:14:21,751][46662] Updated weights for policy 0, policy_version 850 (0.0008) +[2023-10-13 00:14:21,937][46663] Updated weights for policy 1, policy_version 841 (0.0008) +[2023-10-13 00:14:22,121][46662] Updated weights for policy 0, policy_version 860 (0.0009) +[2023-10-13 00:14:22,303][46663] Updated weights for policy 1, policy_version 851 (0.0008) +[2023-10-13 00:14:22,674][46663] Updated weights for policy 1, policy_version 861 (0.0007) +[2023-10-13 00:14:23,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 12852.8). Total num frames: 1769472. Throughput: 0: 1669.5, 1: 1665.6. Samples: 448736. Policy #0 lag: (min: 8.0, avg: 36.1, max: 40.0) +[2023-10-13 00:14:23,608][45375] Avg episode reward: [(0, '6.390'), (1, '7.370')] +[2023-10-13 00:14:23,618][46384] Saving new best policy, reward=7.370! +[2023-10-13 00:14:26,193][46662] Updated weights for policy 0, policy_version 870 (0.0008) +[2023-10-13 00:14:26,562][46662] Updated weights for policy 0, policy_version 880 (0.0007) +[2023-10-13 00:14:26,613][46663] Updated weights for policy 1, policy_version 871 (0.0008) +[2023-10-13 00:14:26,927][46662] Updated weights for policy 0, policy_version 890 (0.0008) +[2023-10-13 00:14:26,976][46663] Updated weights for policy 1, policy_version 881 (0.0008) +[2023-10-13 00:14:27,351][46663] Updated weights for policy 1, policy_version 891 (0.0008) +[2023-10-13 00:14:28,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 12861.7). Total num frames: 1835008. Throughput: 0: 1675.6, 1: 1668.9. Samples: 460476. Policy #0 lag: (min: 16.0, avg: 44.3, max: 48.0) +[2023-10-13 00:14:28,607][45375] Avg episode reward: [(0, '6.950'), (1, '8.030')] +[2023-10-13 00:14:28,608][46091] Saving new best policy, reward=6.950! +[2023-10-13 00:14:28,608][46384] Saving new best policy, reward=8.030! +[2023-10-13 00:14:31,192][46662] Updated weights for policy 0, policy_version 900 (0.0009) +[2023-10-13 00:14:31,454][46663] Updated weights for policy 1, policy_version 901 (0.0008) +[2023-10-13 00:14:31,581][46662] Updated weights for policy 0, policy_version 910 (0.0007) +[2023-10-13 00:14:31,849][46663] Updated weights for policy 1, policy_version 911 (0.0009) +[2023-10-13 00:14:31,953][46662] Updated weights for policy 0, policy_version 920 (0.0008) +[2023-10-13 00:14:32,221][46663] Updated weights for policy 1, policy_version 921 (0.0010) +[2023-10-13 00:14:33,607][45375] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 12870.0). Total num frames: 1900544. Throughput: 0: 1655.7, 1: 1653.7. Samples: 479112. Policy #0 lag: (min: 31.0, avg: 36.7, max: 63.0) +[2023-10-13 00:14:33,607][45375] Avg episode reward: [(0, '7.540'), (1, '8.240')] +[2023-10-13 00:14:33,608][46384] Saving new best policy, reward=8.240! +[2023-10-13 00:14:33,608][46091] Saving new best policy, reward=7.540! +[2023-10-13 00:14:35,803][46662] Updated weights for policy 0, policy_version 930 (0.0009) +[2023-10-13 00:14:36,160][46663] Updated weights for policy 1, policy_version 931 (0.0008) +[2023-10-13 00:14:36,174][46662] Updated weights for policy 0, policy_version 940 (0.0009) +[2023-10-13 00:14:36,524][46663] Updated weights for policy 1, policy_version 941 (0.0007) +[2023-10-13 00:14:36,545][46662] Updated weights for policy 0, policy_version 950 (0.0008) +[2023-10-13 00:14:36,885][46663] Updated weights for policy 1, policy_version 951 (0.0010) +[2023-10-13 00:14:36,917][46662] Updated weights for policy 0, policy_version 960 (0.0007) +[2023-10-13 00:14:38,607][45375] Fps is (10 sec: 13106.9, 60 sec: 13107.1, 300 sec: 12877.8). Total num frames: 1966080. Throughput: 0: 1673.4, 1: 1675.8. Samples: 499110. Policy #0 lag: (min: 31.0, avg: 37.1, max: 63.0) +[2023-10-13 00:14:38,608][45375] Avg episode reward: [(0, '8.880'), (1, '8.290')] +[2023-10-13 00:14:38,621][46091] Saving new best policy, reward=8.880! +[2023-10-13 00:14:38,621][46384] Saving new best policy, reward=8.290! +[2023-10-13 00:14:40,992][46663] Updated weights for policy 1, policy_version 961 (0.0008) +[2023-10-13 00:14:41,107][46662] Updated weights for policy 0, policy_version 970 (0.0009) +[2023-10-13 00:14:41,353][46663] Updated weights for policy 1, policy_version 971 (0.0009) +[2023-10-13 00:14:41,470][46662] Updated weights for policy 0, policy_version 980 (0.0009) +[2023-10-13 00:14:41,720][46663] Updated weights for policy 1, policy_version 981 (0.0008) +[2023-10-13 00:14:41,853][46662] Updated weights for policy 0, policy_version 990 (0.0008) +[2023-10-13 00:14:42,095][46663] Updated weights for policy 1, policy_version 991 (0.0007) +[2023-10-13 00:14:43,607][45375] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 12885.0). Total num frames: 2031616. Throughput: 0: 1670.9, 1: 1672.3. Samples: 510250. Policy #0 lag: (min: 21.0, avg: 24.4, max: 53.0) +[2023-10-13 00:14:43,608][45375] Avg episode reward: [(0, '9.680'), (1, '7.740')] +[2023-10-13 00:14:43,609][46091] Saving new best policy, reward=9.680! +[2023-10-13 00:14:45,980][46662] Updated weights for policy 0, policy_version 1000 (0.0007) +[2023-10-13 00:14:46,263][46663] Updated weights for policy 1, policy_version 1001 (0.0007) +[2023-10-13 00:14:46,351][46662] Updated weights for policy 0, policy_version 1010 (0.0008) +[2023-10-13 00:14:46,631][46663] Updated weights for policy 1, policy_version 1011 (0.0008) +[2023-10-13 00:14:46,718][46662] Updated weights for policy 0, policy_version 1020 (0.0009) +[2023-10-13 00:14:46,995][46663] Updated weights for policy 1, policy_version 1021 (0.0008) +[2023-10-13 00:14:48,607][45375] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 12891.9). Total num frames: 2097152. Throughput: 0: 1657.9, 1: 1660.7. Samples: 528988. Policy #0 lag: (min: 13.0, avg: 18.5, max: 45.0) +[2023-10-13 00:14:48,607][45375] Avg episode reward: [(0, '10.510'), (1, '7.780')] +[2023-10-13 00:14:48,608][46091] Saving new best policy, reward=10.510! +[2023-10-13 00:14:50,708][46662] Updated weights for policy 0, policy_version 1030 (0.0009) +[2023-10-13 00:14:51,079][46662] Updated weights for policy 0, policy_version 1040 (0.0007) +[2023-10-13 00:14:51,093][46663] Updated weights for policy 1, policy_version 1031 (0.0009) +[2023-10-13 00:14:51,441][46662] Updated weights for policy 0, policy_version 1050 (0.0007) +[2023-10-13 00:14:51,459][46663] Updated weights for policy 1, policy_version 1041 (0.0009) +[2023-10-13 00:14:51,814][46663] Updated weights for policy 1, policy_version 1051 (0.0009) +[2023-10-13 00:14:53,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 12898.3). Total num frames: 2162688. Throughput: 0: 1674.9, 1: 1675.5. Samples: 549376. Policy #0 lag: (min: 3.0, avg: 5.4, max: 35.0) +[2023-10-13 00:14:53,608][45375] Avg episode reward: [(0, '11.130'), (1, '8.860')] +[2023-10-13 00:14:53,616][46091] Saving new best policy, reward=11.130! +[2023-10-13 00:14:53,616][46384] Saving new best policy, reward=8.860! +[2023-10-13 00:14:55,624][46662] Updated weights for policy 0, policy_version 1060 (0.0007) +[2023-10-13 00:14:55,886][46663] Updated weights for policy 1, policy_version 1061 (0.0009) +[2023-10-13 00:14:55,996][46662] Updated weights for policy 0, policy_version 1070 (0.0008) +[2023-10-13 00:14:56,256][46663] Updated weights for policy 1, policy_version 1071 (0.0008) +[2023-10-13 00:14:56,369][46662] Updated weights for policy 0, policy_version 1080 (0.0008) +[2023-10-13 00:14:56,628][46663] Updated weights for policy 1, policy_version 1081 (0.0008) +[2023-10-13 00:14:58,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 12904.4). Total num frames: 2228224. Throughput: 0: 1663.8, 1: 1660.6. Samples: 559882. Policy #0 lag: (min: 31.0, avg: 34.6, max: 63.0) +[2023-10-13 00:14:58,607][45375] Avg episode reward: [(0, '10.890'), (1, '9.940')] +[2023-10-13 00:14:58,608][46384] Saving new best policy, reward=9.940! +[2023-10-13 00:15:00,380][46662] Updated weights for policy 0, policy_version 1090 (0.0008) +[2023-10-13 00:15:00,737][46663] Updated weights for policy 1, policy_version 1091 (0.0007) +[2023-10-13 00:15:00,758][46662] Updated weights for policy 0, policy_version 1100 (0.0009) +[2023-10-13 00:15:01,109][46663] Updated weights for policy 1, policy_version 1101 (0.0007) +[2023-10-13 00:15:01,130][46662] Updated weights for policy 0, policy_version 1110 (0.0008) +[2023-10-13 00:15:01,485][46663] Updated weights for policy 1, policy_version 1111 (0.0009) +[2023-10-13 00:15:01,494][46662] Updated weights for policy 0, policy_version 1120 (0.0009) +[2023-10-13 00:15:03,607][45375] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 12910.1). Total num frames: 2293760. Throughput: 0: 1660.0, 1: 1666.0. Samples: 579062. Policy #0 lag: (min: 6.0, avg: 12.6, max: 38.0) +[2023-10-13 00:15:03,607][45375] Avg episode reward: [(0, '10.840'), (1, '10.790')] +[2023-10-13 00:15:03,608][46384] Saving new best policy, reward=10.790! +[2023-10-13 00:15:05,459][46663] Updated weights for policy 1, policy_version 1121 (0.0008) +[2023-10-13 00:15:05,810][46662] Updated weights for policy 0, policy_version 1130 (0.0008) +[2023-10-13 00:15:05,819][46663] Updated weights for policy 1, policy_version 1131 (0.0010) +[2023-10-13 00:15:06,179][46662] Updated weights for policy 0, policy_version 1140 (0.0007) +[2023-10-13 00:15:06,189][46663] Updated weights for policy 1, policy_version 1141 (0.0008) +[2023-10-13 00:15:06,552][46662] Updated weights for policy 0, policy_version 1150 (0.0009) +[2023-10-13 00:15:06,555][46663] Updated weights for policy 1, policy_version 1151 (0.0007) +[2023-10-13 00:15:08,606][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 12915.5). Total num frames: 2359296. Throughput: 0: 1668.3, 1: 1683.8. Samples: 599582. Policy #0 lag: (min: 31.0, avg: 37.6, max: 63.0) +[2023-10-13 00:15:08,607][45375] Avg episode reward: [(0, '11.420'), (1, '12.360')] +[2023-10-13 00:15:08,618][46091] Saving new best policy, reward=11.420! +[2023-10-13 00:15:08,618][46384] Saving new best policy, reward=12.360! +[2023-10-13 00:15:10,653][46663] Updated weights for policy 1, policy_version 1161 (0.0011) +[2023-10-13 00:15:10,686][46662] Updated weights for policy 0, policy_version 1160 (0.0010) +[2023-10-13 00:15:11,027][46663] Updated weights for policy 1, policy_version 1171 (0.0008) +[2023-10-13 00:15:11,063][46662] Updated weights for policy 0, policy_version 1170 (0.0008) +[2023-10-13 00:15:11,392][46663] Updated weights for policy 1, policy_version 1181 (0.0011) +[2023-10-13 00:15:11,434][46662] Updated weights for policy 0, policy_version 1180 (0.0007) +[2023-10-13 00:15:13,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.3, 300 sec: 12920.6). Total num frames: 2424832. Throughput: 0: 1654.2, 1: 1659.3. Samples: 609584. Policy #0 lag: (min: 31.0, avg: 37.8, max: 63.0) +[2023-10-13 00:15:13,607][45375] Avg episode reward: [(0, '11.640'), (1, '13.420')] +[2023-10-13 00:15:13,608][46091] Saving new best policy, reward=11.640! +[2023-10-13 00:15:13,608][46384] Saving new best policy, reward=13.420! +[2023-10-13 00:15:15,348][46663] Updated weights for policy 1, policy_version 1191 (0.0009) +[2023-10-13 00:15:15,384][46662] Updated weights for policy 0, policy_version 1190 (0.0009) +[2023-10-13 00:15:15,718][46663] Updated weights for policy 1, policy_version 1201 (0.0010) +[2023-10-13 00:15:15,753][46662] Updated weights for policy 0, policy_version 1200 (0.0007) +[2023-10-13 00:15:16,093][46663] Updated weights for policy 1, policy_version 1211 (0.0007) +[2023-10-13 00:15:16,131][46662] Updated weights for policy 0, policy_version 1210 (0.0008) +[2023-10-13 00:15:18,607][45375] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 12925.4). Total num frames: 2490368. Throughput: 0: 1657.7, 1: 1683.1. Samples: 629452. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 00:15:18,608][45375] Avg episode reward: [(0, '13.200'), (1, '14.260')] +[2023-10-13 00:15:18,609][46091] Saving new best policy, reward=13.200! +[2023-10-13 00:15:18,610][46384] Saving new best policy, reward=14.260! +[2023-10-13 00:15:20,283][46663] Updated weights for policy 1, policy_version 1221 (0.0009) +[2023-10-13 00:15:20,333][46662] Updated weights for policy 0, policy_version 1220 (0.0009) +[2023-10-13 00:15:20,682][46663] Updated weights for policy 1, policy_version 1231 (0.0007) +[2023-10-13 00:15:20,729][46662] Updated weights for policy 0, policy_version 1230 (0.0009) +[2023-10-13 00:15:21,048][46663] Updated weights for policy 1, policy_version 1241 (0.0007) +[2023-10-13 00:15:21,092][46662] Updated weights for policy 0, policy_version 1240 (0.0009) +[2023-10-13 00:15:23,607][45375] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 12930.0). Total num frames: 2555904. Throughput: 0: 1659.6, 1: 1685.5. Samples: 649642. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 00:15:23,608][45375] Avg episode reward: [(0, '13.820'), (1, '13.740')] +[2023-10-13 00:15:23,616][46091] Saving new best policy, reward=13.820! +[2023-10-13 00:15:25,146][46663] Updated weights for policy 1, policy_version 1251 (0.0007) +[2023-10-13 00:15:25,200][46662] Updated weights for policy 0, policy_version 1250 (0.0009) +[2023-10-13 00:15:25,512][46663] Updated weights for policy 1, policy_version 1261 (0.0009) +[2023-10-13 00:15:25,568][46662] Updated weights for policy 0, policy_version 1260 (0.0007) +[2023-10-13 00:15:25,878][46663] Updated weights for policy 1, policy_version 1271 (0.0009) +[2023-10-13 00:15:25,935][46662] Updated weights for policy 0, policy_version 1270 (0.0009) +[2023-10-13 00:15:26,313][46662] Updated weights for policy 0, policy_version 1280 (0.0007) +[2023-10-13 00:15:28,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13107.1, 300 sec: 12934.4). Total num frames: 2621440. Throughput: 0: 1648.6, 1: 1663.8. Samples: 659310. Policy #0 lag: (min: 31.0, avg: 38.0, max: 63.0) +[2023-10-13 00:15:28,608][45375] Avg episode reward: [(0, '14.350'), (1, '14.370')] +[2023-10-13 00:15:28,610][46091] Saving new best policy, reward=14.350! +[2023-10-13 00:15:28,610][46384] Saving new best policy, reward=14.370! +[2023-10-13 00:15:29,946][46663] Updated weights for policy 1, policy_version 1281 (0.0008) +[2023-10-13 00:15:30,307][46663] Updated weights for policy 1, policy_version 1291 (0.0008) +[2023-10-13 00:15:30,353][46662] Updated weights for policy 0, policy_version 1290 (0.0008) +[2023-10-13 00:15:30,675][46663] Updated weights for policy 1, policy_version 1301 (0.0009) +[2023-10-13 00:15:30,727][46662] Updated weights for policy 0, policy_version 1300 (0.0008) +[2023-10-13 00:15:31,041][46663] Updated weights for policy 1, policy_version 1311 (0.0009) +[2023-10-13 00:15:31,096][46662] Updated weights for policy 0, policy_version 1310 (0.0008) +[2023-10-13 00:15:33,606][45375] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 12938.5). Total num frames: 2686976. Throughput: 0: 1660.8, 1: 1685.0. Samples: 679550. Policy #0 lag: (min: 26.0, avg: 27.3, max: 51.0) +[2023-10-13 00:15:33,607][45375] Avg episode reward: [(0, '15.390'), (1, '14.610')] +[2023-10-13 00:15:33,608][46091] Saving new best policy, reward=15.390! +[2023-10-13 00:15:33,608][46384] Saving new best policy, reward=14.610! +[2023-10-13 00:15:35,207][46662] Updated weights for policy 0, policy_version 1320 (0.0007) +[2023-10-13 00:15:35,228][46663] Updated weights for policy 1, policy_version 1321 (0.0008) +[2023-10-13 00:15:35,573][46662] Updated weights for policy 0, policy_version 1330 (0.0007) +[2023-10-13 00:15:35,590][46663] Updated weights for policy 1, policy_version 1331 (0.0007) +[2023-10-13 00:15:35,951][46662] Updated weights for policy 0, policy_version 1340 (0.0007) +[2023-10-13 00:15:35,957][46663] Updated weights for policy 1, policy_version 1341 (0.0008) +[2023-10-13 00:15:38,607][45375] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 12942.5). Total num frames: 2752512. Throughput: 0: 1664.4, 1: 1684.8. Samples: 700092. Policy #0 lag: (min: 28.0, avg: 28.0, max: 28.0) +[2023-10-13 00:15:38,607][45375] Avg episode reward: [(0, '15.310'), (1, '14.870')] +[2023-10-13 00:15:38,617][46384] Saving new best policy, reward=14.870! +[2023-10-13 00:15:40,034][46663] Updated weights for policy 1, policy_version 1351 (0.0007) +[2023-10-13 00:15:40,139][46662] Updated weights for policy 0, policy_version 1350 (0.0007) +[2023-10-13 00:15:40,405][46663] Updated weights for policy 1, policy_version 1361 (0.0008) +[2023-10-13 00:15:40,507][46662] Updated weights for policy 0, policy_version 1360 (0.0007) +[2023-10-13 00:15:40,766][46663] Updated weights for policy 1, policy_version 1371 (0.0009) +[2023-10-13 00:15:40,868][46662] Updated weights for policy 0, policy_version 1370 (0.0009) +[2023-10-13 00:15:43,607][45375] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 12946.3). Total num frames: 2818048. Throughput: 0: 1650.4, 1: 1671.2. Samples: 709358. Policy #0 lag: (min: 10.0, avg: 18.2, max: 42.0) +[2023-10-13 00:15:43,608][45375] Avg episode reward: [(0, '16.280'), (1, '15.890')] +[2023-10-13 00:15:43,609][46091] Saving new best policy, reward=16.280! +[2023-10-13 00:15:43,610][46384] Saving new best policy, reward=15.890! +[2023-10-13 00:15:44,836][46663] Updated weights for policy 1, policy_version 1381 (0.0009) +[2023-10-13 00:15:44,896][46662] Updated weights for policy 0, policy_version 1380 (0.0008) +[2023-10-13 00:15:45,193][46663] Updated weights for policy 1, policy_version 1391 (0.0007) +[2023-10-13 00:15:45,266][46662] Updated weights for policy 0, policy_version 1390 (0.0008) +[2023-10-13 00:15:45,568][46663] Updated weights for policy 1, policy_version 1401 (0.0008) +[2023-10-13 00:15:45,647][46662] Updated weights for policy 0, policy_version 1400 (0.0008) +[2023-10-13 00:15:48,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 12949.9). Total num frames: 2883584. Throughput: 0: 1665.6, 1: 1682.2. Samples: 729714. Policy #0 lag: (min: 10.0, avg: 13.6, max: 42.0) +[2023-10-13 00:15:48,608][45375] Avg episode reward: [(0, '16.370'), (1, '16.580')] +[2023-10-13 00:15:48,609][46091] Saving new best policy, reward=16.370! +[2023-10-13 00:15:48,610][46384] Saving new best policy, reward=16.580! +[2023-10-13 00:15:49,746][46663] Updated weights for policy 1, policy_version 1411 (0.0008) +[2023-10-13 00:15:49,813][46662] Updated weights for policy 0, policy_version 1410 (0.0008) +[2023-10-13 00:15:50,119][46663] Updated weights for policy 1, policy_version 1421 (0.0008) +[2023-10-13 00:15:50,185][46662] Updated weights for policy 0, policy_version 1420 (0.0008) +[2023-10-13 00:15:50,488][46663] Updated weights for policy 1, policy_version 1431 (0.0009) +[2023-10-13 00:15:50,556][46662] Updated weights for policy 0, policy_version 1430 (0.0008) +[2023-10-13 00:15:50,936][46662] Updated weights for policy 0, policy_version 1440 (0.0009) +[2023-10-13 00:15:53,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 12953.4). Total num frames: 2949120. Throughput: 0: 1669.9, 1: 1673.2. Samples: 750026. Policy #0 lag: (min: 15.0, avg: 22.8, max: 47.0) +[2023-10-13 00:15:53,607][45375] Avg episode reward: [(0, '17.370'), (1, '17.690')] +[2023-10-13 00:15:53,615][46091] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000001440_1474560.pth... +[2023-10-13 00:15:53,615][46384] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000001440_1474560.pth... +[2023-10-13 00:15:53,653][46091] Saving new best policy, reward=17.370! +[2023-10-13 00:15:53,655][46384] Saving new best policy, reward=17.690! +[2023-10-13 00:15:54,625][46663] Updated weights for policy 1, policy_version 1441 (0.0009) +[2023-10-13 00:15:54,994][46663] Updated weights for policy 1, policy_version 1451 (0.0007) +[2023-10-13 00:15:55,002][46662] Updated weights for policy 0, policy_version 1450 (0.0010) +[2023-10-13 00:15:55,361][46663] Updated weights for policy 1, policy_version 1461 (0.0008) +[2023-10-13 00:15:55,372][46662] Updated weights for policy 0, policy_version 1460 (0.0009) +[2023-10-13 00:15:55,729][46663] Updated weights for policy 1, policy_version 1471 (0.0007) +[2023-10-13 00:15:55,738][46662] Updated weights for policy 0, policy_version 1470 (0.0009) +[2023-10-13 00:15:58,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.1, 300 sec: 12956.7). Total num frames: 3014656. Throughput: 0: 1658.3, 1: 1667.2. Samples: 759232. Policy #0 lag: (min: 31.0, avg: 38.6, max: 63.0) +[2023-10-13 00:15:58,608][45375] Avg episode reward: [(0, '18.180'), (1, '19.620')] +[2023-10-13 00:15:58,609][46091] Saving new best policy, reward=18.180! +[2023-10-13 00:15:58,610][46384] Saving new best policy, reward=19.620! +[2023-10-13 00:15:59,778][46662] Updated weights for policy 0, policy_version 1480 (0.0008) +[2023-10-13 00:15:59,801][46663] Updated weights for policy 1, policy_version 1481 (0.0008) +[2023-10-13 00:16:00,142][46662] Updated weights for policy 0, policy_version 1490 (0.0009) +[2023-10-13 00:16:00,175][46663] Updated weights for policy 1, policy_version 1491 (0.0007) +[2023-10-13 00:16:00,512][46662] Updated weights for policy 0, policy_version 1500 (0.0008) +[2023-10-13 00:16:00,547][46663] Updated weights for policy 1, policy_version 1501 (0.0008) +[2023-10-13 00:16:03,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 12959.8). Total num frames: 3080192. Throughput: 0: 1670.7, 1: 1662.7. Samples: 779452. Policy #0 lag: (min: 31.0, avg: 38.9, max: 63.0) +[2023-10-13 00:16:03,608][45375] Avg episode reward: [(0, '19.220'), (1, '20.910')] +[2023-10-13 00:16:03,609][46091] Saving new best policy, reward=19.220! +[2023-10-13 00:16:03,610][46384] Saving new best policy, reward=20.910! +[2023-10-13 00:16:04,673][46662] Updated weights for policy 0, policy_version 1510 (0.0010) +[2023-10-13 00:16:04,727][46663] Updated weights for policy 1, policy_version 1511 (0.0007) +[2023-10-13 00:16:05,050][46662] Updated weights for policy 0, policy_version 1520 (0.0009) +[2023-10-13 00:16:05,109][46663] Updated weights for policy 1, policy_version 1521 (0.0007) +[2023-10-13 00:16:05,422][46662] Updated weights for policy 0, policy_version 1530 (0.0008) +[2023-10-13 00:16:05,487][46663] Updated weights for policy 1, policy_version 1531 (0.0008) +[2023-10-13 00:16:08,607][45375] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 12962.9). Total num frames: 3145728. Throughput: 0: 1675.6, 1: 1663.8. Samples: 799912. Policy #0 lag: (min: 5.0, avg: 7.7, max: 37.0) +[2023-10-13 00:16:08,607][45375] Avg episode reward: [(0, '20.300'), (1, '21.550')] +[2023-10-13 00:16:08,615][46091] Saving new best policy, reward=20.300! +[2023-10-13 00:16:08,615][46384] Saving new best policy, reward=21.550! +[2023-10-13 00:16:09,557][46663] Updated weights for policy 1, policy_version 1541 (0.0008) +[2023-10-13 00:16:09,655][46662] Updated weights for policy 0, policy_version 1540 (0.0009) +[2023-10-13 00:16:09,924][46663] Updated weights for policy 1, policy_version 1551 (0.0008) +[2023-10-13 00:16:10,054][46662] Updated weights for policy 0, policy_version 1550 (0.0008) +[2023-10-13 00:16:10,298][46663] Updated weights for policy 1, policy_version 1561 (0.0007) +[2023-10-13 00:16:10,418][46662] Updated weights for policy 0, policy_version 1560 (0.0009) +[2023-10-13 00:16:13,607][45375] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 12965.8). Total num frames: 3211264. Throughput: 0: 1658.3, 1: 1664.0. Samples: 808812. Policy #0 lag: (min: 17.0, avg: 25.3, max: 49.0) +[2023-10-13 00:16:13,607][45375] Avg episode reward: [(0, '21.020'), (1, '21.170')] +[2023-10-13 00:16:13,608][46091] Saving new best policy, reward=21.020! +[2023-10-13 00:16:14,322][46663] Updated weights for policy 1, policy_version 1571 (0.0009) +[2023-10-13 00:16:14,440][46662] Updated weights for policy 0, policy_version 1570 (0.0008) +[2023-10-13 00:16:14,696][46663] Updated weights for policy 1, policy_version 1581 (0.0008) +[2023-10-13 00:16:14,817][46662] Updated weights for policy 0, policy_version 1580 (0.0007) +[2023-10-13 00:16:15,056][46663] Updated weights for policy 1, policy_version 1591 (0.0008) +[2023-10-13 00:16:15,195][46662] Updated weights for policy 0, policy_version 1590 (0.0007) +[2023-10-13 00:16:15,567][46662] Updated weights for policy 0, policy_version 1600 (0.0009) +[2023-10-13 00:16:18,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 12968.6). Total num frames: 3276800. Throughput: 0: 1670.2, 1: 1664.4. Samples: 829608. Policy #0 lag: (min: 30.0, avg: 32.7, max: 62.0) +[2023-10-13 00:16:18,608][45375] Avg episode reward: [(0, '21.230'), (1, '21.380')] +[2023-10-13 00:16:18,609][46091] Saving new best policy, reward=21.230! +[2023-10-13 00:16:19,211][46663] Updated weights for policy 1, policy_version 1601 (0.0009) +[2023-10-13 00:16:19,566][46663] Updated weights for policy 1, policy_version 1611 (0.0009) +[2023-10-13 00:16:19,568][46662] Updated weights for policy 0, policy_version 1610 (0.0008) +[2023-10-13 00:16:19,936][46662] Updated weights for policy 0, policy_version 1620 (0.0008) +[2023-10-13 00:16:19,938][46663] Updated weights for policy 1, policy_version 1621 (0.0009) +[2023-10-13 00:16:20,306][46663] Updated weights for policy 1, policy_version 1631 (0.0009) +[2023-10-13 00:16:20,307][46662] Updated weights for policy 0, policy_version 1630 (0.0008) +[2023-10-13 00:16:23,607][45375] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 12971.3). Total num frames: 3342336. Throughput: 0: 1669.6, 1: 1671.0. Samples: 850418. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0) +[2023-10-13 00:16:23,608][45375] Avg episode reward: [(0, '21.760'), (1, '21.340')] +[2023-10-13 00:16:23,620][46091] Saving new best policy, reward=21.760! +[2023-10-13 00:16:24,196][46663] Updated weights for policy 1, policy_version 1641 (0.0009) +[2023-10-13 00:16:24,547][46662] Updated weights for policy 0, policy_version 1640 (0.0008) +[2023-10-13 00:16:24,556][46663] Updated weights for policy 1, policy_version 1651 (0.0007) +[2023-10-13 00:16:24,916][46662] Updated weights for policy 0, policy_version 1650 (0.0009) +[2023-10-13 00:16:24,931][46663] Updated weights for policy 1, policy_version 1661 (0.0007) +[2023-10-13 00:16:25,296][46662] Updated weights for policy 0, policy_version 1660 (0.0009) +[2023-10-13 00:16:28,606][45375] Fps is (10 sec: 13107.5, 60 sec: 13107.3, 300 sec: 12973.9). Total num frames: 3407872. Throughput: 0: 1661.8, 1: 1670.9. Samples: 859328. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) +[2023-10-13 00:16:28,607][45375] Avg episode reward: [(0, '21.240'), (1, '21.160')] +[2023-10-13 00:16:29,056][46663] Updated weights for policy 1, policy_version 1671 (0.0009) +[2023-10-13 00:16:29,200][46662] Updated weights for policy 0, policy_version 1670 (0.0009) +[2023-10-13 00:16:29,420][46663] Updated weights for policy 1, policy_version 1681 (0.0009) +[2023-10-13 00:16:29,567][46662] Updated weights for policy 0, policy_version 1680 (0.0008) +[2023-10-13 00:16:29,797][46663] Updated weights for policy 1, policy_version 1691 (0.0008) +[2023-10-13 00:16:29,938][46662] Updated weights for policy 0, policy_version 1690 (0.0007) +[2023-10-13 00:16:33,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.1, 300 sec: 12976.3). Total num frames: 3473408. Throughput: 0: 1671.2, 1: 1675.6. Samples: 880316. Policy #0 lag: (min: 31.0, avg: 37.7, max: 63.0) +[2023-10-13 00:16:33,608][45375] Avg episode reward: [(0, '22.080'), (1, '21.170')] +[2023-10-13 00:16:33,609][46091] Saving new best policy, reward=22.080! +[2023-10-13 00:16:33,753][46663] Updated weights for policy 1, policy_version 1701 (0.0008) +[2023-10-13 00:16:34,071][46662] Updated weights for policy 0, policy_version 1700 (0.0007) +[2023-10-13 00:16:34,118][46663] Updated weights for policy 1, policy_version 1711 (0.0009) +[2023-10-13 00:16:34,440][46662] Updated weights for policy 0, policy_version 1710 (0.0008) +[2023-10-13 00:16:34,488][46663] Updated weights for policy 1, policy_version 1721 (0.0009) +[2023-10-13 00:16:34,814][46662] Updated weights for policy 0, policy_version 1720 (0.0007) +[2023-10-13 00:16:38,502][46663] Updated weights for policy 1, policy_version 1731 (0.0009) +[2023-10-13 00:16:38,606][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 12978.8). Total num frames: 3538944. Throughput: 0: 1664.1, 1: 1685.6. Samples: 900758. Policy #0 lag: (min: 31.0, avg: 32.5, max: 58.0) +[2023-10-13 00:16:38,607][45375] Avg episode reward: [(0, '21.910'), (1, '20.740')] +[2023-10-13 00:16:38,864][46663] Updated weights for policy 1, policy_version 1741 (0.0009) +[2023-10-13 00:16:39,060][46662] Updated weights for policy 0, policy_version 1730 (0.0008) +[2023-10-13 00:16:39,234][46663] Updated weights for policy 1, policy_version 1751 (0.0008) +[2023-10-13 00:16:39,417][46662] Updated weights for policy 0, policy_version 1740 (0.0009) +[2023-10-13 00:16:39,793][46662] Updated weights for policy 0, policy_version 1750 (0.0009) +[2023-10-13 00:16:40,162][46662] Updated weights for policy 0, policy_version 1760 (0.0008) +[2023-10-13 00:16:43,388][46663] Updated weights for policy 1, policy_version 1761 (0.0008) +[2023-10-13 00:16:43,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 12981.1). Total num frames: 3604480. Throughput: 0: 1661.0, 1: 1684.7. Samples: 909788. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 00:16:43,608][45375] Avg episode reward: [(0, '20.090'), (1, '20.850')] +[2023-10-13 00:16:43,758][46663] Updated weights for policy 1, policy_version 1771 (0.0007) +[2023-10-13 00:16:44,129][46663] Updated weights for policy 1, policy_version 1781 (0.0007) +[2023-10-13 00:16:44,314][46662] Updated weights for policy 0, policy_version 1770 (0.0009) +[2023-10-13 00:16:44,493][46663] Updated weights for policy 1, policy_version 1791 (0.0007) +[2023-10-13 00:16:44,694][46662] Updated weights for policy 0, policy_version 1780 (0.0009) +[2023-10-13 00:16:45,072][46662] Updated weights for policy 0, policy_version 1790 (0.0009) +[2023-10-13 00:16:48,531][46663] Updated weights for policy 1, policy_version 1801 (0.0009) +[2023-10-13 00:16:48,607][45375] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 12983.3). Total num frames: 3670016. Throughput: 0: 1666.4, 1: 1690.8. Samples: 930524. Policy #0 lag: (min: 8.0, avg: 31.4, max: 40.0) +[2023-10-13 00:16:48,608][45375] Avg episode reward: [(0, '19.860'), (1, '21.010')] +[2023-10-13 00:16:48,912][46663] Updated weights for policy 1, policy_version 1811 (0.0010) +[2023-10-13 00:16:49,149][46662] Updated weights for policy 0, policy_version 1800 (0.0008) +[2023-10-13 00:16:49,268][46663] Updated weights for policy 1, policy_version 1821 (0.0008) +[2023-10-13 00:16:49,520][46662] Updated weights for policy 0, policy_version 1810 (0.0008) +[2023-10-13 00:16:49,901][46662] Updated weights for policy 0, policy_version 1820 (0.0008) +[2023-10-13 00:16:53,493][46663] Updated weights for policy 1, policy_version 1831 (0.0007) +[2023-10-13 00:16:53,607][45375] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 12985.4). Total num frames: 3735552. Throughput: 0: 1673.6, 1: 1686.0. Samples: 951094. Policy #0 lag: (min: 31.0, avg: 46.9, max: 63.0) +[2023-10-13 00:16:53,607][45375] Avg episode reward: [(0, '19.650'), (1, '21.720')] +[2023-10-13 00:16:53,856][46662] Updated weights for policy 0, policy_version 1830 (0.0008) +[2023-10-13 00:16:53,865][46663] Updated weights for policy 1, policy_version 1841 (0.0007) +[2023-10-13 00:16:54,221][46662] Updated weights for policy 0, policy_version 1840 (0.0008) +[2023-10-13 00:16:54,234][46663] Updated weights for policy 1, policy_version 1851 (0.0009) +[2023-10-13 00:16:54,417][46384] Saving new best policy, reward=21.720! +[2023-10-13 00:16:54,597][46662] Updated weights for policy 0, policy_version 1850 (0.0010) +[2023-10-13 00:16:58,110][46663] Updated weights for policy 1, policy_version 1861 (0.0008) +[2023-10-13 00:16:58,479][46663] Updated weights for policy 1, policy_version 1871 (0.0009) +[2023-10-13 00:16:58,607][45375] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 12987.5). Total num frames: 3801088. Throughput: 0: 1675.9, 1: 1692.5. Samples: 960392. Policy #0 lag: (min: 31.0, avg: 34.0, max: 63.0) +[2023-10-13 00:16:58,607][45375] Avg episode reward: [(0, '19.740'), (1, '20.930')] +[2023-10-13 00:16:58,729][46662] Updated weights for policy 0, policy_version 1860 (0.0010) +[2023-10-13 00:16:58,843][46663] Updated weights for policy 1, policy_version 1881 (0.0008) +[2023-10-13 00:16:59,107][46662] Updated weights for policy 0, policy_version 1870 (0.0009) +[2023-10-13 00:16:59,482][46662] Updated weights for policy 0, policy_version 1880 (0.0008) +[2023-10-13 00:17:02,950][46663] Updated weights for policy 1, policy_version 1891 (0.0009) +[2023-10-13 00:17:03,316][46663] Updated weights for policy 1, policy_version 1901 (0.0010) +[2023-10-13 00:17:03,551][46662] Updated weights for policy 0, policy_version 1890 (0.0008) +[2023-10-13 00:17:03,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.3, 300 sec: 13107.2). Total num frames: 3866624. Throughput: 0: 1672.2, 1: 1695.6. Samples: 981160. Policy #0 lag: (min: 26.0, avg: 34.0, max: 58.0) +[2023-10-13 00:17:03,607][45375] Avg episode reward: [(0, '19.090'), (1, '21.700')] +[2023-10-13 00:17:03,675][46663] Updated weights for policy 1, policy_version 1911 (0.0009) +[2023-10-13 00:17:03,924][46662] Updated weights for policy 0, policy_version 1900 (0.0007) +[2023-10-13 00:17:04,292][46662] Updated weights for policy 0, policy_version 1910 (0.0007) +[2023-10-13 00:17:04,665][46662] Updated weights for policy 0, policy_version 1920 (0.0011) +[2023-10-13 00:17:07,809][46663] Updated weights for policy 1, policy_version 1921 (0.0009) +[2023-10-13 00:17:08,181][46663] Updated weights for policy 1, policy_version 1931 (0.0010) +[2023-10-13 00:17:08,542][46663] Updated weights for policy 1, policy_version 1941 (0.0008) +[2023-10-13 00:17:08,606][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 3932160. Throughput: 0: 1672.3, 1: 1678.2. Samples: 1001190. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 00:17:08,607][45375] Avg episode reward: [(0, '18.600'), (1, '20.470')] +[2023-10-13 00:17:08,657][46662] Updated weights for policy 0, policy_version 1930 (0.0007) +[2023-10-13 00:17:08,913][46663] Updated weights for policy 1, policy_version 1951 (0.0009) +[2023-10-13 00:17:09,021][46662] Updated weights for policy 0, policy_version 1940 (0.0007) +[2023-10-13 00:17:09,400][46662] Updated weights for policy 0, policy_version 1950 (0.0009) +[2023-10-13 00:17:12,858][46663] Updated weights for policy 1, policy_version 1961 (0.0007) +[2023-10-13 00:17:13,235][46663] Updated weights for policy 1, policy_version 1971 (0.0010) +[2023-10-13 00:17:13,476][46662] Updated weights for policy 0, policy_version 1960 (0.0007) +[2023-10-13 00:17:13,596][46663] Updated weights for policy 1, policy_version 1981 (0.0009) +[2023-10-13 00:17:13,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 3997696. Throughput: 0: 1676.0, 1: 1695.8. Samples: 1011062. Policy #0 lag: (min: 31.0, avg: 38.0, max: 63.0) +[2023-10-13 00:17:13,607][45375] Avg episode reward: [(0, '18.990'), (1, '21.230')] +[2023-10-13 00:17:13,840][46662] Updated weights for policy 0, policy_version 1970 (0.0009) +[2023-10-13 00:17:14,211][46662] Updated weights for policy 0, policy_version 1980 (0.0010) +[2023-10-13 00:17:17,741][46663] Updated weights for policy 1, policy_version 1991 (0.0008) +[2023-10-13 00:17:18,113][46663] Updated weights for policy 1, policy_version 2001 (0.0009) +[2023-10-13 00:17:18,309][46662] Updated weights for policy 0, policy_version 1990 (0.0009) +[2023-10-13 00:17:18,480][46663] Updated weights for policy 1, policy_version 2011 (0.0009) +[2023-10-13 00:17:18,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 4063232. Throughput: 0: 1673.3, 1: 1692.0. Samples: 1031752. Policy #0 lag: (min: 31.0, avg: 39.1, max: 63.0) +[2023-10-13 00:17:18,607][45375] Avg episode reward: [(0, '18.870'), (1, '22.330')] +[2023-10-13 00:17:18,660][46384] Saving new best policy, reward=22.330! +[2023-10-13 00:17:18,679][46662] Updated weights for policy 0, policy_version 2000 (0.0008) +[2023-10-13 00:17:19,051][46662] Updated weights for policy 0, policy_version 2010 (0.0008) +[2023-10-13 00:17:22,385][46663] Updated weights for policy 1, policy_version 2021 (0.0007) +[2023-10-13 00:17:22,751][46663] Updated weights for policy 1, policy_version 2031 (0.0011) +[2023-10-13 00:17:23,104][46662] Updated weights for policy 0, policy_version 2020 (0.0008) +[2023-10-13 00:17:23,126][46663] Updated weights for policy 1, policy_version 2041 (0.0008) +[2023-10-13 00:17:23,479][46662] Updated weights for policy 0, policy_version 2030 (0.0007) +[2023-10-13 00:17:23,607][45375] Fps is (10 sec: 16383.6, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 4161536. Throughput: 0: 1684.7, 1: 1664.2. Samples: 1051456. Policy #0 lag: (min: 1.0, avg: 1.4, max: 15.0) +[2023-10-13 00:17:23,608][45375] Avg episode reward: [(0, '20.120'), (1, '22.110')] +[2023-10-13 00:17:23,852][46662] Updated weights for policy 0, policy_version 2040 (0.0009) +[2023-10-13 00:17:27,132][46663] Updated weights for policy 1, policy_version 2051 (0.0007) +[2023-10-13 00:17:27,503][46663] Updated weights for policy 1, policy_version 2061 (0.0007) +[2023-10-13 00:17:27,872][46663] Updated weights for policy 1, policy_version 2071 (0.0010) +[2023-10-13 00:17:27,927][46662] Updated weights for policy 0, policy_version 2050 (0.0008) +[2023-10-13 00:17:28,300][46662] Updated weights for policy 0, policy_version 2060 (0.0008) +[2023-10-13 00:17:28,607][45375] Fps is (10 sec: 16383.8, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 4227072. Throughput: 0: 1685.0, 1: 1688.9. Samples: 1061616. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 00:17:28,608][45375] Avg episode reward: [(0, '19.890'), (1, '21.890')] +[2023-10-13 00:17:28,679][46662] Updated weights for policy 0, policy_version 2070 (0.0009) +[2023-10-13 00:17:29,053][46662] Updated weights for policy 0, policy_version 2080 (0.0011) +[2023-10-13 00:17:32,018][46663] Updated weights for policy 1, policy_version 2081 (0.0008) +[2023-10-13 00:17:32,394][46663] Updated weights for policy 1, policy_version 2091 (0.0008) +[2023-10-13 00:17:32,765][46663] Updated weights for policy 1, policy_version 2101 (0.0007) +[2023-10-13 00:17:33,133][46663] Updated weights for policy 1, policy_version 2111 (0.0009) +[2023-10-13 00:17:33,261][46662] Updated weights for policy 0, policy_version 2090 (0.0007) +[2023-10-13 00:17:33,606][45375] Fps is (10 sec: 13107.5, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 4292608. Throughput: 0: 1685.2, 1: 1677.2. Samples: 1081828. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 00:17:33,607][45375] Avg episode reward: [(0, '20.800'), (1, '22.550')] +[2023-10-13 00:17:33,608][46384] Saving new best policy, reward=22.550! +[2023-10-13 00:17:33,641][46662] Updated weights for policy 0, policy_version 2100 (0.0010) +[2023-10-13 00:17:34,009][46662] Updated weights for policy 0, policy_version 2110 (0.0009) +[2023-10-13 00:17:37,183][46663] Updated weights for policy 1, policy_version 2121 (0.0007) +[2023-10-13 00:17:37,551][46663] Updated weights for policy 1, policy_version 2131 (0.0008) +[2023-10-13 00:17:37,923][46663] Updated weights for policy 1, policy_version 2141 (0.0008) +[2023-10-13 00:17:38,100][46662] Updated weights for policy 0, policy_version 2120 (0.0009) +[2023-10-13 00:17:38,467][46662] Updated weights for policy 0, policy_version 2130 (0.0009) +[2023-10-13 00:17:38,607][45375] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 4358144. Throughput: 0: 1678.7, 1: 1668.4. Samples: 1101714. Policy #0 lag: (min: 26.0, avg: 33.9, max: 58.0) +[2023-10-13 00:17:38,607][45375] Avg episode reward: [(0, '21.790'), (1, '22.320')] +[2023-10-13 00:17:38,836][46662] Updated weights for policy 0, policy_version 2140 (0.0009) +[2023-10-13 00:17:42,071][46663] Updated weights for policy 1, policy_version 2151 (0.0010) +[2023-10-13 00:17:42,447][46663] Updated weights for policy 1, policy_version 2161 (0.0007) +[2023-10-13 00:17:42,823][46663] Updated weights for policy 1, policy_version 2171 (0.0008) +[2023-10-13 00:17:43,011][46662] Updated weights for policy 0, policy_version 2150 (0.0010) +[2023-10-13 00:17:43,388][46662] Updated weights for policy 0, policy_version 2160 (0.0009) +[2023-10-13 00:17:43,606][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 4423680. Throughput: 0: 1674.8, 1: 1694.2. Samples: 1111994. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 00:17:43,607][45375] Avg episode reward: [(0, '22.310'), (1, '22.200')] +[2023-10-13 00:17:43,760][46662] Updated weights for policy 0, policy_version 2170 (0.0008) +[2023-10-13 00:17:43,980][46091] Saving new best policy, reward=22.310! +[2023-10-13 00:17:46,981][46663] Updated weights for policy 1, policy_version 2181 (0.0008) +[2023-10-13 00:17:47,356][46663] Updated weights for policy 1, policy_version 2191 (0.0010) +[2023-10-13 00:17:47,718][46663] Updated weights for policy 1, policy_version 2201 (0.0009) +[2023-10-13 00:17:47,862][46662] Updated weights for policy 0, policy_version 2180 (0.0009) +[2023-10-13 00:17:47,735][46662] Updated weights for policy 0, policy_version 2190 (0.0009) +[2023-10-13 00:17:48,109][46662] Updated weights for policy 0, policy_version 2200 (0.0009) +[2023-10-13 00:17:48,607][45375] Fps is (10 sec: 16384.0, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 4521984. Throughput: 0: 1677.6, 1: 1673.6. Samples: 1131968. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 00:17:48,607][45375] Avg episode reward: [(0, '22.500'), (1, '21.500')] +[2023-10-13 00:17:48,608][46091] Saving new best policy, reward=22.500! +[2023-10-13 00:17:51,454][46663] Updated weights for policy 1, policy_version 2211 (0.0007) +[2023-10-13 00:17:51,823][46663] Updated weights for policy 1, policy_version 2221 (0.0007) +[2023-10-13 00:17:52,035][46662] Updated weights for policy 0, policy_version 2210 (0.0008) +[2023-10-13 00:17:52,202][46663] Updated weights for policy 1, policy_version 2231 (0.0009) +[2023-10-13 00:17:52,394][46662] Updated weights for policy 0, policy_version 2220 (0.0009) +[2023-10-13 00:17:52,767][46662] Updated weights for policy 0, policy_version 2230 (0.0009) +[2023-10-13 00:17:53,147][46662] Updated weights for policy 0, policy_version 2240 (0.0009) +[2023-10-13 00:17:53,607][45375] Fps is (10 sec: 16383.4, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 4587520. Throughput: 0: 1677.4, 1: 1693.3. Samples: 1152874. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 00:17:53,608][45375] Avg episode reward: [(0, '22.700'), (1, '21.650')] +[2023-10-13 00:17:53,621][46384] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000002240_2293760.pth... +[2023-10-13 00:17:53,621][46091] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000002240_2293760.pth... +[2023-10-13 00:17:53,659][46384] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000000672_688128.pth +[2023-10-13 00:17:53,666][46091] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000000672_688128.pth +[2023-10-13 00:17:53,671][46091] Saving new best policy, reward=22.700! +[2023-10-13 00:17:56,311][46663] Updated weights for policy 1, policy_version 2241 (0.0007) +[2023-10-13 00:17:56,677][46663] Updated weights for policy 1, policy_version 2251 (0.0007) +[2023-10-13 00:17:57,049][46663] Updated weights for policy 1, policy_version 2261 (0.0007) +[2023-10-13 00:17:57,307][46662] Updated weights for policy 0, policy_version 2250 (0.0011) +[2023-10-13 00:17:57,424][46663] Updated weights for policy 1, policy_version 2271 (0.0008) +[2023-10-13 00:17:57,680][46662] Updated weights for policy 0, policy_version 2260 (0.0008) +[2023-10-13 00:17:58,051][46662] Updated weights for policy 0, policy_version 2270 (0.0007) +[2023-10-13 00:17:58,606][45375] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 4653056. Throughput: 0: 1696.7, 1: 1699.4. Samples: 1163884. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 00:17:58,607][45375] Avg episode reward: [(0, '22.790'), (1, '22.310')] +[2023-10-13 00:17:58,608][46091] Saving new best policy, reward=22.790! +[2023-10-13 00:18:01,532][46663] Updated weights for policy 1, policy_version 2281 (0.0009) +[2023-10-13 00:18:01,904][46663] Updated weights for policy 1, policy_version 2291 (0.0007) +[2023-10-13 00:18:02,131][46662] Updated weights for policy 0, policy_version 2280 (0.0009) +[2023-10-13 00:18:02,278][46663] Updated weights for policy 1, policy_version 2301 (0.0009) +[2023-10-13 00:18:02,506][46662] Updated weights for policy 0, policy_version 2290 (0.0009) +[2023-10-13 00:18:02,874][46662] Updated weights for policy 0, policy_version 2300 (0.0011) +[2023-10-13 00:18:03,607][45375] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 4718592. Throughput: 0: 1692.9, 1: 1674.1. Samples: 1183266. Policy #0 lag: (min: 9.0, avg: 11.5, max: 41.0) +[2023-10-13 00:18:03,607][45375] Avg episode reward: [(0, '22.670'), (1, '23.320')] +[2023-10-13 00:18:03,608][46384] Saving new best policy, reward=23.320! +[2023-10-13 00:18:06,142][46663] Updated weights for policy 1, policy_version 2311 (0.0007) +[2023-10-13 00:18:06,516][46663] Updated weights for policy 1, policy_version 2321 (0.0007) +[2023-10-13 00:18:06,889][46663] Updated weights for policy 1, policy_version 2331 (0.0008) +[2023-10-13 00:18:07,030][46662] Updated weights for policy 0, policy_version 2310 (0.0008) +[2023-10-13 00:18:07,410][46662] Updated weights for policy 0, policy_version 2320 (0.0008) +[2023-10-13 00:18:07,779][46662] Updated weights for policy 0, policy_version 2330 (0.0009) +[2023-10-13 00:18:08,607][45375] Fps is (10 sec: 13106.9, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 4784128. Throughput: 0: 1664.8, 1: 1703.9. Samples: 1203046. Policy #0 lag: (min: 31.0, avg: 46.4, max: 63.0) +[2023-10-13 00:18:08,608][45375] Avg episode reward: [(0, '22.190'), (1, '24.430')] +[2023-10-13 00:18:08,618][46384] Saving new best policy, reward=24.430! +[2023-10-13 00:18:10,981][46663] Updated weights for policy 1, policy_version 2341 (0.0007) +[2023-10-13 00:18:11,341][46663] Updated weights for policy 1, policy_version 2351 (0.0009) +[2023-10-13 00:18:11,714][46663] Updated weights for policy 1, policy_version 2361 (0.0008) +[2023-10-13 00:18:11,866][46662] Updated weights for policy 0, policy_version 2340 (0.0008) +[2023-10-13 00:18:12,239][46662] Updated weights for policy 0, policy_version 2350 (0.0007) +[2023-10-13 00:18:12,608][46662] Updated weights for policy 0, policy_version 2360 (0.0008) +[2023-10-13 00:18:13,606][45375] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 13440.4). Total num frames: 4849664. Throughput: 0: 1686.1, 1: 1690.9. Samples: 1213580. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0) +[2023-10-13 00:18:13,607][45375] Avg episode reward: [(0, '21.650'), (1, '24.710')] +[2023-10-13 00:18:13,608][46384] Saving new best policy, reward=24.710! +[2023-10-13 00:18:15,690][46663] Updated weights for policy 1, policy_version 2371 (0.0009) +[2023-10-13 00:18:16,057][46663] Updated weights for policy 1, policy_version 2381 (0.0008) +[2023-10-13 00:18:16,435][46663] Updated weights for policy 1, policy_version 2391 (0.0007) +[2023-10-13 00:18:16,775][46662] Updated weights for policy 0, policy_version 2370 (0.0010) +[2023-10-13 00:18:17,134][46662] Updated weights for policy 0, policy_version 2380 (0.0008) +[2023-10-13 00:18:17,514][46662] Updated weights for policy 0, policy_version 2390 (0.0007) +[2023-10-13 00:18:17,884][46662] Updated weights for policy 0, policy_version 2400 (0.0009) +[2023-10-13 00:18:18,607][45375] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 13440.4). Total num frames: 4915200. Throughput: 0: 1686.5, 1: 1685.2. Samples: 1233556. Policy #0 lag: (min: 31.0, avg: 36.4, max: 63.0) +[2023-10-13 00:18:18,607][45375] Avg episode reward: [(0, '21.840'), (1, '24.960')] +[2023-10-13 00:18:18,608][46384] Saving new best policy, reward=24.960! +[2023-10-13 00:18:20,531][46663] Updated weights for policy 1, policy_version 2401 (0.0007) +[2023-10-13 00:18:20,901][46663] Updated weights for policy 1, policy_version 2411 (0.0007) +[2023-10-13 00:18:21,258][46663] Updated weights for policy 1, policy_version 2421 (0.0009) +[2023-10-13 00:18:21,630][46663] Updated weights for policy 1, policy_version 2431 (0.0009) +[2023-10-13 00:18:21,900][46662] Updated weights for policy 0, policy_version 2410 (0.0008) +[2023-10-13 00:18:22,276][46662] Updated weights for policy 0, policy_version 2420 (0.0011) +[2023-10-13 00:18:22,647][46662] Updated weights for policy 0, policy_version 2430 (0.0010) +[2023-10-13 00:18:23,606][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13329.4). Total num frames: 4980736. Throughput: 0: 1668.8, 1: 1702.8. Samples: 1253434. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) +[2023-10-13 00:18:23,607][45375] Avg episode reward: [(0, '21.650'), (1, '25.520')] +[2023-10-13 00:18:23,616][46384] Saving new best policy, reward=25.520! +[2023-10-13 00:18:25,490][46663] Updated weights for policy 1, policy_version 2441 (0.0009) +[2023-10-13 00:18:25,848][46663] Updated weights for policy 1, policy_version 2451 (0.0010) +[2023-10-13 00:18:26,218][46663] Updated weights for policy 1, policy_version 2461 (0.0010) +[2023-10-13 00:18:26,513][46662] Updated weights for policy 0, policy_version 2440 (0.0009) +[2023-10-13 00:18:26,896][46662] Updated weights for policy 0, policy_version 2450 (0.0009) +[2023-10-13 00:18:27,260][46662] Updated weights for policy 0, policy_version 2460 (0.0010) +[2023-10-13 00:18:28,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13329.4). Total num frames: 5046272. Throughput: 0: 1702.9, 1: 1673.6. Samples: 1263934. Policy #0 lag: (min: 31.0, avg: 33.1, max: 63.0) +[2023-10-13 00:18:28,607][45375] Avg episode reward: [(0, '21.480'), (1, '26.170')] +[2023-10-13 00:18:28,608][46384] Saving new best policy, reward=26.170! +[2023-10-13 00:18:30,374][46663] Updated weights for policy 1, policy_version 2471 (0.0008) +[2023-10-13 00:18:30,730][46663] Updated weights for policy 1, policy_version 2481 (0.0007) +[2023-10-13 00:18:31,096][46663] Updated weights for policy 1, policy_version 2491 (0.0008) +[2023-10-13 00:18:31,422][46662] Updated weights for policy 0, policy_version 2470 (0.0008) +[2023-10-13 00:18:31,802][46662] Updated weights for policy 0, policy_version 2480 (0.0007) +[2023-10-13 00:18:32,186][46662] Updated weights for policy 0, policy_version 2490 (0.0009) +[2023-10-13 00:18:33,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13329.4). Total num frames: 5111808. Throughput: 0: 1684.4, 1: 1690.7. Samples: 1283844. Policy #0 lag: (min: 21.0, avg: 27.8, max: 53.0) +[2023-10-13 00:18:33,607][45375] Avg episode reward: [(0, '21.010'), (1, '26.280')] +[2023-10-13 00:18:33,608][46384] Saving new best policy, reward=26.280! +[2023-10-13 00:18:35,200][46663] Updated weights for policy 1, policy_version 2501 (0.0009) +[2023-10-13 00:18:35,562][46663] Updated weights for policy 1, policy_version 2511 (0.0009) +[2023-10-13 00:18:35,928][46663] Updated weights for policy 1, policy_version 2521 (0.0009) +[2023-10-13 00:18:36,157][46662] Updated weights for policy 0, policy_version 2500 (0.0009) +[2023-10-13 00:18:36,517][46662] Updated weights for policy 0, policy_version 2510 (0.0010) +[2023-10-13 00:18:36,899][46662] Updated weights for policy 0, policy_version 2520 (0.0010) +[2023-10-13 00:18:38,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13329.4). Total num frames: 5177344. Throughput: 0: 1672.4, 1: 1679.5. Samples: 1303710. Policy #0 lag: (min: 18.0, avg: 25.6, max: 50.0) +[2023-10-13 00:18:38,607][45375] Avg episode reward: [(0, '21.280'), (1, '25.810')] +[2023-10-13 00:18:40,086][46663] Updated weights for policy 1, policy_version 2531 (0.0008) +[2023-10-13 00:18:40,454][46663] Updated weights for policy 1, policy_version 2541 (0.0007) +[2023-10-13 00:18:40,818][46663] Updated weights for policy 1, policy_version 2551 (0.0007) +[2023-10-13 00:18:41,056][46662] Updated weights for policy 0, policy_version 2530 (0.0007) +[2023-10-13 00:18:41,432][46662] Updated weights for policy 0, policy_version 2540 (0.0009) +[2023-10-13 00:18:41,807][46662] Updated weights for policy 0, policy_version 2550 (0.0009) +[2023-10-13 00:18:42,184][46662] Updated weights for policy 0, policy_version 2560 (0.0011) +[2023-10-13 00:18:43,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13329.4). Total num frames: 5242880. Throughput: 0: 1678.9, 1: 1655.6. Samples: 1313938. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 00:18:43,607][45375] Avg episode reward: [(0, '21.200'), (1, '24.890')] +[2023-10-13 00:18:44,888][46663] Updated weights for policy 1, policy_version 2561 (0.0009) +[2023-10-13 00:18:45,252][46663] Updated weights for policy 1, policy_version 2571 (0.0011) +[2023-10-13 00:18:45,629][46663] Updated weights for policy 1, policy_version 2581 (0.0007) +[2023-10-13 00:18:46,001][46663] Updated weights for policy 1, policy_version 2591 (0.0008) +[2023-10-13 00:18:46,324][46662] Updated weights for policy 0, policy_version 2570 (0.0008) +[2023-10-13 00:18:46,689][46662] Updated weights for policy 0, policy_version 2580 (0.0008) +[2023-10-13 00:18:47,061][46662] Updated weights for policy 0, policy_version 2590 (0.0011) +[2023-10-13 00:18:48,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 5308416. Throughput: 0: 1661.4, 1: 1678.9. Samples: 1333580. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) +[2023-10-13 00:18:48,607][45375] Avg episode reward: [(0, '21.180'), (1, '25.990')] +[2023-10-13 00:18:50,232][46663] Updated weights for policy 1, policy_version 2601 (0.0007) +[2023-10-13 00:18:50,595][46663] Updated weights for policy 1, policy_version 2611 (0.0007) +[2023-10-13 00:18:50,966][46663] Updated weights for policy 1, policy_version 2621 (0.0007) +[2023-10-13 00:18:51,221][46662] Updated weights for policy 0, policy_version 2600 (0.0010) +[2023-10-13 00:18:51,596][46662] Updated weights for policy 0, policy_version 2610 (0.0009) +[2023-10-13 00:18:51,965][46662] Updated weights for policy 0, policy_version 2620 (0.0008) +[2023-10-13 00:18:53,607][45375] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 5373952. Throughput: 0: 1670.2, 1: 1674.9. Samples: 1353574. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 00:18:53,608][45375] Avg episode reward: [(0, '22.310'), (1, '26.020')] +[2023-10-13 00:18:55,007][46663] Updated weights for policy 1, policy_version 2631 (0.0010) +[2023-10-13 00:18:55,369][46663] Updated weights for policy 1, policy_version 2641 (0.0008) +[2023-10-13 00:18:55,746][46663] Updated weights for policy 1, policy_version 2651 (0.0007) +[2023-10-13 00:18:55,937][46662] Updated weights for policy 0, policy_version 2630 (0.0011) +[2023-10-13 00:18:56,313][46662] Updated weights for policy 0, policy_version 2640 (0.0010) +[2023-10-13 00:18:56,685][46662] Updated weights for policy 0, policy_version 2650 (0.0010) +[2023-10-13 00:18:58,606][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 5439488. Throughput: 0: 1679.2, 1: 1659.8. Samples: 1363838. Policy #0 lag: (min: 31.0, avg: 35.7, max: 63.0) +[2023-10-13 00:18:58,607][45375] Avg episode reward: [(0, '22.350'), (1, '26.300')] +[2023-10-13 00:18:58,608][46384] Saving new best policy, reward=26.300! +[2023-10-13 00:18:59,959][46663] Updated weights for policy 1, policy_version 2661 (0.0009) +[2023-10-13 00:19:00,329][46663] Updated weights for policy 1, policy_version 2671 (0.0008) +[2023-10-13 00:19:00,703][46663] Updated weights for policy 1, policy_version 2681 (0.0007) +[2023-10-13 00:19:00,718][46662] Updated weights for policy 0, policy_version 2660 (0.0010) +[2023-10-13 00:19:01,096][46662] Updated weights for policy 0, policy_version 2670 (0.0007) +[2023-10-13 00:19:01,466][46662] Updated weights for policy 0, policy_version 2680 (0.0007) +[2023-10-13 00:19:03,606][45375] Fps is (10 sec: 13107.6, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 5505024. Throughput: 0: 1651.1, 1: 1675.0. Samples: 1383232. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) +[2023-10-13 00:19:03,607][45375] Avg episode reward: [(0, '22.060'), (1, '27.260')] +[2023-10-13 00:19:03,608][46384] Saving new best policy, reward=27.260! +[2023-10-13 00:19:04,699][46663] Updated weights for policy 1, policy_version 2691 (0.0007) +[2023-10-13 00:19:05,072][46663] Updated weights for policy 1, policy_version 2701 (0.0009) +[2023-10-13 00:19:05,444][46663] Updated weights for policy 1, policy_version 2711 (0.0009) +[2023-10-13 00:19:05,542][46662] Updated weights for policy 0, policy_version 2690 (0.0007) +[2023-10-13 00:19:05,916][46662] Updated weights for policy 0, policy_version 2700 (0.0007) +[2023-10-13 00:19:06,287][46662] Updated weights for policy 0, policy_version 2710 (0.0009) +[2023-10-13 00:19:06,655][46662] Updated weights for policy 0, policy_version 2720 (0.0007) +[2023-10-13 00:19:08,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 5570560. Throughput: 0: 1672.8, 1: 1672.7. Samples: 1403982. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0) +[2023-10-13 00:19:08,607][45375] Avg episode reward: [(0, '21.790'), (1, '26.710')] +[2023-10-13 00:19:09,580][46663] Updated weights for policy 1, policy_version 2721 (0.0008) +[2023-10-13 00:19:09,942][46663] Updated weights for policy 1, policy_version 2731 (0.0007) +[2023-10-13 00:19:10,316][46663] Updated weights for policy 1, policy_version 2741 (0.0007) +[2023-10-13 00:19:10,635][46662] Updated weights for policy 0, policy_version 2730 (0.0009) +[2023-10-13 00:19:10,679][46663] Updated weights for policy 1, policy_version 2751 (0.0007) +[2023-10-13 00:19:11,019][46662] Updated weights for policy 0, policy_version 2740 (0.0008) +[2023-10-13 00:19:11,397][46662] Updated weights for policy 0, policy_version 2750 (0.0009) +[2023-10-13 00:19:13,607][45375] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 5636096. Throughput: 0: 1659.2, 1: 1670.2. Samples: 1413756. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-10-13 00:19:13,608][45375] Avg episode reward: [(0, '22.610'), (1, '27.430')] +[2023-10-13 00:19:13,609][46384] Saving new best policy, reward=27.430! +[2023-10-13 00:19:14,495][46663] Updated weights for policy 1, policy_version 2761 (0.0009) +[2023-10-13 00:19:14,859][46663] Updated weights for policy 1, policy_version 2771 (0.0008) +[2023-10-13 00:19:15,223][46663] Updated weights for policy 1, policy_version 2781 (0.0008) +[2023-10-13 00:19:15,480][46662] Updated weights for policy 0, policy_version 2760 (0.0008) +[2023-10-13 00:19:15,843][46662] Updated weights for policy 0, policy_version 2770 (0.0010) +[2023-10-13 00:19:16,214][46662] Updated weights for policy 0, policy_version 2780 (0.0009) +[2023-10-13 00:19:18,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 5701632. Throughput: 0: 1660.4, 1: 1674.1. Samples: 1433896. Policy #0 lag: (min: 25.0, avg: 36.3, max: 57.0) +[2023-10-13 00:19:18,607][45375] Avg episode reward: [(0, '23.160'), (1, '27.210')] +[2023-10-13 00:19:18,608][46091] Saving new best policy, reward=23.160! +[2023-10-13 00:19:19,412][46663] Updated weights for policy 1, policy_version 2791 (0.0007) +[2023-10-13 00:19:19,784][46663] Updated weights for policy 1, policy_version 2801 (0.0007) +[2023-10-13 00:19:20,152][46663] Updated weights for policy 1, policy_version 2811 (0.0007) +[2023-10-13 00:19:20,403][46662] Updated weights for policy 0, policy_version 2790 (0.0008) +[2023-10-13 00:19:20,775][46662] Updated weights for policy 0, policy_version 2800 (0.0007) +[2023-10-13 00:19:21,157][46662] Updated weights for policy 0, policy_version 2810 (0.0009) +[2023-10-13 00:19:23,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.1, 300 sec: 13329.3). Total num frames: 5767168. Throughput: 0: 1673.5, 1: 1682.4. Samples: 1454730. Policy #0 lag: (min: 19.0, avg: 21.7, max: 51.0) +[2023-10-13 00:19:23,608][45375] Avg episode reward: [(0, '22.780'), (1, '28.870')] +[2023-10-13 00:19:23,617][46384] Saving new best policy, reward=28.870! +[2023-10-13 00:19:24,026][46663] Updated weights for policy 1, policy_version 2821 (0.0009) +[2023-10-13 00:19:24,399][46663] Updated weights for policy 1, policy_version 2831 (0.0009) +[2023-10-13 00:19:24,776][46663] Updated weights for policy 1, policy_version 2841 (0.0008) +[2023-10-13 00:19:25,060][46662] Updated weights for policy 0, policy_version 2820 (0.0009) +[2023-10-13 00:19:25,436][46662] Updated weights for policy 0, policy_version 2830 (0.0007) +[2023-10-13 00:19:25,817][46662] Updated weights for policy 0, policy_version 2840 (0.0009) +[2023-10-13 00:19:28,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 5832704. Throughput: 0: 1657.3, 1: 1683.0. Samples: 1464252. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 00:19:28,607][45375] Avg episode reward: [(0, '23.600'), (1, '29.320')] +[2023-10-13 00:19:28,608][46384] Saving new best policy, reward=29.320! +[2023-10-13 00:19:28,608][46091] Saving new best policy, reward=23.600! +[2023-10-13 00:19:28,883][46663] Updated weights for policy 1, policy_version 2851 (0.0010) +[2023-10-13 00:19:29,243][46663] Updated weights for policy 1, policy_version 2861 (0.0009) +[2023-10-13 00:19:29,617][46663] Updated weights for policy 1, policy_version 2871 (0.0008) +[2023-10-13 00:19:29,957][46662] Updated weights for policy 0, policy_version 2850 (0.0007) +[2023-10-13 00:19:30,335][46662] Updated weights for policy 0, policy_version 2860 (0.0008) +[2023-10-13 00:19:30,712][46662] Updated weights for policy 0, policy_version 2870 (0.0008) +[2023-10-13 00:19:31,088][46662] Updated weights for policy 0, policy_version 2880 (0.0010) +[2023-10-13 00:19:33,606][45375] Fps is (10 sec: 13107.6, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 5898240. Throughput: 0: 1667.0, 1: 1683.9. Samples: 1484370. Policy #0 lag: (min: 16.0, avg: 41.5, max: 48.0) +[2023-10-13 00:19:33,607][45375] Avg episode reward: [(0, '24.630'), (1, '30.290')] +[2023-10-13 00:19:33,608][46091] Saving new best policy, reward=24.630! +[2023-10-13 00:19:33,620][46663] Updated weights for policy 1, policy_version 2881 (0.0008) +[2023-10-13 00:19:33,985][46663] Updated weights for policy 1, policy_version 2891 (0.0008) +[2023-10-13 00:19:34,359][46663] Updated weights for policy 1, policy_version 2901 (0.0009) +[2023-10-13 00:19:34,729][46663] Updated weights for policy 1, policy_version 2911 (0.0007) +[2023-10-13 00:19:34,765][46384] Saving new best policy, reward=30.290! +[2023-10-13 00:19:34,968][46662] Updated weights for policy 0, policy_version 2890 (0.0009) +[2023-10-13 00:19:35,336][46662] Updated weights for policy 0, policy_version 2900 (0.0009) +[2023-10-13 00:19:35,711][46662] Updated weights for policy 0, policy_version 2910 (0.0009) +[2023-10-13 00:19:38,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 5963776. Throughput: 0: 1685.9, 1: 1681.4. Samples: 1505102. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-10-13 00:19:38,607][45375] Avg episode reward: [(0, '25.010'), (1, '30.840')] +[2023-10-13 00:19:38,615][46091] Saving new best policy, reward=25.010! +[2023-10-13 00:19:38,918][46663] Updated weights for policy 1, policy_version 2921 (0.0008) +[2023-10-13 00:19:39,286][46663] Updated weights for policy 1, policy_version 2931 (0.0010) +[2023-10-13 00:19:39,662][46663] Updated weights for policy 1, policy_version 2941 (0.0009) +[2023-10-13 00:19:39,767][46662] Updated weights for policy 0, policy_version 2920 (0.0008) +[2023-10-13 00:19:39,771][46384] Saving new best policy, reward=30.840! +[2023-10-13 00:19:40,151][46662] Updated weights for policy 0, policy_version 2930 (0.0010) +[2023-10-13 00:19:40,528][46662] Updated weights for policy 0, policy_version 2940 (0.0009) +[2023-10-13 00:19:43,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 6029312. Throughput: 0: 1655.9, 1: 1683.8. Samples: 1514124. Policy #0 lag: (min: 3.0, avg: 8.1, max: 35.0) +[2023-10-13 00:19:43,607][45375] Avg episode reward: [(0, '25.220'), (1, '31.060')] +[2023-10-13 00:19:43,608][46091] Saving new best policy, reward=25.220! +[2023-10-13 00:19:43,797][46663] Updated weights for policy 1, policy_version 2951 (0.0007) +[2023-10-13 00:19:44,171][46663] Updated weights for policy 1, policy_version 2961 (0.0007) +[2023-10-13 00:19:44,507][46662] Updated weights for policy 0, policy_version 2950 (0.0009) +[2023-10-13 00:19:44,537][46663] Updated weights for policy 1, policy_version 2971 (0.0008) +[2023-10-13 00:19:44,721][46384] Saving new best policy, reward=31.060! +[2023-10-13 00:19:44,881][46662] Updated weights for policy 0, policy_version 2960 (0.0008) +[2023-10-13 00:19:45,250][46662] Updated weights for policy 0, policy_version 2970 (0.0009) +[2023-10-13 00:19:48,588][46663] Updated weights for policy 1, policy_version 2981 (0.0008) +[2023-10-13 00:19:48,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 6094848. Throughput: 0: 1686.7, 1: 1689.4. Samples: 1535154. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 00:19:48,607][45375] Avg episode reward: [(0, '26.460'), (1, '30.280')] +[2023-10-13 00:19:48,608][46091] Saving new best policy, reward=26.460! +[2023-10-13 00:19:48,959][46663] Updated weights for policy 1, policy_version 2991 (0.0008) +[2023-10-13 00:19:49,291][46662] Updated weights for policy 0, policy_version 2980 (0.0009) +[2023-10-13 00:19:49,332][46663] Updated weights for policy 1, policy_version 3001 (0.0008) +[2023-10-13 00:19:49,661][46662] Updated weights for policy 0, policy_version 2990 (0.0008) +[2023-10-13 00:19:50,031][46662] Updated weights for policy 0, policy_version 3000 (0.0008) +[2023-10-13 00:19:53,510][46663] Updated weights for policy 1, policy_version 3011 (0.0010) +[2023-10-13 00:19:53,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 6160384. Throughput: 0: 1686.1, 1: 1689.4. Samples: 1555880. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 00:19:53,607][45375] Avg episode reward: [(0, '26.850'), (1, '31.100')] +[2023-10-13 00:19:53,616][46091] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000003008_3080192.pth... +[2023-10-13 00:19:53,652][46091] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000001440_1474560.pth +[2023-10-13 00:19:53,655][46091] Saving new best policy, reward=26.850! +[2023-10-13 00:19:53,876][46663] Updated weights for policy 1, policy_version 3021 (0.0007) +[2023-10-13 00:19:53,996][46662] Updated weights for policy 0, policy_version 3010 (0.0009) +[2023-10-13 00:19:54,244][46663] Updated weights for policy 1, policy_version 3031 (0.0007) +[2023-10-13 00:19:54,371][46662] Updated weights for policy 0, policy_version 3020 (0.0009) +[2023-10-13 00:19:54,577][46384] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000003040_3112960.pth... +[2023-10-13 00:19:54,616][46384] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000001440_1474560.pth +[2023-10-13 00:19:54,620][46384] Saving new best policy, reward=31.100! +[2023-10-13 00:19:54,746][46662] Updated weights for policy 0, policy_version 3030 (0.0009) +[2023-10-13 00:19:55,116][46662] Updated weights for policy 0, policy_version 3040 (0.0008) +[2023-10-13 00:19:58,284][46663] Updated weights for policy 1, policy_version 3041 (0.0008) +[2023-10-13 00:19:58,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 6225920. Throughput: 0: 1668.3, 1: 1691.6. Samples: 1564948. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 00:19:58,607][45375] Avg episode reward: [(0, '26.510'), (1, '31.150')] +[2023-10-13 00:19:58,653][46663] Updated weights for policy 1, policy_version 3051 (0.0008) +[2023-10-13 00:19:59,031][46663] Updated weights for policy 1, policy_version 3061 (0.0009) +[2023-10-13 00:19:59,399][46663] Updated weights for policy 1, policy_version 3071 (0.0009) +[2023-10-13 00:19:59,404][46662] Updated weights for policy 0, policy_version 3050 (0.0008) +[2023-10-13 00:19:59,425][46384] Saving new best policy, reward=31.150! +[2023-10-13 00:19:59,775][46662] Updated weights for policy 0, policy_version 3060 (0.0010) +[2023-10-13 00:20:00,158][46662] Updated weights for policy 0, policy_version 3070 (0.0009) +[2023-10-13 00:20:03,581][46663] Updated weights for policy 1, policy_version 3081 (0.0008) +[2023-10-13 00:20:03,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 6291456. Throughput: 0: 1684.6, 1: 1683.9. Samples: 1585478. Policy #0 lag: (min: 31.0, avg: 33.8, max: 63.0) +[2023-10-13 00:20:03,608][45375] Avg episode reward: [(0, '27.590'), (1, '31.570')] +[2023-10-13 00:20:03,609][46091] Saving new best policy, reward=27.590! +[2023-10-13 00:20:03,950][46663] Updated weights for policy 1, policy_version 3091 (0.0007) +[2023-10-13 00:20:04,310][46662] Updated weights for policy 0, policy_version 3080 (0.0010) +[2023-10-13 00:20:04,322][46663] Updated weights for policy 1, policy_version 3101 (0.0007) +[2023-10-13 00:20:04,424][46384] Saving new best policy, reward=31.570! +[2023-10-13 00:20:04,688][46662] Updated weights for policy 0, policy_version 3090 (0.0010) +[2023-10-13 00:20:05,055][46662] Updated weights for policy 0, policy_version 3100 (0.0009) +[2023-10-13 00:20:08,465][46663] Updated weights for policy 1, policy_version 3111 (0.0009) +[2023-10-13 00:20:08,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 6356992. Throughput: 0: 1681.3, 1: 1678.5. Samples: 1605920. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) +[2023-10-13 00:20:08,607][45375] Avg episode reward: [(0, '28.470'), (1, '32.220')] +[2023-10-13 00:20:08,613][46091] Saving new best policy, reward=28.470! +[2023-10-13 00:20:08,825][46663] Updated weights for policy 1, policy_version 3121 (0.0009) +[2023-10-13 00:20:09,199][46663] Updated weights for policy 1, policy_version 3131 (0.0008) +[2023-10-13 00:20:09,383][46384] Saving new best policy, reward=32.220! +[2023-10-13 00:20:09,388][46662] Updated weights for policy 0, policy_version 3110 (0.0007) +[2023-10-13 00:20:09,774][46662] Updated weights for policy 0, policy_version 3120 (0.0007) +[2023-10-13 00:20:10,153][46662] Updated weights for policy 0, policy_version 3130 (0.0008) +[2023-10-13 00:20:13,181][46663] Updated weights for policy 1, policy_version 3141 (0.0007) +[2023-10-13 00:20:13,580][46663] Updated weights for policy 1, policy_version 3151 (0.0009) +[2023-10-13 00:20:13,607][45375] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 6422528. Throughput: 0: 1667.9, 1: 1684.2. Samples: 1615096. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 00:20:13,607][45375] Avg episode reward: [(0, '26.940'), (1, '31.060')] +[2023-10-13 00:20:13,949][46663] Updated weights for policy 1, policy_version 3161 (0.0007) +[2023-10-13 00:20:14,037][46662] Updated weights for policy 0, policy_version 3140 (0.0010) +[2023-10-13 00:20:14,396][46662] Updated weights for policy 0, policy_version 3150 (0.0009) +[2023-10-13 00:20:14,766][46662] Updated weights for policy 0, policy_version 3160 (0.0010) +[2023-10-13 00:20:17,952][46663] Updated weights for policy 1, policy_version 3171 (0.0008) +[2023-10-13 00:20:18,319][46663] Updated weights for policy 1, policy_version 3181 (0.0008) +[2023-10-13 00:20:18,606][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 6488064. Throughput: 0: 1678.8, 1: 1683.9. Samples: 1635690. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 00:20:18,607][45375] Avg episode reward: [(0, '26.570'), (1, '31.190')] +[2023-10-13 00:20:18,684][46663] Updated weights for policy 1, policy_version 3191 (0.0010) +[2023-10-13 00:20:18,962][46662] Updated weights for policy 0, policy_version 3170 (0.0009) +[2023-10-13 00:20:19,341][46662] Updated weights for policy 0, policy_version 3180 (0.0008) +[2023-10-13 00:20:19,713][46662] Updated weights for policy 0, policy_version 3190 (0.0007) +[2023-10-13 00:20:20,086][46662] Updated weights for policy 0, policy_version 3200 (0.0007) +[2023-10-13 00:20:22,888][46663] Updated weights for policy 1, policy_version 3201 (0.0008) +[2023-10-13 00:20:23,255][46663] Updated weights for policy 1, policy_version 3211 (0.0007) +[2023-10-13 00:20:23,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.3, 300 sec: 13329.4). Total num frames: 6553600. Throughput: 0: 1674.8, 1: 1669.1. Samples: 1655578. Policy #0 lag: (min: 29.0, avg: 33.9, max: 61.0) +[2023-10-13 00:20:23,607][45375] Avg episode reward: [(0, '27.450'), (1, '30.500')] +[2023-10-13 00:20:23,633][46663] Updated weights for policy 1, policy_version 3221 (0.0008) +[2023-10-13 00:20:24,001][46663] Updated weights for policy 1, policy_version 3231 (0.0009) +[2023-10-13 00:20:24,121][46662] Updated weights for policy 0, policy_version 3210 (0.0007) +[2023-10-13 00:20:24,480][46662] Updated weights for policy 0, policy_version 3220 (0.0007) +[2023-10-13 00:20:24,863][46662] Updated weights for policy 0, policy_version 3230 (0.0008) +[2023-10-13 00:20:27,955][46663] Updated weights for policy 1, policy_version 3241 (0.0010) +[2023-10-13 00:20:28,330][46663] Updated weights for policy 1, policy_version 3251 (0.0008) +[2023-10-13 00:20:28,606][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 6619136. Throughput: 0: 1679.8, 1: 1681.7. Samples: 1665392. Policy #0 lag: (min: 29.0, avg: 33.9, max: 61.0) +[2023-10-13 00:20:28,607][45375] Avg episode reward: [(0, '25.940'), (1, '30.910')] +[2023-10-13 00:20:28,700][46663] Updated weights for policy 1, policy_version 3261 (0.0007) +[2023-10-13 00:20:28,787][46662] Updated weights for policy 0, policy_version 3240 (0.0008) +[2023-10-13 00:20:29,161][46662] Updated weights for policy 0, policy_version 3250 (0.0007) +[2023-10-13 00:20:29,539][46662] Updated weights for policy 0, policy_version 3260 (0.0007) +[2023-10-13 00:20:32,894][46663] Updated weights for policy 1, policy_version 3271 (0.0007) +[2023-10-13 00:20:33,258][46663] Updated weights for policy 1, policy_version 3281 (0.0009) +[2023-10-13 00:20:33,607][45375] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 6684672. Throughput: 0: 1677.9, 1: 1673.0. Samples: 1685944. Policy #0 lag: (min: 19.0, avg: 23.8, max: 51.0) +[2023-10-13 00:20:33,608][45375] Avg episode reward: [(0, '26.050'), (1, '32.870')] +[2023-10-13 00:20:33,622][46662] Updated weights for policy 0, policy_version 3270 (0.0008) +[2023-10-13 00:20:33,627][46663] Updated weights for policy 1, policy_version 3291 (0.0007) +[2023-10-13 00:20:33,817][46384] Saving new best policy, reward=32.870! +[2023-10-13 00:20:33,991][46662] Updated weights for policy 0, policy_version 3280 (0.0008) +[2023-10-13 00:20:34,361][46662] Updated weights for policy 0, policy_version 3290 (0.0009) +[2023-10-13 00:20:37,604][46663] Updated weights for policy 1, policy_version 3301 (0.0007) +[2023-10-13 00:20:37,957][46663] Updated weights for policy 1, policy_version 3311 (0.0008) +[2023-10-13 00:20:38,330][46663] Updated weights for policy 1, policy_version 3321 (0.0008) +[2023-10-13 00:20:38,464][46662] Updated weights for policy 0, policy_version 3300 (0.0008) +[2023-10-13 00:20:38,606][45375] Fps is (10 sec: 16384.0, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 6782976. Throughput: 0: 1676.7, 1: 1654.5. Samples: 1705786. Policy #0 lag: (min: 31.0, avg: 38.1, max: 63.0) +[2023-10-13 00:20:38,607][45375] Avg episode reward: [(0, '25.080'), (1, '32.710')] +[2023-10-13 00:20:38,835][46662] Updated weights for policy 0, policy_version 3310 (0.0009) +[2023-10-13 00:20:39,207][46662] Updated weights for policy 0, policy_version 3320 (0.0007) +[2023-10-13 00:20:42,330][46663] Updated weights for policy 1, policy_version 3331 (0.0009) +[2023-10-13 00:20:42,701][46663] Updated weights for policy 1, policy_version 3341 (0.0009) +[2023-10-13 00:20:43,072][46663] Updated weights for policy 1, policy_version 3351 (0.0010) +[2023-10-13 00:20:43,203][46662] Updated weights for policy 0, policy_version 3330 (0.0008) +[2023-10-13 00:20:43,564][46662] Updated weights for policy 0, policy_version 3340 (0.0008) +[2023-10-13 00:20:43,607][45375] Fps is (10 sec: 16384.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 6848512. Throughput: 0: 1683.2, 1: 1679.3. Samples: 1716264. Policy #0 lag: (min: 31.0, avg: 38.1, max: 63.0) +[2023-10-13 00:20:43,608][45375] Avg episode reward: [(0, '24.290'), (1, '33.720')] +[2023-10-13 00:20:43,609][46384] Saving new best policy, reward=33.720! +[2023-10-13 00:20:43,938][46662] Updated weights for policy 0, policy_version 3350 (0.0008) +[2023-10-13 00:20:44,306][46662] Updated weights for policy 0, policy_version 3360 (0.0007) +[2023-10-13 00:20:47,123][46663] Updated weights for policy 1, policy_version 3361 (0.0009) +[2023-10-13 00:20:47,489][46663] Updated weights for policy 1, policy_version 3371 (0.0010) +[2023-10-13 00:20:47,863][46663] Updated weights for policy 1, policy_version 3381 (0.0008) +[2023-10-13 00:20:48,233][46663] Updated weights for policy 1, policy_version 3391 (0.0008) +[2023-10-13 00:20:48,376][46662] Updated weights for policy 0, policy_version 3370 (0.0009) +[2023-10-13 00:20:48,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 6914048. Throughput: 0: 1682.9, 1: 1676.3. Samples: 1736642. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 00:20:48,607][45375] Avg episode reward: [(0, '25.340'), (1, '34.610')] +[2023-10-13 00:20:48,608][46384] Saving new best policy, reward=34.610! +[2023-10-13 00:20:48,753][46662] Updated weights for policy 0, policy_version 3380 (0.0008) +[2023-10-13 00:20:49,125][46662] Updated weights for policy 0, policy_version 3390 (0.0009) +[2023-10-13 00:20:52,298][46663] Updated weights for policy 1, policy_version 3401 (0.0009) +[2023-10-13 00:20:52,663][46663] Updated weights for policy 1, policy_version 3411 (0.0010) +[2023-10-13 00:20:53,031][46663] Updated weights for policy 1, policy_version 3421 (0.0008) +[2023-10-13 00:20:53,245][46662] Updated weights for policy 0, policy_version 3400 (0.0009) +[2023-10-13 00:20:53,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 6979584. Throughput: 0: 1684.1, 1: 1658.1. Samples: 1756320. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 00:20:53,608][45375] Avg episode reward: [(0, '26.610'), (1, '34.310')] +[2023-10-13 00:20:53,616][46662] Updated weights for policy 0, policy_version 3410 (0.0007) +[2023-10-13 00:20:53,982][46662] Updated weights for policy 0, policy_version 3420 (0.0007) +[2023-10-13 00:20:57,178][46663] Updated weights for policy 1, policy_version 3431 (0.0009) +[2023-10-13 00:20:57,556][46663] Updated weights for policy 1, policy_version 3441 (0.0010) +[2023-10-13 00:20:57,927][46663] Updated weights for policy 1, policy_version 3451 (0.0009) +[2023-10-13 00:20:58,193][46662] Updated weights for policy 0, policy_version 3430 (0.0008) +[2023-10-13 00:20:58,571][46662] Updated weights for policy 0, policy_version 3440 (0.0008) +[2023-10-13 00:20:58,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 7045120. Throughput: 0: 1687.0, 1: 1677.9. Samples: 1766516. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 00:20:58,607][45375] Avg episode reward: [(0, '26.150'), (1, '34.360')] +[2023-10-13 00:20:58,951][46662] Updated weights for policy 0, policy_version 3450 (0.0008) +[2023-10-13 00:21:02,159][46663] Updated weights for policy 1, policy_version 3461 (0.0007) +[2023-10-13 00:21:02,553][46663] Updated weights for policy 1, policy_version 3471 (0.0009) +[2023-10-13 00:21:02,922][46663] Updated weights for policy 1, policy_version 3481 (0.0008) +[2023-10-13 00:21:03,023][46662] Updated weights for policy 0, policy_version 3460 (0.0008) +[2023-10-13 00:21:03,392][46662] Updated weights for policy 0, policy_version 3470 (0.0009) +[2023-10-13 00:21:03,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 7110656. Throughput: 0: 1686.8, 1: 1667.7. Samples: 1786646. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 00:21:03,608][45375] Avg episode reward: [(0, '26.460'), (1, '34.000')] +[2023-10-13 00:21:03,755][46662] Updated weights for policy 0, policy_version 3480 (0.0007) +[2023-10-13 00:21:06,919][46663] Updated weights for policy 1, policy_version 3491 (0.0008) +[2023-10-13 00:21:07,292][46663] Updated weights for policy 1, policy_version 3501 (0.0008) +[2023-10-13 00:21:07,668][46663] Updated weights for policy 1, policy_version 3511 (0.0008) +[2023-10-13 00:21:07,709][46662] Updated weights for policy 0, policy_version 3490 (0.0008) +[2023-10-13 00:21:08,082][46662] Updated weights for policy 0, policy_version 3500 (0.0008) +[2023-10-13 00:21:08,457][46662] Updated weights for policy 0, policy_version 3510 (0.0008) +[2023-10-13 00:21:08,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 7176192. Throughput: 0: 1685.8, 1: 1664.0. Samples: 1806318. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-10-13 00:21:08,607][45375] Avg episode reward: [(0, '27.420'), (1, '33.700')] +[2023-10-13 00:21:08,828][46662] Updated weights for policy 0, policy_version 3520 (0.0008) +[2023-10-13 00:21:11,717][46663] Updated weights for policy 1, policy_version 3521 (0.0009) +[2023-10-13 00:21:12,082][46663] Updated weights for policy 1, policy_version 3531 (0.0007) +[2023-10-13 00:21:12,457][46663] Updated weights for policy 1, policy_version 3541 (0.0009) +[2023-10-13 00:21:12,815][46663] Updated weights for policy 1, policy_version 3551 (0.0007) +[2023-10-13 00:21:12,848][46662] Updated weights for policy 0, policy_version 3530 (0.0007) +[2023-10-13 00:21:13,222][46662] Updated weights for policy 0, policy_version 3540 (0.0009) +[2023-10-13 00:21:13,591][46662] Updated weights for policy 0, policy_version 3550 (0.0009) +[2023-10-13 00:21:13,606][45375] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 7241728. Throughput: 0: 1685.4, 1: 1680.9. Samples: 1816874. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) +[2023-10-13 00:21:13,607][45375] Avg episode reward: [(0, '28.240'), (1, '33.570')] +[2023-10-13 00:21:16,768][46663] Updated weights for policy 1, policy_version 3561 (0.0008) +[2023-10-13 00:21:17,152][46663] Updated weights for policy 1, policy_version 3571 (0.0007) +[2023-10-13 00:21:17,522][46663] Updated weights for policy 1, policy_version 3581 (0.0009) +[2023-10-13 00:21:17,624][46662] Updated weights for policy 0, policy_version 3560 (0.0009) +[2023-10-13 00:21:17,995][46662] Updated weights for policy 0, policy_version 3570 (0.0008) +[2023-10-13 00:21:18,369][46662] Updated weights for policy 0, policy_version 3580 (0.0008) +[2023-10-13 00:21:18,606][45375] Fps is (10 sec: 16384.0, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 7340032. Throughput: 0: 1684.9, 1: 1665.4. Samples: 1836710. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 00:21:18,607][45375] Avg episode reward: [(0, '28.180'), (1, '34.530')] +[2023-10-13 00:21:21,649][46663] Updated weights for policy 1, policy_version 3591 (0.0008) +[2023-10-13 00:21:22,022][46663] Updated weights for policy 1, policy_version 3601 (0.0008) +[2023-10-13 00:21:22,383][46663] Updated weights for policy 1, policy_version 3611 (0.0008) +[2023-10-13 00:21:22,449][46662] Updated weights for policy 0, policy_version 3590 (0.0008) +[2023-10-13 00:21:22,817][46662] Updated weights for policy 0, policy_version 3600 (0.0007) +[2023-10-13 00:21:23,201][46662] Updated weights for policy 0, policy_version 3610 (0.0007) +[2023-10-13 00:21:23,607][45375] Fps is (10 sec: 16383.6, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 7405568. Throughput: 0: 1671.3, 1: 1677.5. Samples: 1856482. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 00:21:23,608][45375] Avg episode reward: [(0, '28.810'), (1, '33.890')] +[2023-10-13 00:21:23,619][46091] Saving new best policy, reward=28.810! +[2023-10-13 00:21:26,441][46663] Updated weights for policy 1, policy_version 3621 (0.0008) +[2023-10-13 00:21:26,815][46663] Updated weights for policy 1, policy_version 3631 (0.0010) +[2023-10-13 00:21:27,147][46662] Updated weights for policy 0, policy_version 3620 (0.0008) +[2023-10-13 00:21:27,192][46663] Updated weights for policy 1, policy_version 3641 (0.0008) +[2023-10-13 00:21:27,513][46662] Updated weights for policy 0, policy_version 3630 (0.0010) +[2023-10-13 00:21:27,893][46662] Updated weights for policy 0, policy_version 3640 (0.0009) +[2023-10-13 00:21:28,607][45375] Fps is (10 sec: 13107.1, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 7471104. Throughput: 0: 1681.4, 1: 1678.6. Samples: 1867462. Policy #0 lag: (min: 31.0, avg: 33.9, max: 63.0) +[2023-10-13 00:21:28,607][45375] Avg episode reward: [(0, '27.830'), (1, '32.520')] +[2023-10-13 00:21:31,153][46663] Updated weights for policy 1, policy_version 3651 (0.0008) +[2023-10-13 00:21:31,529][46663] Updated weights for policy 1, policy_version 3661 (0.0009) +[2023-10-13 00:21:31,898][46663] Updated weights for policy 1, policy_version 3671 (0.0009) +[2023-10-13 00:21:32,121][46662] Updated weights for policy 0, policy_version 3650 (0.0010) +[2023-10-13 00:21:32,493][46662] Updated weights for policy 0, policy_version 3660 (0.0011) +[2023-10-13 00:21:32,881][46662] Updated weights for policy 0, policy_version 3670 (0.0009) +[2023-10-13 00:21:33,246][46662] Updated weights for policy 0, policy_version 3680 (0.0007) +[2023-10-13 00:21:33,607][45375] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 7536640. Throughput: 0: 1680.8, 1: 1662.1. Samples: 1887070. Policy #0 lag: (min: 31.0, avg: 33.9, max: 63.0) +[2023-10-13 00:21:33,607][45375] Avg episode reward: [(0, '27.740'), (1, '31.490')] +[2023-10-13 00:21:35,903][46663] Updated weights for policy 1, policy_version 3681 (0.0009) +[2023-10-13 00:21:36,271][46663] Updated weights for policy 1, policy_version 3691 (0.0008) +[2023-10-13 00:21:36,643][46663] Updated weights for policy 1, policy_version 3701 (0.0008) +[2023-10-13 00:21:37,006][46663] Updated weights for policy 1, policy_version 3711 (0.0008) +[2023-10-13 00:21:37,293][46662] Updated weights for policy 0, policy_version 3690 (0.0008) +[2023-10-13 00:21:37,670][46662] Updated weights for policy 0, policy_version 3700 (0.0007) +[2023-10-13 00:21:38,043][46662] Updated weights for policy 0, policy_version 3710 (0.0007) +[2023-10-13 00:21:38,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 7602176. Throughput: 0: 1662.7, 1: 1686.5. Samples: 1907030. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 00:21:38,607][45375] Avg episode reward: [(0, '27.250'), (1, '31.470')] +[2023-10-13 00:21:40,928][46663] Updated weights for policy 1, policy_version 3721 (0.0009) +[2023-10-13 00:21:41,304][46663] Updated weights for policy 1, policy_version 3731 (0.0010) +[2023-10-13 00:21:41,663][46663] Updated weights for policy 1, policy_version 3741 (0.0010) +[2023-10-13 00:21:42,050][46662] Updated weights for policy 0, policy_version 3720 (0.0008) +[2023-10-13 00:21:42,426][46662] Updated weights for policy 0, policy_version 3730 (0.0009) +[2023-10-13 00:21:42,793][46662] Updated weights for policy 0, policy_version 3740 (0.0009) +[2023-10-13 00:21:43,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 7667712. Throughput: 0: 1688.8, 1: 1672.3. Samples: 1917766. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 00:21:43,607][45375] Avg episode reward: [(0, '25.830'), (1, '31.300')] +[2023-10-13 00:21:45,874][46663] Updated weights for policy 1, policy_version 3751 (0.0010) +[2023-10-13 00:21:46,246][46663] Updated weights for policy 1, policy_version 3761 (0.0011) +[2023-10-13 00:21:46,615][46663] Updated weights for policy 1, policy_version 3771 (0.0008) +[2023-10-13 00:21:46,810][46662] Updated weights for policy 0, policy_version 3750 (0.0008) +[2023-10-13 00:21:47,201][46662] Updated weights for policy 0, policy_version 3760 (0.0010) +[2023-10-13 00:21:47,574][46662] Updated weights for policy 0, policy_version 3770 (0.0010) +[2023-10-13 00:21:48,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 7733248. Throughput: 0: 1685.3, 1: 1674.9. Samples: 1937854. Policy #0 lag: (min: 16.0, avg: 42.7, max: 48.0) +[2023-10-13 00:21:48,607][45375] Avg episode reward: [(0, '25.930'), (1, '31.980')] +[2023-10-13 00:21:50,795][46663] Updated weights for policy 1, policy_version 3781 (0.0008) +[2023-10-13 00:21:51,170][46663] Updated weights for policy 1, policy_version 3791 (0.0010) +[2023-10-13 00:21:51,535][46663] Updated weights for policy 1, policy_version 3801 (0.0009) +[2023-10-13 00:21:51,547][46662] Updated weights for policy 0, policy_version 3780 (0.0007) +[2023-10-13 00:21:51,924][46662] Updated weights for policy 0, policy_version 3790 (0.0007) +[2023-10-13 00:21:52,292][46662] Updated weights for policy 0, policy_version 3800 (0.0008) +[2023-10-13 00:21:53,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 7798784. Throughput: 0: 1666.7, 1: 1689.2. Samples: 1957336. Policy #0 lag: (min: 16.0, avg: 42.7, max: 48.0) +[2023-10-13 00:21:53,607][45375] Avg episode reward: [(0, '25.800'), (1, '31.300')] +[2023-10-13 00:21:53,617][46091] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000003808_3899392.pth... +[2023-10-13 00:21:53,617][46384] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000003808_3899392.pth... +[2023-10-13 00:21:53,652][46384] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000002240_2293760.pth +[2023-10-13 00:21:53,657][46091] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000002240_2293760.pth +[2023-10-13 00:21:55,726][46663] Updated weights for policy 1, policy_version 3811 (0.0009) +[2023-10-13 00:21:56,099][46663] Updated weights for policy 1, policy_version 3821 (0.0011) +[2023-10-13 00:21:56,352][46662] Updated weights for policy 0, policy_version 3810 (0.0010) +[2023-10-13 00:21:56,460][46663] Updated weights for policy 1, policy_version 3831 (0.0008) +[2023-10-13 00:21:56,716][46662] Updated weights for policy 0, policy_version 3820 (0.0007) +[2023-10-13 00:21:57,084][46662] Updated weights for policy 0, policy_version 3830 (0.0009) +[2023-10-13 00:21:57,451][46662] Updated weights for policy 0, policy_version 3840 (0.0007) +[2023-10-13 00:21:58,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 7864320. Throughput: 0: 1693.7, 1: 1671.4. Samples: 1968304. Policy #0 lag: (min: 29.0, avg: 29.0, max: 29.0) +[2023-10-13 00:21:58,607][45375] Avg episode reward: [(0, '25.260'), (1, '32.530')] +[2023-10-13 00:22:00,341][46663] Updated weights for policy 1, policy_version 3841 (0.0007) +[2023-10-13 00:22:00,709][46663] Updated weights for policy 1, policy_version 3851 (0.0009) +[2023-10-13 00:22:01,075][46663] Updated weights for policy 1, policy_version 3861 (0.0008) +[2023-10-13 00:22:01,445][46663] Updated weights for policy 1, policy_version 3871 (0.0008) +[2023-10-13 00:22:01,591][46662] Updated weights for policy 0, policy_version 3850 (0.0007) +[2023-10-13 00:22:01,966][46662] Updated weights for policy 0, policy_version 3860 (0.0009) +[2023-10-13 00:22:02,329][46662] Updated weights for policy 0, policy_version 3870 (0.0008) +[2023-10-13 00:22:03,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 7929856. Throughput: 0: 1677.1, 1: 1686.9. Samples: 1988092. Policy #0 lag: (min: 29.0, avg: 29.0, max: 29.0) +[2023-10-13 00:22:03,608][45375] Avg episode reward: [(0, '24.810'), (1, '31.470')] +[2023-10-13 00:22:05,617][46663] Updated weights for policy 1, policy_version 3881 (0.0008) +[2023-10-13 00:22:05,986][46663] Updated weights for policy 1, policy_version 3891 (0.0008) +[2023-10-13 00:22:06,353][46663] Updated weights for policy 1, policy_version 3901 (0.0008) +[2023-10-13 00:22:06,397][46662] Updated weights for policy 0, policy_version 3880 (0.0008) +[2023-10-13 00:22:06,774][46662] Updated weights for policy 0, policy_version 3890 (0.0008) +[2023-10-13 00:22:07,146][46662] Updated weights for policy 0, policy_version 3900 (0.0008) +[2023-10-13 00:22:08,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 7995392. Throughput: 0: 1679.7, 1: 1690.5. Samples: 2008140. Policy #0 lag: (min: 31.0, avg: 38.9, max: 63.0) +[2023-10-13 00:22:08,607][45375] Avg episode reward: [(0, '25.570'), (1, '31.030')] +[2023-10-13 00:22:10,427][46663] Updated weights for policy 1, policy_version 3911 (0.0009) +[2023-10-13 00:22:10,791][46663] Updated weights for policy 1, policy_version 3921 (0.0009) +[2023-10-13 00:22:11,157][46663] Updated weights for policy 1, policy_version 3931 (0.0008) +[2023-10-13 00:22:11,177][46662] Updated weights for policy 0, policy_version 3910 (0.0008) +[2023-10-13 00:22:11,551][46662] Updated weights for policy 0, policy_version 3920 (0.0010) +[2023-10-13 00:22:11,931][46662] Updated weights for policy 0, policy_version 3930 (0.0009) +[2023-10-13 00:22:13,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 8060928. Throughput: 0: 1692.3, 1: 1662.6. Samples: 2018432. Policy #0 lag: (min: 31.0, avg: 38.9, max: 63.0) +[2023-10-13 00:22:13,608][45375] Avg episode reward: [(0, '25.070'), (1, '31.340')] +[2023-10-13 00:22:15,196][46663] Updated weights for policy 1, policy_version 3941 (0.0008) +[2023-10-13 00:22:15,560][46663] Updated weights for policy 1, policy_version 3951 (0.0008) +[2023-10-13 00:22:15,860][46662] Updated weights for policy 0, policy_version 3940 (0.0008) +[2023-10-13 00:22:15,926][46663] Updated weights for policy 1, policy_version 3961 (0.0008) +[2023-10-13 00:22:16,233][46662] Updated weights for policy 0, policy_version 3950 (0.0007) +[2023-10-13 00:22:16,602][46662] Updated weights for policy 0, policy_version 3960 (0.0007) +[2023-10-13 00:22:18,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 8126464. Throughput: 0: 1674.1, 1: 1683.4. Samples: 2038160. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) +[2023-10-13 00:22:18,607][45375] Avg episode reward: [(0, '26.460'), (1, '30.600')] +[2023-10-13 00:22:20,137][46663] Updated weights for policy 1, policy_version 3971 (0.0009) +[2023-10-13 00:22:20,501][46663] Updated weights for policy 1, policy_version 3981 (0.0010) +[2023-10-13 00:22:20,618][46662] Updated weights for policy 0, policy_version 3970 (0.0009) +[2023-10-13 00:22:20,871][46663] Updated weights for policy 1, policy_version 3991 (0.0007) +[2023-10-13 00:22:20,995][46662] Updated weights for policy 0, policy_version 3980 (0.0008) +[2023-10-13 00:22:21,359][46662] Updated weights for policy 0, policy_version 3990 (0.0009) +[2023-10-13 00:22:21,732][46662] Updated weights for policy 0, policy_version 4000 (0.0007) +[2023-10-13 00:22:23,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 8192000. Throughput: 0: 1688.8, 1: 1680.3. Samples: 2058640. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) +[2023-10-13 00:22:23,607][45375] Avg episode reward: [(0, '27.260'), (1, '30.940')] +[2023-10-13 00:22:24,827][46663] Updated weights for policy 1, policy_version 4001 (0.0007) +[2023-10-13 00:22:25,197][46663] Updated weights for policy 1, policy_version 4011 (0.0011) +[2023-10-13 00:22:25,558][46663] Updated weights for policy 1, policy_version 4021 (0.0007) +[2023-10-13 00:22:25,741][46662] Updated weights for policy 0, policy_version 4010 (0.0008) +[2023-10-13 00:22:25,932][46663] Updated weights for policy 1, policy_version 4031 (0.0008) +[2023-10-13 00:22:26,115][46662] Updated weights for policy 0, policy_version 4020 (0.0007) +[2023-10-13 00:22:26,491][46662] Updated weights for policy 0, policy_version 4030 (0.0008) +[2023-10-13 00:22:28,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 8257536. Throughput: 0: 1683.7, 1: 1666.3. Samples: 2068518. Policy #0 lag: (min: 31.0, avg: 37.6, max: 63.0) +[2023-10-13 00:22:28,607][45375] Avg episode reward: [(0, '27.110'), (1, '29.730')] +[2023-10-13 00:22:29,987][46663] Updated weights for policy 1, policy_version 4041 (0.0008) +[2023-10-13 00:22:30,347][46663] Updated weights for policy 1, policy_version 4051 (0.0008) +[2023-10-13 00:22:30,479][46662] Updated weights for policy 0, policy_version 4040 (0.0008) +[2023-10-13 00:22:30,725][46663] Updated weights for policy 1, policy_version 4061 (0.0008) +[2023-10-13 00:22:30,858][46662] Updated weights for policy 0, policy_version 4050 (0.0007) +[2023-10-13 00:22:31,224][46662] Updated weights for policy 0, policy_version 4060 (0.0007) +[2023-10-13 00:22:33,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 8323072. Throughput: 0: 1666.8, 1: 1676.5. Samples: 2088304. Policy #0 lag: (min: 31.0, avg: 37.6, max: 63.0) +[2023-10-13 00:22:33,607][45375] Avg episode reward: [(0, '27.460'), (1, '30.090')] +[2023-10-13 00:22:34,627][46663] Updated weights for policy 1, policy_version 4071 (0.0009) +[2023-10-13 00:22:34,992][46663] Updated weights for policy 1, policy_version 4081 (0.0008) +[2023-10-13 00:22:35,367][46663] Updated weights for policy 1, policy_version 4091 (0.0009) +[2023-10-13 00:22:35,419][46662] Updated weights for policy 0, policy_version 4070 (0.0007) +[2023-10-13 00:22:35,811][46662] Updated weights for policy 0, policy_version 4080 (0.0008) +[2023-10-13 00:22:36,178][46662] Updated weights for policy 0, policy_version 4090 (0.0008) +[2023-10-13 00:22:38,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 8388608. Throughput: 0: 1687.2, 1: 1685.9. Samples: 2109124. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) +[2023-10-13 00:22:38,607][45375] Avg episode reward: [(0, '27.560'), (1, '30.220')] +[2023-10-13 00:22:39,536][46663] Updated weights for policy 1, policy_version 4101 (0.0008) +[2023-10-13 00:22:39,914][46663] Updated weights for policy 1, policy_version 4111 (0.0010) +[2023-10-13 00:22:40,223][46662] Updated weights for policy 0, policy_version 4100 (0.0009) +[2023-10-13 00:22:40,286][46663] Updated weights for policy 1, policy_version 4121 (0.0007) +[2023-10-13 00:22:40,589][46662] Updated weights for policy 0, policy_version 4110 (0.0009) +[2023-10-13 00:22:40,960][46662] Updated weights for policy 0, policy_version 4120 (0.0010) +[2023-10-13 00:22:43,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 8454144. Throughput: 0: 1667.9, 1: 1669.2. Samples: 2118472. Policy #0 lag: (min: 7.0, avg: 10.2, max: 39.0) +[2023-10-13 00:22:43,607][45375] Avg episode reward: [(0, '28.020'), (1, '29.160')] +[2023-10-13 00:22:44,377][46663] Updated weights for policy 1, policy_version 4131 (0.0008) +[2023-10-13 00:22:44,758][46663] Updated weights for policy 1, policy_version 4141 (0.0010) +[2023-10-13 00:22:45,135][46663] Updated weights for policy 1, policy_version 4151 (0.0010) +[2023-10-13 00:22:45,147][46662] Updated weights for policy 0, policy_version 4130 (0.0009) +[2023-10-13 00:22:45,522][46662] Updated weights for policy 0, policy_version 4140 (0.0008) +[2023-10-13 00:22:45,891][46662] Updated weights for policy 0, policy_version 4150 (0.0011) +[2023-10-13 00:22:46,263][46662] Updated weights for policy 0, policy_version 4160 (0.0010) +[2023-10-13 00:22:48,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 8519680. Throughput: 0: 1670.4, 1: 1666.6. Samples: 2138258. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 00:22:48,608][45375] Avg episode reward: [(0, '28.670'), (1, '28.330')] +[2023-10-13 00:22:49,287][46663] Updated weights for policy 1, policy_version 4161 (0.0008) +[2023-10-13 00:22:49,657][46663] Updated weights for policy 1, policy_version 4171 (0.0010) +[2023-10-13 00:22:50,025][46663] Updated weights for policy 1, policy_version 4181 (0.0008) +[2023-10-13 00:22:50,287][46662] Updated weights for policy 0, policy_version 4170 (0.0007) +[2023-10-13 00:22:50,396][46663] Updated weights for policy 1, policy_version 4191 (0.0008) +[2023-10-13 00:22:50,660][46662] Updated weights for policy 0, policy_version 4180 (0.0008) +[2023-10-13 00:22:51,029][46662] Updated weights for policy 0, policy_version 4190 (0.0007) +[2023-10-13 00:22:53,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 8585216. Throughput: 0: 1680.8, 1: 1668.8. Samples: 2158872. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 00:22:53,607][45375] Avg episode reward: [(0, '29.180'), (1, '28.710')] +[2023-10-13 00:22:53,615][46091] Saving new best policy, reward=29.180! +[2023-10-13 00:22:54,455][46663] Updated weights for policy 1, policy_version 4201 (0.0009) +[2023-10-13 00:22:54,831][46663] Updated weights for policy 1, policy_version 4211 (0.0009) +[2023-10-13 00:22:54,945][46662] Updated weights for policy 0, policy_version 4200 (0.0008) +[2023-10-13 00:22:55,192][46663] Updated weights for policy 1, policy_version 4221 (0.0009) +[2023-10-13 00:22:55,316][46662] Updated weights for policy 0, policy_version 4210 (0.0010) +[2023-10-13 00:22:55,706][46662] Updated weights for policy 0, policy_version 4220 (0.0007) +[2023-10-13 00:22:58,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 8650752. Throughput: 0: 1658.3, 1: 1672.7. Samples: 2168324. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 00:22:58,607][45375] Avg episode reward: [(0, '28.870'), (1, '30.510')] +[2023-10-13 00:22:59,432][46663] Updated weights for policy 1, policy_version 4231 (0.0008) +[2023-10-13 00:22:59,769][46662] Updated weights for policy 0, policy_version 4230 (0.0007) +[2023-10-13 00:22:59,809][46663] Updated weights for policy 1, policy_version 4241 (0.0009) +[2023-10-13 00:23:00,136][46662] Updated weights for policy 0, policy_version 4240 (0.0008) +[2023-10-13 00:23:00,175][46663] Updated weights for policy 1, policy_version 4251 (0.0007) +[2023-10-13 00:23:00,506][46662] Updated weights for policy 0, policy_version 4250 (0.0008) +[2023-10-13 00:23:03,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 8716288. Throughput: 0: 1676.1, 1: 1668.6. Samples: 2188670. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 00:23:03,607][45375] Avg episode reward: [(0, '30.080'), (1, '30.880')] +[2023-10-13 00:23:03,608][46091] Saving new best policy, reward=30.080! +[2023-10-13 00:23:04,395][46663] Updated weights for policy 1, policy_version 4261 (0.0007) +[2023-10-13 00:23:04,580][46662] Updated weights for policy 0, policy_version 4260 (0.0007) +[2023-10-13 00:23:04,766][46663] Updated weights for policy 1, policy_version 4271 (0.0007) +[2023-10-13 00:23:04,949][46662] Updated weights for policy 0, policy_version 4270 (0.0007) +[2023-10-13 00:23:05,135][46663] Updated weights for policy 1, policy_version 4281 (0.0009) +[2023-10-13 00:23:05,309][46662] Updated weights for policy 0, policy_version 4280 (0.0010) +[2023-10-13 00:23:08,606][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 8781824. Throughput: 0: 1679.6, 1: 1668.3. Samples: 2209292. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 00:23:08,607][45375] Avg episode reward: [(0, '30.000'), (1, '31.150')] +[2023-10-13 00:23:09,314][46663] Updated weights for policy 1, policy_version 4291 (0.0007) +[2023-10-13 00:23:09,497][46662] Updated weights for policy 0, policy_version 4290 (0.0009) +[2023-10-13 00:23:09,689][46663] Updated weights for policy 1, policy_version 4301 (0.0007) +[2023-10-13 00:23:09,856][46662] Updated weights for policy 0, policy_version 4300 (0.0007) +[2023-10-13 00:23:10,062][46663] Updated weights for policy 1, policy_version 4311 (0.0008) +[2023-10-13 00:23:10,233][46662] Updated weights for policy 0, policy_version 4310 (0.0008) +[2023-10-13 00:23:10,599][46662] Updated weights for policy 0, policy_version 4320 (0.0007) +[2023-10-13 00:23:13,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 8847360. Throughput: 0: 1657.1, 1: 1668.2. Samples: 2218156. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 00:23:13,608][45375] Avg episode reward: [(0, '29.710'), (1, '31.970')] +[2023-10-13 00:23:14,240][46663] Updated weights for policy 1, policy_version 4321 (0.0008) +[2023-10-13 00:23:14,608][46663] Updated weights for policy 1, policy_version 4331 (0.0008) +[2023-10-13 00:23:14,629][46662] Updated weights for policy 0, policy_version 4330 (0.0009) +[2023-10-13 00:23:14,977][46663] Updated weights for policy 1, policy_version 4341 (0.0008) +[2023-10-13 00:23:15,005][46662] Updated weights for policy 0, policy_version 4340 (0.0008) +[2023-10-13 00:23:15,337][46663] Updated weights for policy 1, policy_version 4351 (0.0008) +[2023-10-13 00:23:15,376][46662] Updated weights for policy 0, policy_version 4350 (0.0007) +[2023-10-13 00:23:18,606][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 8912896. Throughput: 0: 1679.7, 1: 1663.3. Samples: 2238740. Policy #0 lag: (min: 31.0, avg: 35.6, max: 63.0) +[2023-10-13 00:23:18,607][45375] Avg episode reward: [(0, '30.000'), (1, '31.790')] +[2023-10-13 00:23:19,360][46663] Updated weights for policy 1, policy_version 4361 (0.0009) +[2023-10-13 00:23:19,507][46662] Updated weights for policy 0, policy_version 4360 (0.0009) +[2023-10-13 00:23:19,730][46663] Updated weights for policy 1, policy_version 4371 (0.0008) +[2023-10-13 00:23:19,874][46662] Updated weights for policy 0, policy_version 4370 (0.0008) +[2023-10-13 00:23:20,098][46663] Updated weights for policy 1, policy_version 4381 (0.0008) +[2023-10-13 00:23:20,242][46662] Updated weights for policy 0, policy_version 4380 (0.0009) +[2023-10-13 00:23:23,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 8978432. Throughput: 0: 1673.4, 1: 1658.9. Samples: 2259078. Policy #0 lag: (min: 31.0, avg: 35.6, max: 63.0) +[2023-10-13 00:23:23,607][45375] Avg episode reward: [(0, '29.720'), (1, '31.730')] +[2023-10-13 00:23:24,243][46663] Updated weights for policy 1, policy_version 4391 (0.0008) +[2023-10-13 00:23:24,447][46662] Updated weights for policy 0, policy_version 4390 (0.0008) +[2023-10-13 00:23:24,628][46663] Updated weights for policy 1, policy_version 4401 (0.0007) +[2023-10-13 00:23:24,834][46662] Updated weights for policy 0, policy_version 4400 (0.0008) +[2023-10-13 00:23:24,994][46663] Updated weights for policy 1, policy_version 4411 (0.0008) +[2023-10-13 00:23:25,211][46662] Updated weights for policy 0, policy_version 4410 (0.0009) +[2023-10-13 00:23:28,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 9043968. Throughput: 0: 1655.9, 1: 1662.7. Samples: 2267810. Policy #0 lag: (min: 31.0, avg: 37.2, max: 63.0) +[2023-10-13 00:23:28,607][45375] Avg episode reward: [(0, '30.850'), (1, '31.830')] +[2023-10-13 00:23:28,608][46091] Saving new best policy, reward=30.850! +[2023-10-13 00:23:29,033][46663] Updated weights for policy 1, policy_version 4421 (0.0008) +[2023-10-13 00:23:29,339][46662] Updated weights for policy 0, policy_version 4420 (0.0007) +[2023-10-13 00:23:29,398][46663] Updated weights for policy 1, policy_version 4431 (0.0008) +[2023-10-13 00:23:29,705][46662] Updated weights for policy 0, policy_version 4430 (0.0007) +[2023-10-13 00:23:29,767][46663] Updated weights for policy 1, policy_version 4441 (0.0007) +[2023-10-13 00:23:30,078][46662] Updated weights for policy 0, policy_version 4440 (0.0008) +[2023-10-13 00:23:33,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 9109504. Throughput: 0: 1668.4, 1: 1669.8. Samples: 2288476. Policy #0 lag: (min: 31.0, avg: 37.2, max: 63.0) +[2023-10-13 00:23:33,607][45375] Avg episode reward: [(0, '30.260'), (1, '32.630')] +[2023-10-13 00:23:33,720][46663] Updated weights for policy 1, policy_version 4451 (0.0009) +[2023-10-13 00:23:34,075][46663] Updated weights for policy 1, policy_version 4461 (0.0009) +[2023-10-13 00:23:34,200][46662] Updated weights for policy 0, policy_version 4450 (0.0007) +[2023-10-13 00:23:34,448][46663] Updated weights for policy 1, policy_version 4471 (0.0008) +[2023-10-13 00:23:34,573][46662] Updated weights for policy 0, policy_version 4460 (0.0009) +[2023-10-13 00:23:34,955][46662] Updated weights for policy 0, policy_version 4470 (0.0008) +[2023-10-13 00:23:35,329][46662] Updated weights for policy 0, policy_version 4480 (0.0010) +[2023-10-13 00:23:38,378][46663] Updated weights for policy 1, policy_version 4481 (0.0008) +[2023-10-13 00:23:38,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 9175040. Throughput: 0: 1664.9, 1: 1675.9. Samples: 2309208. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 00:23:38,607][45375] Avg episode reward: [(0, '30.230'), (1, '34.290')] +[2023-10-13 00:23:38,747][46663] Updated weights for policy 1, policy_version 4491 (0.0010) +[2023-10-13 00:23:39,104][46663] Updated weights for policy 1, policy_version 4501 (0.0009) +[2023-10-13 00:23:39,473][46663] Updated weights for policy 1, policy_version 4511 (0.0009) +[2023-10-13 00:23:39,481][46662] Updated weights for policy 0, policy_version 4490 (0.0007) +[2023-10-13 00:23:39,852][46662] Updated weights for policy 0, policy_version 4500 (0.0009) +[2023-10-13 00:23:40,215][46662] Updated weights for policy 0, policy_version 4510 (0.0010) +[2023-10-13 00:23:43,603][46663] Updated weights for policy 1, policy_version 4521 (0.0007) +[2023-10-13 00:23:43,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 9240576. Throughput: 0: 1658.9, 1: 1674.4. Samples: 2318324. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 00:23:43,608][45375] Avg episode reward: [(0, '30.650'), (1, '35.130')] +[2023-10-13 00:23:43,962][46663] Updated weights for policy 1, policy_version 4531 (0.0008) +[2023-10-13 00:23:44,287][46662] Updated weights for policy 0, policy_version 4520 (0.0009) +[2023-10-13 00:23:44,332][46663] Updated weights for policy 1, policy_version 4541 (0.0009) +[2023-10-13 00:23:44,441][46384] Saving new best policy, reward=35.130! +[2023-10-13 00:23:44,645][46662] Updated weights for policy 0, policy_version 4530 (0.0009) +[2023-10-13 00:23:45,024][46662] Updated weights for policy 0, policy_version 4540 (0.0011) +[2023-10-13 00:23:48,434][46663] Updated weights for policy 1, policy_version 4551 (0.0008) +[2023-10-13 00:23:48,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 9306112. Throughput: 0: 1659.5, 1: 1678.4. Samples: 2338874. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 00:23:48,607][45375] Avg episode reward: [(0, '29.620'), (1, '34.250')] +[2023-10-13 00:23:48,798][46663] Updated weights for policy 1, policy_version 4561 (0.0008) +[2023-10-13 00:23:49,100][46662] Updated weights for policy 0, policy_version 4550 (0.0009) +[2023-10-13 00:23:49,160][46663] Updated weights for policy 1, policy_version 4571 (0.0008) +[2023-10-13 00:23:49,464][46662] Updated weights for policy 0, policy_version 4560 (0.0008) +[2023-10-13 00:23:49,836][46662] Updated weights for policy 0, policy_version 4570 (0.0008) +[2023-10-13 00:23:53,433][46663] Updated weights for policy 1, policy_version 4581 (0.0009) +[2023-10-13 00:23:53,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 9371648. Throughput: 0: 1661.3, 1: 1670.8. Samples: 2359240. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 00:23:53,608][45375] Avg episode reward: [(0, '29.830'), (1, '34.650')] +[2023-10-13 00:23:53,618][46091] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000004576_4685824.pth... +[2023-10-13 00:23:53,649][46091] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000003008_3080192.pth +[2023-10-13 00:23:53,806][46663] Updated weights for policy 1, policy_version 4591 (0.0007) +[2023-10-13 00:23:53,981][46662] Updated weights for policy 0, policy_version 4580 (0.0007) +[2023-10-13 00:23:54,165][46663] Updated weights for policy 1, policy_version 4601 (0.0007) +[2023-10-13 00:23:54,344][46662] Updated weights for policy 0, policy_version 4590 (0.0007) +[2023-10-13 00:23:54,423][46384] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000004608_4718592.pth... +[2023-10-13 00:23:54,452][46384] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000003040_3112960.pth +[2023-10-13 00:23:54,714][46662] Updated weights for policy 0, policy_version 4600 (0.0008) +[2023-10-13 00:23:58,366][46663] Updated weights for policy 1, policy_version 4611 (0.0008) +[2023-10-13 00:23:58,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 9437184. Throughput: 0: 1662.9, 1: 1674.9. Samples: 2368360. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0) +[2023-10-13 00:23:58,607][45375] Avg episode reward: [(0, '31.700'), (1, '34.670')] +[2023-10-13 00:23:58,736][46663] Updated weights for policy 1, policy_version 4621 (0.0009) +[2023-10-13 00:23:58,762][46662] Updated weights for policy 0, policy_version 4610 (0.0007) +[2023-10-13 00:23:59,104][46663] Updated weights for policy 1, policy_version 4631 (0.0010) +[2023-10-13 00:23:59,134][46662] Updated weights for policy 0, policy_version 4620 (0.0008) +[2023-10-13 00:23:59,514][46662] Updated weights for policy 0, policy_version 4630 (0.0008) +[2023-10-13 00:23:59,888][46091] Saving new best policy, reward=31.700! +[2023-10-13 00:23:59,894][46662] Updated weights for policy 0, policy_version 4640 (0.0008) +[2023-10-13 00:24:03,143][46663] Updated weights for policy 1, policy_version 4641 (0.0009) +[2023-10-13 00:24:03,511][46663] Updated weights for policy 1, policy_version 4651 (0.0010) +[2023-10-13 00:24:03,606][45375] Fps is (10 sec: 13107.7, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 9502720. Throughput: 0: 1665.4, 1: 1674.8. Samples: 2389052. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0) +[2023-10-13 00:24:03,607][45375] Avg episode reward: [(0, '32.130'), (1, '35.270')] +[2023-10-13 00:24:03,879][46663] Updated weights for policy 1, policy_version 4661 (0.0008) +[2023-10-13 00:24:03,970][46662] Updated weights for policy 0, policy_version 4650 (0.0009) +[2023-10-13 00:24:04,243][46663] Updated weights for policy 1, policy_version 4671 (0.0008) +[2023-10-13 00:24:04,280][46384] Saving new best policy, reward=35.270! +[2023-10-13 00:24:04,343][46662] Updated weights for policy 0, policy_version 4660 (0.0010) +[2023-10-13 00:24:04,717][46662] Updated weights for policy 0, policy_version 4670 (0.0008) +[2023-10-13 00:24:04,792][46091] Saving new best policy, reward=32.130! +[2023-10-13 00:24:08,424][46663] Updated weights for policy 1, policy_version 4681 (0.0008) +[2023-10-13 00:24:08,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 9568256. Throughput: 0: 1664.1, 1: 1664.4. Samples: 2408864. Policy #0 lag: (min: 31.0, avg: 33.7, max: 63.0) +[2023-10-13 00:24:08,607][45375] Avg episode reward: [(0, '31.210'), (1, '35.570')] +[2023-10-13 00:24:08,787][46663] Updated weights for policy 1, policy_version 4691 (0.0007) +[2023-10-13 00:24:08,976][46662] Updated weights for policy 0, policy_version 4680 (0.0009) +[2023-10-13 00:24:09,157][46663] Updated weights for policy 1, policy_version 4701 (0.0007) +[2023-10-13 00:24:09,261][46384] Saving new best policy, reward=35.570! +[2023-10-13 00:24:09,345][46662] Updated weights for policy 0, policy_version 4690 (0.0007) +[2023-10-13 00:24:09,716][46662] Updated weights for policy 0, policy_version 4700 (0.0007) +[2023-10-13 00:24:13,207][46663] Updated weights for policy 1, policy_version 4711 (0.0007) +[2023-10-13 00:24:13,586][46663] Updated weights for policy 1, policy_version 4721 (0.0008) +[2023-10-13 00:24:13,606][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 9633792. Throughput: 0: 1668.0, 1: 1672.4. Samples: 2418124. Policy #0 lag: (min: 31.0, avg: 33.7, max: 63.0) +[2023-10-13 00:24:13,607][45375] Avg episode reward: [(0, '29.780'), (1, '35.790')] +[2023-10-13 00:24:13,869][46662] Updated weights for policy 0, policy_version 4710 (0.0008) +[2023-10-13 00:24:13,946][46663] Updated weights for policy 1, policy_version 4731 (0.0007) +[2023-10-13 00:24:14,129][46384] Saving new best policy, reward=35.790! +[2023-10-13 00:24:14,235][46662] Updated weights for policy 0, policy_version 4720 (0.0010) +[2023-10-13 00:24:14,611][46662] Updated weights for policy 0, policy_version 4730 (0.0007) +[2023-10-13 00:24:18,064][46663] Updated weights for policy 1, policy_version 4741 (0.0008) +[2023-10-13 00:24:18,428][46663] Updated weights for policy 1, policy_version 4751 (0.0008) +[2023-10-13 00:24:18,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 9699328. Throughput: 0: 1669.1, 1: 1671.1. Samples: 2438784. Policy #0 lag: (min: 31.0, avg: 33.9, max: 63.0) +[2023-10-13 00:24:18,607][45375] Avg episode reward: [(0, '30.600'), (1, '37.480')] +[2023-10-13 00:24:18,738][46662] Updated weights for policy 0, policy_version 4740 (0.0010) +[2023-10-13 00:24:18,802][46663] Updated weights for policy 1, policy_version 4761 (0.0008) +[2023-10-13 00:24:19,049][46384] Saving new best policy, reward=37.480! +[2023-10-13 00:24:19,099][46662] Updated weights for policy 0, policy_version 4750 (0.0008) +[2023-10-13 00:24:19,466][46662] Updated weights for policy 0, policy_version 4760 (0.0009) +[2023-10-13 00:24:22,927][46663] Updated weights for policy 1, policy_version 4771 (0.0008) +[2023-10-13 00:24:23,301][46663] Updated weights for policy 1, policy_version 4781 (0.0009) +[2023-10-13 00:24:23,565][46662] Updated weights for policy 0, policy_version 4770 (0.0007) +[2023-10-13 00:24:23,606][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 9764864. Throughput: 0: 1671.2, 1: 1649.9. Samples: 2458654. Policy #0 lag: (min: 31.0, avg: 33.9, max: 63.0) +[2023-10-13 00:24:23,607][45375] Avg episode reward: [(0, '30.030'), (1, '38.490')] +[2023-10-13 00:24:23,673][46663] Updated weights for policy 1, policy_version 4791 (0.0007) +[2023-10-13 00:24:23,937][46662] Updated weights for policy 0, policy_version 4780 (0.0008) +[2023-10-13 00:24:23,999][46384] Saving new best policy, reward=38.490! +[2023-10-13 00:24:24,310][46662] Updated weights for policy 0, policy_version 4790 (0.0009) +[2023-10-13 00:24:24,670][46662] Updated weights for policy 0, policy_version 4800 (0.0008) +[2023-10-13 00:24:27,641][46663] Updated weights for policy 1, policy_version 4801 (0.0008) +[2023-10-13 00:24:28,002][46663] Updated weights for policy 1, policy_version 4811 (0.0008) +[2023-10-13 00:24:28,379][46663] Updated weights for policy 1, policy_version 4821 (0.0009) +[2023-10-13 00:24:28,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 9830400. Throughput: 0: 1670.2, 1: 1667.3. Samples: 2468512. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-10-13 00:24:28,607][45375] Avg episode reward: [(0, '28.720'), (1, '37.950')] +[2023-10-13 00:24:28,745][46663] Updated weights for policy 1, policy_version 4831 (0.0009) +[2023-10-13 00:24:28,888][46662] Updated weights for policy 0, policy_version 4810 (0.0008) +[2023-10-13 00:24:29,268][46662] Updated weights for policy 0, policy_version 4820 (0.0007) +[2023-10-13 00:24:29,638][46662] Updated weights for policy 0, policy_version 4830 (0.0007) +[2023-10-13 00:24:32,939][46663] Updated weights for policy 1, policy_version 4841 (0.0008) +[2023-10-13 00:24:33,301][46663] Updated weights for policy 1, policy_version 4851 (0.0010) +[2023-10-13 00:24:33,606][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 9895936. Throughput: 0: 1669.3, 1: 1665.4. Samples: 2488938. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-10-13 00:24:33,607][45375] Avg episode reward: [(0, '28.640'), (1, '36.130')] +[2023-10-13 00:24:33,680][46663] Updated weights for policy 1, policy_version 4861 (0.0009) +[2023-10-13 00:24:33,760][46662] Updated weights for policy 0, policy_version 4840 (0.0008) +[2023-10-13 00:24:34,137][46662] Updated weights for policy 0, policy_version 4850 (0.0010) +[2023-10-13 00:24:34,509][46662] Updated weights for policy 0, policy_version 4860 (0.0007) +[2023-10-13 00:24:37,904][46663] Updated weights for policy 1, policy_version 4871 (0.0008) +[2023-10-13 00:24:38,282][46663] Updated weights for policy 1, policy_version 4881 (0.0008) +[2023-10-13 00:24:38,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 9961472. Throughput: 0: 1666.7, 1: 1653.3. Samples: 2508638. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0) +[2023-10-13 00:24:38,607][45375] Avg episode reward: [(0, '29.430'), (1, '36.530')] +[2023-10-13 00:24:38,650][46663] Updated weights for policy 1, policy_version 4891 (0.0008) +[2023-10-13 00:24:38,657][46662] Updated weights for policy 0, policy_version 4870 (0.0009) +[2023-10-13 00:24:39,018][46662] Updated weights for policy 0, policy_version 4880 (0.0009) +[2023-10-13 00:24:39,400][46662] Updated weights for policy 0, policy_version 4890 (0.0009) +[2023-10-13 00:24:42,740][46663] Updated weights for policy 1, policy_version 4901 (0.0008) +[2023-10-13 00:24:43,110][46663] Updated weights for policy 1, policy_version 4911 (0.0008) +[2023-10-13 00:24:43,402][46662] Updated weights for policy 0, policy_version 4900 (0.0008) +[2023-10-13 00:24:43,487][46663] Updated weights for policy 1, policy_version 4921 (0.0010) +[2023-10-13 00:24:43,606][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 10027008. Throughput: 0: 1669.0, 1: 1668.2. Samples: 2518536. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0) +[2023-10-13 00:24:43,607][45375] Avg episode reward: [(0, '28.630'), (1, '35.450')] +[2023-10-13 00:24:43,773][46662] Updated weights for policy 0, policy_version 4910 (0.0007) +[2023-10-13 00:24:44,140][46662] Updated weights for policy 0, policy_version 4920 (0.0010) +[2023-10-13 00:24:47,453][46663] Updated weights for policy 1, policy_version 4931 (0.0009) +[2023-10-13 00:24:47,816][46663] Updated weights for policy 1, policy_version 4941 (0.0008) +[2023-10-13 00:24:48,180][46663] Updated weights for policy 1, policy_version 4951 (0.0010) +[2023-10-13 00:24:48,206][46662] Updated weights for policy 0, policy_version 4930 (0.0010) +[2023-10-13 00:24:48,566][46662] Updated weights for policy 0, policy_version 4940 (0.0009) +[2023-10-13 00:24:48,607][45375] Fps is (10 sec: 16384.0, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 10125312. Throughput: 0: 1662.8, 1: 1671.3. Samples: 2539086. Policy #0 lag: (min: 31.0, avg: 35.6, max: 63.0) +[2023-10-13 00:24:48,607][45375] Avg episode reward: [(0, '28.600'), (1, '35.830')] +[2023-10-13 00:24:48,945][46662] Updated weights for policy 0, policy_version 4950 (0.0010) +[2023-10-13 00:24:49,313][46662] Updated weights for policy 0, policy_version 4960 (0.0010) +[2023-10-13 00:24:52,209][46663] Updated weights for policy 1, policy_version 4961 (0.0009) +[2023-10-13 00:24:52,572][46663] Updated weights for policy 1, policy_version 4971 (0.0009) +[2023-10-13 00:24:52,949][46663] Updated weights for policy 1, policy_version 4981 (0.0009) +[2023-10-13 00:24:53,316][46663] Updated weights for policy 1, policy_version 4991 (0.0008) +[2023-10-13 00:24:53,561][46662] Updated weights for policy 0, policy_version 4970 (0.0009) +[2023-10-13 00:24:53,607][45375] Fps is (10 sec: 16383.4, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 10190848. Throughput: 0: 1667.9, 1: 1661.2. Samples: 2558676. Policy #0 lag: (min: 31.0, avg: 35.6, max: 63.0) +[2023-10-13 00:24:53,608][45375] Avg episode reward: [(0, '30.230'), (1, '36.740')] +[2023-10-13 00:24:53,946][46662] Updated weights for policy 0, policy_version 4980 (0.0009) +[2023-10-13 00:24:54,321][46662] Updated weights for policy 0, policy_version 4990 (0.0010) +[2023-10-13 00:24:57,369][46663] Updated weights for policy 1, policy_version 5001 (0.0009) +[2023-10-13 00:24:57,749][46663] Updated weights for policy 1, policy_version 5011 (0.0011) +[2023-10-13 00:24:58,127][46663] Updated weights for policy 1, policy_version 5021 (0.0009) +[2023-10-13 00:24:58,602][46662] Updated weights for policy 0, policy_version 5000 (0.0010) +[2023-10-13 00:24:58,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 10256384. Throughput: 0: 1667.8, 1: 1682.4. Samples: 2568882. Policy #0 lag: (min: 8.0, avg: 35.3, max: 40.0) +[2023-10-13 00:24:58,607][45375] Avg episode reward: [(0, '31.490'), (1, '36.250')] +[2023-10-13 00:24:58,977][46662] Updated weights for policy 0, policy_version 5010 (0.0007) +[2023-10-13 00:24:59,344][46662] Updated weights for policy 0, policy_version 5020 (0.0010) +[2023-10-13 00:25:02,231][46663] Updated weights for policy 1, policy_version 5031 (0.0008) +[2023-10-13 00:25:02,603][46663] Updated weights for policy 1, policy_version 5041 (0.0010) +[2023-10-13 00:25:02,980][46663] Updated weights for policy 1, policy_version 5051 (0.0009) +[2023-10-13 00:25:03,223][46662] Updated weights for policy 0, policy_version 5030 (0.0010) +[2023-10-13 00:25:03,594][46662] Updated weights for policy 0, policy_version 5040 (0.0007) +[2023-10-13 00:25:03,606][45375] Fps is (10 sec: 13107.6, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 10321920. Throughput: 0: 1661.3, 1: 1671.5. Samples: 2588760. Policy #0 lag: (min: 8.0, avg: 35.3, max: 40.0) +[2023-10-13 00:25:03,607][45375] Avg episode reward: [(0, '31.570'), (1, '33.700')] +[2023-10-13 00:25:03,973][46662] Updated weights for policy 0, policy_version 5050 (0.0009) +[2023-10-13 00:25:06,877][46663] Updated weights for policy 1, policy_version 5061 (0.0008) +[2023-10-13 00:25:07,245][46663] Updated weights for policy 1, policy_version 5071 (0.0010) +[2023-10-13 00:25:07,618][46663] Updated weights for policy 1, policy_version 5081 (0.0009) +[2023-10-13 00:25:08,069][46662] Updated weights for policy 0, policy_version 5060 (0.0008) +[2023-10-13 00:25:08,438][46662] Updated weights for policy 0, policy_version 5070 (0.0007) +[2023-10-13 00:25:08,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 10387456. Throughput: 0: 1662.6, 1: 1670.8. Samples: 2608654. Policy #0 lag: (min: 27.0, avg: 35.0, max: 59.0) +[2023-10-13 00:25:08,607][45375] Avg episode reward: [(0, '31.530'), (1, '33.000')] +[2023-10-13 00:25:08,806][46662] Updated weights for policy 0, policy_version 5080 (0.0008) +[2023-10-13 00:25:11,761][46663] Updated weights for policy 1, policy_version 5091 (0.0010) +[2023-10-13 00:25:12,128][46663] Updated weights for policy 1, policy_version 5101 (0.0011) +[2023-10-13 00:25:12,505][46663] Updated weights for policy 1, policy_version 5111 (0.0009) +[2023-10-13 00:25:12,952][46662] Updated weights for policy 0, policy_version 5090 (0.0007) +[2023-10-13 00:25:13,314][46662] Updated weights for policy 0, policy_version 5100 (0.0009) +[2023-10-13 00:25:13,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 10452992. Throughput: 0: 1665.2, 1: 1680.1. Samples: 2619054. Policy #0 lag: (min: 27.0, avg: 35.0, max: 59.0) +[2023-10-13 00:25:13,608][45375] Avg episode reward: [(0, '32.160'), (1, '33.290')] +[2023-10-13 00:25:13,692][46662] Updated weights for policy 0, policy_version 5110 (0.0008) +[2023-10-13 00:25:14,058][46091] Saving new best policy, reward=32.160! +[2023-10-13 00:25:14,060][46662] Updated weights for policy 0, policy_version 5120 (0.0007) +[2023-10-13 00:25:16,522][46663] Updated weights for policy 1, policy_version 5121 (0.0009) +[2023-10-13 00:25:16,885][46663] Updated weights for policy 1, policy_version 5131 (0.0009) +[2023-10-13 00:25:17,253][46663] Updated weights for policy 1, policy_version 5141 (0.0010) +[2023-10-13 00:25:17,620][46663] Updated weights for policy 1, policy_version 5151 (0.0010) +[2023-10-13 00:25:18,262][46662] Updated weights for policy 0, policy_version 5130 (0.0008) +[2023-10-13 00:25:18,607][45375] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 10518528. Throughput: 0: 1667.1, 1: 1663.8. Samples: 2638832. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 00:25:18,608][45375] Avg episode reward: [(0, '33.070'), (1, '33.230')] +[2023-10-13 00:25:18,622][46662] Updated weights for policy 0, policy_version 5140 (0.0009) +[2023-10-13 00:25:18,996][46662] Updated weights for policy 0, policy_version 5150 (0.0008) +[2023-10-13 00:25:19,068][46091] Saving new best policy, reward=33.070! +[2023-10-13 00:25:21,851][46663] Updated weights for policy 1, policy_version 5161 (0.0009) +[2023-10-13 00:25:22,233][46663] Updated weights for policy 1, policy_version 5171 (0.0008) +[2023-10-13 00:25:22,598][46663] Updated weights for policy 1, policy_version 5181 (0.0009) +[2023-10-13 00:25:23,066][46662] Updated weights for policy 0, policy_version 5160 (0.0008) +[2023-10-13 00:25:23,444][46662] Updated weights for policy 0, policy_version 5170 (0.0008) +[2023-10-13 00:25:23,606][45375] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 10584064. Throughput: 0: 1665.9, 1: 1671.6. Samples: 2658826. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 00:25:23,607][45375] Avg episode reward: [(0, '31.000'), (1, '34.300')] +[2023-10-13 00:25:23,813][46662] Updated weights for policy 0, policy_version 5180 (0.0009) +[2023-10-13 00:25:26,562][46663] Updated weights for policy 1, policy_version 5191 (0.0009) +[2023-10-13 00:25:26,921][46663] Updated weights for policy 1, policy_version 5201 (0.0008) +[2023-10-13 00:25:27,286][46663] Updated weights for policy 1, policy_version 5211 (0.0007) +[2023-10-13 00:25:27,720][46662] Updated weights for policy 0, policy_version 5190 (0.0007) +[2023-10-13 00:25:28,098][46662] Updated weights for policy 0, policy_version 5200 (0.0007) +[2023-10-13 00:25:28,469][46662] Updated weights for policy 0, policy_version 5210 (0.0009) +[2023-10-13 00:25:28,606][45375] Fps is (10 sec: 13107.5, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 10649600. Throughput: 0: 1666.8, 1: 1683.8. Samples: 2669314. Policy #0 lag: (min: 26.0, avg: 29.6, max: 56.0) +[2023-10-13 00:25:28,607][45375] Avg episode reward: [(0, '31.160'), (1, '34.460')] +[2023-10-13 00:25:31,388][46663] Updated weights for policy 1, policy_version 5221 (0.0009) +[2023-10-13 00:25:31,760][46663] Updated weights for policy 1, policy_version 5231 (0.0007) +[2023-10-13 00:25:32,129][46663] Updated weights for policy 1, policy_version 5241 (0.0007) +[2023-10-13 00:25:32,527][46662] Updated weights for policy 0, policy_version 5220 (0.0009) +[2023-10-13 00:25:32,909][46662] Updated weights for policy 0, policy_version 5230 (0.0010) +[2023-10-13 00:25:33,281][46662] Updated weights for policy 0, policy_version 5240 (0.0010) +[2023-10-13 00:25:33,607][45375] Fps is (10 sec: 16383.7, 60 sec: 14199.4, 300 sec: 13440.4). Total num frames: 10747904. Throughput: 0: 1671.7, 1: 1659.9. Samples: 2689008. Policy #0 lag: (min: 1.0, avg: 11.8, max: 33.0) +[2023-10-13 00:25:33,608][45375] Avg episode reward: [(0, '31.050'), (1, '34.370')] +[2023-10-13 00:25:36,065][46663] Updated weights for policy 1, policy_version 5251 (0.0009) +[2023-10-13 00:25:36,443][46663] Updated weights for policy 1, policy_version 5261 (0.0010) +[2023-10-13 00:25:36,805][46663] Updated weights for policy 1, policy_version 5271 (0.0010) +[2023-10-13 00:25:37,368][46662] Updated weights for policy 0, policy_version 5250 (0.0009) +[2023-10-13 00:25:37,740][46662] Updated weights for policy 0, policy_version 5260 (0.0008) +[2023-10-13 00:25:38,102][46662] Updated weights for policy 0, policy_version 5270 (0.0009) +[2023-10-13 00:25:38,482][46662] Updated weights for policy 0, policy_version 5280 (0.0010) +[2023-10-13 00:25:38,606][45375] Fps is (10 sec: 16384.0, 60 sec: 14199.5, 300 sec: 13440.4). Total num frames: 10813440. Throughput: 0: 1667.6, 1: 1676.8. Samples: 2709174. Policy #0 lag: (min: 1.0, avg: 11.8, max: 33.0) +[2023-10-13 00:25:38,607][45375] Avg episode reward: [(0, '30.320'), (1, '33.560')] +[2023-10-13 00:25:40,852][46663] Updated weights for policy 1, policy_version 5281 (0.0011) +[2023-10-13 00:25:41,222][46663] Updated weights for policy 1, policy_version 5291 (0.0008) +[2023-10-13 00:25:41,595][46663] Updated weights for policy 1, policy_version 5301 (0.0008) +[2023-10-13 00:25:41,966][46663] Updated weights for policy 1, policy_version 5311 (0.0008) +[2023-10-13 00:25:42,538][46662] Updated weights for policy 0, policy_version 5290 (0.0011) +[2023-10-13 00:25:42,915][46662] Updated weights for policy 0, policy_version 5300 (0.0010) +[2023-10-13 00:25:43,277][46662] Updated weights for policy 0, policy_version 5310 (0.0010) +[2023-10-13 00:25:43,607][45375] Fps is (10 sec: 13107.2, 60 sec: 14199.4, 300 sec: 13440.4). Total num frames: 10878976. Throughput: 0: 1680.6, 1: 1669.3. Samples: 2719628. Policy #0 lag: (min: 31.0, avg: 47.0, max: 63.0) +[2023-10-13 00:25:43,608][45375] Avg episode reward: [(0, '28.710'), (1, '32.940')] +[2023-10-13 00:25:46,062][46663] Updated weights for policy 1, policy_version 5321 (0.0009) +[2023-10-13 00:25:46,420][46663] Updated weights for policy 1, policy_version 5331 (0.0010) +[2023-10-13 00:25:46,790][46663] Updated weights for policy 1, policy_version 5341 (0.0007) +[2023-10-13 00:25:47,397][46662] Updated weights for policy 0, policy_version 5320 (0.0007) +[2023-10-13 00:25:47,780][46662] Updated weights for policy 0, policy_version 5330 (0.0010) +[2023-10-13 00:25:48,142][46662] Updated weights for policy 0, policy_version 5340 (0.0008) +[2023-10-13 00:25:48,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 10944512. Throughput: 0: 1687.0, 1: 1665.8. Samples: 2739634. Policy #0 lag: (min: 31.0, avg: 47.0, max: 63.0) +[2023-10-13 00:25:48,607][45375] Avg episode reward: [(0, '29.960'), (1, '33.620')] +[2023-10-13 00:25:51,073][46663] Updated weights for policy 1, policy_version 5351 (0.0009) +[2023-10-13 00:25:51,457][46663] Updated weights for policy 1, policy_version 5361 (0.0008) +[2023-10-13 00:25:51,831][46663] Updated weights for policy 1, policy_version 5371 (0.0007) +[2023-10-13 00:25:52,192][46662] Updated weights for policy 0, policy_version 5350 (0.0008) +[2023-10-13 00:25:52,562][46662] Updated weights for policy 0, policy_version 5360 (0.0010) +[2023-10-13 00:25:52,927][46662] Updated weights for policy 0, policy_version 5370 (0.0009) +[2023-10-13 00:25:53,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 11010048. Throughput: 0: 1667.0, 1: 1680.7. Samples: 2759300. Policy #0 lag: (min: 4.0, avg: 12.0, max: 36.0) +[2023-10-13 00:25:53,607][45375] Avg episode reward: [(0, '30.650'), (1, '33.400')] +[2023-10-13 00:25:53,617][46384] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000005376_5505024.pth... +[2023-10-13 00:25:53,618][46091] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000005376_5505024.pth... +[2023-10-13 00:25:53,654][46091] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000003808_3899392.pth +[2023-10-13 00:25:53,657][46384] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000003808_3899392.pth +[2023-10-13 00:25:55,782][46663] Updated weights for policy 1, policy_version 5381 (0.0009) +[2023-10-13 00:25:56,159][46663] Updated weights for policy 1, policy_version 5391 (0.0008) +[2023-10-13 00:25:56,538][46663] Updated weights for policy 1, policy_version 5401 (0.0007) +[2023-10-13 00:25:56,863][46662] Updated weights for policy 0, policy_version 5380 (0.0009) +[2023-10-13 00:25:57,233][46662] Updated weights for policy 0, policy_version 5390 (0.0008) +[2023-10-13 00:25:57,606][46662] Updated weights for policy 0, policy_version 5400 (0.0008) +[2023-10-13 00:25:58,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 11075584. Throughput: 0: 1685.8, 1: 1666.4. Samples: 2769902. Policy #0 lag: (min: 4.0, avg: 12.0, max: 36.0) +[2023-10-13 00:25:58,607][45375] Avg episode reward: [(0, '29.350'), (1, '33.700')] +[2023-10-13 00:26:00,671][46663] Updated weights for policy 1, policy_version 5411 (0.0007) +[2023-10-13 00:26:01,044][46663] Updated weights for policy 1, policy_version 5421 (0.0007) +[2023-10-13 00:26:01,414][46663] Updated weights for policy 1, policy_version 5431 (0.0010) +[2023-10-13 00:26:01,715][46662] Updated weights for policy 0, policy_version 5410 (0.0008) +[2023-10-13 00:26:02,083][46662] Updated weights for policy 0, policy_version 5420 (0.0007) +[2023-10-13 00:26:02,463][46662] Updated weights for policy 0, policy_version 5430 (0.0008) +[2023-10-13 00:26:02,833][46662] Updated weights for policy 0, policy_version 5440 (0.0011) +[2023-10-13 00:26:03,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 11141120. Throughput: 0: 1681.1, 1: 1674.7. Samples: 2789842. Policy #0 lag: (min: 2.0, avg: 16.9, max: 34.0) +[2023-10-13 00:26:03,607][45375] Avg episode reward: [(0, '30.180'), (1, '33.290')] +[2023-10-13 00:26:05,592][46663] Updated weights for policy 1, policy_version 5441 (0.0009) +[2023-10-13 00:26:05,960][46663] Updated weights for policy 1, policy_version 5451 (0.0010) +[2023-10-13 00:26:06,333][46663] Updated weights for policy 1, policy_version 5461 (0.0008) +[2023-10-13 00:26:06,699][46663] Updated weights for policy 1, policy_version 5471 (0.0009) +[2023-10-13 00:26:06,789][46662] Updated weights for policy 0, policy_version 5450 (0.0007) +[2023-10-13 00:26:07,153][46662] Updated weights for policy 0, policy_version 5460 (0.0007) +[2023-10-13 00:26:07,527][46662] Updated weights for policy 0, policy_version 5470 (0.0008) +[2023-10-13 00:26:08,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 11206656. Throughput: 0: 1659.7, 1: 1687.0. Samples: 2809430. Policy #0 lag: (min: 2.0, avg: 16.9, max: 34.0) +[2023-10-13 00:26:08,607][45375] Avg episode reward: [(0, '31.190'), (1, '33.280')] +[2023-10-13 00:26:10,741][46663] Updated weights for policy 1, policy_version 5481 (0.0009) +[2023-10-13 00:26:11,111][46663] Updated weights for policy 1, policy_version 5491 (0.0009) +[2023-10-13 00:26:11,478][46662] Updated weights for policy 0, policy_version 5480 (0.0009) +[2023-10-13 00:26:11,479][46663] Updated weights for policy 1, policy_version 5501 (0.0007) +[2023-10-13 00:26:11,849][46662] Updated weights for policy 0, policy_version 5490 (0.0007) +[2023-10-13 00:26:12,227][46662] Updated weights for policy 0, policy_version 5500 (0.0008) +[2023-10-13 00:26:13,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13329.4). Total num frames: 11272192. Throughput: 0: 1687.2, 1: 1666.6. Samples: 2820236. Policy #0 lag: (min: 17.0, avg: 22.6, max: 49.0) +[2023-10-13 00:26:13,607][45375] Avg episode reward: [(0, '33.650'), (1, '33.370')] +[2023-10-13 00:26:13,608][46091] Saving new best policy, reward=33.650! +[2023-10-13 00:26:15,481][46663] Updated weights for policy 1, policy_version 5511 (0.0007) +[2023-10-13 00:26:15,851][46663] Updated weights for policy 1, policy_version 5521 (0.0008) +[2023-10-13 00:26:16,228][46663] Updated weights for policy 1, policy_version 5531 (0.0008) +[2023-10-13 00:26:16,300][46662] Updated weights for policy 0, policy_version 5510 (0.0008) +[2023-10-13 00:26:16,681][46662] Updated weights for policy 0, policy_version 5520 (0.0007) +[2023-10-13 00:26:17,054][46662] Updated weights for policy 0, policy_version 5530 (0.0008) +[2023-10-13 00:26:18,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13653.4, 300 sec: 13329.4). Total num frames: 11337728. Throughput: 0: 1671.4, 1: 1688.6. Samples: 2840206. Policy #0 lag: (min: 17.0, avg: 22.6, max: 49.0) +[2023-10-13 00:26:18,607][45375] Avg episode reward: [(0, '33.610'), (1, '33.190')] +[2023-10-13 00:26:20,100][46663] Updated weights for policy 1, policy_version 5541 (0.0008) +[2023-10-13 00:26:20,479][46663] Updated weights for policy 1, policy_version 5551 (0.0011) +[2023-10-13 00:26:20,849][46663] Updated weights for policy 1, policy_version 5561 (0.0008) +[2023-10-13 00:26:21,312][46662] Updated weights for policy 0, policy_version 5540 (0.0009) +[2023-10-13 00:26:21,684][46662] Updated weights for policy 0, policy_version 5550 (0.0010) +[2023-10-13 00:26:22,060][46662] Updated weights for policy 0, policy_version 5560 (0.0008) +[2023-10-13 00:26:23,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13329.4). Total num frames: 11403264. Throughput: 0: 1663.1, 1: 1692.5. Samples: 2860178. Policy #0 lag: (min: 17.0, avg: 18.5, max: 40.0) +[2023-10-13 00:26:23,607][45375] Avg episode reward: [(0, '33.930'), (1, '32.660')] +[2023-10-13 00:26:23,615][46091] Saving new best policy, reward=33.930! +[2023-10-13 00:26:24,842][46663] Updated weights for policy 1, policy_version 5571 (0.0007) +[2023-10-13 00:26:25,215][46663] Updated weights for policy 1, policy_version 5581 (0.0008) +[2023-10-13 00:26:25,581][46663] Updated weights for policy 1, policy_version 5591 (0.0007) +[2023-10-13 00:26:25,973][46662] Updated weights for policy 0, policy_version 5570 (0.0009) +[2023-10-13 00:26:26,338][46662] Updated weights for policy 0, policy_version 5580 (0.0008) +[2023-10-13 00:26:26,714][46662] Updated weights for policy 0, policy_version 5590 (0.0009) +[2023-10-13 00:26:27,080][46662] Updated weights for policy 0, policy_version 5600 (0.0008) +[2023-10-13 00:26:28,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13329.4). Total num frames: 11468800. Throughput: 0: 1684.5, 1: 1675.5. Samples: 2870828. Policy #0 lag: (min: 17.0, avg: 18.5, max: 40.0) +[2023-10-13 00:26:28,607][45375] Avg episode reward: [(0, '34.320'), (1, '33.060')] +[2023-10-13 00:26:28,608][46091] Saving new best policy, reward=34.320! +[2023-10-13 00:26:29,755][46663] Updated weights for policy 1, policy_version 5601 (0.0010) +[2023-10-13 00:26:30,117][46663] Updated weights for policy 1, policy_version 5611 (0.0010) +[2023-10-13 00:26:30,485][46663] Updated weights for policy 1, policy_version 5621 (0.0011) +[2023-10-13 00:26:30,858][46663] Updated weights for policy 1, policy_version 5631 (0.0009) +[2023-10-13 00:26:31,205][46662] Updated weights for policy 0, policy_version 5610 (0.0008) +[2023-10-13 00:26:31,574][46662] Updated weights for policy 0, policy_version 5620 (0.0008) +[2023-10-13 00:26:31,941][46662] Updated weights for policy 0, policy_version 5630 (0.0007) +[2023-10-13 00:26:33,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 11534336. Throughput: 0: 1662.6, 1: 1694.5. Samples: 2890702. Policy #0 lag: (min: 31.0, avg: 33.8, max: 63.0) +[2023-10-13 00:26:33,608][45375] Avg episode reward: [(0, '33.330'), (1, '32.770')] +[2023-10-13 00:26:34,988][46663] Updated weights for policy 1, policy_version 5641 (0.0007) +[2023-10-13 00:26:35,359][46663] Updated weights for policy 1, policy_version 5651 (0.0008) +[2023-10-13 00:26:35,721][46663] Updated weights for policy 1, policy_version 5661 (0.0011) +[2023-10-13 00:26:36,179][46662] Updated weights for policy 0, policy_version 5640 (0.0008) +[2023-10-13 00:26:36,552][46662] Updated weights for policy 0, policy_version 5650 (0.0007) +[2023-10-13 00:26:36,930][46662] Updated weights for policy 0, policy_version 5660 (0.0008) +[2023-10-13 00:26:38,607][45375] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 11599872. Throughput: 0: 1672.8, 1: 1692.2. Samples: 2910726. Policy #0 lag: (min: 31.0, avg: 33.8, max: 63.0) +[2023-10-13 00:26:38,608][45375] Avg episode reward: [(0, '32.900'), (1, '33.880')] +[2023-10-13 00:26:39,855][46663] Updated weights for policy 1, policy_version 5671 (0.0009) +[2023-10-13 00:26:40,222][46663] Updated weights for policy 1, policy_version 5681 (0.0008) +[2023-10-13 00:26:40,595][46663] Updated weights for policy 1, policy_version 5691 (0.0009) +[2023-10-13 00:26:41,010][46662] Updated weights for policy 0, policy_version 5670 (0.0008) +[2023-10-13 00:26:41,377][46662] Updated weights for policy 0, policy_version 5680 (0.0009) +[2023-10-13 00:26:41,765][46662] Updated weights for policy 0, policy_version 5690 (0.0010) +[2023-10-13 00:26:43,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 11665408. Throughput: 0: 1681.2, 1: 1674.4. Samples: 2920902. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 00:26:43,607][45375] Avg episode reward: [(0, '33.360'), (1, '33.020')] +[2023-10-13 00:26:44,548][46663] Updated weights for policy 1, policy_version 5701 (0.0008) +[2023-10-13 00:26:44,907][46663] Updated weights for policy 1, policy_version 5711 (0.0008) +[2023-10-13 00:26:45,279][46663] Updated weights for policy 1, policy_version 5721 (0.0009) +[2023-10-13 00:26:45,759][46662] Updated weights for policy 0, policy_version 5700 (0.0010) +[2023-10-13 00:26:46,126][46662] Updated weights for policy 0, policy_version 5710 (0.0009) +[2023-10-13 00:26:46,496][46662] Updated weights for policy 0, policy_version 5720 (0.0008) +[2023-10-13 00:26:48,606][45375] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 11730944. Throughput: 0: 1659.2, 1: 1686.4. Samples: 2940396. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 00:26:48,607][45375] Avg episode reward: [(0, '32.690'), (1, '33.270')] +[2023-10-13 00:26:49,420][46663] Updated weights for policy 1, policy_version 5731 (0.0009) +[2023-10-13 00:26:49,791][46663] Updated weights for policy 1, policy_version 5741 (0.0010) +[2023-10-13 00:26:50,157][46663] Updated weights for policy 1, policy_version 5751 (0.0010) +[2023-10-13 00:26:50,613][46662] Updated weights for policy 0, policy_version 5730 (0.0008) +[2023-10-13 00:26:50,976][46662] Updated weights for policy 0, policy_version 5740 (0.0007) +[2023-10-13 00:26:51,349][46662] Updated weights for policy 0, policy_version 5750 (0.0009) +[2023-10-13 00:26:51,717][46662] Updated weights for policy 0, policy_version 5760 (0.0009) +[2023-10-13 00:26:53,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 11796480. Throughput: 0: 1681.7, 1: 1691.3. Samples: 2961218. Policy #0 lag: (min: 31.0, avg: 31.6, max: 47.0) +[2023-10-13 00:26:53,608][45375] Avg episode reward: [(0, '31.120'), (1, '33.140')] +[2023-10-13 00:26:54,033][46663] Updated weights for policy 1, policy_version 5761 (0.0009) +[2023-10-13 00:26:54,396][46663] Updated weights for policy 1, policy_version 5771 (0.0007) +[2023-10-13 00:26:54,763][46663] Updated weights for policy 1, policy_version 5781 (0.0009) +[2023-10-13 00:26:55,138][46663] Updated weights for policy 1, policy_version 5791 (0.0011) +[2023-10-13 00:26:55,821][46662] Updated weights for policy 0, policy_version 5770 (0.0011) +[2023-10-13 00:26:56,186][46662] Updated weights for policy 0, policy_version 5780 (0.0010) +[2023-10-13 00:26:56,561][46662] Updated weights for policy 0, policy_version 5790 (0.0009) +[2023-10-13 00:26:58,606][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 11862016. Throughput: 0: 1669.8, 1: 1683.1. Samples: 2971118. Policy #0 lag: (min: 31.0, avg: 31.6, max: 47.0) +[2023-10-13 00:26:58,607][45375] Avg episode reward: [(0, '29.290'), (1, '33.390')] +[2023-10-13 00:26:59,377][46663] Updated weights for policy 1, policy_version 5801 (0.0008) +[2023-10-13 00:26:59,742][46663] Updated weights for policy 1, policy_version 5811 (0.0008) +[2023-10-13 00:27:00,113][46663] Updated weights for policy 1, policy_version 5821 (0.0009) +[2023-10-13 00:27:00,803][46662] Updated weights for policy 0, policy_version 5800 (0.0009) +[2023-10-13 00:27:01,171][46662] Updated weights for policy 0, policy_version 5810 (0.0009) +[2023-10-13 00:27:01,539][46662] Updated weights for policy 0, policy_version 5820 (0.0010) +[2023-10-13 00:27:03,607][45375] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 11927552. Throughput: 0: 1657.2, 1: 1687.2. Samples: 2990708. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 00:27:03,608][45375] Avg episode reward: [(0, '27.870'), (1, '33.600')] +[2023-10-13 00:27:04,034][46663] Updated weights for policy 1, policy_version 5831 (0.0007) +[2023-10-13 00:27:04,398][46663] Updated weights for policy 1, policy_version 5841 (0.0009) +[2023-10-13 00:27:04,758][46663] Updated weights for policy 1, policy_version 5851 (0.0008) +[2023-10-13 00:27:05,462][46662] Updated weights for policy 0, policy_version 5830 (0.0008) +[2023-10-13 00:27:05,827][46662] Updated weights for policy 0, policy_version 5840 (0.0008) +[2023-10-13 00:27:06,199][46662] Updated weights for policy 0, policy_version 5850 (0.0009) +[2023-10-13 00:27:08,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 11993088. Throughput: 0: 1675.9, 1: 1688.4. Samples: 3011572. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 00:27:08,607][45375] Avg episode reward: [(0, '28.670'), (1, '34.790')] +[2023-10-13 00:27:08,839][46663] Updated weights for policy 1, policy_version 5861 (0.0008) +[2023-10-13 00:27:09,207][46663] Updated weights for policy 1, policy_version 5871 (0.0009) +[2023-10-13 00:27:09,585][46663] Updated weights for policy 1, policy_version 5881 (0.0008) +[2023-10-13 00:27:10,422][46662] Updated weights for policy 0, policy_version 5860 (0.0007) +[2023-10-13 00:27:10,799][46662] Updated weights for policy 0, policy_version 5870 (0.0008) +[2023-10-13 00:27:11,171][46662] Updated weights for policy 0, policy_version 5880 (0.0011) +[2023-10-13 00:27:13,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 12058624. Throughput: 0: 1660.0, 1: 1685.6. Samples: 3021384. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 00:27:13,608][45375] Avg episode reward: [(0, '28.520'), (1, '35.220')] +[2023-10-13 00:27:13,742][46663] Updated weights for policy 1, policy_version 5891 (0.0009) +[2023-10-13 00:27:14,115][46663] Updated weights for policy 1, policy_version 5901 (0.0007) +[2023-10-13 00:27:14,478][46663] Updated weights for policy 1, policy_version 5911 (0.0007) +[2023-10-13 00:27:15,033][46662] Updated weights for policy 0, policy_version 5890 (0.0009) +[2023-10-13 00:27:15,401][46662] Updated weights for policy 0, policy_version 5900 (0.0007) +[2023-10-13 00:27:15,781][46662] Updated weights for policy 0, policy_version 5910 (0.0009) +[2023-10-13 00:27:16,161][46662] Updated weights for policy 0, policy_version 5920 (0.0008) +[2023-10-13 00:27:18,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 12124160. Throughput: 0: 1666.9, 1: 1680.9. Samples: 3041352. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 00:27:18,607][45375] Avg episode reward: [(0, '28.470'), (1, '34.690')] +[2023-10-13 00:27:18,679][46663] Updated weights for policy 1, policy_version 5921 (0.0008) +[2023-10-13 00:27:19,053][46663] Updated weights for policy 1, policy_version 5931 (0.0011) +[2023-10-13 00:27:19,433][46663] Updated weights for policy 1, policy_version 5941 (0.0009) +[2023-10-13 00:27:19,800][46663] Updated weights for policy 1, policy_version 5951 (0.0009) +[2023-10-13 00:27:20,134][46662] Updated weights for policy 0, policy_version 5930 (0.0009) +[2023-10-13 00:27:20,514][46662] Updated weights for policy 0, policy_version 5940 (0.0007) +[2023-10-13 00:27:20,888][46662] Updated weights for policy 0, policy_version 5950 (0.0008) +[2023-10-13 00:27:23,606][45375] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 12189696. Throughput: 0: 1675.0, 1: 1684.2. Samples: 3061892. Policy #0 lag: (min: 17.0, avg: 29.8, max: 49.0) +[2023-10-13 00:27:23,607][45375] Avg episode reward: [(0, '28.010'), (1, '34.850')] +[2023-10-13 00:27:23,676][46663] Updated weights for policy 1, policy_version 5961 (0.0009) +[2023-10-13 00:27:24,050][46663] Updated weights for policy 1, policy_version 5971 (0.0008) +[2023-10-13 00:27:24,412][46663] Updated weights for policy 1, policy_version 5981 (0.0010) +[2023-10-13 00:27:25,119][46662] Updated weights for policy 0, policy_version 5960 (0.0008) +[2023-10-13 00:27:25,494][46662] Updated weights for policy 0, policy_version 5970 (0.0009) +[2023-10-13 00:27:25,869][46662] Updated weights for policy 0, policy_version 5980 (0.0007) +[2023-10-13 00:27:28,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 12255232. Throughput: 0: 1651.1, 1: 1685.7. Samples: 3071058. Policy #0 lag: (min: 17.0, avg: 29.8, max: 49.0) +[2023-10-13 00:27:28,607][45375] Avg episode reward: [(0, '29.120'), (1, '36.220')] +[2023-10-13 00:27:28,825][46663] Updated weights for policy 1, policy_version 5991 (0.0010) +[2023-10-13 00:27:29,198][46663] Updated weights for policy 1, policy_version 6001 (0.0009) +[2023-10-13 00:27:29,566][46663] Updated weights for policy 1, policy_version 6011 (0.0010) +[2023-10-13 00:27:29,801][46662] Updated weights for policy 0, policy_version 5990 (0.0009) +[2023-10-13 00:27:30,172][46662] Updated weights for policy 0, policy_version 6000 (0.0009) +[2023-10-13 00:27:30,532][46662] Updated weights for policy 0, policy_version 6010 (0.0009) +[2023-10-13 00:27:33,604][46663] Updated weights for policy 1, policy_version 6021 (0.0009) +[2023-10-13 00:27:33,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 12320768. Throughput: 0: 1673.8, 1: 1678.5. Samples: 3091252. Policy #0 lag: (min: 31.0, avg: 38.6, max: 63.0) +[2023-10-13 00:27:33,608][45375] Avg episode reward: [(0, '29.230'), (1, '35.630')] +[2023-10-13 00:27:33,965][46663] Updated weights for policy 1, policy_version 6031 (0.0010) +[2023-10-13 00:27:34,330][46663] Updated weights for policy 1, policy_version 6041 (0.0008) +[2023-10-13 00:27:34,854][46662] Updated weights for policy 0, policy_version 6020 (0.0010) +[2023-10-13 00:27:35,233][46662] Updated weights for policy 0, policy_version 6030 (0.0010) +[2023-10-13 00:27:35,604][46662] Updated weights for policy 0, policy_version 6040 (0.0010) +[2023-10-13 00:27:38,583][46663] Updated weights for policy 1, policy_version 6051 (0.0010) +[2023-10-13 00:27:38,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 12386304. Throughput: 0: 1674.6, 1: 1670.6. Samples: 3111750. Policy #0 lag: (min: 31.0, avg: 38.6, max: 63.0) +[2023-10-13 00:27:38,607][45375] Avg episode reward: [(0, '29.970'), (1, '35.950')] +[2023-10-13 00:27:38,959][46663] Updated weights for policy 1, policy_version 6061 (0.0010) +[2023-10-13 00:27:39,333][46663] Updated weights for policy 1, policy_version 6071 (0.0009) +[2023-10-13 00:27:39,521][46662] Updated weights for policy 0, policy_version 6050 (0.0010) +[2023-10-13 00:27:39,883][46662] Updated weights for policy 0, policy_version 6060 (0.0008) +[2023-10-13 00:27:40,258][46662] Updated weights for policy 0, policy_version 6070 (0.0008) +[2023-10-13 00:27:40,632][46662] Updated weights for policy 0, policy_version 6080 (0.0009) +[2023-10-13 00:27:43,311][46663] Updated weights for policy 1, policy_version 6081 (0.0009) +[2023-10-13 00:27:43,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 12451840. Throughput: 0: 1656.9, 1: 1673.0. Samples: 3120964. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 00:27:43,608][45375] Avg episode reward: [(0, '30.050'), (1, '35.690')] +[2023-10-13 00:27:43,670][46663] Updated weights for policy 1, policy_version 6091 (0.0010) +[2023-10-13 00:27:44,044][46663] Updated weights for policy 1, policy_version 6101 (0.0008) +[2023-10-13 00:27:44,418][46663] Updated weights for policy 1, policy_version 6111 (0.0010) +[2023-10-13 00:27:44,754][46662] Updated weights for policy 0, policy_version 6090 (0.0009) +[2023-10-13 00:27:45,129][46662] Updated weights for policy 0, policy_version 6100 (0.0008) +[2023-10-13 00:27:45,491][46662] Updated weights for policy 0, policy_version 6110 (0.0009) +[2023-10-13 00:27:48,436][46663] Updated weights for policy 1, policy_version 6121 (0.0009) +[2023-10-13 00:27:48,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 12517376. Throughput: 0: 1679.5, 1: 1674.9. Samples: 3141656. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 00:27:48,607][45375] Avg episode reward: [(0, '30.560'), (1, '35.440')] +[2023-10-13 00:27:48,806][46663] Updated weights for policy 1, policy_version 6131 (0.0008) +[2023-10-13 00:27:49,174][46663] Updated weights for policy 1, policy_version 6141 (0.0008) +[2023-10-13 00:27:49,578][46662] Updated weights for policy 0, policy_version 6120 (0.0008) +[2023-10-13 00:27:49,966][46662] Updated weights for policy 0, policy_version 6130 (0.0009) +[2023-10-13 00:27:50,342][46662] Updated weights for policy 0, policy_version 6140 (0.0009) +[2023-10-13 00:27:53,180][46663] Updated weights for policy 1, policy_version 6151 (0.0009) +[2023-10-13 00:27:53,559][46663] Updated weights for policy 1, policy_version 6161 (0.0009) +[2023-10-13 00:27:53,606][45375] Fps is (10 sec: 13107.5, 60 sec: 13107.3, 300 sec: 13329.4). Total num frames: 12582912. Throughput: 0: 1675.2, 1: 1663.5. Samples: 3161812. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0) +[2023-10-13 00:27:53,607][45375] Avg episode reward: [(0, '31.380'), (1, '35.360')] +[2023-10-13 00:27:53,616][46091] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000006144_6291456.pth... +[2023-10-13 00:27:53,652][46091] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000004576_4685824.pth +[2023-10-13 00:27:53,924][46663] Updated weights for policy 1, policy_version 6171 (0.0008) +[2023-10-13 00:27:54,099][46384] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000006176_6324224.pth... +[2023-10-13 00:27:54,135][46384] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000004608_4718592.pth +[2023-10-13 00:27:54,388][46662] Updated weights for policy 0, policy_version 6150 (0.0010) +[2023-10-13 00:27:54,754][46662] Updated weights for policy 0, policy_version 6160 (0.0009) +[2023-10-13 00:27:55,127][46662] Updated weights for policy 0, policy_version 6170 (0.0008) +[2023-10-13 00:27:57,840][46663] Updated weights for policy 1, policy_version 6181 (0.0009) +[2023-10-13 00:27:58,203][46663] Updated weights for policy 1, policy_version 6191 (0.0007) +[2023-10-13 00:27:58,577][46663] Updated weights for policy 1, policy_version 6201 (0.0007) +[2023-10-13 00:27:58,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 12648448. Throughput: 0: 1657.1, 1: 1678.4. Samples: 3171478. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0) +[2023-10-13 00:27:58,607][45375] Avg episode reward: [(0, '31.840'), (1, '35.320')] +[2023-10-13 00:27:59,376][46662] Updated weights for policy 0, policy_version 6180 (0.0008) +[2023-10-13 00:27:59,757][46662] Updated weights for policy 0, policy_version 6190 (0.0009) +[2023-10-13 00:28:00,135][46662] Updated weights for policy 0, policy_version 6200 (0.0009) +[2023-10-13 00:28:02,603][46663] Updated weights for policy 1, policy_version 6211 (0.0009) +[2023-10-13 00:28:02,968][46663] Updated weights for policy 1, policy_version 6221 (0.0010) +[2023-10-13 00:28:03,342][46663] Updated weights for policy 1, policy_version 6231 (0.0007) +[2023-10-13 00:28:03,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 12713984. Throughput: 0: 1666.8, 1: 1682.7. Samples: 3192078. Policy #0 lag: (min: 30.0, avg: 30.0, max: 30.0) +[2023-10-13 00:28:03,608][45375] Avg episode reward: [(0, '32.510'), (1, '33.400')] +[2023-10-13 00:28:04,264][46662] Updated weights for policy 0, policy_version 6210 (0.0010) +[2023-10-13 00:28:04,639][46662] Updated weights for policy 0, policy_version 6220 (0.0010) +[2023-10-13 00:28:05,011][46662] Updated weights for policy 0, policy_version 6230 (0.0008) +[2023-10-13 00:28:05,375][46662] Updated weights for policy 0, policy_version 6240 (0.0010) +[2023-10-13 00:28:07,336][46663] Updated weights for policy 1, policy_version 6241 (0.0009) +[2023-10-13 00:28:07,701][46663] Updated weights for policy 1, policy_version 6251 (0.0007) +[2023-10-13 00:28:08,078][46663] Updated weights for policy 1, policy_version 6261 (0.0010) +[2023-10-13 00:28:08,456][46663] Updated weights for policy 1, policy_version 6271 (0.0007) +[2023-10-13 00:28:08,607][45375] Fps is (10 sec: 16383.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 12812288. Throughput: 0: 1668.4, 1: 1663.7. Samples: 3211836. Policy #0 lag: (min: 30.0, avg: 30.0, max: 30.0) +[2023-10-13 00:28:08,607][45375] Avg episode reward: [(0, '34.010'), (1, '32.630')] +[2023-10-13 00:28:09,522][46662] Updated weights for policy 0, policy_version 6250 (0.0007) +[2023-10-13 00:28:09,890][46662] Updated weights for policy 0, policy_version 6260 (0.0007) +[2023-10-13 00:28:10,274][46662] Updated weights for policy 0, policy_version 6270 (0.0008) +[2023-10-13 00:28:12,564][46663] Updated weights for policy 1, policy_version 6281 (0.0009) +[2023-10-13 00:28:12,928][46663] Updated weights for policy 1, policy_version 6291 (0.0009) +[2023-10-13 00:28:13,303][46663] Updated weights for policy 1, policy_version 6301 (0.0008) +[2023-10-13 00:28:13,607][45375] Fps is (10 sec: 16384.1, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 12877824. Throughput: 0: 1662.5, 1: 1689.8. Samples: 3221910. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 00:28:13,607][45375] Avg episode reward: [(0, '33.110'), (1, '32.490')] +[2023-10-13 00:28:14,310][46662] Updated weights for policy 0, policy_version 6280 (0.0009) +[2023-10-13 00:28:14,683][46662] Updated weights for policy 0, policy_version 6290 (0.0008) +[2023-10-13 00:28:15,056][46662] Updated weights for policy 0, policy_version 6300 (0.0009) +[2023-10-13 00:28:17,419][46663] Updated weights for policy 1, policy_version 6311 (0.0009) +[2023-10-13 00:28:17,802][46663] Updated weights for policy 1, policy_version 6321 (0.0008) +[2023-10-13 00:28:18,165][46663] Updated weights for policy 1, policy_version 6331 (0.0011) +[2023-10-13 00:28:18,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 12943360. Throughput: 0: 1668.5, 1: 1688.4. Samples: 3242314. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 00:28:18,608][45375] Avg episode reward: [(0, '31.860'), (1, '33.880')] +[2023-10-13 00:28:19,178][46662] Updated weights for policy 0, policy_version 6310 (0.0008) +[2023-10-13 00:28:19,559][46662] Updated weights for policy 0, policy_version 6320 (0.0009) +[2023-10-13 00:28:19,924][46662] Updated weights for policy 0, policy_version 6330 (0.0009) +[2023-10-13 00:28:22,347][46663] Updated weights for policy 1, policy_version 6341 (0.0009) +[2023-10-13 00:28:22,717][46663] Updated weights for policy 1, policy_version 6351 (0.0011) +[2023-10-13 00:28:23,084][46663] Updated weights for policy 1, policy_version 6361 (0.0007) +[2023-10-13 00:28:23,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 13008896. Throughput: 0: 1666.2, 1: 1664.4. Samples: 3261628. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 00:28:23,607][45375] Avg episode reward: [(0, '31.150'), (1, '35.360')] +[2023-10-13 00:28:24,080][46662] Updated weights for policy 0, policy_version 6340 (0.0010) +[2023-10-13 00:28:24,451][46662] Updated weights for policy 0, policy_version 6350 (0.0008) +[2023-10-13 00:28:24,829][46662] Updated weights for policy 0, policy_version 6360 (0.0008) +[2023-10-13 00:28:27,078][46663] Updated weights for policy 1, policy_version 6371 (0.0010) +[2023-10-13 00:28:27,458][46663] Updated weights for policy 1, policy_version 6381 (0.0010) +[2023-10-13 00:28:27,822][46663] Updated weights for policy 1, policy_version 6391 (0.0010) +[2023-10-13 00:28:28,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 13074432. Throughput: 0: 1664.8, 1: 1689.3. Samples: 3271898. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 00:28:28,608][45375] Avg episode reward: [(0, '30.460'), (1, '37.120')] +[2023-10-13 00:28:28,848][46662] Updated weights for policy 0, policy_version 6370 (0.0009) +[2023-10-13 00:28:29,227][46662] Updated weights for policy 0, policy_version 6380 (0.0010) +[2023-10-13 00:28:29,597][46662] Updated weights for policy 0, policy_version 6390 (0.0008) +[2023-10-13 00:28:29,963][46662] Updated weights for policy 0, policy_version 6400 (0.0007) +[2023-10-13 00:28:31,870][46663] Updated weights for policy 1, policy_version 6401 (0.0009) +[2023-10-13 00:28:32,240][46663] Updated weights for policy 1, policy_version 6411 (0.0009) +[2023-10-13 00:28:32,601][46663] Updated weights for policy 1, policy_version 6421 (0.0009) +[2023-10-13 00:28:32,971][46663] Updated weights for policy 1, policy_version 6431 (0.0009) +[2023-10-13 00:28:33,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 13139968. Throughput: 0: 1668.3, 1: 1672.9. Samples: 3292012. Policy #0 lag: (min: 31.0, avg: 34.1, max: 63.0) +[2023-10-13 00:28:33,607][45375] Avg episode reward: [(0, '29.040'), (1, '37.740')] +[2023-10-13 00:28:34,049][46662] Updated weights for policy 0, policy_version 6410 (0.0009) +[2023-10-13 00:28:34,420][46662] Updated weights for policy 0, policy_version 6420 (0.0009) +[2023-10-13 00:28:34,804][46662] Updated weights for policy 0, policy_version 6430 (0.0007) +[2023-10-13 00:28:37,091][46663] Updated weights for policy 1, policy_version 6441 (0.0007) +[2023-10-13 00:28:37,455][46663] Updated weights for policy 1, policy_version 6451 (0.0007) +[2023-10-13 00:28:37,822][46663] Updated weights for policy 1, policy_version 6461 (0.0008) +[2023-10-13 00:28:38,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 13205504. Throughput: 0: 1672.9, 1: 1664.6. Samples: 3311998. Policy #0 lag: (min: 31.0, avg: 34.1, max: 63.0) +[2023-10-13 00:28:38,607][45375] Avg episode reward: [(0, '28.120'), (1, '38.280')] +[2023-10-13 00:28:38,842][46662] Updated weights for policy 0, policy_version 6440 (0.0008) +[2023-10-13 00:28:39,208][46662] Updated weights for policy 0, policy_version 6450 (0.0008) +[2023-10-13 00:28:39,588][46662] Updated weights for policy 0, policy_version 6460 (0.0008) +[2023-10-13 00:28:41,870][46663] Updated weights for policy 1, policy_version 6471 (0.0008) +[2023-10-13 00:28:42,235][46663] Updated weights for policy 1, policy_version 6481 (0.0011) +[2023-10-13 00:28:42,602][46663] Updated weights for policy 1, policy_version 6491 (0.0009) +[2023-10-13 00:28:43,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 13271040. Throughput: 0: 1675.3, 1: 1681.5. Samples: 3322534. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 00:28:43,607][45375] Avg episode reward: [(0, '29.010'), (1, '39.510')] +[2023-10-13 00:28:43,608][46384] Saving new best policy, reward=39.510! +[2023-10-13 00:28:43,758][46662] Updated weights for policy 0, policy_version 6470 (0.0007) +[2023-10-13 00:28:44,128][46662] Updated weights for policy 0, policy_version 6480 (0.0008) +[2023-10-13 00:28:44,502][46662] Updated weights for policy 0, policy_version 6490 (0.0008) +[2023-10-13 00:28:46,693][46663] Updated weights for policy 1, policy_version 6501 (0.0010) +[2023-10-13 00:28:47,062][46663] Updated weights for policy 1, policy_version 6511 (0.0011) +[2023-10-13 00:28:47,435][46663] Updated weights for policy 1, policy_version 6521 (0.0008) +[2023-10-13 00:28:48,601][46662] Updated weights for policy 0, policy_version 6500 (0.0007) +[2023-10-13 00:28:48,606][45375] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13440.5). Total num frames: 13336576. Throughput: 0: 1681.9, 1: 1661.2. Samples: 3342518. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 00:28:48,607][45375] Avg episode reward: [(0, '29.250'), (1, '39.920')] +[2023-10-13 00:28:48,608][46384] Saving new best policy, reward=39.920! +[2023-10-13 00:28:48,973][46662] Updated weights for policy 0, policy_version 6510 (0.0010) +[2023-10-13 00:28:49,349][46662] Updated weights for policy 0, policy_version 6520 (0.0008) +[2023-10-13 00:28:51,443][46663] Updated weights for policy 1, policy_version 6531 (0.0009) +[2023-10-13 00:28:51,809][46663] Updated weights for policy 1, policy_version 6541 (0.0008) +[2023-10-13 00:28:52,181][46663] Updated weights for policy 1, policy_version 6551 (0.0007) +[2023-10-13 00:28:53,283][46662] Updated weights for policy 0, policy_version 6530 (0.0009) +[2023-10-13 00:28:53,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 13402112. Throughput: 0: 1682.7, 1: 1675.1. Samples: 3362936. Policy #0 lag: (min: 0.0, avg: 28.5, max: 32.0) +[2023-10-13 00:28:53,607][45375] Avg episode reward: [(0, '29.640'), (1, '39.150')] +[2023-10-13 00:28:53,660][46662] Updated weights for policy 0, policy_version 6540 (0.0011) +[2023-10-13 00:28:54,027][46662] Updated weights for policy 0, policy_version 6550 (0.0009) +[2023-10-13 00:28:54,397][46662] Updated weights for policy 0, policy_version 6560 (0.0008) +[2023-10-13 00:28:56,221][46663] Updated weights for policy 1, policy_version 6561 (0.0008) +[2023-10-13 00:28:56,596][46663] Updated weights for policy 1, policy_version 6571 (0.0008) +[2023-10-13 00:28:56,963][46663] Updated weights for policy 1, policy_version 6581 (0.0008) +[2023-10-13 00:28:57,332][46663] Updated weights for policy 1, policy_version 6591 (0.0008) +[2023-10-13 00:28:58,360][46662] Updated weights for policy 0, policy_version 6570 (0.0009) +[2023-10-13 00:28:58,606][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 13467648. Throughput: 0: 1682.2, 1: 1677.4. Samples: 3373092. Policy #0 lag: (min: 0.0, avg: 28.5, max: 32.0) +[2023-10-13 00:28:58,607][45375] Avg episode reward: [(0, '30.030'), (1, '40.540')] +[2023-10-13 00:28:58,608][46384] Saving new best policy, reward=40.540! +[2023-10-13 00:28:58,731][46662] Updated weights for policy 0, policy_version 6580 (0.0009) +[2023-10-13 00:28:59,103][46662] Updated weights for policy 0, policy_version 6590 (0.0011) +[2023-10-13 00:29:01,465][46663] Updated weights for policy 1, policy_version 6601 (0.0007) +[2023-10-13 00:29:01,825][46663] Updated weights for policy 1, policy_version 6611 (0.0009) +[2023-10-13 00:29:02,194][46663] Updated weights for policy 1, policy_version 6621 (0.0008) +[2023-10-13 00:29:03,253][46662] Updated weights for policy 0, policy_version 6600 (0.0009) +[2023-10-13 00:29:03,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 13533184. Throughput: 0: 1682.3, 1: 1659.3. Samples: 3392686. Policy #0 lag: (min: 31.0, avg: 38.1, max: 63.0) +[2023-10-13 00:29:03,608][45375] Avg episode reward: [(0, '31.140'), (1, '42.070')] +[2023-10-13 00:29:03,610][46384] Saving new best policy, reward=42.070! +[2023-10-13 00:29:03,629][46662] Updated weights for policy 0, policy_version 6610 (0.0007) +[2023-10-13 00:29:04,005][46662] Updated weights for policy 0, policy_version 6620 (0.0008) +[2023-10-13 00:29:06,348][46663] Updated weights for policy 1, policy_version 6631 (0.0009) +[2023-10-13 00:29:06,733][46663] Updated weights for policy 1, policy_version 6641 (0.0008) +[2023-10-13 00:29:07,107][46663] Updated weights for policy 1, policy_version 6651 (0.0009) +[2023-10-13 00:29:08,068][46662] Updated weights for policy 0, policy_version 6630 (0.0007) +[2023-10-13 00:29:08,437][46662] Updated weights for policy 0, policy_version 6640 (0.0008) +[2023-10-13 00:29:08,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 13598720. Throughput: 0: 1685.6, 1: 1683.1. Samples: 3413220. Policy #0 lag: (min: 31.0, avg: 38.1, max: 63.0) +[2023-10-13 00:29:08,607][45375] Avg episode reward: [(0, '32.150'), (1, '41.990')] +[2023-10-13 00:29:08,819][46662] Updated weights for policy 0, policy_version 6650 (0.0008) +[2023-10-13 00:29:11,205][46663] Updated weights for policy 1, policy_version 6661 (0.0008) +[2023-10-13 00:29:11,571][46663] Updated weights for policy 1, policy_version 6671 (0.0008) +[2023-10-13 00:29:11,944][46663] Updated weights for policy 1, policy_version 6681 (0.0010) +[2023-10-13 00:29:12,980][46662] Updated weights for policy 0, policy_version 6660 (0.0009) +[2023-10-13 00:29:13,362][46662] Updated weights for policy 0, policy_version 6670 (0.0009) +[2023-10-13 00:29:13,607][45375] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 13664256. Throughput: 0: 1686.1, 1: 1675.2. Samples: 3423158. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 00:29:13,607][45375] Avg episode reward: [(0, '32.220'), (1, '40.900')] +[2023-10-13 00:29:13,724][46662] Updated weights for policy 0, policy_version 6680 (0.0007) +[2023-10-13 00:29:15,952][46663] Updated weights for policy 1, policy_version 6691 (0.0010) +[2023-10-13 00:29:16,332][46663] Updated weights for policy 1, policy_version 6701 (0.0010) +[2023-10-13 00:29:16,701][46663] Updated weights for policy 1, policy_version 6711 (0.0009) +[2023-10-13 00:29:17,542][46662] Updated weights for policy 0, policy_version 6690 (0.0008) +[2023-10-13 00:29:17,916][46662] Updated weights for policy 0, policy_version 6700 (0.0007) +[2023-10-13 00:29:18,293][46662] Updated weights for policy 0, policy_version 6710 (0.0010) +[2023-10-13 00:29:18,607][45375] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 13729792. Throughput: 0: 1688.3, 1: 1666.9. Samples: 3442996. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 00:29:18,607][45375] Avg episode reward: [(0, '31.450'), (1, '38.850')] +[2023-10-13 00:29:18,662][46662] Updated weights for policy 0, policy_version 6720 (0.0009) +[2023-10-13 00:29:20,822][46663] Updated weights for policy 1, policy_version 6721 (0.0010) +[2023-10-13 00:29:21,187][46663] Updated weights for policy 1, policy_version 6731 (0.0007) +[2023-10-13 00:29:21,548][46663] Updated weights for policy 1, policy_version 6741 (0.0007) +[2023-10-13 00:29:21,927][46663] Updated weights for policy 1, policy_version 6751 (0.0009) +[2023-10-13 00:29:22,645][46662] Updated weights for policy 0, policy_version 6730 (0.0010) +[2023-10-13 00:29:23,016][46662] Updated weights for policy 0, policy_version 6740 (0.0009) +[2023-10-13 00:29:23,387][46662] Updated weights for policy 0, policy_version 6750 (0.0011) +[2023-10-13 00:29:23,607][45375] Fps is (10 sec: 16384.0, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 13828096. Throughput: 0: 1677.2, 1: 1686.9. Samples: 3463384. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 00:29:23,607][45375] Avg episode reward: [(0, '30.100'), (1, '38.380')] +[2023-10-13 00:29:25,906][46663] Updated weights for policy 1, policy_version 6761 (0.0007) +[2023-10-13 00:29:26,275][46663] Updated weights for policy 1, policy_version 6771 (0.0009) +[2023-10-13 00:29:26,646][46663] Updated weights for policy 1, policy_version 6781 (0.0007) +[2023-10-13 00:29:27,503][46662] Updated weights for policy 0, policy_version 6760 (0.0008) +[2023-10-13 00:29:27,879][46662] Updated weights for policy 0, policy_version 6770 (0.0007) +[2023-10-13 00:29:28,255][46662] Updated weights for policy 0, policy_version 6780 (0.0007) +[2023-10-13 00:29:28,607][45375] Fps is (10 sec: 16383.9, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 13893632. Throughput: 0: 1688.7, 1: 1666.3. Samples: 3473506. Policy #0 lag: (min: 31.0, avg: 37.2, max: 63.0) +[2023-10-13 00:29:28,607][45375] Avg episode reward: [(0, '29.860'), (1, '37.940')] +[2023-10-13 00:29:30,671][46663] Updated weights for policy 1, policy_version 6791 (0.0009) +[2023-10-13 00:29:31,045][46663] Updated weights for policy 1, policy_version 6801 (0.0007) +[2023-10-13 00:29:31,407][46663] Updated weights for policy 1, policy_version 6811 (0.0008) +[2023-10-13 00:29:32,438][46662] Updated weights for policy 0, policy_version 6790 (0.0008) +[2023-10-13 00:29:32,802][46662] Updated weights for policy 0, policy_version 6800 (0.0007) +[2023-10-13 00:29:33,174][46662] Updated weights for policy 0, policy_version 6810 (0.0007) +[2023-10-13 00:29:33,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 13959168. Throughput: 0: 1684.6, 1: 1675.6. Samples: 3493728. Policy #0 lag: (min: 31.0, avg: 37.2, max: 63.0) +[2023-10-13 00:29:33,608][45375] Avg episode reward: [(0, '28.990'), (1, '38.170')] +[2023-10-13 00:29:35,469][46663] Updated weights for policy 1, policy_version 6821 (0.0008) +[2023-10-13 00:29:35,840][46663] Updated weights for policy 1, policy_version 6831 (0.0008) +[2023-10-13 00:29:36,212][46663] Updated weights for policy 1, policy_version 6841 (0.0008) +[2023-10-13 00:29:37,111][46662] Updated weights for policy 0, policy_version 6820 (0.0010) +[2023-10-13 00:29:37,490][46662] Updated weights for policy 0, policy_version 6830 (0.0009) +[2023-10-13 00:29:37,859][46662] Updated weights for policy 0, policy_version 6840 (0.0009) +[2023-10-13 00:29:38,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 14024704. Throughput: 0: 1667.5, 1: 1684.6. Samples: 3513778. Policy #0 lag: (min: 14.0, avg: 21.2, max: 46.0) +[2023-10-13 00:29:38,607][45375] Avg episode reward: [(0, '28.240'), (1, '36.910')] +[2023-10-13 00:29:40,175][46663] Updated weights for policy 1, policy_version 6851 (0.0008) +[2023-10-13 00:29:40,533][46663] Updated weights for policy 1, policy_version 6861 (0.0009) +[2023-10-13 00:29:40,906][46663] Updated weights for policy 1, policy_version 6871 (0.0009) +[2023-10-13 00:29:42,013][46662] Updated weights for policy 0, policy_version 6850 (0.0008) +[2023-10-13 00:29:42,388][46662] Updated weights for policy 0, policy_version 6860 (0.0009) +[2023-10-13 00:29:42,753][46662] Updated weights for policy 0, policy_version 6870 (0.0007) +[2023-10-13 00:29:43,132][46662] Updated weights for policy 0, policy_version 6880 (0.0009) +[2023-10-13 00:29:43,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 14090240. Throughput: 0: 1687.9, 1: 1661.6. Samples: 3523820. Policy #0 lag: (min: 14.0, avg: 21.2, max: 46.0) +[2023-10-13 00:29:43,608][45375] Avg episode reward: [(0, '27.820'), (1, '37.430')] +[2023-10-13 00:29:45,047][46663] Updated weights for policy 1, policy_version 6881 (0.0008) +[2023-10-13 00:29:45,420][46663] Updated weights for policy 1, policy_version 6891 (0.0009) +[2023-10-13 00:29:45,795][46663] Updated weights for policy 1, policy_version 6901 (0.0009) +[2023-10-13 00:29:46,165][46663] Updated weights for policy 1, policy_version 6911 (0.0007) +[2023-10-13 00:29:47,195][46662] Updated weights for policy 0, policy_version 6890 (0.0008) +[2023-10-13 00:29:47,560][46662] Updated weights for policy 0, policy_version 6900 (0.0009) +[2023-10-13 00:29:47,932][46662] Updated weights for policy 0, policy_version 6910 (0.0009) +[2023-10-13 00:29:48,606][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 14155776. Throughput: 0: 1683.3, 1: 1691.2. Samples: 3544540. Policy #0 lag: (min: 28.0, avg: 52.9, max: 56.0) +[2023-10-13 00:29:48,607][45375] Avg episode reward: [(0, '28.060'), (1, '37.210')] +[2023-10-13 00:29:50,254][46663] Updated weights for policy 1, policy_version 6921 (0.0009) +[2023-10-13 00:29:50,614][46663] Updated weights for policy 1, policy_version 6931 (0.0009) +[2023-10-13 00:29:50,988][46663] Updated weights for policy 1, policy_version 6941 (0.0009) +[2023-10-13 00:29:52,094][46662] Updated weights for policy 0, policy_version 6920 (0.0009) +[2023-10-13 00:29:52,457][46662] Updated weights for policy 0, policy_version 6930 (0.0011) +[2023-10-13 00:29:52,825][46662] Updated weights for policy 0, policy_version 6940 (0.0008) +[2023-10-13 00:29:53,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 14221312. Throughput: 0: 1659.9, 1: 1692.6. Samples: 3564080. Policy #0 lag: (min: 28.0, avg: 52.9, max: 56.0) +[2023-10-13 00:29:53,608][45375] Avg episode reward: [(0, '29.340'), (1, '36.220')] +[2023-10-13 00:29:53,618][46091] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000006944_7110656.pth... +[2023-10-13 00:29:53,618][46384] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000006944_7110656.pth... +[2023-10-13 00:29:53,671][46384] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000005376_5505024.pth +[2023-10-13 00:29:53,671][46091] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000005376_5505024.pth +[2023-10-13 00:29:55,257][46663] Updated weights for policy 1, policy_version 6951 (0.0008) +[2023-10-13 00:29:55,624][46663] Updated weights for policy 1, policy_version 6961 (0.0009) +[2023-10-13 00:29:55,986][46663] Updated weights for policy 1, policy_version 6971 (0.0011) +[2023-10-13 00:29:56,749][46662] Updated weights for policy 0, policy_version 6950 (0.0008) +[2023-10-13 00:29:57,108][46662] Updated weights for policy 0, policy_version 6960 (0.0008) +[2023-10-13 00:29:57,484][46662] Updated weights for policy 0, policy_version 6970 (0.0007) +[2023-10-13 00:29:58,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 14286848. Throughput: 0: 1682.3, 1: 1668.4. Samples: 3573942. Policy #0 lag: (min: 31.0, avg: 37.7, max: 63.0) +[2023-10-13 00:29:58,607][45375] Avg episode reward: [(0, '30.090'), (1, '36.670')] +[2023-10-13 00:30:00,019][46663] Updated weights for policy 1, policy_version 6981 (0.0010) +[2023-10-13 00:30:00,394][46663] Updated weights for policy 1, policy_version 6991 (0.0008) +[2023-10-13 00:30:00,761][46663] Updated weights for policy 1, policy_version 7001 (0.0009) +[2023-10-13 00:30:01,579][46662] Updated weights for policy 0, policy_version 6980 (0.0008) +[2023-10-13 00:30:01,951][46662] Updated weights for policy 0, policy_version 6990 (0.0007) +[2023-10-13 00:30:02,322][46662] Updated weights for policy 0, policy_version 7000 (0.0007) +[2023-10-13 00:30:03,607][45375] Fps is (10 sec: 13107.6, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 14352384. Throughput: 0: 1670.1, 1: 1690.3. Samples: 3594216. Policy #0 lag: (min: 31.0, avg: 37.7, max: 63.0) +[2023-10-13 00:30:03,607][45375] Avg episode reward: [(0, '31.960'), (1, '37.910')] +[2023-10-13 00:30:04,766][46663] Updated weights for policy 1, policy_version 7011 (0.0008) +[2023-10-13 00:30:05,141][46663] Updated weights for policy 1, policy_version 7021 (0.0009) +[2023-10-13 00:30:05,508][46663] Updated weights for policy 1, policy_version 7031 (0.0008) +[2023-10-13 00:30:06,323][46662] Updated weights for policy 0, policy_version 7010 (0.0008) +[2023-10-13 00:30:06,705][46662] Updated weights for policy 0, policy_version 7020 (0.0007) +[2023-10-13 00:30:07,067][46662] Updated weights for policy 0, policy_version 7030 (0.0007) +[2023-10-13 00:30:07,440][46662] Updated weights for policy 0, policy_version 7040 (0.0008) +[2023-10-13 00:30:08,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 14417920. Throughput: 0: 1660.3, 1: 1688.3. Samples: 3614068. Policy #0 lag: (min: 31.0, avg: 33.9, max: 63.0) +[2023-10-13 00:30:08,607][45375] Avg episode reward: [(0, '33.520'), (1, '36.940')] +[2023-10-13 00:30:09,597][46663] Updated weights for policy 1, policy_version 7041 (0.0008) +[2023-10-13 00:30:09,962][46663] Updated weights for policy 1, policy_version 7051 (0.0010) +[2023-10-13 00:30:10,336][46663] Updated weights for policy 1, policy_version 7061 (0.0008) +[2023-10-13 00:30:10,700][46663] Updated weights for policy 1, policy_version 7071 (0.0009) +[2023-10-13 00:30:11,503][46662] Updated weights for policy 0, policy_version 7050 (0.0009) +[2023-10-13 00:30:11,873][46662] Updated weights for policy 0, policy_version 7060 (0.0010) +[2023-10-13 00:30:12,250][46662] Updated weights for policy 0, policy_version 7070 (0.0008) +[2023-10-13 00:30:13,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 14483456. Throughput: 0: 1680.7, 1: 1674.8. Samples: 3624504. Policy #0 lag: (min: 31.0, avg: 33.9, max: 63.0) +[2023-10-13 00:30:13,607][45375] Avg episode reward: [(0, '33.120'), (1, '37.220')] +[2023-10-13 00:30:14,604][46663] Updated weights for policy 1, policy_version 7081 (0.0009) +[2023-10-13 00:30:14,966][46663] Updated weights for policy 1, policy_version 7091 (0.0008) +[2023-10-13 00:30:15,336][46663] Updated weights for policy 1, policy_version 7101 (0.0011) +[2023-10-13 00:30:16,264][46662] Updated weights for policy 0, policy_version 7080 (0.0008) +[2023-10-13 00:30:16,634][46662] Updated weights for policy 0, policy_version 7090 (0.0011) +[2023-10-13 00:30:17,011][46662] Updated weights for policy 0, policy_version 7100 (0.0007) +[2023-10-13 00:30:18,606][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 14548992. Throughput: 0: 1666.0, 1: 1683.2. Samples: 3644442. Policy #0 lag: (min: 31.0, avg: 32.4, max: 56.0) +[2023-10-13 00:30:18,607][45375] Avg episode reward: [(0, '32.290'), (1, '37.990')] +[2023-10-13 00:30:19,489][46663] Updated weights for policy 1, policy_version 7111 (0.0008) +[2023-10-13 00:30:19,858][46663] Updated weights for policy 1, policy_version 7121 (0.0007) +[2023-10-13 00:30:20,228][46663] Updated weights for policy 1, policy_version 7131 (0.0007) +[2023-10-13 00:30:21,128][46662] Updated weights for policy 0, policy_version 7110 (0.0009) +[2023-10-13 00:30:21,502][46662] Updated weights for policy 0, policy_version 7120 (0.0007) +[2023-10-13 00:30:21,884][46662] Updated weights for policy 0, policy_version 7130 (0.0012) +[2023-10-13 00:30:23,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 14614528. Throughput: 0: 1671.0, 1: 1678.4. Samples: 3664504. Policy #0 lag: (min: 31.0, avg: 32.4, max: 56.0) +[2023-10-13 00:30:23,608][45375] Avg episode reward: [(0, '32.980'), (1, '37.380')] +[2023-10-13 00:30:24,235][46663] Updated weights for policy 1, policy_version 7141 (0.0007) +[2023-10-13 00:30:24,609][46663] Updated weights for policy 1, policy_version 7151 (0.0008) +[2023-10-13 00:30:24,978][46663] Updated weights for policy 1, policy_version 7161 (0.0009) +[2023-10-13 00:30:25,910][46662] Updated weights for policy 0, policy_version 7140 (0.0008) +[2023-10-13 00:30:26,285][46662] Updated weights for policy 0, policy_version 7150 (0.0010) +[2023-10-13 00:30:26,655][46662] Updated weights for policy 0, policy_version 7160 (0.0009) +[2023-10-13 00:30:28,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 14680064. Throughput: 0: 1683.4, 1: 1674.0. Samples: 3674900. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 00:30:28,607][45375] Avg episode reward: [(0, '33.670'), (1, '36.380')] +[2023-10-13 00:30:29,193][46663] Updated weights for policy 1, policy_version 7171 (0.0009) +[2023-10-13 00:30:29,556][46663] Updated weights for policy 1, policy_version 7181 (0.0007) +[2023-10-13 00:30:29,922][46663] Updated weights for policy 1, policy_version 7191 (0.0007) +[2023-10-13 00:30:30,968][46662] Updated weights for policy 0, policy_version 7170 (0.0008) +[2023-10-13 00:30:31,342][46662] Updated weights for policy 0, policy_version 7180 (0.0008) +[2023-10-13 00:30:31,713][46662] Updated weights for policy 0, policy_version 7190 (0.0007) +[2023-10-13 00:30:32,083][46662] Updated weights for policy 0, policy_version 7200 (0.0008) +[2023-10-13 00:30:33,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 14745600. Throughput: 0: 1661.4, 1: 1669.1. Samples: 3694414. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 00:30:33,608][45375] Avg episode reward: [(0, '32.840'), (1, '37.030')] +[2023-10-13 00:30:34,068][46663] Updated weights for policy 1, policy_version 7201 (0.0009) +[2023-10-13 00:30:34,440][46663] Updated weights for policy 1, policy_version 7211 (0.0008) +[2023-10-13 00:30:34,815][46663] Updated weights for policy 1, policy_version 7221 (0.0008) +[2023-10-13 00:30:35,179][46663] Updated weights for policy 1, policy_version 7231 (0.0008) +[2023-10-13 00:30:36,152][46662] Updated weights for policy 0, policy_version 7210 (0.0009) +[2023-10-13 00:30:36,518][46662] Updated weights for policy 0, policy_version 7220 (0.0007) +[2023-10-13 00:30:36,886][46662] Updated weights for policy 0, policy_version 7230 (0.0008) +[2023-10-13 00:30:38,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 14811136. Throughput: 0: 1679.9, 1: 1672.1. Samples: 3714922. Policy #0 lag: (min: 29.0, avg: 33.2, max: 61.0) +[2023-10-13 00:30:38,607][45375] Avg episode reward: [(0, '32.720'), (1, '37.260')] +[2023-10-13 00:30:39,181][46663] Updated weights for policy 1, policy_version 7241 (0.0008) +[2023-10-13 00:30:39,558][46663] Updated weights for policy 1, policy_version 7251 (0.0007) +[2023-10-13 00:30:39,916][46663] Updated weights for policy 1, policy_version 7261 (0.0007) +[2023-10-13 00:30:40,955][46662] Updated weights for policy 0, policy_version 7240 (0.0009) +[2023-10-13 00:30:41,341][46662] Updated weights for policy 0, policy_version 7250 (0.0007) +[2023-10-13 00:30:41,707][46662] Updated weights for policy 0, policy_version 7260 (0.0008) +[2023-10-13 00:30:43,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 14876672. Throughput: 0: 1680.3, 1: 1678.2. Samples: 3725072. Policy #0 lag: (min: 29.0, avg: 33.2, max: 61.0) +[2023-10-13 00:30:43,608][45375] Avg episode reward: [(0, '33.380'), (1, '38.560')] +[2023-10-13 00:30:44,123][46663] Updated weights for policy 1, policy_version 7271 (0.0009) +[2023-10-13 00:30:44,502][46663] Updated weights for policy 1, policy_version 7281 (0.0007) +[2023-10-13 00:30:44,874][46663] Updated weights for policy 1, policy_version 7291 (0.0008) +[2023-10-13 00:30:45,790][46662] Updated weights for policy 0, policy_version 7270 (0.0008) +[2023-10-13 00:30:46,166][46662] Updated weights for policy 0, policy_version 7280 (0.0007) +[2023-10-13 00:30:46,532][46662] Updated weights for policy 0, policy_version 7290 (0.0008) +[2023-10-13 00:30:48,606][45375] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 14942208. Throughput: 0: 1661.6, 1: 1672.0. Samples: 3744228. Policy #0 lag: (min: 1.0, avg: 11.6, max: 33.0) +[2023-10-13 00:30:48,607][45375] Avg episode reward: [(0, '32.660'), (1, '37.150')] +[2023-10-13 00:30:49,014][46663] Updated weights for policy 1, policy_version 7301 (0.0009) +[2023-10-13 00:30:49,379][46663] Updated weights for policy 1, policy_version 7311 (0.0007) +[2023-10-13 00:30:49,749][46663] Updated weights for policy 1, policy_version 7321 (0.0007) +[2023-10-13 00:30:50,560][46662] Updated weights for policy 0, policy_version 7300 (0.0009) +[2023-10-13 00:30:50,931][46662] Updated weights for policy 0, policy_version 7310 (0.0010) +[2023-10-13 00:30:51,314][46662] Updated weights for policy 0, policy_version 7320 (0.0008) +[2023-10-13 00:30:53,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 15007744. Throughput: 0: 1682.5, 1: 1665.7. Samples: 3764738. Policy #0 lag: (min: 1.0, avg: 11.6, max: 33.0) +[2023-10-13 00:30:53,608][45375] Avg episode reward: [(0, '32.370'), (1, '37.050')] +[2023-10-13 00:30:53,923][46663] Updated weights for policy 1, policy_version 7331 (0.0007) +[2023-10-13 00:30:54,295][46663] Updated weights for policy 1, policy_version 7341 (0.0007) +[2023-10-13 00:30:54,665][46663] Updated weights for policy 1, policy_version 7351 (0.0008) +[2023-10-13 00:30:55,465][46662] Updated weights for policy 0, policy_version 7330 (0.0009) +[2023-10-13 00:30:55,831][46662] Updated weights for policy 0, policy_version 7340 (0.0008) +[2023-10-13 00:30:56,196][46662] Updated weights for policy 0, policy_version 7350 (0.0009) +[2023-10-13 00:30:56,573][46662] Updated weights for policy 0, policy_version 7360 (0.0008) +[2023-10-13 00:30:58,606][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 15073280. Throughput: 0: 1670.1, 1: 1667.6. Samples: 3774704. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0) +[2023-10-13 00:30:58,607][45375] Avg episode reward: [(0, '34.420'), (1, '38.140')] +[2023-10-13 00:30:58,608][46091] Saving new best policy, reward=34.420! +[2023-10-13 00:30:58,734][46663] Updated weights for policy 1, policy_version 7361 (0.0010) +[2023-10-13 00:30:59,110][46663] Updated weights for policy 1, policy_version 7371 (0.0010) +[2023-10-13 00:30:59,481][46663] Updated weights for policy 1, policy_version 7381 (0.0008) +[2023-10-13 00:30:59,842][46663] Updated weights for policy 1, policy_version 7391 (0.0007) +[2023-10-13 00:31:00,555][46662] Updated weights for policy 0, policy_version 7370 (0.0008) +[2023-10-13 00:31:00,928][46662] Updated weights for policy 0, policy_version 7380 (0.0009) +[2023-10-13 00:31:01,305][46662] Updated weights for policy 0, policy_version 7390 (0.0009) +[2023-10-13 00:31:03,606][45375] Fps is (10 sec: 13107.6, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 15138816. Throughput: 0: 1668.2, 1: 1667.0. Samples: 3794526. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0) +[2023-10-13 00:31:03,607][45375] Avg episode reward: [(0, '33.980'), (1, '39.100')] +[2023-10-13 00:31:03,917][46663] Updated weights for policy 1, policy_version 7401 (0.0009) +[2023-10-13 00:31:04,294][46663] Updated weights for policy 1, policy_version 7411 (0.0007) +[2023-10-13 00:31:04,665][46663] Updated weights for policy 1, policy_version 7421 (0.0008) +[2023-10-13 00:31:05,417][46662] Updated weights for policy 0, policy_version 7400 (0.0009) +[2023-10-13 00:31:05,789][46662] Updated weights for policy 0, policy_version 7410 (0.0008) +[2023-10-13 00:31:06,165][46662] Updated weights for policy 0, policy_version 7420 (0.0009) +[2023-10-13 00:31:08,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 15204352. Throughput: 0: 1681.4, 1: 1668.5. Samples: 3815246. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 00:31:08,607][45375] Avg episode reward: [(0, '34.530'), (1, '38.930')] +[2023-10-13 00:31:08,615][46091] Saving new best policy, reward=34.530! +[2023-10-13 00:31:08,770][46663] Updated weights for policy 1, policy_version 7431 (0.0010) +[2023-10-13 00:31:09,137][46663] Updated weights for policy 1, policy_version 7441 (0.0009) +[2023-10-13 00:31:09,506][46663] Updated weights for policy 1, policy_version 7451 (0.0008) +[2023-10-13 00:31:10,126][46662] Updated weights for policy 0, policy_version 7430 (0.0009) +[2023-10-13 00:31:10,505][46662] Updated weights for policy 0, policy_version 7440 (0.0008) +[2023-10-13 00:31:10,881][46662] Updated weights for policy 0, policy_version 7450 (0.0007) +[2023-10-13 00:31:13,607][45375] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 15269888. Throughput: 0: 1663.3, 1: 1669.5. Samples: 3824878. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 00:31:13,608][45375] Avg episode reward: [(0, '36.030'), (1, '38.560')] +[2023-10-13 00:31:13,609][46091] Saving new best policy, reward=36.030! +[2023-10-13 00:31:13,658][46663] Updated weights for policy 1, policy_version 7461 (0.0009) +[2023-10-13 00:31:14,027][46663] Updated weights for policy 1, policy_version 7471 (0.0008) +[2023-10-13 00:31:14,405][46663] Updated weights for policy 1, policy_version 7481 (0.0007) +[2023-10-13 00:31:14,765][46662] Updated weights for policy 0, policy_version 7460 (0.0009) +[2023-10-13 00:31:15,143][46662] Updated weights for policy 0, policy_version 7470 (0.0009) +[2023-10-13 00:31:15,508][46662] Updated weights for policy 0, policy_version 7480 (0.0009) +[2023-10-13 00:31:18,338][46663] Updated weights for policy 1, policy_version 7491 (0.0007) +[2023-10-13 00:31:18,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 15335424. Throughput: 0: 1684.2, 1: 1671.8. Samples: 3845432. Policy #0 lag: (min: 31.0, avg: 37.7, max: 63.0) +[2023-10-13 00:31:18,607][45375] Avg episode reward: [(0, '35.830'), (1, '38.280')] +[2023-10-13 00:31:18,712][46663] Updated weights for policy 1, policy_version 7501 (0.0010) +[2023-10-13 00:31:19,077][46663] Updated weights for policy 1, policy_version 7511 (0.0009) +[2023-10-13 00:31:19,549][46662] Updated weights for policy 0, policy_version 7490 (0.0009) +[2023-10-13 00:31:19,922][46662] Updated weights for policy 0, policy_version 7500 (0.0008) +[2023-10-13 00:31:20,285][46662] Updated weights for policy 0, policy_version 7510 (0.0008) +[2023-10-13 00:31:20,653][46662] Updated weights for policy 0, policy_version 7520 (0.0007) +[2023-10-13 00:31:23,122][46663] Updated weights for policy 1, policy_version 7521 (0.0009) +[2023-10-13 00:31:23,494][46663] Updated weights for policy 1, policy_version 7531 (0.0007) +[2023-10-13 00:31:23,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 15400960. Throughput: 0: 1690.3, 1: 1663.1. Samples: 3865824. Policy #0 lag: (min: 31.0, avg: 37.7, max: 63.0) +[2023-10-13 00:31:23,608][45375] Avg episode reward: [(0, '34.640'), (1, '36.400')] +[2023-10-13 00:31:23,869][46663] Updated weights for policy 1, policy_version 7541 (0.0008) +[2023-10-13 00:31:24,231][46663] Updated weights for policy 1, policy_version 7551 (0.0008) +[2023-10-13 00:31:24,756][46662] Updated weights for policy 0, policy_version 7530 (0.0008) +[2023-10-13 00:31:25,125][46662] Updated weights for policy 0, policy_version 7540 (0.0009) +[2023-10-13 00:31:25,503][46662] Updated weights for policy 0, policy_version 7550 (0.0010) +[2023-10-13 00:31:28,254][46663] Updated weights for policy 1, policy_version 7561 (0.0008) +[2023-10-13 00:31:28,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 15466496. Throughput: 0: 1665.3, 1: 1668.8. Samples: 3875102. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 00:31:28,607][45375] Avg episode reward: [(0, '34.530'), (1, '36.800')] +[2023-10-13 00:31:28,623][46663] Updated weights for policy 1, policy_version 7571 (0.0009) +[2023-10-13 00:31:28,986][46663] Updated weights for policy 1, policy_version 7581 (0.0009) +[2023-10-13 00:31:29,539][46662] Updated weights for policy 0, policy_version 7560 (0.0010) +[2023-10-13 00:31:29,911][46662] Updated weights for policy 0, policy_version 7570 (0.0009) +[2023-10-13 00:31:30,284][46662] Updated weights for policy 0, policy_version 7580 (0.0010) +[2023-10-13 00:31:33,067][46663] Updated weights for policy 1, policy_version 7591 (0.0010) +[2023-10-13 00:31:33,436][46663] Updated weights for policy 1, policy_version 7601 (0.0011) +[2023-10-13 00:31:33,606][45375] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 15532032. Throughput: 0: 1693.9, 1: 1676.4. Samples: 3895894. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 00:31:33,607][45375] Avg episode reward: [(0, '35.740'), (1, '36.700')] +[2023-10-13 00:31:33,798][46663] Updated weights for policy 1, policy_version 7611 (0.0009) +[2023-10-13 00:31:34,228][46662] Updated weights for policy 0, policy_version 7590 (0.0008) +[2023-10-13 00:31:34,614][46662] Updated weights for policy 0, policy_version 7600 (0.0008) +[2023-10-13 00:31:34,982][46662] Updated weights for policy 0, policy_version 7610 (0.0008) +[2023-10-13 00:31:37,923][46663] Updated weights for policy 1, policy_version 7621 (0.0008) +[2023-10-13 00:31:38,279][46663] Updated weights for policy 1, policy_version 7631 (0.0010) +[2023-10-13 00:31:38,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 15597568. Throughput: 0: 1693.7, 1: 1664.4. Samples: 3915852. Policy #0 lag: (min: 28.0, avg: 34.7, max: 60.0) +[2023-10-13 00:31:38,607][45375] Avg episode reward: [(0, '35.400'), (1, '37.340')] +[2023-10-13 00:31:38,643][46663] Updated weights for policy 1, policy_version 7641 (0.0008) +[2023-10-13 00:31:39,123][46662] Updated weights for policy 0, policy_version 7620 (0.0007) +[2023-10-13 00:31:39,495][46662] Updated weights for policy 0, policy_version 7630 (0.0007) +[2023-10-13 00:31:39,862][46662] Updated weights for policy 0, policy_version 7640 (0.0007) +[2023-10-13 00:31:42,748][46663] Updated weights for policy 1, policy_version 7651 (0.0010) +[2023-10-13 00:31:43,119][46663] Updated weights for policy 1, policy_version 7661 (0.0009) +[2023-10-13 00:31:43,489][46663] Updated weights for policy 1, policy_version 7671 (0.0010) +[2023-10-13 00:31:43,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 15663104. Throughput: 0: 1672.8, 1: 1681.9. Samples: 3925662. Policy #0 lag: (min: 28.0, avg: 34.7, max: 60.0) +[2023-10-13 00:31:43,607][45375] Avg episode reward: [(0, '34.970'), (1, '37.710')] +[2023-10-13 00:31:43,842][46662] Updated weights for policy 0, policy_version 7650 (0.0009) +[2023-10-13 00:31:44,219][46662] Updated weights for policy 0, policy_version 7660 (0.0008) +[2023-10-13 00:31:44,594][46662] Updated weights for policy 0, policy_version 7670 (0.0009) +[2023-10-13 00:31:44,961][46662] Updated weights for policy 0, policy_version 7680 (0.0007) +[2023-10-13 00:31:47,542][46663] Updated weights for policy 1, policy_version 7681 (0.0008) +[2023-10-13 00:31:47,898][46663] Updated weights for policy 1, policy_version 7691 (0.0010) +[2023-10-13 00:31:48,277][46663] Updated weights for policy 1, policy_version 7701 (0.0009) +[2023-10-13 00:31:48,606][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 15728640. Throughput: 0: 1695.2, 1: 1683.1. Samples: 3946548. Policy #0 lag: (min: 25.0, avg: 32.4, max: 57.0) +[2023-10-13 00:31:48,607][45375] Avg episode reward: [(0, '35.610'), (1, '36.530')] +[2023-10-13 00:31:48,640][46663] Updated weights for policy 1, policy_version 7711 (0.0007) +[2023-10-13 00:31:49,177][46662] Updated weights for policy 0, policy_version 7690 (0.0008) +[2023-10-13 00:31:49,553][46662] Updated weights for policy 0, policy_version 7700 (0.0009) +[2023-10-13 00:31:49,921][46662] Updated weights for policy 0, policy_version 7710 (0.0008) +[2023-10-13 00:31:52,751][46663] Updated weights for policy 1, policy_version 7721 (0.0007) +[2023-10-13 00:31:53,121][46663] Updated weights for policy 1, policy_version 7731 (0.0009) +[2023-10-13 00:31:53,490][46663] Updated weights for policy 1, policy_version 7741 (0.0007) +[2023-10-13 00:31:53,607][45375] Fps is (10 sec: 16383.9, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 15826944. Throughput: 0: 1692.7, 1: 1662.0. Samples: 3966210. Policy #0 lag: (min: 25.0, avg: 32.4, max: 57.0) +[2023-10-13 00:31:53,607][45375] Avg episode reward: [(0, '35.130'), (1, '35.800')] +[2023-10-13 00:31:53,616][46091] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000007712_7897088.pth... +[2023-10-13 00:31:53,616][46384] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000007744_7929856.pth... +[2023-10-13 00:31:53,656][46091] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000006144_6291456.pth +[2023-10-13 00:31:53,656][46384] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000006176_6324224.pth +[2023-10-13 00:31:53,660][46384] Saving a milestone ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/milestones/checkpoint_000007744_7929856.pth +[2023-10-13 00:31:53,662][46091] Saving a milestone ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/milestones/checkpoint_000007712_7897088.pth +[2023-10-13 00:31:54,129][46662] Updated weights for policy 0, policy_version 7720 (0.0010) +[2023-10-13 00:31:54,506][46662] Updated weights for policy 0, policy_version 7730 (0.0010) +[2023-10-13 00:31:54,878][46662] Updated weights for policy 0, policy_version 7740 (0.0007) +[2023-10-13 00:31:57,390][46663] Updated weights for policy 1, policy_version 7751 (0.0010) +[2023-10-13 00:31:57,750][46663] Updated weights for policy 1, policy_version 7761 (0.0009) +[2023-10-13 00:31:58,122][46663] Updated weights for policy 1, policy_version 7771 (0.0010) +[2023-10-13 00:31:58,607][45375] Fps is (10 sec: 16383.6, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 15892480. Throughput: 0: 1678.6, 1: 1686.0. Samples: 3976286. Policy #0 lag: (min: 31.0, avg: 36.0, max: 63.0) +[2023-10-13 00:31:58,608][45375] Avg episode reward: [(0, '34.210'), (1, '36.680')] +[2023-10-13 00:31:58,839][46662] Updated weights for policy 0, policy_version 7750 (0.0010) +[2023-10-13 00:31:59,201][46662] Updated weights for policy 0, policy_version 7760 (0.0010) +[2023-10-13 00:31:59,591][46662] Updated weights for policy 0, policy_version 7770 (0.0008) +[2023-10-13 00:32:02,204][46663] Updated weights for policy 1, policy_version 7781 (0.0008) +[2023-10-13 00:32:02,567][46663] Updated weights for policy 1, policy_version 7791 (0.0009) +[2023-10-13 00:32:02,942][46663] Updated weights for policy 1, policy_version 7801 (0.0007) +[2023-10-13 00:32:03,606][45375] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 15958016. Throughput: 0: 1687.2, 1: 1679.0. Samples: 3996910. Policy #0 lag: (min: 31.0, avg: 36.0, max: 63.0) +[2023-10-13 00:32:03,607][45375] Avg episode reward: [(0, '35.100'), (1, '38.040')] +[2023-10-13 00:32:03,634][46662] Updated weights for policy 0, policy_version 7780 (0.0008) +[2023-10-13 00:32:04,003][46662] Updated weights for policy 0, policy_version 7790 (0.0008) +[2023-10-13 00:32:04,381][46662] Updated weights for policy 0, policy_version 7800 (0.0009) +[2023-10-13 00:32:06,953][46663] Updated weights for policy 1, policy_version 7811 (0.0007) +[2023-10-13 00:32:07,316][46663] Updated weights for policy 1, policy_version 7821 (0.0009) +[2023-10-13 00:32:07,684][46663] Updated weights for policy 1, policy_version 7831 (0.0008) +[2023-10-13 00:32:08,383][46662] Updated weights for policy 0, policy_version 7810 (0.0008) +[2023-10-13 00:32:08,607][45375] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 16023552. Throughput: 0: 1688.5, 1: 1667.3. Samples: 4016834. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 00:32:08,607][45375] Avg episode reward: [(0, '34.000'), (1, '38.340')] +[2023-10-13 00:32:08,751][46662] Updated weights for policy 0, policy_version 7820 (0.0010) +[2023-10-13 00:32:09,130][46662] Updated weights for policy 0, policy_version 7830 (0.0010) +[2023-10-13 00:32:09,496][46662] Updated weights for policy 0, policy_version 7840 (0.0010) +[2023-10-13 00:32:11,773][46663] Updated weights for policy 1, policy_version 7841 (0.0008) +[2023-10-13 00:32:12,143][46663] Updated weights for policy 1, policy_version 7851 (0.0007) +[2023-10-13 00:32:12,508][46663] Updated weights for policy 1, policy_version 7861 (0.0008) +[2023-10-13 00:32:12,878][46663] Updated weights for policy 1, policy_version 7871 (0.0009) +[2023-10-13 00:32:13,495][46662] Updated weights for policy 0, policy_version 7850 (0.0008) +[2023-10-13 00:32:13,607][45375] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 16089088. Throughput: 0: 1692.2, 1: 1689.5. Samples: 4027280. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 00:32:13,608][45375] Avg episode reward: [(0, '34.300'), (1, '39.420')] +[2023-10-13 00:32:13,859][46662] Updated weights for policy 0, policy_version 7860 (0.0008) +[2023-10-13 00:32:14,239][46662] Updated weights for policy 0, policy_version 7870 (0.0008) +[2023-10-13 00:32:16,903][46663] Updated weights for policy 1, policy_version 7881 (0.0009) +[2023-10-13 00:32:17,283][46663] Updated weights for policy 1, policy_version 7891 (0.0010) +[2023-10-13 00:32:17,657][46663] Updated weights for policy 1, policy_version 7901 (0.0008) +[2023-10-13 00:32:18,322][46662] Updated weights for policy 0, policy_version 7880 (0.0008) +[2023-10-13 00:32:18,606][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 16154624. Throughput: 0: 1694.8, 1: 1672.2. Samples: 4047408. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 00:32:18,607][45375] Avg episode reward: [(0, '33.760'), (1, '40.720')] +[2023-10-13 00:32:18,696][46662] Updated weights for policy 0, policy_version 7890 (0.0008) +[2023-10-13 00:32:19,062][46662] Updated weights for policy 0, policy_version 7900 (0.0009) +[2023-10-13 00:32:21,739][46663] Updated weights for policy 1, policy_version 7911 (0.0009) +[2023-10-13 00:32:22,120][46663] Updated weights for policy 1, policy_version 7921 (0.0007) +[2023-10-13 00:32:22,485][46663] Updated weights for policy 1, policy_version 7931 (0.0008) +[2023-10-13 00:32:23,112][46662] Updated weights for policy 0, policy_version 7910 (0.0009) +[2023-10-13 00:32:23,489][46662] Updated weights for policy 0, policy_version 7920 (0.0009) +[2023-10-13 00:32:23,607][45375] Fps is (10 sec: 13107.4, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 16220160. Throughput: 0: 1694.0, 1: 1681.3. Samples: 4067742. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 00:32:23,607][45375] Avg episode reward: [(0, '33.400'), (1, '41.160')] +[2023-10-13 00:32:23,865][46662] Updated weights for policy 0, policy_version 7930 (0.0007) +[2023-10-13 00:32:26,343][46663] Updated weights for policy 1, policy_version 7941 (0.0009) +[2023-10-13 00:32:26,719][46663] Updated weights for policy 1, policy_version 7951 (0.0008) +[2023-10-13 00:32:27,088][46663] Updated weights for policy 1, policy_version 7961 (0.0007) +[2023-10-13 00:32:28,000][46662] Updated weights for policy 0, policy_version 7940 (0.0007) +[2023-10-13 00:32:28,372][46662] Updated weights for policy 0, policy_version 7950 (0.0007) +[2023-10-13 00:32:28,606][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 16285696. Throughput: 0: 1689.9, 1: 1686.9. Samples: 4077618. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 00:32:28,607][45375] Avg episode reward: [(0, '32.930'), (1, '40.200')] +[2023-10-13 00:32:28,748][46662] Updated weights for policy 0, policy_version 7960 (0.0008) +[2023-10-13 00:32:31,184][46663] Updated weights for policy 1, policy_version 7971 (0.0008) +[2023-10-13 00:32:31,558][46663] Updated weights for policy 1, policy_version 7981 (0.0008) +[2023-10-13 00:32:31,925][46663] Updated weights for policy 1, policy_version 7991 (0.0007) +[2023-10-13 00:32:32,696][46662] Updated weights for policy 0, policy_version 7970 (0.0010) +[2023-10-13 00:32:33,059][46662] Updated weights for policy 0, policy_version 7980 (0.0009) +[2023-10-13 00:32:33,437][46662] Updated weights for policy 0, policy_version 7990 (0.0010) +[2023-10-13 00:32:33,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 16351232. Throughput: 0: 1687.6, 1: 1663.1. Samples: 4097330. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 00:32:33,607][45375] Avg episode reward: [(0, '32.270'), (1, '38.370')] +[2023-10-13 00:32:33,806][46662] Updated weights for policy 0, policy_version 8000 (0.0010) +[2023-10-13 00:32:35,905][46663] Updated weights for policy 1, policy_version 8001 (0.0008) +[2023-10-13 00:32:36,285][46663] Updated weights for policy 1, policy_version 8011 (0.0009) +[2023-10-13 00:32:36,652][46663] Updated weights for policy 1, policy_version 8021 (0.0007) +[2023-10-13 00:32:37,026][46663] Updated weights for policy 1, policy_version 8031 (0.0007) +[2023-10-13 00:32:37,833][46662] Updated weights for policy 0, policy_version 8010 (0.0008) +[2023-10-13 00:32:38,210][46662] Updated weights for policy 0, policy_version 8020 (0.0007) +[2023-10-13 00:32:38,589][46662] Updated weights for policy 0, policy_version 8030 (0.0007) +[2023-10-13 00:32:38,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 16416768. Throughput: 0: 1681.5, 1: 1688.0. Samples: 4117838. Policy #0 lag: (min: 1.0, avg: 13.2, max: 33.0) +[2023-10-13 00:32:38,607][45375] Avg episode reward: [(0, '31.720'), (1, '40.260')] +[2023-10-13 00:32:41,232][46663] Updated weights for policy 1, policy_version 8041 (0.0007) +[2023-10-13 00:32:41,596][46663] Updated weights for policy 1, policy_version 8051 (0.0007) +[2023-10-13 00:32:41,968][46663] Updated weights for policy 1, policy_version 8061 (0.0007) +[2023-10-13 00:32:42,679][46662] Updated weights for policy 0, policy_version 8040 (0.0009) +[2023-10-13 00:32:43,053][46662] Updated weights for policy 0, policy_version 8050 (0.0007) +[2023-10-13 00:32:43,428][46662] Updated weights for policy 0, policy_version 8060 (0.0008) +[2023-10-13 00:32:43,607][45375] Fps is (10 sec: 16384.0, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 16515072. Throughput: 0: 1685.8, 1: 1679.9. Samples: 4127742. Policy #0 lag: (min: 21.0, avg: 29.0, max: 53.0) +[2023-10-13 00:32:43,607][45375] Avg episode reward: [(0, '32.110'), (1, '39.040')] +[2023-10-13 00:32:46,031][46663] Updated weights for policy 1, policy_version 8071 (0.0009) +[2023-10-13 00:32:46,405][46663] Updated weights for policy 1, policy_version 8081 (0.0008) +[2023-10-13 00:32:46,782][46663] Updated weights for policy 1, policy_version 8091 (0.0008) +[2023-10-13 00:32:47,382][46662] Updated weights for policy 0, policy_version 8070 (0.0010) +[2023-10-13 00:32:47,755][46662] Updated weights for policy 0, policy_version 8080 (0.0009) +[2023-10-13 00:32:48,119][46662] Updated weights for policy 0, policy_version 8090 (0.0007) +[2023-10-13 00:32:48,607][45375] Fps is (10 sec: 16384.0, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 16580608. Throughput: 0: 1683.9, 1: 1667.8. Samples: 4147734. Policy #0 lag: (min: 21.0, avg: 29.0, max: 53.0) +[2023-10-13 00:32:48,607][45375] Avg episode reward: [(0, '30.950'), (1, '38.390')] +[2023-10-13 00:32:50,889][46663] Updated weights for policy 1, policy_version 8101 (0.0009) +[2023-10-13 00:32:51,249][46663] Updated weights for policy 1, policy_version 8111 (0.0011) +[2023-10-13 00:32:51,616][46663] Updated weights for policy 1, policy_version 8121 (0.0009) +[2023-10-13 00:32:52,263][46662] Updated weights for policy 0, policy_version 8100 (0.0010) +[2023-10-13 00:32:52,626][46662] Updated weights for policy 0, policy_version 8110 (0.0008) +[2023-10-13 00:32:52,996][46662] Updated weights for policy 0, policy_version 8120 (0.0008) +[2023-10-13 00:32:53,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 16646144. Throughput: 0: 1665.8, 1: 1687.3. Samples: 4167724. Policy #0 lag: (min: 2.0, avg: 4.8, max: 34.0) +[2023-10-13 00:32:53,607][45375] Avg episode reward: [(0, '31.620'), (1, '38.180')] +[2023-10-13 00:32:55,613][46663] Updated weights for policy 1, policy_version 8131 (0.0009) +[2023-10-13 00:32:55,976][46663] Updated weights for policy 1, policy_version 8141 (0.0009) +[2023-10-13 00:32:56,343][46663] Updated weights for policy 1, policy_version 8151 (0.0008) +[2023-10-13 00:32:57,202][46662] Updated weights for policy 0, policy_version 8130 (0.0010) +[2023-10-13 00:32:57,578][46662] Updated weights for policy 0, policy_version 8140 (0.0008) +[2023-10-13 00:32:57,955][46662] Updated weights for policy 0, policy_version 8150 (0.0011) +[2023-10-13 00:32:58,326][46662] Updated weights for policy 0, policy_version 8160 (0.0008) +[2023-10-13 00:32:58,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 16711680. Throughput: 0: 1680.8, 1: 1667.7. Samples: 4177960. Policy #0 lag: (min: 2.0, avg: 4.8, max: 34.0) +[2023-10-13 00:32:58,607][45375] Avg episode reward: [(0, '32.520'), (1, '37.810')] +[2023-10-13 00:33:00,633][46663] Updated weights for policy 1, policy_version 8161 (0.0010) +[2023-10-13 00:33:00,993][46663] Updated weights for policy 1, policy_version 8171 (0.0010) +[2023-10-13 00:33:01,373][46663] Updated weights for policy 1, policy_version 8181 (0.0010) +[2023-10-13 00:33:01,743][46663] Updated weights for policy 1, policy_version 8191 (0.0007) +[2023-10-13 00:33:02,479][46662] Updated weights for policy 0, policy_version 8170 (0.0009) +[2023-10-13 00:33:02,860][46662] Updated weights for policy 0, policy_version 8180 (0.0009) +[2023-10-13 00:33:03,242][46662] Updated weights for policy 0, policy_version 8190 (0.0007) +[2023-10-13 00:33:03,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 16777216. Throughput: 0: 1678.4, 1: 1674.9. Samples: 4198308. Policy #0 lag: (min: 15.0, avg: 17.8, max: 47.0) +[2023-10-13 00:33:03,608][45375] Avg episode reward: [(0, '34.070'), (1, '35.920')] +[2023-10-13 00:33:05,865][46663] Updated weights for policy 1, policy_version 8201 (0.0007) +[2023-10-13 00:33:06,229][46663] Updated weights for policy 1, policy_version 8211 (0.0008) +[2023-10-13 00:33:06,591][46663] Updated weights for policy 1, policy_version 8221 (0.0007) +[2023-10-13 00:33:07,402][46662] Updated weights for policy 0, policy_version 8200 (0.0008) +[2023-10-13 00:33:07,764][46662] Updated weights for policy 0, policy_version 8210 (0.0008) +[2023-10-13 00:33:08,151][46662] Updated weights for policy 0, policy_version 8220 (0.0009) +[2023-10-13 00:33:08,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 16842752. Throughput: 0: 1659.2, 1: 1685.8. Samples: 4218266. Policy #0 lag: (min: 15.0, avg: 17.8, max: 47.0) +[2023-10-13 00:33:08,607][45375] Avg episode reward: [(0, '34.530'), (1, '34.710')] +[2023-10-13 00:33:10,798][46663] Updated weights for policy 1, policy_version 8231 (0.0008) +[2023-10-13 00:33:11,185][46663] Updated weights for policy 1, policy_version 8241 (0.0010) +[2023-10-13 00:33:11,558][46663] Updated weights for policy 1, policy_version 8251 (0.0009) +[2023-10-13 00:33:12,026][46662] Updated weights for policy 0, policy_version 8230 (0.0009) +[2023-10-13 00:33:12,392][46662] Updated weights for policy 0, policy_version 8240 (0.0007) +[2023-10-13 00:33:12,758][46662] Updated weights for policy 0, policy_version 8250 (0.0008) +[2023-10-13 00:33:13,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 16908288. Throughput: 0: 1681.9, 1: 1667.6. Samples: 4228346. Policy #0 lag: (min: 30.0, avg: 46.0, max: 62.0) +[2023-10-13 00:33:13,608][45375] Avg episode reward: [(0, '34.610'), (1, '35.860')] +[2023-10-13 00:33:15,503][46663] Updated weights for policy 1, policy_version 8261 (0.0009) +[2023-10-13 00:33:15,875][46663] Updated weights for policy 1, policy_version 8271 (0.0008) +[2023-10-13 00:33:16,240][46663] Updated weights for policy 1, policy_version 8281 (0.0007) +[2023-10-13 00:33:16,751][46662] Updated weights for policy 0, policy_version 8260 (0.0008) +[2023-10-13 00:33:17,134][46662] Updated weights for policy 0, policy_version 8270 (0.0008) +[2023-10-13 00:33:17,499][46662] Updated weights for policy 0, policy_version 8280 (0.0009) +[2023-10-13 00:33:18,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 16973824. Throughput: 0: 1676.4, 1: 1685.1. Samples: 4248598. Policy #0 lag: (min: 30.0, avg: 46.0, max: 62.0) +[2023-10-13 00:33:18,607][45375] Avg episode reward: [(0, '35.370'), (1, '36.950')] +[2023-10-13 00:33:20,328][46663] Updated weights for policy 1, policy_version 8291 (0.0009) +[2023-10-13 00:33:20,696][46663] Updated weights for policy 1, policy_version 8301 (0.0008) +[2023-10-13 00:33:21,063][46663] Updated weights for policy 1, policy_version 8311 (0.0008) +[2023-10-13 00:33:21,644][46662] Updated weights for policy 0, policy_version 8290 (0.0010) +[2023-10-13 00:33:22,024][46662] Updated weights for policy 0, policy_version 8300 (0.0010) +[2023-10-13 00:33:22,388][46662] Updated weights for policy 0, policy_version 8310 (0.0011) +[2023-10-13 00:33:22,764][46662] Updated weights for policy 0, policy_version 8320 (0.0010) +[2023-10-13 00:33:23,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 17039360. Throughput: 0: 1661.7, 1: 1679.4. Samples: 4268190. Policy #0 lag: (min: 31.0, avg: 38.3, max: 63.0) +[2023-10-13 00:33:23,608][45375] Avg episode reward: [(0, '35.590'), (1, '36.930')] +[2023-10-13 00:33:25,009][46663] Updated weights for policy 1, policy_version 8321 (0.0008) +[2023-10-13 00:33:25,375][46663] Updated weights for policy 1, policy_version 8331 (0.0011) +[2023-10-13 00:33:25,748][46663] Updated weights for policy 1, policy_version 8341 (0.0010) +[2023-10-13 00:33:26,125][46663] Updated weights for policy 1, policy_version 8351 (0.0007) +[2023-10-13 00:33:26,879][46662] Updated weights for policy 0, policy_version 8330 (0.0008) +[2023-10-13 00:33:27,246][46662] Updated weights for policy 0, policy_version 8340 (0.0008) +[2023-10-13 00:33:27,611][46662] Updated weights for policy 0, policy_version 8350 (0.0009) +[2023-10-13 00:33:28,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 17104896. Throughput: 0: 1685.5, 1: 1662.8. Samples: 4278414. Policy #0 lag: (min: 31.0, avg: 38.3, max: 63.0) +[2023-10-13 00:33:28,607][45375] Avg episode reward: [(0, '36.030'), (1, '37.450')] +[2023-10-13 00:33:30,190][46663] Updated weights for policy 1, policy_version 8361 (0.0007) +[2023-10-13 00:33:30,561][46663] Updated weights for policy 1, policy_version 8371 (0.0008) +[2023-10-13 00:33:30,931][46663] Updated weights for policy 1, policy_version 8381 (0.0009) +[2023-10-13 00:33:31,750][46662] Updated weights for policy 0, policy_version 8360 (0.0008) +[2023-10-13 00:33:32,128][46662] Updated weights for policy 0, policy_version 8370 (0.0008) +[2023-10-13 00:33:32,501][46662] Updated weights for policy 0, policy_version 8380 (0.0008) +[2023-10-13 00:33:33,607][45375] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 17170432. Throughput: 0: 1676.5, 1: 1678.7. Samples: 4298718. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 00:33:33,607][45375] Avg episode reward: [(0, '36.650'), (1, '37.450')] +[2023-10-13 00:33:33,608][46091] Saving new best policy, reward=36.650! +[2023-10-13 00:33:35,114][46663] Updated weights for policy 1, policy_version 8391 (0.0008) +[2023-10-13 00:33:35,489][46663] Updated weights for policy 1, policy_version 8401 (0.0009) +[2023-10-13 00:33:35,850][46663] Updated weights for policy 1, policy_version 8411 (0.0012) +[2023-10-13 00:33:36,612][46662] Updated weights for policy 0, policy_version 8390 (0.0008) +[2023-10-13 00:33:36,985][46662] Updated weights for policy 0, policy_version 8400 (0.0007) +[2023-10-13 00:33:37,360][46662] Updated weights for policy 0, policy_version 8410 (0.0008) +[2023-10-13 00:33:38,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 17235968. Throughput: 0: 1667.6, 1: 1679.4. Samples: 4318338. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 00:33:38,607][45375] Avg episode reward: [(0, '36.730'), (1, '36.940')] +[2023-10-13 00:33:38,615][46091] Saving new best policy, reward=36.730! +[2023-10-13 00:33:40,042][46663] Updated weights for policy 1, policy_version 8421 (0.0009) +[2023-10-13 00:33:40,414][46663] Updated weights for policy 1, policy_version 8431 (0.0007) +[2023-10-13 00:33:40,788][46663] Updated weights for policy 1, policy_version 8441 (0.0009) +[2023-10-13 00:33:41,217][46662] Updated weights for policy 0, policy_version 8420 (0.0009) +[2023-10-13 00:33:41,583][46662] Updated weights for policy 0, policy_version 8430 (0.0008) +[2023-10-13 00:33:41,947][46662] Updated weights for policy 0, policy_version 8440 (0.0007) +[2023-10-13 00:33:43,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 17301504. Throughput: 0: 1682.9, 1: 1667.7. Samples: 4328736. Policy #0 lag: (min: 21.0, avg: 29.0, max: 53.0) +[2023-10-13 00:33:43,607][45375] Avg episode reward: [(0, '36.570'), (1, '36.580')] +[2023-10-13 00:33:44,735][46663] Updated weights for policy 1, policy_version 8451 (0.0009) +[2023-10-13 00:33:45,113][46663] Updated weights for policy 1, policy_version 8461 (0.0008) +[2023-10-13 00:33:45,474][46663] Updated weights for policy 1, policy_version 8471 (0.0009) +[2023-10-13 00:33:46,158][46662] Updated weights for policy 0, policy_version 8450 (0.0008) +[2023-10-13 00:33:46,533][46662] Updated weights for policy 0, policy_version 8460 (0.0008) +[2023-10-13 00:33:46,896][46662] Updated weights for policy 0, policy_version 8470 (0.0010) +[2023-10-13 00:33:47,265][46662] Updated weights for policy 0, policy_version 8480 (0.0007) +[2023-10-13 00:33:48,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 17367040. Throughput: 0: 1664.0, 1: 1676.4. Samples: 4348626. Policy #0 lag: (min: 21.0, avg: 29.0, max: 53.0) +[2023-10-13 00:33:48,608][45375] Avg episode reward: [(0, '36.990'), (1, '36.330')] +[2023-10-13 00:33:48,609][46091] Saving new best policy, reward=36.990! +[2023-10-13 00:33:49,529][46663] Updated weights for policy 1, policy_version 8481 (0.0008) +[2023-10-13 00:33:49,895][46663] Updated weights for policy 1, policy_version 8491 (0.0007) +[2023-10-13 00:33:50,258][46663] Updated weights for policy 1, policy_version 8501 (0.0009) +[2023-10-13 00:33:50,629][46663] Updated weights for policy 1, policy_version 8511 (0.0011) +[2023-10-13 00:33:51,393][46662] Updated weights for policy 0, policy_version 8490 (0.0010) +[2023-10-13 00:33:51,778][46662] Updated weights for policy 0, policy_version 8500 (0.0008) +[2023-10-13 00:33:52,157][46662] Updated weights for policy 0, policy_version 8510 (0.0008) +[2023-10-13 00:33:53,607][45375] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 17432576. Throughput: 0: 1668.5, 1: 1674.2. Samples: 4368688. Policy #0 lag: (min: 26.0, avg: 28.7, max: 54.0) +[2023-10-13 00:33:53,608][45375] Avg episode reward: [(0, '36.440'), (1, '36.640')] +[2023-10-13 00:33:53,622][46384] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000008512_8716288.pth... +[2023-10-13 00:33:53,622][46091] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000008512_8716288.pth... +[2023-10-13 00:33:53,656][46384] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000006944_7110656.pth +[2023-10-13 00:33:53,657][46091] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000006944_7110656.pth +[2023-10-13 00:33:54,872][46663] Updated weights for policy 1, policy_version 8521 (0.0009) +[2023-10-13 00:33:55,249][46663] Updated weights for policy 1, policy_version 8531 (0.0009) +[2023-10-13 00:33:55,618][46663] Updated weights for policy 1, policy_version 8541 (0.0007) +[2023-10-13 00:33:56,021][46662] Updated weights for policy 0, policy_version 8520 (0.0008) +[2023-10-13 00:33:56,401][46662] Updated weights for policy 0, policy_version 8530 (0.0007) +[2023-10-13 00:33:56,773][46662] Updated weights for policy 0, policy_version 8540 (0.0008) +[2023-10-13 00:33:58,607][45375] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 17498112. Throughput: 0: 1679.9, 1: 1667.5. Samples: 4378980. Policy #0 lag: (min: 26.0, avg: 28.7, max: 54.0) +[2023-10-13 00:33:58,607][45375] Avg episode reward: [(0, '37.130'), (1, '36.500')] +[2023-10-13 00:33:58,608][46091] Saving new best policy, reward=37.130! +[2023-10-13 00:33:59,706][46663] Updated weights for policy 1, policy_version 8551 (0.0007) +[2023-10-13 00:34:00,083][46663] Updated weights for policy 1, policy_version 8561 (0.0010) +[2023-10-13 00:34:00,448][46663] Updated weights for policy 1, policy_version 8571 (0.0011) +[2023-10-13 00:34:00,783][46662] Updated weights for policy 0, policy_version 8550 (0.0009) +[2023-10-13 00:34:01,158][46662] Updated weights for policy 0, policy_version 8560 (0.0007) +[2023-10-13 00:34:01,533][46662] Updated weights for policy 0, policy_version 8570 (0.0007) +[2023-10-13 00:34:03,606][45375] Fps is (10 sec: 13107.6, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 17563648. Throughput: 0: 1658.6, 1: 1670.8. Samples: 4398422. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 00:34:03,607][45375] Avg episode reward: [(0, '36.410'), (1, '36.220')] +[2023-10-13 00:34:04,655][46663] Updated weights for policy 1, policy_version 8581 (0.0009) +[2023-10-13 00:34:05,057][46663] Updated weights for policy 1, policy_version 8591 (0.0010) +[2023-10-13 00:34:05,419][46663] Updated weights for policy 1, policy_version 8601 (0.0008) +[2023-10-13 00:34:05,556][46662] Updated weights for policy 0, policy_version 8580 (0.0007) +[2023-10-13 00:34:05,954][46662] Updated weights for policy 0, policy_version 8590 (0.0007) +[2023-10-13 00:34:06,324][46662] Updated weights for policy 0, policy_version 8600 (0.0007) +[2023-10-13 00:34:08,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 17629184. Throughput: 0: 1681.5, 1: 1667.3. Samples: 4418886. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 00:34:08,607][45375] Avg episode reward: [(0, '36.390'), (1, '36.490')] +[2023-10-13 00:34:09,528][46663] Updated weights for policy 1, policy_version 8611 (0.0008) +[2023-10-13 00:34:09,907][46663] Updated weights for policy 1, policy_version 8621 (0.0010) +[2023-10-13 00:34:10,271][46663] Updated weights for policy 1, policy_version 8631 (0.0008) +[2023-10-13 00:34:10,401][46662] Updated weights for policy 0, policy_version 8610 (0.0007) +[2023-10-13 00:34:10,771][46662] Updated weights for policy 0, policy_version 8620 (0.0007) +[2023-10-13 00:34:11,143][46662] Updated weights for policy 0, policy_version 8630 (0.0008) +[2023-10-13 00:34:11,514][46662] Updated weights for policy 0, policy_version 8640 (0.0007) +[2023-10-13 00:34:13,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 17694720. Throughput: 0: 1672.1, 1: 1666.4. Samples: 4428646. Policy #0 lag: (min: 31.0, avg: 32.7, max: 59.0) +[2023-10-13 00:34:13,608][45375] Avg episode reward: [(0, '36.300'), (1, '36.670')] +[2023-10-13 00:34:14,456][46663] Updated weights for policy 1, policy_version 8641 (0.0008) +[2023-10-13 00:34:14,815][46663] Updated weights for policy 1, policy_version 8651 (0.0007) +[2023-10-13 00:34:15,177][46663] Updated weights for policy 1, policy_version 8661 (0.0010) +[2023-10-13 00:34:15,550][46663] Updated weights for policy 1, policy_version 8671 (0.0008) +[2023-10-13 00:34:15,651][46662] Updated weights for policy 0, policy_version 8650 (0.0007) +[2023-10-13 00:34:16,017][46662] Updated weights for policy 0, policy_version 8660 (0.0009) +[2023-10-13 00:34:16,399][46662] Updated weights for policy 0, policy_version 8670 (0.0011) +[2023-10-13 00:34:18,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 17760256. Throughput: 0: 1660.4, 1: 1665.9. Samples: 4448402. Policy #0 lag: (min: 31.0, avg: 32.7, max: 59.0) +[2023-10-13 00:34:18,607][45375] Avg episode reward: [(0, '37.560'), (1, '35.350')] +[2023-10-13 00:34:18,608][46091] Saving new best policy, reward=37.560! +[2023-10-13 00:34:19,540][46663] Updated weights for policy 1, policy_version 8681 (0.0009) +[2023-10-13 00:34:19,917][46663] Updated weights for policy 1, policy_version 8691 (0.0010) +[2023-10-13 00:34:20,285][46663] Updated weights for policy 1, policy_version 8701 (0.0010) +[2023-10-13 00:34:20,440][46662] Updated weights for policy 0, policy_version 8680 (0.0008) +[2023-10-13 00:34:20,812][46662] Updated weights for policy 0, policy_version 8690 (0.0009) +[2023-10-13 00:34:21,184][46662] Updated weights for policy 0, policy_version 8700 (0.0009) +[2023-10-13 00:34:23,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 17825792. Throughput: 0: 1681.2, 1: 1669.2. Samples: 4469106. Policy #0 lag: (min: 15.0, avg: 22.9, max: 47.0) +[2023-10-13 00:34:23,608][45375] Avg episode reward: [(0, '36.880'), (1, '36.260')] +[2023-10-13 00:34:24,090][46663] Updated weights for policy 1, policy_version 8711 (0.0009) +[2023-10-13 00:34:24,459][46663] Updated weights for policy 1, policy_version 8721 (0.0008) +[2023-10-13 00:34:24,823][46663] Updated weights for policy 1, policy_version 8731 (0.0008) +[2023-10-13 00:34:25,225][46662] Updated weights for policy 0, policy_version 8710 (0.0009) +[2023-10-13 00:34:25,595][46662] Updated weights for policy 0, policy_version 8720 (0.0008) +[2023-10-13 00:34:25,975][46662] Updated weights for policy 0, policy_version 8730 (0.0008) +[2023-10-13 00:34:28,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 17891328. Throughput: 0: 1661.7, 1: 1671.6. Samples: 4478736. Policy #0 lag: (min: 15.0, avg: 22.9, max: 47.0) +[2023-10-13 00:34:28,607][45375] Avg episode reward: [(0, '37.730'), (1, '35.980')] +[2023-10-13 00:34:28,608][46091] Saving new best policy, reward=37.730! +[2023-10-13 00:34:29,069][46663] Updated weights for policy 1, policy_version 8741 (0.0009) +[2023-10-13 00:34:29,444][46663] Updated weights for policy 1, policy_version 8751 (0.0008) +[2023-10-13 00:34:29,805][46663] Updated weights for policy 1, policy_version 8761 (0.0009) +[2023-10-13 00:34:30,061][46662] Updated weights for policy 0, policy_version 8740 (0.0007) +[2023-10-13 00:34:30,433][46662] Updated weights for policy 0, policy_version 8750 (0.0007) +[2023-10-13 00:34:30,807][46662] Updated weights for policy 0, policy_version 8760 (0.0009) +[2023-10-13 00:34:33,607][45375] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 17956864. Throughput: 0: 1668.2, 1: 1666.9. Samples: 4498706. Policy #0 lag: (min: 9.0, avg: 15.5, max: 41.0) +[2023-10-13 00:34:33,607][45375] Avg episode reward: [(0, '37.450'), (1, '37.280')] +[2023-10-13 00:34:33,856][46663] Updated weights for policy 1, policy_version 8771 (0.0009) +[2023-10-13 00:34:34,231][46663] Updated weights for policy 1, policy_version 8781 (0.0007) +[2023-10-13 00:34:34,593][46663] Updated weights for policy 1, policy_version 8791 (0.0009) +[2023-10-13 00:34:34,630][46662] Updated weights for policy 0, policy_version 8770 (0.0007) +[2023-10-13 00:34:35,004][46662] Updated weights for policy 0, policy_version 8780 (0.0009) +[2023-10-13 00:34:35,371][46662] Updated weights for policy 0, policy_version 8790 (0.0007) +[2023-10-13 00:34:35,748][46662] Updated weights for policy 0, policy_version 8800 (0.0009) +[2023-10-13 00:34:38,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 18022400. Throughput: 0: 1690.4, 1: 1666.6. Samples: 4519754. Policy #0 lag: (min: 9.0, avg: 15.5, max: 41.0) +[2023-10-13 00:34:38,607][45375] Avg episode reward: [(0, '38.180'), (1, '38.250')] +[2023-10-13 00:34:38,615][46091] Saving new best policy, reward=38.180! +[2023-10-13 00:34:38,655][46663] Updated weights for policy 1, policy_version 8801 (0.0008) +[2023-10-13 00:34:39,032][46663] Updated weights for policy 1, policy_version 8811 (0.0009) +[2023-10-13 00:34:39,398][46663] Updated weights for policy 1, policy_version 8821 (0.0008) +[2023-10-13 00:34:39,776][46663] Updated weights for policy 1, policy_version 8831 (0.0009) +[2023-10-13 00:34:39,923][46662] Updated weights for policy 0, policy_version 8810 (0.0007) +[2023-10-13 00:34:40,288][46662] Updated weights for policy 0, policy_version 8820 (0.0007) +[2023-10-13 00:34:40,663][46662] Updated weights for policy 0, policy_version 8830 (0.0007) +[2023-10-13 00:34:43,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 18087936. Throughput: 0: 1663.0, 1: 1667.0. Samples: 4528830. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 00:34:43,607][45375] Avg episode reward: [(0, '38.070'), (1, '38.130')] +[2023-10-13 00:34:44,024][46663] Updated weights for policy 1, policy_version 8841 (0.0010) +[2023-10-13 00:34:44,388][46663] Updated weights for policy 1, policy_version 8851 (0.0008) +[2023-10-13 00:34:44,757][46663] Updated weights for policy 1, policy_version 8861 (0.0007) +[2023-10-13 00:34:44,765][46662] Updated weights for policy 0, policy_version 8840 (0.0008) +[2023-10-13 00:34:45,129][46662] Updated weights for policy 0, policy_version 8850 (0.0009) +[2023-10-13 00:34:45,514][46662] Updated weights for policy 0, policy_version 8860 (0.0008) +[2023-10-13 00:34:48,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 18153472. Throughput: 0: 1690.0, 1: 1667.8. Samples: 4549524. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 00:34:48,607][45375] Avg episode reward: [(0, '39.200'), (1, '39.400')] +[2023-10-13 00:34:48,608][46091] Saving new best policy, reward=39.200! +[2023-10-13 00:34:48,818][46663] Updated weights for policy 1, policy_version 8871 (0.0010) +[2023-10-13 00:34:49,194][46663] Updated weights for policy 1, policy_version 8881 (0.0010) +[2023-10-13 00:34:49,330][46662] Updated weights for policy 0, policy_version 8870 (0.0009) +[2023-10-13 00:34:49,569][46663] Updated weights for policy 1, policy_version 8891 (0.0008) +[2023-10-13 00:34:49,708][46662] Updated weights for policy 0, policy_version 8880 (0.0010) +[2023-10-13 00:34:50,079][46662] Updated weights for policy 0, policy_version 8890 (0.0008) +[2023-10-13 00:34:53,607][45375] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 18219008. Throughput: 0: 1686.3, 1: 1673.0. Samples: 4570054. Policy #0 lag: (min: 17.0, avg: 26.2, max: 49.0) +[2023-10-13 00:34:53,608][45375] Avg episode reward: [(0, '38.770'), (1, '38.360')] +[2023-10-13 00:34:53,801][46663] Updated weights for policy 1, policy_version 8901 (0.0008) +[2023-10-13 00:34:54,193][46663] Updated weights for policy 1, policy_version 8911 (0.0007) +[2023-10-13 00:34:54,237][46662] Updated weights for policy 0, policy_version 8900 (0.0007) +[2023-10-13 00:34:54,564][46663] Updated weights for policy 1, policy_version 8921 (0.0009) +[2023-10-13 00:34:54,640][46662] Updated weights for policy 0, policy_version 8910 (0.0009) +[2023-10-13 00:34:55,011][46662] Updated weights for policy 0, policy_version 8920 (0.0007) +[2023-10-13 00:34:58,588][46663] Updated weights for policy 1, policy_version 8931 (0.0008) +[2023-10-13 00:34:58,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 18284544. Throughput: 0: 1667.1, 1: 1676.2. Samples: 4579094. Policy #0 lag: (min: 17.0, avg: 26.2, max: 49.0) +[2023-10-13 00:34:58,607][45375] Avg episode reward: [(0, '40.040'), (1, '37.320')] +[2023-10-13 00:34:58,608][46091] Saving new best policy, reward=40.040! +[2023-10-13 00:34:58,955][46663] Updated weights for policy 1, policy_version 8941 (0.0009) +[2023-10-13 00:34:59,124][46662] Updated weights for policy 0, policy_version 8930 (0.0007) +[2023-10-13 00:34:59,325][46663] Updated weights for policy 1, policy_version 8951 (0.0008) +[2023-10-13 00:34:59,481][46662] Updated weights for policy 0, policy_version 8940 (0.0007) +[2023-10-13 00:34:59,848][46662] Updated weights for policy 0, policy_version 8950 (0.0007) +[2023-10-13 00:35:00,226][46662] Updated weights for policy 0, policy_version 8960 (0.0011) +[2023-10-13 00:35:03,462][46663] Updated weights for policy 1, policy_version 8961 (0.0007) +[2023-10-13 00:35:03,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 18350080. Throughput: 0: 1683.4, 1: 1679.2. Samples: 4599720. Policy #0 lag: (min: 31.0, avg: 35.0, max: 63.0) +[2023-10-13 00:35:03,608][45375] Avg episode reward: [(0, '41.060'), (1, '38.250')] +[2023-10-13 00:35:03,609][46091] Saving new best policy, reward=41.060! +[2023-10-13 00:35:03,835][46663] Updated weights for policy 1, policy_version 8971 (0.0010) +[2023-10-13 00:35:04,207][46663] Updated weights for policy 1, policy_version 8981 (0.0008) +[2023-10-13 00:35:04,254][46662] Updated weights for policy 0, policy_version 8970 (0.0008) +[2023-10-13 00:35:04,565][46663] Updated weights for policy 1, policy_version 8991 (0.0007) +[2023-10-13 00:35:04,627][46662] Updated weights for policy 0, policy_version 8980 (0.0007) +[2023-10-13 00:35:04,998][46662] Updated weights for policy 0, policy_version 8990 (0.0007) +[2023-10-13 00:35:08,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 18415616. Throughput: 0: 1688.2, 1: 1670.6. Samples: 4620252. Policy #0 lag: (min: 31.0, avg: 35.0, max: 63.0) +[2023-10-13 00:35:08,608][45375] Avg episode reward: [(0, '39.550'), (1, '38.760')] +[2023-10-13 00:35:08,661][46663] Updated weights for policy 1, policy_version 9001 (0.0007) +[2023-10-13 00:35:09,038][46663] Updated weights for policy 1, policy_version 9011 (0.0009) +[2023-10-13 00:35:09,142][46662] Updated weights for policy 0, policy_version 9000 (0.0008) +[2023-10-13 00:35:09,401][46663] Updated weights for policy 1, policy_version 9021 (0.0008) +[2023-10-13 00:35:09,515][46662] Updated weights for policy 0, policy_version 9010 (0.0009) +[2023-10-13 00:35:09,885][46662] Updated weights for policy 0, policy_version 9020 (0.0008) +[2023-10-13 00:35:13,513][46663] Updated weights for policy 1, policy_version 9031 (0.0007) +[2023-10-13 00:35:13,607][45375] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 18481152. Throughput: 0: 1675.3, 1: 1670.8. Samples: 4629312. Policy #0 lag: (min: 31.0, avg: 35.3, max: 63.0) +[2023-10-13 00:35:13,607][45375] Avg episode reward: [(0, '39.350'), (1, '38.490')] +[2023-10-13 00:35:13,873][46663] Updated weights for policy 1, policy_version 9041 (0.0007) +[2023-10-13 00:35:13,931][46662] Updated weights for policy 0, policy_version 9030 (0.0009) +[2023-10-13 00:35:14,236][46663] Updated weights for policy 1, policy_version 9051 (0.0008) +[2023-10-13 00:35:14,303][46662] Updated weights for policy 0, policy_version 9040 (0.0008) +[2023-10-13 00:35:14,671][46662] Updated weights for policy 0, policy_version 9050 (0.0007) +[2023-10-13 00:35:18,411][46663] Updated weights for policy 1, policy_version 9061 (0.0007) +[2023-10-13 00:35:18,607][45375] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 18546688. Throughput: 0: 1687.5, 1: 1672.4. Samples: 4649900. Policy #0 lag: (min: 31.0, avg: 35.3, max: 63.0) +[2023-10-13 00:35:18,607][45375] Avg episode reward: [(0, '38.880'), (1, '38.800')] +[2023-10-13 00:35:18,661][46662] Updated weights for policy 0, policy_version 9060 (0.0009) +[2023-10-13 00:35:18,771][46663] Updated weights for policy 1, policy_version 9071 (0.0008) +[2023-10-13 00:35:19,028][46662] Updated weights for policy 0, policy_version 9070 (0.0009) +[2023-10-13 00:35:19,137][46663] Updated weights for policy 1, policy_version 9081 (0.0008) +[2023-10-13 00:35:19,401][46662] Updated weights for policy 0, policy_version 9080 (0.0007) +[2023-10-13 00:35:23,326][46663] Updated weights for policy 1, policy_version 9091 (0.0007) +[2023-10-13 00:35:23,517][46662] Updated weights for policy 0, policy_version 9090 (0.0007) +[2023-10-13 00:35:23,607][45375] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 18612224. Throughput: 0: 1674.7, 1: 1664.6. Samples: 4670022. Policy #0 lag: (min: 26.0, avg: 30.9, max: 58.0) +[2023-10-13 00:35:23,608][45375] Avg episode reward: [(0, '38.410'), (1, '38.820')] +[2023-10-13 00:35:23,688][46663] Updated weights for policy 1, policy_version 9101 (0.0007) +[2023-10-13 00:35:23,890][46662] Updated weights for policy 0, policy_version 9100 (0.0009) +[2023-10-13 00:35:24,049][46663] Updated weights for policy 1, policy_version 9111 (0.0007) +[2023-10-13 00:35:24,246][46662] Updated weights for policy 0, policy_version 9110 (0.0007) +[2023-10-13 00:35:24,611][46662] Updated weights for policy 0, policy_version 9120 (0.0010) +[2023-10-13 00:35:28,251][46663] Updated weights for policy 1, policy_version 9121 (0.0009) +[2023-10-13 00:35:28,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 18677760. Throughput: 0: 1672.1, 1: 1668.4. Samples: 4679152. Policy #0 lag: (min: 26.0, avg: 30.9, max: 58.0) +[2023-10-13 00:35:28,607][45375] Avg episode reward: [(0, '38.430'), (1, '38.190')] +[2023-10-13 00:35:28,614][46663] Updated weights for policy 1, policy_version 9131 (0.0009) +[2023-10-13 00:35:28,790][46662] Updated weights for policy 0, policy_version 9130 (0.0009) +[2023-10-13 00:35:28,983][46663] Updated weights for policy 1, policy_version 9141 (0.0008) +[2023-10-13 00:35:29,164][46662] Updated weights for policy 0, policy_version 9140 (0.0007) +[2023-10-13 00:35:29,357][46663] Updated weights for policy 1, policy_version 9151 (0.0008) +[2023-10-13 00:35:29,541][46662] Updated weights for policy 0, policy_version 9150 (0.0007) +[2023-10-13 00:35:33,378][46663] Updated weights for policy 1, policy_version 9161 (0.0009) +[2023-10-13 00:35:33,607][45375] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 18743296. Throughput: 0: 1675.4, 1: 1665.3. Samples: 4699856. Policy #0 lag: (min: 31.0, avg: 31.1, max: 38.0) +[2023-10-13 00:35:33,607][45375] Avg episode reward: [(0, '38.470'), (1, '38.950')] +[2023-10-13 00:35:33,673][46662] Updated weights for policy 0, policy_version 9160 (0.0009) +[2023-10-13 00:35:33,743][46663] Updated weights for policy 1, policy_version 9171 (0.0007) +[2023-10-13 00:35:34,037][46662] Updated weights for policy 0, policy_version 9170 (0.0009) +[2023-10-13 00:35:34,114][46663] Updated weights for policy 1, policy_version 9181 (0.0007) +[2023-10-13 00:35:34,406][46662] Updated weights for policy 0, policy_version 9180 (0.0008) +[2023-10-13 00:35:38,332][46663] Updated weights for policy 1, policy_version 9191 (0.0008) +[2023-10-13 00:35:38,559][46662] Updated weights for policy 0, policy_version 9190 (0.0010) +[2023-10-13 00:35:38,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 18808832. Throughput: 0: 1678.2, 1: 1652.5. Samples: 4719936. Policy #0 lag: (min: 31.0, avg: 31.1, max: 38.0) +[2023-10-13 00:35:38,607][45375] Avg episode reward: [(0, '38.590'), (1, '38.300')] +[2023-10-13 00:35:38,697][46663] Updated weights for policy 1, policy_version 9201 (0.0009) +[2023-10-13 00:35:38,923][46662] Updated weights for policy 0, policy_version 9200 (0.0010) +[2023-10-13 00:35:39,079][46663] Updated weights for policy 1, policy_version 9211 (0.0007) +[2023-10-13 00:35:39,300][46662] Updated weights for policy 0, policy_version 9210 (0.0007) +[2023-10-13 00:35:43,268][46663] Updated weights for policy 1, policy_version 9221 (0.0008) +[2023-10-13 00:35:43,475][46662] Updated weights for policy 0, policy_version 9220 (0.0007) +[2023-10-13 00:35:43,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 18874368. Throughput: 0: 1681.1, 1: 1656.7. Samples: 4729292. Policy #0 lag: (min: 31.0, avg: 31.2, max: 40.0) +[2023-10-13 00:35:43,607][45375] Avg episode reward: [(0, '37.260'), (1, '38.840')] +[2023-10-13 00:35:43,660][46663] Updated weights for policy 1, policy_version 9231 (0.0008) +[2023-10-13 00:35:43,876][46662] Updated weights for policy 0, policy_version 9230 (0.0007) +[2023-10-13 00:35:44,028][46663] Updated weights for policy 1, policy_version 9241 (0.0009) +[2023-10-13 00:35:44,248][46662] Updated weights for policy 0, policy_version 9240 (0.0009) +[2023-10-13 00:35:48,166][46663] Updated weights for policy 1, policy_version 9251 (0.0007) +[2023-10-13 00:35:48,216][46662] Updated weights for policy 0, policy_version 9250 (0.0010) +[2023-10-13 00:35:48,542][46663] Updated weights for policy 1, policy_version 9261 (0.0008) +[2023-10-13 00:35:48,582][46662] Updated weights for policy 0, policy_version 9260 (0.0010) +[2023-10-13 00:35:48,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 18939904. Throughput: 0: 1681.0, 1: 1646.1. Samples: 4749438. Policy #0 lag: (min: 31.0, avg: 31.2, max: 40.0) +[2023-10-13 00:35:48,607][45375] Avg episode reward: [(0, '36.290'), (1, '38.990')] +[2023-10-13 00:35:48,906][46663] Updated weights for policy 1, policy_version 9271 (0.0008) +[2023-10-13 00:35:48,954][46662] Updated weights for policy 0, policy_version 9270 (0.0008) +[2023-10-13 00:35:49,323][46662] Updated weights for policy 0, policy_version 9280 (0.0008) +[2023-10-13 00:35:53,145][46663] Updated weights for policy 1, policy_version 9281 (0.0010) +[2023-10-13 00:35:53,375][46662] Updated weights for policy 0, policy_version 9290 (0.0007) +[2023-10-13 00:35:53,520][46663] Updated weights for policy 1, policy_version 9291 (0.0009) +[2023-10-13 00:35:53,607][45375] Fps is (10 sec: 13106.7, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 19005440. Throughput: 0: 1681.8, 1: 1636.0. Samples: 4769556. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 00:35:53,608][45375] Avg episode reward: [(0, '35.960'), (1, '40.480')] +[2023-10-13 00:35:53,743][46662] Updated weights for policy 0, policy_version 9300 (0.0010) +[2023-10-13 00:35:53,881][46663] Updated weights for policy 1, policy_version 9301 (0.0008) +[2023-10-13 00:35:54,116][46662] Updated weights for policy 0, policy_version 9310 (0.0008) +[2023-10-13 00:35:54,185][46091] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000009312_9535488.pth... +[2023-10-13 00:35:54,224][46091] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000007712_7897088.pth +[2023-10-13 00:35:54,251][46663] Updated weights for policy 1, policy_version 9311 (0.0007) +[2023-10-13 00:35:54,280][46384] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000009312_9535488.pth... +[2023-10-13 00:35:54,309][46384] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000007744_7929856.pth +[2023-10-13 00:35:58,174][46662] Updated weights for policy 0, policy_version 9320 (0.0008) +[2023-10-13 00:35:58,431][46663] Updated weights for policy 1, policy_version 9321 (0.0008) +[2023-10-13 00:35:58,545][46662] Updated weights for policy 0, policy_version 9330 (0.0007) +[2023-10-13 00:35:58,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 19070976. Throughput: 0: 1680.5, 1: 1640.6. Samples: 4778762. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 00:35:58,607][45375] Avg episode reward: [(0, '36.110'), (1, '39.010')] +[2023-10-13 00:35:58,794][46663] Updated weights for policy 1, policy_version 9331 (0.0009) +[2023-10-13 00:35:58,919][46662] Updated weights for policy 0, policy_version 9340 (0.0008) +[2023-10-13 00:35:59,170][46663] Updated weights for policy 1, policy_version 9341 (0.0008) +[2023-10-13 00:36:02,744][46662] Updated weights for policy 0, policy_version 9350 (0.0008) +[2023-10-13 00:36:03,113][46662] Updated weights for policy 0, policy_version 9360 (0.0007) +[2023-10-13 00:36:03,364][46663] Updated weights for policy 1, policy_version 9351 (0.0009) +[2023-10-13 00:36:03,480][46662] Updated weights for policy 0, policy_version 9370 (0.0008) +[2023-10-13 00:36:03,606][45375] Fps is (10 sec: 13107.7, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 19136512. Throughput: 0: 1686.9, 1: 1641.4. Samples: 4799674. Policy #0 lag: (min: 31.0, avg: 35.2, max: 63.0) +[2023-10-13 00:36:03,607][45375] Avg episode reward: [(0, '37.830'), (1, '39.410')] +[2023-10-13 00:36:03,735][46663] Updated weights for policy 1, policy_version 9361 (0.0008) +[2023-10-13 00:36:04,102][46663] Updated weights for policy 1, policy_version 9371 (0.0008) +[2023-10-13 00:36:07,528][46662] Updated weights for policy 0, policy_version 9380 (0.0008) +[2023-10-13 00:36:07,891][46662] Updated weights for policy 0, policy_version 9390 (0.0011) +[2023-10-13 00:36:08,207][46663] Updated weights for policy 1, policy_version 9381 (0.0007) +[2023-10-13 00:36:08,268][46662] Updated weights for policy 0, policy_version 9400 (0.0009) +[2023-10-13 00:36:08,573][46663] Updated weights for policy 1, policy_version 9391 (0.0008) +[2023-10-13 00:36:08,607][45375] Fps is (10 sec: 16383.8, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 19234816. Throughput: 0: 1684.8, 1: 1640.1. Samples: 4819642. Policy #0 lag: (min: 31.0, avg: 38.3, max: 63.0) +[2023-10-13 00:36:08,608][45375] Avg episode reward: [(0, '38.860'), (1, '39.970')] +[2023-10-13 00:36:08,940][46663] Updated weights for policy 1, policy_version 9401 (0.0007) +[2023-10-13 00:36:12,286][46662] Updated weights for policy 0, policy_version 9410 (0.0007) +[2023-10-13 00:36:12,658][46662] Updated weights for policy 0, policy_version 9420 (0.0009) +[2023-10-13 00:36:12,733][46663] Updated weights for policy 1, policy_version 9411 (0.0007) +[2023-10-13 00:36:13,015][46662] Updated weights for policy 0, policy_version 9430 (0.0008) +[2023-10-13 00:36:13,091][46663] Updated weights for policy 1, policy_version 9421 (0.0007) +[2023-10-13 00:36:13,401][46662] Updated weights for policy 0, policy_version 9440 (0.0009) +[2023-10-13 00:36:13,465][46663] Updated weights for policy 1, policy_version 9431 (0.0007) +[2023-10-13 00:36:13,607][45375] Fps is (10 sec: 16383.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 19300352. Throughput: 0: 1696.8, 1: 1648.2. Samples: 4829678. Policy #0 lag: (min: 31.0, avg: 38.3, max: 63.0) +[2023-10-13 00:36:13,607][45375] Avg episode reward: [(0, '39.130'), (1, '40.160')] +[2023-10-13 00:36:17,634][46662] Updated weights for policy 0, policy_version 9450 (0.0010) +[2023-10-13 00:36:17,763][46663] Updated weights for policy 1, policy_version 9441 (0.0007) +[2023-10-13 00:36:18,000][46662] Updated weights for policy 0, policy_version 9460 (0.0008) +[2023-10-13 00:36:18,132][46663] Updated weights for policy 1, policy_version 9451 (0.0008) +[2023-10-13 00:36:18,373][46662] Updated weights for policy 0, policy_version 9470 (0.0009) +[2023-10-13 00:36:18,499][46663] Updated weights for policy 1, policy_version 9461 (0.0007) +[2023-10-13 00:36:18,606][45375] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 19365888. Throughput: 0: 1691.2, 1: 1655.5. Samples: 4850454. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 00:36:18,607][45375] Avg episode reward: [(0, '40.090'), (1, '41.240')] +[2023-10-13 00:36:18,873][46663] Updated weights for policy 1, policy_version 9471 (0.0007) +[2023-10-13 00:36:22,499][46662] Updated weights for policy 0, policy_version 9480 (0.0009) +[2023-10-13 00:36:22,851][46663] Updated weights for policy 1, policy_version 9481 (0.0007) +[2023-10-13 00:36:22,867][46662] Updated weights for policy 0, policy_version 9490 (0.0009) +[2023-10-13 00:36:23,209][46663] Updated weights for policy 1, policy_version 9491 (0.0009) +[2023-10-13 00:36:23,234][46662] Updated weights for policy 0, policy_version 9500 (0.0007) +[2023-10-13 00:36:23,582][46663] Updated weights for policy 1, policy_version 9501 (0.0009) +[2023-10-13 00:36:23,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 19431424. Throughput: 0: 1672.7, 1: 1651.1. Samples: 4869508. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 00:36:23,608][45375] Avg episode reward: [(0, '40.950'), (1, '40.920')] +[2023-10-13 00:36:27,310][46662] Updated weights for policy 0, policy_version 9510 (0.0008) +[2023-10-13 00:36:27,682][46662] Updated weights for policy 0, policy_version 9520 (0.0008) +[2023-10-13 00:36:27,695][46663] Updated weights for policy 1, policy_version 9511 (0.0009) +[2023-10-13 00:36:28,052][46662] Updated weights for policy 0, policy_version 9530 (0.0010) +[2023-10-13 00:36:28,093][46663] Updated weights for policy 1, policy_version 9521 (0.0009) +[2023-10-13 00:36:28,472][46663] Updated weights for policy 1, policy_version 9531 (0.0010) +[2023-10-13 00:36:28,606][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 19496960. Throughput: 0: 1684.6, 1: 1666.1. Samples: 4880074. Policy #0 lag: (min: 9.0, avg: 13.9, max: 41.0) +[2023-10-13 00:36:28,607][45375] Avg episode reward: [(0, '40.660'), (1, '40.750')] +[2023-10-13 00:36:32,255][46662] Updated weights for policy 0, policy_version 9540 (0.0010) +[2023-10-13 00:36:32,547][46663] Updated weights for policy 1, policy_version 9541 (0.0008) +[2023-10-13 00:36:32,644][46662] Updated weights for policy 0, policy_version 9550 (0.0008) +[2023-10-13 00:36:32,914][46663] Updated weights for policy 1, policy_version 9551 (0.0008) +[2023-10-13 00:36:33,018][46662] Updated weights for policy 0, policy_version 9560 (0.0008) +[2023-10-13 00:36:33,284][46663] Updated weights for policy 1, policy_version 9561 (0.0008) +[2023-10-13 00:36:33,607][45375] Fps is (10 sec: 16384.1, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 19595264. Throughput: 0: 1683.7, 1: 1676.8. Samples: 4900662. Policy #0 lag: (min: 9.0, avg: 13.9, max: 41.0) +[2023-10-13 00:36:33,608][45375] Avg episode reward: [(0, '40.760'), (1, '41.110')] +[2023-10-13 00:36:37,133][46662] Updated weights for policy 0, policy_version 9570 (0.0010) +[2023-10-13 00:36:37,367][46663] Updated weights for policy 1, policy_version 9571 (0.0009) +[2023-10-13 00:36:37,495][46662] Updated weights for policy 0, policy_version 9580 (0.0008) +[2023-10-13 00:36:37,736][46663] Updated weights for policy 1, policy_version 9581 (0.0007) +[2023-10-13 00:36:37,865][46662] Updated weights for policy 0, policy_version 9590 (0.0007) +[2023-10-13 00:36:38,100][46663] Updated weights for policy 1, policy_version 9591 (0.0008) +[2023-10-13 00:36:38,238][46662] Updated weights for policy 0, policy_version 9600 (0.0007) +[2023-10-13 00:36:38,607][45375] Fps is (10 sec: 16383.7, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 19660800. Throughput: 0: 1661.5, 1: 1663.9. Samples: 4919196. Policy #0 lag: (min: 31.0, avg: 36.7, max: 63.0) +[2023-10-13 00:36:38,607][45375] Avg episode reward: [(0, '40.320'), (1, '40.820')] +[2023-10-13 00:36:42,093][46663] Updated weights for policy 1, policy_version 9601 (0.0008) +[2023-10-13 00:36:42,289][46662] Updated weights for policy 0, policy_version 9610 (0.0008) +[2023-10-13 00:36:42,466][46663] Updated weights for policy 1, policy_version 9611 (0.0008) +[2023-10-13 00:36:42,663][46662] Updated weights for policy 0, policy_version 9620 (0.0008) +[2023-10-13 00:36:42,841][46663] Updated weights for policy 1, policy_version 9621 (0.0008) +[2023-10-13 00:36:43,025][46662] Updated weights for policy 0, policy_version 9630 (0.0008) +[2023-10-13 00:36:43,212][46663] Updated weights for policy 1, policy_version 9631 (0.0008) +[2023-10-13 00:36:43,607][45375] Fps is (10 sec: 13107.3, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 19726336. Throughput: 0: 1678.4, 1: 1686.5. Samples: 4930184. Policy #0 lag: (min: 31.0, avg: 36.7, max: 63.0) +[2023-10-13 00:36:43,607][45375] Avg episode reward: [(0, '41.170'), (1, '37.910')] +[2023-10-13 00:36:43,608][46091] Saving new best policy, reward=41.170! +[2023-10-13 00:36:47,122][46663] Updated weights for policy 1, policy_version 9641 (0.0008) +[2023-10-13 00:36:47,131][46662] Updated weights for policy 0, policy_version 9640 (0.0008) +[2023-10-13 00:36:47,494][46663] Updated weights for policy 1, policy_version 9651 (0.0007) +[2023-10-13 00:36:47,505][46662] Updated weights for policy 0, policy_version 9650 (0.0007) +[2023-10-13 00:36:47,851][46663] Updated weights for policy 1, policy_version 9661 (0.0009) +[2023-10-13 00:36:47,871][46662] Updated weights for policy 0, policy_version 9660 (0.0009) +[2023-10-13 00:36:48,607][45375] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 13440.4). Total num frames: 19791872. Throughput: 0: 1669.9, 1: 1677.6. Samples: 4950310. Policy #0 lag: (min: 25.0, avg: 34.7, max: 57.0) +[2023-10-13 00:36:48,607][45375] Avg episode reward: [(0, '41.530'), (1, '37.880')] +[2023-10-13 00:36:48,608][46091] Saving new best policy, reward=41.530! +[2023-10-13 00:36:52,033][46662] Updated weights for policy 0, policy_version 9670 (0.0008) +[2023-10-13 00:36:52,147][46663] Updated weights for policy 1, policy_version 9671 (0.0008) +[2023-10-13 00:36:52,398][46662] Updated weights for policy 0, policy_version 9680 (0.0007) +[2023-10-13 00:36:52,513][46663] Updated weights for policy 1, policy_version 9681 (0.0009) +[2023-10-13 00:36:52,771][46662] Updated weights for policy 0, policy_version 9690 (0.0007) +[2023-10-13 00:36:52,897][46663] Updated weights for policy 1, policy_version 9691 (0.0008) +[2023-10-13 00:36:53,607][45375] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 13440.4). Total num frames: 19857408. Throughput: 0: 1651.1, 1: 1666.5. Samples: 4968934. Policy #0 lag: (min: 25.0, avg: 34.7, max: 57.0) +[2023-10-13 00:36:53,607][45375] Avg episode reward: [(0, '41.050'), (1, '37.070')] +[2023-10-13 00:36:56,745][46662] Updated weights for policy 0, policy_version 9700 (0.0008) +[2023-10-13 00:36:56,945][46663] Updated weights for policy 1, policy_version 9701 (0.0009) +[2023-10-13 00:36:57,108][46662] Updated weights for policy 0, policy_version 9710 (0.0007) +[2023-10-13 00:36:57,315][46663] Updated weights for policy 1, policy_version 9711 (0.0007) +[2023-10-13 00:36:57,477][46662] Updated weights for policy 0, policy_version 9720 (0.0007) +[2023-10-13 00:36:57,682][46663] Updated weights for policy 1, policy_version 9721 (0.0009) +[2023-10-13 00:36:58,607][45375] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 13440.4). Total num frames: 19922944. Throughput: 0: 1664.4, 1: 1687.9. Samples: 4980528. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 00:36:58,607][45375] Avg episode reward: [(0, '41.510'), (1, '36.890')] +[2023-10-13 00:37:01,629][46662] Updated weights for policy 0, policy_version 9730 (0.0008) +[2023-10-13 00:37:01,755][46663] Updated weights for policy 1, policy_version 9731 (0.0008) +[2023-10-13 00:37:01,997][46662] Updated weights for policy 0, policy_version 9740 (0.0008) +[2023-10-13 00:37:02,117][46663] Updated weights for policy 1, policy_version 9741 (0.0009) +[2023-10-13 00:37:02,369][46662] Updated weights for policy 0, policy_version 9750 (0.0009) +[2023-10-13 00:37:02,480][46663] Updated weights for policy 1, policy_version 9751 (0.0007) +[2023-10-13 00:37:02,736][46662] Updated weights for policy 0, policy_version 9760 (0.0010) +[2023-10-13 00:37:03,606][45375] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 13440.4). Total num frames: 19988480. Throughput: 0: 1662.7, 1: 1666.8. Samples: 5000282. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 00:37:03,607][45375] Avg episode reward: [(0, '41.640'), (1, '35.870')] +[2023-10-13 00:37:03,608][46091] Saving new best policy, reward=41.640! +[2023-10-13 00:37:06,411][46663] Updated weights for policy 1, policy_version 9761 (0.0008) +[2023-10-13 00:37:06,781][46663] Updated weights for policy 1, policy_version 9771 (0.0007) +[2023-10-13 00:37:06,859][46662] Updated weights for policy 0, policy_version 9770 (0.0008) +[2023-10-13 00:37:07,156][46663] Updated weights for policy 1, policy_version 9781 (0.0007) +[2023-10-13 00:37:07,231][46662] Updated weights for policy 0, policy_version 9780 (0.0008) +[2023-10-13 00:37:07,519][46663] Updated weights for policy 1, policy_version 9791 (0.0008) +[2023-10-13 00:37:07,600][46662] Updated weights for policy 0, policy_version 9790 (0.0007) +[2023-10-13 00:37:08,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 20054016. Throughput: 0: 1660.5, 1: 1675.3. Samples: 5019620. Policy #0 lag: (min: 31.0, avg: 38.3, max: 63.0) +[2023-10-13 00:37:08,607][45375] Avg episode reward: [(0, '41.560'), (1, '35.340')] +[2023-10-13 00:37:11,641][46662] Updated weights for policy 0, policy_version 9800 (0.0007) +[2023-10-13 00:37:11,671][46663] Updated weights for policy 1, policy_version 9801 (0.0007) +[2023-10-13 00:37:12,011][46662] Updated weights for policy 0, policy_version 9810 (0.0007) +[2023-10-13 00:37:12,031][46663] Updated weights for policy 1, policy_version 9811 (0.0008) +[2023-10-13 00:37:12,378][46662] Updated weights for policy 0, policy_version 9820 (0.0007) +[2023-10-13 00:37:12,399][46663] Updated weights for policy 1, policy_version 9821 (0.0008) +[2023-10-13 00:37:13,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 20119552. Throughput: 0: 1673.9, 1: 1679.9. Samples: 5030992. Policy #0 lag: (min: 31.0, avg: 38.3, max: 63.0) +[2023-10-13 00:37:13,607][45375] Avg episode reward: [(0, '40.110'), (1, '35.550')] +[2023-10-13 00:37:16,460][46663] Updated weights for policy 1, policy_version 9831 (0.0008) +[2023-10-13 00:37:16,584][46662] Updated weights for policy 0, policy_version 9830 (0.0009) +[2023-10-13 00:37:16,822][46663] Updated weights for policy 1, policy_version 9841 (0.0007) +[2023-10-13 00:37:16,945][46662] Updated weights for policy 0, policy_version 9840 (0.0008) +[2023-10-13 00:37:17,182][46663] Updated weights for policy 1, policy_version 9851 (0.0008) +[2023-10-13 00:37:17,314][46662] Updated weights for policy 0, policy_version 9850 (0.0008) +[2023-10-13 00:37:18,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 20185088. Throughput: 0: 1666.2, 1: 1655.5. Samples: 5050140. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) +[2023-10-13 00:37:18,607][45375] Avg episode reward: [(0, '38.820'), (1, '34.910')] +[2023-10-13 00:37:21,278][46663] Updated weights for policy 1, policy_version 9861 (0.0010) +[2023-10-13 00:37:21,461][46662] Updated weights for policy 0, policy_version 9860 (0.0007) +[2023-10-13 00:37:21,656][46663] Updated weights for policy 1, policy_version 9871 (0.0007) +[2023-10-13 00:37:21,850][46662] Updated weights for policy 0, policy_version 9870 (0.0007) +[2023-10-13 00:37:22,023][46663] Updated weights for policy 1, policy_version 9881 (0.0009) +[2023-10-13 00:37:22,215][46662] Updated weights for policy 0, policy_version 9880 (0.0007) +[2023-10-13 00:37:23,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 20250624. Throughput: 0: 1662.4, 1: 1684.2. Samples: 5069792. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) +[2023-10-13 00:37:23,608][45375] Avg episode reward: [(0, '38.730'), (1, '35.000')] +[2023-10-13 00:37:25,985][46663] Updated weights for policy 1, policy_version 9891 (0.0009) +[2023-10-13 00:37:26,097][46662] Updated weights for policy 0, policy_version 9890 (0.0010) +[2023-10-13 00:37:26,358][46663] Updated weights for policy 1, policy_version 9901 (0.0011) +[2023-10-13 00:37:26,474][46662] Updated weights for policy 0, policy_version 9900 (0.0007) +[2023-10-13 00:37:26,715][46663] Updated weights for policy 1, policy_version 9911 (0.0009) +[2023-10-13 00:37:26,832][46662] Updated weights for policy 0, policy_version 9910 (0.0008) +[2023-10-13 00:37:27,209][46662] Updated weights for policy 0, policy_version 9920 (0.0010) +[2023-10-13 00:37:28,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 20316160. Throughput: 0: 1679.8, 1: 1675.6. Samples: 5081178. Policy #0 lag: (min: 35.0, avg: 54.8, max: 56.0) +[2023-10-13 00:37:28,607][45375] Avg episode reward: [(0, '38.790'), (1, '34.720')] +[2023-10-13 00:37:30,693][46663] Updated weights for policy 1, policy_version 9921 (0.0009) +[2023-10-13 00:37:31,058][46663] Updated weights for policy 1, policy_version 9931 (0.0010) +[2023-10-13 00:37:31,244][46662] Updated weights for policy 0, policy_version 9930 (0.0009) +[2023-10-13 00:37:31,426][46663] Updated weights for policy 1, policy_version 9941 (0.0010) +[2023-10-13 00:37:31,619][46662] Updated weights for policy 0, policy_version 9940 (0.0009) +[2023-10-13 00:37:31,794][46663] Updated weights for policy 1, policy_version 9951 (0.0008) +[2023-10-13 00:37:31,988][46662] Updated weights for policy 0, policy_version 9950 (0.0007) +[2023-10-13 00:37:33,606][45375] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 20381696. Throughput: 0: 1656.5, 1: 1672.4. Samples: 5100112. Policy #0 lag: (min: 35.0, avg: 54.8, max: 56.0) +[2023-10-13 00:37:33,607][45375] Avg episode reward: [(0, '38.430'), (1, '35.470')] +[2023-10-13 00:37:35,773][46663] Updated weights for policy 1, policy_version 9961 (0.0009) +[2023-10-13 00:37:36,025][46662] Updated weights for policy 0, policy_version 9960 (0.0007) +[2023-10-13 00:37:36,133][46663] Updated weights for policy 1, policy_version 9971 (0.0008) +[2023-10-13 00:37:36,402][46662] Updated weights for policy 0, policy_version 9970 (0.0008) +[2023-10-13 00:37:36,500][46663] Updated weights for policy 1, policy_version 9981 (0.0009) +[2023-10-13 00:37:36,766][46662] Updated weights for policy 0, policy_version 9980 (0.0009) +[2023-10-13 00:37:38,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 20447232. Throughput: 0: 1675.8, 1: 1697.7. Samples: 5120742. Policy #0 lag: (min: 31.0, avg: 38.2, max: 63.0) +[2023-10-13 00:37:38,608][45375] Avg episode reward: [(0, '38.480'), (1, '35.990')] +[2023-10-13 00:37:40,482][46663] Updated weights for policy 1, policy_version 9991 (0.0007) +[2023-10-13 00:37:40,852][46663] Updated weights for policy 1, policy_version 10001 (0.0010) +[2023-10-13 00:37:40,953][46662] Updated weights for policy 0, policy_version 9990 (0.0010) +[2023-10-13 00:37:41,223][46663] Updated weights for policy 1, policy_version 10011 (0.0008) +[2023-10-13 00:37:41,317][46662] Updated weights for policy 0, policy_version 10000 (0.0007) +[2023-10-13 00:37:41,691][46662] Updated weights for policy 0, policy_version 10010 (0.0007) +[2023-10-13 00:37:43,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 20512768. Throughput: 0: 1680.4, 1: 1669.8. Samples: 5131290. Policy #0 lag: (min: 31.0, avg: 38.2, max: 63.0) +[2023-10-13 00:37:43,607][45375] Avg episode reward: [(0, '38.900'), (1, '36.760')] +[2023-10-13 00:37:45,237][46663] Updated weights for policy 1, policy_version 10021 (0.0009) +[2023-10-13 00:37:45,612][46663] Updated weights for policy 1, policy_version 10031 (0.0008) +[2023-10-13 00:37:45,788][46662] Updated weights for policy 0, policy_version 10020 (0.0011) +[2023-10-13 00:37:45,973][46663] Updated weights for policy 1, policy_version 10041 (0.0007) +[2023-10-13 00:37:46,159][46662] Updated weights for policy 0, policy_version 10030 (0.0009) +[2023-10-13 00:37:46,532][46662] Updated weights for policy 0, policy_version 10040 (0.0008) +[2023-10-13 00:37:48,607][45375] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 20578304. Throughput: 0: 1652.8, 1: 1692.5. Samples: 5150820. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 00:37:48,607][45375] Avg episode reward: [(0, '40.580'), (1, '37.670')] +[2023-10-13 00:37:49,960][46663] Updated weights for policy 1, policy_version 10051 (0.0008) +[2023-10-13 00:37:50,332][46663] Updated weights for policy 1, policy_version 10061 (0.0007) +[2023-10-13 00:37:50,660][46662] Updated weights for policy 0, policy_version 10050 (0.0008) +[2023-10-13 00:37:50,692][46663] Updated weights for policy 1, policy_version 10071 (0.0007) +[2023-10-13 00:37:51,016][46662] Updated weights for policy 0, policy_version 10060 (0.0007) +[2023-10-13 00:37:51,386][46662] Updated weights for policy 0, policy_version 10070 (0.0008) +[2023-10-13 00:37:51,757][46662] Updated weights for policy 0, policy_version 10080 (0.0007) +[2023-10-13 00:37:53,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 20643840. Throughput: 0: 1669.3, 1: 1703.7. Samples: 5171408. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 00:37:53,607][45375] Avg episode reward: [(0, '39.580'), (1, '39.020')] +[2023-10-13 00:37:53,617][46091] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000010080_10321920.pth... +[2023-10-13 00:37:53,618][46384] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000010080_10321920.pth... +[2023-10-13 00:37:53,651][46091] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000008512_8716288.pth +[2023-10-13 00:37:53,660][46384] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000008512_8716288.pth +[2023-10-13 00:37:54,697][46663] Updated weights for policy 1, policy_version 10081 (0.0008) +[2023-10-13 00:37:55,055][46663] Updated weights for policy 1, policy_version 10091 (0.0010) +[2023-10-13 00:37:55,430][46663] Updated weights for policy 1, policy_version 10101 (0.0008) +[2023-10-13 00:37:55,788][46663] Updated weights for policy 1, policy_version 10111 (0.0008) +[2023-10-13 00:37:55,888][46662] Updated weights for policy 0, policy_version 10090 (0.0009) +[2023-10-13 00:37:56,260][46662] Updated weights for policy 0, policy_version 10100 (0.0009) +[2023-10-13 00:37:56,636][46662] Updated weights for policy 0, policy_version 10110 (0.0010) +[2023-10-13 00:37:58,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 20709376. Throughput: 0: 1664.0, 1: 1680.9. Samples: 5181514. Policy #0 lag: (min: 31.0, avg: 38.9, max: 63.0) +[2023-10-13 00:37:58,607][45375] Avg episode reward: [(0, '39.790'), (1, '37.980')] +[2023-10-13 00:37:59,969][46663] Updated weights for policy 1, policy_version 10121 (0.0011) +[2023-10-13 00:38:00,332][46663] Updated weights for policy 1, policy_version 10131 (0.0011) +[2023-10-13 00:38:00,698][46663] Updated weights for policy 1, policy_version 10141 (0.0007) +[2023-10-13 00:38:00,791][46662] Updated weights for policy 0, policy_version 10120 (0.0008) +[2023-10-13 00:38:01,160][46662] Updated weights for policy 0, policy_version 10130 (0.0008) +[2023-10-13 00:38:01,531][46662] Updated weights for policy 0, policy_version 10140 (0.0009) +[2023-10-13 00:38:03,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 20774912. Throughput: 0: 1652.0, 1: 1704.1. Samples: 5201164. Policy #0 lag: (min: 31.0, avg: 38.9, max: 63.0) +[2023-10-13 00:38:03,607][45375] Avg episode reward: [(0, '39.860'), (1, '37.960')] +[2023-10-13 00:38:04,700][46663] Updated weights for policy 1, policy_version 10151 (0.0007) +[2023-10-13 00:38:05,084][46663] Updated weights for policy 1, policy_version 10161 (0.0007) +[2023-10-13 00:38:05,444][46663] Updated weights for policy 1, policy_version 10171 (0.0010) +[2023-10-13 00:38:05,561][46662] Updated weights for policy 0, policy_version 10150 (0.0008) +[2023-10-13 00:38:05,917][46662] Updated weights for policy 0, policy_version 10160 (0.0011) +[2023-10-13 00:38:06,299][46662] Updated weights for policy 0, policy_version 10170 (0.0011) +[2023-10-13 00:38:08,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 20840448. Throughput: 0: 1673.7, 1: 1706.3. Samples: 5221892. Policy #0 lag: (min: 31.0, avg: 38.9, max: 63.0) +[2023-10-13 00:38:08,607][45375] Avg episode reward: [(0, '39.840'), (1, '38.590')] +[2023-10-13 00:38:09,362][46663] Updated weights for policy 1, policy_version 10181 (0.0008) +[2023-10-13 00:38:09,731][46663] Updated weights for policy 1, policy_version 10191 (0.0007) +[2023-10-13 00:38:10,102][46663] Updated weights for policy 1, policy_version 10201 (0.0010) +[2023-10-13 00:38:10,535][46662] Updated weights for policy 0, policy_version 10180 (0.0010) +[2023-10-13 00:38:10,923][46662] Updated weights for policy 0, policy_version 10190 (0.0009) +[2023-10-13 00:38:11,281][46662] Updated weights for policy 0, policy_version 10200 (0.0009) +[2023-10-13 00:38:13,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 20905984. Throughput: 0: 1660.8, 1: 1689.1. Samples: 5231926. Policy #0 lag: (min: 31.0, avg: 38.9, max: 63.0) +[2023-10-13 00:38:13,608][45375] Avg episode reward: [(0, '41.370'), (1, '39.120')] +[2023-10-13 00:38:14,245][46663] Updated weights for policy 1, policy_version 10211 (0.0010) +[2023-10-13 00:38:14,623][46663] Updated weights for policy 1, policy_version 10221 (0.0009) +[2023-10-13 00:38:14,994][46663] Updated weights for policy 1, policy_version 10231 (0.0008) +[2023-10-13 00:38:15,112][46662] Updated weights for policy 0, policy_version 10210 (0.0008) +[2023-10-13 00:38:15,491][46662] Updated weights for policy 0, policy_version 10220 (0.0010) +[2023-10-13 00:38:15,864][46662] Updated weights for policy 0, policy_version 10230 (0.0008) +[2023-10-13 00:38:16,224][46662] Updated weights for policy 0, policy_version 10240 (0.0007) +[2023-10-13 00:38:18,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 20971520. Throughput: 0: 1669.1, 1: 1707.8. Samples: 5252072. Policy #0 lag: (min: 5.0, avg: 11.9, max: 37.0) +[2023-10-13 00:38:18,607][45375] Avg episode reward: [(0, '40.100'), (1, '39.380')] +[2023-10-13 00:38:19,100][46663] Updated weights for policy 1, policy_version 10241 (0.0009) +[2023-10-13 00:38:19,461][46663] Updated weights for policy 1, policy_version 10251 (0.0010) +[2023-10-13 00:38:19,829][46663] Updated weights for policy 1, policy_version 10261 (0.0009) +[2023-10-13 00:38:20,166][46662] Updated weights for policy 0, policy_version 10250 (0.0008) +[2023-10-13 00:38:20,192][46663] Updated weights for policy 1, policy_version 10271 (0.0008) +[2023-10-13 00:38:20,531][46662] Updated weights for policy 0, policy_version 10260 (0.0008) +[2023-10-13 00:38:20,907][46662] Updated weights for policy 0, policy_version 10270 (0.0008) +[2023-10-13 00:38:23,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 21037056. Throughput: 0: 1678.8, 1: 1698.8. Samples: 5272732. Policy #0 lag: (min: 5.0, avg: 11.9, max: 37.0) +[2023-10-13 00:38:23,608][45375] Avg episode reward: [(0, '40.860'), (1, '41.390')] +[2023-10-13 00:38:24,402][46663] Updated weights for policy 1, policy_version 10281 (0.0009) +[2023-10-13 00:38:24,769][46663] Updated weights for policy 1, policy_version 10291 (0.0007) +[2023-10-13 00:38:24,893][46662] Updated weights for policy 0, policy_version 10280 (0.0008) +[2023-10-13 00:38:25,135][46663] Updated weights for policy 1, policy_version 10301 (0.0007) +[2023-10-13 00:38:25,270][46662] Updated weights for policy 0, policy_version 10290 (0.0009) +[2023-10-13 00:38:25,648][46662] Updated weights for policy 0, policy_version 10300 (0.0010) +[2023-10-13 00:38:28,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 21102592. Throughput: 0: 1652.9, 1: 1695.2. Samples: 5281956. Policy #0 lag: (min: 19.0, avg: 21.9, max: 51.0) +[2023-10-13 00:38:28,607][45375] Avg episode reward: [(0, '41.460'), (1, '42.590')] +[2023-10-13 00:38:28,608][46384] Saving new best policy, reward=42.590! +[2023-10-13 00:38:29,262][46663] Updated weights for policy 1, policy_version 10311 (0.0007) +[2023-10-13 00:38:29,625][46663] Updated weights for policy 1, policy_version 10321 (0.0007) +[2023-10-13 00:38:29,868][46662] Updated weights for policy 0, policy_version 10310 (0.0009) +[2023-10-13 00:38:29,991][46663] Updated weights for policy 1, policy_version 10331 (0.0007) +[2023-10-13 00:38:30,240][46662] Updated weights for policy 0, policy_version 10320 (0.0009) +[2023-10-13 00:38:30,610][46662] Updated weights for policy 0, policy_version 10330 (0.0009) +[2023-10-13 00:38:33,606][45375] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 21168128. Throughput: 0: 1677.2, 1: 1689.2. Samples: 5302312. Policy #0 lag: (min: 19.0, avg: 21.9, max: 51.0) +[2023-10-13 00:38:33,607][45375] Avg episode reward: [(0, '41.740'), (1, '41.660')] +[2023-10-13 00:38:33,608][46091] Saving new best policy, reward=41.740! +[2023-10-13 00:38:34,052][46663] Updated weights for policy 1, policy_version 10341 (0.0010) +[2023-10-13 00:38:34,423][46663] Updated weights for policy 1, policy_version 10351 (0.0010) +[2023-10-13 00:38:34,605][46662] Updated weights for policy 0, policy_version 10340 (0.0009) +[2023-10-13 00:38:34,783][46663] Updated weights for policy 1, policy_version 10361 (0.0009) +[2023-10-13 00:38:34,975][46662] Updated weights for policy 0, policy_version 10350 (0.0007) +[2023-10-13 00:38:35,350][46662] Updated weights for policy 0, policy_version 10360 (0.0010) +[2023-10-13 00:38:38,607][45375] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 21233664. Throughput: 0: 1684.8, 1: 1683.2. Samples: 5322968. Policy #0 lag: (min: 22.0, avg: 22.0, max: 22.0) +[2023-10-13 00:38:38,608][45375] Avg episode reward: [(0, '42.990'), (1, '42.310')] +[2023-10-13 00:38:38,618][46091] Saving new best policy, reward=42.990! +[2023-10-13 00:38:38,949][46663] Updated weights for policy 1, policy_version 10371 (0.0009) +[2023-10-13 00:38:39,318][46663] Updated weights for policy 1, policy_version 10381 (0.0009) +[2023-10-13 00:38:39,554][46662] Updated weights for policy 0, policy_version 10370 (0.0010) +[2023-10-13 00:38:39,681][46663] Updated weights for policy 1, policy_version 10391 (0.0008) +[2023-10-13 00:38:39,924][46662] Updated weights for policy 0, policy_version 10380 (0.0008) +[2023-10-13 00:38:40,299][46662] Updated weights for policy 0, policy_version 10390 (0.0010) +[2023-10-13 00:38:40,682][46662] Updated weights for policy 0, policy_version 10400 (0.0010) +[2023-10-13 00:38:43,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 21299200. Throughput: 0: 1665.1, 1: 1679.9. Samples: 5332040. Policy #0 lag: (min: 22.0, avg: 22.0, max: 22.0) +[2023-10-13 00:38:43,608][45375] Avg episode reward: [(0, '41.540'), (1, '40.910')] +[2023-10-13 00:38:43,698][46663] Updated weights for policy 1, policy_version 10401 (0.0007) +[2023-10-13 00:38:44,066][46663] Updated weights for policy 1, policy_version 10411 (0.0009) +[2023-10-13 00:38:44,421][46663] Updated weights for policy 1, policy_version 10421 (0.0009) +[2023-10-13 00:38:44,617][46662] Updated weights for policy 0, policy_version 10410 (0.0008) +[2023-10-13 00:38:44,796][46663] Updated weights for policy 1, policy_version 10431 (0.0008) +[2023-10-13 00:38:44,991][46662] Updated weights for policy 0, policy_version 10420 (0.0007) +[2023-10-13 00:38:45,361][46662] Updated weights for policy 0, policy_version 10430 (0.0009) +[2023-10-13 00:38:48,606][45375] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 21364736. Throughput: 0: 1693.3, 1: 1677.9. Samples: 5352866. Policy #0 lag: (min: 31.0, avg: 40.3, max: 63.0) +[2023-10-13 00:38:48,607][45375] Avg episode reward: [(0, '42.600'), (1, '40.740')] +[2023-10-13 00:38:48,839][46663] Updated weights for policy 1, policy_version 10441 (0.0008) +[2023-10-13 00:38:49,205][46663] Updated weights for policy 1, policy_version 10451 (0.0007) +[2023-10-13 00:38:49,433][46662] Updated weights for policy 0, policy_version 10440 (0.0008) +[2023-10-13 00:38:49,576][46663] Updated weights for policy 1, policy_version 10461 (0.0007) +[2023-10-13 00:38:49,815][46662] Updated weights for policy 0, policy_version 10450 (0.0007) +[2023-10-13 00:38:50,186][46662] Updated weights for policy 0, policy_version 10460 (0.0009) +[2023-10-13 00:38:53,607][45375] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 21430272. Throughput: 0: 1694.8, 1: 1672.9. Samples: 5373440. Policy #0 lag: (min: 31.0, avg: 40.3, max: 63.0) +[2023-10-13 00:38:53,607][45375] Avg episode reward: [(0, '41.550'), (1, '41.120')] +[2023-10-13 00:38:53,917][46663] Updated weights for policy 1, policy_version 10471 (0.0007) +[2023-10-13 00:38:54,245][46662] Updated weights for policy 0, policy_version 10470 (0.0008) +[2023-10-13 00:38:54,301][46663] Updated weights for policy 1, policy_version 10481 (0.0007) +[2023-10-13 00:38:54,612][46662] Updated weights for policy 0, policy_version 10480 (0.0008) +[2023-10-13 00:38:54,658][46663] Updated weights for policy 1, policy_version 10491 (0.0008) +[2023-10-13 00:38:54,992][46662] Updated weights for policy 0, policy_version 10490 (0.0009) +[2023-10-13 00:38:58,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 21495808. Throughput: 0: 1679.5, 1: 1666.3. Samples: 5382486. Policy #0 lag: (min: 28.0, avg: 28.0, max: 28.0) +[2023-10-13 00:38:58,607][45375] Avg episode reward: [(0, '41.510'), (1, '41.390')] +[2023-10-13 00:38:58,751][46663] Updated weights for policy 1, policy_version 10501 (0.0009) +[2023-10-13 00:38:59,060][46662] Updated weights for policy 0, policy_version 10500 (0.0008) +[2023-10-13 00:38:59,118][46663] Updated weights for policy 1, policy_version 10511 (0.0009) +[2023-10-13 00:38:59,440][46662] Updated weights for policy 0, policy_version 10510 (0.0009) +[2023-10-13 00:38:59,474][46663] Updated weights for policy 1, policy_version 10521 (0.0008) +[2023-10-13 00:38:59,808][46662] Updated weights for policy 0, policy_version 10520 (0.0007) +[2023-10-13 00:39:03,455][46663] Updated weights for policy 1, policy_version 10531 (0.0008) +[2023-10-13 00:39:03,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 21561344. Throughput: 0: 1690.5, 1: 1662.3. Samples: 5402946. Policy #0 lag: (min: 28.0, avg: 28.0, max: 28.0) +[2023-10-13 00:39:03,607][45375] Avg episode reward: [(0, '40.660'), (1, '41.010')] +[2023-10-13 00:39:03,784][46662] Updated weights for policy 0, policy_version 10530 (0.0007) +[2023-10-13 00:39:03,824][46663] Updated weights for policy 1, policy_version 10541 (0.0009) +[2023-10-13 00:39:04,160][46662] Updated weights for policy 0, policy_version 10540 (0.0008) +[2023-10-13 00:39:04,190][46663] Updated weights for policy 1, policy_version 10551 (0.0008) +[2023-10-13 00:39:04,529][46662] Updated weights for policy 0, policy_version 10550 (0.0010) +[2023-10-13 00:39:04,902][46662] Updated weights for policy 0, policy_version 10560 (0.0009) +[2023-10-13 00:39:08,319][46663] Updated weights for policy 1, policy_version 10561 (0.0008) +[2023-10-13 00:39:08,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 21626880. Throughput: 0: 1689.6, 1: 1664.7. Samples: 5423672. Policy #0 lag: (min: 25.0, avg: 39.4, max: 57.0) +[2023-10-13 00:39:08,608][45375] Avg episode reward: [(0, '40.530'), (1, '40.740')] +[2023-10-13 00:39:08,688][46663] Updated weights for policy 1, policy_version 10571 (0.0008) +[2023-10-13 00:39:08,998][46662] Updated weights for policy 0, policy_version 10570 (0.0010) +[2023-10-13 00:39:09,066][46663] Updated weights for policy 1, policy_version 10581 (0.0007) +[2023-10-13 00:39:09,364][46662] Updated weights for policy 0, policy_version 10580 (0.0007) +[2023-10-13 00:39:09,433][46663] Updated weights for policy 1, policy_version 10591 (0.0007) +[2023-10-13 00:39:09,744][46662] Updated weights for policy 0, policy_version 10590 (0.0007) +[2023-10-13 00:39:13,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 21692416. Throughput: 0: 1688.7, 1: 1664.4. Samples: 5432844. Policy #0 lag: (min: 25.0, avg: 39.4, max: 57.0) +[2023-10-13 00:39:13,608][45375] Avg episode reward: [(0, '42.390'), (1, '40.810')] +[2023-10-13 00:39:13,617][46663] Updated weights for policy 1, policy_version 10601 (0.0009) +[2023-10-13 00:39:13,783][46662] Updated weights for policy 0, policy_version 10600 (0.0007) +[2023-10-13 00:39:13,990][46663] Updated weights for policy 1, policy_version 10611 (0.0007) +[2023-10-13 00:39:14,142][46662] Updated weights for policy 0, policy_version 10610 (0.0007) +[2023-10-13 00:39:14,350][46663] Updated weights for policy 1, policy_version 10621 (0.0008) +[2023-10-13 00:39:14,520][46662] Updated weights for policy 0, policy_version 10620 (0.0008) +[2023-10-13 00:39:18,322][46663] Updated weights for policy 1, policy_version 10631 (0.0008) +[2023-10-13 00:39:18,606][45375] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 21757952. Throughput: 0: 1690.6, 1: 1667.6. Samples: 5453434. Policy #0 lag: (min: 30.0, avg: 30.0, max: 30.0) +[2023-10-13 00:39:18,607][45375] Avg episode reward: [(0, '43.450'), (1, '38.390')] +[2023-10-13 00:39:18,645][46662] Updated weights for policy 0, policy_version 10630 (0.0008) +[2023-10-13 00:39:18,687][46663] Updated weights for policy 1, policy_version 10641 (0.0009) +[2023-10-13 00:39:19,015][46662] Updated weights for policy 0, policy_version 10640 (0.0009) +[2023-10-13 00:39:19,060][46663] Updated weights for policy 1, policy_version 10651 (0.0008) +[2023-10-13 00:39:19,393][46662] Updated weights for policy 0, policy_version 10650 (0.0008) +[2023-10-13 00:39:19,604][46091] Saving new best policy, reward=43.450! +[2023-10-13 00:39:23,258][46663] Updated weights for policy 1, policy_version 10661 (0.0008) +[2023-10-13 00:39:23,328][46662] Updated weights for policy 0, policy_version 10660 (0.0007) +[2023-10-13 00:39:23,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 21823488. Throughput: 0: 1690.1, 1: 1663.7. Samples: 5473890. Policy #0 lag: (min: 30.0, avg: 30.0, max: 30.0) +[2023-10-13 00:39:23,607][45375] Avg episode reward: [(0, '42.270'), (1, '39.290')] +[2023-10-13 00:39:23,626][46663] Updated weights for policy 1, policy_version 10671 (0.0007) +[2023-10-13 00:39:23,707][46662] Updated weights for policy 0, policy_version 10670 (0.0007) +[2023-10-13 00:39:23,995][46663] Updated weights for policy 1, policy_version 10681 (0.0008) +[2023-10-13 00:39:24,083][46662] Updated weights for policy 0, policy_version 10680 (0.0009) +[2023-10-13 00:39:28,085][46663] Updated weights for policy 1, policy_version 10691 (0.0009) +[2023-10-13 00:39:28,110][46662] Updated weights for policy 0, policy_version 10690 (0.0008) +[2023-10-13 00:39:28,449][46663] Updated weights for policy 1, policy_version 10701 (0.0009) +[2023-10-13 00:39:28,473][46662] Updated weights for policy 0, policy_version 10700 (0.0007) +[2023-10-13 00:39:28,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 21889024. Throughput: 0: 1688.0, 1: 1672.7. Samples: 5483270. Policy #0 lag: (min: 27.0, avg: 34.5, max: 59.0) +[2023-10-13 00:39:28,607][45375] Avg episode reward: [(0, '39.710'), (1, '38.880')] +[2023-10-13 00:39:28,811][46663] Updated weights for policy 1, policy_version 10711 (0.0009) +[2023-10-13 00:39:28,842][46662] Updated weights for policy 0, policy_version 10710 (0.0010) +[2023-10-13 00:39:29,217][46662] Updated weights for policy 0, policy_version 10720 (0.0009) +[2023-10-13 00:39:32,981][46663] Updated weights for policy 1, policy_version 10721 (0.0009) +[2023-10-13 00:39:33,347][46663] Updated weights for policy 1, policy_version 10731 (0.0009) +[2023-10-13 00:39:33,371][46662] Updated weights for policy 0, policy_version 10730 (0.0008) +[2023-10-13 00:39:33,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 21954560. Throughput: 0: 1680.7, 1: 1669.9. Samples: 5503646. Policy #0 lag: (min: 27.0, avg: 34.5, max: 59.0) +[2023-10-13 00:39:33,608][45375] Avg episode reward: [(0, '39.700'), (1, '37.530')] +[2023-10-13 00:39:33,714][46663] Updated weights for policy 1, policy_version 10741 (0.0008) +[2023-10-13 00:39:33,747][46662] Updated weights for policy 0, policy_version 10740 (0.0009) +[2023-10-13 00:39:34,071][46663] Updated weights for policy 1, policy_version 10751 (0.0008) +[2023-10-13 00:39:34,117][46662] Updated weights for policy 0, policy_version 10750 (0.0008) +[2023-10-13 00:39:38,122][46662] Updated weights for policy 0, policy_version 10760 (0.0008) +[2023-10-13 00:39:38,218][46663] Updated weights for policy 1, policy_version 10761 (0.0008) +[2023-10-13 00:39:38,494][46662] Updated weights for policy 0, policy_version 10770 (0.0007) +[2023-10-13 00:39:38,577][46663] Updated weights for policy 1, policy_version 10771 (0.0009) +[2023-10-13 00:39:38,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 22020096. Throughput: 0: 1683.9, 1: 1658.3. Samples: 5523840. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) +[2023-10-13 00:39:38,607][45375] Avg episode reward: [(0, '40.440'), (1, '38.100')] +[2023-10-13 00:39:38,869][46662] Updated weights for policy 0, policy_version 10780 (0.0007) +[2023-10-13 00:39:38,948][46663] Updated weights for policy 1, policy_version 10781 (0.0009) +[2023-10-13 00:39:42,879][46662] Updated weights for policy 0, policy_version 10790 (0.0007) +[2023-10-13 00:39:43,131][46663] Updated weights for policy 1, policy_version 10791 (0.0008) +[2023-10-13 00:39:43,240][46662] Updated weights for policy 0, policy_version 10800 (0.0007) +[2023-10-13 00:39:43,513][46663] Updated weights for policy 1, policy_version 10801 (0.0007) +[2023-10-13 00:39:43,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 22085632. Throughput: 0: 1680.2, 1: 1674.1. Samples: 5533432. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) +[2023-10-13 00:39:43,608][45375] Avg episode reward: [(0, '39.680'), (1, '38.560')] +[2023-10-13 00:39:43,614][46662] Updated weights for policy 0, policy_version 10810 (0.0007) +[2023-10-13 00:39:43,887][46663] Updated weights for policy 1, policy_version 10811 (0.0009) +[2023-10-13 00:39:47,803][46663] Updated weights for policy 1, policy_version 10821 (0.0009) +[2023-10-13 00:39:47,819][46662] Updated weights for policy 0, policy_version 10820 (0.0008) +[2023-10-13 00:39:48,169][46663] Updated weights for policy 1, policy_version 10831 (0.0008) +[2023-10-13 00:39:48,196][46662] Updated weights for policy 0, policy_version 10830 (0.0008) +[2023-10-13 00:39:48,534][46663] Updated weights for policy 1, policy_version 10841 (0.0007) +[2023-10-13 00:39:48,578][46662] Updated weights for policy 0, policy_version 10840 (0.0008) +[2023-10-13 00:39:48,606][45375] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 22151168. Throughput: 0: 1685.6, 1: 1675.2. Samples: 5554182. Policy #0 lag: (min: 29.0, avg: 34.1, max: 61.0) +[2023-10-13 00:39:48,607][45375] Avg episode reward: [(0, '40.720'), (1, '39.350')] +[2023-10-13 00:39:52,518][46663] Updated weights for policy 1, policy_version 10851 (0.0009) +[2023-10-13 00:39:52,744][46662] Updated weights for policy 0, policy_version 10850 (0.0009) +[2023-10-13 00:39:52,887][46663] Updated weights for policy 1, policy_version 10861 (0.0008) +[2023-10-13 00:39:53,111][46662] Updated weights for policy 0, policy_version 10860 (0.0007) +[2023-10-13 00:39:53,258][46663] Updated weights for policy 1, policy_version 10871 (0.0008) +[2023-10-13 00:39:53,484][46662] Updated weights for policy 0, policy_version 10870 (0.0008) +[2023-10-13 00:39:53,607][45375] Fps is (10 sec: 16384.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 22249472. Throughput: 0: 1675.7, 1: 1658.0. Samples: 5573688. Policy #0 lag: (min: 29.0, avg: 34.1, max: 61.0) +[2023-10-13 00:39:53,607][45375] Avg episode reward: [(0, '41.350'), (1, '39.540')] +[2023-10-13 00:39:53,617][46384] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000010880_11141120.pth... +[2023-10-13 00:39:53,650][46384] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000009312_9535488.pth +[2023-10-13 00:39:53,865][46662] Updated weights for policy 0, policy_version 10880 (0.0008) +[2023-10-13 00:39:53,865][46091] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000010880_11141120.pth... +[2023-10-13 00:39:53,904][46091] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000009312_9535488.pth +[2023-10-13 00:39:57,471][46663] Updated weights for policy 1, policy_version 10881 (0.0008) +[2023-10-13 00:39:57,839][46663] Updated weights for policy 1, policy_version 10891 (0.0008) +[2023-10-13 00:39:57,857][46662] Updated weights for policy 0, policy_version 10890 (0.0009) +[2023-10-13 00:39:58,198][46663] Updated weights for policy 1, policy_version 10901 (0.0007) +[2023-10-13 00:39:58,225][46662] Updated weights for policy 0, policy_version 10900 (0.0007) +[2023-10-13 00:39:58,575][46663] Updated weights for policy 1, policy_version 10911 (0.0008) +[2023-10-13 00:39:58,591][46662] Updated weights for policy 0, policy_version 10910 (0.0007) +[2023-10-13 00:39:58,606][45375] Fps is (10 sec: 16384.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 22315008. Throughput: 0: 1679.4, 1: 1675.3. Samples: 5583804. Policy #0 lag: (min: 31.0, avg: 33.8, max: 63.0) +[2023-10-13 00:39:58,607][45375] Avg episode reward: [(0, '42.060'), (1, '38.670')] +[2023-10-13 00:40:02,600][46662] Updated weights for policy 0, policy_version 10920 (0.0007) +[2023-10-13 00:40:02,673][46663] Updated weights for policy 1, policy_version 10921 (0.0010) +[2023-10-13 00:40:02,974][46662] Updated weights for policy 0, policy_version 10930 (0.0010) +[2023-10-13 00:40:03,047][46663] Updated weights for policy 1, policy_version 10931 (0.0007) +[2023-10-13 00:40:03,336][46662] Updated weights for policy 0, policy_version 10940 (0.0008) +[2023-10-13 00:40:03,416][46663] Updated weights for policy 1, policy_version 10941 (0.0008) +[2023-10-13 00:40:03,607][45375] Fps is (10 sec: 16384.0, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 22413312. Throughput: 0: 1681.1, 1: 1668.4. Samples: 5604160. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 00:40:03,607][45375] Avg episode reward: [(0, '42.080'), (1, '37.180')] +[2023-10-13 00:40:07,331][46662] Updated weights for policy 0, policy_version 10950 (0.0007) +[2023-10-13 00:40:07,387][46663] Updated weights for policy 1, policy_version 10951 (0.0008) +[2023-10-13 00:40:07,705][46662] Updated weights for policy 0, policy_version 10960 (0.0008) +[2023-10-13 00:40:07,755][46663] Updated weights for policy 1, policy_version 10961 (0.0009) +[2023-10-13 00:40:08,079][46662] Updated weights for policy 0, policy_version 10970 (0.0007) +[2023-10-13 00:40:08,124][46663] Updated weights for policy 1, policy_version 10971 (0.0008) +[2023-10-13 00:40:08,607][45375] Fps is (10 sec: 16383.8, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 22478848. Throughput: 0: 1664.0, 1: 1652.0. Samples: 5623114. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 00:40:08,607][45375] Avg episode reward: [(0, '42.370'), (1, '38.200')] +[2023-10-13 00:40:12,120][46662] Updated weights for policy 0, policy_version 10980 (0.0007) +[2023-10-13 00:40:12,127][46663] Updated weights for policy 1, policy_version 10981 (0.0008) +[2023-10-13 00:40:12,484][46662] Updated weights for policy 0, policy_version 10990 (0.0008) +[2023-10-13 00:40:12,489][46663] Updated weights for policy 1, policy_version 10991 (0.0009) +[2023-10-13 00:40:12,844][46662] Updated weights for policy 0, policy_version 11000 (0.0008) +[2023-10-13 00:40:12,854][46663] Updated weights for policy 1, policy_version 11001 (0.0009) +[2023-10-13 00:40:13,607][45375] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 22544384. Throughput: 0: 1683.6, 1: 1676.0. Samples: 5634452. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 00:40:13,607][45375] Avg episode reward: [(0, '42.230'), (1, '38.390')] +[2023-10-13 00:40:16,775][46662] Updated weights for policy 0, policy_version 11010 (0.0009) +[2023-10-13 00:40:16,928][46663] Updated weights for policy 1, policy_version 11011 (0.0008) +[2023-10-13 00:40:17,156][46662] Updated weights for policy 0, policy_version 11020 (0.0008) +[2023-10-13 00:40:17,297][46663] Updated weights for policy 1, policy_version 11021 (0.0008) +[2023-10-13 00:40:17,520][46662] Updated weights for policy 0, policy_version 11030 (0.0008) +[2023-10-13 00:40:17,654][46663] Updated weights for policy 1, policy_version 11031 (0.0009) +[2023-10-13 00:40:17,895][46662] Updated weights for policy 0, policy_version 11040 (0.0007) +[2023-10-13 00:40:18,607][45375] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 22609920. Throughput: 0: 1684.9, 1: 1670.2. Samples: 5654628. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 00:40:18,607][45375] Avg episode reward: [(0, '41.690'), (1, '38.980')] +[2023-10-13 00:40:21,701][46663] Updated weights for policy 1, policy_version 11041 (0.0007) +[2023-10-13 00:40:22,068][46663] Updated weights for policy 1, policy_version 11051 (0.0008) +[2023-10-13 00:40:22,187][46662] Updated weights for policy 0, policy_version 11050 (0.0008) +[2023-10-13 00:40:22,441][46663] Updated weights for policy 1, policy_version 11061 (0.0009) +[2023-10-13 00:40:22,559][46662] Updated weights for policy 0, policy_version 11060 (0.0008) +[2023-10-13 00:40:22,817][46663] Updated weights for policy 1, policy_version 11071 (0.0008) +[2023-10-13 00:40:22,930][46662] Updated weights for policy 0, policy_version 11070 (0.0008) +[2023-10-13 00:40:23,607][45375] Fps is (10 sec: 13106.9, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 22675456. Throughput: 0: 1659.5, 1: 1672.3. Samples: 5673770. Policy #0 lag: (min: 31.0, avg: 31.5, max: 45.0) +[2023-10-13 00:40:23,608][45375] Avg episode reward: [(0, '43.990'), (1, '37.140')] +[2023-10-13 00:40:23,621][46091] Saving new best policy, reward=43.990! +[2023-10-13 00:40:26,772][46663] Updated weights for policy 1, policy_version 11081 (0.0009) +[2023-10-13 00:40:27,137][46663] Updated weights for policy 1, policy_version 11091 (0.0009) +[2023-10-13 00:40:27,157][46662] Updated weights for policy 0, policy_version 11080 (0.0008) +[2023-10-13 00:40:27,505][46663] Updated weights for policy 1, policy_version 11101 (0.0009) +[2023-10-13 00:40:27,530][46662] Updated weights for policy 0, policy_version 11090 (0.0009) +[2023-10-13 00:40:27,896][46662] Updated weights for policy 0, policy_version 11100 (0.0008) +[2023-10-13 00:40:28,607][45375] Fps is (10 sec: 13107.1, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 22740992. Throughput: 0: 1679.7, 1: 1691.0. Samples: 5685116. Policy #0 lag: (min: 31.0, avg: 31.5, max: 45.0) +[2023-10-13 00:40:28,607][45375] Avg episode reward: [(0, '44.010'), (1, '35.900')] +[2023-10-13 00:40:28,608][46091] Saving new best policy, reward=44.010! +[2023-10-13 00:40:31,943][46663] Updated weights for policy 1, policy_version 11111 (0.0008) +[2023-10-13 00:40:31,964][46662] Updated weights for policy 0, policy_version 11110 (0.0007) +[2023-10-13 00:40:32,309][46663] Updated weights for policy 1, policy_version 11121 (0.0009) +[2023-10-13 00:40:32,340][46662] Updated weights for policy 0, policy_version 11120 (0.0008) +[2023-10-13 00:40:32,680][46663] Updated weights for policy 1, policy_version 11131 (0.0009) +[2023-10-13 00:40:32,708][46662] Updated weights for policy 0, policy_version 11130 (0.0007) +[2023-10-13 00:40:33,607][45375] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 22806528. Throughput: 0: 1679.8, 1: 1665.0. Samples: 5704698. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) +[2023-10-13 00:40:33,608][45375] Avg episode reward: [(0, '43.280'), (1, '34.900')] +[2023-10-13 00:40:36,654][46662] Updated weights for policy 0, policy_version 11140 (0.0008) +[2023-10-13 00:40:36,787][46663] Updated weights for policy 1, policy_version 11141 (0.0008) +[2023-10-13 00:40:37,054][46662] Updated weights for policy 0, policy_version 11150 (0.0008) +[2023-10-13 00:40:37,153][46663] Updated weights for policy 1, policy_version 11151 (0.0008) +[2023-10-13 00:40:37,422][46662] Updated weights for policy 0, policy_version 11160 (0.0008) +[2023-10-13 00:40:37,521][46663] Updated weights for policy 1, policy_version 11161 (0.0009) +[2023-10-13 00:40:38,607][45375] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 22872064. Throughput: 0: 1659.2, 1: 1667.9. Samples: 5723410. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) +[2023-10-13 00:40:38,607][45375] Avg episode reward: [(0, '43.630'), (1, '35.490')] +[2023-10-13 00:40:41,558][46663] Updated weights for policy 1, policy_version 11171 (0.0008) +[2023-10-13 00:40:41,686][46662] Updated weights for policy 0, policy_version 11170 (0.0009) +[2023-10-13 00:40:41,928][46663] Updated weights for policy 1, policy_version 11181 (0.0009) +[2023-10-13 00:40:42,066][46662] Updated weights for policy 0, policy_version 11180 (0.0009) +[2023-10-13 00:40:42,299][46663] Updated weights for policy 1, policy_version 11191 (0.0009) +[2023-10-13 00:40:42,428][46662] Updated weights for policy 0, policy_version 11190 (0.0008) +[2023-10-13 00:40:42,802][46662] Updated weights for policy 0, policy_version 11200 (0.0008) +[2023-10-13 00:40:43,607][45375] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 22937600. Throughput: 0: 1678.0, 1: 1681.8. Samples: 5734996. Policy #0 lag: (min: 10.0, avg: 11.0, max: 27.0) +[2023-10-13 00:40:43,607][45375] Avg episode reward: [(0, '43.660'), (1, '35.390')] +[2023-10-13 00:40:46,428][46663] Updated weights for policy 1, policy_version 11201 (0.0010) +[2023-10-13 00:40:46,801][46663] Updated weights for policy 1, policy_version 11211 (0.0008) +[2023-10-13 00:40:46,979][46662] Updated weights for policy 0, policy_version 11210 (0.0009) +[2023-10-13 00:40:47,162][46663] Updated weights for policy 1, policy_version 11221 (0.0010) +[2023-10-13 00:40:47,348][46662] Updated weights for policy 0, policy_version 11220 (0.0010) +[2023-10-13 00:40:47,537][46663] Updated weights for policy 1, policy_version 11231 (0.0008) +[2023-10-13 00:40:47,716][46662] Updated weights for policy 0, policy_version 11230 (0.0008) +[2023-10-13 00:40:48,607][45375] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 23003136. Throughput: 0: 1674.7, 1: 1664.1. Samples: 5754408. Policy #0 lag: (min: 10.0, avg: 11.0, max: 27.0) +[2023-10-13 00:40:48,607][45375] Avg episode reward: [(0, '42.680'), (1, '37.200')] +[2023-10-13 00:40:51,565][46663] Updated weights for policy 1, policy_version 11241 (0.0008) +[2023-10-13 00:40:51,886][46662] Updated weights for policy 0, policy_version 11240 (0.0010) +[2023-10-13 00:40:51,933][46663] Updated weights for policy 1, policy_version 11251 (0.0007) +[2023-10-13 00:40:52,257][46662] Updated weights for policy 0, policy_version 11250 (0.0009) +[2023-10-13 00:40:52,298][46663] Updated weights for policy 1, policy_version 11261 (0.0007) +[2023-10-13 00:40:52,623][46662] Updated weights for policy 0, policy_version 11260 (0.0008) +[2023-10-13 00:40:53,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 23068672. Throughput: 0: 1663.6, 1: 1685.8. Samples: 5773840. Policy #0 lag: (min: 28.0, avg: 28.0, max: 28.0) +[2023-10-13 00:40:53,608][45375] Avg episode reward: [(0, '43.050'), (1, '37.350')] +[2023-10-13 00:40:56,312][46663] Updated weights for policy 1, policy_version 11271 (0.0008) +[2023-10-13 00:40:56,680][46663] Updated weights for policy 1, policy_version 11281 (0.0007) +[2023-10-13 00:40:56,844][46662] Updated weights for policy 0, policy_version 11270 (0.0009) +[2023-10-13 00:40:57,042][46663] Updated weights for policy 1, policy_version 11291 (0.0009) +[2023-10-13 00:40:57,220][46662] Updated weights for policy 0, policy_version 11280 (0.0008) +[2023-10-13 00:40:57,582][46662] Updated weights for policy 0, policy_version 11290 (0.0008) +[2023-10-13 00:40:58,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 23134208. Throughput: 0: 1667.4, 1: 1674.4. Samples: 5784834. Policy #0 lag: (min: 28.0, avg: 28.0, max: 28.0) +[2023-10-13 00:40:58,608][45375] Avg episode reward: [(0, '42.090'), (1, '37.070')] +[2023-10-13 00:41:01,039][46663] Updated weights for policy 1, policy_version 11301 (0.0008) +[2023-10-13 00:41:01,404][46663] Updated weights for policy 1, policy_version 11311 (0.0007) +[2023-10-13 00:41:01,704][46662] Updated weights for policy 0, policy_version 11300 (0.0010) +[2023-10-13 00:41:01,780][46663] Updated weights for policy 1, policy_version 11321 (0.0008) +[2023-10-13 00:41:02,080][46662] Updated weights for policy 0, policy_version 11310 (0.0009) +[2023-10-13 00:41:02,442][46662] Updated weights for policy 0, policy_version 11320 (0.0009) +[2023-10-13 00:41:03,607][45375] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 23199744. Throughput: 0: 1661.4, 1: 1665.4. Samples: 5804336. Policy #0 lag: (min: 31.0, avg: 38.1, max: 63.0) +[2023-10-13 00:41:03,607][45375] Avg episode reward: [(0, '41.600'), (1, '37.600')] +[2023-10-13 00:41:06,074][46663] Updated weights for policy 1, policy_version 11331 (0.0008) +[2023-10-13 00:41:06,401][46662] Updated weights for policy 0, policy_version 11330 (0.0008) +[2023-10-13 00:41:06,446][46663] Updated weights for policy 1, policy_version 11341 (0.0007) +[2023-10-13 00:41:06,781][46662] Updated weights for policy 0, policy_version 11340 (0.0007) +[2023-10-13 00:41:06,816][46663] Updated weights for policy 1, policy_version 11351 (0.0007) +[2023-10-13 00:41:07,153][46662] Updated weights for policy 0, policy_version 11350 (0.0007) +[2023-10-13 00:41:07,518][46662] Updated weights for policy 0, policy_version 11360 (0.0010) +[2023-10-13 00:41:08,606][45375] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 23265280. Throughput: 0: 1660.2, 1: 1672.5. Samples: 5823738. Policy #0 lag: (min: 31.0, avg: 38.1, max: 63.0) +[2023-10-13 00:41:08,607][45375] Avg episode reward: [(0, '41.330'), (1, '37.390')] +[2023-10-13 00:41:10,860][46663] Updated weights for policy 1, policy_version 11361 (0.0007) +[2023-10-13 00:41:11,233][46663] Updated weights for policy 1, policy_version 11371 (0.0009) +[2023-10-13 00:41:11,422][46662] Updated weights for policy 0, policy_version 11370 (0.0008) +[2023-10-13 00:41:11,600][46663] Updated weights for policy 1, policy_version 11381 (0.0009) +[2023-10-13 00:41:11,790][46662] Updated weights for policy 0, policy_version 11380 (0.0010) +[2023-10-13 00:41:11,967][46663] Updated weights for policy 1, policy_version 11391 (0.0009) +[2023-10-13 00:41:12,166][46662] Updated weights for policy 0, policy_version 11390 (0.0010) +[2023-10-13 00:41:13,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 23330816. Throughput: 0: 1670.0, 1: 1654.8. Samples: 5834728. Policy #0 lag: (min: 31.0, avg: 38.1, max: 63.0) +[2023-10-13 00:41:13,608][45375] Avg episode reward: [(0, '41.200'), (1, '37.730')] +[2023-10-13 00:41:16,147][46663] Updated weights for policy 1, policy_version 11401 (0.0009) +[2023-10-13 00:41:16,299][46662] Updated weights for policy 0, policy_version 11400 (0.0009) +[2023-10-13 00:41:16,517][46663] Updated weights for policy 1, policy_version 11411 (0.0008) +[2023-10-13 00:41:16,664][46662] Updated weights for policy 0, policy_version 11410 (0.0009) +[2023-10-13 00:41:16,877][46663] Updated weights for policy 1, policy_version 11421 (0.0009) +[2023-10-13 00:41:17,038][46662] Updated weights for policy 0, policy_version 11420 (0.0009) +[2023-10-13 00:41:18,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 23396352. Throughput: 0: 1649.7, 1: 1660.4. Samples: 5853650. Policy #0 lag: (min: 24.0, avg: 49.3, max: 56.0) +[2023-10-13 00:41:18,607][45375] Avg episode reward: [(0, '39.510'), (1, '39.580')] +[2023-10-13 00:41:20,975][46663] Updated weights for policy 1, policy_version 11431 (0.0007) +[2023-10-13 00:41:21,335][46662] Updated weights for policy 0, policy_version 11430 (0.0009) +[2023-10-13 00:41:21,339][46663] Updated weights for policy 1, policy_version 11441 (0.0008) +[2023-10-13 00:41:21,716][46663] Updated weights for policy 1, policy_version 11451 (0.0007) +[2023-10-13 00:41:21,727][46662] Updated weights for policy 0, policy_version 11440 (0.0009) +[2023-10-13 00:41:22,094][46662] Updated weights for policy 0, policy_version 11450 (0.0007) +[2023-10-13 00:41:23,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 23461888. Throughput: 0: 1662.8, 1: 1673.7. Samples: 5873552. Policy #0 lag: (min: 24.0, avg: 49.3, max: 56.0) +[2023-10-13 00:41:23,607][45375] Avg episode reward: [(0, '41.150'), (1, '39.910')] +[2023-10-13 00:41:25,761][46663] Updated weights for policy 1, policy_version 11461 (0.0008) +[2023-10-13 00:41:26,037][46662] Updated weights for policy 0, policy_version 11460 (0.0009) +[2023-10-13 00:41:26,121][46663] Updated weights for policy 1, policy_version 11471 (0.0009) +[2023-10-13 00:41:26,413][46662] Updated weights for policy 0, policy_version 11470 (0.0010) +[2023-10-13 00:41:26,500][46663] Updated weights for policy 1, policy_version 11481 (0.0009) +[2023-10-13 00:41:26,772][46662] Updated weights for policy 0, policy_version 11480 (0.0009) +[2023-10-13 00:41:28,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 23527424. Throughput: 0: 1668.6, 1: 1654.0. Samples: 5884512. Policy #0 lag: (min: 31.0, avg: 44.0, max: 63.0) +[2023-10-13 00:41:28,607][45375] Avg episode reward: [(0, '41.500'), (1, '40.060')] +[2023-10-13 00:41:30,532][46662] Updated weights for policy 0, policy_version 11490 (0.0009) +[2023-10-13 00:41:30,657][46663] Updated weights for policy 1, policy_version 11491 (0.0009) +[2023-10-13 00:41:30,902][46662] Updated weights for policy 0, policy_version 11500 (0.0009) +[2023-10-13 00:41:31,024][46663] Updated weights for policy 1, policy_version 11501 (0.0008) +[2023-10-13 00:41:31,279][46662] Updated weights for policy 0, policy_version 11510 (0.0009) +[2023-10-13 00:41:31,389][46663] Updated weights for policy 1, policy_version 11511 (0.0009) +[2023-10-13 00:41:31,645][46662] Updated weights for policy 0, policy_version 11520 (0.0009) +[2023-10-13 00:41:33,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 23592960. Throughput: 0: 1650.4, 1: 1665.0. Samples: 5903600. Policy #0 lag: (min: 31.0, avg: 44.0, max: 63.0) +[2023-10-13 00:41:33,607][45375] Avg episode reward: [(0, '40.310'), (1, '39.020')] +[2023-10-13 00:41:35,509][46663] Updated weights for policy 1, policy_version 11521 (0.0008) +[2023-10-13 00:41:35,872][46663] Updated weights for policy 1, policy_version 11531 (0.0007) +[2023-10-13 00:41:36,002][46662] Updated weights for policy 0, policy_version 11530 (0.0010) +[2023-10-13 00:41:36,241][46663] Updated weights for policy 1, policy_version 11541 (0.0008) +[2023-10-13 00:41:36,366][46662] Updated weights for policy 0, policy_version 11540 (0.0009) +[2023-10-13 00:41:36,613][46663] Updated weights for policy 1, policy_version 11551 (0.0007) +[2023-10-13 00:41:36,727][46662] Updated weights for policy 0, policy_version 11550 (0.0009) +[2023-10-13 00:41:38,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 23658496. Throughput: 0: 1668.4, 1: 1667.4. Samples: 5923952. Policy #0 lag: (min: 31.0, avg: 44.0, max: 63.0) +[2023-10-13 00:41:38,607][45375] Avg episode reward: [(0, '42.130'), (1, '39.100')] +[2023-10-13 00:41:40,391][46663] Updated weights for policy 1, policy_version 11561 (0.0007) +[2023-10-13 00:41:40,762][46663] Updated weights for policy 1, policy_version 11571 (0.0009) +[2023-10-13 00:41:40,855][46662] Updated weights for policy 0, policy_version 11560 (0.0010) +[2023-10-13 00:41:41,118][46663] Updated weights for policy 1, policy_version 11581 (0.0009) +[2023-10-13 00:41:41,226][46662] Updated weights for policy 0, policy_version 11570 (0.0009) +[2023-10-13 00:41:41,606][46662] Updated weights for policy 0, policy_version 11580 (0.0009) +[2023-10-13 00:41:43,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 23724032. Throughput: 0: 1668.8, 1: 1651.7. Samples: 5934260. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 00:41:43,608][45375] Avg episode reward: [(0, '41.990'), (1, '39.010')] +[2023-10-13 00:41:45,149][46663] Updated weights for policy 1, policy_version 11591 (0.0009) +[2023-10-13 00:41:45,532][46663] Updated weights for policy 1, policy_version 11601 (0.0008) +[2023-10-13 00:41:45,652][46662] Updated weights for policy 0, policy_version 11590 (0.0010) +[2023-10-13 00:41:45,890][46663] Updated weights for policy 1, policy_version 11611 (0.0008) +[2023-10-13 00:41:46,018][46662] Updated weights for policy 0, policy_version 11600 (0.0009) +[2023-10-13 00:41:46,385][46662] Updated weights for policy 0, policy_version 11610 (0.0010) +[2023-10-13 00:41:48,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 23789568. Throughput: 0: 1652.0, 1: 1673.6. Samples: 5953988. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 00:41:48,607][45375] Avg episode reward: [(0, '42.040'), (1, '38.250')] +[2023-10-13 00:41:49,972][46663] Updated weights for policy 1, policy_version 11621 (0.0010) +[2023-10-13 00:41:50,337][46663] Updated weights for policy 1, policy_version 11631 (0.0010) +[2023-10-13 00:41:50,497][46662] Updated weights for policy 0, policy_version 11620 (0.0009) +[2023-10-13 00:41:50,704][46663] Updated weights for policy 1, policy_version 11641 (0.0008) +[2023-10-13 00:41:50,871][46662] Updated weights for policy 0, policy_version 11630 (0.0009) +[2023-10-13 00:41:51,240][46662] Updated weights for policy 0, policy_version 11640 (0.0008) +[2023-10-13 00:41:53,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 23855104. Throughput: 0: 1674.7, 1: 1674.3. Samples: 5974446. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 00:41:53,608][45375] Avg episode reward: [(0, '41.240'), (1, '38.070')] +[2023-10-13 00:41:53,619][46384] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000011648_11927552.pth... +[2023-10-13 00:41:53,619][46091] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000011648_11927552.pth... +[2023-10-13 00:41:53,657][46091] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000010080_10321920.pth +[2023-10-13 00:41:53,665][46384] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000010080_10321920.pth +[2023-10-13 00:41:54,762][46663] Updated weights for policy 1, policy_version 11651 (0.0007) +[2023-10-13 00:41:55,125][46663] Updated weights for policy 1, policy_version 11661 (0.0008) +[2023-10-13 00:41:55,262][46662] Updated weights for policy 0, policy_version 11650 (0.0009) +[2023-10-13 00:41:55,502][46663] Updated weights for policy 1, policy_version 11671 (0.0009) +[2023-10-13 00:41:55,635][46662] Updated weights for policy 0, policy_version 11660 (0.0008) +[2023-10-13 00:41:56,008][46662] Updated weights for policy 0, policy_version 11670 (0.0011) +[2023-10-13 00:41:56,380][46662] Updated weights for policy 0, policy_version 11680 (0.0009) +[2023-10-13 00:41:58,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 23920640. Throughput: 0: 1663.7, 1: 1662.3. Samples: 5984396. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 00:41:58,607][45375] Avg episode reward: [(0, '41.670'), (1, '38.460')] +[2023-10-13 00:41:59,643][46663] Updated weights for policy 1, policy_version 11681 (0.0008) +[2023-10-13 00:42:00,007][46663] Updated weights for policy 1, policy_version 11691 (0.0007) +[2023-10-13 00:42:00,363][46662] Updated weights for policy 0, policy_version 11690 (0.0007) +[2023-10-13 00:42:00,369][46663] Updated weights for policy 1, policy_version 11701 (0.0009) +[2023-10-13 00:42:00,734][46662] Updated weights for policy 0, policy_version 11700 (0.0007) +[2023-10-13 00:42:00,739][46663] Updated weights for policy 1, policy_version 11711 (0.0008) +[2023-10-13 00:42:01,107][46662] Updated weights for policy 0, policy_version 11710 (0.0010) +[2023-10-13 00:42:03,607][45375] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 23986176. Throughput: 0: 1671.6, 1: 1687.8. Samples: 6004820. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 00:42:03,607][45375] Avg episode reward: [(0, '41.360'), (1, '38.280')] +[2023-10-13 00:42:04,766][46663] Updated weights for policy 1, policy_version 11721 (0.0008) +[2023-10-13 00:42:05,135][46663] Updated weights for policy 1, policy_version 11731 (0.0007) +[2023-10-13 00:42:05,180][46662] Updated weights for policy 0, policy_version 11720 (0.0009) +[2023-10-13 00:42:05,504][46663] Updated weights for policy 1, policy_version 11741 (0.0008) +[2023-10-13 00:42:05,552][46662] Updated weights for policy 0, policy_version 11730 (0.0008) +[2023-10-13 00:42:05,926][46662] Updated weights for policy 0, policy_version 11740 (0.0009) +[2023-10-13 00:42:08,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 24051712. Throughput: 0: 1683.6, 1: 1690.0. Samples: 6025364. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 00:42:08,607][45375] Avg episode reward: [(0, '42.560'), (1, '38.020')] +[2023-10-13 00:42:09,726][46663] Updated weights for policy 1, policy_version 11751 (0.0008) +[2023-10-13 00:42:10,013][46662] Updated weights for policy 0, policy_version 11750 (0.0009) +[2023-10-13 00:42:10,101][46663] Updated weights for policy 1, policy_version 11761 (0.0007) +[2023-10-13 00:42:10,401][46662] Updated weights for policy 0, policy_version 11760 (0.0007) +[2023-10-13 00:42:10,462][46663] Updated weights for policy 1, policy_version 11771 (0.0007) +[2023-10-13 00:42:10,763][46662] Updated weights for policy 0, policy_version 11770 (0.0009) +[2023-10-13 00:42:13,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 24117248. Throughput: 0: 1656.6, 1: 1673.9. Samples: 6034388. Policy #0 lag: (min: 9.0, avg: 17.1, max: 41.0) +[2023-10-13 00:42:13,607][45375] Avg episode reward: [(0, '42.880'), (1, '38.280')] +[2023-10-13 00:42:14,656][46663] Updated weights for policy 1, policy_version 11781 (0.0009) +[2023-10-13 00:42:14,889][46662] Updated weights for policy 0, policy_version 11780 (0.0008) +[2023-10-13 00:42:15,021][46663] Updated weights for policy 1, policy_version 11791 (0.0007) +[2023-10-13 00:42:15,250][46662] Updated weights for policy 0, policy_version 11790 (0.0008) +[2023-10-13 00:42:15,390][46663] Updated weights for policy 1, policy_version 11801 (0.0008) +[2023-10-13 00:42:15,629][46662] Updated weights for policy 0, policy_version 11800 (0.0008) +[2023-10-13 00:42:18,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 24182784. Throughput: 0: 1673.0, 1: 1684.4. Samples: 6054686. Policy #0 lag: (min: 9.0, avg: 17.1, max: 41.0) +[2023-10-13 00:42:18,607][45375] Avg episode reward: [(0, '41.740'), (1, '37.440')] +[2023-10-13 00:42:19,627][46663] Updated weights for policy 1, policy_version 11811 (0.0007) +[2023-10-13 00:42:19,631][46662] Updated weights for policy 0, policy_version 11810 (0.0011) +[2023-10-13 00:42:19,998][46663] Updated weights for policy 1, policy_version 11821 (0.0009) +[2023-10-13 00:42:19,998][46662] Updated weights for policy 0, policy_version 11820 (0.0007) +[2023-10-13 00:42:20,366][46662] Updated weights for policy 0, policy_version 11830 (0.0009) +[2023-10-13 00:42:20,371][46663] Updated weights for policy 1, policy_version 11831 (0.0007) +[2023-10-13 00:42:20,739][46662] Updated weights for policy 0, policy_version 11840 (0.0009) +[2023-10-13 00:42:23,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 24248320. Throughput: 0: 1689.7, 1: 1680.2. Samples: 6075598. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 00:42:23,608][45375] Avg episode reward: [(0, '41.150'), (1, '37.400')] +[2023-10-13 00:42:24,572][46663] Updated weights for policy 1, policy_version 11841 (0.0008) +[2023-10-13 00:42:24,645][46662] Updated weights for policy 0, policy_version 11850 (0.0010) +[2023-10-13 00:42:24,938][46663] Updated weights for policy 1, policy_version 11851 (0.0008) +[2023-10-13 00:42:25,005][46662] Updated weights for policy 0, policy_version 11860 (0.0009) +[2023-10-13 00:42:25,294][46663] Updated weights for policy 1, policy_version 11861 (0.0009) +[2023-10-13 00:42:25,372][46662] Updated weights for policy 0, policy_version 11870 (0.0008) +[2023-10-13 00:42:25,667][46663] Updated weights for policy 1, policy_version 11871 (0.0008) +[2023-10-13 00:42:28,606][45375] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 24313856. Throughput: 0: 1668.4, 1: 1673.7. Samples: 6084650. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 00:42:28,607][45375] Avg episode reward: [(0, '41.480'), (1, '37.950')] +[2023-10-13 00:42:29,610][46663] Updated weights for policy 1, policy_version 11881 (0.0008) +[2023-10-13 00:42:29,632][46662] Updated weights for policy 0, policy_version 11880 (0.0010) +[2023-10-13 00:42:29,984][46663] Updated weights for policy 1, policy_version 11891 (0.0009) +[2023-10-13 00:42:30,003][46662] Updated weights for policy 0, policy_version 11890 (0.0010) +[2023-10-13 00:42:30,348][46663] Updated weights for policy 1, policy_version 11901 (0.0008) +[2023-10-13 00:42:30,373][46662] Updated weights for policy 0, policy_version 11900 (0.0009) +[2023-10-13 00:42:33,607][45375] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 24379392. Throughput: 0: 1691.7, 1: 1671.4. Samples: 6105328. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 00:42:33,607][45375] Avg episode reward: [(0, '41.970'), (1, '38.170')] +[2023-10-13 00:42:34,368][46662] Updated weights for policy 0, policy_version 11910 (0.0010) +[2023-10-13 00:42:34,460][46663] Updated weights for policy 1, policy_version 11911 (0.0008) +[2023-10-13 00:42:34,740][46662] Updated weights for policy 0, policy_version 11920 (0.0008) +[2023-10-13 00:42:34,834][46663] Updated weights for policy 1, policy_version 11921 (0.0008) +[2023-10-13 00:42:35,116][46662] Updated weights for policy 0, policy_version 11930 (0.0007) +[2023-10-13 00:42:35,201][46663] Updated weights for policy 1, policy_version 11931 (0.0008) +[2023-10-13 00:42:38,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 24444928. Throughput: 0: 1694.1, 1: 1674.9. Samples: 6126052. Policy #0 lag: (min: 1.0, avg: 10.9, max: 33.0) +[2023-10-13 00:42:38,607][45375] Avg episode reward: [(0, '42.490'), (1, '38.790')] +[2023-10-13 00:42:39,218][46662] Updated weights for policy 0, policy_version 11940 (0.0007) +[2023-10-13 00:42:39,245][46663] Updated weights for policy 1, policy_version 11941 (0.0008) +[2023-10-13 00:42:39,590][46662] Updated weights for policy 0, policy_version 11950 (0.0008) +[2023-10-13 00:42:39,620][46663] Updated weights for policy 1, policy_version 11951 (0.0007) +[2023-10-13 00:42:39,965][46662] Updated weights for policy 0, policy_version 11960 (0.0009) +[2023-10-13 00:42:39,984][46663] Updated weights for policy 1, policy_version 11961 (0.0007) +[2023-10-13 00:42:43,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 24510464. Throughput: 0: 1672.6, 1: 1676.0. Samples: 6135082. Policy #0 lag: (min: 1.0, avg: 10.9, max: 33.0) +[2023-10-13 00:42:43,607][45375] Avg episode reward: [(0, '42.820'), (1, '39.910')] +[2023-10-13 00:42:44,090][46663] Updated weights for policy 1, policy_version 11971 (0.0008) +[2023-10-13 00:42:44,158][46662] Updated weights for policy 0, policy_version 11970 (0.0008) +[2023-10-13 00:42:44,450][46663] Updated weights for policy 1, policy_version 11981 (0.0008) +[2023-10-13 00:42:44,527][46662] Updated weights for policy 0, policy_version 11980 (0.0008) +[2023-10-13 00:42:44,825][46663] Updated weights for policy 1, policy_version 11991 (0.0007) +[2023-10-13 00:42:44,902][46662] Updated weights for policy 0, policy_version 11990 (0.0009) +[2023-10-13 00:42:45,266][46662] Updated weights for policy 0, policy_version 12000 (0.0009) +[2023-10-13 00:42:48,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 24576000. Throughput: 0: 1686.1, 1: 1669.8. Samples: 6155836. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 00:42:48,608][45375] Avg episode reward: [(0, '43.540'), (1, '40.740')] +[2023-10-13 00:42:48,842][46663] Updated weights for policy 1, policy_version 12001 (0.0010) +[2023-10-13 00:42:49,207][46663] Updated weights for policy 1, policy_version 12011 (0.0009) +[2023-10-13 00:42:49,519][46662] Updated weights for policy 0, policy_version 12010 (0.0009) +[2023-10-13 00:42:49,571][46663] Updated weights for policy 1, policy_version 12021 (0.0007) +[2023-10-13 00:42:49,900][46662] Updated weights for policy 0, policy_version 12020 (0.0007) +[2023-10-13 00:42:49,941][46663] Updated weights for policy 1, policy_version 12031 (0.0007) +[2023-10-13 00:42:50,267][46662] Updated weights for policy 0, policy_version 12030 (0.0008) +[2023-10-13 00:42:53,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 24641536. Throughput: 0: 1686.0, 1: 1674.4. Samples: 6176584. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 00:42:53,607][45375] Avg episode reward: [(0, '43.870'), (1, '40.520')] +[2023-10-13 00:42:53,965][46663] Updated weights for policy 1, policy_version 12041 (0.0008) +[2023-10-13 00:42:54,231][46662] Updated weights for policy 0, policy_version 12040 (0.0009) +[2023-10-13 00:42:54,333][46663] Updated weights for policy 1, policy_version 12051 (0.0008) +[2023-10-13 00:42:54,606][46662] Updated weights for policy 0, policy_version 12050 (0.0008) +[2023-10-13 00:42:54,690][46663] Updated weights for policy 1, policy_version 12061 (0.0007) +[2023-10-13 00:42:54,984][46662] Updated weights for policy 0, policy_version 12060 (0.0008) +[2023-10-13 00:42:58,606][45375] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 24707072. Throughput: 0: 1678.1, 1: 1680.9. Samples: 6185538. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 00:42:58,607][45375] Avg episode reward: [(0, '44.000'), (1, '41.700')] +[2023-10-13 00:42:58,741][46663] Updated weights for policy 1, policy_version 12071 (0.0009) +[2023-10-13 00:42:58,915][46662] Updated weights for policy 0, policy_version 12070 (0.0007) +[2023-10-13 00:42:59,106][46663] Updated weights for policy 1, policy_version 12081 (0.0009) +[2023-10-13 00:42:59,290][46662] Updated weights for policy 0, policy_version 12080 (0.0008) +[2023-10-13 00:42:59,473][46663] Updated weights for policy 1, policy_version 12091 (0.0008) +[2023-10-13 00:42:59,656][46662] Updated weights for policy 0, policy_version 12090 (0.0007) +[2023-10-13 00:43:03,607][46662] Updated weights for policy 0, policy_version 12100 (0.0007) +[2023-10-13 00:43:03,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 24772608. Throughput: 0: 1686.9, 1: 1678.8. Samples: 6206144. Policy #0 lag: (min: 22.0, avg: 22.0, max: 22.0) +[2023-10-13 00:43:03,608][45375] Avg episode reward: [(0, '44.130'), (1, '41.460')] +[2023-10-13 00:43:03,638][46663] Updated weights for policy 1, policy_version 12101 (0.0009) +[2023-10-13 00:43:03,977][46662] Updated weights for policy 0, policy_version 12110 (0.0007) +[2023-10-13 00:43:04,009][46663] Updated weights for policy 1, policy_version 12111 (0.0007) +[2023-10-13 00:43:04,351][46662] Updated weights for policy 0, policy_version 12120 (0.0008) +[2023-10-13 00:43:04,379][46663] Updated weights for policy 1, policy_version 12121 (0.0008) +[2023-10-13 00:43:04,636][46091] Saving new best policy, reward=44.130! +[2023-10-13 00:43:08,497][46662] Updated weights for policy 0, policy_version 12130 (0.0008) +[2023-10-13 00:43:08,606][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 24838144. Throughput: 0: 1678.3, 1: 1677.8. Samples: 6226622. Policy #0 lag: (min: 22.0, avg: 22.0, max: 22.0) +[2023-10-13 00:43:08,607][45375] Avg episode reward: [(0, '44.530'), (1, '44.080')] +[2023-10-13 00:43:08,619][46663] Updated weights for policy 1, policy_version 12131 (0.0010) +[2023-10-13 00:43:08,864][46662] Updated weights for policy 0, policy_version 12140 (0.0008) +[2023-10-13 00:43:08,978][46663] Updated weights for policy 1, policy_version 12141 (0.0008) +[2023-10-13 00:43:09,229][46662] Updated weights for policy 0, policy_version 12150 (0.0010) +[2023-10-13 00:43:09,344][46663] Updated weights for policy 1, policy_version 12151 (0.0007) +[2023-10-13 00:43:09,602][46091] Saving new best policy, reward=44.530! +[2023-10-13 00:43:09,606][46662] Updated weights for policy 0, policy_version 12160 (0.0008) +[2023-10-13 00:43:09,673][46384] Saving new best policy, reward=44.080! +[2023-10-13 00:43:13,368][46663] Updated weights for policy 1, policy_version 12161 (0.0007) +[2023-10-13 00:43:13,606][45375] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 24903680. Throughput: 0: 1676.8, 1: 1679.6. Samples: 6235688. Policy #0 lag: (min: 22.0, avg: 22.0, max: 22.0) +[2023-10-13 00:43:13,607][45375] Avg episode reward: [(0, '44.360'), (1, '44.470')] +[2023-10-13 00:43:13,734][46662] Updated weights for policy 0, policy_version 12170 (0.0009) +[2023-10-13 00:43:13,735][46663] Updated weights for policy 1, policy_version 12171 (0.0009) +[2023-10-13 00:43:14,097][46663] Updated weights for policy 1, policy_version 12181 (0.0008) +[2023-10-13 00:43:14,115][46662] Updated weights for policy 0, policy_version 12180 (0.0007) +[2023-10-13 00:43:14,459][46663] Updated weights for policy 1, policy_version 12191 (0.0008) +[2023-10-13 00:43:14,485][46662] Updated weights for policy 0, policy_version 12190 (0.0008) +[2023-10-13 00:43:14,495][46384] Saving new best policy, reward=44.470! +[2023-10-13 00:43:18,606][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 24969216. Throughput: 0: 1675.4, 1: 1678.4. Samples: 6256250. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 00:43:18,607][45375] Avg episode reward: [(0, '46.020'), (1, '44.470')] +[2023-10-13 00:43:18,681][46662] Updated weights for policy 0, policy_version 12200 (0.0007) +[2023-10-13 00:43:18,728][46663] Updated weights for policy 1, policy_version 12201 (0.0008) +[2023-10-13 00:43:19,046][46662] Updated weights for policy 0, policy_version 12210 (0.0007) +[2023-10-13 00:43:19,088][46663] Updated weights for policy 1, policy_version 12211 (0.0008) +[2023-10-13 00:43:19,425][46662] Updated weights for policy 0, policy_version 12220 (0.0007) +[2023-10-13 00:43:19,454][46663] Updated weights for policy 1, policy_version 12221 (0.0007) +[2023-10-13 00:43:19,564][46091] Saving new best policy, reward=46.020! +[2023-10-13 00:43:23,475][46662] Updated weights for policy 0, policy_version 12230 (0.0009) +[2023-10-13 00:43:23,509][46663] Updated weights for policy 1, policy_version 12231 (0.0008) +[2023-10-13 00:43:23,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 25034752. Throughput: 0: 1673.4, 1: 1674.0. Samples: 6276686. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 00:43:23,607][45375] Avg episode reward: [(0, '45.550'), (1, '46.050')] +[2023-10-13 00:43:23,845][46662] Updated weights for policy 0, policy_version 12240 (0.0008) +[2023-10-13 00:43:23,882][46663] Updated weights for policy 1, policy_version 12241 (0.0010) +[2023-10-13 00:43:24,219][46662] Updated weights for policy 0, policy_version 12250 (0.0009) +[2023-10-13 00:43:24,247][46663] Updated weights for policy 1, policy_version 12251 (0.0008) +[2023-10-13 00:43:24,426][46384] Saving new best policy, reward=46.050! +[2023-10-13 00:43:28,209][46662] Updated weights for policy 0, policy_version 12260 (0.0010) +[2023-10-13 00:43:28,473][46663] Updated weights for policy 1, policy_version 12261 (0.0008) +[2023-10-13 00:43:28,578][46662] Updated weights for policy 0, policy_version 12270 (0.0008) +[2023-10-13 00:43:28,606][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 25100288. Throughput: 0: 1678.3, 1: 1672.6. Samples: 6285870. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 00:43:28,607][45375] Avg episode reward: [(0, '44.970'), (1, '46.010')] +[2023-10-13 00:43:28,842][46663] Updated weights for policy 1, policy_version 12271 (0.0007) +[2023-10-13 00:43:28,947][46662] Updated weights for policy 0, policy_version 12280 (0.0009) +[2023-10-13 00:43:29,209][46663] Updated weights for policy 1, policy_version 12281 (0.0007) +[2023-10-13 00:43:33,080][46662] Updated weights for policy 0, policy_version 12290 (0.0008) +[2023-10-13 00:43:33,279][46663] Updated weights for policy 1, policy_version 12291 (0.0009) +[2023-10-13 00:43:33,457][46662] Updated weights for policy 0, policy_version 12300 (0.0008) +[2023-10-13 00:43:33,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 25165824. Throughput: 0: 1673.1, 1: 1665.7. Samples: 6306082. Policy #0 lag: (min: 31.0, avg: 38.9, max: 63.0) +[2023-10-13 00:43:33,608][45375] Avg episode reward: [(0, '45.890'), (1, '45.450')] +[2023-10-13 00:43:33,644][46663] Updated weights for policy 1, policy_version 12301 (0.0007) +[2023-10-13 00:43:33,824][46662] Updated weights for policy 0, policy_version 12310 (0.0009) +[2023-10-13 00:43:34,015][46663] Updated weights for policy 1, policy_version 12311 (0.0007) +[2023-10-13 00:43:34,201][46662] Updated weights for policy 0, policy_version 12320 (0.0009) +[2023-10-13 00:43:37,872][46663] Updated weights for policy 1, policy_version 12321 (0.0010) +[2023-10-13 00:43:38,230][46663] Updated weights for policy 1, policy_version 12331 (0.0010) +[2023-10-13 00:43:38,368][46662] Updated weights for policy 0, policy_version 12330 (0.0008) +[2023-10-13 00:43:38,602][46663] Updated weights for policy 1, policy_version 12341 (0.0008) +[2023-10-13 00:43:38,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 25231360. Throughput: 0: 1674.8, 1: 1653.2. Samples: 6326342. Policy #0 lag: (min: 31.0, avg: 38.9, max: 63.0) +[2023-10-13 00:43:38,607][45375] Avg episode reward: [(0, '45.990'), (1, '45.920')] +[2023-10-13 00:43:38,730][46662] Updated weights for policy 0, policy_version 12340 (0.0008) +[2023-10-13 00:43:38,970][46663] Updated weights for policy 1, policy_version 12351 (0.0007) +[2023-10-13 00:43:39,099][46662] Updated weights for policy 0, policy_version 12350 (0.0009) +[2023-10-13 00:43:42,930][46663] Updated weights for policy 1, policy_version 12361 (0.0008) +[2023-10-13 00:43:43,213][46662] Updated weights for policy 0, policy_version 12360 (0.0007) +[2023-10-13 00:43:43,299][46663] Updated weights for policy 1, policy_version 12371 (0.0009) +[2023-10-13 00:43:43,584][46662] Updated weights for policy 0, policy_version 12370 (0.0007) +[2023-10-13 00:43:43,606][45375] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 25296896. Throughput: 0: 1677.8, 1: 1668.0. Samples: 6336102. Policy #0 lag: (min: 31.0, avg: 38.9, max: 63.0) +[2023-10-13 00:43:43,607][45375] Avg episode reward: [(0, '46.390'), (1, '43.070')] +[2023-10-13 00:43:43,665][46663] Updated weights for policy 1, policy_version 12381 (0.0009) +[2023-10-13 00:43:43,960][46662] Updated weights for policy 0, policy_version 12380 (0.0007) +[2023-10-13 00:43:44,103][46091] Saving new best policy, reward=46.390! +[2023-10-13 00:43:47,695][46663] Updated weights for policy 1, policy_version 12391 (0.0008) +[2023-10-13 00:43:47,802][46662] Updated weights for policy 0, policy_version 12390 (0.0008) +[2023-10-13 00:43:48,074][46663] Updated weights for policy 1, policy_version 12401 (0.0007) +[2023-10-13 00:43:48,172][46662] Updated weights for policy 0, policy_version 12400 (0.0009) +[2023-10-13 00:43:48,443][46663] Updated weights for policy 1, policy_version 12411 (0.0007) +[2023-10-13 00:43:48,544][46662] Updated weights for policy 0, policy_version 12410 (0.0008) +[2023-10-13 00:43:48,606][45375] Fps is (10 sec: 13107.4, 60 sec: 13107.3, 300 sec: 13329.4). Total num frames: 25362432. Throughput: 0: 1673.3, 1: 1672.5. Samples: 6356706. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 00:43:48,607][45375] Avg episode reward: [(0, '47.770'), (1, '43.290')] +[2023-10-13 00:43:48,756][46091] Saving new best policy, reward=47.770! +[2023-10-13 00:43:52,617][46663] Updated weights for policy 1, policy_version 12421 (0.0008) +[2023-10-13 00:43:52,801][46662] Updated weights for policy 0, policy_version 12420 (0.0009) +[2023-10-13 00:43:52,978][46663] Updated weights for policy 1, policy_version 12431 (0.0010) +[2023-10-13 00:43:53,162][46662] Updated weights for policy 0, policy_version 12430 (0.0009) +[2023-10-13 00:43:53,348][46663] Updated weights for policy 1, policy_version 12441 (0.0010) +[2023-10-13 00:43:53,535][46662] Updated weights for policy 0, policy_version 12440 (0.0009) +[2023-10-13 00:43:53,606][45375] Fps is (10 sec: 16384.0, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 25460736. Throughput: 0: 1667.6, 1: 1652.8. Samples: 6376042. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 00:43:53,607][45375] Avg episode reward: [(0, '46.460'), (1, '43.650')] +[2023-10-13 00:43:53,614][46384] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000012448_12746752.pth... +[2023-10-13 00:43:53,655][46384] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000010880_11141120.pth +[2023-10-13 00:43:53,821][46091] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000012448_12746752.pth... +[2023-10-13 00:43:53,849][46091] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000010880_11141120.pth +[2023-10-13 00:43:57,588][46663] Updated weights for policy 1, policy_version 12451 (0.0007) +[2023-10-13 00:43:57,676][46662] Updated weights for policy 0, policy_version 12450 (0.0008) +[2023-10-13 00:43:57,954][46663] Updated weights for policy 1, policy_version 12461 (0.0007) +[2023-10-13 00:43:58,045][46662] Updated weights for policy 0, policy_version 12460 (0.0009) +[2023-10-13 00:43:58,328][46663] Updated weights for policy 1, policy_version 12471 (0.0007) +[2023-10-13 00:43:58,416][46662] Updated weights for policy 0, policy_version 12470 (0.0008) +[2023-10-13 00:43:58,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 25493504. Throughput: 0: 1668.1, 1: 1676.1. Samples: 6386178. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 00:43:58,607][45375] Avg episode reward: [(0, '45.950'), (1, '43.580')] +[2023-10-13 00:43:58,780][46662] Updated weights for policy 0, policy_version 12480 (0.0008) +[2023-10-13 00:44:02,413][46663] Updated weights for policy 1, policy_version 12481 (0.0008) +[2023-10-13 00:44:02,777][46663] Updated weights for policy 1, policy_version 12491 (0.0008) +[2023-10-13 00:44:02,809][46662] Updated weights for policy 0, policy_version 12490 (0.0008) +[2023-10-13 00:44:03,145][46663] Updated weights for policy 1, policy_version 12501 (0.0007) +[2023-10-13 00:44:03,179][46662] Updated weights for policy 0, policy_version 12500 (0.0009) +[2023-10-13 00:44:03,512][46663] Updated weights for policy 1, policy_version 12511 (0.0007) +[2023-10-13 00:44:03,551][46662] Updated weights for policy 0, policy_version 12510 (0.0008) +[2023-10-13 00:44:03,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 25591808. Throughput: 0: 1672.6, 1: 1676.3. Samples: 6406950. Policy #0 lag: (min: 31.0, avg: 33.9, max: 63.0) +[2023-10-13 00:44:03,607][45375] Avg episode reward: [(0, '46.320'), (1, '42.710')] +[2023-10-13 00:44:07,557][46663] Updated weights for policy 1, policy_version 12521 (0.0009) +[2023-10-13 00:44:07,615][46662] Updated weights for policy 0, policy_version 12520 (0.0008) +[2023-10-13 00:44:07,923][46663] Updated weights for policy 1, policy_version 12531 (0.0009) +[2023-10-13 00:44:07,983][46662] Updated weights for policy 0, policy_version 12530 (0.0007) +[2023-10-13 00:44:08,293][46663] Updated weights for policy 1, policy_version 12541 (0.0008) +[2023-10-13 00:44:08,350][46662] Updated weights for policy 0, policy_version 12540 (0.0007) +[2023-10-13 00:44:08,607][45375] Fps is (10 sec: 19660.8, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 25690112. Throughput: 0: 1663.0, 1: 1653.9. Samples: 6425946. Policy #0 lag: (min: 31.0, avg: 33.9, max: 63.0) +[2023-10-13 00:44:08,607][45375] Avg episode reward: [(0, '47.530'), (1, '42.190')] +[2023-10-13 00:44:12,108][46663] Updated weights for policy 1, policy_version 12551 (0.0008) +[2023-10-13 00:44:12,429][46662] Updated weights for policy 0, policy_version 12550 (0.0009) +[2023-10-13 00:44:12,483][46663] Updated weights for policy 1, policy_version 12561 (0.0008) +[2023-10-13 00:44:12,799][46662] Updated weights for policy 0, policy_version 12560 (0.0008) +[2023-10-13 00:44:12,840][46663] Updated weights for policy 1, policy_version 12571 (0.0009) +[2023-10-13 00:44:13,177][46662] Updated weights for policy 0, policy_version 12570 (0.0009) +[2023-10-13 00:44:13,607][45375] Fps is (10 sec: 16383.7, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 25755648. Throughput: 0: 1672.7, 1: 1685.5. Samples: 6436988. Policy #0 lag: (min: 12.0, avg: 14.4, max: 44.0) +[2023-10-13 00:44:13,608][45375] Avg episode reward: [(0, '47.750'), (1, '43.770')] +[2023-10-13 00:44:17,071][46663] Updated weights for policy 1, policy_version 12581 (0.0009) +[2023-10-13 00:44:17,205][46662] Updated weights for policy 0, policy_version 12580 (0.0007) +[2023-10-13 00:44:17,441][46663] Updated weights for policy 1, policy_version 12591 (0.0008) +[2023-10-13 00:44:17,581][46662] Updated weights for policy 0, policy_version 12590 (0.0008) +[2023-10-13 00:44:17,800][46663] Updated weights for policy 1, policy_version 12601 (0.0007) +[2023-10-13 00:44:17,954][46662] Updated weights for policy 0, policy_version 12600 (0.0009) +[2023-10-13 00:44:18,607][45375] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 25821184. Throughput: 0: 1680.8, 1: 1686.2. Samples: 6457596. Policy #0 lag: (min: 12.0, avg: 14.4, max: 44.0) +[2023-10-13 00:44:18,607][45375] Avg episode reward: [(0, '47.160'), (1, '43.110')] +[2023-10-13 00:44:21,826][46663] Updated weights for policy 1, policy_version 12611 (0.0008) +[2023-10-13 00:44:22,183][46662] Updated weights for policy 0, policy_version 12610 (0.0007) +[2023-10-13 00:44:22,196][46663] Updated weights for policy 1, policy_version 12621 (0.0008) +[2023-10-13 00:44:22,550][46662] Updated weights for policy 0, policy_version 12620 (0.0008) +[2023-10-13 00:44:22,557][46663] Updated weights for policy 1, policy_version 12631 (0.0009) +[2023-10-13 00:44:22,923][46662] Updated weights for policy 0, policy_version 12630 (0.0007) +[2023-10-13 00:44:23,292][46662] Updated weights for policy 0, policy_version 12640 (0.0009) +[2023-10-13 00:44:23,606][45375] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 25886720. Throughput: 0: 1660.3, 1: 1677.7. Samples: 6476552. Policy #0 lag: (min: 19.0, avg: 19.8, max: 39.0) +[2023-10-13 00:44:23,607][45375] Avg episode reward: [(0, '47.470'), (1, '43.180')] +[2023-10-13 00:44:26,619][46663] Updated weights for policy 1, policy_version 12641 (0.0009) +[2023-10-13 00:44:26,985][46663] Updated weights for policy 1, policy_version 12651 (0.0009) +[2023-10-13 00:44:27,350][46663] Updated weights for policy 1, policy_version 12661 (0.0009) +[2023-10-13 00:44:27,496][46662] Updated weights for policy 0, policy_version 12650 (0.0007) +[2023-10-13 00:44:27,715][46663] Updated weights for policy 1, policy_version 12671 (0.0009) +[2023-10-13 00:44:27,863][46662] Updated weights for policy 0, policy_version 12660 (0.0009) +[2023-10-13 00:44:28,237][46662] Updated weights for policy 0, policy_version 12670 (0.0007) +[2023-10-13 00:44:28,607][45375] Fps is (10 sec: 13107.2, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 25952256. Throughput: 0: 1678.1, 1: 1688.7. Samples: 6487608. Policy #0 lag: (min: 19.0, avg: 19.8, max: 39.0) +[2023-10-13 00:44:28,607][45375] Avg episode reward: [(0, '46.790'), (1, '43.200')] +[2023-10-13 00:44:31,755][46663] Updated weights for policy 1, policy_version 12681 (0.0009) +[2023-10-13 00:44:32,112][46663] Updated weights for policy 1, policy_version 12691 (0.0008) +[2023-10-13 00:44:32,356][46662] Updated weights for policy 0, policy_version 12680 (0.0008) +[2023-10-13 00:44:32,478][46663] Updated weights for policy 1, policy_version 12701 (0.0010) +[2023-10-13 00:44:32,728][46662] Updated weights for policy 0, policy_version 12690 (0.0010) +[2023-10-13 00:44:33,101][46662] Updated weights for policy 0, policy_version 12700 (0.0008) +[2023-10-13 00:44:33,607][45375] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 26017792. Throughput: 0: 1674.0, 1: 1669.8. Samples: 6507178. Policy #0 lag: (min: 19.0, avg: 19.8, max: 39.0) +[2023-10-13 00:44:33,607][45375] Avg episode reward: [(0, '45.510'), (1, '43.840')] +[2023-10-13 00:44:36,731][46663] Updated weights for policy 1, policy_version 12711 (0.0008) +[2023-10-13 00:44:36,944][46662] Updated weights for policy 0, policy_version 12710 (0.0007) +[2023-10-13 00:44:37,099][46663] Updated weights for policy 1, policy_version 12721 (0.0008) +[2023-10-13 00:44:37,312][46662] Updated weights for policy 0, policy_version 12720 (0.0008) +[2023-10-13 00:44:37,466][46663] Updated weights for policy 1, policy_version 12731 (0.0008) +[2023-10-13 00:44:37,678][46662] Updated weights for policy 0, policy_version 12730 (0.0008) +[2023-10-13 00:44:38,607][45375] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 26083328. Throughput: 0: 1657.2, 1: 1685.8. Samples: 6526476. Policy #0 lag: (min: 31.0, avg: 37.2, max: 63.0) +[2023-10-13 00:44:38,607][45375] Avg episode reward: [(0, '43.460'), (1, '43.830')] +[2023-10-13 00:44:41,537][46663] Updated weights for policy 1, policy_version 12741 (0.0007) +[2023-10-13 00:44:41,684][46662] Updated weights for policy 0, policy_version 12740 (0.0008) +[2023-10-13 00:44:41,901][46663] Updated weights for policy 1, policy_version 12751 (0.0008) +[2023-10-13 00:44:42,052][46662] Updated weights for policy 0, policy_version 12750 (0.0008) +[2023-10-13 00:44:42,263][46663] Updated weights for policy 1, policy_version 12761 (0.0007) +[2023-10-13 00:44:42,417][46662] Updated weights for policy 0, policy_version 12760 (0.0008) +[2023-10-13 00:44:43,606][45375] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 26148864. Throughput: 0: 1684.7, 1: 1690.2. Samples: 6538048. Policy #0 lag: (min: 31.0, avg: 37.2, max: 63.0) +[2023-10-13 00:44:43,607][45375] Avg episode reward: [(0, '42.670'), (1, '44.450')] +[2023-10-13 00:44:46,397][46663] Updated weights for policy 1, policy_version 12771 (0.0009) +[2023-10-13 00:44:46,515][46662] Updated weights for policy 0, policy_version 12770 (0.0008) +[2023-10-13 00:44:46,766][46663] Updated weights for policy 1, policy_version 12781 (0.0009) +[2023-10-13 00:44:46,885][46662] Updated weights for policy 0, policy_version 12780 (0.0008) +[2023-10-13 00:44:47,127][46663] Updated weights for policy 1, policy_version 12791 (0.0008) +[2023-10-13 00:44:47,258][46662] Updated weights for policy 0, policy_version 12790 (0.0008) +[2023-10-13 00:44:47,631][46662] Updated weights for policy 0, policy_version 12800 (0.0011) +[2023-10-13 00:44:48,607][45375] Fps is (10 sec: 13107.2, 60 sec: 14199.4, 300 sec: 13440.4). Total num frames: 26214400. Throughput: 0: 1672.5, 1: 1666.0. Samples: 6557182. Policy #0 lag: (min: 31.0, avg: 37.2, max: 63.0) +[2023-10-13 00:44:48,607][45375] Avg episode reward: [(0, '42.540'), (1, '46.450')] +[2023-10-13 00:44:48,608][46384] Saving new best policy, reward=46.450! +[2023-10-13 00:44:51,130][46663] Updated weights for policy 1, policy_version 12801 (0.0009) +[2023-10-13 00:44:51,496][46663] Updated weights for policy 1, policy_version 12811 (0.0009) +[2023-10-13 00:44:51,715][46662] Updated weights for policy 0, policy_version 12810 (0.0009) +[2023-10-13 00:44:51,860][46663] Updated weights for policy 1, policy_version 12821 (0.0008) +[2023-10-13 00:44:52,078][46662] Updated weights for policy 0, policy_version 12820 (0.0010) +[2023-10-13 00:44:52,231][46663] Updated weights for policy 1, policy_version 12831 (0.0008) +[2023-10-13 00:44:52,445][46662] Updated weights for policy 0, policy_version 12830 (0.0009) +[2023-10-13 00:44:53,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 26279936. Throughput: 0: 1664.7, 1: 1686.0. Samples: 6576726. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 00:44:53,607][45375] Avg episode reward: [(0, '44.040'), (1, '46.080')] +[2023-10-13 00:44:56,240][46663] Updated weights for policy 1, policy_version 12841 (0.0008) +[2023-10-13 00:44:56,513][46662] Updated weights for policy 0, policy_version 12840 (0.0007) +[2023-10-13 00:44:56,601][46663] Updated weights for policy 1, policy_version 12851 (0.0007) +[2023-10-13 00:44:56,884][46662] Updated weights for policy 0, policy_version 12850 (0.0007) +[2023-10-13 00:44:56,964][46663] Updated weights for policy 1, policy_version 12861 (0.0008) +[2023-10-13 00:44:57,256][46662] Updated weights for policy 0, policy_version 12860 (0.0007) +[2023-10-13 00:44:58,607][45375] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 13329.4). Total num frames: 26345472. Throughput: 0: 1679.8, 1: 1679.3. Samples: 6588148. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 00:44:58,607][45375] Avg episode reward: [(0, '44.050'), (1, '45.310')] +[2023-10-13 00:45:00,896][46663] Updated weights for policy 1, policy_version 12871 (0.0010) +[2023-10-13 00:45:01,274][46663] Updated weights for policy 1, policy_version 12881 (0.0008) +[2023-10-13 00:45:01,419][46662] Updated weights for policy 0, policy_version 12870 (0.0008) +[2023-10-13 00:45:01,646][46663] Updated weights for policy 1, policy_version 12891 (0.0009) +[2023-10-13 00:45:01,780][46662] Updated weights for policy 0, policy_version 12880 (0.0009) +[2023-10-13 00:45:02,163][46662] Updated weights for policy 0, policy_version 12890 (0.0010) +[2023-10-13 00:45:03,606][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13329.4). Total num frames: 26411008. Throughput: 0: 1655.9, 1: 1669.8. Samples: 6607252. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 00:45:03,607][45375] Avg episode reward: [(0, '43.300'), (1, '45.730')] +[2023-10-13 00:45:05,678][46663] Updated weights for policy 1, policy_version 12901 (0.0007) +[2023-10-13 00:45:06,060][46663] Updated weights for policy 1, policy_version 12911 (0.0008) +[2023-10-13 00:45:06,416][46662] Updated weights for policy 0, policy_version 12900 (0.0008) +[2023-10-13 00:45:06,427][46663] Updated weights for policy 1, policy_version 12921 (0.0011) +[2023-10-13 00:45:06,788][46662] Updated weights for policy 0, policy_version 12910 (0.0007) +[2023-10-13 00:45:07,150][46662] Updated weights for policy 0, policy_version 12920 (0.0007) +[2023-10-13 00:45:08,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 26476544. Throughput: 0: 1659.1, 1: 1687.7. Samples: 6627160. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 00:45:08,607][45375] Avg episode reward: [(0, '41.550'), (1, '43.730')] +[2023-10-13 00:45:10,635][46663] Updated weights for policy 1, policy_version 12931 (0.0009) +[2023-10-13 00:45:10,995][46663] Updated weights for policy 1, policy_version 12941 (0.0008) +[2023-10-13 00:45:11,347][46662] Updated weights for policy 0, policy_version 12930 (0.0008) +[2023-10-13 00:45:11,368][46663] Updated weights for policy 1, policy_version 12951 (0.0008) +[2023-10-13 00:45:11,708][46662] Updated weights for policy 0, policy_version 12940 (0.0009) +[2023-10-13 00:45:12,077][46662] Updated weights for policy 0, policy_version 12950 (0.0010) +[2023-10-13 00:45:12,450][46662] Updated weights for policy 0, policy_version 12960 (0.0010) +[2023-10-13 00:45:13,607][45375] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 26542080. Throughput: 0: 1674.6, 1: 1666.4. Samples: 6637952. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 00:45:13,608][45375] Avg episode reward: [(0, '41.630'), (1, '43.120')] +[2023-10-13 00:45:15,298][46663] Updated weights for policy 1, policy_version 12961 (0.0007) +[2023-10-13 00:45:15,666][46663] Updated weights for policy 1, policy_version 12971 (0.0008) +[2023-10-13 00:45:16,039][46663] Updated weights for policy 1, policy_version 12981 (0.0009) +[2023-10-13 00:45:16,406][46663] Updated weights for policy 1, policy_version 12991 (0.0007) +[2023-10-13 00:45:16,606][46662] Updated weights for policy 0, policy_version 12970 (0.0008) +[2023-10-13 00:45:16,969][46662] Updated weights for policy 0, policy_version 12980 (0.0007) +[2023-10-13 00:45:17,343][46662] Updated weights for policy 0, policy_version 12990 (0.0007) +[2023-10-13 00:45:18,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 26607616. Throughput: 0: 1666.1, 1: 1675.8. Samples: 6657564. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 00:45:18,607][45375] Avg episode reward: [(0, '40.820'), (1, '41.880')] +[2023-10-13 00:45:20,551][46663] Updated weights for policy 1, policy_version 13001 (0.0008) +[2023-10-13 00:45:20,909][46663] Updated weights for policy 1, policy_version 13011 (0.0010) +[2023-10-13 00:45:21,280][46663] Updated weights for policy 1, policy_version 13021 (0.0008) +[2023-10-13 00:45:21,357][46662] Updated weights for policy 0, policy_version 13000 (0.0008) +[2023-10-13 00:45:21,723][46662] Updated weights for policy 0, policy_version 13010 (0.0009) +[2023-10-13 00:45:22,097][46662] Updated weights for policy 0, policy_version 13020 (0.0009) +[2023-10-13 00:45:23,606][45375] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 26673152. Throughput: 0: 1671.0, 1: 1683.6. Samples: 6677436. Policy #0 lag: (min: 31.0, avg: 31.5, max: 46.0) +[2023-10-13 00:45:23,607][45375] Avg episode reward: [(0, '41.260'), (1, '43.290')] +[2023-10-13 00:45:25,548][46663] Updated weights for policy 1, policy_version 13031 (0.0010) +[2023-10-13 00:45:25,937][46663] Updated weights for policy 1, policy_version 13041 (0.0010) +[2023-10-13 00:45:26,122][46662] Updated weights for policy 0, policy_version 13030 (0.0010) +[2023-10-13 00:45:26,301][46663] Updated weights for policy 1, policy_version 13051 (0.0010) +[2023-10-13 00:45:26,495][46662] Updated weights for policy 0, policy_version 13040 (0.0009) +[2023-10-13 00:45:26,871][46662] Updated weights for policy 0, policy_version 13050 (0.0009) +[2023-10-13 00:45:28,606][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 26738688. Throughput: 0: 1672.9, 1: 1654.7. Samples: 6687788. Policy #0 lag: (min: 31.0, avg: 31.5, max: 46.0) +[2023-10-13 00:45:28,607][45375] Avg episode reward: [(0, '40.380'), (1, '43.240')] +[2023-10-13 00:45:30,523][46663] Updated weights for policy 1, policy_version 13061 (0.0008) +[2023-10-13 00:45:30,798][46662] Updated weights for policy 0, policy_version 13060 (0.0007) +[2023-10-13 00:45:30,891][46663] Updated weights for policy 1, policy_version 13071 (0.0011) +[2023-10-13 00:45:31,171][46662] Updated weights for policy 0, policy_version 13070 (0.0009) +[2023-10-13 00:45:31,263][46663] Updated weights for policy 1, policy_version 13081 (0.0010) +[2023-10-13 00:45:31,541][46662] Updated weights for policy 0, policy_version 13080 (0.0008) +[2023-10-13 00:45:33,606][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 26804224. Throughput: 0: 1654.0, 1: 1674.7. Samples: 6706974. Policy #0 lag: (min: 31.0, avg: 31.5, max: 46.0) +[2023-10-13 00:45:33,607][45375] Avg episode reward: [(0, '40.480'), (1, '42.020')] +[2023-10-13 00:45:35,368][46663] Updated weights for policy 1, policy_version 13091 (0.0008) +[2023-10-13 00:45:35,591][46662] Updated weights for policy 0, policy_version 13090 (0.0008) +[2023-10-13 00:45:35,738][46663] Updated weights for policy 1, policy_version 13101 (0.0009) +[2023-10-13 00:45:35,968][46662] Updated weights for policy 0, policy_version 13100 (0.0009) +[2023-10-13 00:45:36,096][46663] Updated weights for policy 1, policy_version 13111 (0.0010) +[2023-10-13 00:45:36,333][46662] Updated weights for policy 0, policy_version 13110 (0.0009) +[2023-10-13 00:45:36,702][46662] Updated weights for policy 0, policy_version 13120 (0.0009) +[2023-10-13 00:45:38,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 26869760. Throughput: 0: 1671.7, 1: 1680.9. Samples: 6727594. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 00:45:38,607][45375] Avg episode reward: [(0, '41.520'), (1, '42.750')] +[2023-10-13 00:45:40,105][46663] Updated weights for policy 1, policy_version 13121 (0.0008) +[2023-10-13 00:45:40,469][46663] Updated weights for policy 1, policy_version 13131 (0.0007) +[2023-10-13 00:45:40,704][46662] Updated weights for policy 0, policy_version 13130 (0.0009) +[2023-10-13 00:45:40,837][46663] Updated weights for policy 1, policy_version 13141 (0.0008) +[2023-10-13 00:45:41,072][46662] Updated weights for policy 0, policy_version 13140 (0.0009) +[2023-10-13 00:45:41,204][46663] Updated weights for policy 1, policy_version 13151 (0.0007) +[2023-10-13 00:45:41,450][46662] Updated weights for policy 0, policy_version 13150 (0.0009) +[2023-10-13 00:45:43,606][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 26935296. Throughput: 0: 1664.2, 1: 1656.4. Samples: 6737576. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 00:45:43,607][45375] Avg episode reward: [(0, '43.290'), (1, '40.310')] +[2023-10-13 00:45:45,294][46663] Updated weights for policy 1, policy_version 13161 (0.0010) +[2023-10-13 00:45:45,624][46662] Updated weights for policy 0, policy_version 13160 (0.0009) +[2023-10-13 00:45:45,659][46663] Updated weights for policy 1, policy_version 13171 (0.0008) +[2023-10-13 00:45:45,998][46662] Updated weights for policy 0, policy_version 13170 (0.0007) +[2023-10-13 00:45:46,027][46663] Updated weights for policy 1, policy_version 13181 (0.0010) +[2023-10-13 00:45:46,361][46662] Updated weights for policy 0, policy_version 13180 (0.0007) +[2023-10-13 00:45:48,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 27000832. Throughput: 0: 1667.9, 1: 1673.0. Samples: 6757594. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 00:45:48,607][45375] Avg episode reward: [(0, '42.620'), (1, '41.240')] +[2023-10-13 00:45:50,068][46663] Updated weights for policy 1, policy_version 13191 (0.0007) +[2023-10-13 00:45:50,434][46662] Updated weights for policy 0, policy_version 13190 (0.0008) +[2023-10-13 00:45:50,441][46663] Updated weights for policy 1, policy_version 13201 (0.0008) +[2023-10-13 00:45:50,801][46662] Updated weights for policy 0, policy_version 13200 (0.0008) +[2023-10-13 00:45:50,814][46663] Updated weights for policy 1, policy_version 13211 (0.0008) +[2023-10-13 00:45:51,178][46662] Updated weights for policy 0, policy_version 13210 (0.0007) +[2023-10-13 00:45:53,607][45375] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 27066368. Throughput: 0: 1681.4, 1: 1672.8. Samples: 6778100. Policy #0 lag: (min: 11.0, avg: 18.9, max: 43.0) +[2023-10-13 00:45:53,608][45375] Avg episode reward: [(0, '42.200'), (1, '42.320')] +[2023-10-13 00:45:53,617][46091] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000013216_13533184.pth... +[2023-10-13 00:45:53,618][46384] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000013216_13533184.pth... +[2023-10-13 00:45:53,653][46384] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000011648_11927552.pth +[2023-10-13 00:45:53,659][46091] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000011648_11927552.pth +[2023-10-13 00:45:54,782][46663] Updated weights for policy 1, policy_version 13221 (0.0008) +[2023-10-13 00:45:55,150][46663] Updated weights for policy 1, policy_version 13231 (0.0010) +[2023-10-13 00:45:55,281][46662] Updated weights for policy 0, policy_version 13220 (0.0009) +[2023-10-13 00:45:55,528][46663] Updated weights for policy 1, policy_version 13241 (0.0007) +[2023-10-13 00:45:55,650][46662] Updated weights for policy 0, policy_version 13230 (0.0007) +[2023-10-13 00:45:56,013][46662] Updated weights for policy 0, policy_version 13240 (0.0008) +[2023-10-13 00:45:58,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 27131904. Throughput: 0: 1661.9, 1: 1669.8. Samples: 6787878. Policy #0 lag: (min: 11.0, avg: 18.9, max: 43.0) +[2023-10-13 00:45:58,607][45375] Avg episode reward: [(0, '44.790'), (1, '42.590')] +[2023-10-13 00:45:59,508][46663] Updated weights for policy 1, policy_version 13251 (0.0008) +[2023-10-13 00:45:59,870][46663] Updated weights for policy 1, policy_version 13261 (0.0007) +[2023-10-13 00:46:00,076][46662] Updated weights for policy 0, policy_version 13250 (0.0008) +[2023-10-13 00:46:00,244][46663] Updated weights for policy 1, policy_version 13271 (0.0007) +[2023-10-13 00:46:00,448][46662] Updated weights for policy 0, policy_version 13260 (0.0008) +[2023-10-13 00:46:00,814][46662] Updated weights for policy 0, policy_version 13270 (0.0009) +[2023-10-13 00:46:01,194][46662] Updated weights for policy 0, policy_version 13280 (0.0009) +[2023-10-13 00:46:03,607][45375] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 27197440. Throughput: 0: 1665.3, 1: 1682.7. Samples: 6808226. Policy #0 lag: (min: 11.0, avg: 18.9, max: 43.0) +[2023-10-13 00:46:03,607][45375] Avg episode reward: [(0, '45.710'), (1, '42.300')] +[2023-10-13 00:46:04,296][46663] Updated weights for policy 1, policy_version 13281 (0.0008) +[2023-10-13 00:46:04,671][46663] Updated weights for policy 1, policy_version 13291 (0.0010) +[2023-10-13 00:46:05,036][46663] Updated weights for policy 1, policy_version 13301 (0.0007) +[2023-10-13 00:46:05,231][46662] Updated weights for policy 0, policy_version 13290 (0.0009) +[2023-10-13 00:46:05,402][46663] Updated weights for policy 1, policy_version 13311 (0.0008) +[2023-10-13 00:46:05,591][46662] Updated weights for policy 0, policy_version 13300 (0.0009) +[2023-10-13 00:46:05,973][46662] Updated weights for policy 0, policy_version 13310 (0.0010) +[2023-10-13 00:46:08,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 27262976. Throughput: 0: 1681.7, 1: 1684.9. Samples: 6828936. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) +[2023-10-13 00:46:08,607][45375] Avg episode reward: [(0, '46.480'), (1, '43.490')] +[2023-10-13 00:46:09,442][46663] Updated weights for policy 1, policy_version 13321 (0.0007) +[2023-10-13 00:46:09,806][46663] Updated weights for policy 1, policy_version 13331 (0.0008) +[2023-10-13 00:46:10,139][46662] Updated weights for policy 0, policy_version 13320 (0.0008) +[2023-10-13 00:46:10,174][46663] Updated weights for policy 1, policy_version 13341 (0.0008) +[2023-10-13 00:46:10,506][46662] Updated weights for policy 0, policy_version 13330 (0.0007) +[2023-10-13 00:46:10,876][46662] Updated weights for policy 0, policy_version 13340 (0.0007) +[2023-10-13 00:46:13,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 27328512. Throughput: 0: 1657.5, 1: 1685.4. Samples: 6838218. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) +[2023-10-13 00:46:13,608][45375] Avg episode reward: [(0, '46.500'), (1, '44.130')] +[2023-10-13 00:46:14,390][46663] Updated weights for policy 1, policy_version 13351 (0.0009) +[2023-10-13 00:46:14,673][46662] Updated weights for policy 0, policy_version 13350 (0.0007) +[2023-10-13 00:46:14,758][46663] Updated weights for policy 1, policy_version 13361 (0.0008) +[2023-10-13 00:46:15,043][46662] Updated weights for policy 0, policy_version 13360 (0.0008) +[2023-10-13 00:46:15,122][46663] Updated weights for policy 1, policy_version 13371 (0.0008) +[2023-10-13 00:46:15,416][46662] Updated weights for policy 0, policy_version 13370 (0.0009) +[2023-10-13 00:46:18,606][45375] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 27394048. Throughput: 0: 1681.3, 1: 1685.8. Samples: 6858496. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) +[2023-10-13 00:46:18,607][45375] Avg episode reward: [(0, '47.960'), (1, '44.640')] +[2023-10-13 00:46:18,608][46091] Saving new best policy, reward=47.960! +[2023-10-13 00:46:19,198][46663] Updated weights for policy 1, policy_version 13381 (0.0008) +[2023-10-13 00:46:19,572][46663] Updated weights for policy 1, policy_version 13391 (0.0007) +[2023-10-13 00:46:19,580][46662] Updated weights for policy 0, policy_version 13380 (0.0010) +[2023-10-13 00:46:19,929][46663] Updated weights for policy 1, policy_version 13401 (0.0008) +[2023-10-13 00:46:19,955][46662] Updated weights for policy 0, policy_version 13390 (0.0008) +[2023-10-13 00:46:20,332][46662] Updated weights for policy 0, policy_version 13400 (0.0008) +[2023-10-13 00:46:23,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13107.1, 300 sec: 13329.3). Total num frames: 27459584. Throughput: 0: 1682.0, 1: 1687.4. Samples: 6879218. Policy #0 lag: (min: 20.0, avg: 24.2, max: 52.0) +[2023-10-13 00:46:23,608][45375] Avg episode reward: [(0, '48.150'), (1, '44.910')] +[2023-10-13 00:46:23,618][46091] Saving new best policy, reward=48.150! +[2023-10-13 00:46:24,004][46663] Updated weights for policy 1, policy_version 13411 (0.0008) +[2023-10-13 00:46:24,364][46663] Updated weights for policy 1, policy_version 13421 (0.0007) +[2023-10-13 00:46:24,514][46662] Updated weights for policy 0, policy_version 13410 (0.0007) +[2023-10-13 00:46:24,728][46663] Updated weights for policy 1, policy_version 13431 (0.0007) +[2023-10-13 00:46:24,882][46662] Updated weights for policy 0, policy_version 13420 (0.0009) +[2023-10-13 00:46:25,255][46662] Updated weights for policy 0, policy_version 13430 (0.0010) +[2023-10-13 00:46:25,632][46662] Updated weights for policy 0, policy_version 13440 (0.0011) +[2023-10-13 00:46:28,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 27525120. Throughput: 0: 1657.8, 1: 1690.9. Samples: 6888268. Policy #0 lag: (min: 20.0, avg: 24.2, max: 52.0) +[2023-10-13 00:46:28,607][45375] Avg episode reward: [(0, '48.360'), (1, '44.890')] +[2023-10-13 00:46:28,608][46091] Saving new best policy, reward=48.360! +[2023-10-13 00:46:28,740][46663] Updated weights for policy 1, policy_version 13441 (0.0009) +[2023-10-13 00:46:29,101][46663] Updated weights for policy 1, policy_version 13451 (0.0011) +[2023-10-13 00:46:29,471][46663] Updated weights for policy 1, policy_version 13461 (0.0008) +[2023-10-13 00:46:29,832][46663] Updated weights for policy 1, policy_version 13471 (0.0008) +[2023-10-13 00:46:29,840][46662] Updated weights for policy 0, policy_version 13450 (0.0009) +[2023-10-13 00:46:30,216][46662] Updated weights for policy 0, policy_version 13460 (0.0010) +[2023-10-13 00:46:30,589][46662] Updated weights for policy 0, policy_version 13470 (0.0008) +[2023-10-13 00:46:33,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.1, 300 sec: 13329.3). Total num frames: 27590656. Throughput: 0: 1670.9, 1: 1690.8. Samples: 6908872. Policy #0 lag: (min: 20.0, avg: 24.2, max: 52.0) +[2023-10-13 00:46:33,608][45375] Avg episode reward: [(0, '50.160'), (1, '44.260')] +[2023-10-13 00:46:33,609][46091] Saving new best policy, reward=50.160! +[2023-10-13 00:46:33,800][46663] Updated weights for policy 1, policy_version 13481 (0.0009) +[2023-10-13 00:46:34,173][46663] Updated weights for policy 1, policy_version 13491 (0.0009) +[2023-10-13 00:46:34,538][46663] Updated weights for policy 1, policy_version 13501 (0.0009) +[2023-10-13 00:46:34,727][46662] Updated weights for policy 0, policy_version 13480 (0.0007) +[2023-10-13 00:46:35,101][46662] Updated weights for policy 0, policy_version 13490 (0.0008) +[2023-10-13 00:46:35,467][46662] Updated weights for policy 0, policy_version 13500 (0.0010) +[2023-10-13 00:46:38,604][46663] Updated weights for policy 1, policy_version 13511 (0.0008) +[2023-10-13 00:46:38,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 27656192. Throughput: 0: 1676.7, 1: 1690.7. Samples: 6929632. Policy #0 lag: (min: 31.0, avg: 36.7, max: 63.0) +[2023-10-13 00:46:38,607][45375] Avg episode reward: [(0, '51.130'), (1, '43.940')] +[2023-10-13 00:46:38,614][46091] Saving new best policy, reward=51.130! +[2023-10-13 00:46:38,965][46663] Updated weights for policy 1, policy_version 13521 (0.0011) +[2023-10-13 00:46:39,335][46663] Updated weights for policy 1, policy_version 13531 (0.0010) +[2023-10-13 00:46:39,384][46662] Updated weights for policy 0, policy_version 13510 (0.0008) +[2023-10-13 00:46:39,755][46662] Updated weights for policy 0, policy_version 13520 (0.0009) +[2023-10-13 00:46:40,134][46662] Updated weights for policy 0, policy_version 13530 (0.0009) +[2023-10-13 00:46:43,434][46663] Updated weights for policy 1, policy_version 13541 (0.0008) +[2023-10-13 00:46:43,606][45375] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 27721728. Throughput: 0: 1665.6, 1: 1687.6. Samples: 6938776. Policy #0 lag: (min: 31.0, avg: 36.7, max: 63.0) +[2023-10-13 00:46:43,607][45375] Avg episode reward: [(0, '49.880'), (1, '43.400')] +[2023-10-13 00:46:43,799][46663] Updated weights for policy 1, policy_version 13551 (0.0009) +[2023-10-13 00:46:44,072][46662] Updated weights for policy 0, policy_version 13540 (0.0009) +[2023-10-13 00:46:44,176][46663] Updated weights for policy 1, policy_version 13561 (0.0007) +[2023-10-13 00:46:44,446][46662] Updated weights for policy 0, policy_version 13550 (0.0008) +[2023-10-13 00:46:44,818][46662] Updated weights for policy 0, policy_version 13560 (0.0009) +[2023-10-13 00:46:48,402][46663] Updated weights for policy 1, policy_version 13571 (0.0009) +[2023-10-13 00:46:48,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 27787264. Throughput: 0: 1682.3, 1: 1683.3. Samples: 6959680. Policy #0 lag: (min: 31.0, avg: 36.7, max: 63.0) +[2023-10-13 00:46:48,607][45375] Avg episode reward: [(0, '49.060'), (1, '43.740')] +[2023-10-13 00:46:48,775][46663] Updated weights for policy 1, policy_version 13581 (0.0010) +[2023-10-13 00:46:48,985][46662] Updated weights for policy 0, policy_version 13570 (0.0008) +[2023-10-13 00:46:49,133][46663] Updated weights for policy 1, policy_version 13591 (0.0008) +[2023-10-13 00:46:49,349][46662] Updated weights for policy 0, policy_version 13580 (0.0008) +[2023-10-13 00:46:49,719][46662] Updated weights for policy 0, policy_version 13590 (0.0008) +[2023-10-13 00:46:50,083][46662] Updated weights for policy 0, policy_version 13600 (0.0009) +[2023-10-13 00:46:53,136][46663] Updated weights for policy 1, policy_version 13601 (0.0008) +[2023-10-13 00:46:53,501][46663] Updated weights for policy 1, policy_version 13611 (0.0008) +[2023-10-13 00:46:53,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 27852800. Throughput: 0: 1680.1, 1: 1674.1. Samples: 6979876. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 00:46:53,607][45375] Avg episode reward: [(0, '48.640'), (1, '42.890')] +[2023-10-13 00:46:53,871][46663] Updated weights for policy 1, policy_version 13621 (0.0007) +[2023-10-13 00:46:54,159][46662] Updated weights for policy 0, policy_version 13610 (0.0008) +[2023-10-13 00:46:54,230][46663] Updated weights for policy 1, policy_version 13631 (0.0008) +[2023-10-13 00:46:54,524][46662] Updated weights for policy 0, policy_version 13620 (0.0008) +[2023-10-13 00:46:54,902][46662] Updated weights for policy 0, policy_version 13630 (0.0009) +[2023-10-13 00:46:58,301][46663] Updated weights for policy 1, policy_version 13641 (0.0008) +[2023-10-13 00:46:58,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 27918336. Throughput: 0: 1673.9, 1: 1679.0. Samples: 6989098. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 00:46:58,607][45375] Avg episode reward: [(0, '48.200'), (1, '42.520')] +[2023-10-13 00:46:58,660][46663] Updated weights for policy 1, policy_version 13651 (0.0008) +[2023-10-13 00:46:58,914][46662] Updated weights for policy 0, policy_version 13640 (0.0007) +[2023-10-13 00:46:59,026][46663] Updated weights for policy 1, policy_version 13661 (0.0008) +[2023-10-13 00:46:59,285][46662] Updated weights for policy 0, policy_version 13650 (0.0009) +[2023-10-13 00:46:59,665][46662] Updated weights for policy 0, policy_version 13660 (0.0009) +[2023-10-13 00:47:03,209][46663] Updated weights for policy 1, policy_version 13671 (0.0009) +[2023-10-13 00:47:03,586][46663] Updated weights for policy 1, policy_version 13681 (0.0009) +[2023-10-13 00:47:03,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 27983872. Throughput: 0: 1680.7, 1: 1686.2. Samples: 7010008. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 00:47:03,608][45375] Avg episode reward: [(0, '47.050'), (1, '41.970')] +[2023-10-13 00:47:03,892][46662] Updated weights for policy 0, policy_version 13670 (0.0008) +[2023-10-13 00:47:03,956][46663] Updated weights for policy 1, policy_version 13691 (0.0007) +[2023-10-13 00:47:04,267][46662] Updated weights for policy 0, policy_version 13680 (0.0009) +[2023-10-13 00:47:04,638][46662] Updated weights for policy 0, policy_version 13690 (0.0008) +[2023-10-13 00:47:07,966][46663] Updated weights for policy 1, policy_version 13701 (0.0007) +[2023-10-13 00:47:08,335][46663] Updated weights for policy 1, policy_version 13711 (0.0008) +[2023-10-13 00:47:08,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 28049408. Throughput: 0: 1679.0, 1: 1675.3. Samples: 7030164. Policy #0 lag: (min: 21.0, avg: 21.0, max: 21.0) +[2023-10-13 00:47:08,607][45375] Avg episode reward: [(0, '47.700'), (1, '41.050')] +[2023-10-13 00:47:08,716][46663] Updated weights for policy 1, policy_version 13721 (0.0009) +[2023-10-13 00:47:08,855][46662] Updated weights for policy 0, policy_version 13700 (0.0007) +[2023-10-13 00:47:09,218][46662] Updated weights for policy 0, policy_version 13710 (0.0008) +[2023-10-13 00:47:09,597][46662] Updated weights for policy 0, policy_version 13720 (0.0009) +[2023-10-13 00:47:12,842][46663] Updated weights for policy 1, policy_version 13731 (0.0008) +[2023-10-13 00:47:13,212][46663] Updated weights for policy 1, policy_version 13741 (0.0008) +[2023-10-13 00:47:13,584][46663] Updated weights for policy 1, policy_version 13751 (0.0009) +[2023-10-13 00:47:13,607][45375] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 28114944. Throughput: 0: 1683.3, 1: 1683.4. Samples: 7039768. Policy #0 lag: (min: 21.0, avg: 21.0, max: 21.0) +[2023-10-13 00:47:13,607][45375] Avg episode reward: [(0, '47.790'), (1, '40.970')] +[2023-10-13 00:47:13,702][46662] Updated weights for policy 0, policy_version 13730 (0.0008) +[2023-10-13 00:47:14,074][46662] Updated weights for policy 0, policy_version 13740 (0.0008) +[2023-10-13 00:47:14,448][46662] Updated weights for policy 0, policy_version 13750 (0.0008) +[2023-10-13 00:47:14,825][46662] Updated weights for policy 0, policy_version 13760 (0.0009) +[2023-10-13 00:47:17,518][46663] Updated weights for policy 1, policy_version 13761 (0.0007) +[2023-10-13 00:47:17,883][46663] Updated weights for policy 1, policy_version 13771 (0.0008) +[2023-10-13 00:47:18,260][46663] Updated weights for policy 1, policy_version 13781 (0.0009) +[2023-10-13 00:47:18,606][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 28180480. Throughput: 0: 1684.5, 1: 1685.3. Samples: 7060510. Policy #0 lag: (min: 21.0, avg: 21.0, max: 21.0) +[2023-10-13 00:47:18,607][45375] Avg episode reward: [(0, '47.190'), (1, '41.500')] +[2023-10-13 00:47:18,628][46663] Updated weights for policy 1, policy_version 13791 (0.0008) +[2023-10-13 00:47:18,749][46662] Updated weights for policy 0, policy_version 13770 (0.0008) +[2023-10-13 00:47:19,122][46662] Updated weights for policy 0, policy_version 13780 (0.0008) +[2023-10-13 00:47:19,494][46662] Updated weights for policy 0, policy_version 13790 (0.0009) +[2023-10-13 00:47:22,767][46663] Updated weights for policy 1, policy_version 13801 (0.0008) +[2023-10-13 00:47:23,132][46663] Updated weights for policy 1, policy_version 13811 (0.0007) +[2023-10-13 00:47:23,501][46663] Updated weights for policy 1, policy_version 13821 (0.0007) +[2023-10-13 00:47:23,505][46662] Updated weights for policy 0, policy_version 13800 (0.0008) +[2023-10-13 00:47:23,607][45375] Fps is (10 sec: 16383.7, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 28278784. Throughput: 0: 1683.2, 1: 1662.6. Samples: 7080190. Policy #0 lag: (min: 31.0, avg: 38.8, max: 63.0) +[2023-10-13 00:47:23,608][45375] Avg episode reward: [(0, '46.720'), (1, '40.980')] +[2023-10-13 00:47:23,873][46662] Updated weights for policy 0, policy_version 13810 (0.0009) +[2023-10-13 00:47:24,242][46662] Updated weights for policy 0, policy_version 13820 (0.0010) +[2023-10-13 00:47:27,593][46663] Updated weights for policy 1, policy_version 13831 (0.0009) +[2023-10-13 00:47:27,973][46663] Updated weights for policy 1, policy_version 13841 (0.0010) +[2023-10-13 00:47:28,337][46663] Updated weights for policy 1, policy_version 13851 (0.0007) +[2023-10-13 00:47:28,363][46662] Updated weights for policy 0, policy_version 13830 (0.0008) +[2023-10-13 00:47:28,606][45375] Fps is (10 sec: 16384.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 28344320. Throughput: 0: 1682.0, 1: 1683.5. Samples: 7090222. Policy #0 lag: (min: 31.0, avg: 38.8, max: 63.0) +[2023-10-13 00:47:28,607][45375] Avg episode reward: [(0, '47.260'), (1, '41.870')] +[2023-10-13 00:47:28,744][46662] Updated weights for policy 0, policy_version 13840 (0.0008) +[2023-10-13 00:47:29,119][46662] Updated weights for policy 0, policy_version 13850 (0.0010) +[2023-10-13 00:47:32,447][46663] Updated weights for policy 1, policy_version 13861 (0.0008) +[2023-10-13 00:47:32,814][46663] Updated weights for policy 1, policy_version 13871 (0.0009) +[2023-10-13 00:47:33,194][46663] Updated weights for policy 1, policy_version 13881 (0.0009) +[2023-10-13 00:47:33,296][46662] Updated weights for policy 0, policy_version 13860 (0.0008) +[2023-10-13 00:47:33,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 28409856. Throughput: 0: 1675.6, 1: 1673.1. Samples: 7110374. Policy #0 lag: (min: 31.0, avg: 38.8, max: 63.0) +[2023-10-13 00:47:33,608][45375] Avg episode reward: [(0, '47.130'), (1, '42.830')] +[2023-10-13 00:47:33,667][46662] Updated weights for policy 0, policy_version 13870 (0.0007) +[2023-10-13 00:47:34,040][46662] Updated weights for policy 0, policy_version 13880 (0.0008) +[2023-10-13 00:47:37,215][46663] Updated weights for policy 1, policy_version 13891 (0.0009) +[2023-10-13 00:47:37,577][46663] Updated weights for policy 1, policy_version 13901 (0.0009) +[2023-10-13 00:47:37,939][46663] Updated weights for policy 1, policy_version 13911 (0.0007) +[2023-10-13 00:47:38,101][46662] Updated weights for policy 0, policy_version 13890 (0.0009) +[2023-10-13 00:47:38,495][46662] Updated weights for policy 0, policy_version 13900 (0.0009) +[2023-10-13 00:47:38,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 28475392. Throughput: 0: 1678.2, 1: 1655.9. Samples: 7129912. Policy #0 lag: (min: 18.0, avg: 26.0, max: 50.0) +[2023-10-13 00:47:38,607][45375] Avg episode reward: [(0, '46.080'), (1, '44.480')] +[2023-10-13 00:47:38,863][46662] Updated weights for policy 0, policy_version 13910 (0.0008) +[2023-10-13 00:47:39,235][46662] Updated weights for policy 0, policy_version 13920 (0.0008) +[2023-10-13 00:47:42,148][46663] Updated weights for policy 1, policy_version 13921 (0.0007) +[2023-10-13 00:47:42,510][46663] Updated weights for policy 1, policy_version 13931 (0.0009) +[2023-10-13 00:47:42,882][46663] Updated weights for policy 1, policy_version 13941 (0.0010) +[2023-10-13 00:47:43,253][46663] Updated weights for policy 1, policy_version 13951 (0.0008) +[2023-10-13 00:47:43,305][46662] Updated weights for policy 0, policy_version 13930 (0.0007) +[2023-10-13 00:47:43,606][45375] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 28540928. Throughput: 0: 1675.3, 1: 1679.2. Samples: 7140052. Policy #0 lag: (min: 18.0, avg: 26.0, max: 50.0) +[2023-10-13 00:47:43,607][45375] Avg episode reward: [(0, '45.740'), (1, '43.110')] +[2023-10-13 00:47:43,675][46662] Updated weights for policy 0, policy_version 13940 (0.0007) +[2023-10-13 00:47:44,041][46662] Updated weights for policy 0, policy_version 13950 (0.0007) +[2023-10-13 00:47:47,431][46663] Updated weights for policy 1, policy_version 13961 (0.0011) +[2023-10-13 00:47:47,805][46663] Updated weights for policy 1, policy_version 13971 (0.0009) +[2023-10-13 00:47:47,994][46662] Updated weights for policy 0, policy_version 13960 (0.0008) +[2023-10-13 00:47:48,166][46663] Updated weights for policy 1, policy_version 13981 (0.0009) +[2023-10-13 00:47:48,366][46662] Updated weights for policy 0, policy_version 13970 (0.0009) +[2023-10-13 00:47:48,607][45375] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 28606464. Throughput: 0: 1674.9, 1: 1666.5. Samples: 7160372. Policy #0 lag: (min: 18.0, avg: 26.0, max: 50.0) +[2023-10-13 00:47:48,608][45375] Avg episode reward: [(0, '47.070'), (1, '42.420')] +[2023-10-13 00:47:48,730][46662] Updated weights for policy 0, policy_version 13980 (0.0010) +[2023-10-13 00:47:52,278][46663] Updated weights for policy 1, policy_version 13991 (0.0008) +[2023-10-13 00:47:52,657][46663] Updated weights for policy 1, policy_version 14001 (0.0009) +[2023-10-13 00:47:52,817][46662] Updated weights for policy 0, policy_version 13990 (0.0008) +[2023-10-13 00:47:53,015][46663] Updated weights for policy 1, policy_version 14011 (0.0008) +[2023-10-13 00:47:53,183][46662] Updated weights for policy 0, policy_version 14000 (0.0007) +[2023-10-13 00:47:53,554][46662] Updated weights for policy 0, policy_version 14010 (0.0008) +[2023-10-13 00:47:53,607][45375] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 28672000. Throughput: 0: 1674.6, 1: 1648.7. Samples: 7179716. Policy #0 lag: (min: 24.0, avg: 46.0, max: 56.0) +[2023-10-13 00:47:53,608][45375] Avg episode reward: [(0, '47.630'), (1, '42.660')] +[2023-10-13 00:47:53,619][46384] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000014016_14352384.pth... +[2023-10-13 00:47:53,653][46384] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000012448_12746752.pth +[2023-10-13 00:47:53,771][46091] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000014016_14352384.pth... +[2023-10-13 00:47:53,800][46091] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000012448_12746752.pth +[2023-10-13 00:47:56,971][46663] Updated weights for policy 1, policy_version 14021 (0.0010) +[2023-10-13 00:47:57,345][46663] Updated weights for policy 1, policy_version 14031 (0.0009) +[2023-10-13 00:47:57,599][46662] Updated weights for policy 0, policy_version 14020 (0.0008) +[2023-10-13 00:47:57,709][46663] Updated weights for policy 1, policy_version 14041 (0.0009) +[2023-10-13 00:47:57,967][46662] Updated weights for policy 0, policy_version 14030 (0.0008) +[2023-10-13 00:47:58,346][46662] Updated weights for policy 0, policy_version 14040 (0.0007) +[2023-10-13 00:47:58,606][45375] Fps is (10 sec: 13107.5, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 28737536. Throughput: 0: 1678.1, 1: 1667.7. Samples: 7190330. Policy #0 lag: (min: 24.0, avg: 46.0, max: 56.0) +[2023-10-13 00:47:58,607][45375] Avg episode reward: [(0, '47.420'), (1, '43.320')] +[2023-10-13 00:48:01,782][46663] Updated weights for policy 1, policy_version 14051 (0.0008) +[2023-10-13 00:48:02,142][46663] Updated weights for policy 1, policy_version 14061 (0.0007) +[2023-10-13 00:48:02,332][46662] Updated weights for policy 0, policy_version 14050 (0.0009) +[2023-10-13 00:48:02,509][46663] Updated weights for policy 1, policy_version 14071 (0.0009) +[2023-10-13 00:48:02,711][46662] Updated weights for policy 0, policy_version 14060 (0.0008) +[2023-10-13 00:48:03,074][46662] Updated weights for policy 0, policy_version 14070 (0.0008) +[2023-10-13 00:48:03,446][46662] Updated weights for policy 0, policy_version 14080 (0.0008) +[2023-10-13 00:48:03,606][45375] Fps is (10 sec: 16384.4, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 28835840. Throughput: 0: 1682.3, 1: 1646.4. Samples: 7210304. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) +[2023-10-13 00:48:03,607][45375] Avg episode reward: [(0, '48.110'), (1, '43.630')] +[2023-10-13 00:48:06,633][46663] Updated weights for policy 1, policy_version 14081 (0.0008) +[2023-10-13 00:48:06,999][46663] Updated weights for policy 1, policy_version 14091 (0.0007) +[2023-10-13 00:48:07,355][46663] Updated weights for policy 1, policy_version 14101 (0.0009) +[2023-10-13 00:48:07,524][46662] Updated weights for policy 0, policy_version 14090 (0.0008) +[2023-10-13 00:48:07,722][46663] Updated weights for policy 1, policy_version 14111 (0.0007) +[2023-10-13 00:48:07,893][46662] Updated weights for policy 0, policy_version 14100 (0.0009) +[2023-10-13 00:48:08,268][46662] Updated weights for policy 0, policy_version 14110 (0.0008) +[2023-10-13 00:48:08,606][45375] Fps is (10 sec: 16384.0, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 28901376. Throughput: 0: 1669.0, 1: 1657.3. Samples: 7229874. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) +[2023-10-13 00:48:08,607][45375] Avg episode reward: [(0, '48.720'), (1, '42.860')] +[2023-10-13 00:48:12,020][46663] Updated weights for policy 1, policy_version 14121 (0.0008) +[2023-10-13 00:48:12,368][46662] Updated weights for policy 0, policy_version 14120 (0.0008) +[2023-10-13 00:48:12,394][46663] Updated weights for policy 1, policy_version 14131 (0.0009) +[2023-10-13 00:48:12,733][46662] Updated weights for policy 0, policy_version 14130 (0.0008) +[2023-10-13 00:48:12,750][46663] Updated weights for policy 1, policy_version 14141 (0.0008) +[2023-10-13 00:48:13,107][46662] Updated weights for policy 0, policy_version 14140 (0.0007) +[2023-10-13 00:48:13,607][45375] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 28966912. Throughput: 0: 1682.1, 1: 1663.6. Samples: 7240780. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) +[2023-10-13 00:48:13,607][45375] Avg episode reward: [(0, '47.510'), (1, '43.870')] +[2023-10-13 00:48:16,833][46663] Updated weights for policy 1, policy_version 14151 (0.0008) +[2023-10-13 00:48:16,993][46662] Updated weights for policy 0, policy_version 14150 (0.0009) +[2023-10-13 00:48:17,204][46663] Updated weights for policy 1, policy_version 14161 (0.0007) +[2023-10-13 00:48:17,371][46662] Updated weights for policy 0, policy_version 14160 (0.0009) +[2023-10-13 00:48:17,571][46663] Updated weights for policy 1, policy_version 14171 (0.0008) +[2023-10-13 00:48:17,737][46662] Updated weights for policy 0, policy_version 14170 (0.0008) +[2023-10-13 00:48:18,607][45375] Fps is (10 sec: 13107.1, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 29032448. Throughput: 0: 1689.6, 1: 1658.8. Samples: 7261056. Policy #0 lag: (min: 30.0, avg: 31.8, max: 59.0) +[2023-10-13 00:48:18,607][45375] Avg episode reward: [(0, '48.200'), (1, '45.340')] +[2023-10-13 00:48:21,727][46663] Updated weights for policy 1, policy_version 14181 (0.0007) +[2023-10-13 00:48:21,859][46662] Updated weights for policy 0, policy_version 14180 (0.0007) +[2023-10-13 00:48:22,097][46663] Updated weights for policy 1, policy_version 14191 (0.0007) +[2023-10-13 00:48:22,229][46662] Updated weights for policy 0, policy_version 14190 (0.0007) +[2023-10-13 00:48:22,460][46663] Updated weights for policy 1, policy_version 14201 (0.0008) +[2023-10-13 00:48:22,599][46662] Updated weights for policy 0, policy_version 14200 (0.0008) +[2023-10-13 00:48:23,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 29097984. Throughput: 0: 1666.3, 1: 1671.7. Samples: 7280120. Policy #0 lag: (min: 30.0, avg: 31.8, max: 59.0) +[2023-10-13 00:48:23,607][45375] Avg episode reward: [(0, '48.920'), (1, '46.040')] +[2023-10-13 00:48:26,489][46663] Updated weights for policy 1, policy_version 14211 (0.0010) +[2023-10-13 00:48:26,749][46662] Updated weights for policy 0, policy_version 14210 (0.0007) +[2023-10-13 00:48:26,862][46663] Updated weights for policy 1, policy_version 14221 (0.0009) +[2023-10-13 00:48:27,168][46662] Updated weights for policy 0, policy_version 14220 (0.0008) +[2023-10-13 00:48:27,223][46663] Updated weights for policy 1, policy_version 14231 (0.0009) +[2023-10-13 00:48:27,540][46662] Updated weights for policy 0, policy_version 14230 (0.0008) +[2023-10-13 00:48:27,905][46662] Updated weights for policy 0, policy_version 14240 (0.0008) +[2023-10-13 00:48:28,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 29163520. Throughput: 0: 1693.4, 1: 1670.7. Samples: 7291438. Policy #0 lag: (min: 30.0, avg: 31.8, max: 59.0) +[2023-10-13 00:48:28,608][45375] Avg episode reward: [(0, '48.960'), (1, '45.220')] +[2023-10-13 00:48:31,259][46663] Updated weights for policy 1, policy_version 14241 (0.0008) +[2023-10-13 00:48:31,624][46663] Updated weights for policy 1, policy_version 14251 (0.0007) +[2023-10-13 00:48:32,000][46663] Updated weights for policy 1, policy_version 14261 (0.0007) +[2023-10-13 00:48:32,046][46662] Updated weights for policy 0, policy_version 14250 (0.0007) +[2023-10-13 00:48:32,356][46663] Updated weights for policy 1, policy_version 14271 (0.0007) +[2023-10-13 00:48:32,419][46662] Updated weights for policy 0, policy_version 14260 (0.0007) +[2023-10-13 00:48:32,786][46662] Updated weights for policy 0, policy_version 14270 (0.0008) +[2023-10-13 00:48:33,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 29229056. Throughput: 0: 1683.7, 1: 1655.4. Samples: 7310634. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 00:48:33,608][45375] Avg episode reward: [(0, '48.940'), (1, '44.490')] +[2023-10-13 00:48:36,452][46663] Updated weights for policy 1, policy_version 14281 (0.0009) +[2023-10-13 00:48:36,716][46662] Updated weights for policy 0, policy_version 14280 (0.0010) +[2023-10-13 00:48:36,826][46663] Updated weights for policy 1, policy_version 14291 (0.0007) +[2023-10-13 00:48:37,085][46662] Updated weights for policy 0, policy_version 14290 (0.0008) +[2023-10-13 00:48:37,190][46663] Updated weights for policy 1, policy_version 14301 (0.0008) +[2023-10-13 00:48:37,459][46662] Updated weights for policy 0, policy_version 14300 (0.0008) +[2023-10-13 00:48:38,606][45375] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 29294592. Throughput: 0: 1662.0, 1: 1678.1. Samples: 7330020. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 00:48:38,607][45375] Avg episode reward: [(0, '47.970'), (1, '45.120')] +[2023-10-13 00:48:41,326][46663] Updated weights for policy 1, policy_version 14311 (0.0007) +[2023-10-13 00:48:41,488][46662] Updated weights for policy 0, policy_version 14310 (0.0008) +[2023-10-13 00:48:41,717][46663] Updated weights for policy 1, policy_version 14321 (0.0008) +[2023-10-13 00:48:41,859][46662] Updated weights for policy 0, policy_version 14320 (0.0007) +[2023-10-13 00:48:42,088][46663] Updated weights for policy 1, policy_version 14331 (0.0009) +[2023-10-13 00:48:42,228][46662] Updated weights for policy 0, policy_version 14330 (0.0007) +[2023-10-13 00:48:43,607][45375] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 29360128. Throughput: 0: 1689.1, 1: 1668.4. Samples: 7341416. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 00:48:43,607][45375] Avg episode reward: [(0, '47.120'), (1, '45.140')] +[2023-10-13 00:48:46,066][46663] Updated weights for policy 1, policy_version 14341 (0.0010) +[2023-10-13 00:48:46,109][46662] Updated weights for policy 0, policy_version 14340 (0.0009) +[2023-10-13 00:48:46,431][46663] Updated weights for policy 1, policy_version 14351 (0.0008) +[2023-10-13 00:48:46,479][46662] Updated weights for policy 0, policy_version 14350 (0.0007) +[2023-10-13 00:48:46,802][46663] Updated weights for policy 1, policy_version 14361 (0.0007) +[2023-10-13 00:48:46,855][46662] Updated weights for policy 0, policy_version 14360 (0.0008) +[2023-10-13 00:48:48,607][45375] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 29425664. Throughput: 0: 1671.3, 1: 1661.0. Samples: 7360256. Policy #0 lag: (min: 27.0, avg: 33.2, max: 59.0) +[2023-10-13 00:48:48,608][45375] Avg episode reward: [(0, '47.850'), (1, '48.000')] +[2023-10-13 00:48:48,609][46384] Saving new best policy, reward=48.000! +[2023-10-13 00:48:51,006][46663] Updated weights for policy 1, policy_version 14371 (0.0009) +[2023-10-13 00:48:51,010][46662] Updated weights for policy 0, policy_version 14370 (0.0007) +[2023-10-13 00:48:51,369][46662] Updated weights for policy 0, policy_version 14380 (0.0009) +[2023-10-13 00:48:51,369][46663] Updated weights for policy 1, policy_version 14381 (0.0008) +[2023-10-13 00:48:51,735][46663] Updated weights for policy 1, policy_version 14391 (0.0007) +[2023-10-13 00:48:51,747][46662] Updated weights for policy 0, policy_version 14390 (0.0008) +[2023-10-13 00:48:52,107][46662] Updated weights for policy 0, policy_version 14400 (0.0007) +[2023-10-13 00:48:53,607][45375] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 29491200. Throughput: 0: 1674.3, 1: 1668.8. Samples: 7380312. Policy #0 lag: (min: 27.0, avg: 33.2, max: 59.0) +[2023-10-13 00:48:53,608][45375] Avg episode reward: [(0, '47.400'), (1, '47.340')] +[2023-10-13 00:48:55,857][46663] Updated weights for policy 1, policy_version 14401 (0.0009) +[2023-10-13 00:48:56,109][46662] Updated weights for policy 0, policy_version 14410 (0.0008) +[2023-10-13 00:48:56,228][46663] Updated weights for policy 1, policy_version 14411 (0.0008) +[2023-10-13 00:48:56,489][46662] Updated weights for policy 0, policy_version 14420 (0.0010) +[2023-10-13 00:48:56,596][46663] Updated weights for policy 1, policy_version 14421 (0.0008) +[2023-10-13 00:48:56,864][46662] Updated weights for policy 0, policy_version 14430 (0.0007) +[2023-10-13 00:48:56,953][46663] Updated weights for policy 1, policy_version 14431 (0.0007) +[2023-10-13 00:48:58,606][45375] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 29556736. Throughput: 0: 1687.4, 1: 1655.8. Samples: 7391226. Policy #0 lag: (min: 27.0, avg: 33.2, max: 59.0) +[2023-10-13 00:48:58,607][45375] Avg episode reward: [(0, '48.750'), (1, '46.820')] +[2023-10-13 00:49:01,051][46662] Updated weights for policy 0, policy_version 14440 (0.0009) +[2023-10-13 00:49:01,074][46663] Updated weights for policy 1, policy_version 14441 (0.0007) +[2023-10-13 00:49:01,414][46662] Updated weights for policy 0, policy_version 14450 (0.0008) +[2023-10-13 00:49:01,439][46663] Updated weights for policy 1, policy_version 14451 (0.0007) +[2023-10-13 00:49:01,792][46662] Updated weights for policy 0, policy_version 14460 (0.0007) +[2023-10-13 00:49:01,801][46663] Updated weights for policy 1, policy_version 14461 (0.0008) +[2023-10-13 00:49:03,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 29622272. Throughput: 0: 1653.5, 1: 1654.2. Samples: 7409904. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 00:49:03,608][45375] Avg episode reward: [(0, '48.330'), (1, '47.490')] +[2023-10-13 00:49:06,019][46662] Updated weights for policy 0, policy_version 14470 (0.0008) +[2023-10-13 00:49:06,073][46663] Updated weights for policy 1, policy_version 14471 (0.0010) +[2023-10-13 00:49:06,394][46662] Updated weights for policy 0, policy_version 14480 (0.0008) +[2023-10-13 00:49:06,450][46663] Updated weights for policy 1, policy_version 14481 (0.0008) +[2023-10-13 00:49:06,761][46662] Updated weights for policy 0, policy_version 14490 (0.0008) +[2023-10-13 00:49:06,817][46663] Updated weights for policy 1, policy_version 14491 (0.0008) +[2023-10-13 00:49:08,607][45375] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 29687808. Throughput: 0: 1672.4, 1: 1663.2. Samples: 7430220. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 00:49:08,608][45375] Avg episode reward: [(0, '48.400'), (1, '49.010')] +[2023-10-13 00:49:08,619][46384] Saving new best policy, reward=49.010! +[2023-10-13 00:49:10,836][46662] Updated weights for policy 0, policy_version 14500 (0.0009) +[2023-10-13 00:49:11,033][46663] Updated weights for policy 1, policy_version 14501 (0.0007) +[2023-10-13 00:49:11,208][46662] Updated weights for policy 0, policy_version 14510 (0.0008) +[2023-10-13 00:49:11,402][46663] Updated weights for policy 1, policy_version 14511 (0.0009) +[2023-10-13 00:49:11,581][46662] Updated weights for policy 0, policy_version 14520 (0.0008) +[2023-10-13 00:49:11,755][46663] Updated weights for policy 1, policy_version 14521 (0.0007) +[2023-10-13 00:49:13,607][45375] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 29753344. Throughput: 0: 1672.2, 1: 1650.3. Samples: 7440950. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 00:49:13,607][45375] Avg episode reward: [(0, '50.050'), (1, '47.830')] +[2023-10-13 00:49:15,849][46662] Updated weights for policy 0, policy_version 14530 (0.0007) +[2023-10-13 00:49:15,901][46663] Updated weights for policy 1, policy_version 14531 (0.0007) +[2023-10-13 00:49:16,250][46662] Updated weights for policy 0, policy_version 14540 (0.0009) +[2023-10-13 00:49:16,269][46663] Updated weights for policy 1, policy_version 14541 (0.0009) +[2023-10-13 00:49:16,622][46662] Updated weights for policy 0, policy_version 14550 (0.0009) +[2023-10-13 00:49:16,632][46663] Updated weights for policy 1, policy_version 14551 (0.0009) +[2023-10-13 00:49:16,994][46662] Updated weights for policy 0, policy_version 14560 (0.0009) +[2023-10-13 00:49:18,607][45375] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 29818880. Throughput: 0: 1651.4, 1: 1658.2. Samples: 7459568. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) +[2023-10-13 00:49:18,607][45375] Avg episode reward: [(0, '47.180'), (1, '47.290')] +[2023-10-13 00:49:20,656][46663] Updated weights for policy 1, policy_version 14561 (0.0007) +[2023-10-13 00:49:21,023][46663] Updated weights for policy 1, policy_version 14571 (0.0009) +[2023-10-13 00:49:21,099][46662] Updated weights for policy 0, policy_version 14570 (0.0009) +[2023-10-13 00:49:21,397][46663] Updated weights for policy 1, policy_version 14581 (0.0010) +[2023-10-13 00:49:21,476][46662] Updated weights for policy 0, policy_version 14580 (0.0008) +[2023-10-13 00:49:21,761][46663] Updated weights for policy 1, policy_version 14591 (0.0007) +[2023-10-13 00:49:21,839][46662] Updated weights for policy 0, policy_version 14590 (0.0008) +[2023-10-13 00:49:23,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 29884416. Throughput: 0: 1666.5, 1: 1660.8. Samples: 7479750. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) +[2023-10-13 00:49:23,608][45375] Avg episode reward: [(0, '47.270'), (1, '46.320')] +[2023-10-13 00:49:25,962][46662] Updated weights for policy 0, policy_version 14600 (0.0008) +[2023-10-13 00:49:25,990][46663] Updated weights for policy 1, policy_version 14601 (0.0008) +[2023-10-13 00:49:26,339][46662] Updated weights for policy 0, policy_version 14610 (0.0009) +[2023-10-13 00:49:26,364][46663] Updated weights for policy 1, policy_version 14611 (0.0009) +[2023-10-13 00:49:26,712][46662] Updated weights for policy 0, policy_version 14620 (0.0008) +[2023-10-13 00:49:26,741][46663] Updated weights for policy 1, policy_version 14621 (0.0008) +[2023-10-13 00:49:28,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 29949952. Throughput: 0: 1660.8, 1: 1649.2. Samples: 7490368. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) +[2023-10-13 00:49:28,607][45375] Avg episode reward: [(0, '48.970'), (1, '45.870')] +[2023-10-13 00:49:30,671][46663] Updated weights for policy 1, policy_version 14631 (0.0007) +[2023-10-13 00:49:30,839][46662] Updated weights for policy 0, policy_version 14630 (0.0008) +[2023-10-13 00:49:31,040][46663] Updated weights for policy 1, policy_version 14641 (0.0008) +[2023-10-13 00:49:31,213][46662] Updated weights for policy 0, policy_version 14640 (0.0008) +[2023-10-13 00:49:31,406][46663] Updated weights for policy 1, policy_version 14651 (0.0010) +[2023-10-13 00:49:31,582][46662] Updated weights for policy 0, policy_version 14650 (0.0007) +[2023-10-13 00:49:33,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 30015488. Throughput: 0: 1648.5, 1: 1663.8. Samples: 7509310. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 00:49:33,608][45375] Avg episode reward: [(0, '48.920'), (1, '46.000')] +[2023-10-13 00:49:35,594][46663] Updated weights for policy 1, policy_version 14661 (0.0009) +[2023-10-13 00:49:35,762][46662] Updated weights for policy 0, policy_version 14660 (0.0008) +[2023-10-13 00:49:35,964][46663] Updated weights for policy 1, policy_version 14671 (0.0010) +[2023-10-13 00:49:36,134][46662] Updated weights for policy 0, policy_version 14670 (0.0009) +[2023-10-13 00:49:36,336][46663] Updated weights for policy 1, policy_version 14681 (0.0008) +[2023-10-13 00:49:36,515][46662] Updated weights for policy 0, policy_version 14680 (0.0008) +[2023-10-13 00:49:38,607][45375] Fps is (10 sec: 13106.9, 60 sec: 13107.1, 300 sec: 13329.3). Total num frames: 30081024. Throughput: 0: 1655.5, 1: 1662.8. Samples: 7529634. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 00:49:38,608][45375] Avg episode reward: [(0, '48.640'), (1, '45.990')] +[2023-10-13 00:49:40,523][46663] Updated weights for policy 1, policy_version 14691 (0.0008) +[2023-10-13 00:49:40,722][46662] Updated weights for policy 0, policy_version 14690 (0.0007) +[2023-10-13 00:49:40,888][46663] Updated weights for policy 1, policy_version 14701 (0.0009) +[2023-10-13 00:49:41,096][46662] Updated weights for policy 0, policy_version 14700 (0.0007) +[2023-10-13 00:49:41,256][46663] Updated weights for policy 1, policy_version 14711 (0.0008) +[2023-10-13 00:49:41,471][46662] Updated weights for policy 0, policy_version 14710 (0.0007) +[2023-10-13 00:49:41,845][46662] Updated weights for policy 0, policy_version 14720 (0.0008) +[2023-10-13 00:49:43,607][45375] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 30146560. Throughput: 0: 1652.4, 1: 1653.5. Samples: 7539990. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 00:49:43,607][45375] Avg episode reward: [(0, '48.530'), (1, '48.040')] +[2023-10-13 00:49:45,501][46663] Updated weights for policy 1, policy_version 14721 (0.0010) +[2023-10-13 00:49:45,800][46662] Updated weights for policy 0, policy_version 14730 (0.0010) +[2023-10-13 00:49:45,865][46663] Updated weights for policy 1, policy_version 14731 (0.0008) +[2023-10-13 00:49:46,165][46662] Updated weights for policy 0, policy_version 14740 (0.0009) +[2023-10-13 00:49:46,235][46663] Updated weights for policy 1, policy_version 14741 (0.0007) +[2023-10-13 00:49:46,545][46662] Updated weights for policy 0, policy_version 14750 (0.0007) +[2023-10-13 00:49:46,595][46663] Updated weights for policy 1, policy_version 14751 (0.0008) +[2023-10-13 00:49:48,606][45375] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 30212096. Throughput: 0: 1654.5, 1: 1660.1. Samples: 7559058. Policy #0 lag: (min: 5.0, avg: 14.4, max: 37.0) +[2023-10-13 00:49:48,607][45375] Avg episode reward: [(0, '49.080'), (1, '47.870')] +[2023-10-13 00:49:50,556][46662] Updated weights for policy 0, policy_version 14760 (0.0008) +[2023-10-13 00:49:50,765][46663] Updated weights for policy 1, policy_version 14761 (0.0008) +[2023-10-13 00:49:50,932][46662] Updated weights for policy 0, policy_version 14770 (0.0008) +[2023-10-13 00:49:51,127][46663] Updated weights for policy 1, policy_version 14771 (0.0008) +[2023-10-13 00:49:51,289][46662] Updated weights for policy 0, policy_version 14780 (0.0007) +[2023-10-13 00:49:51,500][46663] Updated weights for policy 1, policy_version 14781 (0.0009) +[2023-10-13 00:49:53,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 30277632. Throughput: 0: 1660.4, 1: 1658.6. Samples: 7579578. Policy #0 lag: (min: 5.0, avg: 14.4, max: 37.0) +[2023-10-13 00:49:53,607][45375] Avg episode reward: [(0, '49.380'), (1, '47.060')] +[2023-10-13 00:49:53,618][46091] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000014784_15138816.pth... +[2023-10-13 00:49:53,619][46384] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000014784_15138816.pth... +[2023-10-13 00:49:53,653][46091] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000013216_13533184.pth +[2023-10-13 00:49:53,656][46384] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000013216_13533184.pth +[2023-10-13 00:49:55,186][46662] Updated weights for policy 0, policy_version 14790 (0.0009) +[2023-10-13 00:49:55,517][46663] Updated weights for policy 1, policy_version 14791 (0.0008) +[2023-10-13 00:49:55,563][46662] Updated weights for policy 0, policy_version 14800 (0.0007) +[2023-10-13 00:49:55,886][46663] Updated weights for policy 1, policy_version 14801 (0.0007) +[2023-10-13 00:49:55,933][46662] Updated weights for policy 0, policy_version 14810 (0.0008) +[2023-10-13 00:49:56,248][46663] Updated weights for policy 1, policy_version 14811 (0.0008) +[2023-10-13 00:49:58,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 30343168. Throughput: 0: 1649.5, 1: 1651.4. Samples: 7589490. Policy #0 lag: (min: 5.0, avg: 14.4, max: 37.0) +[2023-10-13 00:49:58,607][45375] Avg episode reward: [(0, '49.640'), (1, '50.000')] +[2023-10-13 00:49:58,608][46384] Saving new best policy, reward=50.000! +[2023-10-13 00:50:00,111][46662] Updated weights for policy 0, policy_version 14820 (0.0008) +[2023-10-13 00:50:00,277][46663] Updated weights for policy 1, policy_version 14821 (0.0008) +[2023-10-13 00:50:00,482][46662] Updated weights for policy 0, policy_version 14830 (0.0007) +[2023-10-13 00:50:00,651][46663] Updated weights for policy 1, policy_version 14831 (0.0007) +[2023-10-13 00:50:00,862][46662] Updated weights for policy 0, policy_version 14840 (0.0008) +[2023-10-13 00:50:01,018][46663] Updated weights for policy 1, policy_version 14841 (0.0008) +[2023-10-13 00:50:03,607][45375] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 30408704. Throughput: 0: 1668.0, 1: 1665.8. Samples: 7609588. Policy #0 lag: (min: 31.0, avg: 37.0, max: 63.0) +[2023-10-13 00:50:03,607][45375] Avg episode reward: [(0, '48.730'), (1, '50.410')] +[2023-10-13 00:50:03,608][46384] Saving new best policy, reward=50.410! +[2023-10-13 00:50:04,746][46662] Updated weights for policy 0, policy_version 14850 (0.0008) +[2023-10-13 00:50:05,124][46663] Updated weights for policy 1, policy_version 14851 (0.0007) +[2023-10-13 00:50:05,132][46662] Updated weights for policy 0, policy_version 14860 (0.0008) +[2023-10-13 00:50:05,488][46663] Updated weights for policy 1, policy_version 14861 (0.0007) +[2023-10-13 00:50:05,492][46662] Updated weights for policy 0, policy_version 14870 (0.0008) +[2023-10-13 00:50:05,852][46663] Updated weights for policy 1, policy_version 14871 (0.0007) +[2023-10-13 00:50:05,861][46662] Updated weights for policy 0, policy_version 14880 (0.0009) +[2023-10-13 00:50:08,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 30474240. Throughput: 0: 1680.0, 1: 1668.0. Samples: 7630408. Policy #0 lag: (min: 31.0, avg: 37.0, max: 63.0) +[2023-10-13 00:50:08,607][45375] Avg episode reward: [(0, '46.520'), (1, '50.140')] +[2023-10-13 00:50:09,883][46663] Updated weights for policy 1, policy_version 14881 (0.0008) +[2023-10-13 00:50:10,008][46662] Updated weights for policy 0, policy_version 14890 (0.0007) +[2023-10-13 00:50:10,251][46663] Updated weights for policy 1, policy_version 14891 (0.0007) +[2023-10-13 00:50:10,374][46662] Updated weights for policy 0, policy_version 14900 (0.0007) +[2023-10-13 00:50:10,619][46663] Updated weights for policy 1, policy_version 14901 (0.0008) +[2023-10-13 00:50:10,741][46662] Updated weights for policy 0, policy_version 14910 (0.0008) +[2023-10-13 00:50:10,981][46663] Updated weights for policy 1, policy_version 14911 (0.0010) +[2023-10-13 00:50:13,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 30539776. Throughput: 0: 1654.3, 1: 1658.9. Samples: 7639462. Policy #0 lag: (min: 31.0, avg: 37.0, max: 63.0) +[2023-10-13 00:50:13,607][45375] Avg episode reward: [(0, '44.000'), (1, '50.160')] +[2023-10-13 00:50:14,848][46662] Updated weights for policy 0, policy_version 14920 (0.0007) +[2023-10-13 00:50:15,042][46663] Updated weights for policy 1, policy_version 14921 (0.0008) +[2023-10-13 00:50:15,209][46662] Updated weights for policy 0, policy_version 14930 (0.0009) +[2023-10-13 00:50:15,407][46663] Updated weights for policy 1, policy_version 14931 (0.0007) +[2023-10-13 00:50:15,590][46662] Updated weights for policy 0, policy_version 14940 (0.0008) +[2023-10-13 00:50:15,787][46663] Updated weights for policy 1, policy_version 14941 (0.0008) +[2023-10-13 00:50:18,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 30605312. Throughput: 0: 1682.3, 1: 1667.5. Samples: 7660048. Policy #0 lag: (min: 5.0, avg: 15.9, max: 37.0) +[2023-10-13 00:50:18,607][45375] Avg episode reward: [(0, '43.630'), (1, '50.540')] +[2023-10-13 00:50:18,609][46384] Saving new best policy, reward=50.540! +[2023-10-13 00:50:19,778][46662] Updated weights for policy 0, policy_version 14950 (0.0009) +[2023-10-13 00:50:19,951][46663] Updated weights for policy 1, policy_version 14951 (0.0008) +[2023-10-13 00:50:20,149][46662] Updated weights for policy 0, policy_version 14960 (0.0008) +[2023-10-13 00:50:20,317][46663] Updated weights for policy 1, policy_version 14961 (0.0009) +[2023-10-13 00:50:20,531][46662] Updated weights for policy 0, policy_version 14970 (0.0007) +[2023-10-13 00:50:20,681][46663] Updated weights for policy 1, policy_version 14971 (0.0009) +[2023-10-13 00:50:23,607][45375] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 30670848. Throughput: 0: 1680.9, 1: 1673.2. Samples: 7680570. Policy #0 lag: (min: 5.0, avg: 15.9, max: 37.0) +[2023-10-13 00:50:23,608][45375] Avg episode reward: [(0, '42.830'), (1, '50.110')] +[2023-10-13 00:50:24,542][46662] Updated weights for policy 0, policy_version 14980 (0.0008) +[2023-10-13 00:50:24,612][46663] Updated weights for policy 1, policy_version 14981 (0.0007) +[2023-10-13 00:50:24,905][46662] Updated weights for policy 0, policy_version 14990 (0.0009) +[2023-10-13 00:50:24,979][46663] Updated weights for policy 1, policy_version 14991 (0.0009) +[2023-10-13 00:50:25,283][46662] Updated weights for policy 0, policy_version 15000 (0.0009) +[2023-10-13 00:50:25,358][46663] Updated weights for policy 1, policy_version 15001 (0.0008) +[2023-10-13 00:50:28,606][45375] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 30736384. Throughput: 0: 1658.1, 1: 1668.4. Samples: 7689682. Policy #0 lag: (min: 5.0, avg: 15.9, max: 37.0) +[2023-10-13 00:50:28,607][45375] Avg episode reward: [(0, '41.840'), (1, '51.080')] +[2023-10-13 00:50:28,608][46384] Saving new best policy, reward=51.080! +[2023-10-13 00:50:29,462][46662] Updated weights for policy 0, policy_version 15010 (0.0009) +[2023-10-13 00:50:29,539][46663] Updated weights for policy 1, policy_version 15011 (0.0008) +[2023-10-13 00:50:29,833][46662] Updated weights for policy 0, policy_version 15020 (0.0009) +[2023-10-13 00:50:29,910][46663] Updated weights for policy 1, policy_version 15021 (0.0010) +[2023-10-13 00:50:30,210][46662] Updated weights for policy 0, policy_version 15030 (0.0009) +[2023-10-13 00:50:30,283][46663] Updated weights for policy 1, policy_version 15031 (0.0008) +[2023-10-13 00:50:30,583][46662] Updated weights for policy 0, policy_version 15040 (0.0007) +[2023-10-13 00:50:33,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 30801920. Throughput: 0: 1684.3, 1: 1674.2. Samples: 7710190. Policy #0 lag: (min: 31.0, avg: 31.9, max: 51.0) +[2023-10-13 00:50:33,608][45375] Avg episode reward: [(0, '41.290'), (1, '50.550')] +[2023-10-13 00:50:34,548][46663] Updated weights for policy 1, policy_version 15041 (0.0009) +[2023-10-13 00:50:34,595][46662] Updated weights for policy 0, policy_version 15050 (0.0007) +[2023-10-13 00:50:34,914][46663] Updated weights for policy 1, policy_version 15051 (0.0010) +[2023-10-13 00:50:34,957][46662] Updated weights for policy 0, policy_version 15060 (0.0007) +[2023-10-13 00:50:35,277][46663] Updated weights for policy 1, policy_version 15061 (0.0009) +[2023-10-13 00:50:35,332][46662] Updated weights for policy 0, policy_version 15070 (0.0009) +[2023-10-13 00:50:35,646][46663] Updated weights for policy 1, policy_version 15071 (0.0008) +[2023-10-13 00:50:38,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.3, 300 sec: 13329.4). Total num frames: 30867456. Throughput: 0: 1674.8, 1: 1676.5. Samples: 7730384. Policy #0 lag: (min: 31.0, avg: 31.9, max: 51.0) +[2023-10-13 00:50:38,607][45375] Avg episode reward: [(0, '42.440'), (1, '50.540')] +[2023-10-13 00:50:39,546][46662] Updated weights for policy 0, policy_version 15080 (0.0008) +[2023-10-13 00:50:39,743][46663] Updated weights for policy 1, policy_version 15081 (0.0008) +[2023-10-13 00:50:39,922][46662] Updated weights for policy 0, policy_version 15090 (0.0009) +[2023-10-13 00:50:40,111][46663] Updated weights for policy 1, policy_version 15091 (0.0009) +[2023-10-13 00:50:40,287][46662] Updated weights for policy 0, policy_version 15100 (0.0008) +[2023-10-13 00:50:40,487][46663] Updated weights for policy 1, policy_version 15101 (0.0008) +[2023-10-13 00:50:43,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 30932992. Throughput: 0: 1660.2, 1: 1668.6. Samples: 7739288. Policy #0 lag: (min: 31.0, avg: 31.9, max: 51.0) +[2023-10-13 00:50:43,608][45375] Avg episode reward: [(0, '42.080'), (1, '50.100')] +[2023-10-13 00:50:44,490][46662] Updated weights for policy 0, policy_version 15110 (0.0008) +[2023-10-13 00:50:44,558][46663] Updated weights for policy 1, policy_version 15111 (0.0009) +[2023-10-13 00:50:44,855][46662] Updated weights for policy 0, policy_version 15120 (0.0010) +[2023-10-13 00:50:44,923][46663] Updated weights for policy 1, policy_version 15121 (0.0008) +[2023-10-13 00:50:45,231][46662] Updated weights for policy 0, policy_version 15130 (0.0007) +[2023-10-13 00:50:45,294][46663] Updated weights for policy 1, policy_version 15131 (0.0009) +[2023-10-13 00:50:48,606][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 30998528. Throughput: 0: 1667.2, 1: 1664.8. Samples: 7759526. Policy #0 lag: (min: 24.0, avg: 43.4, max: 56.0) +[2023-10-13 00:50:48,607][45375] Avg episode reward: [(0, '43.040'), (1, '48.410')] +[2023-10-13 00:50:49,537][46663] Updated weights for policy 1, policy_version 15141 (0.0007) +[2023-10-13 00:50:49,554][46662] Updated weights for policy 0, policy_version 15140 (0.0008) +[2023-10-13 00:50:49,911][46663] Updated weights for policy 1, policy_version 15151 (0.0008) +[2023-10-13 00:50:49,935][46662] Updated weights for policy 0, policy_version 15150 (0.0007) +[2023-10-13 00:50:50,279][46663] Updated weights for policy 1, policy_version 15161 (0.0008) +[2023-10-13 00:50:50,303][46662] Updated weights for policy 0, policy_version 15160 (0.0009) +[2023-10-13 00:50:53,607][45375] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 31064064. Throughput: 0: 1658.2, 1: 1664.3. Samples: 7779920. Policy #0 lag: (min: 24.0, avg: 43.4, max: 56.0) +[2023-10-13 00:50:53,607][45375] Avg episode reward: [(0, '42.030'), (1, '46.620')] +[2023-10-13 00:50:54,327][46663] Updated weights for policy 1, policy_version 15171 (0.0009) +[2023-10-13 00:50:54,343][46662] Updated weights for policy 0, policy_version 15170 (0.0007) +[2023-10-13 00:50:54,679][46663] Updated weights for policy 1, policy_version 15181 (0.0009) +[2023-10-13 00:50:54,714][46662] Updated weights for policy 0, policy_version 15180 (0.0008) +[2023-10-13 00:50:55,051][46663] Updated weights for policy 1, policy_version 15191 (0.0009) +[2023-10-13 00:50:55,089][46662] Updated weights for policy 0, policy_version 15190 (0.0008) +[2023-10-13 00:50:55,463][46662] Updated weights for policy 0, policy_version 15200 (0.0010) +[2023-10-13 00:50:58,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 31129600. Throughput: 0: 1660.2, 1: 1660.0. Samples: 7788870. Policy #0 lag: (min: 24.0, avg: 43.4, max: 56.0) +[2023-10-13 00:50:58,607][45375] Avg episode reward: [(0, '41.800'), (1, '46.150')] +[2023-10-13 00:50:59,229][46663] Updated weights for policy 1, policy_version 15201 (0.0007) +[2023-10-13 00:50:59,589][46663] Updated weights for policy 1, policy_version 15211 (0.0009) +[2023-10-13 00:50:59,616][46662] Updated weights for policy 0, policy_version 15210 (0.0009) +[2023-10-13 00:50:59,960][46663] Updated weights for policy 1, policy_version 15221 (0.0009) +[2023-10-13 00:50:59,992][46662] Updated weights for policy 0, policy_version 15220 (0.0009) +[2023-10-13 00:51:00,330][46663] Updated weights for policy 1, policy_version 15231 (0.0008) +[2023-10-13 00:51:00,360][46662] Updated weights for policy 0, policy_version 15230 (0.0009) +[2023-10-13 00:51:03,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 31195136. Throughput: 0: 1659.1, 1: 1660.2. Samples: 7809418. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 00:51:03,607][45375] Avg episode reward: [(0, '41.470'), (1, '45.310')] +[2023-10-13 00:51:04,307][46662] Updated weights for policy 0, policy_version 15240 (0.0008) +[2023-10-13 00:51:04,576][46663] Updated weights for policy 1, policy_version 15241 (0.0009) +[2023-10-13 00:51:04,672][46662] Updated weights for policy 0, policy_version 15250 (0.0007) +[2023-10-13 00:51:04,955][46663] Updated weights for policy 1, policy_version 15251 (0.0009) +[2023-10-13 00:51:05,054][46662] Updated weights for policy 0, policy_version 15260 (0.0008) +[2023-10-13 00:51:05,322][46663] Updated weights for policy 1, policy_version 15261 (0.0007) +[2023-10-13 00:51:08,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 31260672. Throughput: 0: 1662.7, 1: 1659.9. Samples: 7830084. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 00:51:08,607][45375] Avg episode reward: [(0, '41.720'), (1, '44.230')] +[2023-10-13 00:51:08,984][46662] Updated weights for policy 0, policy_version 15270 (0.0008) +[2023-10-13 00:51:09,367][46662] Updated weights for policy 0, policy_version 15280 (0.0008) +[2023-10-13 00:51:09,409][46663] Updated weights for policy 1, policy_version 15271 (0.0008) +[2023-10-13 00:51:09,739][46662] Updated weights for policy 0, policy_version 15290 (0.0007) +[2023-10-13 00:51:09,783][46663] Updated weights for policy 1, policy_version 15281 (0.0008) +[2023-10-13 00:51:10,145][46663] Updated weights for policy 1, policy_version 15291 (0.0008) +[2023-10-13 00:51:13,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 31326208. Throughput: 0: 1660.7, 1: 1657.5. Samples: 7839000. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 00:51:13,608][45375] Avg episode reward: [(0, '42.120'), (1, '42.730')] +[2023-10-13 00:51:13,945][46662] Updated weights for policy 0, policy_version 15300 (0.0007) +[2023-10-13 00:51:14,313][46662] Updated weights for policy 0, policy_version 15310 (0.0011) +[2023-10-13 00:51:14,355][46663] Updated weights for policy 1, policy_version 15301 (0.0008) +[2023-10-13 00:51:14,672][46662] Updated weights for policy 0, policy_version 15320 (0.0008) +[2023-10-13 00:51:14,722][46663] Updated weights for policy 1, policy_version 15311 (0.0008) +[2023-10-13 00:51:15,103][46663] Updated weights for policy 1, policy_version 15321 (0.0007) +[2023-10-13 00:51:18,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 31391744. Throughput: 0: 1660.7, 1: 1662.3. Samples: 7859724. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 00:51:18,608][45375] Avg episode reward: [(0, '43.690'), (1, '42.260')] +[2023-10-13 00:51:18,865][46662] Updated weights for policy 0, policy_version 15330 (0.0008) +[2023-10-13 00:51:19,100][46663] Updated weights for policy 1, policy_version 15331 (0.0009) +[2023-10-13 00:51:19,230][46662] Updated weights for policy 0, policy_version 15340 (0.0008) +[2023-10-13 00:51:19,478][46663] Updated weights for policy 1, policy_version 15341 (0.0007) +[2023-10-13 00:51:19,600][46662] Updated weights for policy 0, policy_version 15350 (0.0008) +[2023-10-13 00:51:19,843][46663] Updated weights for policy 1, policy_version 15351 (0.0007) +[2023-10-13 00:51:19,972][46662] Updated weights for policy 0, policy_version 15360 (0.0007) +[2023-10-13 00:51:23,607][45375] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 31457280. Throughput: 0: 1664.4, 1: 1663.7. Samples: 7880150. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 00:51:23,607][45375] Avg episode reward: [(0, '44.600'), (1, '42.380')] +[2023-10-13 00:51:23,824][46663] Updated weights for policy 1, policy_version 15361 (0.0009) +[2023-10-13 00:51:24,079][46662] Updated weights for policy 0, policy_version 15370 (0.0009) +[2023-10-13 00:51:24,197][46663] Updated weights for policy 1, policy_version 15371 (0.0009) +[2023-10-13 00:51:24,462][46662] Updated weights for policy 0, policy_version 15380 (0.0009) +[2023-10-13 00:51:24,582][46663] Updated weights for policy 1, policy_version 15381 (0.0009) +[2023-10-13 00:51:24,829][46662] Updated weights for policy 0, policy_version 15390 (0.0009) +[2023-10-13 00:51:24,934][46663] Updated weights for policy 1, policy_version 15391 (0.0009) +[2023-10-13 00:51:28,606][45375] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 31522816. Throughput: 0: 1667.5, 1: 1664.2. Samples: 7889216. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 00:51:28,607][45375] Avg episode reward: [(0, '43.920'), (1, '43.080')] +[2023-10-13 00:51:28,787][46662] Updated weights for policy 0, policy_version 15400 (0.0009) +[2023-10-13 00:51:29,159][46662] Updated weights for policy 0, policy_version 15410 (0.0009) +[2023-10-13 00:51:29,168][46663] Updated weights for policy 1, policy_version 15401 (0.0007) +[2023-10-13 00:51:29,522][46662] Updated weights for policy 0, policy_version 15420 (0.0008) +[2023-10-13 00:51:29,533][46663] Updated weights for policy 1, policy_version 15411 (0.0009) +[2023-10-13 00:51:29,910][46663] Updated weights for policy 1, policy_version 15421 (0.0011) +[2023-10-13 00:51:33,606][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 31588352. Throughput: 0: 1672.0, 1: 1667.0. Samples: 7909782. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 00:51:33,607][45375] Avg episode reward: [(0, '44.810'), (1, '42.310')] +[2023-10-13 00:51:33,787][46662] Updated weights for policy 0, policy_version 15430 (0.0008) +[2023-10-13 00:51:34,074][46663] Updated weights for policy 1, policy_version 15431 (0.0009) +[2023-10-13 00:51:34,155][46662] Updated weights for policy 0, policy_version 15440 (0.0007) +[2023-10-13 00:51:34,439][46663] Updated weights for policy 1, policy_version 15441 (0.0009) +[2023-10-13 00:51:34,526][46662] Updated weights for policy 0, policy_version 15450 (0.0009) +[2023-10-13 00:51:34,810][46663] Updated weights for policy 1, policy_version 15451 (0.0008) +[2023-10-13 00:51:38,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 31653888. Throughput: 0: 1675.6, 1: 1664.9. Samples: 7930240. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 00:51:38,607][45375] Avg episode reward: [(0, '44.320'), (1, '40.740')] +[2023-10-13 00:51:38,668][46662] Updated weights for policy 0, policy_version 15460 (0.0008) +[2023-10-13 00:51:38,921][46663] Updated weights for policy 1, policy_version 15461 (0.0009) +[2023-10-13 00:51:39,052][46662] Updated weights for policy 0, policy_version 15470 (0.0007) +[2023-10-13 00:51:39,298][46663] Updated weights for policy 1, policy_version 15471 (0.0008) +[2023-10-13 00:51:39,429][46662] Updated weights for policy 0, policy_version 15480 (0.0010) +[2023-10-13 00:51:39,668][46663] Updated weights for policy 1, policy_version 15481 (0.0007) +[2023-10-13 00:51:43,375][46662] Updated weights for policy 0, policy_version 15490 (0.0008) +[2023-10-13 00:51:43,606][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 31719424. Throughput: 0: 1671.9, 1: 1667.6. Samples: 7939150. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 00:51:43,607][45375] Avg episode reward: [(0, '42.890'), (1, '41.370')] +[2023-10-13 00:51:43,754][46662] Updated weights for policy 0, policy_version 15500 (0.0009) +[2023-10-13 00:51:43,824][46663] Updated weights for policy 1, policy_version 15491 (0.0009) +[2023-10-13 00:51:44,123][46662] Updated weights for policy 0, policy_version 15510 (0.0010) +[2023-10-13 00:51:44,186][46663] Updated weights for policy 1, policy_version 15501 (0.0007) +[2023-10-13 00:51:44,498][46662] Updated weights for policy 0, policy_version 15520 (0.0008) +[2023-10-13 00:51:44,551][46663] Updated weights for policy 1, policy_version 15511 (0.0009) +[2023-10-13 00:51:48,579][46662] Updated weights for policy 0, policy_version 15530 (0.0007) +[2023-10-13 00:51:48,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 31784960. Throughput: 0: 1675.3, 1: 1668.4. Samples: 7959882. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 00:51:48,607][45375] Avg episode reward: [(0, '41.980'), (1, '41.950')] +[2023-10-13 00:51:48,815][46663] Updated weights for policy 1, policy_version 15521 (0.0010) +[2023-10-13 00:51:48,949][46662] Updated weights for policy 0, policy_version 15540 (0.0010) +[2023-10-13 00:51:49,234][46663] Updated weights for policy 1, policy_version 15531 (0.0010) +[2023-10-13 00:51:49,321][46662] Updated weights for policy 0, policy_version 15550 (0.0009) +[2023-10-13 00:51:49,601][46663] Updated weights for policy 1, policy_version 15541 (0.0009) +[2023-10-13 00:51:49,973][46663] Updated weights for policy 1, policy_version 15551 (0.0008) +[2023-10-13 00:51:53,325][46662] Updated weights for policy 0, policy_version 15560 (0.0009) +[2023-10-13 00:51:53,607][45375] Fps is (10 sec: 13106.7, 60 sec: 13107.1, 300 sec: 13329.3). Total num frames: 31850496. Throughput: 0: 1674.7, 1: 1663.4. Samples: 7980298. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 00:51:53,608][45375] Avg episode reward: [(0, '42.480'), (1, '42.770')] +[2023-10-13 00:51:53,702][46662] Updated weights for policy 0, policy_version 15570 (0.0010) +[2023-10-13 00:51:53,983][46663] Updated weights for policy 1, policy_version 15561 (0.0007) +[2023-10-13 00:51:54,068][46662] Updated weights for policy 0, policy_version 15580 (0.0007) +[2023-10-13 00:51:54,211][46091] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000015584_15958016.pth... +[2023-10-13 00:51:54,241][46091] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000014016_14352384.pth +[2023-10-13 00:51:54,245][46091] Saving a milestone ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/milestones/checkpoint_000015584_15958016.pth +[2023-10-13 00:51:54,345][46663] Updated weights for policy 1, policy_version 15571 (0.0008) +[2023-10-13 00:51:54,720][46663] Updated weights for policy 1, policy_version 15581 (0.0009) +[2023-10-13 00:51:54,829][46384] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000015584_15958016.pth... +[2023-10-13 00:51:54,867][46384] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000014016_14352384.pth +[2023-10-13 00:51:54,873][46384] Saving a milestone ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/milestones/checkpoint_000015584_15958016.pth +[2023-10-13 00:51:58,115][46662] Updated weights for policy 0, policy_version 15590 (0.0007) +[2023-10-13 00:51:58,494][46662] Updated weights for policy 0, policy_version 15600 (0.0008) +[2023-10-13 00:51:58,606][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 31916032. Throughput: 0: 1677.3, 1: 1668.0. Samples: 7989534. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 00:51:58,607][45375] Avg episode reward: [(0, '42.990'), (1, '42.560')] +[2023-10-13 00:51:58,794][46663] Updated weights for policy 1, policy_version 15591 (0.0007) +[2023-10-13 00:51:58,860][46662] Updated weights for policy 0, policy_version 15610 (0.0009) +[2023-10-13 00:51:59,170][46663] Updated weights for policy 1, policy_version 15601 (0.0009) +[2023-10-13 00:51:59,534][46663] Updated weights for policy 1, policy_version 15611 (0.0009) +[2023-10-13 00:52:03,008][46662] Updated weights for policy 0, policy_version 15620 (0.0010) +[2023-10-13 00:52:03,382][46662] Updated weights for policy 0, policy_version 15630 (0.0009) +[2023-10-13 00:52:03,522][46663] Updated weights for policy 1, policy_version 15621 (0.0009) +[2023-10-13 00:52:03,606][45375] Fps is (10 sec: 13107.7, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 31981568. Throughput: 0: 1675.8, 1: 1663.6. Samples: 8009996. Policy #0 lag: (min: 21.0, avg: 21.0, max: 21.0) +[2023-10-13 00:52:03,607][45375] Avg episode reward: [(0, '43.600'), (1, '42.520')] +[2023-10-13 00:52:03,750][46662] Updated weights for policy 0, policy_version 15640 (0.0008) +[2023-10-13 00:52:03,892][46663] Updated weights for policy 1, policy_version 15631 (0.0007) +[2023-10-13 00:52:04,257][46663] Updated weights for policy 1, policy_version 15641 (0.0008) +[2023-10-13 00:52:07,895][46662] Updated weights for policy 0, policy_version 15650 (0.0009) +[2023-10-13 00:52:08,262][46662] Updated weights for policy 0, policy_version 15660 (0.0008) +[2023-10-13 00:52:08,308][46663] Updated weights for policy 1, policy_version 15651 (0.0008) +[2023-10-13 00:52:08,606][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 32047104. Throughput: 0: 1678.8, 1: 1662.4. Samples: 8030506. Policy #0 lag: (min: 21.0, avg: 21.0, max: 21.0) +[2023-10-13 00:52:08,607][45375] Avg episode reward: [(0, '45.090'), (1, '44.670')] +[2023-10-13 00:52:08,627][46662] Updated weights for policy 0, policy_version 15670 (0.0008) +[2023-10-13 00:52:08,677][46663] Updated weights for policy 1, policy_version 15661 (0.0009) +[2023-10-13 00:52:08,996][46662] Updated weights for policy 0, policy_version 15680 (0.0009) +[2023-10-13 00:52:09,037][46663] Updated weights for policy 1, policy_version 15671 (0.0007) +[2023-10-13 00:52:12,911][46662] Updated weights for policy 0, policy_version 15690 (0.0007) +[2023-10-13 00:52:13,064][46663] Updated weights for policy 1, policy_version 15681 (0.0008) +[2023-10-13 00:52:13,273][46662] Updated weights for policy 0, policy_version 15700 (0.0007) +[2023-10-13 00:52:13,428][46663] Updated weights for policy 1, policy_version 15691 (0.0009) +[2023-10-13 00:52:13,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 32112640. Throughput: 0: 1680.0, 1: 1669.8. Samples: 8039958. Policy #0 lag: (min: 21.0, avg: 21.0, max: 21.0) +[2023-10-13 00:52:13,607][45375] Avg episode reward: [(0, '44.220'), (1, '45.790')] +[2023-10-13 00:52:13,638][46662] Updated weights for policy 0, policy_version 15710 (0.0008) +[2023-10-13 00:52:13,794][46663] Updated weights for policy 1, policy_version 15701 (0.0010) +[2023-10-13 00:52:14,162][46663] Updated weights for policy 1, policy_version 15711 (0.0008) +[2023-10-13 00:52:17,842][46662] Updated weights for policy 0, policy_version 15720 (0.0008) +[2023-10-13 00:52:18,212][46662] Updated weights for policy 0, policy_version 15730 (0.0009) +[2023-10-13 00:52:18,286][46663] Updated weights for policy 1, policy_version 15721 (0.0009) +[2023-10-13 00:52:18,584][46662] Updated weights for policy 0, policy_version 15740 (0.0007) +[2023-10-13 00:52:18,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13218.3). Total num frames: 32178176. Throughput: 0: 1686.0, 1: 1671.3. Samples: 8060860. Policy #0 lag: (min: 1.0, avg: 18.7, max: 33.0) +[2023-10-13 00:52:18,607][45375] Avg episode reward: [(0, '43.890'), (1, '45.450')] +[2023-10-13 00:52:18,661][46663] Updated weights for policy 1, policy_version 15731 (0.0007) +[2023-10-13 00:52:19,020][46663] Updated weights for policy 1, policy_version 15741 (0.0008) +[2023-10-13 00:52:22,542][46662] Updated weights for policy 0, policy_version 15750 (0.0011) +[2023-10-13 00:52:22,915][46662] Updated weights for policy 0, policy_version 15760 (0.0008) +[2023-10-13 00:52:23,246][46663] Updated weights for policy 1, policy_version 15751 (0.0007) +[2023-10-13 00:52:23,286][46662] Updated weights for policy 0, policy_version 15770 (0.0008) +[2023-10-13 00:52:23,607][45375] Fps is (10 sec: 16383.9, 60 sec: 13653.3, 300 sec: 13329.4). Total num frames: 32276480. Throughput: 0: 1675.5, 1: 1661.0. Samples: 8080382. Policy #0 lag: (min: 1.0, avg: 18.7, max: 33.0) +[2023-10-13 00:52:23,607][45375] Avg episode reward: [(0, '43.640'), (1, '45.610')] +[2023-10-13 00:52:23,612][46663] Updated weights for policy 1, policy_version 15761 (0.0007) +[2023-10-13 00:52:23,983][46663] Updated weights for policy 1, policy_version 15771 (0.0009) +[2023-10-13 00:52:27,376][46662] Updated weights for policy 0, policy_version 15780 (0.0010) +[2023-10-13 00:52:27,764][46662] Updated weights for policy 0, policy_version 15790 (0.0009) +[2023-10-13 00:52:27,925][46663] Updated weights for policy 1, policy_version 15781 (0.0010) +[2023-10-13 00:52:28,135][46662] Updated weights for policy 0, policy_version 15800 (0.0008) +[2023-10-13 00:52:28,281][46663] Updated weights for policy 1, policy_version 15791 (0.0008) +[2023-10-13 00:52:28,606][45375] Fps is (10 sec: 16384.1, 60 sec: 13653.3, 300 sec: 13329.4). Total num frames: 32342016. Throughput: 0: 1690.4, 1: 1672.4. Samples: 8090478. Policy #0 lag: (min: 2.0, avg: 3.7, max: 29.0) +[2023-10-13 00:52:28,607][45375] Avg episode reward: [(0, '43.520'), (1, '45.160')] +[2023-10-13 00:52:28,653][46663] Updated weights for policy 1, policy_version 15801 (0.0008) +[2023-10-13 00:52:32,124][46662] Updated weights for policy 0, policy_version 15810 (0.0007) +[2023-10-13 00:52:32,485][46662] Updated weights for policy 0, policy_version 15820 (0.0009) +[2023-10-13 00:52:32,862][46662] Updated weights for policy 0, policy_version 15830 (0.0008) +[2023-10-13 00:52:33,002][46663] Updated weights for policy 1, policy_version 15811 (0.0008) +[2023-10-13 00:52:33,229][46662] Updated weights for policy 0, policy_version 15840 (0.0007) +[2023-10-13 00:52:33,369][46663] Updated weights for policy 1, policy_version 15821 (0.0008) +[2023-10-13 00:52:33,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13329.3). Total num frames: 32407552. Throughput: 0: 1687.5, 1: 1672.2. Samples: 8111068. Policy #0 lag: (min: 2.0, avg: 3.7, max: 29.0) +[2023-10-13 00:52:33,608][45375] Avg episode reward: [(0, '43.670'), (1, '45.210')] +[2023-10-13 00:52:33,740][46663] Updated weights for policy 1, policy_version 15831 (0.0009) +[2023-10-13 00:52:37,297][46662] Updated weights for policy 0, policy_version 15850 (0.0008) +[2023-10-13 00:52:37,662][46662] Updated weights for policy 0, policy_version 15860 (0.0009) +[2023-10-13 00:52:37,688][46663] Updated weights for policy 1, policy_version 15841 (0.0008) +[2023-10-13 00:52:38,037][46662] Updated weights for policy 0, policy_version 15870 (0.0008) +[2023-10-13 00:52:38,111][46663] Updated weights for policy 1, policy_version 15851 (0.0008) +[2023-10-13 00:52:38,478][46663] Updated weights for policy 1, policy_version 15861 (0.0009) +[2023-10-13 00:52:38,606][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13329.4). Total num frames: 32473088. Throughput: 0: 1669.9, 1: 1660.1. Samples: 8130148. Policy #0 lag: (min: 2.0, avg: 3.7, max: 29.0) +[2023-10-13 00:52:38,607][45375] Avg episode reward: [(0, '43.480'), (1, '45.450')] +[2023-10-13 00:52:38,847][46663] Updated weights for policy 1, policy_version 15871 (0.0010) +[2023-10-13 00:52:42,037][46662] Updated weights for policy 0, policy_version 15880 (0.0009) +[2023-10-13 00:52:42,413][46662] Updated weights for policy 0, policy_version 15890 (0.0010) +[2023-10-13 00:52:42,786][46662] Updated weights for policy 0, policy_version 15900 (0.0010) +[2023-10-13 00:52:42,817][46663] Updated weights for policy 1, policy_version 15881 (0.0009) +[2023-10-13 00:52:43,175][46663] Updated weights for policy 1, policy_version 15891 (0.0008) +[2023-10-13 00:52:43,542][46663] Updated weights for policy 1, policy_version 15901 (0.0007) +[2023-10-13 00:52:43,607][45375] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13329.4). Total num frames: 32538624. Throughput: 0: 1693.8, 1: 1674.8. Samples: 8141120. Policy #0 lag: (min: 19.0, avg: 31.6, max: 32.0) +[2023-10-13 00:52:43,607][45375] Avg episode reward: [(0, '43.680'), (1, '46.030')] +[2023-10-13 00:52:46,741][46662] Updated weights for policy 0, policy_version 15910 (0.0010) +[2023-10-13 00:52:47,098][46662] Updated weights for policy 0, policy_version 15920 (0.0009) +[2023-10-13 00:52:47,467][46662] Updated weights for policy 0, policy_version 15930 (0.0009) +[2023-10-13 00:52:47,585][46663] Updated weights for policy 1, policy_version 15911 (0.0008) +[2023-10-13 00:52:47,946][46663] Updated weights for policy 1, policy_version 15921 (0.0009) +[2023-10-13 00:52:48,313][46663] Updated weights for policy 1, policy_version 15931 (0.0011) +[2023-10-13 00:52:48,607][45375] Fps is (10 sec: 16383.9, 60 sec: 14199.4, 300 sec: 13440.4). Total num frames: 32636928. Throughput: 0: 1693.6, 1: 1679.3. Samples: 8161774. Policy #0 lag: (min: 19.0, avg: 31.6, max: 32.0) +[2023-10-13 00:52:48,607][45375] Avg episode reward: [(0, '43.680'), (1, '45.070')] +[2023-10-13 00:52:51,658][46662] Updated weights for policy 0, policy_version 15940 (0.0009) +[2023-10-13 00:52:52,026][46662] Updated weights for policy 0, policy_version 15950 (0.0011) +[2023-10-13 00:52:52,352][46663] Updated weights for policy 1, policy_version 15941 (0.0010) +[2023-10-13 00:52:52,397][46662] Updated weights for policy 0, policy_version 15960 (0.0009) +[2023-10-13 00:52:52,706][46663] Updated weights for policy 1, policy_version 15951 (0.0009) +[2023-10-13 00:52:53,077][46663] Updated weights for policy 1, policy_version 15961 (0.0010) +[2023-10-13 00:52:53,607][45375] Fps is (10 sec: 16383.6, 60 sec: 14199.5, 300 sec: 13440.4). Total num frames: 32702464. Throughput: 0: 1671.6, 1: 1654.5. Samples: 8180182. Policy #0 lag: (min: 19.0, avg: 31.6, max: 32.0) +[2023-10-13 00:52:53,608][45375] Avg episode reward: [(0, '44.810'), (1, '45.160')] +[2023-10-13 00:52:56,357][46662] Updated weights for policy 0, policy_version 15970 (0.0008) +[2023-10-13 00:52:56,724][46662] Updated weights for policy 0, policy_version 15980 (0.0007) +[2023-10-13 00:52:57,099][46662] Updated weights for policy 0, policy_version 15990 (0.0007) +[2023-10-13 00:52:57,214][46663] Updated weights for policy 1, policy_version 15971 (0.0009) +[2023-10-13 00:52:57,459][46662] Updated weights for policy 0, policy_version 16000 (0.0008) +[2023-10-13 00:52:57,571][46663] Updated weights for policy 1, policy_version 15981 (0.0007) +[2023-10-13 00:52:57,947][46663] Updated weights for policy 1, policy_version 15991 (0.0008) +[2023-10-13 00:52:58,606][45375] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 13329.4). Total num frames: 32768000. Throughput: 0: 1700.1, 1: 1674.1. Samples: 8191800. Policy #0 lag: (min: 31.0, avg: 34.4, max: 63.0) +[2023-10-13 00:52:58,607][45375] Avg episode reward: [(0, '44.210'), (1, '45.260')] +[2023-10-13 00:53:01,654][46662] Updated weights for policy 0, policy_version 16010 (0.0008) +[2023-10-13 00:53:02,023][46662] Updated weights for policy 0, policy_version 16020 (0.0009) +[2023-10-13 00:53:02,106][46663] Updated weights for policy 1, policy_version 16001 (0.0007) +[2023-10-13 00:53:02,398][46662] Updated weights for policy 0, policy_version 16030 (0.0008) +[2023-10-13 00:53:02,471][46663] Updated weights for policy 1, policy_version 16011 (0.0008) +[2023-10-13 00:53:02,830][46663] Updated weights for policy 1, policy_version 16021 (0.0010) +[2023-10-13 00:53:03,197][46663] Updated weights for policy 1, policy_version 16031 (0.0007) +[2023-10-13 00:53:03,607][45375] Fps is (10 sec: 13107.6, 60 sec: 14199.5, 300 sec: 13329.4). Total num frames: 32833536. Throughput: 0: 1679.9, 1: 1669.1. Samples: 8211566. Policy #0 lag: (min: 31.0, avg: 34.4, max: 63.0) +[2023-10-13 00:53:03,607][45375] Avg episode reward: [(0, '43.110'), (1, '45.000')] +[2023-10-13 00:53:06,507][46662] Updated weights for policy 0, policy_version 16040 (0.0008) +[2023-10-13 00:53:06,882][46662] Updated weights for policy 0, policy_version 16050 (0.0007) +[2023-10-13 00:53:07,202][46663] Updated weights for policy 1, policy_version 16041 (0.0008) +[2023-10-13 00:53:07,257][46662] Updated weights for policy 0, policy_version 16060 (0.0007) +[2023-10-13 00:53:07,568][46663] Updated weights for policy 1, policy_version 16051 (0.0007) +[2023-10-13 00:53:07,940][46663] Updated weights for policy 1, policy_version 16061 (0.0010) +[2023-10-13 00:53:08,607][45375] Fps is (10 sec: 13106.9, 60 sec: 14199.4, 300 sec: 13329.3). Total num frames: 32899072. Throughput: 0: 1671.5, 1: 1665.9. Samples: 8230564. Policy #0 lag: (min: 31.0, avg: 34.4, max: 63.0) +[2023-10-13 00:53:08,608][45375] Avg episode reward: [(0, '43.340'), (1, '44.390')] +[2023-10-13 00:53:11,359][46662] Updated weights for policy 0, policy_version 16070 (0.0010) +[2023-10-13 00:53:11,737][46662] Updated weights for policy 0, policy_version 16080 (0.0010) +[2023-10-13 00:53:12,059][46663] Updated weights for policy 1, policy_version 16071 (0.0010) +[2023-10-13 00:53:12,098][46662] Updated weights for policy 0, policy_version 16090 (0.0009) +[2023-10-13 00:53:12,427][46663] Updated weights for policy 1, policy_version 16081 (0.0010) +[2023-10-13 00:53:12,796][46663] Updated weights for policy 1, policy_version 16091 (0.0010) +[2023-10-13 00:53:13,607][45375] Fps is (10 sec: 13107.0, 60 sec: 14199.4, 300 sec: 13329.4). Total num frames: 32964608. Throughput: 0: 1687.7, 1: 1683.8. Samples: 8242198. Policy #0 lag: (min: 31.0, avg: 31.3, max: 43.0) +[2023-10-13 00:53:13,608][45375] Avg episode reward: [(0, '43.890'), (1, '43.780')] +[2023-10-13 00:53:16,255][46662] Updated weights for policy 0, policy_version 16100 (0.0007) +[2023-10-13 00:53:16,638][46662] Updated weights for policy 0, policy_version 16110 (0.0007) +[2023-10-13 00:53:16,826][46663] Updated weights for policy 1, policy_version 16101 (0.0009) +[2023-10-13 00:53:17,004][46662] Updated weights for policy 0, policy_version 16120 (0.0008) +[2023-10-13 00:53:17,185][46663] Updated weights for policy 1, policy_version 16111 (0.0008) +[2023-10-13 00:53:17,557][46663] Updated weights for policy 1, policy_version 16121 (0.0007) +[2023-10-13 00:53:18,607][45375] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 13329.4). Total num frames: 33030144. Throughput: 0: 1669.1, 1: 1671.5. Samples: 8261396. Policy #0 lag: (min: 31.0, avg: 31.3, max: 43.0) +[2023-10-13 00:53:18,607][45375] Avg episode reward: [(0, '43.350'), (1, '41.930')] +[2023-10-13 00:53:21,142][46662] Updated weights for policy 0, policy_version 16130 (0.0009) +[2023-10-13 00:53:21,511][46662] Updated weights for policy 0, policy_version 16140 (0.0007) +[2023-10-13 00:53:21,598][46663] Updated weights for policy 1, policy_version 16131 (0.0009) +[2023-10-13 00:53:21,874][46662] Updated weights for policy 0, policy_version 16150 (0.0007) +[2023-10-13 00:53:21,963][46663] Updated weights for policy 1, policy_version 16141 (0.0008) +[2023-10-13 00:53:22,231][46662] Updated weights for policy 0, policy_version 16160 (0.0008) +[2023-10-13 00:53:22,340][46663] Updated weights for policy 1, policy_version 16151 (0.0008) +[2023-10-13 00:53:23,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13329.4). Total num frames: 33095680. Throughput: 0: 1671.9, 1: 1678.2. Samples: 8280906. Policy #0 lag: (min: 31.0, avg: 31.3, max: 43.0) +[2023-10-13 00:53:23,608][45375] Avg episode reward: [(0, '44.700'), (1, '42.520')] +[2023-10-13 00:53:26,255][46662] Updated weights for policy 0, policy_version 16170 (0.0009) +[2023-10-13 00:53:26,441][46663] Updated weights for policy 1, policy_version 16161 (0.0010) +[2023-10-13 00:53:26,620][46662] Updated weights for policy 0, policy_version 16180 (0.0008) +[2023-10-13 00:53:26,852][46663] Updated weights for policy 1, policy_version 16171 (0.0008) +[2023-10-13 00:53:26,986][46662] Updated weights for policy 0, policy_version 16190 (0.0009) +[2023-10-13 00:53:27,216][46663] Updated weights for policy 1, policy_version 16181 (0.0009) +[2023-10-13 00:53:27,577][46663] Updated weights for policy 1, policy_version 16191 (0.0009) +[2023-10-13 00:53:28,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13329.4). Total num frames: 33161216. Throughput: 0: 1674.7, 1: 1688.5. Samples: 8292466. Policy #0 lag: (min: 31.0, avg: 34.1, max: 63.0) +[2023-10-13 00:53:28,607][45375] Avg episode reward: [(0, '44.630'), (1, '42.600')] +[2023-10-13 00:53:31,219][46662] Updated weights for policy 0, policy_version 16200 (0.0008) +[2023-10-13 00:53:31,592][46662] Updated weights for policy 0, policy_version 16210 (0.0010) +[2023-10-13 00:53:31,633][46663] Updated weights for policy 1, policy_version 16201 (0.0008) +[2023-10-13 00:53:31,966][46662] Updated weights for policy 0, policy_version 16220 (0.0008) +[2023-10-13 00:53:31,998][46663] Updated weights for policy 1, policy_version 16211 (0.0010) +[2023-10-13 00:53:32,369][46663] Updated weights for policy 1, policy_version 16221 (0.0007) +[2023-10-13 00:53:33,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13329.3). Total num frames: 33226752. Throughput: 0: 1650.4, 1: 1657.4. Samples: 8310624. Policy #0 lag: (min: 31.0, avg: 34.1, max: 63.0) +[2023-10-13 00:53:33,608][45375] Avg episode reward: [(0, '44.040'), (1, '42.180')] +[2023-10-13 00:53:35,915][46662] Updated weights for policy 0, policy_version 16230 (0.0007) +[2023-10-13 00:53:36,286][46662] Updated weights for policy 0, policy_version 16240 (0.0009) +[2023-10-13 00:53:36,475][46663] Updated weights for policy 1, policy_version 16231 (0.0008) +[2023-10-13 00:53:36,663][46662] Updated weights for policy 0, policy_version 16250 (0.0008) +[2023-10-13 00:53:36,837][46663] Updated weights for policy 1, policy_version 16241 (0.0010) +[2023-10-13 00:53:37,205][46663] Updated weights for policy 1, policy_version 16251 (0.0008) +[2023-10-13 00:53:38,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13329.4). Total num frames: 33292288. Throughput: 0: 1666.6, 1: 1679.8. Samples: 8330772. Policy #0 lag: (min: 31.0, avg: 34.1, max: 63.0) +[2023-10-13 00:53:38,607][45375] Avg episode reward: [(0, '43.920'), (1, '40.990')] +[2023-10-13 00:53:40,710][46662] Updated weights for policy 0, policy_version 16260 (0.0008) +[2023-10-13 00:53:41,074][46662] Updated weights for policy 0, policy_version 16270 (0.0008) +[2023-10-13 00:53:41,382][46663] Updated weights for policy 1, policy_version 16261 (0.0008) +[2023-10-13 00:53:41,438][46662] Updated weights for policy 0, policy_version 16280 (0.0007) +[2023-10-13 00:53:41,753][46663] Updated weights for policy 1, policy_version 16271 (0.0008) +[2023-10-13 00:53:42,112][46663] Updated weights for policy 1, policy_version 16281 (0.0009) +[2023-10-13 00:53:43,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13329.4). Total num frames: 33357824. Throughput: 0: 1659.9, 1: 1676.7. Samples: 8341946. Policy #0 lag: (min: 27.0, avg: 35.6, max: 59.0) +[2023-10-13 00:53:43,607][45375] Avg episode reward: [(0, '44.650'), (1, '42.270')] +[2023-10-13 00:53:45,498][46662] Updated weights for policy 0, policy_version 16290 (0.0008) +[2023-10-13 00:53:45,858][46662] Updated weights for policy 0, policy_version 16300 (0.0007) +[2023-10-13 00:53:46,144][46663] Updated weights for policy 1, policy_version 16291 (0.0009) +[2023-10-13 00:53:46,224][46662] Updated weights for policy 0, policy_version 16310 (0.0007) +[2023-10-13 00:53:46,502][46663] Updated weights for policy 1, policy_version 16301 (0.0008) +[2023-10-13 00:53:46,601][46662] Updated weights for policy 0, policy_version 16320 (0.0007) +[2023-10-13 00:53:46,870][46663] Updated weights for policy 1, policy_version 16311 (0.0008) +[2023-10-13 00:53:48,606][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 33423360. Throughput: 0: 1649.6, 1: 1661.2. Samples: 8360550. Policy #0 lag: (min: 27.0, avg: 35.6, max: 59.0) +[2023-10-13 00:53:48,607][45375] Avg episode reward: [(0, '45.990'), (1, '43.620')] +[2023-10-13 00:53:50,575][46662] Updated weights for policy 0, policy_version 16330 (0.0008) +[2023-10-13 00:53:50,937][46662] Updated weights for policy 0, policy_version 16340 (0.0008) +[2023-10-13 00:53:51,030][46663] Updated weights for policy 1, policy_version 16321 (0.0010) +[2023-10-13 00:53:51,311][46662] Updated weights for policy 0, policy_version 16350 (0.0008) +[2023-10-13 00:53:51,384][46663] Updated weights for policy 1, policy_version 16331 (0.0009) +[2023-10-13 00:53:51,749][46663] Updated weights for policy 1, policy_version 16341 (0.0009) +[2023-10-13 00:53:52,123][46663] Updated weights for policy 1, policy_version 16351 (0.0009) +[2023-10-13 00:53:53,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.3, 300 sec: 13329.4). Total num frames: 33488896. Throughput: 0: 1677.3, 1: 1671.6. Samples: 8381266. Policy #0 lag: (min: 27.0, avg: 35.6, max: 59.0) +[2023-10-13 00:53:53,607][45375] Avg episode reward: [(0, '44.120'), (1, '42.690')] +[2023-10-13 00:53:53,615][46384] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000016352_16744448.pth... +[2023-10-13 00:53:53,615][46091] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000016352_16744448.pth... +[2023-10-13 00:53:53,647][46384] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000014784_15138816.pth +[2023-10-13 00:53:53,655][46091] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000014784_15138816.pth +[2023-10-13 00:53:55,193][46662] Updated weights for policy 0, policy_version 16360 (0.0010) +[2023-10-13 00:53:55,576][46662] Updated weights for policy 0, policy_version 16370 (0.0010) +[2023-10-13 00:53:55,946][46662] Updated weights for policy 0, policy_version 16380 (0.0011) +[2023-10-13 00:53:56,493][46663] Updated weights for policy 1, policy_version 16361 (0.0008) +[2023-10-13 00:53:56,859][46663] Updated weights for policy 1, policy_version 16371 (0.0008) +[2023-10-13 00:53:57,224][46663] Updated weights for policy 1, policy_version 16381 (0.0008) +[2023-10-13 00:53:58,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 33554432. Throughput: 0: 1659.6, 1: 1664.0. Samples: 8391760. Policy #0 lag: (min: 8.0, avg: 35.4, max: 40.0) +[2023-10-13 00:53:58,607][45375] Avg episode reward: [(0, '43.010'), (1, '43.100')] +[2023-10-13 00:53:59,984][46662] Updated weights for policy 0, policy_version 16390 (0.0008) +[2023-10-13 00:54:00,350][46662] Updated weights for policy 0, policy_version 16400 (0.0007) +[2023-10-13 00:54:00,722][46662] Updated weights for policy 0, policy_version 16410 (0.0007) +[2023-10-13 00:54:01,165][46663] Updated weights for policy 1, policy_version 16391 (0.0010) +[2023-10-13 00:54:01,542][46663] Updated weights for policy 1, policy_version 16401 (0.0010) +[2023-10-13 00:54:01,913][46663] Updated weights for policy 1, policy_version 16411 (0.0008) +[2023-10-13 00:54:03,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 33619968. Throughput: 0: 1675.0, 1: 1656.0. Samples: 8411290. Policy #0 lag: (min: 8.0, avg: 35.4, max: 40.0) +[2023-10-13 00:54:03,608][45375] Avg episode reward: [(0, '43.800'), (1, '42.240')] +[2023-10-13 00:54:04,710][46662] Updated weights for policy 0, policy_version 16420 (0.0008) +[2023-10-13 00:54:05,107][46662] Updated weights for policy 0, policy_version 16430 (0.0008) +[2023-10-13 00:54:05,483][46662] Updated weights for policy 0, policy_version 16440 (0.0008) +[2023-10-13 00:54:05,797][46663] Updated weights for policy 1, policy_version 16421 (0.0008) +[2023-10-13 00:54:06,172][46663] Updated weights for policy 1, policy_version 16431 (0.0008) +[2023-10-13 00:54:06,544][46663] Updated weights for policy 1, policy_version 16441 (0.0009) +[2023-10-13 00:54:08,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 33685504. Throughput: 0: 1689.0, 1: 1667.7. Samples: 8431954. Policy #0 lag: (min: 8.0, avg: 35.4, max: 40.0) +[2023-10-13 00:54:08,607][45375] Avg episode reward: [(0, '43.780'), (1, '43.350')] +[2023-10-13 00:54:09,574][46662] Updated weights for policy 0, policy_version 16450 (0.0008) +[2023-10-13 00:54:09,952][46662] Updated weights for policy 0, policy_version 16460 (0.0007) +[2023-10-13 00:54:10,326][46662] Updated weights for policy 0, policy_version 16470 (0.0008) +[2023-10-13 00:54:10,543][46663] Updated weights for policy 1, policy_version 16451 (0.0009) +[2023-10-13 00:54:10,686][46662] Updated weights for policy 0, policy_version 16480 (0.0007) +[2023-10-13 00:54:10,956][46663] Updated weights for policy 1, policy_version 16461 (0.0008) +[2023-10-13 00:54:11,319][46663] Updated weights for policy 1, policy_version 16471 (0.0008) +[2023-10-13 00:54:13,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 33751040. Throughput: 0: 1660.3, 1: 1646.6. Samples: 8441274. Policy #0 lag: (min: 6.0, avg: 9.0, max: 38.0) +[2023-10-13 00:54:13,608][45375] Avg episode reward: [(0, '45.190'), (1, '42.820')] +[2023-10-13 00:54:14,780][46662] Updated weights for policy 0, policy_version 16490 (0.0009) +[2023-10-13 00:54:15,151][46662] Updated weights for policy 0, policy_version 16500 (0.0008) +[2023-10-13 00:54:15,509][46663] Updated weights for policy 1, policy_version 16481 (0.0007) +[2023-10-13 00:54:15,530][46662] Updated weights for policy 0, policy_version 16510 (0.0009) +[2023-10-13 00:54:15,869][46663] Updated weights for policy 1, policy_version 16491 (0.0007) +[2023-10-13 00:54:16,236][46663] Updated weights for policy 1, policy_version 16501 (0.0007) +[2023-10-13 00:54:16,606][46663] Updated weights for policy 1, policy_version 16511 (0.0008) +[2023-10-13 00:54:18,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 33816576. Throughput: 0: 1690.0, 1: 1667.3. Samples: 8461704. Policy #0 lag: (min: 6.0, avg: 9.0, max: 38.0) +[2023-10-13 00:54:18,607][45375] Avg episode reward: [(0, '45.640'), (1, '45.280')] +[2023-10-13 00:54:19,496][46662] Updated weights for policy 0, policy_version 16520 (0.0009) +[2023-10-13 00:54:19,872][46662] Updated weights for policy 0, policy_version 16530 (0.0009) +[2023-10-13 00:54:20,238][46662] Updated weights for policy 0, policy_version 16540 (0.0009) +[2023-10-13 00:54:20,673][46663] Updated weights for policy 1, policy_version 16521 (0.0009) +[2023-10-13 00:54:21,048][46663] Updated weights for policy 1, policy_version 16531 (0.0007) +[2023-10-13 00:54:21,415][46663] Updated weights for policy 1, policy_version 16541 (0.0008) +[2023-10-13 00:54:23,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 33882112. Throughput: 0: 1696.1, 1: 1671.6. Samples: 8482322. Policy #0 lag: (min: 6.0, avg: 9.0, max: 38.0) +[2023-10-13 00:54:23,608][45375] Avg episode reward: [(0, '44.530'), (1, '43.440')] +[2023-10-13 00:54:24,331][46662] Updated weights for policy 0, policy_version 16550 (0.0008) +[2023-10-13 00:54:24,707][46662] Updated weights for policy 0, policy_version 16560 (0.0009) +[2023-10-13 00:54:25,083][46662] Updated weights for policy 0, policy_version 16570 (0.0007) +[2023-10-13 00:54:25,516][46663] Updated weights for policy 1, policy_version 16551 (0.0008) +[2023-10-13 00:54:25,886][46663] Updated weights for policy 1, policy_version 16561 (0.0008) +[2023-10-13 00:54:26,257][46663] Updated weights for policy 1, policy_version 16571 (0.0010) +[2023-10-13 00:54:28,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 33947648. Throughput: 0: 1672.0, 1: 1655.6. Samples: 8491686. Policy #0 lag: (min: 9.0, avg: 26.0, max: 41.0) +[2023-10-13 00:54:28,607][45375] Avg episode reward: [(0, '44.670'), (1, '45.020')] +[2023-10-13 00:54:29,370][46662] Updated weights for policy 0, policy_version 16580 (0.0008) +[2023-10-13 00:54:29,730][46662] Updated weights for policy 0, policy_version 16590 (0.0010) +[2023-10-13 00:54:30,100][46662] Updated weights for policy 0, policy_version 16600 (0.0010) +[2023-10-13 00:54:30,273][46663] Updated weights for policy 1, policy_version 16581 (0.0009) +[2023-10-13 00:54:30,652][46663] Updated weights for policy 1, policy_version 16591 (0.0008) +[2023-10-13 00:54:31,016][46663] Updated weights for policy 1, policy_version 16601 (0.0010) +[2023-10-13 00:54:33,607][45375] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 34013184. Throughput: 0: 1690.6, 1: 1676.9. Samples: 8512090. Policy #0 lag: (min: 9.0, avg: 26.0, max: 41.0) +[2023-10-13 00:54:33,607][45375] Avg episode reward: [(0, '46.410'), (1, '45.630')] +[2023-10-13 00:54:34,384][46662] Updated weights for policy 0, policy_version 16610 (0.0009) +[2023-10-13 00:54:34,763][46662] Updated weights for policy 0, policy_version 16620 (0.0007) +[2023-10-13 00:54:35,134][46662] Updated weights for policy 0, policy_version 16630 (0.0007) +[2023-10-13 00:54:35,204][46663] Updated weights for policy 1, policy_version 16611 (0.0010) +[2023-10-13 00:54:35,498][46662] Updated weights for policy 0, policy_version 16640 (0.0008) +[2023-10-13 00:54:35,570][46663] Updated weights for policy 1, policy_version 16621 (0.0007) +[2023-10-13 00:54:35,941][46663] Updated weights for policy 1, policy_version 16631 (0.0009) +[2023-10-13 00:54:38,607][45375] Fps is (10 sec: 13106.9, 60 sec: 13107.1, 300 sec: 13329.3). Total num frames: 34078720. Throughput: 0: 1681.9, 1: 1681.5. Samples: 8532618. Policy #0 lag: (min: 9.0, avg: 26.0, max: 41.0) +[2023-10-13 00:54:38,608][45375] Avg episode reward: [(0, '46.280'), (1, '46.940')] +[2023-10-13 00:54:39,538][46662] Updated weights for policy 0, policy_version 16650 (0.0009) +[2023-10-13 00:54:39,912][46662] Updated weights for policy 0, policy_version 16660 (0.0008) +[2023-10-13 00:54:39,981][46663] Updated weights for policy 1, policy_version 16641 (0.0008) +[2023-10-13 00:54:40,278][46662] Updated weights for policy 0, policy_version 16670 (0.0008) +[2023-10-13 00:54:40,350][46663] Updated weights for policy 1, policy_version 16651 (0.0009) +[2023-10-13 00:54:40,712][46663] Updated weights for policy 1, policy_version 16661 (0.0010) +[2023-10-13 00:54:41,084][46663] Updated weights for policy 1, policy_version 16671 (0.0008) +[2023-10-13 00:54:43,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 34144256. Throughput: 0: 1670.1, 1: 1660.6. Samples: 8541644. Policy #0 lag: (min: 23.0, avg: 23.0, max: 23.0) +[2023-10-13 00:54:43,607][45375] Avg episode reward: [(0, '45.800'), (1, '47.050')] +[2023-10-13 00:54:44,404][46662] Updated weights for policy 0, policy_version 16680 (0.0009) +[2023-10-13 00:54:44,768][46662] Updated weights for policy 0, policy_version 16690 (0.0008) +[2023-10-13 00:54:45,142][46662] Updated weights for policy 0, policy_version 16700 (0.0008) +[2023-10-13 00:54:45,199][46663] Updated weights for policy 1, policy_version 16681 (0.0007) +[2023-10-13 00:54:45,556][46663] Updated weights for policy 1, policy_version 16691 (0.0010) +[2023-10-13 00:54:45,929][46663] Updated weights for policy 1, policy_version 16701 (0.0008) +[2023-10-13 00:54:48,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 34209792. Throughput: 0: 1676.7, 1: 1685.8. Samples: 8562602. Policy #0 lag: (min: 23.0, avg: 23.0, max: 23.0) +[2023-10-13 00:54:48,608][45375] Avg episode reward: [(0, '45.640'), (1, '46.020')] +[2023-10-13 00:54:49,229][46662] Updated weights for policy 0, policy_version 16710 (0.0008) +[2023-10-13 00:54:49,610][46662] Updated weights for policy 0, policy_version 16720 (0.0007) +[2023-10-13 00:54:49,985][46662] Updated weights for policy 0, policy_version 16730 (0.0008) +[2023-10-13 00:54:50,129][46663] Updated weights for policy 1, policy_version 16711 (0.0009) +[2023-10-13 00:54:50,504][46663] Updated weights for policy 1, policy_version 16721 (0.0008) +[2023-10-13 00:54:50,866][46663] Updated weights for policy 1, policy_version 16731 (0.0008) +[2023-10-13 00:54:53,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 34275328. Throughput: 0: 1674.5, 1: 1682.8. Samples: 8583032. Policy #0 lag: (min: 23.0, avg: 23.0, max: 23.0) +[2023-10-13 00:54:53,607][45375] Avg episode reward: [(0, '45.810'), (1, '45.810')] +[2023-10-13 00:54:54,081][46662] Updated weights for policy 0, policy_version 16740 (0.0010) +[2023-10-13 00:54:54,454][46662] Updated weights for policy 0, policy_version 16750 (0.0011) +[2023-10-13 00:54:54,825][46662] Updated weights for policy 0, policy_version 16760 (0.0008) +[2023-10-13 00:54:54,998][46663] Updated weights for policy 1, policy_version 16741 (0.0007) +[2023-10-13 00:54:55,363][46663] Updated weights for policy 1, policy_version 16751 (0.0008) +[2023-10-13 00:54:55,734][46663] Updated weights for policy 1, policy_version 16761 (0.0009) +[2023-10-13 00:54:58,607][45375] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 34340864. Throughput: 0: 1674.9, 1: 1676.9. Samples: 8592106. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 00:54:58,607][45375] Avg episode reward: [(0, '46.910'), (1, '45.560')] +[2023-10-13 00:54:58,998][46662] Updated weights for policy 0, policy_version 16770 (0.0008) +[2023-10-13 00:54:59,378][46662] Updated weights for policy 0, policy_version 16780 (0.0010) +[2023-10-13 00:54:59,682][46663] Updated weights for policy 1, policy_version 16771 (0.0010) +[2023-10-13 00:54:59,752][46662] Updated weights for policy 0, policy_version 16790 (0.0009) +[2023-10-13 00:55:00,045][46663] Updated weights for policy 1, policy_version 16781 (0.0007) +[2023-10-13 00:55:00,126][46662] Updated weights for policy 0, policy_version 16800 (0.0009) +[2023-10-13 00:55:00,416][46663] Updated weights for policy 1, policy_version 16791 (0.0010) +[2023-10-13 00:55:03,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 34406400. Throughput: 0: 1670.7, 1: 1684.1. Samples: 8612674. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 00:55:03,608][45375] Avg episode reward: [(0, '48.570'), (1, '46.560')] +[2023-10-13 00:55:04,018][46662] Updated weights for policy 0, policy_version 16810 (0.0007) +[2023-10-13 00:55:04,386][46662] Updated weights for policy 0, policy_version 16820 (0.0008) +[2023-10-13 00:55:04,453][46663] Updated weights for policy 1, policy_version 16801 (0.0011) +[2023-10-13 00:55:04,751][46662] Updated weights for policy 0, policy_version 16830 (0.0008) +[2023-10-13 00:55:04,828][46663] Updated weights for policy 1, policy_version 16811 (0.0010) +[2023-10-13 00:55:05,192][46663] Updated weights for policy 1, policy_version 16821 (0.0010) +[2023-10-13 00:55:05,556][46663] Updated weights for policy 1, policy_version 16831 (0.0011) +[2023-10-13 00:55:08,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 34471936. Throughput: 0: 1672.9, 1: 1682.8. Samples: 8633326. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 00:55:08,607][45375] Avg episode reward: [(0, '48.160'), (1, '46.370')] +[2023-10-13 00:55:08,807][46662] Updated weights for policy 0, policy_version 16840 (0.0007) +[2023-10-13 00:55:09,186][46662] Updated weights for policy 0, policy_version 16850 (0.0007) +[2023-10-13 00:55:09,557][46662] Updated weights for policy 0, policy_version 16860 (0.0009) +[2023-10-13 00:55:09,733][46663] Updated weights for policy 1, policy_version 16841 (0.0007) +[2023-10-13 00:55:10,096][46663] Updated weights for policy 1, policy_version 16851 (0.0008) +[2023-10-13 00:55:10,475][46663] Updated weights for policy 1, policy_version 16861 (0.0007) +[2023-10-13 00:55:13,502][46662] Updated weights for policy 0, policy_version 16870 (0.0010) +[2023-10-13 00:55:13,607][45375] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 34537472. Throughput: 0: 1674.2, 1: 1675.9. Samples: 8642440. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 00:55:13,607][45375] Avg episode reward: [(0, '48.730'), (1, '47.820')] +[2023-10-13 00:55:13,876][46662] Updated weights for policy 0, policy_version 16880 (0.0010) +[2023-10-13 00:55:14,250][46662] Updated weights for policy 0, policy_version 16890 (0.0010) +[2023-10-13 00:55:14,489][46663] Updated weights for policy 1, policy_version 16871 (0.0007) +[2023-10-13 00:55:14,857][46663] Updated weights for policy 1, policy_version 16881 (0.0008) +[2023-10-13 00:55:15,224][46663] Updated weights for policy 1, policy_version 16891 (0.0011) +[2023-10-13 00:55:18,250][46662] Updated weights for policy 0, policy_version 16900 (0.0008) +[2023-10-13 00:55:18,606][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 34603008. Throughput: 0: 1679.4, 1: 1679.2. Samples: 8663224. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 00:55:18,607][45375] Avg episode reward: [(0, '47.130'), (1, '46.810')] +[2023-10-13 00:55:18,622][46662] Updated weights for policy 0, policy_version 16910 (0.0008) +[2023-10-13 00:55:18,984][46662] Updated weights for policy 0, policy_version 16920 (0.0009) +[2023-10-13 00:55:19,270][46663] Updated weights for policy 1, policy_version 16901 (0.0008) +[2023-10-13 00:55:19,636][46663] Updated weights for policy 1, policy_version 16911 (0.0009) +[2023-10-13 00:55:20,012][46663] Updated weights for policy 1, policy_version 16921 (0.0007) +[2023-10-13 00:55:22,944][46662] Updated weights for policy 0, policy_version 16930 (0.0007) +[2023-10-13 00:55:23,313][46662] Updated weights for policy 0, policy_version 16940 (0.0011) +[2023-10-13 00:55:23,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 34668544. Throughput: 0: 1688.7, 1: 1680.7. Samples: 8684240. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 00:55:23,607][45375] Avg episode reward: [(0, '47.290'), (1, '47.000')] +[2023-10-13 00:55:23,684][46662] Updated weights for policy 0, policy_version 16950 (0.0009) +[2023-10-13 00:55:23,952][46663] Updated weights for policy 1, policy_version 16931 (0.0009) +[2023-10-13 00:55:24,049][46662] Updated weights for policy 0, policy_version 16960 (0.0008) +[2023-10-13 00:55:24,320][46663] Updated weights for policy 1, policy_version 16941 (0.0008) +[2023-10-13 00:55:24,687][46663] Updated weights for policy 1, policy_version 16951 (0.0007) +[2023-10-13 00:55:28,051][46662] Updated weights for policy 0, policy_version 16970 (0.0008) +[2023-10-13 00:55:28,421][46662] Updated weights for policy 0, policy_version 16980 (0.0009) +[2023-10-13 00:55:28,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 34734080. Throughput: 0: 1689.6, 1: 1679.5. Samples: 8693254. Policy #0 lag: (min: 12.0, avg: 18.2, max: 44.0) +[2023-10-13 00:55:28,608][45375] Avg episode reward: [(0, '48.060'), (1, '46.710')] +[2023-10-13 00:55:28,793][46662] Updated weights for policy 0, policy_version 16990 (0.0008) +[2023-10-13 00:55:28,944][46663] Updated weights for policy 1, policy_version 16961 (0.0009) +[2023-10-13 00:55:29,317][46663] Updated weights for policy 1, policy_version 16971 (0.0008) +[2023-10-13 00:55:29,690][46663] Updated weights for policy 1, policy_version 16981 (0.0007) +[2023-10-13 00:55:30,060][46663] Updated weights for policy 1, policy_version 16991 (0.0009) +[2023-10-13 00:55:32,864][46662] Updated weights for policy 0, policy_version 17000 (0.0008) +[2023-10-13 00:55:33,247][46662] Updated weights for policy 0, policy_version 17010 (0.0007) +[2023-10-13 00:55:33,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 34799616. Throughput: 0: 1688.6, 1: 1679.4. Samples: 8714160. Policy #0 lag: (min: 12.0, avg: 18.2, max: 44.0) +[2023-10-13 00:55:33,608][45375] Avg episode reward: [(0, '47.240'), (1, '46.960')] +[2023-10-13 00:55:33,621][46662] Updated weights for policy 0, policy_version 17020 (0.0008) +[2023-10-13 00:55:34,103][46663] Updated weights for policy 1, policy_version 17001 (0.0009) +[2023-10-13 00:55:34,468][46663] Updated weights for policy 1, policy_version 17011 (0.0007) +[2023-10-13 00:55:34,827][46663] Updated weights for policy 1, policy_version 17021 (0.0008) +[2023-10-13 00:55:37,725][46662] Updated weights for policy 0, policy_version 17030 (0.0008) +[2023-10-13 00:55:38,115][46662] Updated weights for policy 0, policy_version 17040 (0.0010) +[2023-10-13 00:55:38,488][46662] Updated weights for policy 0, policy_version 17050 (0.0010) +[2023-10-13 00:55:38,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 34865152. Throughput: 0: 1683.9, 1: 1687.4. Samples: 8734742. Policy #0 lag: (min: 12.0, avg: 18.2, max: 44.0) +[2023-10-13 00:55:38,607][45375] Avg episode reward: [(0, '45.630'), (1, '47.620')] +[2023-10-13 00:55:38,616][46663] Updated weights for policy 1, policy_version 17031 (0.0008) +[2023-10-13 00:55:38,987][46663] Updated weights for policy 1, policy_version 17041 (0.0009) +[2023-10-13 00:55:39,354][46663] Updated weights for policy 1, policy_version 17051 (0.0007) +[2023-10-13 00:55:42,655][46662] Updated weights for policy 0, policy_version 17060 (0.0008) +[2023-10-13 00:55:43,032][46662] Updated weights for policy 0, policy_version 17070 (0.0007) +[2023-10-13 00:55:43,320][46663] Updated weights for policy 1, policy_version 17061 (0.0009) +[2023-10-13 00:55:43,402][46662] Updated weights for policy 0, policy_version 17080 (0.0007) +[2023-10-13 00:55:43,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 34930688. Throughput: 0: 1692.6, 1: 1687.5. Samples: 8744210. Policy #0 lag: (min: 23.0, avg: 23.0, max: 23.0) +[2023-10-13 00:55:43,608][45375] Avg episode reward: [(0, '46.140'), (1, '46.770')] +[2023-10-13 00:55:43,691][46663] Updated weights for policy 1, policy_version 17071 (0.0007) +[2023-10-13 00:55:44,062][46663] Updated weights for policy 1, policy_version 17081 (0.0007) +[2023-10-13 00:55:47,304][46662] Updated weights for policy 0, policy_version 17090 (0.0008) +[2023-10-13 00:55:47,669][46662] Updated weights for policy 0, policy_version 17100 (0.0008) +[2023-10-13 00:55:48,017][46663] Updated weights for policy 1, policy_version 17091 (0.0007) +[2023-10-13 00:55:48,038][46662] Updated weights for policy 0, policy_version 17110 (0.0009) +[2023-10-13 00:55:48,404][46662] Updated weights for policy 0, policy_version 17120 (0.0008) +[2023-10-13 00:55:48,404][46663] Updated weights for policy 1, policy_version 17101 (0.0009) +[2023-10-13 00:55:48,606][45375] Fps is (10 sec: 16384.2, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 35028992. Throughput: 0: 1692.6, 1: 1697.5. Samples: 8765228. Policy #0 lag: (min: 23.0, avg: 23.0, max: 23.0) +[2023-10-13 00:55:48,607][45375] Avg episode reward: [(0, '46.130'), (1, '47.360')] +[2023-10-13 00:55:48,776][46663] Updated weights for policy 1, policy_version 17111 (0.0008) +[2023-10-13 00:55:52,333][46662] Updated weights for policy 0, policy_version 17130 (0.0009) +[2023-10-13 00:55:52,708][46662] Updated weights for policy 0, policy_version 17140 (0.0008) +[2023-10-13 00:55:52,808][46663] Updated weights for policy 1, policy_version 17121 (0.0008) +[2023-10-13 00:55:53,073][46662] Updated weights for policy 0, policy_version 17150 (0.0008) +[2023-10-13 00:55:53,181][46663] Updated weights for policy 1, policy_version 17131 (0.0008) +[2023-10-13 00:55:53,542][46663] Updated weights for policy 1, policy_version 17141 (0.0008) +[2023-10-13 00:55:53,607][45375] Fps is (10 sec: 16384.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 35094528. Throughput: 0: 1675.5, 1: 1687.7. Samples: 8784668. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 00:55:53,607][45375] Avg episode reward: [(0, '46.830'), (1, '47.310')] +[2023-10-13 00:55:53,617][46091] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000017152_17563648.pth... +[2023-10-13 00:55:53,647][46091] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000015584_15958016.pth +[2023-10-13 00:55:53,903][46663] Updated weights for policy 1, policy_version 17151 (0.0008) +[2023-10-13 00:55:53,945][46384] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000017152_17563648.pth... +[2023-10-13 00:55:53,983][46384] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000015584_15958016.pth +[2023-10-13 00:55:57,074][46662] Updated weights for policy 0, policy_version 17160 (0.0010) +[2023-10-13 00:55:57,448][46662] Updated weights for policy 0, policy_version 17170 (0.0008) +[2023-10-13 00:55:57,824][46662] Updated weights for policy 0, policy_version 17180 (0.0008) +[2023-10-13 00:55:57,962][46663] Updated weights for policy 1, policy_version 17161 (0.0010) +[2023-10-13 00:55:58,338][46663] Updated weights for policy 1, policy_version 17171 (0.0008) +[2023-10-13 00:55:58,606][45375] Fps is (10 sec: 13107.1, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 35160064. Throughput: 0: 1695.0, 1: 1701.0. Samples: 8795258. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 00:55:58,607][45375] Avg episode reward: [(0, '46.080'), (1, '49.190')] +[2023-10-13 00:55:58,707][46663] Updated weights for policy 1, policy_version 17181 (0.0011) +[2023-10-13 00:56:01,934][46662] Updated weights for policy 0, policy_version 17190 (0.0009) +[2023-10-13 00:56:02,311][46662] Updated weights for policy 0, policy_version 17200 (0.0009) +[2023-10-13 00:56:02,684][46662] Updated weights for policy 0, policy_version 17210 (0.0010) +[2023-10-13 00:56:02,828][46663] Updated weights for policy 1, policy_version 17191 (0.0009) +[2023-10-13 00:56:03,207][46663] Updated weights for policy 1, policy_version 17201 (0.0007) +[2023-10-13 00:56:03,579][46663] Updated weights for policy 1, policy_version 17211 (0.0009) +[2023-10-13 00:56:03,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 35225600. Throughput: 0: 1692.7, 1: 1700.8. Samples: 8815928. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 00:56:03,607][45375] Avg episode reward: [(0, '45.730'), (1, '49.570')] +[2023-10-13 00:56:06,848][46662] Updated weights for policy 0, policy_version 17220 (0.0009) +[2023-10-13 00:56:07,216][46662] Updated weights for policy 0, policy_version 17230 (0.0008) +[2023-10-13 00:56:07,579][46662] Updated weights for policy 0, policy_version 17240 (0.0009) +[2023-10-13 00:56:07,655][46663] Updated weights for policy 1, policy_version 17221 (0.0008) +[2023-10-13 00:56:08,028][46663] Updated weights for policy 1, policy_version 17231 (0.0008) +[2023-10-13 00:56:08,394][46663] Updated weights for policy 1, policy_version 17241 (0.0008) +[2023-10-13 00:56:08,607][45375] Fps is (10 sec: 13106.8, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 35291136. Throughput: 0: 1662.0, 1: 1679.8. Samples: 8834622. Policy #0 lag: (min: 26.0, avg: 26.0, max: 26.0) +[2023-10-13 00:56:08,608][45375] Avg episode reward: [(0, '45.490'), (1, '48.030')] +[2023-10-13 00:56:11,786][46662] Updated weights for policy 0, policy_version 17250 (0.0008) +[2023-10-13 00:56:12,145][46662] Updated weights for policy 0, policy_version 17260 (0.0007) +[2023-10-13 00:56:12,356][46663] Updated weights for policy 1, policy_version 17251 (0.0008) +[2023-10-13 00:56:12,517][46662] Updated weights for policy 0, policy_version 17270 (0.0008) +[2023-10-13 00:56:12,721][46663] Updated weights for policy 1, policy_version 17261 (0.0010) +[2023-10-13 00:56:12,879][46662] Updated weights for policy 0, policy_version 17280 (0.0009) +[2023-10-13 00:56:13,095][46663] Updated weights for policy 1, policy_version 17271 (0.0007) +[2023-10-13 00:56:13,607][45375] Fps is (10 sec: 16383.8, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 35389440. Throughput: 0: 1686.6, 1: 1700.2. Samples: 8845658. Policy #0 lag: (min: 26.0, avg: 26.0, max: 26.0) +[2023-10-13 00:56:13,608][45375] Avg episode reward: [(0, '45.980'), (1, '47.180')] +[2023-10-13 00:56:16,995][46662] Updated weights for policy 0, policy_version 17290 (0.0010) +[2023-10-13 00:56:17,238][46663] Updated weights for policy 1, policy_version 17281 (0.0007) +[2023-10-13 00:56:17,366][46662] Updated weights for policy 0, policy_version 17300 (0.0009) +[2023-10-13 00:56:17,595][46663] Updated weights for policy 1, policy_version 17291 (0.0008) +[2023-10-13 00:56:17,736][46662] Updated weights for policy 0, policy_version 17310 (0.0009) +[2023-10-13 00:56:17,958][46663] Updated weights for policy 1, policy_version 17301 (0.0010) +[2023-10-13 00:56:18,332][46663] Updated weights for policy 1, policy_version 17311 (0.0009) +[2023-10-13 00:56:18,606][45375] Fps is (10 sec: 16384.5, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 35454976. Throughput: 0: 1680.9, 1: 1694.2. Samples: 8866036. Policy #0 lag: (min: 26.0, avg: 26.0, max: 26.0) +[2023-10-13 00:56:18,607][45375] Avg episode reward: [(0, '46.630'), (1, '45.700')] +[2023-10-13 00:56:21,682][46662] Updated weights for policy 0, policy_version 17320 (0.0008) +[2023-10-13 00:56:22,053][46662] Updated weights for policy 0, policy_version 17330 (0.0008) +[2023-10-13 00:56:22,428][46662] Updated weights for policy 0, policy_version 17340 (0.0009) +[2023-10-13 00:56:22,463][46663] Updated weights for policy 1, policy_version 17321 (0.0008) +[2023-10-13 00:56:22,825][46663] Updated weights for policy 1, policy_version 17331 (0.0008) +[2023-10-13 00:56:23,190][46663] Updated weights for policy 1, policy_version 17341 (0.0011) +[2023-10-13 00:56:23,607][45375] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 35520512. Throughput: 0: 1666.5, 1: 1664.2. Samples: 8884624. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-10-13 00:56:23,607][45375] Avg episode reward: [(0, '46.450'), (1, '46.460')] +[2023-10-13 00:56:26,423][46662] Updated weights for policy 0, policy_version 17350 (0.0008) +[2023-10-13 00:56:26,813][46662] Updated weights for policy 0, policy_version 17360 (0.0007) +[2023-10-13 00:56:27,186][46662] Updated weights for policy 0, policy_version 17370 (0.0007) +[2023-10-13 00:56:27,385][46663] Updated weights for policy 1, policy_version 17351 (0.0009) +[2023-10-13 00:56:27,757][46663] Updated weights for policy 1, policy_version 17361 (0.0009) +[2023-10-13 00:56:28,120][46663] Updated weights for policy 1, policy_version 17371 (0.0009) +[2023-10-13 00:56:28,607][45375] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 35586048. Throughput: 0: 1688.4, 1: 1687.7. Samples: 8896132. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-10-13 00:56:28,607][45375] Avg episode reward: [(0, '45.150'), (1, '44.880')] +[2023-10-13 00:56:31,381][46662] Updated weights for policy 0, policy_version 17380 (0.0007) +[2023-10-13 00:56:31,753][46662] Updated weights for policy 0, policy_version 17390 (0.0009) +[2023-10-13 00:56:32,130][46662] Updated weights for policy 0, policy_version 17400 (0.0010) +[2023-10-13 00:56:32,224][46663] Updated weights for policy 1, policy_version 17381 (0.0008) +[2023-10-13 00:56:32,595][46663] Updated weights for policy 1, policy_version 17391 (0.0010) +[2023-10-13 00:56:32,963][46663] Updated weights for policy 1, policy_version 17401 (0.0010) +[2023-10-13 00:56:33,607][45375] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 35651584. Throughput: 0: 1671.1, 1: 1672.5. Samples: 8915692. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-10-13 00:56:33,607][45375] Avg episode reward: [(0, '45.680'), (1, '43.510')] +[2023-10-13 00:56:36,240][46662] Updated weights for policy 0, policy_version 17410 (0.0010) +[2023-10-13 00:56:36,613][46662] Updated weights for policy 0, policy_version 17420 (0.0007) +[2023-10-13 00:56:36,983][46662] Updated weights for policy 0, policy_version 17430 (0.0007) +[2023-10-13 00:56:37,078][46663] Updated weights for policy 1, policy_version 17411 (0.0007) +[2023-10-13 00:56:37,355][46662] Updated weights for policy 0, policy_version 17440 (0.0009) +[2023-10-13 00:56:37,480][46663] Updated weights for policy 1, policy_version 17421 (0.0009) +[2023-10-13 00:56:37,845][46663] Updated weights for policy 1, policy_version 17431 (0.0008) +[2023-10-13 00:56:38,607][45375] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 35717120. Throughput: 0: 1670.6, 1: 1661.6. Samples: 8934618. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 00:56:38,607][45375] Avg episode reward: [(0, '46.330'), (1, '41.790')] +[2023-10-13 00:56:41,363][46662] Updated weights for policy 0, policy_version 17450 (0.0007) +[2023-10-13 00:56:41,739][46662] Updated weights for policy 0, policy_version 17460 (0.0007) +[2023-10-13 00:56:41,892][46663] Updated weights for policy 1, policy_version 17441 (0.0009) +[2023-10-13 00:56:42,106][46662] Updated weights for policy 0, policy_version 17470 (0.0009) +[2023-10-13 00:56:42,262][46663] Updated weights for policy 1, policy_version 17451 (0.0008) +[2023-10-13 00:56:42,628][46663] Updated weights for policy 1, policy_version 17461 (0.0008) +[2023-10-13 00:56:43,006][46663] Updated weights for policy 1, policy_version 17471 (0.0008) +[2023-10-13 00:56:43,607][45375] Fps is (10 sec: 13106.9, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 35782656. Throughput: 0: 1677.9, 1: 1677.5. Samples: 8946256. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 00:56:43,608][45375] Avg episode reward: [(0, '47.370'), (1, '40.610')] +[2023-10-13 00:56:46,341][46662] Updated weights for policy 0, policy_version 17480 (0.0007) +[2023-10-13 00:56:46,717][46662] Updated weights for policy 0, policy_version 17490 (0.0009) +[2023-10-13 00:56:47,098][46662] Updated weights for policy 0, policy_version 17500 (0.0009) +[2023-10-13 00:56:47,178][46663] Updated weights for policy 1, policy_version 17481 (0.0008) +[2023-10-13 00:56:47,545][46663] Updated weights for policy 1, policy_version 17491 (0.0009) +[2023-10-13 00:56:47,911][46663] Updated weights for policy 1, policy_version 17501 (0.0008) +[2023-10-13 00:56:48,606][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 35848192. Throughput: 0: 1659.8, 1: 1664.0. Samples: 8965496. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 00:56:48,607][45375] Avg episode reward: [(0, '48.080'), (1, '38.630')] +[2023-10-13 00:56:51,099][46662] Updated weights for policy 0, policy_version 17510 (0.0009) +[2023-10-13 00:56:51,473][46662] Updated weights for policy 0, policy_version 17520 (0.0009) +[2023-10-13 00:56:51,846][46662] Updated weights for policy 0, policy_version 17530 (0.0008) +[2023-10-13 00:56:52,058][46663] Updated weights for policy 1, policy_version 17511 (0.0007) +[2023-10-13 00:56:52,430][46663] Updated weights for policy 1, policy_version 17521 (0.0008) +[2023-10-13 00:56:52,801][46663] Updated weights for policy 1, policy_version 17531 (0.0008) +[2023-10-13 00:56:53,607][45375] Fps is (10 sec: 13106.8, 60 sec: 13653.2, 300 sec: 13551.5). Total num frames: 35913728. Throughput: 0: 1673.4, 1: 1669.2. Samples: 8985042. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 00:56:53,608][45375] Avg episode reward: [(0, '47.010'), (1, '38.370')] +[2023-10-13 00:56:55,862][46662] Updated weights for policy 0, policy_version 17540 (0.0008) +[2023-10-13 00:56:56,243][46662] Updated weights for policy 0, policy_version 17550 (0.0009) +[2023-10-13 00:56:56,621][46662] Updated weights for policy 0, policy_version 17560 (0.0008) +[2023-10-13 00:56:56,798][46663] Updated weights for policy 1, policy_version 17541 (0.0008) +[2023-10-13 00:56:57,153][46663] Updated weights for policy 1, policy_version 17551 (0.0009) +[2023-10-13 00:56:57,527][46663] Updated weights for policy 1, policy_version 17561 (0.0008) +[2023-10-13 00:56:58,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 35979264. Throughput: 0: 1675.4, 1: 1679.6. Samples: 8996636. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 00:56:58,607][45375] Avg episode reward: [(0, '47.180'), (1, '37.370')] +[2023-10-13 00:57:00,772][46662] Updated weights for policy 0, policy_version 17570 (0.0009) +[2023-10-13 00:57:01,154][46662] Updated weights for policy 0, policy_version 17580 (0.0010) +[2023-10-13 00:57:01,516][46662] Updated weights for policy 0, policy_version 17590 (0.0007) +[2023-10-13 00:57:01,608][46663] Updated weights for policy 1, policy_version 17571 (0.0008) +[2023-10-13 00:57:01,888][46662] Updated weights for policy 0, policy_version 17600 (0.0007) +[2023-10-13 00:57:01,980][46663] Updated weights for policy 1, policy_version 17581 (0.0007) +[2023-10-13 00:57:02,356][46663] Updated weights for policy 1, policy_version 17591 (0.0009) +[2023-10-13 00:57:03,606][45375] Fps is (10 sec: 13107.9, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 36044800. Throughput: 0: 1657.3, 1: 1658.2. Samples: 9015232. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 00:57:03,607][45375] Avg episode reward: [(0, '47.900'), (1, '38.380')] +[2023-10-13 00:57:06,041][46662] Updated weights for policy 0, policy_version 17610 (0.0011) +[2023-10-13 00:57:06,292][46663] Updated weights for policy 1, policy_version 17601 (0.0009) +[2023-10-13 00:57:06,424][46662] Updated weights for policy 0, policy_version 17620 (0.0010) +[2023-10-13 00:57:06,652][46663] Updated weights for policy 1, policy_version 17611 (0.0010) +[2023-10-13 00:57:06,802][46662] Updated weights for policy 0, policy_version 17630 (0.0008) +[2023-10-13 00:57:07,027][46663] Updated weights for policy 1, policy_version 17621 (0.0008) +[2023-10-13 00:57:07,396][46663] Updated weights for policy 1, policy_version 17631 (0.0009) +[2023-10-13 00:57:08,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 36110336. Throughput: 0: 1673.5, 1: 1673.4. Samples: 9035232. Policy #0 lag: (min: 40.0, avg: 55.1, max: 56.0) +[2023-10-13 00:57:08,607][45375] Avg episode reward: [(0, '47.490'), (1, '40.230')] +[2023-10-13 00:57:10,991][46662] Updated weights for policy 0, policy_version 17640 (0.0010) +[2023-10-13 00:57:11,356][46662] Updated weights for policy 0, policy_version 17650 (0.0009) +[2023-10-13 00:57:11,498][46663] Updated weights for policy 1, policy_version 17641 (0.0009) +[2023-10-13 00:57:11,732][46662] Updated weights for policy 0, policy_version 17660 (0.0007) +[2023-10-13 00:57:11,870][46663] Updated weights for policy 1, policy_version 17651 (0.0009) +[2023-10-13 00:57:12,239][46663] Updated weights for policy 1, policy_version 17661 (0.0010) +[2023-10-13 00:57:13,606][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13551.5). Total num frames: 36175872. Throughput: 0: 1665.4, 1: 1673.5. Samples: 9046382. Policy #0 lag: (min: 40.0, avg: 55.1, max: 56.0) +[2023-10-13 00:57:13,607][45375] Avg episode reward: [(0, '47.300'), (1, '39.930')] +[2023-10-13 00:57:15,852][46662] Updated weights for policy 0, policy_version 17670 (0.0007) +[2023-10-13 00:57:16,240][46662] Updated weights for policy 0, policy_version 17680 (0.0010) +[2023-10-13 00:57:16,414][46663] Updated weights for policy 1, policy_version 17671 (0.0008) +[2023-10-13 00:57:16,610][46662] Updated weights for policy 0, policy_version 17690 (0.0008) +[2023-10-13 00:57:16,781][46663] Updated weights for policy 1, policy_version 17681 (0.0009) +[2023-10-13 00:57:17,149][46663] Updated weights for policy 1, policy_version 17691 (0.0009) +[2023-10-13 00:57:18,606][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 36241408. Throughput: 0: 1657.6, 1: 1659.4. Samples: 9064954. Policy #0 lag: (min: 40.0, avg: 55.1, max: 56.0) +[2023-10-13 00:57:18,607][45375] Avg episode reward: [(0, '46.380'), (1, '40.550')] +[2023-10-13 00:57:20,651][46662] Updated weights for policy 0, policy_version 17700 (0.0008) +[2023-10-13 00:57:21,022][46662] Updated weights for policy 0, policy_version 17710 (0.0008) +[2023-10-13 00:57:21,181][46663] Updated weights for policy 1, policy_version 17701 (0.0009) +[2023-10-13 00:57:21,379][46662] Updated weights for policy 0, policy_version 17720 (0.0007) +[2023-10-13 00:57:21,544][46663] Updated weights for policy 1, policy_version 17711 (0.0008) +[2023-10-13 00:57:21,912][46663] Updated weights for policy 1, policy_version 17721 (0.0007) +[2023-10-13 00:57:23,607][45375] Fps is (10 sec: 13106.8, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 36306944. Throughput: 0: 1671.7, 1: 1682.5. Samples: 9085558. Policy #0 lag: (min: 31.0, avg: 37.8, max: 63.0) +[2023-10-13 00:57:23,608][45375] Avg episode reward: [(0, '46.600'), (1, '41.010')] +[2023-10-13 00:57:25,521][46662] Updated weights for policy 0, policy_version 17730 (0.0007) +[2023-10-13 00:57:25,890][46662] Updated weights for policy 0, policy_version 17740 (0.0009) +[2023-10-13 00:57:26,103][46663] Updated weights for policy 1, policy_version 17731 (0.0008) +[2023-10-13 00:57:26,265][46662] Updated weights for policy 0, policy_version 17750 (0.0008) +[2023-10-13 00:57:26,503][46663] Updated weights for policy 1, policy_version 17741 (0.0007) +[2023-10-13 00:57:26,627][46662] Updated weights for policy 0, policy_version 17760 (0.0008) +[2023-10-13 00:57:26,872][46663] Updated weights for policy 1, policy_version 17751 (0.0007) +[2023-10-13 00:57:28,607][45375] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 36372480. Throughput: 0: 1662.8, 1: 1673.8. Samples: 9096402. Policy #0 lag: (min: 31.0, avg: 37.8, max: 63.0) +[2023-10-13 00:57:28,608][45375] Avg episode reward: [(0, '47.230'), (1, '41.620')] +[2023-10-13 00:57:30,705][46662] Updated weights for policy 0, policy_version 17770 (0.0008) +[2023-10-13 00:57:30,965][46663] Updated weights for policy 1, policy_version 17761 (0.0008) +[2023-10-13 00:57:31,080][46662] Updated weights for policy 0, policy_version 17780 (0.0007) +[2023-10-13 00:57:31,331][46663] Updated weights for policy 1, policy_version 17771 (0.0009) +[2023-10-13 00:57:31,444][46662] Updated weights for policy 0, policy_version 17790 (0.0008) +[2023-10-13 00:57:31,688][46663] Updated weights for policy 1, policy_version 17781 (0.0009) +[2023-10-13 00:57:32,054][46663] Updated weights for policy 1, policy_version 17791 (0.0009) +[2023-10-13 00:57:33,607][45375] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 36438016. Throughput: 0: 1662.8, 1: 1662.2. Samples: 9115120. Policy #0 lag: (min: 31.0, avg: 37.8, max: 63.0) +[2023-10-13 00:57:33,607][45375] Avg episode reward: [(0, '45.650'), (1, '42.040')] +[2023-10-13 00:57:35,408][46662] Updated weights for policy 0, policy_version 17800 (0.0009) +[2023-10-13 00:57:35,783][46662] Updated weights for policy 0, policy_version 17810 (0.0010) +[2023-10-13 00:57:36,093][46663] Updated weights for policy 1, policy_version 17801 (0.0008) +[2023-10-13 00:57:36,159][46662] Updated weights for policy 0, policy_version 17820 (0.0008) +[2023-10-13 00:57:36,469][46663] Updated weights for policy 1, policy_version 17811 (0.0009) +[2023-10-13 00:57:36,840][46663] Updated weights for policy 1, policy_version 17821 (0.0010) +[2023-10-13 00:57:38,607][45375] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 36503552. Throughput: 0: 1679.1, 1: 1678.2. Samples: 9136118. Policy #0 lag: (min: 18.0, avg: 20.4, max: 50.0) +[2023-10-13 00:57:38,607][45375] Avg episode reward: [(0, '45.870'), (1, '42.960')] +[2023-10-13 00:57:40,095][46662] Updated weights for policy 0, policy_version 17830 (0.0008) +[2023-10-13 00:57:40,468][46662] Updated weights for policy 0, policy_version 17840 (0.0009) +[2023-10-13 00:57:40,842][46662] Updated weights for policy 0, policy_version 17850 (0.0009) +[2023-10-13 00:57:40,862][46663] Updated weights for policy 1, policy_version 17831 (0.0008) +[2023-10-13 00:57:41,236][46663] Updated weights for policy 1, policy_version 17841 (0.0009) +[2023-10-13 00:57:41,606][46663] Updated weights for policy 1, policy_version 17851 (0.0010) +[2023-10-13 00:57:43,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 36569088. Throughput: 0: 1659.8, 1: 1660.4. Samples: 9146044. Policy #0 lag: (min: 18.0, avg: 20.4, max: 50.0) +[2023-10-13 00:57:43,607][45375] Avg episode reward: [(0, '46.170'), (1, '43.090')] +[2023-10-13 00:57:44,882][46662] Updated weights for policy 0, policy_version 17860 (0.0007) +[2023-10-13 00:57:45,254][46662] Updated weights for policy 0, policy_version 17870 (0.0008) +[2023-10-13 00:57:45,611][46663] Updated weights for policy 1, policy_version 17861 (0.0008) +[2023-10-13 00:57:45,614][46662] Updated weights for policy 0, policy_version 17880 (0.0009) +[2023-10-13 00:57:45,975][46663] Updated weights for policy 1, policy_version 17871 (0.0008) +[2023-10-13 00:57:46,344][46663] Updated weights for policy 1, policy_version 17881 (0.0010) +[2023-10-13 00:57:48,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 36634624. Throughput: 0: 1676.0, 1: 1669.2. Samples: 9165768. Policy #0 lag: (min: 18.0, avg: 20.4, max: 50.0) +[2023-10-13 00:57:48,607][45375] Avg episode reward: [(0, '44.710'), (1, '44.670')] +[2023-10-13 00:57:49,607][46662] Updated weights for policy 0, policy_version 17890 (0.0007) +[2023-10-13 00:57:49,966][46662] Updated weights for policy 0, policy_version 17900 (0.0009) +[2023-10-13 00:57:50,341][46662] Updated weights for policy 0, policy_version 17910 (0.0010) +[2023-10-13 00:57:50,431][46663] Updated weights for policy 1, policy_version 17891 (0.0009) +[2023-10-13 00:57:50,710][46662] Updated weights for policy 0, policy_version 17920 (0.0009) +[2023-10-13 00:57:50,800][46663] Updated weights for policy 1, policy_version 17901 (0.0010) +[2023-10-13 00:57:51,164][46663] Updated weights for policy 1, policy_version 17911 (0.0008) +[2023-10-13 00:57:53,607][45375] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 36700160. Throughput: 0: 1683.0, 1: 1676.2. Samples: 9186396. Policy #0 lag: (min: 22.0, avg: 27.9, max: 54.0) +[2023-10-13 00:57:53,608][45375] Avg episode reward: [(0, '44.660'), (1, '44.690')] +[2023-10-13 00:57:53,621][46384] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000017920_18350080.pth... +[2023-10-13 00:57:53,621][46091] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000017920_18350080.pth... +[2023-10-13 00:57:53,650][46384] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000016352_16744448.pth +[2023-10-13 00:57:53,655][46091] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000016352_16744448.pth +[2023-10-13 00:57:54,889][46662] Updated weights for policy 0, policy_version 17930 (0.0008) +[2023-10-13 00:57:55,255][46662] Updated weights for policy 0, policy_version 17940 (0.0008) +[2023-10-13 00:57:55,277][46663] Updated weights for policy 1, policy_version 17921 (0.0008) +[2023-10-13 00:57:55,631][46662] Updated weights for policy 0, policy_version 17950 (0.0009) +[2023-10-13 00:57:55,649][46663] Updated weights for policy 1, policy_version 17931 (0.0008) +[2023-10-13 00:57:56,009][46663] Updated weights for policy 1, policy_version 17941 (0.0009) +[2023-10-13 00:57:56,383][46663] Updated weights for policy 1, policy_version 17951 (0.0010) +[2023-10-13 00:57:58,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 36765696. Throughput: 0: 1659.5, 1: 1653.8. Samples: 9195480. Policy #0 lag: (min: 22.0, avg: 27.9, max: 54.0) +[2023-10-13 00:57:58,607][45375] Avg episode reward: [(0, '43.680'), (1, '44.870')] +[2023-10-13 00:57:59,873][46662] Updated weights for policy 0, policy_version 17960 (0.0007) +[2023-10-13 00:58:00,241][46662] Updated weights for policy 0, policy_version 17970 (0.0008) +[2023-10-13 00:58:00,497][46663] Updated weights for policy 1, policy_version 17961 (0.0008) +[2023-10-13 00:58:00,613][46662] Updated weights for policy 0, policy_version 17980 (0.0009) +[2023-10-13 00:58:00,865][46663] Updated weights for policy 1, policy_version 17971 (0.0008) +[2023-10-13 00:58:01,235][46663] Updated weights for policy 1, policy_version 17981 (0.0009) +[2023-10-13 00:58:03,607][45375] Fps is (10 sec: 13107.6, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 36831232. Throughput: 0: 1683.7, 1: 1667.4. Samples: 9215752. Policy #0 lag: (min: 22.0, avg: 27.9, max: 54.0) +[2023-10-13 00:58:03,607][45375] Avg episode reward: [(0, '42.590'), (1, '44.100')] +[2023-10-13 00:58:04,568][46662] Updated weights for policy 0, policy_version 17990 (0.0008) +[2023-10-13 00:58:04,964][46662] Updated weights for policy 0, policy_version 18000 (0.0010) +[2023-10-13 00:58:05,301][46663] Updated weights for policy 1, policy_version 17991 (0.0010) +[2023-10-13 00:58:05,334][46662] Updated weights for policy 0, policy_version 18010 (0.0007) +[2023-10-13 00:58:05,676][46663] Updated weights for policy 1, policy_version 18001 (0.0008) +[2023-10-13 00:58:06,035][46663] Updated weights for policy 1, policy_version 18011 (0.0009) +[2023-10-13 00:58:08,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 36896768. Throughput: 0: 1682.2, 1: 1671.2. Samples: 9236462. Policy #0 lag: (min: 24.0, avg: 48.6, max: 56.0) +[2023-10-13 00:58:08,608][45375] Avg episode reward: [(0, '41.960'), (1, '43.840')] +[2023-10-13 00:58:09,509][46662] Updated weights for policy 0, policy_version 18020 (0.0008) +[2023-10-13 00:58:09,884][46662] Updated weights for policy 0, policy_version 18030 (0.0007) +[2023-10-13 00:58:10,203][46663] Updated weights for policy 1, policy_version 18021 (0.0009) +[2023-10-13 00:58:10,252][46662] Updated weights for policy 0, policy_version 18040 (0.0007) +[2023-10-13 00:58:10,557][46663] Updated weights for policy 1, policy_version 18031 (0.0008) +[2023-10-13 00:58:10,927][46663] Updated weights for policy 1, policy_version 18041 (0.0008) +[2023-10-13 00:58:13,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 36962304. Throughput: 0: 1664.0, 1: 1649.6. Samples: 9245512. Policy #0 lag: (min: 24.0, avg: 48.6, max: 56.0) +[2023-10-13 00:58:13,607][45375] Avg episode reward: [(0, '42.470'), (1, '42.970')] +[2023-10-13 00:58:14,316][46662] Updated weights for policy 0, policy_version 18050 (0.0009) +[2023-10-13 00:58:14,679][46662] Updated weights for policy 0, policy_version 18060 (0.0010) +[2023-10-13 00:58:14,992][46663] Updated weights for policy 1, policy_version 18051 (0.0009) +[2023-10-13 00:58:15,052][46662] Updated weights for policy 0, policy_version 18070 (0.0008) +[2023-10-13 00:58:15,383][46663] Updated weights for policy 1, policy_version 18061 (0.0007) +[2023-10-13 00:58:15,425][46662] Updated weights for policy 0, policy_version 18080 (0.0007) +[2023-10-13 00:58:15,749][46663] Updated weights for policy 1, policy_version 18071 (0.0008) +[2023-10-13 00:58:18,606][45375] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 37027840. Throughput: 0: 1683.2, 1: 1672.2. Samples: 9266110. Policy #0 lag: (min: 24.0, avg: 48.6, max: 56.0) +[2023-10-13 00:58:18,607][45375] Avg episode reward: [(0, '42.190'), (1, '42.990')] +[2023-10-13 00:58:19,538][46662] Updated weights for policy 0, policy_version 18090 (0.0008) +[2023-10-13 00:58:19,901][46662] Updated weights for policy 0, policy_version 18100 (0.0009) +[2023-10-13 00:58:19,920][46663] Updated weights for policy 1, policy_version 18081 (0.0009) +[2023-10-13 00:58:20,285][46662] Updated weights for policy 0, policy_version 18110 (0.0008) +[2023-10-13 00:58:20,286][46663] Updated weights for policy 1, policy_version 18091 (0.0008) +[2023-10-13 00:58:20,660][46663] Updated weights for policy 1, policy_version 18101 (0.0008) +[2023-10-13 00:58:21,029][46663] Updated weights for policy 1, policy_version 18111 (0.0008) +[2023-10-13 00:58:23,607][45375] Fps is (10 sec: 13106.8, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 37093376. Throughput: 0: 1670.3, 1: 1671.0. Samples: 9286476. Policy #0 lag: (min: 31.0, avg: 31.9, max: 51.0) +[2023-10-13 00:58:23,608][45375] Avg episode reward: [(0, '42.380'), (1, '41.770')] +[2023-10-13 00:58:24,410][46662] Updated weights for policy 0, policy_version 18120 (0.0009) +[2023-10-13 00:58:24,783][46662] Updated weights for policy 0, policy_version 18130 (0.0007) +[2023-10-13 00:58:25,096][46663] Updated weights for policy 1, policy_version 18121 (0.0009) +[2023-10-13 00:58:25,154][46662] Updated weights for policy 0, policy_version 18140 (0.0008) +[2023-10-13 00:58:25,460][46663] Updated weights for policy 1, policy_version 18131 (0.0007) +[2023-10-13 00:58:25,824][46663] Updated weights for policy 1, policy_version 18141 (0.0008) +[2023-10-13 00:58:28,606][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 37158912. Throughput: 0: 1665.4, 1: 1659.0. Samples: 9295644. Policy #0 lag: (min: 31.0, avg: 31.9, max: 51.0) +[2023-10-13 00:58:28,607][45375] Avg episode reward: [(0, '43.140'), (1, '39.810')] +[2023-10-13 00:58:29,291][46662] Updated weights for policy 0, policy_version 18150 (0.0010) +[2023-10-13 00:58:29,669][46662] Updated weights for policy 0, policy_version 18160 (0.0007) +[2023-10-13 00:58:30,021][46663] Updated weights for policy 1, policy_version 18151 (0.0008) +[2023-10-13 00:58:30,037][46662] Updated weights for policy 0, policy_version 18170 (0.0008) +[2023-10-13 00:58:30,388][46663] Updated weights for policy 1, policy_version 18161 (0.0008) +[2023-10-13 00:58:30,747][46663] Updated weights for policy 1, policy_version 18171 (0.0009) +[2023-10-13 00:58:33,606][45375] Fps is (10 sec: 13107.6, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 37224448. Throughput: 0: 1669.0, 1: 1672.6. Samples: 9316140. Policy #0 lag: (min: 31.0, avg: 31.9, max: 51.0) +[2023-10-13 00:58:33,607][45375] Avg episode reward: [(0, '43.530'), (1, '40.520')] +[2023-10-13 00:58:33,992][46662] Updated weights for policy 0, policy_version 18180 (0.0008) +[2023-10-13 00:58:34,364][46662] Updated weights for policy 0, policy_version 18190 (0.0011) +[2023-10-13 00:58:34,745][46662] Updated weights for policy 0, policy_version 18200 (0.0010) +[2023-10-13 00:58:34,944][46663] Updated weights for policy 1, policy_version 18181 (0.0009) +[2023-10-13 00:58:35,309][46663] Updated weights for policy 1, policy_version 18191 (0.0008) +[2023-10-13 00:58:35,683][46663] Updated weights for policy 1, policy_version 18201 (0.0008) +[2023-10-13 00:58:38,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 37289984. Throughput: 0: 1668.3, 1: 1670.6. Samples: 9336644. Policy #0 lag: (min: 16.0, avg: 42.3, max: 48.0) +[2023-10-13 00:58:38,607][45375] Avg episode reward: [(0, '43.870'), (1, '41.200')] +[2023-10-13 00:58:38,810][46662] Updated weights for policy 0, policy_version 18210 (0.0009) +[2023-10-13 00:58:39,182][46662] Updated weights for policy 0, policy_version 18220 (0.0009) +[2023-10-13 00:58:39,553][46662] Updated weights for policy 0, policy_version 18230 (0.0008) +[2023-10-13 00:58:39,726][46663] Updated weights for policy 1, policy_version 18211 (0.0008) +[2023-10-13 00:58:39,919][46662] Updated weights for policy 0, policy_version 18240 (0.0009) +[2023-10-13 00:58:40,085][46663] Updated weights for policy 1, policy_version 18221 (0.0007) +[2023-10-13 00:58:40,454][46663] Updated weights for policy 1, policy_version 18231 (0.0009) +[2023-10-13 00:58:43,607][45375] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 37355520. Throughput: 0: 1672.0, 1: 1669.2. Samples: 9345832. Policy #0 lag: (min: 16.0, avg: 42.3, max: 48.0) +[2023-10-13 00:58:43,608][45375] Avg episode reward: [(0, '43.150'), (1, '40.200')] +[2023-10-13 00:58:44,027][46662] Updated weights for policy 0, policy_version 18250 (0.0007) +[2023-10-13 00:58:44,396][46662] Updated weights for policy 0, policy_version 18260 (0.0008) +[2023-10-13 00:58:44,646][46663] Updated weights for policy 1, policy_version 18241 (0.0010) +[2023-10-13 00:58:44,766][46662] Updated weights for policy 0, policy_version 18270 (0.0008) +[2023-10-13 00:58:45,010][46663] Updated weights for policy 1, policy_version 18251 (0.0008) +[2023-10-13 00:58:45,386][46663] Updated weights for policy 1, policy_version 18261 (0.0009) +[2023-10-13 00:58:45,752][46663] Updated weights for policy 1, policy_version 18271 (0.0008) +[2023-10-13 00:58:48,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 37421056. Throughput: 0: 1675.3, 1: 1679.2. Samples: 9366706. Policy #0 lag: (min: 16.0, avg: 42.3, max: 48.0) +[2023-10-13 00:58:48,607][45375] Avg episode reward: [(0, '44.140'), (1, '40.080')] +[2023-10-13 00:58:48,810][46662] Updated weights for policy 0, policy_version 18280 (0.0008) +[2023-10-13 00:58:49,173][46662] Updated weights for policy 0, policy_version 18290 (0.0010) +[2023-10-13 00:58:49,542][46662] Updated weights for policy 0, policy_version 18300 (0.0010) +[2023-10-13 00:58:49,740][46663] Updated weights for policy 1, policy_version 18281 (0.0008) +[2023-10-13 00:58:50,112][46663] Updated weights for policy 1, policy_version 18291 (0.0007) +[2023-10-13 00:58:50,484][46663] Updated weights for policy 1, policy_version 18301 (0.0008) +[2023-10-13 00:58:53,594][46662] Updated weights for policy 0, policy_version 18310 (0.0008) +[2023-10-13 00:58:53,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 37486592. Throughput: 0: 1682.5, 1: 1669.0. Samples: 9387280. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 00:58:53,607][45375] Avg episode reward: [(0, '46.330'), (1, '39.310')] +[2023-10-13 00:58:53,976][46662] Updated weights for policy 0, policy_version 18320 (0.0010) +[2023-10-13 00:58:54,352][46662] Updated weights for policy 0, policy_version 18330 (0.0010) +[2023-10-13 00:58:54,655][46663] Updated weights for policy 1, policy_version 18311 (0.0009) +[2023-10-13 00:58:55,025][46663] Updated weights for policy 1, policy_version 18321 (0.0009) +[2023-10-13 00:58:55,401][46663] Updated weights for policy 1, policy_version 18331 (0.0008) +[2023-10-13 00:58:58,486][46662] Updated weights for policy 0, policy_version 18340 (0.0008) +[2023-10-13 00:58:58,607][45375] Fps is (10 sec: 13106.9, 60 sec: 13107.1, 300 sec: 13329.4). Total num frames: 37552128. Throughput: 0: 1678.1, 1: 1674.1. Samples: 9396362. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 00:58:58,608][45375] Avg episode reward: [(0, '46.470'), (1, '39.640')] +[2023-10-13 00:58:58,857][46662] Updated weights for policy 0, policy_version 18350 (0.0008) +[2023-10-13 00:58:59,230][46662] Updated weights for policy 0, policy_version 18360 (0.0007) +[2023-10-13 00:58:59,571][46663] Updated weights for policy 1, policy_version 18341 (0.0007) +[2023-10-13 00:58:59,944][46663] Updated weights for policy 1, policy_version 18351 (0.0011) +[2023-10-13 00:59:00,318][46663] Updated weights for policy 1, policy_version 18361 (0.0008) +[2023-10-13 00:59:03,204][46662] Updated weights for policy 0, policy_version 18370 (0.0008) +[2023-10-13 00:59:03,575][46662] Updated weights for policy 0, policy_version 18380 (0.0007) +[2023-10-13 00:59:03,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 37617664. Throughput: 0: 1681.3, 1: 1672.4. Samples: 9417026. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 00:59:03,607][45375] Avg episode reward: [(0, '47.000'), (1, '39.170')] +[2023-10-13 00:59:03,951][46662] Updated weights for policy 0, policy_version 18390 (0.0008) +[2023-10-13 00:59:04,322][46663] Updated weights for policy 1, policy_version 18371 (0.0008) +[2023-10-13 00:59:04,329][46662] Updated weights for policy 0, policy_version 18400 (0.0007) +[2023-10-13 00:59:04,736][46663] Updated weights for policy 1, policy_version 18381 (0.0009) +[2023-10-13 00:59:05,109][46663] Updated weights for policy 1, policy_version 18391 (0.0008) +[2023-10-13 00:59:08,247][46662] Updated weights for policy 0, policy_version 18410 (0.0008) +[2023-10-13 00:59:08,607][45375] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 37683200. Throughput: 0: 1685.2, 1: 1667.2. Samples: 9437332. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 00:59:08,607][45375] Avg episode reward: [(0, '48.670'), (1, '39.170')] +[2023-10-13 00:59:08,614][46662] Updated weights for policy 0, policy_version 18420 (0.0008) +[2023-10-13 00:59:08,979][46662] Updated weights for policy 0, policy_version 18430 (0.0007) +[2023-10-13 00:59:09,152][46663] Updated weights for policy 1, policy_version 18401 (0.0008) +[2023-10-13 00:59:09,519][46663] Updated weights for policy 1, policy_version 18411 (0.0009) +[2023-10-13 00:59:09,894][46663] Updated weights for policy 1, policy_version 18421 (0.0009) +[2023-10-13 00:59:10,266][46663] Updated weights for policy 1, policy_version 18431 (0.0007) +[2023-10-13 00:59:13,035][46662] Updated weights for policy 0, policy_version 18440 (0.0009) +[2023-10-13 00:59:13,399][46662] Updated weights for policy 0, policy_version 18450 (0.0008) +[2023-10-13 00:59:13,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 37748736. Throughput: 0: 1687.2, 1: 1668.4. Samples: 9446646. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 00:59:13,607][45375] Avg episode reward: [(0, '49.520'), (1, '39.310')] +[2023-10-13 00:59:13,775][46662] Updated weights for policy 0, policy_version 18460 (0.0008) +[2023-10-13 00:59:14,478][46663] Updated weights for policy 1, policy_version 18441 (0.0009) +[2023-10-13 00:59:14,851][46663] Updated weights for policy 1, policy_version 18451 (0.0009) +[2023-10-13 00:59:15,222][46663] Updated weights for policy 1, policy_version 18461 (0.0008) +[2023-10-13 00:59:18,000][46662] Updated weights for policy 0, policy_version 18470 (0.0007) +[2023-10-13 00:59:18,373][46662] Updated weights for policy 0, policy_version 18480 (0.0010) +[2023-10-13 00:59:18,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 37814272. Throughput: 0: 1690.7, 1: 1667.6. Samples: 9467268. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 00:59:18,608][45375] Avg episode reward: [(0, '49.430'), (1, '40.160')] +[2023-10-13 00:59:18,744][46662] Updated weights for policy 0, policy_version 18490 (0.0010) +[2023-10-13 00:59:19,197][46663] Updated weights for policy 1, policy_version 18471 (0.0008) +[2023-10-13 00:59:19,568][46663] Updated weights for policy 1, policy_version 18481 (0.0008) +[2023-10-13 00:59:19,924][46663] Updated weights for policy 1, policy_version 18491 (0.0008) +[2023-10-13 00:59:22,821][46662] Updated weights for policy 0, policy_version 18500 (0.0008) +[2023-10-13 00:59:23,191][46662] Updated weights for policy 0, policy_version 18510 (0.0008) +[2023-10-13 00:59:23,569][46662] Updated weights for policy 0, policy_version 18520 (0.0007) +[2023-10-13 00:59:23,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 37879808. Throughput: 0: 1685.6, 1: 1671.8. Samples: 9487724. Policy #0 lag: (min: 31.0, avg: 33.8, max: 63.0) +[2023-10-13 00:59:23,607][45375] Avg episode reward: [(0, '49.680'), (1, '42.080')] +[2023-10-13 00:59:24,043][46663] Updated weights for policy 1, policy_version 18501 (0.0009) +[2023-10-13 00:59:24,409][46663] Updated weights for policy 1, policy_version 18511 (0.0007) +[2023-10-13 00:59:24,785][46663] Updated weights for policy 1, policy_version 18521 (0.0008) +[2023-10-13 00:59:27,557][46662] Updated weights for policy 0, policy_version 18530 (0.0009) +[2023-10-13 00:59:27,934][46662] Updated weights for policy 0, policy_version 18540 (0.0008) +[2023-10-13 00:59:28,306][46662] Updated weights for policy 0, policy_version 18550 (0.0009) +[2023-10-13 00:59:28,606][45375] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 37945344. Throughput: 0: 1687.9, 1: 1671.1. Samples: 9496986. Policy #0 lag: (min: 31.0, avg: 33.8, max: 63.0) +[2023-10-13 00:59:28,607][45375] Avg episode reward: [(0, '50.000'), (1, '42.900')] +[2023-10-13 00:59:28,681][46662] Updated weights for policy 0, policy_version 18560 (0.0009) +[2023-10-13 00:59:28,690][46663] Updated weights for policy 1, policy_version 18531 (0.0009) +[2023-10-13 00:59:29,058][46663] Updated weights for policy 1, policy_version 18541 (0.0009) +[2023-10-13 00:59:29,427][46663] Updated weights for policy 1, policy_version 18551 (0.0007) +[2023-10-13 00:59:32,734][46662] Updated weights for policy 0, policy_version 18570 (0.0010) +[2023-10-13 00:59:33,110][46662] Updated weights for policy 0, policy_version 18580 (0.0007) +[2023-10-13 00:59:33,357][46663] Updated weights for policy 1, policy_version 18561 (0.0009) +[2023-10-13 00:59:33,482][46662] Updated weights for policy 0, policy_version 18590 (0.0007) +[2023-10-13 00:59:33,607][45375] Fps is (10 sec: 16384.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 38043648. Throughput: 0: 1685.1, 1: 1672.4. Samples: 9517794. Policy #0 lag: (min: 31.0, avg: 38.3, max: 63.0) +[2023-10-13 00:59:33,607][45375] Avg episode reward: [(0, '49.790'), (1, '43.310')] +[2023-10-13 00:59:33,724][46663] Updated weights for policy 1, policy_version 18571 (0.0007) +[2023-10-13 00:59:34,084][46663] Updated weights for policy 1, policy_version 18581 (0.0008) +[2023-10-13 00:59:34,456][46663] Updated weights for policy 1, policy_version 18591 (0.0010) +[2023-10-13 00:59:37,529][46662] Updated weights for policy 0, policy_version 18600 (0.0009) +[2023-10-13 00:59:37,902][46662] Updated weights for policy 0, policy_version 18610 (0.0011) +[2023-10-13 00:59:38,275][46662] Updated weights for policy 0, policy_version 18620 (0.0010) +[2023-10-13 00:59:38,538][46663] Updated weights for policy 1, policy_version 18601 (0.0008) +[2023-10-13 00:59:38,607][45375] Fps is (10 sec: 16383.8, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 38109184. Throughput: 0: 1670.8, 1: 1676.9. Samples: 9537928. Policy #0 lag: (min: 31.0, avg: 38.3, max: 63.0) +[2023-10-13 00:59:38,607][45375] Avg episode reward: [(0, '50.550'), (1, '45.290')] +[2023-10-13 00:59:38,905][46663] Updated weights for policy 1, policy_version 18611 (0.0011) +[2023-10-13 00:59:39,278][46663] Updated weights for policy 1, policy_version 18621 (0.0009) +[2023-10-13 00:59:42,587][46662] Updated weights for policy 0, policy_version 18630 (0.0009) +[2023-10-13 00:59:42,963][46662] Updated weights for policy 0, policy_version 18640 (0.0010) +[2023-10-13 00:59:43,343][46662] Updated weights for policy 0, policy_version 18650 (0.0008) +[2023-10-13 00:59:43,529][46663] Updated weights for policy 1, policy_version 18631 (0.0008) +[2023-10-13 00:59:43,607][45375] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 38174720. Throughput: 0: 1688.0, 1: 1674.1. Samples: 9547656. Policy #0 lag: (min: 31.0, avg: 38.3, max: 63.0) +[2023-10-13 00:59:43,608][45375] Avg episode reward: [(0, '52.280'), (1, '45.320')] +[2023-10-13 00:59:43,609][46091] Saving new best policy, reward=52.280! +[2023-10-13 00:59:43,890][46663] Updated weights for policy 1, policy_version 18641 (0.0007) +[2023-10-13 00:59:44,263][46663] Updated weights for policy 1, policy_version 18651 (0.0009) +[2023-10-13 00:59:47,304][46662] Updated weights for policy 0, policy_version 18660 (0.0008) +[2023-10-13 00:59:47,671][46662] Updated weights for policy 0, policy_version 18670 (0.0010) +[2023-10-13 00:59:48,040][46662] Updated weights for policy 0, policy_version 18680 (0.0010) +[2023-10-13 00:59:48,598][46663] Updated weights for policy 1, policy_version 18661 (0.0008) +[2023-10-13 00:59:48,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 38240256. Throughput: 0: 1685.2, 1: 1672.8. Samples: 9568140. Policy #0 lag: (min: 11.0, avg: 13.1, max: 43.0) +[2023-10-13 00:59:48,607][45375] Avg episode reward: [(0, '53.060'), (1, '44.850')] +[2023-10-13 00:59:48,608][46091] Saving new best policy, reward=53.060! +[2023-10-13 00:59:48,964][46663] Updated weights for policy 1, policy_version 18671 (0.0009) +[2023-10-13 00:59:49,334][46663] Updated weights for policy 1, policy_version 18681 (0.0008) +[2023-10-13 00:59:51,916][46662] Updated weights for policy 0, policy_version 18690 (0.0008) +[2023-10-13 00:59:52,285][46662] Updated weights for policy 0, policy_version 18700 (0.0008) +[2023-10-13 00:59:52,655][46662] Updated weights for policy 0, policy_version 18710 (0.0010) +[2023-10-13 00:59:53,028][46662] Updated weights for policy 0, policy_version 18720 (0.0009) +[2023-10-13 00:59:53,519][46663] Updated weights for policy 1, policy_version 18691 (0.0010) +[2023-10-13 00:59:53,607][45375] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 38305792. Throughput: 0: 1669.8, 1: 1674.0. Samples: 9587804. Policy #0 lag: (min: 11.0, avg: 13.1, max: 43.0) +[2023-10-13 00:59:53,607][45375] Avg episode reward: [(0, '52.450'), (1, '45.140')] +[2023-10-13 00:59:53,618][46091] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000018720_19169280.pth... +[2023-10-13 00:59:53,650][46091] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000017152_17563648.pth +[2023-10-13 00:59:53,950][46663] Updated weights for policy 1, policy_version 18701 (0.0011) +[2023-10-13 00:59:54,306][46663] Updated weights for policy 1, policy_version 18711 (0.0010) +[2023-10-13 00:59:54,632][46384] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000018720_19169280.pth... +[2023-10-13 00:59:54,661][46384] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000017152_17563648.pth +[2023-10-13 00:59:57,143][46662] Updated weights for policy 0, policy_version 18730 (0.0007) +[2023-10-13 00:59:57,512][46662] Updated weights for policy 0, policy_version 18740 (0.0007) +[2023-10-13 00:59:57,885][46662] Updated weights for policy 0, policy_version 18750 (0.0008) +[2023-10-13 00:59:58,470][46663] Updated weights for policy 1, policy_version 18721 (0.0009) +[2023-10-13 00:59:58,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 38371328. Throughput: 0: 1687.3, 1: 1670.6. Samples: 9597752. Policy #0 lag: (min: 11.0, avg: 13.1, max: 43.0) +[2023-10-13 00:59:58,607][45375] Avg episode reward: [(0, '52.630'), (1, '45.680')] +[2023-10-13 00:59:58,831][46663] Updated weights for policy 1, policy_version 18731 (0.0010) +[2023-10-13 00:59:59,195][46663] Updated weights for policy 1, policy_version 18741 (0.0011) +[2023-10-13 00:59:59,569][46663] Updated weights for policy 1, policy_version 18751 (0.0011) +[2023-10-13 01:00:01,855][46662] Updated weights for policy 0, policy_version 18760 (0.0009) +[2023-10-13 01:00:02,224][46662] Updated weights for policy 0, policy_version 18770 (0.0007) +[2023-10-13 01:00:02,598][46662] Updated weights for policy 0, policy_version 18780 (0.0010) +[2023-10-13 01:00:03,578][46663] Updated weights for policy 1, policy_version 18761 (0.0007) +[2023-10-13 01:00:03,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 38436864. Throughput: 0: 1682.5, 1: 1669.4. Samples: 9618106. Policy #0 lag: (min: 31.0, avg: 33.0, max: 62.0) +[2023-10-13 01:00:03,607][45375] Avg episode reward: [(0, '51.320'), (1, '46.110')] +[2023-10-13 01:00:03,952][46663] Updated weights for policy 1, policy_version 18771 (0.0008) +[2023-10-13 01:00:04,321][46663] Updated weights for policy 1, policy_version 18781 (0.0007) +[2023-10-13 01:00:06,754][46662] Updated weights for policy 0, policy_version 18790 (0.0008) +[2023-10-13 01:00:07,129][46662] Updated weights for policy 0, policy_version 18800 (0.0008) +[2023-10-13 01:00:07,496][46662] Updated weights for policy 0, policy_version 18810 (0.0010) +[2023-10-13 01:00:08,426][46663] Updated weights for policy 1, policy_version 18791 (0.0010) +[2023-10-13 01:00:08,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 38502400. Throughput: 0: 1663.0, 1: 1664.2. Samples: 9637448. Policy #0 lag: (min: 31.0, avg: 33.0, max: 62.0) +[2023-10-13 01:00:08,607][45375] Avg episode reward: [(0, '51.570'), (1, '45.490')] +[2023-10-13 01:00:08,801][46663] Updated weights for policy 1, policy_version 18801 (0.0009) +[2023-10-13 01:00:09,177][46663] Updated weights for policy 1, policy_version 18811 (0.0011) +[2023-10-13 01:00:11,513][46662] Updated weights for policy 0, policy_version 18820 (0.0009) +[2023-10-13 01:00:11,884][46662] Updated weights for policy 0, policy_version 18830 (0.0007) +[2023-10-13 01:00:12,256][46662] Updated weights for policy 0, policy_version 18840 (0.0009) +[2023-10-13 01:00:13,313][46663] Updated weights for policy 1, policy_version 18821 (0.0008) +[2023-10-13 01:00:13,606][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 38567936. Throughput: 0: 1691.9, 1: 1667.3. Samples: 9648150. Policy #0 lag: (min: 31.0, avg: 33.0, max: 62.0) +[2023-10-13 01:00:13,607][45375] Avg episode reward: [(0, '51.650'), (1, '46.140')] +[2023-10-13 01:00:13,698][46663] Updated weights for policy 1, policy_version 18831 (0.0008) +[2023-10-13 01:00:14,070][46663] Updated weights for policy 1, policy_version 18841 (0.0011) +[2023-10-13 01:00:16,273][46662] Updated weights for policy 0, policy_version 18850 (0.0010) +[2023-10-13 01:00:16,641][46662] Updated weights for policy 0, policy_version 18860 (0.0007) +[2023-10-13 01:00:17,001][46662] Updated weights for policy 0, policy_version 18870 (0.0008) +[2023-10-13 01:00:17,378][46662] Updated weights for policy 0, policy_version 18880 (0.0008) +[2023-10-13 01:00:18,221][46663] Updated weights for policy 1, policy_version 18851 (0.0010) +[2023-10-13 01:00:18,587][46663] Updated weights for policy 1, policy_version 18861 (0.0009) +[2023-10-13 01:00:18,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 38633472. Throughput: 0: 1680.3, 1: 1660.1. Samples: 9668112. Policy #0 lag: (min: 23.0, avg: 23.0, max: 23.0) +[2023-10-13 01:00:18,607][45375] Avg episode reward: [(0, '52.410'), (1, '45.030')] +[2023-10-13 01:00:18,956][46663] Updated weights for policy 1, policy_version 18871 (0.0007) +[2023-10-13 01:00:21,470][46662] Updated weights for policy 0, policy_version 18890 (0.0008) +[2023-10-13 01:00:21,843][46662] Updated weights for policy 0, policy_version 18900 (0.0010) +[2023-10-13 01:00:22,206][46662] Updated weights for policy 0, policy_version 18910 (0.0008) +[2023-10-13 01:00:22,998][46663] Updated weights for policy 1, policy_version 18881 (0.0009) +[2023-10-13 01:00:23,358][46663] Updated weights for policy 1, policy_version 18891 (0.0009) +[2023-10-13 01:00:23,606][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 38699008. Throughput: 0: 1676.7, 1: 1654.5. Samples: 9687830. Policy #0 lag: (min: 23.0, avg: 23.0, max: 23.0) +[2023-10-13 01:00:23,607][45375] Avg episode reward: [(0, '53.900'), (1, '44.030')] +[2023-10-13 01:00:23,615][46091] Saving new best policy, reward=53.900! +[2023-10-13 01:00:23,734][46663] Updated weights for policy 1, policy_version 18901 (0.0010) +[2023-10-13 01:00:24,098][46663] Updated weights for policy 1, policy_version 18911 (0.0008) +[2023-10-13 01:00:26,258][46662] Updated weights for policy 0, policy_version 18920 (0.0009) +[2023-10-13 01:00:26,624][46662] Updated weights for policy 0, policy_version 18930 (0.0009) +[2023-10-13 01:00:27,000][46662] Updated weights for policy 0, policy_version 18940 (0.0008) +[2023-10-13 01:00:28,051][46663] Updated weights for policy 1, policy_version 18921 (0.0011) +[2023-10-13 01:00:28,420][46663] Updated weights for policy 1, policy_version 18931 (0.0009) +[2023-10-13 01:00:28,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 38764544. Throughput: 0: 1692.8, 1: 1662.5. Samples: 9698642. Policy #0 lag: (min: 23.0, avg: 23.0, max: 23.0) +[2023-10-13 01:00:28,607][45375] Avg episode reward: [(0, '52.080'), (1, '43.490')] +[2023-10-13 01:00:28,785][46663] Updated weights for policy 1, policy_version 18941 (0.0007) +[2023-10-13 01:00:31,045][46662] Updated weights for policy 0, policy_version 18950 (0.0009) +[2023-10-13 01:00:31,414][46662] Updated weights for policy 0, policy_version 18960 (0.0008) +[2023-10-13 01:00:31,795][46662] Updated weights for policy 0, policy_version 18970 (0.0007) +[2023-10-13 01:00:33,018][46663] Updated weights for policy 1, policy_version 18951 (0.0007) +[2023-10-13 01:00:33,380][46663] Updated weights for policy 1, policy_version 18961 (0.0008) +[2023-10-13 01:00:33,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 38830080. Throughput: 0: 1666.5, 1: 1667.9. Samples: 9718190. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 01:00:33,607][45375] Avg episode reward: [(0, '51.990'), (1, '41.570')] +[2023-10-13 01:00:33,751][46663] Updated weights for policy 1, policy_version 18971 (0.0008) +[2023-10-13 01:00:35,922][46662] Updated weights for policy 0, policy_version 18980 (0.0009) +[2023-10-13 01:00:36,303][46662] Updated weights for policy 0, policy_version 18990 (0.0007) +[2023-10-13 01:00:36,676][46662] Updated weights for policy 0, policy_version 19000 (0.0007) +[2023-10-13 01:00:38,056][46663] Updated weights for policy 1, policy_version 18981 (0.0010) +[2023-10-13 01:00:38,445][46663] Updated weights for policy 1, policy_version 18991 (0.0009) +[2023-10-13 01:00:38,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 38895616. Throughput: 0: 1672.4, 1: 1653.9. Samples: 9737490. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 01:00:38,607][45375] Avg episode reward: [(0, '51.560'), (1, '41.820')] +[2023-10-13 01:00:38,804][46663] Updated weights for policy 1, policy_version 19001 (0.0009) +[2023-10-13 01:00:40,684][46662] Updated weights for policy 0, policy_version 19010 (0.0009) +[2023-10-13 01:00:41,058][46662] Updated weights for policy 0, policy_version 19020 (0.0008) +[2023-10-13 01:00:41,430][46662] Updated weights for policy 0, policy_version 19030 (0.0007) +[2023-10-13 01:00:41,804][46662] Updated weights for policy 0, policy_version 19040 (0.0008) +[2023-10-13 01:00:42,669][46663] Updated weights for policy 1, policy_version 19011 (0.0008) +[2023-10-13 01:00:43,044][46663] Updated weights for policy 1, policy_version 19021 (0.0007) +[2023-10-13 01:00:43,407][46663] Updated weights for policy 1, policy_version 19031 (0.0007) +[2023-10-13 01:00:43,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 38961152. Throughput: 0: 1672.7, 1: 1672.9. Samples: 9748302. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 01:00:43,607][45375] Avg episode reward: [(0, '50.930'), (1, '43.300')] +[2023-10-13 01:00:45,828][46662] Updated weights for policy 0, policy_version 19050 (0.0011) +[2023-10-13 01:00:46,207][46662] Updated weights for policy 0, policy_version 19060 (0.0009) +[2023-10-13 01:00:46,579][46662] Updated weights for policy 0, policy_version 19070 (0.0007) +[2023-10-13 01:00:47,436][46663] Updated weights for policy 1, policy_version 19041 (0.0007) +[2023-10-13 01:00:47,814][46663] Updated weights for policy 1, policy_version 19051 (0.0008) +[2023-10-13 01:00:48,180][46663] Updated weights for policy 1, policy_version 19061 (0.0009) +[2023-10-13 01:00:48,543][46663] Updated weights for policy 1, policy_version 19071 (0.0009) +[2023-10-13 01:00:48,607][45375] Fps is (10 sec: 16384.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 39059456. Throughput: 0: 1648.8, 1: 1675.4. Samples: 9767696. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 01:00:48,607][45375] Avg episode reward: [(0, '49.490'), (1, '44.540')] +[2023-10-13 01:00:50,745][46662] Updated weights for policy 0, policy_version 19080 (0.0007) +[2023-10-13 01:00:51,108][46662] Updated weights for policy 0, policy_version 19090 (0.0007) +[2023-10-13 01:00:51,475][46662] Updated weights for policy 0, policy_version 19100 (0.0007) +[2023-10-13 01:00:52,386][46663] Updated weights for policy 1, policy_version 19081 (0.0008) +[2023-10-13 01:00:52,748][46663] Updated weights for policy 1, policy_version 19091 (0.0008) +[2023-10-13 01:00:53,117][46663] Updated weights for policy 1, policy_version 19101 (0.0008) +[2023-10-13 01:00:53,607][45375] Fps is (10 sec: 16383.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 39124992. Throughput: 0: 1672.9, 1: 1653.7. Samples: 9787148. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 01:00:53,607][45375] Avg episode reward: [(0, '48.870'), (1, '45.160')] +[2023-10-13 01:00:55,682][46662] Updated weights for policy 0, policy_version 19110 (0.0010) +[2023-10-13 01:00:56,058][46662] Updated weights for policy 0, policy_version 19120 (0.0010) +[2023-10-13 01:00:56,422][46662] Updated weights for policy 0, policy_version 19130 (0.0010) +[2023-10-13 01:00:57,315][46663] Updated weights for policy 1, policy_version 19111 (0.0009) +[2023-10-13 01:00:57,688][46663] Updated weights for policy 1, policy_version 19121 (0.0009) +[2023-10-13 01:00:58,060][46663] Updated weights for policy 1, policy_version 19131 (0.0010) +[2023-10-13 01:00:58,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 39190528. Throughput: 0: 1656.6, 1: 1678.0. Samples: 9798204. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 01:00:58,607][45375] Avg episode reward: [(0, '48.820'), (1, '44.860')] +[2023-10-13 01:01:00,765][46662] Updated weights for policy 0, policy_version 19140 (0.0007) +[2023-10-13 01:01:01,136][46662] Updated weights for policy 0, policy_version 19150 (0.0009) +[2023-10-13 01:01:01,510][46662] Updated weights for policy 0, policy_version 19160 (0.0007) +[2023-10-13 01:01:02,267][46663] Updated weights for policy 1, policy_version 19141 (0.0008) +[2023-10-13 01:01:02,629][46663] Updated weights for policy 1, policy_version 19151 (0.0008) +[2023-10-13 01:01:02,991][46663] Updated weights for policy 1, policy_version 19161 (0.0007) +[2023-10-13 01:01:03,606][45375] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 39256064. Throughput: 0: 1650.4, 1: 1671.6. Samples: 9817604. Policy #0 lag: (min: 31.0, avg: 46.9, max: 63.0) +[2023-10-13 01:01:03,607][45375] Avg episode reward: [(0, '49.250'), (1, '45.770')] +[2023-10-13 01:01:05,467][46662] Updated weights for policy 0, policy_version 19170 (0.0007) +[2023-10-13 01:01:05,843][46662] Updated weights for policy 0, policy_version 19180 (0.0008) +[2023-10-13 01:01:06,224][46662] Updated weights for policy 0, policy_version 19190 (0.0007) +[2023-10-13 01:01:06,595][46662] Updated weights for policy 0, policy_version 19200 (0.0010) +[2023-10-13 01:01:07,091][46663] Updated weights for policy 1, policy_version 19171 (0.0008) +[2023-10-13 01:01:07,462][46663] Updated weights for policy 1, policy_version 19181 (0.0009) +[2023-10-13 01:01:07,837][46663] Updated weights for policy 1, policy_version 19191 (0.0009) +[2023-10-13 01:01:08,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13329.4). Total num frames: 39321600. Throughput: 0: 1669.5, 1: 1655.1. Samples: 9837436. Policy #0 lag: (min: 31.0, avg: 46.9, max: 63.0) +[2023-10-13 01:01:08,607][45375] Avg episode reward: [(0, '50.880'), (1, '46.300')] +[2023-10-13 01:01:10,527][46662] Updated weights for policy 0, policy_version 19210 (0.0008) +[2023-10-13 01:01:10,896][46662] Updated weights for policy 0, policy_version 19220 (0.0009) +[2023-10-13 01:01:11,267][46662] Updated weights for policy 0, policy_version 19230 (0.0010) +[2023-10-13 01:01:11,847][46663] Updated weights for policy 1, policy_version 19201 (0.0008) +[2023-10-13 01:01:12,216][46663] Updated weights for policy 1, policy_version 19211 (0.0007) +[2023-10-13 01:01:12,587][46663] Updated weights for policy 1, policy_version 19221 (0.0009) +[2023-10-13 01:01:12,953][46663] Updated weights for policy 1, policy_version 19231 (0.0007) +[2023-10-13 01:01:13,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13329.4). Total num frames: 39387136. Throughput: 0: 1658.6, 1: 1677.7. Samples: 9848774. Policy #0 lag: (min: 31.0, avg: 46.9, max: 63.0) +[2023-10-13 01:01:13,607][45375] Avg episode reward: [(0, '50.960'), (1, '46.180')] +[2023-10-13 01:01:15,181][46662] Updated weights for policy 0, policy_version 19240 (0.0010) +[2023-10-13 01:01:15,555][46662] Updated weights for policy 0, policy_version 19250 (0.0009) +[2023-10-13 01:01:15,920][46662] Updated weights for policy 0, policy_version 19260 (0.0009) +[2023-10-13 01:01:17,033][46663] Updated weights for policy 1, policy_version 19241 (0.0007) +[2023-10-13 01:01:17,401][46663] Updated weights for policy 1, policy_version 19251 (0.0008) +[2023-10-13 01:01:17,766][46663] Updated weights for policy 1, policy_version 19261 (0.0009) +[2023-10-13 01:01:18,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13329.4). Total num frames: 39452672. Throughput: 0: 1674.6, 1: 1663.4. Samples: 9868398. Policy #0 lag: (min: 31.0, avg: 46.9, max: 63.0) +[2023-10-13 01:01:18,607][45375] Avg episode reward: [(0, '51.770'), (1, '46.400')] +[2023-10-13 01:01:19,952][46662] Updated weights for policy 0, policy_version 19270 (0.0009) +[2023-10-13 01:01:20,325][46662] Updated weights for policy 0, policy_version 19280 (0.0008) +[2023-10-13 01:01:20,694][46662] Updated weights for policy 0, policy_version 19290 (0.0008) +[2023-10-13 01:01:21,850][46663] Updated weights for policy 1, policy_version 19271 (0.0008) +[2023-10-13 01:01:22,218][46663] Updated weights for policy 1, policy_version 19281 (0.0009) +[2023-10-13 01:01:22,573][46663] Updated weights for policy 1, policy_version 19291 (0.0009) +[2023-10-13 01:01:23,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13329.3). Total num frames: 39518208. Throughput: 0: 1686.4, 1: 1667.1. Samples: 9888398. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 01:01:23,608][45375] Avg episode reward: [(0, '49.360'), (1, '47.710')] +[2023-10-13 01:01:24,711][46662] Updated weights for policy 0, policy_version 19300 (0.0008) +[2023-10-13 01:01:25,096][46662] Updated weights for policy 0, policy_version 19310 (0.0010) +[2023-10-13 01:01:25,463][46662] Updated weights for policy 0, policy_version 19320 (0.0007) +[2023-10-13 01:01:26,934][46663] Updated weights for policy 1, policy_version 19301 (0.0008) +[2023-10-13 01:01:27,325][46663] Updated weights for policy 1, policy_version 19311 (0.0007) +[2023-10-13 01:01:27,701][46663] Updated weights for policy 1, policy_version 19321 (0.0008) +[2023-10-13 01:01:28,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13329.4). Total num frames: 39583744. Throughput: 0: 1664.4, 1: 1675.9. Samples: 9898616. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 01:01:28,607][45375] Avg episode reward: [(0, '49.270'), (1, '46.900')] +[2023-10-13 01:01:29,543][46662] Updated weights for policy 0, policy_version 19330 (0.0010) +[2023-10-13 01:01:29,916][46662] Updated weights for policy 0, policy_version 19340 (0.0011) +[2023-10-13 01:01:30,279][46662] Updated weights for policy 0, policy_version 19350 (0.0011) +[2023-10-13 01:01:30,646][46662] Updated weights for policy 0, policy_version 19360 (0.0008) +[2023-10-13 01:01:31,716][46663] Updated weights for policy 1, policy_version 19331 (0.0008) +[2023-10-13 01:01:32,087][46663] Updated weights for policy 1, policy_version 19341 (0.0010) +[2023-10-13 01:01:32,462][46663] Updated weights for policy 1, policy_version 19351 (0.0010) +[2023-10-13 01:01:33,607][45375] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13329.4). Total num frames: 39649280. Throughput: 0: 1689.9, 1: 1659.7. Samples: 9918428. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 01:01:33,609][45375] Avg episode reward: [(0, '49.450'), (1, '45.680')] +[2023-10-13 01:01:34,662][46662] Updated weights for policy 0, policy_version 19370 (0.0009) +[2023-10-13 01:01:35,025][46662] Updated weights for policy 0, policy_version 19380 (0.0008) +[2023-10-13 01:01:35,390][46662] Updated weights for policy 0, policy_version 19390 (0.0009) +[2023-10-13 01:01:36,359][46663] Updated weights for policy 1, policy_version 19361 (0.0009) +[2023-10-13 01:01:36,725][46663] Updated weights for policy 1, policy_version 19371 (0.0008) +[2023-10-13 01:01:37,101][46663] Updated weights for policy 1, policy_version 19381 (0.0008) +[2023-10-13 01:01:37,469][46663] Updated weights for policy 1, policy_version 19391 (0.0010) +[2023-10-13 01:01:38,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13329.4). Total num frames: 39714816. Throughput: 0: 1695.2, 1: 1673.8. Samples: 9938756. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 01:01:38,607][45375] Avg episode reward: [(0, '49.270'), (1, '43.800')] +[2023-10-13 01:01:39,411][46662] Updated weights for policy 0, policy_version 19400 (0.0008) +[2023-10-13 01:01:39,780][46662] Updated weights for policy 0, policy_version 19410 (0.0010) +[2023-10-13 01:01:40,151][46662] Updated weights for policy 0, policy_version 19420 (0.0008) +[2023-10-13 01:01:41,503][46663] Updated weights for policy 1, policy_version 19401 (0.0008) +[2023-10-13 01:01:41,878][46663] Updated weights for policy 1, policy_version 19411 (0.0007) +[2023-10-13 01:01:42,246][46663] Updated weights for policy 1, policy_version 19421 (0.0007) +[2023-10-13 01:01:43,606][45375] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13329.4). Total num frames: 39780352. Throughput: 0: 1680.5, 1: 1673.2. Samples: 9949124. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 01:01:43,607][45375] Avg episode reward: [(0, '49.090'), (1, '42.240')] +[2023-10-13 01:01:44,238][46662] Updated weights for policy 0, policy_version 19430 (0.0007) +[2023-10-13 01:01:44,605][46662] Updated weights for policy 0, policy_version 19440 (0.0007) +[2023-10-13 01:01:44,964][46662] Updated weights for policy 0, policy_version 19450 (0.0009) +[2023-10-13 01:01:46,341][46663] Updated weights for policy 1, policy_version 19431 (0.0008) +[2023-10-13 01:01:46,711][46663] Updated weights for policy 1, policy_version 19441 (0.0007) +[2023-10-13 01:01:47,073][46663] Updated weights for policy 1, policy_version 19451 (0.0008) +[2023-10-13 01:01:48,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 39845888. Throughput: 0: 1699.5, 1: 1657.4. Samples: 9968664. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 01:01:48,607][45375] Avg episode reward: [(0, '49.790'), (1, '42.230')] +[2023-10-13 01:01:49,114][46662] Updated weights for policy 0, policy_version 19460 (0.0008) +[2023-10-13 01:01:49,489][46662] Updated weights for policy 0, policy_version 19470 (0.0008) +[2023-10-13 01:01:49,859][46662] Updated weights for policy 0, policy_version 19480 (0.0009) +[2023-10-13 01:01:51,199][46663] Updated weights for policy 1, policy_version 19461 (0.0011) +[2023-10-13 01:01:51,576][46663] Updated weights for policy 1, policy_version 19471 (0.0007) +[2023-10-13 01:01:51,944][46663] Updated weights for policy 1, policy_version 19481 (0.0010) +[2023-10-13 01:01:53,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 39911424. Throughput: 0: 1694.6, 1: 1679.5. Samples: 9989268. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 01:01:53,607][45375] Avg episode reward: [(0, '50.920'), (1, '41.080')] +[2023-10-13 01:01:53,618][46384] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000019488_19955712.pth... +[2023-10-13 01:01:53,619][46091] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000019488_19955712.pth... +[2023-10-13 01:01:53,654][46384] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000017920_18350080.pth +[2023-10-13 01:01:53,656][46091] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000017920_18350080.pth +[2023-10-13 01:01:53,849][46662] Updated weights for policy 0, policy_version 19490 (0.0008) +[2023-10-13 01:01:54,224][46662] Updated weights for policy 0, policy_version 19500 (0.0008) +[2023-10-13 01:01:54,597][46662] Updated weights for policy 0, policy_version 19510 (0.0007) +[2023-10-13 01:01:54,957][46662] Updated weights for policy 0, policy_version 19520 (0.0008) +[2023-10-13 01:01:56,067][46663] Updated weights for policy 1, policy_version 19491 (0.0009) +[2023-10-13 01:01:56,433][46663] Updated weights for policy 1, policy_version 19501 (0.0009) +[2023-10-13 01:01:56,795][46663] Updated weights for policy 1, policy_version 19511 (0.0008) +[2023-10-13 01:01:58,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 39976960. Throughput: 0: 1676.1, 1: 1669.7. Samples: 9999336. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 01:01:58,607][45375] Avg episode reward: [(0, '52.510'), (1, '42.560')] +[2023-10-13 01:01:59,109][46662] Updated weights for policy 0, policy_version 19530 (0.0009) +[2023-10-13 01:01:59,486][46662] Updated weights for policy 0, policy_version 19540 (0.0011) +[2023-10-13 01:01:59,860][46662] Updated weights for policy 0, policy_version 19550 (0.0011) +[2023-10-13 01:02:00,751][46663] Updated weights for policy 1, policy_version 19521 (0.0008) +[2023-10-13 01:02:01,119][46663] Updated weights for policy 1, policy_version 19531 (0.0009) +[2023-10-13 01:02:01,484][46663] Updated weights for policy 1, policy_version 19541 (0.0008) +[2023-10-13 01:02:01,848][46663] Updated weights for policy 1, policy_version 19551 (0.0008) +[2023-10-13 01:02:03,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 40042496. Throughput: 0: 1689.1, 1: 1669.6. Samples: 10019540. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 01:02:03,607][45375] Avg episode reward: [(0, '52.600'), (1, '40.040')] +[2023-10-13 01:02:03,871][46662] Updated weights for policy 0, policy_version 19560 (0.0008) +[2023-10-13 01:02:04,245][46662] Updated weights for policy 0, policy_version 19570 (0.0008) +[2023-10-13 01:02:04,612][46662] Updated weights for policy 0, policy_version 19580 (0.0009) +[2023-10-13 01:02:05,815][46663] Updated weights for policy 1, policy_version 19561 (0.0009) +[2023-10-13 01:02:06,174][46663] Updated weights for policy 1, policy_version 19571 (0.0008) +[2023-10-13 01:02:06,547][46663] Updated weights for policy 1, policy_version 19581 (0.0007) +[2023-10-13 01:02:08,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 40108032. Throughput: 0: 1689.0, 1: 1680.9. Samples: 10040042. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 01:02:08,607][45375] Avg episode reward: [(0, '52.220'), (1, '39.850')] +[2023-10-13 01:02:08,640][46662] Updated weights for policy 0, policy_version 19590 (0.0011) +[2023-10-13 01:02:09,005][46662] Updated weights for policy 0, policy_version 19600 (0.0009) +[2023-10-13 01:02:09,375][46662] Updated weights for policy 0, policy_version 19610 (0.0009) +[2023-10-13 01:02:10,528][46663] Updated weights for policy 1, policy_version 19591 (0.0008) +[2023-10-13 01:02:10,896][46663] Updated weights for policy 1, policy_version 19601 (0.0010) +[2023-10-13 01:02:11,255][46663] Updated weights for policy 1, policy_version 19611 (0.0007) +[2023-10-13 01:02:13,606][45375] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 40173568. Throughput: 0: 1689.1, 1: 1662.2. Samples: 10049424. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 01:02:13,607][45375] Avg episode reward: [(0, '50.830'), (1, '40.750')] +[2023-10-13 01:02:13,636][46662] Updated weights for policy 0, policy_version 19620 (0.0008) +[2023-10-13 01:02:14,032][46662] Updated weights for policy 0, policy_version 19630 (0.0008) +[2023-10-13 01:02:14,393][46662] Updated weights for policy 0, policy_version 19640 (0.0010) +[2023-10-13 01:02:15,357][46663] Updated weights for policy 1, policy_version 19621 (0.0009) +[2023-10-13 01:02:15,730][46663] Updated weights for policy 1, policy_version 19631 (0.0011) +[2023-10-13 01:02:16,095][46663] Updated weights for policy 1, policy_version 19641 (0.0008) +[2023-10-13 01:02:18,577][46662] Updated weights for policy 0, policy_version 19650 (0.0009) +[2023-10-13 01:02:18,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 40239104. Throughput: 0: 1686.2, 1: 1679.8. Samples: 10069900. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 01:02:18,607][45375] Avg episode reward: [(0, '49.570'), (1, '39.310')] +[2023-10-13 01:02:18,946][46662] Updated weights for policy 0, policy_version 19660 (0.0007) +[2023-10-13 01:02:19,327][46662] Updated weights for policy 0, policy_version 19670 (0.0008) +[2023-10-13 01:02:19,697][46662] Updated weights for policy 0, policy_version 19680 (0.0008) +[2023-10-13 01:02:20,278][46663] Updated weights for policy 1, policy_version 19651 (0.0007) +[2023-10-13 01:02:20,640][46663] Updated weights for policy 1, policy_version 19661 (0.0008) +[2023-10-13 01:02:21,013][46663] Updated weights for policy 1, policy_version 19671 (0.0008) +[2023-10-13 01:02:23,606][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.3, 300 sec: 13329.4). Total num frames: 40304640. Throughput: 0: 1684.8, 1: 1690.4. Samples: 10090636. Policy #0 lag: (min: 31.0, avg: 46.3, max: 63.0) +[2023-10-13 01:02:23,607][45375] Avg episode reward: [(0, '49.030'), (1, '39.430')] +[2023-10-13 01:02:23,619][46662] Updated weights for policy 0, policy_version 19690 (0.0008) +[2023-10-13 01:02:23,993][46662] Updated weights for policy 0, policy_version 19700 (0.0008) +[2023-10-13 01:02:24,363][46662] Updated weights for policy 0, policy_version 19710 (0.0008) +[2023-10-13 01:02:24,964][46663] Updated weights for policy 1, policy_version 19681 (0.0009) +[2023-10-13 01:02:25,333][46663] Updated weights for policy 1, policy_version 19691 (0.0009) +[2023-10-13 01:02:25,693][46663] Updated weights for policy 1, policy_version 19701 (0.0007) +[2023-10-13 01:02:26,071][46663] Updated weights for policy 1, policy_version 19711 (0.0009) +[2023-10-13 01:02:28,299][46662] Updated weights for policy 0, policy_version 19720 (0.0008) +[2023-10-13 01:02:28,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 40370176. Throughput: 0: 1684.0, 1: 1665.9. Samples: 10099868. Policy #0 lag: (min: 31.0, avg: 46.3, max: 63.0) +[2023-10-13 01:02:28,607][45375] Avg episode reward: [(0, '49.130'), (1, '40.690')] +[2023-10-13 01:02:28,669][46662] Updated weights for policy 0, policy_version 19730 (0.0007) +[2023-10-13 01:02:29,040][46662] Updated weights for policy 0, policy_version 19740 (0.0009) +[2023-10-13 01:02:30,102][46663] Updated weights for policy 1, policy_version 19721 (0.0008) +[2023-10-13 01:02:30,467][46663] Updated weights for policy 1, policy_version 19731 (0.0007) +[2023-10-13 01:02:30,829][46663] Updated weights for policy 1, policy_version 19741 (0.0011) +[2023-10-13 01:02:33,158][46662] Updated weights for policy 0, policy_version 19750 (0.0009) +[2023-10-13 01:02:33,527][46662] Updated weights for policy 0, policy_version 19760 (0.0007) +[2023-10-13 01:02:33,606][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 40435712. Throughput: 0: 1683.6, 1: 1690.0. Samples: 10120478. Policy #0 lag: (min: 31.0, avg: 46.3, max: 63.0) +[2023-10-13 01:02:33,607][45375] Avg episode reward: [(0, '49.720'), (1, '41.620')] +[2023-10-13 01:02:33,902][46662] Updated weights for policy 0, policy_version 19770 (0.0008) +[2023-10-13 01:02:35,047][46663] Updated weights for policy 1, policy_version 19751 (0.0008) +[2023-10-13 01:02:35,403][46663] Updated weights for policy 1, policy_version 19761 (0.0008) +[2023-10-13 01:02:35,778][46663] Updated weights for policy 1, policy_version 19771 (0.0010) +[2023-10-13 01:02:38,000][46662] Updated weights for policy 0, policy_version 19780 (0.0008) +[2023-10-13 01:02:38,371][46662] Updated weights for policy 0, policy_version 19790 (0.0008) +[2023-10-13 01:02:38,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 40501248. Throughput: 0: 1680.7, 1: 1691.1. Samples: 10141000. Policy #0 lag: (min: 31.0, avg: 46.3, max: 63.0) +[2023-10-13 01:02:38,607][45375] Avg episode reward: [(0, '48.160'), (1, '41.990')] +[2023-10-13 01:02:38,734][46662] Updated weights for policy 0, policy_version 19800 (0.0009) +[2023-10-13 01:02:39,946][46663] Updated weights for policy 1, policy_version 19781 (0.0008) +[2023-10-13 01:02:40,305][46663] Updated weights for policy 1, policy_version 19791 (0.0009) +[2023-10-13 01:02:40,667][46663] Updated weights for policy 1, policy_version 19801 (0.0010) +[2023-10-13 01:02:42,804][46662] Updated weights for policy 0, policy_version 19810 (0.0009) +[2023-10-13 01:02:43,180][46662] Updated weights for policy 0, policy_version 19820 (0.0007) +[2023-10-13 01:02:43,544][46662] Updated weights for policy 0, policy_version 19830 (0.0007) +[2023-10-13 01:02:43,606][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 40566784. Throughput: 0: 1681.3, 1: 1670.5. Samples: 10150170. Policy #0 lag: (min: 29.0, avg: 29.0, max: 29.0) +[2023-10-13 01:02:43,607][45375] Avg episode reward: [(0, '49.320'), (1, '42.100')] +[2023-10-13 01:02:43,918][46662] Updated weights for policy 0, policy_version 19840 (0.0007) +[2023-10-13 01:02:44,730][46663] Updated weights for policy 1, policy_version 19811 (0.0009) +[2023-10-13 01:02:45,098][46663] Updated weights for policy 1, policy_version 19821 (0.0009) +[2023-10-13 01:02:45,479][46663] Updated weights for policy 1, policy_version 19831 (0.0008) +[2023-10-13 01:02:47,920][46662] Updated weights for policy 0, policy_version 19850 (0.0010) +[2023-10-13 01:02:48,281][46662] Updated weights for policy 0, policy_version 19860 (0.0011) +[2023-10-13 01:02:48,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 40632320. Throughput: 0: 1677.9, 1: 1683.6. Samples: 10170808. Policy #0 lag: (min: 29.0, avg: 29.0, max: 29.0) +[2023-10-13 01:02:48,608][45375] Avg episode reward: [(0, '48.440'), (1, '42.430')] +[2023-10-13 01:02:48,658][46662] Updated weights for policy 0, policy_version 19870 (0.0010) +[2023-10-13 01:02:49,623][46663] Updated weights for policy 1, policy_version 19841 (0.0008) +[2023-10-13 01:02:49,991][46663] Updated weights for policy 1, policy_version 19851 (0.0011) +[2023-10-13 01:02:50,367][46663] Updated weights for policy 1, policy_version 19861 (0.0009) +[2023-10-13 01:02:50,728][46663] Updated weights for policy 1, policy_version 19871 (0.0010) +[2023-10-13 01:02:52,702][46662] Updated weights for policy 0, policy_version 19880 (0.0007) +[2023-10-13 01:02:53,072][46662] Updated weights for policy 0, policy_version 19890 (0.0007) +[2023-10-13 01:02:53,440][46662] Updated weights for policy 0, policy_version 19900 (0.0009) +[2023-10-13 01:02:53,607][45375] Fps is (10 sec: 16383.8, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 40730624. Throughput: 0: 1673.2, 1: 1686.2. Samples: 10191214. Policy #0 lag: (min: 29.0, avg: 29.0, max: 29.0) +[2023-10-13 01:02:53,607][45375] Avg episode reward: [(0, '45.750'), (1, '43.120')] +[2023-10-13 01:02:54,929][46663] Updated weights for policy 1, policy_version 19881 (0.0009) +[2023-10-13 01:02:55,300][46663] Updated weights for policy 1, policy_version 19891 (0.0010) +[2023-10-13 01:02:55,660][46663] Updated weights for policy 1, policy_version 19901 (0.0008) +[2023-10-13 01:02:57,574][46662] Updated weights for policy 0, policy_version 19910 (0.0008) +[2023-10-13 01:02:57,937][46662] Updated weights for policy 0, policy_version 19920 (0.0008) +[2023-10-13 01:02:58,308][46662] Updated weights for policy 0, policy_version 19930 (0.0007) +[2023-10-13 01:02:58,607][45375] Fps is (10 sec: 16383.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 40796160. Throughput: 0: 1682.7, 1: 1678.7. Samples: 10200688. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 01:02:58,608][45375] Avg episode reward: [(0, '45.950'), (1, '42.360')] +[2023-10-13 01:02:59,699][46663] Updated weights for policy 1, policy_version 19911 (0.0008) +[2023-10-13 01:03:00,064][46663] Updated weights for policy 1, policy_version 19921 (0.0007) +[2023-10-13 01:03:00,430][46663] Updated weights for policy 1, policy_version 19931 (0.0008) +[2023-10-13 01:03:02,568][46662] Updated weights for policy 0, policy_version 19940 (0.0008) +[2023-10-13 01:03:02,949][46662] Updated weights for policy 0, policy_version 19950 (0.0008) +[2023-10-13 01:03:03,314][46662] Updated weights for policy 0, policy_version 19960 (0.0007) +[2023-10-13 01:03:03,607][45375] Fps is (10 sec: 9830.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 40828928. Throughput: 0: 1684.3, 1: 1682.9. Samples: 10221422. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 01:03:03,607][45375] Avg episode reward: [(0, '45.180'), (1, '43.290')] +[2023-10-13 01:03:04,510][46663] Updated weights for policy 1, policy_version 19941 (0.0009) +[2023-10-13 01:03:04,912][46663] Updated weights for policy 1, policy_version 19951 (0.0008) +[2023-10-13 01:03:05,281][46663] Updated weights for policy 1, policy_version 19961 (0.0008) +[2023-10-13 01:03:07,195][46662] Updated weights for policy 0, policy_version 19970 (0.0008) +[2023-10-13 01:03:07,560][46662] Updated weights for policy 0, policy_version 19980 (0.0008) +[2023-10-13 01:03:07,940][46662] Updated weights for policy 0, policy_version 19990 (0.0010) +[2023-10-13 01:03:08,311][46662] Updated weights for policy 0, policy_version 20000 (0.0011) +[2023-10-13 01:03:08,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 40927232. Throughput: 0: 1668.8, 1: 1683.7. Samples: 10241500. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 01:03:08,608][45375] Avg episode reward: [(0, '45.130'), (1, '42.850')] +[2023-10-13 01:03:09,324][46663] Updated weights for policy 1, policy_version 19971 (0.0009) +[2023-10-13 01:03:09,694][46663] Updated weights for policy 1, policy_version 19981 (0.0007) +[2023-10-13 01:03:10,059][46663] Updated weights for policy 1, policy_version 19991 (0.0009) +[2023-10-13 01:03:12,304][46662] Updated weights for policy 0, policy_version 20010 (0.0008) +[2023-10-13 01:03:12,679][46662] Updated weights for policy 0, policy_version 20020 (0.0008) +[2023-10-13 01:03:13,050][46662] Updated weights for policy 0, policy_version 20030 (0.0009) +[2023-10-13 01:03:13,607][45375] Fps is (10 sec: 16383.8, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 40992768. Throughput: 0: 1683.6, 1: 1681.8. Samples: 10251312. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 01:03:13,608][45375] Avg episode reward: [(0, '45.790'), (1, '43.600')] +[2023-10-13 01:03:14,169][46663] Updated weights for policy 1, policy_version 20001 (0.0008) +[2023-10-13 01:03:14,539][46663] Updated weights for policy 1, policy_version 20011 (0.0010) +[2023-10-13 01:03:14,906][46663] Updated weights for policy 1, policy_version 20021 (0.0011) +[2023-10-13 01:03:15,272][46663] Updated weights for policy 1, policy_version 20031 (0.0010) +[2023-10-13 01:03:16,981][46662] Updated weights for policy 0, policy_version 20040 (0.0008) +[2023-10-13 01:03:17,353][46662] Updated weights for policy 0, policy_version 20050 (0.0008) +[2023-10-13 01:03:17,722][46662] Updated weights for policy 0, policy_version 20060 (0.0009) +[2023-10-13 01:03:18,607][45375] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 41058304. Throughput: 0: 1683.6, 1: 1682.8. Samples: 10271966. Policy #0 lag: (min: 31.0, avg: 38.6, max: 63.0) +[2023-10-13 01:03:18,607][45375] Avg episode reward: [(0, '46.400'), (1, '42.780')] +[2023-10-13 01:03:19,415][46663] Updated weights for policy 1, policy_version 20041 (0.0009) +[2023-10-13 01:03:19,788][46663] Updated weights for policy 1, policy_version 20051 (0.0009) +[2023-10-13 01:03:20,158][46663] Updated weights for policy 1, policy_version 20061 (0.0008) +[2023-10-13 01:03:21,860][46662] Updated weights for policy 0, policy_version 20070 (0.0009) +[2023-10-13 01:03:22,239][46662] Updated weights for policy 0, policy_version 20080 (0.0009) +[2023-10-13 01:03:22,605][46662] Updated weights for policy 0, policy_version 20090 (0.0008) +[2023-10-13 01:03:23,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 41123840. Throughput: 0: 1661.6, 1: 1680.3. Samples: 10291382. Policy #0 lag: (min: 31.0, avg: 38.6, max: 63.0) +[2023-10-13 01:03:23,607][45375] Avg episode reward: [(0, '46.560'), (1, '43.110')] +[2023-10-13 01:03:24,251][46663] Updated weights for policy 1, policy_version 20071 (0.0008) +[2023-10-13 01:03:24,619][46663] Updated weights for policy 1, policy_version 20081 (0.0007) +[2023-10-13 01:03:24,991][46663] Updated weights for policy 1, policy_version 20091 (0.0007) +[2023-10-13 01:03:26,825][46662] Updated weights for policy 0, policy_version 20100 (0.0008) +[2023-10-13 01:03:27,194][46662] Updated weights for policy 0, policy_version 20110 (0.0007) +[2023-10-13 01:03:27,556][46662] Updated weights for policy 0, policy_version 20120 (0.0009) +[2023-10-13 01:03:28,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 41189376. Throughput: 0: 1688.8, 1: 1678.4. Samples: 10301694. Policy #0 lag: (min: 31.0, avg: 38.6, max: 63.0) +[2023-10-13 01:03:28,607][45375] Avg episode reward: [(0, '46.190'), (1, '42.410')] +[2023-10-13 01:03:29,063][46663] Updated weights for policy 1, policy_version 20101 (0.0008) +[2023-10-13 01:03:29,429][46663] Updated weights for policy 1, policy_version 20111 (0.0007) +[2023-10-13 01:03:29,810][46663] Updated weights for policy 1, policy_version 20121 (0.0007) +[2023-10-13 01:03:31,687][46662] Updated weights for policy 0, policy_version 20130 (0.0007) +[2023-10-13 01:03:32,056][46662] Updated weights for policy 0, policy_version 20140 (0.0007) +[2023-10-13 01:03:32,420][46662] Updated weights for policy 0, policy_version 20150 (0.0008) +[2023-10-13 01:03:32,797][46662] Updated weights for policy 0, policy_version 20160 (0.0007) +[2023-10-13 01:03:33,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 41254912. Throughput: 0: 1683.7, 1: 1680.0. Samples: 10322176. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 01:03:33,607][45375] Avg episode reward: [(0, '46.780'), (1, '42.110')] +[2023-10-13 01:03:33,745][46663] Updated weights for policy 1, policy_version 20131 (0.0010) +[2023-10-13 01:03:34,107][46663] Updated weights for policy 1, policy_version 20141 (0.0009) +[2023-10-13 01:03:34,467][46663] Updated weights for policy 1, policy_version 20151 (0.0010) +[2023-10-13 01:03:36,677][46662] Updated weights for policy 0, policy_version 20170 (0.0008) +[2023-10-13 01:03:37,040][46662] Updated weights for policy 0, policy_version 20180 (0.0008) +[2023-10-13 01:03:37,420][46662] Updated weights for policy 0, policy_version 20190 (0.0009) +[2023-10-13 01:03:38,527][46663] Updated weights for policy 1, policy_version 20161 (0.0010) +[2023-10-13 01:03:38,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 41320448. Throughput: 0: 1669.5, 1: 1678.5. Samples: 10341874. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 01:03:38,607][45375] Avg episode reward: [(0, '47.280'), (1, '41.940')] +[2023-10-13 01:03:38,901][46663] Updated weights for policy 1, policy_version 20171 (0.0010) +[2023-10-13 01:03:39,281][46663] Updated weights for policy 1, policy_version 20181 (0.0011) +[2023-10-13 01:03:39,643][46663] Updated weights for policy 1, policy_version 20191 (0.0009) +[2023-10-13 01:03:41,511][46662] Updated weights for policy 0, policy_version 20200 (0.0008) +[2023-10-13 01:03:41,876][46662] Updated weights for policy 0, policy_version 20210 (0.0010) +[2023-10-13 01:03:42,243][46662] Updated weights for policy 0, policy_version 20220 (0.0009) +[2023-10-13 01:03:43,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 41385984. Throughput: 0: 1687.6, 1: 1679.8. Samples: 10352224. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 01:03:43,608][45375] Avg episode reward: [(0, '47.050'), (1, '41.670')] +[2023-10-13 01:03:43,845][46663] Updated weights for policy 1, policy_version 20201 (0.0009) +[2023-10-13 01:03:44,217][46663] Updated weights for policy 1, policy_version 20211 (0.0009) +[2023-10-13 01:03:44,580][46663] Updated weights for policy 1, policy_version 20221 (0.0009) +[2023-10-13 01:03:46,317][46662] Updated weights for policy 0, policy_version 20230 (0.0008) +[2023-10-13 01:03:46,689][46662] Updated weights for policy 0, policy_version 20240 (0.0008) +[2023-10-13 01:03:47,056][46662] Updated weights for policy 0, policy_version 20250 (0.0009) +[2023-10-13 01:03:48,466][46663] Updated weights for policy 1, policy_version 20231 (0.0009) +[2023-10-13 01:03:48,606][45375] Fps is (10 sec: 13107.4, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 41451520. Throughput: 0: 1679.6, 1: 1673.5. Samples: 10372312. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 01:03:48,607][45375] Avg episode reward: [(0, '46.810'), (1, '40.860')] +[2023-10-13 01:03:48,830][46663] Updated weights for policy 1, policy_version 20241 (0.0008) +[2023-10-13 01:03:49,191][46663] Updated weights for policy 1, policy_version 20251 (0.0007) +[2023-10-13 01:03:51,246][46662] Updated weights for policy 0, policy_version 20260 (0.0009) +[2023-10-13 01:03:51,645][46662] Updated weights for policy 0, policy_version 20270 (0.0007) +[2023-10-13 01:03:52,018][46662] Updated weights for policy 0, policy_version 20280 (0.0007) +[2023-10-13 01:03:53,331][46663] Updated weights for policy 1, policy_version 20261 (0.0007) +[2023-10-13 01:03:53,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 41517056. Throughput: 0: 1673.6, 1: 1672.4. Samples: 10392070. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-10-13 01:03:53,607][45375] Avg episode reward: [(0, '47.140'), (1, '42.130')] +[2023-10-13 01:03:53,618][46091] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000020288_20774912.pth... +[2023-10-13 01:03:53,654][46091] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000018720_19169280.pth +[2023-10-13 01:03:53,719][46663] Updated weights for policy 1, policy_version 20271 (0.0009) +[2023-10-13 01:03:54,091][46663] Updated weights for policy 1, policy_version 20281 (0.0008) +[2023-10-13 01:03:54,347][46384] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000020288_20774912.pth... +[2023-10-13 01:03:54,377][46384] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000018720_19169280.pth +[2023-10-13 01:03:55,943][46662] Updated weights for policy 0, policy_version 20290 (0.0008) +[2023-10-13 01:03:56,318][46662] Updated weights for policy 0, policy_version 20300 (0.0010) +[2023-10-13 01:03:56,687][46662] Updated weights for policy 0, policy_version 20310 (0.0007) +[2023-10-13 01:03:57,052][46662] Updated weights for policy 0, policy_version 20320 (0.0007) +[2023-10-13 01:03:58,133][46663] Updated weights for policy 1, policy_version 20291 (0.0008) +[2023-10-13 01:03:58,495][46663] Updated weights for policy 1, policy_version 20301 (0.0008) +[2023-10-13 01:03:58,606][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.3, 300 sec: 13440.4). Total num frames: 41582592. Throughput: 0: 1687.6, 1: 1676.9. Samples: 10402714. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-10-13 01:03:58,607][45375] Avg episode reward: [(0, '49.250'), (1, '41.680')] +[2023-10-13 01:03:58,860][46663] Updated weights for policy 1, policy_version 20311 (0.0010) +[2023-10-13 01:04:00,990][46662] Updated weights for policy 0, policy_version 20330 (0.0008) +[2023-10-13 01:04:01,358][46662] Updated weights for policy 0, policy_version 20340 (0.0009) +[2023-10-13 01:04:01,730][46662] Updated weights for policy 0, policy_version 20350 (0.0010) +[2023-10-13 01:04:03,011][46663] Updated weights for policy 1, policy_version 20321 (0.0010) +[2023-10-13 01:04:03,372][46663] Updated weights for policy 1, policy_version 20331 (0.0010) +[2023-10-13 01:04:03,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 41648128. Throughput: 0: 1662.8, 1: 1676.4. Samples: 10422230. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-10-13 01:04:03,608][45375] Avg episode reward: [(0, '49.720'), (1, '42.480')] +[2023-10-13 01:04:03,745][46663] Updated weights for policy 1, policy_version 20341 (0.0009) +[2023-10-13 01:04:04,120][46663] Updated weights for policy 1, policy_version 20351 (0.0008) +[2023-10-13 01:04:05,728][46662] Updated weights for policy 0, policy_version 20360 (0.0008) +[2023-10-13 01:04:06,095][46662] Updated weights for policy 0, policy_version 20370 (0.0009) +[2023-10-13 01:04:06,472][46662] Updated weights for policy 0, policy_version 20380 (0.0010) +[2023-10-13 01:04:08,060][46663] Updated weights for policy 1, policy_version 20361 (0.0009) +[2023-10-13 01:04:08,431][46663] Updated weights for policy 1, policy_version 20371 (0.0008) +[2023-10-13 01:04:08,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.3, 300 sec: 13440.4). Total num frames: 41713664. Throughput: 0: 1689.8, 1: 1665.6. Samples: 10442374. Policy #0 lag: (min: 18.0, avg: 18.1, max: 26.0) +[2023-10-13 01:04:08,607][45375] Avg episode reward: [(0, '49.170'), (1, '42.820')] +[2023-10-13 01:04:08,794][46663] Updated weights for policy 1, policy_version 20381 (0.0007) +[2023-10-13 01:04:10,561][46662] Updated weights for policy 0, policy_version 20390 (0.0008) +[2023-10-13 01:04:10,923][46662] Updated weights for policy 0, policy_version 20400 (0.0008) +[2023-10-13 01:04:11,296][46662] Updated weights for policy 0, policy_version 20410 (0.0007) +[2023-10-13 01:04:12,800][46663] Updated weights for policy 1, policy_version 20391 (0.0009) +[2023-10-13 01:04:13,168][46663] Updated weights for policy 1, policy_version 20401 (0.0011) +[2023-10-13 01:04:13,536][46663] Updated weights for policy 1, policy_version 20411 (0.0009) +[2023-10-13 01:04:13,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 41779200. Throughput: 0: 1678.2, 1: 1682.7. Samples: 10452934. Policy #0 lag: (min: 18.0, avg: 18.1, max: 26.0) +[2023-10-13 01:04:13,607][45375] Avg episode reward: [(0, '49.350'), (1, '42.810')] +[2023-10-13 01:04:15,428][46662] Updated weights for policy 0, policy_version 20420 (0.0009) +[2023-10-13 01:04:15,793][46662] Updated weights for policy 0, policy_version 20430 (0.0009) +[2023-10-13 01:04:16,166][46662] Updated weights for policy 0, policy_version 20440 (0.0007) +[2023-10-13 01:04:17,694][46663] Updated weights for policy 1, policy_version 20421 (0.0010) +[2023-10-13 01:04:18,068][46663] Updated weights for policy 1, policy_version 20431 (0.0011) +[2023-10-13 01:04:18,428][46663] Updated weights for policy 1, policy_version 20441 (0.0010) +[2023-10-13 01:04:18,606][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 41844736. Throughput: 0: 1665.5, 1: 1682.1. Samples: 10472818. Policy #0 lag: (min: 18.0, avg: 18.1, max: 26.0) +[2023-10-13 01:04:18,607][45375] Avg episode reward: [(0, '49.630'), (1, '43.250')] +[2023-10-13 01:04:20,111][46662] Updated weights for policy 0, policy_version 20450 (0.0007) +[2023-10-13 01:04:20,483][46662] Updated weights for policy 0, policy_version 20460 (0.0009) +[2023-10-13 01:04:20,853][46662] Updated weights for policy 0, policy_version 20470 (0.0010) +[2023-10-13 01:04:21,233][46662] Updated weights for policy 0, policy_version 20480 (0.0009) +[2023-10-13 01:04:22,565][46663] Updated weights for policy 1, policy_version 20451 (0.0009) +[2023-10-13 01:04:22,943][46663] Updated weights for policy 1, policy_version 20461 (0.0009) +[2023-10-13 01:04:23,322][46663] Updated weights for policy 1, policy_version 20471 (0.0008) +[2023-10-13 01:04:23,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 41910272. Throughput: 0: 1685.2, 1: 1666.4. Samples: 10492694. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 01:04:23,607][45375] Avg episode reward: [(0, '51.330'), (1, '42.540')] +[2023-10-13 01:04:25,358][46662] Updated weights for policy 0, policy_version 20490 (0.0008) +[2023-10-13 01:04:25,736][46662] Updated weights for policy 0, policy_version 20500 (0.0009) +[2023-10-13 01:04:26,095][46662] Updated weights for policy 0, policy_version 20510 (0.0009) +[2023-10-13 01:04:27,338][46663] Updated weights for policy 1, policy_version 20481 (0.0009) +[2023-10-13 01:04:27,698][46663] Updated weights for policy 1, policy_version 20491 (0.0008) +[2023-10-13 01:04:28,073][46663] Updated weights for policy 1, policy_version 20501 (0.0007) +[2023-10-13 01:04:28,444][46663] Updated weights for policy 1, policy_version 20511 (0.0009) +[2023-10-13 01:04:28,607][45375] Fps is (10 sec: 16383.8, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 42008576. Throughput: 0: 1667.5, 1: 1684.9. Samples: 10503080. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 01:04:28,607][45375] Avg episode reward: [(0, '51.260'), (1, '42.910')] +[2023-10-13 01:04:30,077][46662] Updated weights for policy 0, policy_version 20520 (0.0007) +[2023-10-13 01:04:30,461][46662] Updated weights for policy 0, policy_version 20530 (0.0011) +[2023-10-13 01:04:30,822][46662] Updated weights for policy 0, policy_version 20540 (0.0008) +[2023-10-13 01:04:32,564][46663] Updated weights for policy 1, policy_version 20521 (0.0009) +[2023-10-13 01:04:32,940][46663] Updated weights for policy 1, policy_version 20531 (0.0009) +[2023-10-13 01:04:33,308][46663] Updated weights for policy 1, policy_version 20541 (0.0008) +[2023-10-13 01:04:33,607][45375] Fps is (10 sec: 16383.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 42074112. Throughput: 0: 1673.2, 1: 1685.7. Samples: 10523466. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 01:04:33,607][45375] Avg episode reward: [(0, '52.050'), (1, '43.830')] +[2023-10-13 01:04:34,824][46662] Updated weights for policy 0, policy_version 20550 (0.0008) +[2023-10-13 01:04:35,186][46662] Updated weights for policy 0, policy_version 20560 (0.0008) +[2023-10-13 01:04:35,555][46662] Updated weights for policy 0, policy_version 20570 (0.0007) +[2023-10-13 01:04:37,238][46663] Updated weights for policy 1, policy_version 20551 (0.0009) +[2023-10-13 01:04:37,608][46663] Updated weights for policy 1, policy_version 20561 (0.0008) +[2023-10-13 01:04:37,980][46663] Updated weights for policy 1, policy_version 20571 (0.0008) +[2023-10-13 01:04:38,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 42139648. Throughput: 0: 1698.0, 1: 1660.5. Samples: 10543202. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 01:04:38,607][45375] Avg episode reward: [(0, '53.710'), (1, '43.750')] +[2023-10-13 01:04:39,653][46662] Updated weights for policy 0, policy_version 20580 (0.0010) +[2023-10-13 01:04:40,050][46662] Updated weights for policy 0, policy_version 20590 (0.0008) +[2023-10-13 01:04:40,427][46662] Updated weights for policy 0, policy_version 20600 (0.0010) +[2023-10-13 01:04:42,166][46663] Updated weights for policy 1, policy_version 20581 (0.0009) +[2023-10-13 01:04:42,557][46663] Updated weights for policy 1, policy_version 20591 (0.0010) +[2023-10-13 01:04:42,931][46663] Updated weights for policy 1, policy_version 20601 (0.0009) +[2023-10-13 01:04:43,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 42205184. Throughput: 0: 1665.7, 1: 1686.9. Samples: 10553582. Policy #0 lag: (min: 31.0, avg: 36.5, max: 63.0) +[2023-10-13 01:04:43,607][45375] Avg episode reward: [(0, '53.460'), (1, '43.800')] +[2023-10-13 01:04:44,583][46662] Updated weights for policy 0, policy_version 20610 (0.0008) +[2023-10-13 01:04:44,955][46662] Updated weights for policy 0, policy_version 20620 (0.0009) +[2023-10-13 01:04:45,335][46662] Updated weights for policy 0, policy_version 20630 (0.0011) +[2023-10-13 01:04:45,700][46662] Updated weights for policy 0, policy_version 20640 (0.0008) +[2023-10-13 01:04:46,940][46663] Updated weights for policy 1, policy_version 20611 (0.0008) +[2023-10-13 01:04:47,304][46663] Updated weights for policy 1, policy_version 20621 (0.0010) +[2023-10-13 01:04:47,684][46663] Updated weights for policy 1, policy_version 20631 (0.0009) +[2023-10-13 01:04:48,606][45375] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 42270720. Throughput: 0: 1687.3, 1: 1676.2. Samples: 10573588. Policy #0 lag: (min: 31.0, avg: 36.5, max: 63.0) +[2023-10-13 01:04:48,607][45375] Avg episode reward: [(0, '54.460'), (1, '43.250')] +[2023-10-13 01:04:48,608][46091] Saving new best policy, reward=54.460! +[2023-10-13 01:04:49,844][46662] Updated weights for policy 0, policy_version 20650 (0.0008) +[2023-10-13 01:04:50,212][46662] Updated weights for policy 0, policy_version 20660 (0.0008) +[2023-10-13 01:04:50,576][46662] Updated weights for policy 0, policy_version 20670 (0.0008) +[2023-10-13 01:04:51,746][46663] Updated weights for policy 1, policy_version 20641 (0.0007) +[2023-10-13 01:04:52,112][46663] Updated weights for policy 1, policy_version 20651 (0.0009) +[2023-10-13 01:04:52,470][46663] Updated weights for policy 1, policy_version 20661 (0.0009) +[2023-10-13 01:04:52,842][46663] Updated weights for policy 1, policy_version 20671 (0.0007) +[2023-10-13 01:04:53,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 42336256. Throughput: 0: 1684.2, 1: 1674.8. Samples: 10593530. Policy #0 lag: (min: 31.0, avg: 36.5, max: 63.0) +[2023-10-13 01:04:53,608][45375] Avg episode reward: [(0, '52.820'), (1, '42.290')] +[2023-10-13 01:04:54,616][46662] Updated weights for policy 0, policy_version 20680 (0.0008) +[2023-10-13 01:04:54,980][46662] Updated weights for policy 0, policy_version 20690 (0.0009) +[2023-10-13 01:04:55,366][46662] Updated weights for policy 0, policy_version 20700 (0.0009) +[2023-10-13 01:04:57,032][46663] Updated weights for policy 1, policy_version 20681 (0.0009) +[2023-10-13 01:04:57,395][46663] Updated weights for policy 1, policy_version 20691 (0.0009) +[2023-10-13 01:04:57,758][46663] Updated weights for policy 1, policy_version 20701 (0.0009) +[2023-10-13 01:04:58,606][45375] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 42401792. Throughput: 0: 1668.1, 1: 1684.8. Samples: 10603818. Policy #0 lag: (min: 31.0, avg: 36.5, max: 63.0) +[2023-10-13 01:04:58,607][45375] Avg episode reward: [(0, '54.310'), (1, '43.480')] +[2023-10-13 01:04:59,423][46662] Updated weights for policy 0, policy_version 20710 (0.0009) +[2023-10-13 01:04:59,795][46662] Updated weights for policy 0, policy_version 20720 (0.0010) +[2023-10-13 01:05:00,167][46662] Updated weights for policy 0, policy_version 20730 (0.0009) +[2023-10-13 01:05:01,707][46663] Updated weights for policy 1, policy_version 20711 (0.0008) +[2023-10-13 01:05:02,076][46663] Updated weights for policy 1, policy_version 20721 (0.0007) +[2023-10-13 01:05:02,438][46663] Updated weights for policy 1, policy_version 20731 (0.0008) +[2023-10-13 01:05:03,607][45375] Fps is (10 sec: 13107.5, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 42467328. Throughput: 0: 1688.5, 1: 1663.2. Samples: 10623644. Policy #0 lag: (min: 0.0, avg: 20.4, max: 32.0) +[2023-10-13 01:05:03,607][45375] Avg episode reward: [(0, '52.920'), (1, '42.230')] +[2023-10-13 01:05:04,223][46662] Updated weights for policy 0, policy_version 20740 (0.0009) +[2023-10-13 01:05:04,598][46662] Updated weights for policy 0, policy_version 20750 (0.0010) +[2023-10-13 01:05:04,963][46662] Updated weights for policy 0, policy_version 20760 (0.0009) +[2023-10-13 01:05:06,589][46663] Updated weights for policy 1, policy_version 20741 (0.0008) +[2023-10-13 01:05:06,950][46663] Updated weights for policy 1, policy_version 20751 (0.0009) +[2023-10-13 01:05:07,328][46663] Updated weights for policy 1, policy_version 20761 (0.0010) +[2023-10-13 01:05:08,607][45375] Fps is (10 sec: 13106.7, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 42532864. Throughput: 0: 1692.1, 1: 1674.3. Samples: 10644184. Policy #0 lag: (min: 0.0, avg: 20.4, max: 32.0) +[2023-10-13 01:05:08,608][45375] Avg episode reward: [(0, '53.260'), (1, '41.830')] +[2023-10-13 01:05:08,980][46662] Updated weights for policy 0, policy_version 20770 (0.0009) +[2023-10-13 01:05:09,354][46662] Updated weights for policy 0, policy_version 20780 (0.0008) +[2023-10-13 01:05:09,726][46662] Updated weights for policy 0, policy_version 20790 (0.0007) +[2023-10-13 01:05:10,100][46662] Updated weights for policy 0, policy_version 20800 (0.0009) +[2023-10-13 01:05:11,378][46663] Updated weights for policy 1, policy_version 20771 (0.0009) +[2023-10-13 01:05:11,747][46663] Updated weights for policy 1, policy_version 20781 (0.0008) +[2023-10-13 01:05:12,121][46663] Updated weights for policy 1, policy_version 20791 (0.0007) +[2023-10-13 01:05:13,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 42598400. Throughput: 0: 1678.8, 1: 1682.6. Samples: 10654346. Policy #0 lag: (min: 0.0, avg: 20.4, max: 32.0) +[2023-10-13 01:05:13,607][45375] Avg episode reward: [(0, '54.180'), (1, '41.160')] +[2023-10-13 01:05:14,285][46662] Updated weights for policy 0, policy_version 20810 (0.0009) +[2023-10-13 01:05:14,655][46662] Updated weights for policy 0, policy_version 20820 (0.0008) +[2023-10-13 01:05:15,023][46662] Updated weights for policy 0, policy_version 20830 (0.0009) +[2023-10-13 01:05:16,107][46663] Updated weights for policy 1, policy_version 20801 (0.0010) +[2023-10-13 01:05:16,467][46663] Updated weights for policy 1, policy_version 20811 (0.0009) +[2023-10-13 01:05:16,839][46663] Updated weights for policy 1, policy_version 20821 (0.0007) +[2023-10-13 01:05:17,210][46663] Updated weights for policy 1, policy_version 20831 (0.0009) +[2023-10-13 01:05:18,606][45375] Fps is (10 sec: 13107.6, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 42663936. Throughput: 0: 1679.6, 1: 1659.2. Samples: 10673712. Policy #0 lag: (min: 20.0, avg: 20.0, max: 20.0) +[2023-10-13 01:05:18,607][45375] Avg episode reward: [(0, '55.020'), (1, '41.610')] +[2023-10-13 01:05:18,608][46091] Saving new best policy, reward=55.020! +[2023-10-13 01:05:19,164][46662] Updated weights for policy 0, policy_version 20840 (0.0009) +[2023-10-13 01:05:19,535][46662] Updated weights for policy 0, policy_version 20850 (0.0010) +[2023-10-13 01:05:19,906][46662] Updated weights for policy 0, policy_version 20860 (0.0007) +[2023-10-13 01:05:21,167][46663] Updated weights for policy 1, policy_version 20841 (0.0010) +[2023-10-13 01:05:21,534][46663] Updated weights for policy 1, policy_version 20851 (0.0010) +[2023-10-13 01:05:21,910][46663] Updated weights for policy 1, policy_version 20861 (0.0010) +[2023-10-13 01:05:23,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 42729472. Throughput: 0: 1670.2, 1: 1687.2. Samples: 10694286. Policy #0 lag: (min: 20.0, avg: 20.0, max: 20.0) +[2023-10-13 01:05:23,607][45375] Avg episode reward: [(0, '54.570'), (1, '41.270')] +[2023-10-13 01:05:23,888][46662] Updated weights for policy 0, policy_version 20870 (0.0009) +[2023-10-13 01:05:24,258][46662] Updated weights for policy 0, policy_version 20880 (0.0008) +[2023-10-13 01:05:24,625][46662] Updated weights for policy 0, policy_version 20890 (0.0010) +[2023-10-13 01:05:25,990][46663] Updated weights for policy 1, policy_version 20871 (0.0009) +[2023-10-13 01:05:26,360][46663] Updated weights for policy 1, policy_version 20881 (0.0010) +[2023-10-13 01:05:26,734][46663] Updated weights for policy 1, policy_version 20891 (0.0010) +[2023-10-13 01:05:28,606][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 42795008. Throughput: 0: 1669.1, 1: 1671.8. Samples: 10703922. Policy #0 lag: (min: 20.0, avg: 20.0, max: 20.0) +[2023-10-13 01:05:28,607][45375] Avg episode reward: [(0, '53.990'), (1, '40.620')] +[2023-10-13 01:05:28,879][46662] Updated weights for policy 0, policy_version 20900 (0.0010) +[2023-10-13 01:05:29,253][46662] Updated weights for policy 0, policy_version 20910 (0.0008) +[2023-10-13 01:05:29,615][46662] Updated weights for policy 0, policy_version 20920 (0.0008) +[2023-10-13 01:05:31,005][46663] Updated weights for policy 1, policy_version 20901 (0.0009) +[2023-10-13 01:05:31,389][46663] Updated weights for policy 1, policy_version 20911 (0.0010) +[2023-10-13 01:05:31,749][46663] Updated weights for policy 1, policy_version 20921 (0.0011) +[2023-10-13 01:05:33,561][46662] Updated weights for policy 0, policy_version 20930 (0.0009) +[2023-10-13 01:05:33,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 42860544. Throughput: 0: 1673.4, 1: 1662.2. Samples: 10723690. Policy #0 lag: (min: 20.0, avg: 20.0, max: 20.0) +[2023-10-13 01:05:33,608][45375] Avg episode reward: [(0, '52.890'), (1, '40.380')] +[2023-10-13 01:05:33,931][46662] Updated weights for policy 0, policy_version 20940 (0.0007) +[2023-10-13 01:05:34,296][46662] Updated weights for policy 0, policy_version 20950 (0.0007) +[2023-10-13 01:05:34,667][46662] Updated weights for policy 0, policy_version 20960 (0.0008) +[2023-10-13 01:05:35,986][46663] Updated weights for policy 1, policy_version 20931 (0.0010) +[2023-10-13 01:05:36,351][46663] Updated weights for policy 1, policy_version 20941 (0.0008) +[2023-10-13 01:05:36,728][46663] Updated weights for policy 1, policy_version 20951 (0.0010) +[2023-10-13 01:05:38,606][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 42926080. Throughput: 0: 1677.3, 1: 1672.5. Samples: 10744270. Policy #0 lag: (min: 31.0, avg: 38.9, max: 63.0) +[2023-10-13 01:05:38,607][45375] Avg episode reward: [(0, '53.190'), (1, '42.470')] +[2023-10-13 01:05:38,617][46662] Updated weights for policy 0, policy_version 20970 (0.0010) +[2023-10-13 01:05:38,992][46662] Updated weights for policy 0, policy_version 20980 (0.0008) +[2023-10-13 01:05:39,350][46662] Updated weights for policy 0, policy_version 20990 (0.0009) +[2023-10-13 01:05:40,680][46663] Updated weights for policy 1, policy_version 20961 (0.0008) +[2023-10-13 01:05:41,040][46663] Updated weights for policy 1, policy_version 20971 (0.0009) +[2023-10-13 01:05:41,414][46663] Updated weights for policy 1, policy_version 20981 (0.0008) +[2023-10-13 01:05:41,774][46663] Updated weights for policy 1, policy_version 20991 (0.0008) +[2023-10-13 01:05:43,432][46662] Updated weights for policy 0, policy_version 21000 (0.0008) +[2023-10-13 01:05:43,607][45375] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 42991616. Throughput: 0: 1676.0, 1: 1661.4. Samples: 10754002. Policy #0 lag: (min: 31.0, avg: 38.9, max: 63.0) +[2023-10-13 01:05:43,607][45375] Avg episode reward: [(0, '52.720'), (1, '42.940')] +[2023-10-13 01:05:43,796][46662] Updated weights for policy 0, policy_version 21010 (0.0009) +[2023-10-13 01:05:44,171][46662] Updated weights for policy 0, policy_version 21020 (0.0011) +[2023-10-13 01:05:45,916][46663] Updated weights for policy 1, policy_version 21001 (0.0009) +[2023-10-13 01:05:46,289][46663] Updated weights for policy 1, policy_version 21011 (0.0010) +[2023-10-13 01:05:46,652][46663] Updated weights for policy 1, policy_version 21021 (0.0007) +[2023-10-13 01:05:48,245][46662] Updated weights for policy 0, policy_version 21030 (0.0011) +[2023-10-13 01:05:48,606][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 43057152. Throughput: 0: 1676.0, 1: 1670.7. Samples: 10774248. Policy #0 lag: (min: 31.0, avg: 38.9, max: 63.0) +[2023-10-13 01:05:48,607][45375] Avg episode reward: [(0, '52.270'), (1, '45.550')] +[2023-10-13 01:05:48,612][46662] Updated weights for policy 0, policy_version 21040 (0.0009) +[2023-10-13 01:05:48,992][46662] Updated weights for policy 0, policy_version 21050 (0.0007) +[2023-10-13 01:05:50,673][46663] Updated weights for policy 1, policy_version 21031 (0.0011) +[2023-10-13 01:05:51,043][46663] Updated weights for policy 1, policy_version 21041 (0.0008) +[2023-10-13 01:05:51,406][46663] Updated weights for policy 1, policy_version 21051 (0.0009) +[2023-10-13 01:05:53,070][46662] Updated weights for policy 0, policy_version 21060 (0.0009) +[2023-10-13 01:05:53,458][46662] Updated weights for policy 0, policy_version 21070 (0.0011) +[2023-10-13 01:05:53,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 43122688. Throughput: 0: 1669.3, 1: 1681.9. Samples: 10794986. Policy #0 lag: (min: 31.0, avg: 38.9, max: 63.0) +[2023-10-13 01:05:53,607][45375] Avg episode reward: [(0, '50.850'), (1, '43.790')] +[2023-10-13 01:05:53,616][46384] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000021056_21561344.pth... +[2023-10-13 01:05:53,656][46384] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000019488_19955712.pth +[2023-10-13 01:05:53,831][46662] Updated weights for policy 0, policy_version 21080 (0.0008) +[2023-10-13 01:05:54,133][46091] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000021088_21594112.pth... +[2023-10-13 01:05:54,172][46091] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000019488_19955712.pth +[2023-10-13 01:05:55,425][46663] Updated weights for policy 1, policy_version 21061 (0.0008) +[2023-10-13 01:05:55,798][46663] Updated weights for policy 1, policy_version 21071 (0.0007) +[2023-10-13 01:05:56,161][46663] Updated weights for policy 1, policy_version 21081 (0.0009) +[2023-10-13 01:05:57,956][46662] Updated weights for policy 0, policy_version 21090 (0.0008) +[2023-10-13 01:05:58,326][46662] Updated weights for policy 0, policy_version 21100 (0.0007) +[2023-10-13 01:05:58,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 43188224. Throughput: 0: 1671.6, 1: 1656.3. Samples: 10804102. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 01:05:58,608][45375] Avg episode reward: [(0, '52.490'), (1, '43.720')] +[2023-10-13 01:05:58,694][46662] Updated weights for policy 0, policy_version 21110 (0.0010) +[2023-10-13 01:05:59,066][46662] Updated weights for policy 0, policy_version 21120 (0.0009) +[2023-10-13 01:06:00,300][46663] Updated weights for policy 1, policy_version 21091 (0.0008) +[2023-10-13 01:06:00,658][46663] Updated weights for policy 1, policy_version 21101 (0.0009) +[2023-10-13 01:06:01,027][46663] Updated weights for policy 1, policy_version 21111 (0.0009) +[2023-10-13 01:06:03,359][46662] Updated weights for policy 0, policy_version 21130 (0.0011) +[2023-10-13 01:06:03,606][45375] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 43253760. Throughput: 0: 1671.2, 1: 1681.2. Samples: 10824570. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 01:06:03,607][45375] Avg episode reward: [(0, '52.510'), (1, '45.180')] +[2023-10-13 01:06:03,738][46662] Updated weights for policy 0, policy_version 21140 (0.0008) +[2023-10-13 01:06:04,117][46662] Updated weights for policy 0, policy_version 21150 (0.0007) +[2023-10-13 01:06:05,037][46663] Updated weights for policy 1, policy_version 21121 (0.0008) +[2023-10-13 01:06:05,400][46663] Updated weights for policy 1, policy_version 21131 (0.0009) +[2023-10-13 01:06:05,764][46663] Updated weights for policy 1, policy_version 21141 (0.0010) +[2023-10-13 01:06:06,138][46663] Updated weights for policy 1, policy_version 21151 (0.0009) +[2023-10-13 01:06:08,091][46662] Updated weights for policy 0, policy_version 21160 (0.0008) +[2023-10-13 01:06:08,460][46662] Updated weights for policy 0, policy_version 21170 (0.0007) +[2023-10-13 01:06:08,606][45375] Fps is (10 sec: 13107.4, 60 sec: 13107.3, 300 sec: 13329.4). Total num frames: 43319296. Throughput: 0: 1672.7, 1: 1680.7. Samples: 10845186. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 01:06:08,607][45375] Avg episode reward: [(0, '53.240'), (1, '45.550')] +[2023-10-13 01:06:08,823][46662] Updated weights for policy 0, policy_version 21180 (0.0007) +[2023-10-13 01:06:10,129][46663] Updated weights for policy 1, policy_version 21161 (0.0008) +[2023-10-13 01:06:10,499][46663] Updated weights for policy 1, policy_version 21171 (0.0009) +[2023-10-13 01:06:10,866][46663] Updated weights for policy 1, policy_version 21181 (0.0008) +[2023-10-13 01:06:12,667][46662] Updated weights for policy 0, policy_version 21190 (0.0008) +[2023-10-13 01:06:13,030][46662] Updated weights for policy 0, policy_version 21200 (0.0008) +[2023-10-13 01:06:13,402][46662] Updated weights for policy 0, policy_version 21210 (0.0009) +[2023-10-13 01:06:13,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 43384832. Throughput: 0: 1680.1, 1: 1664.5. Samples: 10854430. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 01:06:13,608][45375] Avg episode reward: [(0, '53.690'), (1, '45.910')] +[2023-10-13 01:06:14,846][46663] Updated weights for policy 1, policy_version 21191 (0.0007) +[2023-10-13 01:06:15,211][46663] Updated weights for policy 1, policy_version 21201 (0.0010) +[2023-10-13 01:06:15,590][46663] Updated weights for policy 1, policy_version 21211 (0.0009) +[2023-10-13 01:06:17,512][46662] Updated weights for policy 0, policy_version 21220 (0.0008) +[2023-10-13 01:06:17,884][46662] Updated weights for policy 0, policy_version 21230 (0.0007) +[2023-10-13 01:06:18,248][46662] Updated weights for policy 0, policy_version 21240 (0.0007) +[2023-10-13 01:06:18,607][45375] Fps is (10 sec: 16383.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 43483136. Throughput: 0: 1683.9, 1: 1691.2. Samples: 10875568. Policy #0 lag: (min: 11.0, avg: 11.1, max: 18.0) +[2023-10-13 01:06:18,607][45375] Avg episode reward: [(0, '51.850'), (1, '44.780')] +[2023-10-13 01:06:19,655][46663] Updated weights for policy 1, policy_version 21221 (0.0008) +[2023-10-13 01:06:20,015][46663] Updated weights for policy 1, policy_version 21231 (0.0009) +[2023-10-13 01:06:20,384][46663] Updated weights for policy 1, policy_version 21241 (0.0009) +[2023-10-13 01:06:22,301][46662] Updated weights for policy 0, policy_version 21250 (0.0008) +[2023-10-13 01:06:22,673][46662] Updated weights for policy 0, policy_version 21260 (0.0009) +[2023-10-13 01:06:23,039][46662] Updated weights for policy 0, policy_version 21270 (0.0007) +[2023-10-13 01:06:23,404][46662] Updated weights for policy 0, policy_version 21280 (0.0008) +[2023-10-13 01:06:23,607][45375] Fps is (10 sec: 16383.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 43548672. Throughput: 0: 1671.4, 1: 1696.7. Samples: 10895834. Policy #0 lag: (min: 11.0, avg: 11.1, max: 18.0) +[2023-10-13 01:06:23,608][45375] Avg episode reward: [(0, '53.690'), (1, '44.870')] +[2023-10-13 01:06:24,580][46663] Updated weights for policy 1, policy_version 21251 (0.0009) +[2023-10-13 01:06:24,983][46663] Updated weights for policy 1, policy_version 21261 (0.0007) +[2023-10-13 01:06:25,351][46663] Updated weights for policy 1, policy_version 21271 (0.0007) +[2023-10-13 01:06:27,550][46662] Updated weights for policy 0, policy_version 21290 (0.0009) +[2023-10-13 01:06:27,920][46662] Updated weights for policy 0, policy_version 21300 (0.0009) +[2023-10-13 01:06:28,287][46662] Updated weights for policy 0, policy_version 21310 (0.0009) +[2023-10-13 01:06:28,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 43614208. Throughput: 0: 1686.3, 1: 1678.2. Samples: 10905402. Policy #0 lag: (min: 11.0, avg: 11.1, max: 18.0) +[2023-10-13 01:06:28,607][45375] Avg episode reward: [(0, '54.120'), (1, '46.210')] +[2023-10-13 01:06:29,344][46663] Updated weights for policy 1, policy_version 21281 (0.0008) +[2023-10-13 01:06:29,714][46663] Updated weights for policy 1, policy_version 21291 (0.0007) +[2023-10-13 01:06:30,088][46663] Updated weights for policy 1, policy_version 21301 (0.0007) +[2023-10-13 01:06:30,459][46663] Updated weights for policy 1, policy_version 21311 (0.0007) +[2023-10-13 01:06:32,433][46662] Updated weights for policy 0, policy_version 21320 (0.0011) +[2023-10-13 01:06:32,808][46662] Updated weights for policy 0, policy_version 21330 (0.0009) +[2023-10-13 01:06:33,184][46662] Updated weights for policy 0, policy_version 21340 (0.0010) +[2023-10-13 01:06:33,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 43679744. Throughput: 0: 1679.2, 1: 1697.4. Samples: 10926194. Policy #0 lag: (min: 27.0, avg: 27.0, max: 27.0) +[2023-10-13 01:06:33,608][45375] Avg episode reward: [(0, '52.660'), (1, '43.770')] +[2023-10-13 01:06:34,335][46663] Updated weights for policy 1, policy_version 21321 (0.0008) +[2023-10-13 01:06:34,697][46663] Updated weights for policy 1, policy_version 21331 (0.0009) +[2023-10-13 01:06:35,073][46663] Updated weights for policy 1, policy_version 21341 (0.0009) +[2023-10-13 01:06:37,178][46662] Updated weights for policy 0, policy_version 21350 (0.0008) +[2023-10-13 01:06:37,550][46662] Updated weights for policy 0, policy_version 21360 (0.0007) +[2023-10-13 01:06:37,927][46662] Updated weights for policy 0, policy_version 21370 (0.0010) +[2023-10-13 01:06:38,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 43745280. Throughput: 0: 1667.8, 1: 1694.2. Samples: 10946276. Policy #0 lag: (min: 27.0, avg: 27.0, max: 27.0) +[2023-10-13 01:06:38,607][45375] Avg episode reward: [(0, '52.430'), (1, '41.510')] +[2023-10-13 01:06:39,178][46663] Updated weights for policy 1, policy_version 21351 (0.0009) +[2023-10-13 01:06:39,548][46663] Updated weights for policy 1, policy_version 21361 (0.0007) +[2023-10-13 01:06:39,917][46663] Updated weights for policy 1, policy_version 21371 (0.0007) +[2023-10-13 01:06:41,971][46662] Updated weights for policy 0, policy_version 21380 (0.0009) +[2023-10-13 01:06:42,348][46662] Updated weights for policy 0, policy_version 21390 (0.0010) +[2023-10-13 01:06:42,721][46662] Updated weights for policy 0, policy_version 21400 (0.0010) +[2023-10-13 01:06:43,607][45375] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 43810816. Throughput: 0: 1684.9, 1: 1692.0. Samples: 10956064. Policy #0 lag: (min: 27.0, avg: 27.0, max: 27.0) +[2023-10-13 01:06:43,607][45375] Avg episode reward: [(0, '52.190'), (1, '41.250')] +[2023-10-13 01:06:43,965][46663] Updated weights for policy 1, policy_version 21381 (0.0009) +[2023-10-13 01:06:44,332][46663] Updated weights for policy 1, policy_version 21391 (0.0008) +[2023-10-13 01:06:44,701][46663] Updated weights for policy 1, policy_version 21401 (0.0010) +[2023-10-13 01:06:46,801][46662] Updated weights for policy 0, policy_version 21410 (0.0009) +[2023-10-13 01:06:47,177][46662] Updated weights for policy 0, policy_version 21420 (0.0010) +[2023-10-13 01:06:47,561][46662] Updated weights for policy 0, policy_version 21430 (0.0010) +[2023-10-13 01:06:47,929][46662] Updated weights for policy 0, policy_version 21440 (0.0008) +[2023-10-13 01:06:48,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 43876352. Throughput: 0: 1688.4, 1: 1695.2. Samples: 10976828. Policy #0 lag: (min: 27.0, avg: 27.0, max: 27.0) +[2023-10-13 01:06:48,607][45375] Avg episode reward: [(0, '51.830'), (1, '41.570')] +[2023-10-13 01:06:48,904][46663] Updated weights for policy 1, policy_version 21411 (0.0008) +[2023-10-13 01:06:49,279][46663] Updated weights for policy 1, policy_version 21421 (0.0008) +[2023-10-13 01:06:49,652][46663] Updated weights for policy 1, policy_version 21431 (0.0007) +[2023-10-13 01:06:52,047][46662] Updated weights for policy 0, policy_version 21450 (0.0008) +[2023-10-13 01:06:52,417][46662] Updated weights for policy 0, policy_version 21460 (0.0010) +[2023-10-13 01:06:52,797][46662] Updated weights for policy 0, policy_version 21470 (0.0008) +[2023-10-13 01:06:53,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 43941888. Throughput: 0: 1667.7, 1: 1693.5. Samples: 10996440. Policy #0 lag: (min: 17.0, avg: 25.1, max: 49.0) +[2023-10-13 01:06:53,608][45375] Avg episode reward: [(0, '51.040'), (1, '42.430')] +[2023-10-13 01:06:53,709][46663] Updated weights for policy 1, policy_version 21441 (0.0008) +[2023-10-13 01:06:54,081][46663] Updated weights for policy 1, policy_version 21451 (0.0009) +[2023-10-13 01:06:54,448][46663] Updated weights for policy 1, policy_version 21461 (0.0007) +[2023-10-13 01:06:54,817][46663] Updated weights for policy 1, policy_version 21471 (0.0010) +[2023-10-13 01:06:56,880][46662] Updated weights for policy 0, policy_version 21480 (0.0009) +[2023-10-13 01:06:57,246][46662] Updated weights for policy 0, policy_version 21490 (0.0010) +[2023-10-13 01:06:57,610][46662] Updated weights for policy 0, policy_version 21500 (0.0010) +[2023-10-13 01:06:58,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 44007424. Throughput: 0: 1690.4, 1: 1694.3. Samples: 11006740. Policy #0 lag: (min: 17.0, avg: 25.1, max: 49.0) +[2023-10-13 01:06:58,607][45375] Avg episode reward: [(0, '51.860'), (1, '42.120')] +[2023-10-13 01:06:58,771][46663] Updated weights for policy 1, policy_version 21481 (0.0009) +[2023-10-13 01:06:59,138][46663] Updated weights for policy 1, policy_version 21491 (0.0008) +[2023-10-13 01:06:59,509][46663] Updated weights for policy 1, policy_version 21501 (0.0008) +[2023-10-13 01:07:01,612][46662] Updated weights for policy 0, policy_version 21510 (0.0008) +[2023-10-13 01:07:01,984][46662] Updated weights for policy 0, policy_version 21520 (0.0007) +[2023-10-13 01:07:02,353][46662] Updated weights for policy 0, policy_version 21530 (0.0010) +[2023-10-13 01:07:03,480][46663] Updated weights for policy 1, policy_version 21511 (0.0008) +[2023-10-13 01:07:03,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 44072960. Throughput: 0: 1676.7, 1: 1688.2. Samples: 11026986. Policy #0 lag: (min: 17.0, avg: 25.1, max: 49.0) +[2023-10-13 01:07:03,607][45375] Avg episode reward: [(0, '50.710'), (1, '43.260')] +[2023-10-13 01:07:03,843][46663] Updated weights for policy 1, policy_version 21521 (0.0008) +[2023-10-13 01:07:04,210][46663] Updated weights for policy 1, policy_version 21531 (0.0009) +[2023-10-13 01:07:06,458][46662] Updated weights for policy 0, policy_version 21540 (0.0010) +[2023-10-13 01:07:06,862][46662] Updated weights for policy 0, policy_version 21550 (0.0007) +[2023-10-13 01:07:07,236][46662] Updated weights for policy 0, policy_version 21560 (0.0007) +[2023-10-13 01:07:08,352][46663] Updated weights for policy 1, policy_version 21541 (0.0008) +[2023-10-13 01:07:08,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 44138496. Throughput: 0: 1666.3, 1: 1683.9. Samples: 11046592. Policy #0 lag: (min: 17.0, avg: 25.1, max: 49.0) +[2023-10-13 01:07:08,607][45375] Avg episode reward: [(0, '49.520'), (1, '43.030')] +[2023-10-13 01:07:08,720][46663] Updated weights for policy 1, policy_version 21551 (0.0008) +[2023-10-13 01:07:09,077][46663] Updated weights for policy 1, policy_version 21561 (0.0007) +[2023-10-13 01:07:11,172][46662] Updated weights for policy 0, policy_version 21570 (0.0009) +[2023-10-13 01:07:11,544][46662] Updated weights for policy 0, policy_version 21580 (0.0007) +[2023-10-13 01:07:11,920][46662] Updated weights for policy 0, policy_version 21590 (0.0007) +[2023-10-13 01:07:12,289][46662] Updated weights for policy 0, policy_version 21600 (0.0007) +[2023-10-13 01:07:13,186][46663] Updated weights for policy 1, policy_version 21571 (0.0008) +[2023-10-13 01:07:13,605][46663] Updated weights for policy 1, policy_version 21581 (0.0009) +[2023-10-13 01:07:13,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 44204032. Throughput: 0: 1685.5, 1: 1691.6. Samples: 11057374. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 01:07:13,607][45375] Avg episode reward: [(0, '50.330'), (1, '43.360')] +[2023-10-13 01:07:13,980][46663] Updated weights for policy 1, policy_version 21591 (0.0010) +[2023-10-13 01:07:16,438][46662] Updated weights for policy 0, policy_version 21610 (0.0009) +[2023-10-13 01:07:16,809][46662] Updated weights for policy 0, policy_version 21620 (0.0009) +[2023-10-13 01:07:17,179][46662] Updated weights for policy 0, policy_version 21630 (0.0009) +[2023-10-13 01:07:18,276][46663] Updated weights for policy 1, policy_version 21601 (0.0009) +[2023-10-13 01:07:18,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 44269568. Throughput: 0: 1674.9, 1: 1678.4. Samples: 11077088. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 01:07:18,607][45375] Avg episode reward: [(0, '50.080'), (1, '44.980')] +[2023-10-13 01:07:18,642][46663] Updated weights for policy 1, policy_version 21611 (0.0011) +[2023-10-13 01:07:19,003][46663] Updated weights for policy 1, policy_version 21621 (0.0010) +[2023-10-13 01:07:19,379][46663] Updated weights for policy 1, policy_version 21631 (0.0010) +[2023-10-13 01:07:21,162][46662] Updated weights for policy 0, policy_version 21640 (0.0009) +[2023-10-13 01:07:21,534][46662] Updated weights for policy 0, policy_version 21650 (0.0009) +[2023-10-13 01:07:21,906][46662] Updated weights for policy 0, policy_version 21660 (0.0010) +[2023-10-13 01:07:23,412][46663] Updated weights for policy 1, policy_version 21641 (0.0008) +[2023-10-13 01:07:23,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 44335104. Throughput: 0: 1676.0, 1: 1669.7. Samples: 11096834. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 01:07:23,608][45375] Avg episode reward: [(0, '50.760'), (1, '43.810')] +[2023-10-13 01:07:23,775][46663] Updated weights for policy 1, policy_version 21651 (0.0008) +[2023-10-13 01:07:24,143][46663] Updated weights for policy 1, policy_version 21661 (0.0007) +[2023-10-13 01:07:25,894][46662] Updated weights for policy 0, policy_version 21670 (0.0011) +[2023-10-13 01:07:26,253][46662] Updated weights for policy 0, policy_version 21680 (0.0011) +[2023-10-13 01:07:26,619][46662] Updated weights for policy 0, policy_version 21690 (0.0008) +[2023-10-13 01:07:28,108][46663] Updated weights for policy 1, policy_version 21671 (0.0010) +[2023-10-13 01:07:28,479][46663] Updated weights for policy 1, policy_version 21681 (0.0010) +[2023-10-13 01:07:28,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 44400640. Throughput: 0: 1686.6, 1: 1679.0. Samples: 11107516. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 01:07:28,608][45375] Avg episode reward: [(0, '50.350'), (1, '43.830')] +[2023-10-13 01:07:28,849][46663] Updated weights for policy 1, policy_version 21691 (0.0009) +[2023-10-13 01:07:30,676][46662] Updated weights for policy 0, policy_version 21700 (0.0008) +[2023-10-13 01:07:31,047][46662] Updated weights for policy 0, policy_version 21710 (0.0007) +[2023-10-13 01:07:31,417][46662] Updated weights for policy 0, policy_version 21720 (0.0007) +[2023-10-13 01:07:32,993][46663] Updated weights for policy 1, policy_version 21701 (0.0010) +[2023-10-13 01:07:33,359][46663] Updated weights for policy 1, policy_version 21711 (0.0010) +[2023-10-13 01:07:33,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 44466176. Throughput: 0: 1662.6, 1: 1675.2. Samples: 11127032. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 01:07:33,608][45375] Avg episode reward: [(0, '49.110'), (1, '44.460')] +[2023-10-13 01:07:33,725][46663] Updated weights for policy 1, policy_version 21721 (0.0009) +[2023-10-13 01:07:35,421][46662] Updated weights for policy 0, policy_version 21730 (0.0008) +[2023-10-13 01:07:35,785][46662] Updated weights for policy 0, policy_version 21740 (0.0010) +[2023-10-13 01:07:36,159][46662] Updated weights for policy 0, policy_version 21750 (0.0008) +[2023-10-13 01:07:36,527][46662] Updated weights for policy 0, policy_version 21760 (0.0008) +[2023-10-13 01:07:37,678][46663] Updated weights for policy 1, policy_version 21731 (0.0010) +[2023-10-13 01:07:38,040][46663] Updated weights for policy 1, policy_version 21741 (0.0011) +[2023-10-13 01:07:38,419][46663] Updated weights for policy 1, policy_version 21751 (0.0008) +[2023-10-13 01:07:38,607][45375] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 44531712. Throughput: 0: 1683.9, 1: 1658.4. Samples: 11146844. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 01:07:38,607][45375] Avg episode reward: [(0, '49.970'), (1, '43.880')] +[2023-10-13 01:07:40,656][46662] Updated weights for policy 0, policy_version 21770 (0.0007) +[2023-10-13 01:07:41,022][46662] Updated weights for policy 0, policy_version 21780 (0.0010) +[2023-10-13 01:07:41,399][46662] Updated weights for policy 0, policy_version 21790 (0.0008) +[2023-10-13 01:07:42,429][46663] Updated weights for policy 1, policy_version 21761 (0.0008) +[2023-10-13 01:07:42,801][46663] Updated weights for policy 1, policy_version 21771 (0.0008) +[2023-10-13 01:07:43,171][46663] Updated weights for policy 1, policy_version 21781 (0.0010) +[2023-10-13 01:07:43,534][46663] Updated weights for policy 1, policy_version 21791 (0.0009) +[2023-10-13 01:07:43,606][45375] Fps is (10 sec: 16384.3, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 44630016. Throughput: 0: 1678.5, 1: 1679.4. Samples: 11157846. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 01:07:43,607][45375] Avg episode reward: [(0, '50.300'), (1, '47.630')] +[2023-10-13 01:07:45,428][46662] Updated weights for policy 0, policy_version 21800 (0.0007) +[2023-10-13 01:07:45,793][46662] Updated weights for policy 0, policy_version 21810 (0.0007) +[2023-10-13 01:07:46,164][46662] Updated weights for policy 0, policy_version 21820 (0.0007) +[2023-10-13 01:07:47,627][46663] Updated weights for policy 1, policy_version 21801 (0.0010) +[2023-10-13 01:07:48,000][46663] Updated weights for policy 1, policy_version 21811 (0.0009) +[2023-10-13 01:07:48,372][46663] Updated weights for policy 1, policy_version 21821 (0.0008) +[2023-10-13 01:07:48,607][45375] Fps is (10 sec: 16384.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 44695552. Throughput: 0: 1673.1, 1: 1678.1. Samples: 11177792. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 01:07:48,607][45375] Avg episode reward: [(0, '51.200'), (1, '48.720')] +[2023-10-13 01:07:50,104][46662] Updated weights for policy 0, policy_version 21830 (0.0007) +[2023-10-13 01:07:50,474][46662] Updated weights for policy 0, policy_version 21840 (0.0008) +[2023-10-13 01:07:50,862][46662] Updated weights for policy 0, policy_version 21850 (0.0008) +[2023-10-13 01:07:52,477][46663] Updated weights for policy 1, policy_version 21831 (0.0008) +[2023-10-13 01:07:52,852][46663] Updated weights for policy 1, policy_version 21841 (0.0010) +[2023-10-13 01:07:53,216][46663] Updated weights for policy 1, policy_version 21851 (0.0007) +[2023-10-13 01:07:53,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 44761088. Throughput: 0: 1695.3, 1: 1656.7. Samples: 11197432. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0) +[2023-10-13 01:07:53,607][45375] Avg episode reward: [(0, '51.910'), (1, '48.250')] +[2023-10-13 01:07:53,616][46091] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000021856_22380544.pth... +[2023-10-13 01:07:53,617][46384] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000021856_22380544.pth... +[2023-10-13 01:07:53,653][46384] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000020288_20774912.pth +[2023-10-13 01:07:53,657][46091] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000020288_20774912.pth +[2023-10-13 01:07:54,811][46662] Updated weights for policy 0, policy_version 21860 (0.0007) +[2023-10-13 01:07:55,187][46662] Updated weights for policy 0, policy_version 21870 (0.0008) +[2023-10-13 01:07:55,556][46662] Updated weights for policy 0, policy_version 21880 (0.0007) +[2023-10-13 01:07:57,348][46663] Updated weights for policy 1, policy_version 21861 (0.0008) +[2023-10-13 01:07:57,712][46663] Updated weights for policy 1, policy_version 21871 (0.0007) +[2023-10-13 01:07:58,078][46663] Updated weights for policy 1, policy_version 21881 (0.0008) +[2023-10-13 01:07:58,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 44826624. Throughput: 0: 1667.3, 1: 1674.2. Samples: 11207744. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0) +[2023-10-13 01:07:58,607][45375] Avg episode reward: [(0, '51.630'), (1, '47.880')] +[2023-10-13 01:07:59,633][46662] Updated weights for policy 0, policy_version 21890 (0.0008) +[2023-10-13 01:08:00,009][46662] Updated weights for policy 0, policy_version 21900 (0.0008) +[2023-10-13 01:08:00,384][46662] Updated weights for policy 0, policy_version 21910 (0.0008) +[2023-10-13 01:08:00,754][46662] Updated weights for policy 0, policy_version 21920 (0.0007) +[2023-10-13 01:08:02,401][46663] Updated weights for policy 1, policy_version 21891 (0.0009) +[2023-10-13 01:08:02,818][46663] Updated weights for policy 1, policy_version 21901 (0.0008) +[2023-10-13 01:08:03,175][46663] Updated weights for policy 1, policy_version 21911 (0.0007) +[2023-10-13 01:08:03,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 44892160. Throughput: 0: 1679.9, 1: 1676.5. Samples: 11228128. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0) +[2023-10-13 01:08:03,607][45375] Avg episode reward: [(0, '52.900'), (1, '48.120')] +[2023-10-13 01:08:04,768][46662] Updated weights for policy 0, policy_version 21930 (0.0007) +[2023-10-13 01:08:05,129][46662] Updated weights for policy 0, policy_version 21940 (0.0008) +[2023-10-13 01:08:05,492][46662] Updated weights for policy 0, policy_version 21950 (0.0008) +[2023-10-13 01:08:07,138][46663] Updated weights for policy 1, policy_version 21921 (0.0008) +[2023-10-13 01:08:07,498][46663] Updated weights for policy 1, policy_version 21931 (0.0010) +[2023-10-13 01:08:07,872][46663] Updated weights for policy 1, policy_version 21941 (0.0008) +[2023-10-13 01:08:08,249][46663] Updated weights for policy 1, policy_version 21951 (0.0007) +[2023-10-13 01:08:08,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 44957696. Throughput: 0: 1696.5, 1: 1658.0. Samples: 11247786. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0) +[2023-10-13 01:08:08,607][45375] Avg episode reward: [(0, '54.510'), (1, '49.530')] +[2023-10-13 01:08:09,601][46662] Updated weights for policy 0, policy_version 21960 (0.0008) +[2023-10-13 01:08:09,985][46662] Updated weights for policy 0, policy_version 21970 (0.0007) +[2023-10-13 01:08:10,368][46662] Updated weights for policy 0, policy_version 21980 (0.0011) +[2023-10-13 01:08:12,333][46663] Updated weights for policy 1, policy_version 21961 (0.0009) +[2023-10-13 01:08:12,703][46663] Updated weights for policy 1, policy_version 21971 (0.0008) +[2023-10-13 01:08:13,072][46663] Updated weights for policy 1, policy_version 21981 (0.0007) +[2023-10-13 01:08:13,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 45023232. Throughput: 0: 1668.5, 1: 1677.0. Samples: 11258064. Policy #0 lag: (min: 18.0, avg: 18.0, max: 18.0) +[2023-10-13 01:08:13,607][45375] Avg episode reward: [(0, '54.820'), (1, '47.750')] +[2023-10-13 01:08:14,458][46662] Updated weights for policy 0, policy_version 21990 (0.0009) +[2023-10-13 01:08:14,831][46662] Updated weights for policy 0, policy_version 22000 (0.0007) +[2023-10-13 01:08:15,190][46662] Updated weights for policy 0, policy_version 22010 (0.0010) +[2023-10-13 01:08:17,163][46663] Updated weights for policy 1, policy_version 21991 (0.0008) +[2023-10-13 01:08:17,536][46663] Updated weights for policy 1, policy_version 22001 (0.0010) +[2023-10-13 01:08:17,903][46663] Updated weights for policy 1, policy_version 22011 (0.0009) +[2023-10-13 01:08:18,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 45088768. Throughput: 0: 1697.2, 1: 1670.8. Samples: 11278594. Policy #0 lag: (min: 18.0, avg: 18.0, max: 18.0) +[2023-10-13 01:08:18,607][45375] Avg episode reward: [(0, '53.220'), (1, '46.780')] +[2023-10-13 01:08:19,113][46662] Updated weights for policy 0, policy_version 22020 (0.0009) +[2023-10-13 01:08:19,486][46662] Updated weights for policy 0, policy_version 22030 (0.0010) +[2023-10-13 01:08:19,850][46662] Updated weights for policy 0, policy_version 22040 (0.0007) +[2023-10-13 01:08:21,872][46663] Updated weights for policy 1, policy_version 22021 (0.0009) +[2023-10-13 01:08:22,239][46663] Updated weights for policy 1, policy_version 22031 (0.0009) +[2023-10-13 01:08:22,609][46663] Updated weights for policy 1, policy_version 22041 (0.0010) +[2023-10-13 01:08:23,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 45154304. Throughput: 0: 1707.3, 1: 1674.3. Samples: 11299014. Policy #0 lag: (min: 18.0, avg: 18.0, max: 18.0) +[2023-10-13 01:08:23,608][45375] Avg episode reward: [(0, '52.070'), (1, '47.940')] +[2023-10-13 01:08:23,955][46662] Updated weights for policy 0, policy_version 22050 (0.0009) +[2023-10-13 01:08:24,338][46662] Updated weights for policy 0, policy_version 22060 (0.0010) +[2023-10-13 01:08:24,709][46662] Updated weights for policy 0, policy_version 22070 (0.0010) +[2023-10-13 01:08:25,073][46662] Updated weights for policy 0, policy_version 22080 (0.0008) +[2023-10-13 01:08:26,684][46663] Updated weights for policy 1, policy_version 22051 (0.0010) +[2023-10-13 01:08:27,045][46663] Updated weights for policy 1, policy_version 22061 (0.0008) +[2023-10-13 01:08:27,419][46663] Updated weights for policy 1, policy_version 22071 (0.0008) +[2023-10-13 01:08:28,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 45219840. Throughput: 0: 1686.6, 1: 1682.1. Samples: 11309438. Policy #0 lag: (min: 18.0, avg: 18.0, max: 18.0) +[2023-10-13 01:08:28,607][45375] Avg episode reward: [(0, '52.530'), (1, '48.670')] +[2023-10-13 01:08:29,255][46662] Updated weights for policy 0, policy_version 22090 (0.0010) +[2023-10-13 01:08:29,620][46662] Updated weights for policy 0, policy_version 22100 (0.0007) +[2023-10-13 01:08:29,991][46662] Updated weights for policy 0, policy_version 22110 (0.0009) +[2023-10-13 01:08:31,607][46663] Updated weights for policy 1, policy_version 22081 (0.0008) +[2023-10-13 01:08:31,975][46663] Updated weights for policy 1, policy_version 22091 (0.0009) +[2023-10-13 01:08:32,344][46663] Updated weights for policy 1, policy_version 22101 (0.0007) +[2023-10-13 01:08:32,705][46663] Updated weights for policy 1, policy_version 22111 (0.0011) +[2023-10-13 01:08:33,606][45375] Fps is (10 sec: 13107.5, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 45285376. Throughput: 0: 1699.3, 1: 1663.2. Samples: 11329102. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 01:08:33,607][45375] Avg episode reward: [(0, '50.890'), (1, '47.600')] +[2023-10-13 01:08:33,954][46662] Updated weights for policy 0, policy_version 22120 (0.0007) +[2023-10-13 01:08:34,315][46662] Updated weights for policy 0, policy_version 22130 (0.0008) +[2023-10-13 01:08:34,687][46662] Updated weights for policy 0, policy_version 22140 (0.0009) +[2023-10-13 01:08:36,647][46663] Updated weights for policy 1, policy_version 22121 (0.0008) +[2023-10-13 01:08:37,007][46663] Updated weights for policy 1, policy_version 22131 (0.0009) +[2023-10-13 01:08:37,376][46663] Updated weights for policy 1, policy_version 22141 (0.0009) +[2023-10-13 01:08:38,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 45350912. Throughput: 0: 1697.1, 1: 1681.2. Samples: 11349452. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 01:08:38,607][45375] Avg episode reward: [(0, '50.940'), (1, '49.550')] +[2023-10-13 01:08:38,805][46662] Updated weights for policy 0, policy_version 22150 (0.0007) +[2023-10-13 01:08:39,179][46662] Updated weights for policy 0, policy_version 22160 (0.0008) +[2023-10-13 01:08:39,556][46662] Updated weights for policy 0, policy_version 22170 (0.0009) +[2023-10-13 01:08:41,344][46663] Updated weights for policy 1, policy_version 22151 (0.0011) +[2023-10-13 01:08:41,709][46663] Updated weights for policy 1, policy_version 22161 (0.0008) +[2023-10-13 01:08:42,086][46663] Updated weights for policy 1, policy_version 22171 (0.0009) +[2023-10-13 01:08:43,607][45375] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 45416448. Throughput: 0: 1690.8, 1: 1683.4. Samples: 11359584. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 01:08:43,608][45375] Avg episode reward: [(0, '50.780'), (1, '47.200')] +[2023-10-13 01:08:43,632][46662] Updated weights for policy 0, policy_version 22180 (0.0008) +[2023-10-13 01:08:44,008][46662] Updated weights for policy 0, policy_version 22190 (0.0008) +[2023-10-13 01:08:44,379][46662] Updated weights for policy 0, policy_version 22200 (0.0007) +[2023-10-13 01:08:46,115][46663] Updated weights for policy 1, policy_version 22181 (0.0008) +[2023-10-13 01:08:46,488][46663] Updated weights for policy 1, policy_version 22191 (0.0008) +[2023-10-13 01:08:46,856][46663] Updated weights for policy 1, policy_version 22201 (0.0007) +[2023-10-13 01:08:48,344][46662] Updated weights for policy 0, policy_version 22210 (0.0008) +[2023-10-13 01:08:48,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 45481984. Throughput: 0: 1696.4, 1: 1664.6. Samples: 11379376. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 01:08:48,607][45375] Avg episode reward: [(0, '49.940'), (1, '46.080')] +[2023-10-13 01:08:48,718][46662] Updated weights for policy 0, policy_version 22220 (0.0008) +[2023-10-13 01:08:49,083][46662] Updated weights for policy 0, policy_version 22230 (0.0008) +[2023-10-13 01:08:49,458][46662] Updated weights for policy 0, policy_version 22240 (0.0009) +[2023-10-13 01:08:50,947][46663] Updated weights for policy 1, policy_version 22211 (0.0009) +[2023-10-13 01:08:51,361][46663] Updated weights for policy 1, policy_version 22221 (0.0009) +[2023-10-13 01:08:51,734][46663] Updated weights for policy 1, policy_version 22231 (0.0009) +[2023-10-13 01:08:53,522][46662] Updated weights for policy 0, policy_version 22250 (0.0007) +[2023-10-13 01:08:53,607][45375] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 45547520. Throughput: 0: 1698.4, 1: 1685.4. Samples: 11400056. Policy #0 lag: (min: 31.0, avg: 32.5, max: 58.0) +[2023-10-13 01:08:53,607][45375] Avg episode reward: [(0, '50.700'), (1, '46.560')] +[2023-10-13 01:08:53,886][46662] Updated weights for policy 0, policy_version 22260 (0.0007) +[2023-10-13 01:08:54,263][46662] Updated weights for policy 0, policy_version 22270 (0.0007) +[2023-10-13 01:08:55,776][46663] Updated weights for policy 1, policy_version 22241 (0.0009) +[2023-10-13 01:08:56,135][46663] Updated weights for policy 1, policy_version 22251 (0.0008) +[2023-10-13 01:08:56,498][46663] Updated weights for policy 1, policy_version 22261 (0.0010) +[2023-10-13 01:08:56,866][46663] Updated weights for policy 1, policy_version 22271 (0.0010) +[2023-10-13 01:08:58,379][46662] Updated weights for policy 0, policy_version 22280 (0.0009) +[2023-10-13 01:08:58,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 45613056. Throughput: 0: 1697.5, 1: 1673.7. Samples: 11409768. Policy #0 lag: (min: 31.0, avg: 32.5, max: 58.0) +[2023-10-13 01:08:58,607][45375] Avg episode reward: [(0, '48.490'), (1, '46.560')] +[2023-10-13 01:08:58,761][46662] Updated weights for policy 0, policy_version 22290 (0.0011) +[2023-10-13 01:08:59,135][46662] Updated weights for policy 0, policy_version 22300 (0.0011) +[2023-10-13 01:09:00,944][46663] Updated weights for policy 1, policy_version 22281 (0.0008) +[2023-10-13 01:09:01,309][46663] Updated weights for policy 1, policy_version 22291 (0.0008) +[2023-10-13 01:09:01,672][46663] Updated weights for policy 1, policy_version 22301 (0.0008) +[2023-10-13 01:09:03,096][46662] Updated weights for policy 0, policy_version 22310 (0.0010) +[2023-10-13 01:09:03,461][46662] Updated weights for policy 0, policy_version 22320 (0.0007) +[2023-10-13 01:09:03,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 45678592. Throughput: 0: 1695.5, 1: 1667.0. Samples: 11429908. Policy #0 lag: (min: 31.0, avg: 32.5, max: 58.0) +[2023-10-13 01:09:03,608][45375] Avg episode reward: [(0, '47.600'), (1, '45.480')] +[2023-10-13 01:09:03,828][46662] Updated weights for policy 0, policy_version 22330 (0.0008) +[2023-10-13 01:09:05,881][46663] Updated weights for policy 1, policy_version 22311 (0.0009) +[2023-10-13 01:09:06,251][46663] Updated weights for policy 1, policy_version 22321 (0.0010) +[2023-10-13 01:09:06,627][46663] Updated weights for policy 1, policy_version 22331 (0.0010) +[2023-10-13 01:09:07,917][46662] Updated weights for policy 0, policy_version 22340 (0.0009) +[2023-10-13 01:09:08,276][46662] Updated weights for policy 0, policy_version 22350 (0.0011) +[2023-10-13 01:09:08,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 45744128. Throughput: 0: 1686.4, 1: 1680.0. Samples: 11450502. Policy #0 lag: (min: 31.0, avg: 32.5, max: 58.0) +[2023-10-13 01:09:08,607][45375] Avg episode reward: [(0, '47.190'), (1, '45.380')] +[2023-10-13 01:09:08,643][46662] Updated weights for policy 0, policy_version 22360 (0.0010) +[2023-10-13 01:09:10,751][46663] Updated weights for policy 1, policy_version 22341 (0.0009) +[2023-10-13 01:09:11,123][46663] Updated weights for policy 1, policy_version 22351 (0.0007) +[2023-10-13 01:09:11,490][46663] Updated weights for policy 1, policy_version 22361 (0.0007) +[2023-10-13 01:09:12,645][46662] Updated weights for policy 0, policy_version 22370 (0.0008) +[2023-10-13 01:09:13,012][46662] Updated weights for policy 0, policy_version 22380 (0.0007) +[2023-10-13 01:09:13,392][46662] Updated weights for policy 0, policy_version 22390 (0.0007) +[2023-10-13 01:09:13,606][45375] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 45809664. Throughput: 0: 1686.7, 1: 1662.3. Samples: 11460140. Policy #0 lag: (min: 31.0, avg: 31.7, max: 49.0) +[2023-10-13 01:09:13,607][45375] Avg episode reward: [(0, '46.000'), (1, '45.300')] +[2023-10-13 01:09:13,766][46662] Updated weights for policy 0, policy_version 22400 (0.0007) +[2023-10-13 01:09:15,564][46663] Updated weights for policy 1, policy_version 22371 (0.0007) +[2023-10-13 01:09:15,927][46663] Updated weights for policy 1, policy_version 22381 (0.0007) +[2023-10-13 01:09:16,296][46663] Updated weights for policy 1, policy_version 22391 (0.0009) +[2023-10-13 01:09:17,726][46662] Updated weights for policy 0, policy_version 22410 (0.0009) +[2023-10-13 01:09:18,088][46662] Updated weights for policy 0, policy_version 22420 (0.0010) +[2023-10-13 01:09:18,463][46662] Updated weights for policy 0, policy_version 22430 (0.0010) +[2023-10-13 01:09:18,607][45375] Fps is (10 sec: 16384.1, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 45907968. Throughput: 0: 1685.1, 1: 1674.0. Samples: 11480262. Policy #0 lag: (min: 31.0, avg: 31.7, max: 49.0) +[2023-10-13 01:09:18,607][45375] Avg episode reward: [(0, '47.980'), (1, '46.680')] +[2023-10-13 01:09:20,429][46663] Updated weights for policy 1, policy_version 22401 (0.0009) +[2023-10-13 01:09:20,796][46663] Updated weights for policy 1, policy_version 22411 (0.0011) +[2023-10-13 01:09:21,167][46663] Updated weights for policy 1, policy_version 22421 (0.0011) +[2023-10-13 01:09:21,533][46663] Updated weights for policy 1, policy_version 22431 (0.0009) +[2023-10-13 01:09:22,524][46662] Updated weights for policy 0, policy_version 22440 (0.0010) +[2023-10-13 01:09:22,894][46662] Updated weights for policy 0, policy_version 22450 (0.0011) +[2023-10-13 01:09:23,265][46662] Updated weights for policy 0, policy_version 22460 (0.0011) +[2023-10-13 01:09:23,607][45375] Fps is (10 sec: 16383.6, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 45973504. Throughput: 0: 1676.6, 1: 1684.3. Samples: 11500694. Policy #0 lag: (min: 31.0, avg: 31.7, max: 49.0) +[2023-10-13 01:09:23,608][45375] Avg episode reward: [(0, '49.320'), (1, '46.150')] +[2023-10-13 01:09:25,377][46663] Updated weights for policy 1, policy_version 22441 (0.0009) +[2023-10-13 01:09:25,746][46663] Updated weights for policy 1, policy_version 22451 (0.0008) +[2023-10-13 01:09:26,113][46663] Updated weights for policy 1, policy_version 22461 (0.0008) +[2023-10-13 01:09:27,350][46662] Updated weights for policy 0, policy_version 22470 (0.0011) +[2023-10-13 01:09:27,721][46662] Updated weights for policy 0, policy_version 22480 (0.0010) +[2023-10-13 01:09:28,096][46662] Updated weights for policy 0, policy_version 22490 (0.0007) +[2023-10-13 01:09:28,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 46039040. Throughput: 0: 1691.4, 1: 1660.4. Samples: 11510416. Policy #0 lag: (min: 29.0, avg: 29.1, max: 34.0) +[2023-10-13 01:09:28,607][45375] Avg episode reward: [(0, '49.110'), (1, '46.030')] +[2023-10-13 01:09:30,365][46663] Updated weights for policy 1, policy_version 22471 (0.0008) +[2023-10-13 01:09:30,741][46663] Updated weights for policy 1, policy_version 22481 (0.0008) +[2023-10-13 01:09:31,117][46663] Updated weights for policy 1, policy_version 22491 (0.0010) +[2023-10-13 01:09:32,001][46662] Updated weights for policy 0, policy_version 22500 (0.0007) +[2023-10-13 01:09:32,395][46662] Updated weights for policy 0, policy_version 22510 (0.0010) +[2023-10-13 01:09:32,773][46662] Updated weights for policy 0, policy_version 22520 (0.0010) +[2023-10-13 01:09:33,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 46104576. Throughput: 0: 1688.8, 1: 1682.5. Samples: 11531086. Policy #0 lag: (min: 29.0, avg: 29.1, max: 34.0) +[2023-10-13 01:09:33,608][45375] Avg episode reward: [(0, '47.930'), (1, '47.830')] +[2023-10-13 01:09:34,886][46663] Updated weights for policy 1, policy_version 22501 (0.0009) +[2023-10-13 01:09:35,250][46663] Updated weights for policy 1, policy_version 22511 (0.0011) +[2023-10-13 01:09:35,631][46663] Updated weights for policy 1, policy_version 22521 (0.0009) +[2023-10-13 01:09:36,791][46662] Updated weights for policy 0, policy_version 22530 (0.0011) +[2023-10-13 01:09:37,171][46662] Updated weights for policy 0, policy_version 22540 (0.0007) +[2023-10-13 01:09:37,536][46662] Updated weights for policy 0, policy_version 22550 (0.0009) +[2023-10-13 01:09:37,903][46662] Updated weights for policy 0, policy_version 22560 (0.0009) +[2023-10-13 01:09:38,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 46170112. Throughput: 0: 1661.6, 1: 1695.7. Samples: 11551134. Policy #0 lag: (min: 29.0, avg: 29.1, max: 34.0) +[2023-10-13 01:09:38,607][45375] Avg episode reward: [(0, '49.790'), (1, '47.160')] +[2023-10-13 01:09:39,765][46663] Updated weights for policy 1, policy_version 22531 (0.0008) +[2023-10-13 01:09:40,148][46663] Updated weights for policy 1, policy_version 22541 (0.0007) +[2023-10-13 01:09:40,515][46663] Updated weights for policy 1, policy_version 22551 (0.0008) +[2023-10-13 01:09:42,088][46662] Updated weights for policy 0, policy_version 22570 (0.0010) +[2023-10-13 01:09:42,459][46662] Updated weights for policy 0, policy_version 22580 (0.0009) +[2023-10-13 01:09:42,835][46662] Updated weights for policy 0, policy_version 22590 (0.0010) +[2023-10-13 01:09:43,607][45375] Fps is (10 sec: 13107.4, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 46235648. Throughput: 0: 1686.9, 1: 1678.1. Samples: 11561194. Policy #0 lag: (min: 29.0, avg: 29.1, max: 34.0) +[2023-10-13 01:09:43,607][45375] Avg episode reward: [(0, '49.070'), (1, '47.340')] +[2023-10-13 01:09:44,443][46663] Updated weights for policy 1, policy_version 22561 (0.0010) +[2023-10-13 01:09:44,802][46663] Updated weights for policy 1, policy_version 22571 (0.0009) +[2023-10-13 01:09:45,167][46663] Updated weights for policy 1, policy_version 22581 (0.0008) +[2023-10-13 01:09:45,540][46663] Updated weights for policy 1, policy_version 22591 (0.0008) +[2023-10-13 01:09:46,841][46662] Updated weights for policy 0, policy_version 22600 (0.0008) +[2023-10-13 01:09:47,221][46662] Updated weights for policy 0, policy_version 22610 (0.0008) +[2023-10-13 01:09:47,595][46662] Updated weights for policy 0, policy_version 22620 (0.0007) +[2023-10-13 01:09:48,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 46301184. Throughput: 0: 1681.1, 1: 1696.4. Samples: 11581896. Policy #0 lag: (min: 31.0, avg: 36.2, max: 63.0) +[2023-10-13 01:09:48,608][45375] Avg episode reward: [(0, '50.830'), (1, '47.600')] +[2023-10-13 01:09:49,617][46663] Updated weights for policy 1, policy_version 22601 (0.0008) +[2023-10-13 01:09:49,989][46663] Updated weights for policy 1, policy_version 22611 (0.0008) +[2023-10-13 01:09:50,358][46663] Updated weights for policy 1, policy_version 22621 (0.0009) +[2023-10-13 01:09:51,714][46662] Updated weights for policy 0, policy_version 22630 (0.0009) +[2023-10-13 01:09:52,081][46662] Updated weights for policy 0, policy_version 22640 (0.0009) +[2023-10-13 01:09:52,454][46662] Updated weights for policy 0, policy_version 22650 (0.0009) +[2023-10-13 01:09:53,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 46366720. Throughput: 0: 1664.7, 1: 1696.2. Samples: 11601744. Policy #0 lag: (min: 31.0, avg: 36.2, max: 63.0) +[2023-10-13 01:09:53,607][45375] Avg episode reward: [(0, '51.570'), (1, '47.280')] +[2023-10-13 01:09:53,616][46384] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000022624_23166976.pth... +[2023-10-13 01:09:53,616][46091] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000022656_23199744.pth... +[2023-10-13 01:09:53,647][46384] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000021056_21561344.pth +[2023-10-13 01:09:53,652][46091] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000021088_21594112.pth +[2023-10-13 01:09:54,384][46663] Updated weights for policy 1, policy_version 22631 (0.0008) +[2023-10-13 01:09:54,759][46663] Updated weights for policy 1, policy_version 22641 (0.0009) +[2023-10-13 01:09:55,127][46663] Updated weights for policy 1, policy_version 22651 (0.0008) +[2023-10-13 01:09:56,497][46662] Updated weights for policy 0, policy_version 22660 (0.0009) +[2023-10-13 01:09:56,864][46662] Updated weights for policy 0, policy_version 22670 (0.0007) +[2023-10-13 01:09:57,239][46662] Updated weights for policy 0, policy_version 22680 (0.0007) +[2023-10-13 01:09:58,606][45375] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 46432256. Throughput: 0: 1692.4, 1: 1685.6. Samples: 11612148. Policy #0 lag: (min: 31.0, avg: 36.2, max: 63.0) +[2023-10-13 01:09:58,607][45375] Avg episode reward: [(0, '49.460'), (1, '46.090')] +[2023-10-13 01:09:59,275][46663] Updated weights for policy 1, policy_version 22661 (0.0008) +[2023-10-13 01:09:59,644][46663] Updated weights for policy 1, policy_version 22671 (0.0009) +[2023-10-13 01:10:00,013][46663] Updated weights for policy 1, policy_version 22681 (0.0009) +[2023-10-13 01:10:01,262][46662] Updated weights for policy 0, policy_version 22690 (0.0007) +[2023-10-13 01:10:01,643][46662] Updated weights for policy 0, policy_version 22700 (0.0009) +[2023-10-13 01:10:02,009][46662] Updated weights for policy 0, policy_version 22710 (0.0010) +[2023-10-13 01:10:02,379][46662] Updated weights for policy 0, policy_version 22720 (0.0008) +[2023-10-13 01:10:03,607][45375] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 46497792. Throughput: 0: 1677.2, 1: 1698.3. Samples: 11632158. Policy #0 lag: (min: 31.0, avg: 36.2, max: 63.0) +[2023-10-13 01:10:03,608][45375] Avg episode reward: [(0, '50.180'), (1, '47.260')] +[2023-10-13 01:10:04,082][46663] Updated weights for policy 1, policy_version 22691 (0.0009) +[2023-10-13 01:10:04,450][46663] Updated weights for policy 1, policy_version 22701 (0.0007) +[2023-10-13 01:10:04,819][46663] Updated weights for policy 1, policy_version 22711 (0.0009) +[2023-10-13 01:10:06,398][46662] Updated weights for policy 0, policy_version 22730 (0.0010) +[2023-10-13 01:10:06,774][46662] Updated weights for policy 0, policy_version 22740 (0.0009) +[2023-10-13 01:10:07,149][46662] Updated weights for policy 0, policy_version 22750 (0.0009) +[2023-10-13 01:10:08,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 46563328. Throughput: 0: 1666.4, 1: 1693.3. Samples: 11651882. Policy #0 lag: (min: 31.0, avg: 31.1, max: 37.0) +[2023-10-13 01:10:08,607][45375] Avg episode reward: [(0, '51.610'), (1, '48.200')] +[2023-10-13 01:10:08,877][46663] Updated weights for policy 1, policy_version 22721 (0.0007) +[2023-10-13 01:10:09,241][46663] Updated weights for policy 1, policy_version 22731 (0.0009) +[2023-10-13 01:10:09,617][46663] Updated weights for policy 1, policy_version 22741 (0.0008) +[2023-10-13 01:10:09,995][46663] Updated weights for policy 1, policy_version 22751 (0.0008) +[2023-10-13 01:10:11,398][46662] Updated weights for policy 0, policy_version 22760 (0.0008) +[2023-10-13 01:10:11,770][46662] Updated weights for policy 0, policy_version 22770 (0.0007) +[2023-10-13 01:10:12,133][46662] Updated weights for policy 0, policy_version 22780 (0.0008) +[2023-10-13 01:10:13,607][45375] Fps is (10 sec: 13107.6, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 46628864. Throughput: 0: 1682.1, 1: 1692.0. Samples: 11662250. Policy #0 lag: (min: 31.0, avg: 31.1, max: 37.0) +[2023-10-13 01:10:13,607][45375] Avg episode reward: [(0, '52.440'), (1, '49.160')] +[2023-10-13 01:10:13,878][46663] Updated weights for policy 1, policy_version 22761 (0.0009) +[2023-10-13 01:10:14,243][46663] Updated weights for policy 1, policy_version 22771 (0.0010) +[2023-10-13 01:10:14,621][46663] Updated weights for policy 1, policy_version 22781 (0.0009) +[2023-10-13 01:10:16,335][46662] Updated weights for policy 0, policy_version 22790 (0.0011) +[2023-10-13 01:10:16,706][46662] Updated weights for policy 0, policy_version 22800 (0.0009) +[2023-10-13 01:10:17,084][46662] Updated weights for policy 0, policy_version 22810 (0.0008) +[2023-10-13 01:10:18,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 46694400. Throughput: 0: 1659.7, 1: 1694.7. Samples: 11682032. Policy #0 lag: (min: 31.0, avg: 31.1, max: 37.0) +[2023-10-13 01:10:18,607][45375] Avg episode reward: [(0, '52.040'), (1, '49.410')] +[2023-10-13 01:10:18,767][46663] Updated weights for policy 1, policy_version 22791 (0.0008) +[2023-10-13 01:10:19,134][46663] Updated weights for policy 1, policy_version 22801 (0.0007) +[2023-10-13 01:10:19,503][46663] Updated weights for policy 1, policy_version 22811 (0.0007) +[2023-10-13 01:10:21,264][46662] Updated weights for policy 0, policy_version 22820 (0.0011) +[2023-10-13 01:10:21,651][46662] Updated weights for policy 0, policy_version 22830 (0.0010) +[2023-10-13 01:10:22,023][46662] Updated weights for policy 0, policy_version 22840 (0.0009) +[2023-10-13 01:10:23,578][46663] Updated weights for policy 1, policy_version 22821 (0.0007) +[2023-10-13 01:10:23,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.3, 300 sec: 13440.4). Total num frames: 46759936. Throughput: 0: 1665.2, 1: 1686.6. Samples: 11701968. Policy #0 lag: (min: 31.0, avg: 31.1, max: 37.0) +[2023-10-13 01:10:23,607][45375] Avg episode reward: [(0, '51.890'), (1, '48.950')] +[2023-10-13 01:10:23,948][46663] Updated weights for policy 1, policy_version 22831 (0.0008) +[2023-10-13 01:10:24,316][46663] Updated weights for policy 1, policy_version 22841 (0.0008) +[2023-10-13 01:10:26,220][46662] Updated weights for policy 0, policy_version 22850 (0.0009) +[2023-10-13 01:10:26,582][46662] Updated weights for policy 0, policy_version 22860 (0.0008) +[2023-10-13 01:10:26,953][46662] Updated weights for policy 0, policy_version 22870 (0.0007) +[2023-10-13 01:10:27,330][46662] Updated weights for policy 0, policy_version 22880 (0.0007) +[2023-10-13 01:10:28,453][46663] Updated weights for policy 1, policy_version 22851 (0.0008) +[2023-10-13 01:10:28,606][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 46825472. Throughput: 0: 1672.0, 1: 1689.9. Samples: 11712476. Policy #0 lag: (min: 16.0, avg: 36.2, max: 48.0) +[2023-10-13 01:10:28,607][45375] Avg episode reward: [(0, '50.630'), (1, '50.060')] +[2023-10-13 01:10:28,852][46663] Updated weights for policy 1, policy_version 22861 (0.0009) +[2023-10-13 01:10:29,214][46663] Updated weights for policy 1, policy_version 22871 (0.0008) +[2023-10-13 01:10:31,267][46662] Updated weights for policy 0, policy_version 22890 (0.0007) +[2023-10-13 01:10:31,634][46662] Updated weights for policy 0, policy_version 22900 (0.0008) +[2023-10-13 01:10:32,011][46662] Updated weights for policy 0, policy_version 22910 (0.0007) +[2023-10-13 01:10:33,092][46663] Updated weights for policy 1, policy_version 22881 (0.0008) +[2023-10-13 01:10:33,465][46663] Updated weights for policy 1, policy_version 22891 (0.0009) +[2023-10-13 01:10:33,607][45375] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 46891008. Throughput: 0: 1660.8, 1: 1685.7. Samples: 11732490. Policy #0 lag: (min: 16.0, avg: 36.2, max: 48.0) +[2023-10-13 01:10:33,608][45375] Avg episode reward: [(0, '52.560'), (1, '49.190')] +[2023-10-13 01:10:33,832][46663] Updated weights for policy 1, policy_version 22901 (0.0007) +[2023-10-13 01:10:34,197][46663] Updated weights for policy 1, policy_version 22911 (0.0008) +[2023-10-13 01:10:35,990][46662] Updated weights for policy 0, policy_version 22920 (0.0007) +[2023-10-13 01:10:36,364][46662] Updated weights for policy 0, policy_version 22930 (0.0008) +[2023-10-13 01:10:36,725][46662] Updated weights for policy 0, policy_version 22940 (0.0007) +[2023-10-13 01:10:38,323][46663] Updated weights for policy 1, policy_version 22921 (0.0009) +[2023-10-13 01:10:38,607][45375] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 46956544. Throughput: 0: 1674.3, 1: 1678.5. Samples: 11752620. Policy #0 lag: (min: 16.0, avg: 36.2, max: 48.0) +[2023-10-13 01:10:38,608][45375] Avg episode reward: [(0, '52.340'), (1, '49.070')] +[2023-10-13 01:10:38,687][46663] Updated weights for policy 1, policy_version 22931 (0.0011) +[2023-10-13 01:10:39,065][46663] Updated weights for policy 1, policy_version 22941 (0.0009) +[2023-10-13 01:10:40,833][46662] Updated weights for policy 0, policy_version 22950 (0.0009) +[2023-10-13 01:10:41,206][46662] Updated weights for policy 0, policy_version 22960 (0.0010) +[2023-10-13 01:10:41,571][46662] Updated weights for policy 0, policy_version 22970 (0.0008) +[2023-10-13 01:10:43,139][46663] Updated weights for policy 1, policy_version 22951 (0.0008) +[2023-10-13 01:10:43,516][46663] Updated weights for policy 1, policy_version 22961 (0.0009) +[2023-10-13 01:10:43,607][45375] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 47022080. Throughput: 0: 1668.2, 1: 1687.0. Samples: 11763130. Policy #0 lag: (min: 16.0, avg: 36.2, max: 48.0) +[2023-10-13 01:10:43,607][45375] Avg episode reward: [(0, '51.920'), (1, '49.530')] +[2023-10-13 01:10:43,897][46663] Updated weights for policy 1, policy_version 22971 (0.0011) +[2023-10-13 01:10:45,599][46662] Updated weights for policy 0, policy_version 22980 (0.0008) +[2023-10-13 01:10:45,963][46662] Updated weights for policy 0, policy_version 22990 (0.0007) +[2023-10-13 01:10:46,338][46662] Updated weights for policy 0, policy_version 23000 (0.0008) +[2023-10-13 01:10:48,126][46663] Updated weights for policy 1, policy_version 22981 (0.0010) +[2023-10-13 01:10:48,494][46663] Updated weights for policy 1, policy_version 22991 (0.0008) +[2023-10-13 01:10:48,607][45375] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 47087616. Throughput: 0: 1664.5, 1: 1686.1. Samples: 11782930. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) +[2023-10-13 01:10:48,607][45375] Avg episode reward: [(0, '51.150'), (1, '49.120')] +[2023-10-13 01:10:48,872][46663] Updated weights for policy 1, policy_version 23001 (0.0008) +[2023-10-13 01:10:50,375][46662] Updated weights for policy 0, policy_version 23010 (0.0010) +[2023-10-13 01:10:50,740][46662] Updated weights for policy 0, policy_version 23020 (0.0009) +[2023-10-13 01:10:51,113][46662] Updated weights for policy 0, policy_version 23030 (0.0010) +[2023-10-13 01:10:51,483][46662] Updated weights for policy 0, policy_version 23040 (0.0008) +[2023-10-13 01:10:52,680][46663] Updated weights for policy 1, policy_version 23011 (0.0009) +[2023-10-13 01:10:53,039][46663] Updated weights for policy 1, policy_version 23021 (0.0010) +[2023-10-13 01:10:53,413][46663] Updated weights for policy 1, policy_version 23031 (0.0011) +[2023-10-13 01:10:53,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 47153152. Throughput: 0: 1689.6, 1: 1672.3. Samples: 11803168. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) +[2023-10-13 01:10:53,607][45375] Avg episode reward: [(0, '53.250'), (1, '49.480')] +[2023-10-13 01:10:55,297][46662] Updated weights for policy 0, policy_version 23050 (0.0007) +[2023-10-13 01:10:55,674][46662] Updated weights for policy 0, policy_version 23060 (0.0007) +[2023-10-13 01:10:56,046][46662] Updated weights for policy 0, policy_version 23070 (0.0008) +[2023-10-13 01:10:57,612][46663] Updated weights for policy 1, policy_version 23041 (0.0009) +[2023-10-13 01:10:57,987][46663] Updated weights for policy 1, policy_version 23051 (0.0009) +[2023-10-13 01:10:58,351][46663] Updated weights for policy 1, policy_version 23061 (0.0007) +[2023-10-13 01:10:58,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 47218688. Throughput: 0: 1669.3, 1: 1690.4. Samples: 11813438. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) +[2023-10-13 01:10:58,607][45375] Avg episode reward: [(0, '52.320'), (1, '48.640')] +[2023-10-13 01:10:58,723][46663] Updated weights for policy 1, policy_version 23071 (0.0009) +[2023-10-13 01:11:00,035][46662] Updated weights for policy 0, policy_version 23080 (0.0008) +[2023-10-13 01:11:00,403][46662] Updated weights for policy 0, policy_version 23090 (0.0007) +[2023-10-13 01:11:00,781][46662] Updated weights for policy 0, policy_version 23100 (0.0009) +[2023-10-13 01:11:02,738][46663] Updated weights for policy 1, policy_version 23081 (0.0009) +[2023-10-13 01:11:03,105][46663] Updated weights for policy 1, policy_version 23091 (0.0009) +[2023-10-13 01:11:03,478][46663] Updated weights for policy 1, policy_version 23101 (0.0008) +[2023-10-13 01:11:03,607][45375] Fps is (10 sec: 16383.7, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 47316992. Throughput: 0: 1689.3, 1: 1686.0. Samples: 11833922. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) +[2023-10-13 01:11:03,608][45375] Avg episode reward: [(0, '51.850'), (1, '48.020')] +[2023-10-13 01:11:04,737][46662] Updated weights for policy 0, policy_version 23110 (0.0007) +[2023-10-13 01:11:05,116][46662] Updated weights for policy 0, policy_version 23120 (0.0007) +[2023-10-13 01:11:05,490][46662] Updated weights for policy 0, policy_version 23130 (0.0007) +[2023-10-13 01:11:07,390][46663] Updated weights for policy 1, policy_version 23111 (0.0010) +[2023-10-13 01:11:07,754][46663] Updated weights for policy 1, policy_version 23121 (0.0011) +[2023-10-13 01:11:08,127][46663] Updated weights for policy 1, policy_version 23131 (0.0008) +[2023-10-13 01:11:08,607][45375] Fps is (10 sec: 16383.9, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 47382528. Throughput: 0: 1707.3, 1: 1660.3. Samples: 11853510. Policy #0 lag: (min: 31.0, avg: 37.2, max: 63.0) +[2023-10-13 01:11:08,607][45375] Avg episode reward: [(0, '52.350'), (1, '49.080')] +[2023-10-13 01:11:09,556][46662] Updated weights for policy 0, policy_version 23140 (0.0008) +[2023-10-13 01:11:09,949][46662] Updated weights for policy 0, policy_version 23150 (0.0008) +[2023-10-13 01:11:10,318][46662] Updated weights for policy 0, policy_version 23160 (0.0009) +[2023-10-13 01:11:12,242][46663] Updated weights for policy 1, policy_version 23141 (0.0008) +[2023-10-13 01:11:12,606][46663] Updated weights for policy 1, policy_version 23151 (0.0008) +[2023-10-13 01:11:12,987][46663] Updated weights for policy 1, policy_version 23161 (0.0009) +[2023-10-13 01:11:13,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 47448064. Throughput: 0: 1672.1, 1: 1689.8. Samples: 11863762. Policy #0 lag: (min: 31.0, avg: 37.2, max: 63.0) +[2023-10-13 01:11:13,607][45375] Avg episode reward: [(0, '50.940'), (1, '48.110')] +[2023-10-13 01:11:14,397][46662] Updated weights for policy 0, policy_version 23170 (0.0009) +[2023-10-13 01:11:14,776][46662] Updated weights for policy 0, policy_version 23180 (0.0008) +[2023-10-13 01:11:15,149][46662] Updated weights for policy 0, policy_version 23190 (0.0008) +[2023-10-13 01:11:15,519][46662] Updated weights for policy 0, policy_version 23200 (0.0007) +[2023-10-13 01:11:17,176][46663] Updated weights for policy 1, policy_version 23171 (0.0008) +[2023-10-13 01:11:17,581][46663] Updated weights for policy 1, policy_version 23181 (0.0009) +[2023-10-13 01:11:17,948][46663] Updated weights for policy 1, policy_version 23191 (0.0008) +[2023-10-13 01:11:18,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 47513600. Throughput: 0: 1688.9, 1: 1676.6. Samples: 11883940. Policy #0 lag: (min: 31.0, avg: 37.2, max: 63.0) +[2023-10-13 01:11:18,608][45375] Avg episode reward: [(0, '50.130'), (1, '47.140')] +[2023-10-13 01:11:19,539][46662] Updated weights for policy 0, policy_version 23210 (0.0009) +[2023-10-13 01:11:19,919][46662] Updated weights for policy 0, policy_version 23220 (0.0008) +[2023-10-13 01:11:20,291][46662] Updated weights for policy 0, policy_version 23230 (0.0007) +[2023-10-13 01:11:22,036][46663] Updated weights for policy 1, policy_version 23201 (0.0008) +[2023-10-13 01:11:22,401][46663] Updated weights for policy 1, policy_version 23211 (0.0008) +[2023-10-13 01:11:22,767][46663] Updated weights for policy 1, policy_version 23221 (0.0009) +[2023-10-13 01:11:23,128][46663] Updated weights for policy 1, policy_version 23231 (0.0008) +[2023-10-13 01:11:23,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 47579136. Throughput: 0: 1693.9, 1: 1661.6. Samples: 11903616. Policy #0 lag: (min: 31.0, avg: 37.2, max: 63.0) +[2023-10-13 01:11:23,608][45375] Avg episode reward: [(0, '49.610'), (1, '46.680')] +[2023-10-13 01:11:24,448][46662] Updated weights for policy 0, policy_version 23240 (0.0010) +[2023-10-13 01:11:24,821][46662] Updated weights for policy 0, policy_version 23250 (0.0008) +[2023-10-13 01:11:25,190][46662] Updated weights for policy 0, policy_version 23260 (0.0008) +[2023-10-13 01:11:27,229][46663] Updated weights for policy 1, policy_version 23241 (0.0008) +[2023-10-13 01:11:27,593][46663] Updated weights for policy 1, policy_version 23251 (0.0010) +[2023-10-13 01:11:27,964][46663] Updated weights for policy 1, policy_version 23261 (0.0007) +[2023-10-13 01:11:28,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 47644672. Throughput: 0: 1668.8, 1: 1682.4. Samples: 11913934. Policy #0 lag: (min: 0.0, avg: 19.6, max: 32.0) +[2023-10-13 01:11:28,607][45375] Avg episode reward: [(0, '49.520'), (1, '45.100')] +[2023-10-13 01:11:29,274][46662] Updated weights for policy 0, policy_version 23270 (0.0011) +[2023-10-13 01:11:29,638][46662] Updated weights for policy 0, policy_version 23280 (0.0010) +[2023-10-13 01:11:30,014][46662] Updated weights for policy 0, policy_version 23290 (0.0010) +[2023-10-13 01:11:31,845][46663] Updated weights for policy 1, policy_version 23271 (0.0009) +[2023-10-13 01:11:32,215][46663] Updated weights for policy 1, policy_version 23281 (0.0011) +[2023-10-13 01:11:32,578][46663] Updated weights for policy 1, policy_version 23291 (0.0009) +[2023-10-13 01:11:33,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 47710208. Throughput: 0: 1692.4, 1: 1665.9. Samples: 11934054. Policy #0 lag: (min: 0.0, avg: 19.6, max: 32.0) +[2023-10-13 01:11:33,608][45375] Avg episode reward: [(0, '48.910'), (1, '46.210')] +[2023-10-13 01:11:34,037][46662] Updated weights for policy 0, policy_version 23300 (0.0009) +[2023-10-13 01:11:34,419][46662] Updated weights for policy 0, policy_version 23310 (0.0007) +[2023-10-13 01:11:34,795][46662] Updated weights for policy 0, policy_version 23320 (0.0007) +[2023-10-13 01:11:36,591][46663] Updated weights for policy 1, policy_version 23301 (0.0008) +[2023-10-13 01:11:36,963][46663] Updated weights for policy 1, policy_version 23311 (0.0007) +[2023-10-13 01:11:37,336][46663] Updated weights for policy 1, policy_version 23321 (0.0007) +[2023-10-13 01:11:38,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 47775744. Throughput: 0: 1691.3, 1: 1670.3. Samples: 11954438. Policy #0 lag: (min: 0.0, avg: 19.6, max: 32.0) +[2023-10-13 01:11:38,607][45375] Avg episode reward: [(0, '48.320'), (1, '47.520')] +[2023-10-13 01:11:38,809][46662] Updated weights for policy 0, policy_version 23330 (0.0008) +[2023-10-13 01:11:39,180][46662] Updated weights for policy 0, policy_version 23340 (0.0011) +[2023-10-13 01:11:39,554][46662] Updated weights for policy 0, policy_version 23350 (0.0010) +[2023-10-13 01:11:39,930][46662] Updated weights for policy 0, policy_version 23360 (0.0010) +[2023-10-13 01:11:41,448][46663] Updated weights for policy 1, policy_version 23331 (0.0009) +[2023-10-13 01:11:41,825][46663] Updated weights for policy 1, policy_version 23341 (0.0010) +[2023-10-13 01:11:42,197][46663] Updated weights for policy 1, policy_version 23351 (0.0009) +[2023-10-13 01:11:43,606][45375] Fps is (10 sec: 13107.5, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 47841280. Throughput: 0: 1680.4, 1: 1683.1. Samples: 11964794. Policy #0 lag: (min: 0.0, avg: 19.6, max: 32.0) +[2023-10-13 01:11:43,607][45375] Avg episode reward: [(0, '48.700'), (1, '46.470')] +[2023-10-13 01:11:43,917][46662] Updated weights for policy 0, policy_version 23370 (0.0007) +[2023-10-13 01:11:44,299][46662] Updated weights for policy 0, policy_version 23380 (0.0007) +[2023-10-13 01:11:44,676][46662] Updated weights for policy 0, policy_version 23390 (0.0010) +[2023-10-13 01:11:46,516][46663] Updated weights for policy 1, policy_version 23361 (0.0009) +[2023-10-13 01:11:46,888][46663] Updated weights for policy 1, policy_version 23371 (0.0009) +[2023-10-13 01:11:47,244][46663] Updated weights for policy 1, policy_version 23381 (0.0008) +[2023-10-13 01:11:47,608][46663] Updated weights for policy 1, policy_version 23391 (0.0010) +[2023-10-13 01:11:48,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 47906816. Throughput: 0: 1685.3, 1: 1664.1. Samples: 11984646. Policy #0 lag: (min: 31.0, avg: 31.7, max: 49.0) +[2023-10-13 01:11:48,607][45375] Avg episode reward: [(0, '48.780'), (1, '47.240')] +[2023-10-13 01:11:48,797][46662] Updated weights for policy 0, policy_version 23400 (0.0008) +[2023-10-13 01:11:49,163][46662] Updated weights for policy 0, policy_version 23410 (0.0008) +[2023-10-13 01:11:49,537][46662] Updated weights for policy 0, policy_version 23420 (0.0009) +[2023-10-13 01:11:51,653][46663] Updated weights for policy 1, policy_version 23401 (0.0009) +[2023-10-13 01:11:52,020][46663] Updated weights for policy 1, policy_version 23411 (0.0008) +[2023-10-13 01:11:52,390][46663] Updated weights for policy 1, policy_version 23421 (0.0009) +[2023-10-13 01:11:53,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 47972352. Throughput: 0: 1683.1, 1: 1681.0. Samples: 12004896. Policy #0 lag: (min: 31.0, avg: 31.7, max: 49.0) +[2023-10-13 01:11:53,607][45375] Avg episode reward: [(0, '47.190'), (1, '47.780')] +[2023-10-13 01:11:53,617][46384] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000023424_23986176.pth... +[2023-10-13 01:11:53,658][46384] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000021856_22380544.pth +[2023-10-13 01:11:53,664][46384] Saving a milestone ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/milestones/checkpoint_000023424_23986176.pth +[2023-10-13 01:11:53,741][46662] Updated weights for policy 0, policy_version 23430 (0.0009) +[2023-10-13 01:11:54,113][46662] Updated weights for policy 0, policy_version 23440 (0.0007) +[2023-10-13 01:11:54,478][46662] Updated weights for policy 0, policy_version 23450 (0.0010) +[2023-10-13 01:11:54,700][46091] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000023456_24018944.pth... +[2023-10-13 01:11:54,738][46091] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000021856_22380544.pth +[2023-10-13 01:11:54,744][46091] Saving a milestone ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/milestones/checkpoint_000023456_24018944.pth +[2023-10-13 01:11:56,520][46663] Updated weights for policy 1, policy_version 23431 (0.0009) +[2023-10-13 01:11:56,886][46663] Updated weights for policy 1, policy_version 23441 (0.0010) +[2023-10-13 01:11:57,255][46663] Updated weights for policy 1, policy_version 23451 (0.0008) +[2023-10-13 01:11:58,317][46662] Updated weights for policy 0, policy_version 23460 (0.0008) +[2023-10-13 01:11:58,607][45375] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 48037888. Throughput: 0: 1690.4, 1: 1677.5. Samples: 12015320. Policy #0 lag: (min: 31.0, avg: 31.7, max: 49.0) +[2023-10-13 01:11:58,608][45375] Avg episode reward: [(0, '46.730'), (1, '48.110')] +[2023-10-13 01:11:58,707][46662] Updated weights for policy 0, policy_version 23470 (0.0008) +[2023-10-13 01:11:59,086][46662] Updated weights for policy 0, policy_version 23480 (0.0010) +[2023-10-13 01:12:01,303][46663] Updated weights for policy 1, policy_version 23461 (0.0008) +[2023-10-13 01:12:01,683][46663] Updated weights for policy 1, policy_version 23471 (0.0009) +[2023-10-13 01:12:02,053][46663] Updated weights for policy 1, policy_version 23481 (0.0008) +[2023-10-13 01:12:03,168][46662] Updated weights for policy 0, policy_version 23490 (0.0009) +[2023-10-13 01:12:03,538][46662] Updated weights for policy 0, policy_version 23500 (0.0007) +[2023-10-13 01:12:03,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 48103424. Throughput: 0: 1688.3, 1: 1660.0. Samples: 12034612. Policy #0 lag: (min: 31.0, avg: 31.7, max: 49.0) +[2023-10-13 01:12:03,608][45375] Avg episode reward: [(0, '45.420'), (1, '48.560')] +[2023-10-13 01:12:03,899][46662] Updated weights for policy 0, policy_version 23510 (0.0007) +[2023-10-13 01:12:04,267][46662] Updated weights for policy 0, policy_version 23520 (0.0008) +[2023-10-13 01:12:05,882][46663] Updated weights for policy 1, policy_version 23491 (0.0009) +[2023-10-13 01:12:06,249][46663] Updated weights for policy 1, policy_version 23501 (0.0009) +[2023-10-13 01:12:06,621][46663] Updated weights for policy 1, policy_version 23511 (0.0007) +[2023-10-13 01:12:08,253][46662] Updated weights for policy 0, policy_version 23530 (0.0011) +[2023-10-13 01:12:08,607][45375] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 48168960. Throughput: 0: 1692.4, 1: 1685.7. Samples: 12055628. Policy #0 lag: (min: 8.0, avg: 31.9, max: 40.0) +[2023-10-13 01:12:08,607][45375] Avg episode reward: [(0, '46.500'), (1, '47.290')] +[2023-10-13 01:12:08,616][46662] Updated weights for policy 0, policy_version 23540 (0.0008) +[2023-10-13 01:12:08,989][46662] Updated weights for policy 0, policy_version 23550 (0.0008) +[2023-10-13 01:12:10,788][46663] Updated weights for policy 1, policy_version 23521 (0.0007) +[2023-10-13 01:12:11,153][46663] Updated weights for policy 1, policy_version 23531 (0.0010) +[2023-10-13 01:12:11,537][46663] Updated weights for policy 1, policy_version 23541 (0.0009) +[2023-10-13 01:12:11,914][46663] Updated weights for policy 1, policy_version 23551 (0.0009) +[2023-10-13 01:12:13,123][46662] Updated weights for policy 0, policy_version 23560 (0.0008) +[2023-10-13 01:12:13,491][46662] Updated weights for policy 0, policy_version 23570 (0.0007) +[2023-10-13 01:12:13,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 48234496. Throughput: 0: 1693.0, 1: 1673.6. Samples: 12065430. Policy #0 lag: (min: 8.0, avg: 31.9, max: 40.0) +[2023-10-13 01:12:13,608][45375] Avg episode reward: [(0, '47.100'), (1, '46.520')] +[2023-10-13 01:12:13,863][46662] Updated weights for policy 0, policy_version 23580 (0.0007) +[2023-10-13 01:12:15,854][46663] Updated weights for policy 1, policy_version 23561 (0.0009) +[2023-10-13 01:12:16,224][46663] Updated weights for policy 1, policy_version 23571 (0.0008) +[2023-10-13 01:12:16,584][46663] Updated weights for policy 1, policy_version 23581 (0.0008) +[2023-10-13 01:12:17,909][46662] Updated weights for policy 0, policy_version 23590 (0.0007) +[2023-10-13 01:12:18,277][46662] Updated weights for policy 0, policy_version 23600 (0.0008) +[2023-10-13 01:12:18,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 48300032. Throughput: 0: 1697.5, 1: 1673.8. Samples: 12085762. Policy #0 lag: (min: 8.0, avg: 31.9, max: 40.0) +[2023-10-13 01:12:18,607][45375] Avg episode reward: [(0, '48.490'), (1, '45.330')] +[2023-10-13 01:12:18,653][46662] Updated weights for policy 0, policy_version 23610 (0.0008) +[2023-10-13 01:12:20,811][46663] Updated weights for policy 1, policy_version 23591 (0.0010) +[2023-10-13 01:12:21,189][46663] Updated weights for policy 1, policy_version 23601 (0.0008) +[2023-10-13 01:12:21,568][46663] Updated weights for policy 1, policy_version 23611 (0.0009) +[2023-10-13 01:12:22,773][46662] Updated weights for policy 0, policy_version 23620 (0.0007) +[2023-10-13 01:12:23,139][46662] Updated weights for policy 0, policy_version 23630 (0.0007) +[2023-10-13 01:12:23,518][46662] Updated weights for policy 0, policy_version 23640 (0.0008) +[2023-10-13 01:12:23,607][45375] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 48365568. Throughput: 0: 1686.0, 1: 1684.9. Samples: 12106126. Policy #0 lag: (min: 8.0, avg: 31.9, max: 40.0) +[2023-10-13 01:12:23,607][45375] Avg episode reward: [(0, '48.730'), (1, '46.550')] +[2023-10-13 01:12:25,542][46663] Updated weights for policy 1, policy_version 23621 (0.0008) +[2023-10-13 01:12:25,907][46663] Updated weights for policy 1, policy_version 23631 (0.0008) +[2023-10-13 01:12:26,279][46663] Updated weights for policy 1, policy_version 23641 (0.0009) +[2023-10-13 01:12:27,643][46662] Updated weights for policy 0, policy_version 23650 (0.0011) +[2023-10-13 01:12:28,021][46662] Updated weights for policy 0, policy_version 23660 (0.0009) +[2023-10-13 01:12:28,386][46662] Updated weights for policy 0, policy_version 23670 (0.0010) +[2023-10-13 01:12:28,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 48431104. Throughput: 0: 1691.1, 1: 1662.5. Samples: 12115704. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 01:12:28,607][45375] Avg episode reward: [(0, '49.300'), (1, '45.670')] +[2023-10-13 01:12:28,761][46662] Updated weights for policy 0, policy_version 23680 (0.0009) +[2023-10-13 01:12:30,297][46663] Updated weights for policy 1, policy_version 23651 (0.0009) +[2023-10-13 01:12:30,667][46663] Updated weights for policy 1, policy_version 23661 (0.0007) +[2023-10-13 01:12:31,031][46663] Updated weights for policy 1, policy_version 23671 (0.0008) +[2023-10-13 01:12:32,864][46662] Updated weights for policy 0, policy_version 23690 (0.0008) +[2023-10-13 01:12:33,227][46662] Updated weights for policy 0, policy_version 23700 (0.0009) +[2023-10-13 01:12:33,600][46662] Updated weights for policy 0, policy_version 23710 (0.0008) +[2023-10-13 01:12:33,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 48496640. Throughput: 0: 1688.0, 1: 1679.6. Samples: 12136184. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 01:12:33,607][45375] Avg episode reward: [(0, '48.780'), (1, '44.970')] +[2023-10-13 01:12:35,008][46663] Updated weights for policy 1, policy_version 23681 (0.0010) +[2023-10-13 01:12:35,371][46663] Updated weights for policy 1, policy_version 23691 (0.0007) +[2023-10-13 01:12:35,742][46663] Updated weights for policy 1, policy_version 23701 (0.0007) +[2023-10-13 01:12:36,110][46663] Updated weights for policy 1, policy_version 23711 (0.0008) +[2023-10-13 01:12:37,679][46662] Updated weights for policy 0, policy_version 23720 (0.0007) +[2023-10-13 01:12:38,045][46662] Updated weights for policy 0, policy_version 23730 (0.0007) +[2023-10-13 01:12:38,412][46662] Updated weights for policy 0, policy_version 23740 (0.0007) +[2023-10-13 01:12:38,607][45375] Fps is (10 sec: 16384.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 48594944. Throughput: 0: 1678.0, 1: 1698.3. Samples: 12156828. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 01:12:38,607][45375] Avg episode reward: [(0, '48.470'), (1, '44.770')] +[2023-10-13 01:12:40,177][46663] Updated weights for policy 1, policy_version 23721 (0.0007) +[2023-10-13 01:12:40,546][46663] Updated weights for policy 1, policy_version 23731 (0.0009) +[2023-10-13 01:12:40,915][46663] Updated weights for policy 1, policy_version 23741 (0.0009) +[2023-10-13 01:12:42,392][46662] Updated weights for policy 0, policy_version 23750 (0.0009) +[2023-10-13 01:12:42,757][46662] Updated weights for policy 0, policy_version 23760 (0.0008) +[2023-10-13 01:12:43,130][46662] Updated weights for policy 0, policy_version 23770 (0.0007) +[2023-10-13 01:12:43,607][45375] Fps is (10 sec: 16383.6, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 48660480. Throughput: 0: 1686.5, 1: 1668.0. Samples: 12166274. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 01:12:43,608][45375] Avg episode reward: [(0, '47.290'), (1, '44.780')] +[2023-10-13 01:12:45,039][46663] Updated weights for policy 1, policy_version 23751 (0.0011) +[2023-10-13 01:12:45,400][46663] Updated weights for policy 1, policy_version 23761 (0.0010) +[2023-10-13 01:12:45,773][46663] Updated weights for policy 1, policy_version 23771 (0.0009) +[2023-10-13 01:12:47,264][46662] Updated weights for policy 0, policy_version 23780 (0.0007) +[2023-10-13 01:12:47,654][46662] Updated weights for policy 0, policy_version 23790 (0.0009) +[2023-10-13 01:12:48,027][46662] Updated weights for policy 0, policy_version 23800 (0.0008) +[2023-10-13 01:12:48,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 48726016. Throughput: 0: 1689.0, 1: 1695.4. Samples: 12186912. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 01:12:48,608][45375] Avg episode reward: [(0, '47.170'), (1, '44.520')] +[2023-10-13 01:12:49,993][46663] Updated weights for policy 1, policy_version 23781 (0.0009) +[2023-10-13 01:12:50,388][46663] Updated weights for policy 1, policy_version 23791 (0.0009) +[2023-10-13 01:12:50,759][46663] Updated weights for policy 1, policy_version 23801 (0.0010) +[2023-10-13 01:12:52,033][46662] Updated weights for policy 0, policy_version 23810 (0.0011) +[2023-10-13 01:12:52,411][46662] Updated weights for policy 0, policy_version 23820 (0.0007) +[2023-10-13 01:12:52,778][46662] Updated weights for policy 0, policy_version 23830 (0.0009) +[2023-10-13 01:12:53,149][46662] Updated weights for policy 0, policy_version 23840 (0.0010) +[2023-10-13 01:12:53,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 48791552. Throughput: 0: 1660.1, 1: 1691.1. Samples: 12206432. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 01:12:53,607][45375] Avg episode reward: [(0, '47.450'), (1, '44.410')] +[2023-10-13 01:12:54,916][46663] Updated weights for policy 1, policy_version 23811 (0.0010) +[2023-10-13 01:12:55,294][46663] Updated weights for policy 1, policy_version 23821 (0.0007) +[2023-10-13 01:12:55,659][46663] Updated weights for policy 1, policy_version 23831 (0.0010) +[2023-10-13 01:12:57,120][46662] Updated weights for policy 0, policy_version 23850 (0.0007) +[2023-10-13 01:12:57,486][46662] Updated weights for policy 0, policy_version 23860 (0.0008) +[2023-10-13 01:12:57,858][46662] Updated weights for policy 0, policy_version 23870 (0.0011) +[2023-10-13 01:12:58,607][45375] Fps is (10 sec: 13107.4, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 48857088. Throughput: 0: 1682.4, 1: 1669.4. Samples: 12216262. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 01:12:58,607][45375] Avg episode reward: [(0, '48.040'), (1, '45.230')] +[2023-10-13 01:12:59,765][46663] Updated weights for policy 1, policy_version 23841 (0.0010) +[2023-10-13 01:13:00,128][46663] Updated weights for policy 1, policy_version 23851 (0.0009) +[2023-10-13 01:13:00,499][46663] Updated weights for policy 1, policy_version 23861 (0.0008) +[2023-10-13 01:13:00,855][46663] Updated weights for policy 1, policy_version 23871 (0.0010) +[2023-10-13 01:13:01,855][46662] Updated weights for policy 0, policy_version 23880 (0.0009) +[2023-10-13 01:13:02,236][46662] Updated weights for policy 0, policy_version 23890 (0.0009) +[2023-10-13 01:13:02,611][46662] Updated weights for policy 0, policy_version 23900 (0.0010) +[2023-10-13 01:13:03,606][45375] Fps is (10 sec: 13107.4, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 48922624. Throughput: 0: 1675.1, 1: 1682.8. Samples: 12236868. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0) +[2023-10-13 01:13:03,607][45375] Avg episode reward: [(0, '48.400'), (1, '43.570')] +[2023-10-13 01:13:04,835][46663] Updated weights for policy 1, policy_version 23881 (0.0011) +[2023-10-13 01:13:05,203][46663] Updated weights for policy 1, policy_version 23891 (0.0010) +[2023-10-13 01:13:05,572][46663] Updated weights for policy 1, policy_version 23901 (0.0011) +[2023-10-13 01:13:06,612][46662] Updated weights for policy 0, policy_version 23910 (0.0010) +[2023-10-13 01:13:06,978][46662] Updated weights for policy 0, policy_version 23920 (0.0007) +[2023-10-13 01:13:07,351][46662] Updated weights for policy 0, policy_version 23930 (0.0011) +[2023-10-13 01:13:08,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 48988160. Throughput: 0: 1656.3, 1: 1684.6. Samples: 12256470. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0) +[2023-10-13 01:13:08,608][45375] Avg episode reward: [(0, '47.800'), (1, '43.580')] +[2023-10-13 01:13:09,619][46663] Updated weights for policy 1, policy_version 23911 (0.0007) +[2023-10-13 01:13:09,985][46663] Updated weights for policy 1, policy_version 23921 (0.0009) +[2023-10-13 01:13:10,352][46663] Updated weights for policy 1, policy_version 23931 (0.0009) +[2023-10-13 01:13:11,544][46662] Updated weights for policy 0, policy_version 23940 (0.0007) +[2023-10-13 01:13:11,922][46662] Updated weights for policy 0, policy_version 23950 (0.0007) +[2023-10-13 01:13:12,297][46662] Updated weights for policy 0, policy_version 23960 (0.0008) +[2023-10-13 01:13:13,607][45375] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 49053696. Throughput: 0: 1680.3, 1: 1677.4. Samples: 12266800. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0) +[2023-10-13 01:13:13,608][45375] Avg episode reward: [(0, '47.510'), (1, '44.180')] +[2023-10-13 01:13:14,325][46663] Updated weights for policy 1, policy_version 23941 (0.0007) +[2023-10-13 01:13:14,700][46663] Updated weights for policy 1, policy_version 23951 (0.0007) +[2023-10-13 01:13:15,059][46663] Updated weights for policy 1, policy_version 23961 (0.0008) +[2023-10-13 01:13:16,342][46662] Updated weights for policy 0, policy_version 23970 (0.0008) +[2023-10-13 01:13:16,713][46662] Updated weights for policy 0, policy_version 23980 (0.0011) +[2023-10-13 01:13:17,082][46662] Updated weights for policy 0, policy_version 23990 (0.0010) +[2023-10-13 01:13:17,459][46662] Updated weights for policy 0, policy_version 24000 (0.0010) +[2023-10-13 01:13:18,606][45375] Fps is (10 sec: 13107.5, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 49119232. Throughput: 0: 1666.0, 1: 1686.4. Samples: 12287044. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0) +[2023-10-13 01:13:18,607][45375] Avg episode reward: [(0, '45.990'), (1, '45.830')] +[2023-10-13 01:13:19,039][46663] Updated weights for policy 1, policy_version 23971 (0.0011) +[2023-10-13 01:13:19,406][46663] Updated weights for policy 1, policy_version 23981 (0.0011) +[2023-10-13 01:13:19,775][46663] Updated weights for policy 1, policy_version 23991 (0.0009) +[2023-10-13 01:13:21,755][46662] Updated weights for policy 0, policy_version 24010 (0.0007) +[2023-10-13 01:13:22,130][46662] Updated weights for policy 0, policy_version 24020 (0.0009) +[2023-10-13 01:13:22,493][46662] Updated weights for policy 0, policy_version 24030 (0.0011) +[2023-10-13 01:13:23,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 49184768. Throughput: 0: 1656.9, 1: 1678.9. Samples: 12306942. Policy #0 lag: (min: 21.0, avg: 21.0, max: 21.0) +[2023-10-13 01:13:23,608][45375] Avg episode reward: [(0, '45.810'), (1, '44.770')] +[2023-10-13 01:13:23,872][46663] Updated weights for policy 1, policy_version 24001 (0.0009) +[2023-10-13 01:13:24,244][46663] Updated weights for policy 1, policy_version 24011 (0.0007) +[2023-10-13 01:13:24,608][46663] Updated weights for policy 1, policy_version 24021 (0.0007) +[2023-10-13 01:13:24,975][46663] Updated weights for policy 1, policy_version 24031 (0.0008) +[2023-10-13 01:13:26,456][46662] Updated weights for policy 0, policy_version 24040 (0.0008) +[2023-10-13 01:13:26,833][46662] Updated weights for policy 0, policy_version 24050 (0.0007) +[2023-10-13 01:13:27,195][46662] Updated weights for policy 0, policy_version 24060 (0.0008) +[2023-10-13 01:13:28,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 49250304. Throughput: 0: 1680.6, 1: 1681.5. Samples: 12317566. Policy #0 lag: (min: 21.0, avg: 21.0, max: 21.0) +[2023-10-13 01:13:28,607][45375] Avg episode reward: [(0, '46.860'), (1, '44.810')] +[2023-10-13 01:13:28,983][46663] Updated weights for policy 1, policy_version 24041 (0.0009) +[2023-10-13 01:13:29,356][46663] Updated weights for policy 1, policy_version 24051 (0.0009) +[2023-10-13 01:13:29,730][46663] Updated weights for policy 1, policy_version 24061 (0.0011) +[2023-10-13 01:13:31,216][46662] Updated weights for policy 0, policy_version 24070 (0.0009) +[2023-10-13 01:13:31,586][46662] Updated weights for policy 0, policy_version 24080 (0.0008) +[2023-10-13 01:13:31,955][46662] Updated weights for policy 0, policy_version 24090 (0.0007) +[2023-10-13 01:13:33,607][45375] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 49315840. Throughput: 0: 1659.9, 1: 1681.2. Samples: 12337262. Policy #0 lag: (min: 21.0, avg: 21.0, max: 21.0) +[2023-10-13 01:13:33,607][45375] Avg episode reward: [(0, '46.190'), (1, '43.740')] +[2023-10-13 01:13:33,836][46663] Updated weights for policy 1, policy_version 24071 (0.0009) +[2023-10-13 01:13:34,200][46663] Updated weights for policy 1, policy_version 24081 (0.0007) +[2023-10-13 01:13:34,580][46663] Updated weights for policy 1, policy_version 24091 (0.0008) +[2023-10-13 01:13:36,189][46662] Updated weights for policy 0, policy_version 24100 (0.0009) +[2023-10-13 01:13:36,573][46662] Updated weights for policy 0, policy_version 24110 (0.0011) +[2023-10-13 01:13:36,934][46662] Updated weights for policy 0, policy_version 24120 (0.0011) +[2023-10-13 01:13:38,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 49381376. Throughput: 0: 1671.7, 1: 1683.7. Samples: 12357424. Policy #0 lag: (min: 21.0, avg: 21.0, max: 21.0) +[2023-10-13 01:13:38,607][45375] Avg episode reward: [(0, '47.130'), (1, '42.820')] +[2023-10-13 01:13:38,739][46663] Updated weights for policy 1, policy_version 24101 (0.0009) +[2023-10-13 01:13:39,112][46663] Updated weights for policy 1, policy_version 24111 (0.0009) +[2023-10-13 01:13:39,475][46663] Updated weights for policy 1, policy_version 24121 (0.0010) +[2023-10-13 01:13:40,887][46662] Updated weights for policy 0, policy_version 24130 (0.0009) +[2023-10-13 01:13:41,256][46662] Updated weights for policy 0, policy_version 24140 (0.0007) +[2023-10-13 01:13:41,630][46662] Updated weights for policy 0, policy_version 24150 (0.0008) +[2023-10-13 01:13:41,998][46662] Updated weights for policy 0, policy_version 24160 (0.0009) +[2023-10-13 01:13:43,441][46663] Updated weights for policy 1, policy_version 24131 (0.0008) +[2023-10-13 01:13:43,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 49446912. Throughput: 0: 1682.0, 1: 1685.8. Samples: 12367812. Policy #0 lag: (min: 5.0, avg: 7.5, max: 37.0) +[2023-10-13 01:13:43,607][45375] Avg episode reward: [(0, '46.920'), (1, '42.890')] +[2023-10-13 01:13:43,803][46663] Updated weights for policy 1, policy_version 24141 (0.0009) +[2023-10-13 01:13:44,181][46663] Updated weights for policy 1, policy_version 24151 (0.0007) +[2023-10-13 01:13:46,030][46662] Updated weights for policy 0, policy_version 24170 (0.0007) +[2023-10-13 01:13:46,410][46662] Updated weights for policy 0, policy_version 24180 (0.0007) +[2023-10-13 01:13:46,780][46662] Updated weights for policy 0, policy_version 24190 (0.0008) +[2023-10-13 01:13:48,449][46663] Updated weights for policy 1, policy_version 24161 (0.0009) +[2023-10-13 01:13:48,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 49512448. Throughput: 0: 1657.5, 1: 1688.2. Samples: 12387424. Policy #0 lag: (min: 5.0, avg: 7.5, max: 37.0) +[2023-10-13 01:13:48,607][45375] Avg episode reward: [(0, '48.500'), (1, '42.170')] +[2023-10-13 01:13:48,809][46663] Updated weights for policy 1, policy_version 24171 (0.0009) +[2023-10-13 01:13:49,188][46663] Updated weights for policy 1, policy_version 24181 (0.0009) +[2023-10-13 01:13:49,547][46663] Updated weights for policy 1, policy_version 24191 (0.0009) +[2023-10-13 01:13:50,829][46662] Updated weights for policy 0, policy_version 24200 (0.0009) +[2023-10-13 01:13:51,204][46662] Updated weights for policy 0, policy_version 24210 (0.0009) +[2023-10-13 01:13:51,576][46662] Updated weights for policy 0, policy_version 24220 (0.0008) +[2023-10-13 01:13:53,607][45375] Fps is (10 sec: 13106.7, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 49577984. Throughput: 0: 1680.7, 1: 1685.9. Samples: 12407966. Policy #0 lag: (min: 5.0, avg: 7.5, max: 37.0) +[2023-10-13 01:13:53,608][45375] Avg episode reward: [(0, '50.400'), (1, '41.480')] +[2023-10-13 01:13:53,620][46091] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000024224_24805376.pth... +[2023-10-13 01:13:53,655][46091] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000022656_23199744.pth +[2023-10-13 01:13:53,725][46663] Updated weights for policy 1, policy_version 24201 (0.0011) +[2023-10-13 01:13:54,094][46663] Updated weights for policy 1, policy_version 24211 (0.0009) +[2023-10-13 01:13:54,468][46663] Updated weights for policy 1, policy_version 24221 (0.0008) +[2023-10-13 01:13:54,569][46384] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000024224_24805376.pth... +[2023-10-13 01:13:54,609][46384] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000022624_23166976.pth +[2023-10-13 01:13:55,610][46662] Updated weights for policy 0, policy_version 24230 (0.0008) +[2023-10-13 01:13:55,978][46662] Updated weights for policy 0, policy_version 24240 (0.0008) +[2023-10-13 01:13:56,353][46662] Updated weights for policy 0, policy_version 24250 (0.0009) +[2023-10-13 01:13:58,577][46663] Updated weights for policy 1, policy_version 24231 (0.0008) +[2023-10-13 01:13:58,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 49643520. Throughput: 0: 1674.8, 1: 1686.1. Samples: 12418040. Policy #0 lag: (min: 5.0, avg: 7.5, max: 37.0) +[2023-10-13 01:13:58,607][45375] Avg episode reward: [(0, '49.370'), (1, '43.210')] +[2023-10-13 01:13:58,935][46663] Updated weights for policy 1, policy_version 24241 (0.0008) +[2023-10-13 01:13:59,301][46663] Updated weights for policy 1, policy_version 24251 (0.0007) +[2023-10-13 01:14:00,452][46662] Updated weights for policy 0, policy_version 24260 (0.0009) +[2023-10-13 01:14:00,812][46662] Updated weights for policy 0, policy_version 24270 (0.0008) +[2023-10-13 01:14:01,181][46662] Updated weights for policy 0, policy_version 24280 (0.0011) +[2023-10-13 01:14:03,453][46663] Updated weights for policy 1, policy_version 24261 (0.0008) +[2023-10-13 01:14:03,607][45375] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 49709056. Throughput: 0: 1668.6, 1: 1681.5. Samples: 12437798. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 01:14:03,608][45375] Avg episode reward: [(0, '49.450'), (1, '43.230')] +[2023-10-13 01:14:03,826][46663] Updated weights for policy 1, policy_version 24271 (0.0008) +[2023-10-13 01:14:04,196][46663] Updated weights for policy 1, policy_version 24281 (0.0009) +[2023-10-13 01:14:05,299][46662] Updated weights for policy 0, policy_version 24290 (0.0010) +[2023-10-13 01:14:05,663][46662] Updated weights for policy 0, policy_version 24300 (0.0007) +[2023-10-13 01:14:06,028][46662] Updated weights for policy 0, policy_version 24310 (0.0009) +[2023-10-13 01:14:06,407][46662] Updated weights for policy 0, policy_version 24320 (0.0008) +[2023-10-13 01:14:08,298][46663] Updated weights for policy 1, policy_version 24291 (0.0009) +[2023-10-13 01:14:08,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 49774592. Throughput: 0: 1687.4, 1: 1677.3. Samples: 12458354. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 01:14:08,607][45375] Avg episode reward: [(0, '49.740'), (1, '44.340')] +[2023-10-13 01:14:08,675][46663] Updated weights for policy 1, policy_version 24301 (0.0011) +[2023-10-13 01:14:09,053][46663] Updated weights for policy 1, policy_version 24311 (0.0011) +[2023-10-13 01:14:10,548][46662] Updated weights for policy 0, policy_version 24330 (0.0010) +[2023-10-13 01:14:10,923][46662] Updated weights for policy 0, policy_version 24340 (0.0009) +[2023-10-13 01:14:11,295][46662] Updated weights for policy 0, policy_version 24350 (0.0009) +[2023-10-13 01:14:12,822][46663] Updated weights for policy 1, policy_version 24321 (0.0011) +[2023-10-13 01:14:13,186][46663] Updated weights for policy 1, policy_version 24331 (0.0007) +[2023-10-13 01:14:13,560][46663] Updated weights for policy 1, policy_version 24341 (0.0008) +[2023-10-13 01:14:13,606][45375] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 49840128. Throughput: 0: 1668.9, 1: 1683.5. Samples: 12468426. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 01:14:13,607][45375] Avg episode reward: [(0, '50.340'), (1, '45.020')] +[2023-10-13 01:14:13,930][46663] Updated weights for policy 1, policy_version 24351 (0.0009) +[2023-10-13 01:14:15,175][46662] Updated weights for policy 0, policy_version 24360 (0.0009) +[2023-10-13 01:14:15,537][46662] Updated weights for policy 0, policy_version 24370 (0.0008) +[2023-10-13 01:14:15,913][46662] Updated weights for policy 0, policy_version 24380 (0.0010) +[2023-10-13 01:14:17,844][46663] Updated weights for policy 1, policy_version 24361 (0.0007) +[2023-10-13 01:14:18,213][46663] Updated weights for policy 1, policy_version 24371 (0.0007) +[2023-10-13 01:14:18,584][46663] Updated weights for policy 1, policy_version 24381 (0.0010) +[2023-10-13 01:14:18,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 49905664. Throughput: 0: 1676.9, 1: 1688.2. Samples: 12488690. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 01:14:18,607][45375] Avg episode reward: [(0, '49.760'), (1, '45.620')] +[2023-10-13 01:14:20,173][46662] Updated weights for policy 0, policy_version 24390 (0.0008) +[2023-10-13 01:14:20,550][46662] Updated weights for policy 0, policy_version 24400 (0.0008) +[2023-10-13 01:14:20,921][46662] Updated weights for policy 0, policy_version 24410 (0.0007) +[2023-10-13 01:14:22,577][46663] Updated weights for policy 1, policy_version 24391 (0.0010) +[2023-10-13 01:14:22,957][46663] Updated weights for policy 1, policy_version 24401 (0.0010) +[2023-10-13 01:14:23,318][46663] Updated weights for policy 1, policy_version 24411 (0.0007) +[2023-10-13 01:14:23,607][45375] Fps is (10 sec: 16383.8, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 50003968. Throughput: 0: 1691.3, 1: 1667.0. Samples: 12508548. Policy #0 lag: (min: 16.0, avg: 31.1, max: 48.0) +[2023-10-13 01:14:23,607][45375] Avg episode reward: [(0, '49.940'), (1, '47.110')] +[2023-10-13 01:14:24,975][46662] Updated weights for policy 0, policy_version 24420 (0.0008) +[2023-10-13 01:14:25,373][46662] Updated weights for policy 0, policy_version 24430 (0.0007) +[2023-10-13 01:14:25,746][46662] Updated weights for policy 0, policy_version 24440 (0.0009) +[2023-10-13 01:14:27,277][46663] Updated weights for policy 1, policy_version 24421 (0.0008) +[2023-10-13 01:14:27,673][46663] Updated weights for policy 1, policy_version 24431 (0.0008) +[2023-10-13 01:14:28,033][46663] Updated weights for policy 1, policy_version 24441 (0.0009) +[2023-10-13 01:14:28,606][45375] Fps is (10 sec: 16384.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 50069504. Throughput: 0: 1665.1, 1: 1694.7. Samples: 12519000. Policy #0 lag: (min: 16.0, avg: 31.1, max: 48.0) +[2023-10-13 01:14:28,607][45375] Avg episode reward: [(0, '49.380'), (1, '47.230')] +[2023-10-13 01:14:29,737][46662] Updated weights for policy 0, policy_version 24450 (0.0008) +[2023-10-13 01:14:30,108][46662] Updated weights for policy 0, policy_version 24460 (0.0007) +[2023-10-13 01:14:30,493][46662] Updated weights for policy 0, policy_version 24470 (0.0007) +[2023-10-13 01:14:30,866][46662] Updated weights for policy 0, policy_version 24480 (0.0009) +[2023-10-13 01:14:32,120][46663] Updated weights for policy 1, policy_version 24451 (0.0009) +[2023-10-13 01:14:32,488][46663] Updated weights for policy 1, policy_version 24461 (0.0009) +[2023-10-13 01:14:32,864][46663] Updated weights for policy 1, policy_version 24471 (0.0008) +[2023-10-13 01:14:33,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 50135040. Throughput: 0: 1683.5, 1: 1686.3. Samples: 12539062. Policy #0 lag: (min: 16.0, avg: 31.1, max: 48.0) +[2023-10-13 01:14:33,608][45375] Avg episode reward: [(0, '48.640'), (1, '47.580')] +[2023-10-13 01:14:34,860][46662] Updated weights for policy 0, policy_version 24490 (0.0010) +[2023-10-13 01:14:35,229][46662] Updated weights for policy 0, policy_version 24500 (0.0007) +[2023-10-13 01:14:35,599][46662] Updated weights for policy 0, policy_version 24510 (0.0007) +[2023-10-13 01:14:36,965][46663] Updated weights for policy 1, policy_version 24481 (0.0009) +[2023-10-13 01:14:37,332][46663] Updated weights for policy 1, policy_version 24491 (0.0010) +[2023-10-13 01:14:37,708][46663] Updated weights for policy 1, policy_version 24501 (0.0009) +[2023-10-13 01:14:38,069][46663] Updated weights for policy 1, policy_version 24511 (0.0012) +[2023-10-13 01:14:38,607][45375] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 50200576. Throughput: 0: 1690.9, 1: 1667.0. Samples: 12559072. Policy #0 lag: (min: 16.0, avg: 31.1, max: 48.0) +[2023-10-13 01:14:38,608][45375] Avg episode reward: [(0, '48.780'), (1, '49.110')] +[2023-10-13 01:14:39,512][46662] Updated weights for policy 0, policy_version 24520 (0.0009) +[2023-10-13 01:14:39,887][46662] Updated weights for policy 0, policy_version 24530 (0.0010) +[2023-10-13 01:14:40,265][46662] Updated weights for policy 0, policy_version 24540 (0.0008) +[2023-10-13 01:14:42,302][46663] Updated weights for policy 1, policy_version 24521 (0.0009) +[2023-10-13 01:14:42,671][46663] Updated weights for policy 1, policy_version 24531 (0.0008) +[2023-10-13 01:14:43,032][46663] Updated weights for policy 1, policy_version 24541 (0.0009) +[2023-10-13 01:14:43,607][45375] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 50266112. Throughput: 0: 1666.4, 1: 1693.3. Samples: 12569228. Policy #0 lag: (min: 3.0, avg: 3.5, max: 18.0) +[2023-10-13 01:14:43,607][45375] Avg episode reward: [(0, '48.210'), (1, '48.240')] +[2023-10-13 01:14:44,338][46662] Updated weights for policy 0, policy_version 24550 (0.0010) +[2023-10-13 01:14:44,704][46662] Updated weights for policy 0, policy_version 24560 (0.0008) +[2023-10-13 01:14:45,080][46662] Updated weights for policy 0, policy_version 24570 (0.0008) +[2023-10-13 01:14:47,052][46663] Updated weights for policy 1, policy_version 24551 (0.0007) +[2023-10-13 01:14:47,431][46663] Updated weights for policy 1, policy_version 24561 (0.0008) +[2023-10-13 01:14:47,794][46663] Updated weights for policy 1, policy_version 24571 (0.0010) +[2023-10-13 01:14:48,607][45375] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 50331648. Throughput: 0: 1686.3, 1: 1681.6. Samples: 12589352. Policy #0 lag: (min: 3.0, avg: 3.5, max: 18.0) +[2023-10-13 01:14:48,607][45375] Avg episode reward: [(0, '48.050'), (1, '50.400')] +[2023-10-13 01:14:49,045][46662] Updated weights for policy 0, policy_version 24580 (0.0009) +[2023-10-13 01:14:49,416][46662] Updated weights for policy 0, policy_version 24590 (0.0007) +[2023-10-13 01:14:49,789][46662] Updated weights for policy 0, policy_version 24600 (0.0009) +[2023-10-13 01:14:51,576][46663] Updated weights for policy 1, policy_version 24581 (0.0007) +[2023-10-13 01:14:51,943][46663] Updated weights for policy 1, policy_version 24591 (0.0008) +[2023-10-13 01:14:52,324][46663] Updated weights for policy 1, policy_version 24601 (0.0007) +[2023-10-13 01:14:53,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 50397184. Throughput: 0: 1687.6, 1: 1674.3. Samples: 12609638. Policy #0 lag: (min: 3.0, avg: 3.5, max: 18.0) +[2023-10-13 01:14:53,607][45375] Avg episode reward: [(0, '47.350'), (1, '48.950')] +[2023-10-13 01:14:54,041][46662] Updated weights for policy 0, policy_version 24610 (0.0008) +[2023-10-13 01:14:54,416][46662] Updated weights for policy 0, policy_version 24620 (0.0007) +[2023-10-13 01:14:54,792][46662] Updated weights for policy 0, policy_version 24630 (0.0007) +[2023-10-13 01:14:55,166][46662] Updated weights for policy 0, policy_version 24640 (0.0007) +[2023-10-13 01:14:56,447][46663] Updated weights for policy 1, policy_version 24611 (0.0008) +[2023-10-13 01:14:56,825][46663] Updated weights for policy 1, policy_version 24621 (0.0009) +[2023-10-13 01:14:57,200][46663] Updated weights for policy 1, policy_version 24631 (0.0008) +[2023-10-13 01:14:58,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 50462720. Throughput: 0: 1669.7, 1: 1695.5. Samples: 12619858. Policy #0 lag: (min: 3.0, avg: 3.5, max: 18.0) +[2023-10-13 01:14:58,608][45375] Avg episode reward: [(0, '48.390'), (1, '48.110')] +[2023-10-13 01:14:59,347][46662] Updated weights for policy 0, policy_version 24650 (0.0008) +[2023-10-13 01:14:59,724][46662] Updated weights for policy 0, policy_version 24660 (0.0011) +[2023-10-13 01:15:00,096][46662] Updated weights for policy 0, policy_version 24670 (0.0008) +[2023-10-13 01:15:01,291][46663] Updated weights for policy 1, policy_version 24641 (0.0008) +[2023-10-13 01:15:01,662][46663] Updated weights for policy 1, policy_version 24651 (0.0008) +[2023-10-13 01:15:02,031][46663] Updated weights for policy 1, policy_version 24661 (0.0010) +[2023-10-13 01:15:02,395][46663] Updated weights for policy 1, policy_version 24671 (0.0011) +[2023-10-13 01:15:03,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 50528256. Throughput: 0: 1681.5, 1: 1669.0. Samples: 12639462. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) +[2023-10-13 01:15:03,607][45375] Avg episode reward: [(0, '47.710'), (1, '47.650')] +[2023-10-13 01:15:04,122][46662] Updated weights for policy 0, policy_version 24680 (0.0010) +[2023-10-13 01:15:04,488][46662] Updated weights for policy 0, policy_version 24690 (0.0010) +[2023-10-13 01:15:04,854][46662] Updated weights for policy 0, policy_version 24700 (0.0007) +[2023-10-13 01:15:06,576][46663] Updated weights for policy 1, policy_version 24681 (0.0009) +[2023-10-13 01:15:06,948][46663] Updated weights for policy 1, policy_version 24691 (0.0008) +[2023-10-13 01:15:07,327][46663] Updated weights for policy 1, policy_version 24701 (0.0009) +[2023-10-13 01:15:08,607][45375] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 50593792. Throughput: 0: 1680.4, 1: 1683.0. Samples: 12659902. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) +[2023-10-13 01:15:08,607][45375] Avg episode reward: [(0, '46.690'), (1, '45.520')] +[2023-10-13 01:15:08,862][46662] Updated weights for policy 0, policy_version 24710 (0.0009) +[2023-10-13 01:15:09,230][46662] Updated weights for policy 0, policy_version 24720 (0.0010) +[2023-10-13 01:15:09,605][46662] Updated weights for policy 0, policy_version 24730 (0.0008) +[2023-10-13 01:15:11,300][46663] Updated weights for policy 1, policy_version 24711 (0.0008) +[2023-10-13 01:15:11,669][46663] Updated weights for policy 1, policy_version 24721 (0.0007) +[2023-10-13 01:15:12,036][46663] Updated weights for policy 1, policy_version 24731 (0.0010) +[2023-10-13 01:15:13,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 50659328. Throughput: 0: 1675.6, 1: 1680.2. Samples: 12670012. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) +[2023-10-13 01:15:13,608][45375] Avg episode reward: [(0, '46.020'), (1, '46.620')] +[2023-10-13 01:15:13,872][46662] Updated weights for policy 0, policy_version 24740 (0.0008) +[2023-10-13 01:15:14,265][46662] Updated weights for policy 0, policy_version 24750 (0.0008) +[2023-10-13 01:15:14,642][46662] Updated weights for policy 0, policy_version 24760 (0.0009) +[2023-10-13 01:15:16,152][46663] Updated weights for policy 1, policy_version 24741 (0.0009) +[2023-10-13 01:15:16,548][46663] Updated weights for policy 1, policy_version 24751 (0.0008) +[2023-10-13 01:15:16,916][46663] Updated weights for policy 1, policy_version 24761 (0.0008) +[2023-10-13 01:15:18,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 50724864. Throughput: 0: 1686.0, 1: 1660.9. Samples: 12689674. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) +[2023-10-13 01:15:18,607][45375] Avg episode reward: [(0, '47.410'), (1, '45.380')] +[2023-10-13 01:15:18,661][46662] Updated weights for policy 0, policy_version 24770 (0.0010) +[2023-10-13 01:15:19,030][46662] Updated weights for policy 0, policy_version 24780 (0.0010) +[2023-10-13 01:15:19,398][46662] Updated weights for policy 0, policy_version 24790 (0.0008) +[2023-10-13 01:15:19,764][46662] Updated weights for policy 0, policy_version 24800 (0.0009) +[2023-10-13 01:15:20,775][46663] Updated weights for policy 1, policy_version 24771 (0.0008) +[2023-10-13 01:15:21,144][46663] Updated weights for policy 1, policy_version 24781 (0.0010) +[2023-10-13 01:15:21,506][46663] Updated weights for policy 1, policy_version 24791 (0.0010) +[2023-10-13 01:15:23,607][45375] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 50790400. Throughput: 0: 1679.6, 1: 1685.0. Samples: 12710480. Policy #0 lag: (min: 7.0, avg: 15.0, max: 39.0) +[2023-10-13 01:15:23,607][45375] Avg episode reward: [(0, '47.600'), (1, '45.040')] +[2023-10-13 01:15:23,805][46662] Updated weights for policy 0, policy_version 24810 (0.0008) +[2023-10-13 01:15:24,179][46662] Updated weights for policy 0, policy_version 24820 (0.0008) +[2023-10-13 01:15:24,548][46662] Updated weights for policy 0, policy_version 24830 (0.0008) +[2023-10-13 01:15:25,646][46663] Updated weights for policy 1, policy_version 24801 (0.0011) +[2023-10-13 01:15:26,011][46663] Updated weights for policy 1, policy_version 24811 (0.0010) +[2023-10-13 01:15:26,385][46663] Updated weights for policy 1, policy_version 24821 (0.0008) +[2023-10-13 01:15:26,751][46663] Updated weights for policy 1, policy_version 24831 (0.0008) +[2023-10-13 01:15:28,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 50855936. Throughput: 0: 1681.3, 1: 1670.6. Samples: 12720062. Policy #0 lag: (min: 7.0, avg: 15.0, max: 39.0) +[2023-10-13 01:15:28,607][45375] Avg episode reward: [(0, '49.420'), (1, '45.780')] +[2023-10-13 01:15:28,715][46662] Updated weights for policy 0, policy_version 24840 (0.0008) +[2023-10-13 01:15:29,095][46662] Updated weights for policy 0, policy_version 24850 (0.0007) +[2023-10-13 01:15:29,468][46662] Updated weights for policy 0, policy_version 24860 (0.0009) +[2023-10-13 01:15:30,868][46663] Updated weights for policy 1, policy_version 24841 (0.0008) +[2023-10-13 01:15:31,223][46663] Updated weights for policy 1, policy_version 24851 (0.0008) +[2023-10-13 01:15:31,588][46663] Updated weights for policy 1, policy_version 24861 (0.0007) +[2023-10-13 01:15:33,414][46662] Updated weights for policy 0, policy_version 24870 (0.0008) +[2023-10-13 01:15:33,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 50921472. Throughput: 0: 1686.6, 1: 1669.1. Samples: 12740358. Policy #0 lag: (min: 7.0, avg: 15.0, max: 39.0) +[2023-10-13 01:15:33,608][45375] Avg episode reward: [(0, '49.290'), (1, '46.120')] +[2023-10-13 01:15:33,783][46662] Updated weights for policy 0, policy_version 24880 (0.0008) +[2023-10-13 01:15:34,156][46662] Updated weights for policy 0, policy_version 24890 (0.0009) +[2023-10-13 01:15:35,723][46663] Updated weights for policy 1, policy_version 24871 (0.0008) +[2023-10-13 01:15:36,088][46663] Updated weights for policy 1, policy_version 24881 (0.0008) +[2023-10-13 01:15:36,453][46663] Updated weights for policy 1, policy_version 24891 (0.0009) +[2023-10-13 01:15:38,118][46662] Updated weights for policy 0, policy_version 24900 (0.0008) +[2023-10-13 01:15:38,494][46662] Updated weights for policy 0, policy_version 24910 (0.0008) +[2023-10-13 01:15:38,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 50987008. Throughput: 0: 1687.5, 1: 1677.2. Samples: 12761050. Policy #0 lag: (min: 7.0, avg: 15.0, max: 39.0) +[2023-10-13 01:15:38,607][45375] Avg episode reward: [(0, '49.470'), (1, '45.660')] +[2023-10-13 01:15:38,855][46662] Updated weights for policy 0, policy_version 24920 (0.0008) +[2023-10-13 01:15:40,627][46663] Updated weights for policy 1, policy_version 24901 (0.0008) +[2023-10-13 01:15:40,993][46663] Updated weights for policy 1, policy_version 24911 (0.0009) +[2023-10-13 01:15:41,359][46663] Updated weights for policy 1, policy_version 24921 (0.0008) +[2023-10-13 01:15:42,784][46662] Updated weights for policy 0, policy_version 24930 (0.0010) +[2023-10-13 01:15:43,156][46662] Updated weights for policy 0, policy_version 24940 (0.0008) +[2023-10-13 01:15:43,523][46662] Updated weights for policy 0, policy_version 24950 (0.0007) +[2023-10-13 01:15:43,606][45375] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 51052544. Throughput: 0: 1686.6, 1: 1657.1. Samples: 12770326. Policy #0 lag: (min: 29.0, avg: 29.0, max: 29.0) +[2023-10-13 01:15:43,607][45375] Avg episode reward: [(0, '48.880'), (1, '46.200')] +[2023-10-13 01:15:43,903][46662] Updated weights for policy 0, policy_version 24960 (0.0007) +[2023-10-13 01:15:45,390][46663] Updated weights for policy 1, policy_version 24931 (0.0011) +[2023-10-13 01:15:45,757][46663] Updated weights for policy 1, policy_version 24941 (0.0010) +[2023-10-13 01:15:46,131][46663] Updated weights for policy 1, policy_version 24951 (0.0008) +[2023-10-13 01:15:47,855][46662] Updated weights for policy 0, policy_version 24970 (0.0010) +[2023-10-13 01:15:48,230][46662] Updated weights for policy 0, policy_version 24980 (0.0009) +[2023-10-13 01:15:48,599][46662] Updated weights for policy 0, policy_version 24990 (0.0007) +[2023-10-13 01:15:48,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 51118080. Throughput: 0: 1688.1, 1: 1670.8. Samples: 12790616. Policy #0 lag: (min: 29.0, avg: 29.0, max: 29.0) +[2023-10-13 01:15:48,607][45375] Avg episode reward: [(0, '49.330'), (1, '44.890')] +[2023-10-13 01:15:50,315][46663] Updated weights for policy 1, policy_version 24961 (0.0007) +[2023-10-13 01:15:50,672][46663] Updated weights for policy 1, policy_version 24971 (0.0009) +[2023-10-13 01:15:51,044][46663] Updated weights for policy 1, policy_version 24981 (0.0011) +[2023-10-13 01:15:51,410][46663] Updated weights for policy 1, policy_version 24991 (0.0010) +[2023-10-13 01:15:52,740][46662] Updated weights for policy 0, policy_version 25000 (0.0009) +[2023-10-13 01:15:53,118][46662] Updated weights for policy 0, policy_version 25010 (0.0010) +[2023-10-13 01:15:53,481][46662] Updated weights for policy 0, policy_version 25020 (0.0009) +[2023-10-13 01:15:53,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 51183616. Throughput: 0: 1680.1, 1: 1674.3. Samples: 12810848. Policy #0 lag: (min: 29.0, avg: 29.0, max: 29.0) +[2023-10-13 01:15:53,607][45375] Avg episode reward: [(0, '49.600'), (1, '46.210')] +[2023-10-13 01:15:53,615][46384] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000024992_25591808.pth... +[2023-10-13 01:15:53,624][46091] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000025024_25624576.pth... +[2023-10-13 01:15:53,654][46091] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000023456_24018944.pth +[2023-10-13 01:15:53,657][46384] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000023424_23986176.pth +[2023-10-13 01:15:55,601][46663] Updated weights for policy 1, policy_version 25001 (0.0009) +[2023-10-13 01:15:55,974][46663] Updated weights for policy 1, policy_version 25011 (0.0008) +[2023-10-13 01:15:56,334][46663] Updated weights for policy 1, policy_version 25021 (0.0007) +[2023-10-13 01:15:57,656][46662] Updated weights for policy 0, policy_version 25030 (0.0009) +[2023-10-13 01:15:58,035][46662] Updated weights for policy 0, policy_version 25040 (0.0009) +[2023-10-13 01:15:58,404][46662] Updated weights for policy 0, policy_version 25050 (0.0007) +[2023-10-13 01:15:58,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 51249152. Throughput: 0: 1687.0, 1: 1652.6. Samples: 12820294. Policy #0 lag: (min: 29.0, avg: 29.0, max: 29.0) +[2023-10-13 01:15:58,607][45375] Avg episode reward: [(0, '48.580'), (1, '46.190')] +[2023-10-13 01:16:00,339][46663] Updated weights for policy 1, policy_version 25031 (0.0008) +[2023-10-13 01:16:00,705][46663] Updated weights for policy 1, policy_version 25041 (0.0010) +[2023-10-13 01:16:01,077][46663] Updated weights for policy 1, policy_version 25051 (0.0010) +[2023-10-13 01:16:02,489][46662] Updated weights for policy 0, policy_version 25060 (0.0008) +[2023-10-13 01:16:02,864][46662] Updated weights for policy 0, policy_version 25070 (0.0008) +[2023-10-13 01:16:03,226][46662] Updated weights for policy 0, policy_version 25080 (0.0009) +[2023-10-13 01:16:03,607][45375] Fps is (10 sec: 16384.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 51347456. Throughput: 0: 1680.4, 1: 1677.2. Samples: 12840764. Policy #0 lag: (min: 1.0, avg: 18.1, max: 33.0) +[2023-10-13 01:16:03,607][45375] Avg episode reward: [(0, '48.140'), (1, '46.540')] +[2023-10-13 01:16:05,232][46663] Updated weights for policy 1, policy_version 25061 (0.0011) +[2023-10-13 01:16:05,606][46663] Updated weights for policy 1, policy_version 25071 (0.0009) +[2023-10-13 01:16:05,976][46663] Updated weights for policy 1, policy_version 25081 (0.0010) +[2023-10-13 01:16:07,346][46662] Updated weights for policy 0, policy_version 25090 (0.0008) +[2023-10-13 01:16:07,711][46662] Updated weights for policy 0, policy_version 25100 (0.0009) +[2023-10-13 01:16:08,082][46662] Updated weights for policy 0, policy_version 25110 (0.0009) +[2023-10-13 01:16:08,461][46662] Updated weights for policy 0, policy_version 25120 (0.0009) +[2023-10-13 01:16:08,607][45375] Fps is (10 sec: 16383.6, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 51412992. Throughput: 0: 1670.7, 1: 1672.9. Samples: 12860942. Policy #0 lag: (min: 1.0, avg: 18.1, max: 33.0) +[2023-10-13 01:16:08,608][45375] Avg episode reward: [(0, '47.540'), (1, '46.120')] +[2023-10-13 01:16:09,974][46663] Updated weights for policy 1, policy_version 25091 (0.0009) +[2023-10-13 01:16:10,352][46663] Updated weights for policy 1, policy_version 25101 (0.0007) +[2023-10-13 01:16:10,715][46663] Updated weights for policy 1, policy_version 25111 (0.0009) +[2023-10-13 01:16:12,578][46662] Updated weights for policy 0, policy_version 25130 (0.0008) +[2023-10-13 01:16:12,953][46662] Updated weights for policy 0, policy_version 25140 (0.0007) +[2023-10-13 01:16:13,335][46662] Updated weights for policy 0, policy_version 25150 (0.0009) +[2023-10-13 01:16:13,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 51478528. Throughput: 0: 1683.1, 1: 1659.8. Samples: 12870494. Policy #0 lag: (min: 1.0, avg: 18.1, max: 33.0) +[2023-10-13 01:16:13,608][45375] Avg episode reward: [(0, '48.530'), (1, '44.110')] +[2023-10-13 01:16:14,796][46663] Updated weights for policy 1, policy_version 25121 (0.0009) +[2023-10-13 01:16:15,170][46663] Updated weights for policy 1, policy_version 25131 (0.0008) +[2023-10-13 01:16:15,548][46663] Updated weights for policy 1, policy_version 25141 (0.0008) +[2023-10-13 01:16:15,912][46663] Updated weights for policy 1, policy_version 25151 (0.0009) +[2023-10-13 01:16:17,351][46662] Updated weights for policy 0, policy_version 25160 (0.0011) +[2023-10-13 01:16:17,718][46662] Updated weights for policy 0, policy_version 25170 (0.0008) +[2023-10-13 01:16:18,095][46662] Updated weights for policy 0, policy_version 25180 (0.0010) +[2023-10-13 01:16:18,606][45375] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 51544064. Throughput: 0: 1682.8, 1: 1672.6. Samples: 12891350. Policy #0 lag: (min: 31.0, avg: 40.0, max: 63.0) +[2023-10-13 01:16:18,607][45375] Avg episode reward: [(0, '47.330'), (1, '43.950')] +[2023-10-13 01:16:19,879][46663] Updated weights for policy 1, policy_version 25161 (0.0009) +[2023-10-13 01:16:20,250][46663] Updated weights for policy 1, policy_version 25171 (0.0009) +[2023-10-13 01:16:20,626][46663] Updated weights for policy 1, policy_version 25181 (0.0008) +[2023-10-13 01:16:22,113][46662] Updated weights for policy 0, policy_version 25190 (0.0008) +[2023-10-13 01:16:22,479][46662] Updated weights for policy 0, policy_version 25200 (0.0008) +[2023-10-13 01:16:22,856][46662] Updated weights for policy 0, policy_version 25210 (0.0010) +[2023-10-13 01:16:23,606][45375] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 51609600. Throughput: 0: 1660.8, 1: 1680.5. Samples: 12911410. Policy #0 lag: (min: 31.0, avg: 40.0, max: 63.0) +[2023-10-13 01:16:23,607][45375] Avg episode reward: [(0, '46.760'), (1, '45.100')] +[2023-10-13 01:16:24,486][46663] Updated weights for policy 1, policy_version 25191 (0.0008) +[2023-10-13 01:16:24,850][46663] Updated weights for policy 1, policy_version 25201 (0.0009) +[2023-10-13 01:16:25,219][46663] Updated weights for policy 1, policy_version 25211 (0.0011) +[2023-10-13 01:16:26,908][46662] Updated weights for policy 0, policy_version 25220 (0.0009) +[2023-10-13 01:16:27,280][46662] Updated weights for policy 0, policy_version 25230 (0.0009) +[2023-10-13 01:16:27,638][46662] Updated weights for policy 0, policy_version 25240 (0.0007) +[2023-10-13 01:16:28,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 51675136. Throughput: 0: 1687.0, 1: 1677.0. Samples: 12921706. Policy #0 lag: (min: 31.0, avg: 40.0, max: 63.0) +[2023-10-13 01:16:28,607][45375] Avg episode reward: [(0, '45.540'), (1, '45.120')] +[2023-10-13 01:16:29,243][46663] Updated weights for policy 1, policy_version 25221 (0.0008) +[2023-10-13 01:16:29,615][46663] Updated weights for policy 1, policy_version 25231 (0.0010) +[2023-10-13 01:16:29,980][46663] Updated weights for policy 1, policy_version 25241 (0.0007) +[2023-10-13 01:16:31,721][46662] Updated weights for policy 0, policy_version 25250 (0.0009) +[2023-10-13 01:16:32,091][46662] Updated weights for policy 0, policy_version 25260 (0.0008) +[2023-10-13 01:16:32,468][46662] Updated weights for policy 0, policy_version 25270 (0.0008) +[2023-10-13 01:16:32,834][46662] Updated weights for policy 0, policy_version 25280 (0.0009) +[2023-10-13 01:16:33,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 51740672. Throughput: 0: 1678.9, 1: 1690.6. Samples: 12942246. Policy #0 lag: (min: 31.0, avg: 40.0, max: 63.0) +[2023-10-13 01:16:33,608][45375] Avg episode reward: [(0, '45.030'), (1, '45.740')] +[2023-10-13 01:16:34,036][46663] Updated weights for policy 1, policy_version 25251 (0.0008) +[2023-10-13 01:16:34,405][46663] Updated weights for policy 1, policy_version 25261 (0.0010) +[2023-10-13 01:16:34,778][46663] Updated weights for policy 1, policy_version 25271 (0.0010) +[2023-10-13 01:16:37,003][46662] Updated weights for policy 0, policy_version 25290 (0.0007) +[2023-10-13 01:16:37,362][46662] Updated weights for policy 0, policy_version 25300 (0.0008) +[2023-10-13 01:16:37,742][46662] Updated weights for policy 0, policy_version 25310 (0.0007) +[2023-10-13 01:16:38,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 51806208. Throughput: 0: 1659.1, 1: 1696.9. Samples: 12961868. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) +[2023-10-13 01:16:38,607][45375] Avg episode reward: [(0, '45.090'), (1, '46.590')] +[2023-10-13 01:16:39,049][46663] Updated weights for policy 1, policy_version 25281 (0.0008) +[2023-10-13 01:16:39,410][46663] Updated weights for policy 1, policy_version 25291 (0.0009) +[2023-10-13 01:16:39,772][46663] Updated weights for policy 1, policy_version 25301 (0.0007) +[2023-10-13 01:16:40,140][46663] Updated weights for policy 1, policy_version 25311 (0.0008) +[2023-10-13 01:16:41,803][46662] Updated weights for policy 0, policy_version 25320 (0.0008) +[2023-10-13 01:16:42,171][46662] Updated weights for policy 0, policy_version 25330 (0.0007) +[2023-10-13 01:16:42,538][46662] Updated weights for policy 0, policy_version 25340 (0.0012) +[2023-10-13 01:16:43,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 51871744. Throughput: 0: 1679.6, 1: 1695.3. Samples: 12972166. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) +[2023-10-13 01:16:43,608][45375] Avg episode reward: [(0, '46.770'), (1, '47.350')] +[2023-10-13 01:16:44,070][46663] Updated weights for policy 1, policy_version 25321 (0.0008) +[2023-10-13 01:16:44,440][46663] Updated weights for policy 1, policy_version 25331 (0.0007) +[2023-10-13 01:16:44,821][46663] Updated weights for policy 1, policy_version 25341 (0.0007) +[2023-10-13 01:16:46,703][46662] Updated weights for policy 0, policy_version 25350 (0.0008) +[2023-10-13 01:16:47,068][46662] Updated weights for policy 0, policy_version 25360 (0.0007) +[2023-10-13 01:16:47,440][46662] Updated weights for policy 0, policy_version 25370 (0.0008) +[2023-10-13 01:16:48,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 51937280. Throughput: 0: 1675.3, 1: 1699.7. Samples: 12992638. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) +[2023-10-13 01:16:48,607][45375] Avg episode reward: [(0, '49.120'), (1, '46.520')] +[2023-10-13 01:16:48,930][46663] Updated weights for policy 1, policy_version 25351 (0.0009) +[2023-10-13 01:16:49,284][46663] Updated weights for policy 1, policy_version 25361 (0.0007) +[2023-10-13 01:16:49,659][46663] Updated weights for policy 1, policy_version 25371 (0.0009) +[2023-10-13 01:16:51,535][46662] Updated weights for policy 0, policy_version 25380 (0.0009) +[2023-10-13 01:16:51,904][46662] Updated weights for policy 0, policy_version 25390 (0.0011) +[2023-10-13 01:16:52,280][46662] Updated weights for policy 0, policy_version 25400 (0.0007) +[2023-10-13 01:16:53,509][46663] Updated weights for policy 1, policy_version 25381 (0.0009) +[2023-10-13 01:16:53,606][45375] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 52002816. Throughput: 0: 1665.4, 1: 1703.0. Samples: 13012520. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) +[2023-10-13 01:16:53,607][45375] Avg episode reward: [(0, '46.530'), (1, '47.520')] +[2023-10-13 01:16:53,884][46663] Updated weights for policy 1, policy_version 25391 (0.0008) +[2023-10-13 01:16:54,258][46663] Updated weights for policy 1, policy_version 25401 (0.0009) +[2023-10-13 01:16:56,178][46662] Updated weights for policy 0, policy_version 25410 (0.0008) +[2023-10-13 01:16:56,549][46662] Updated weights for policy 0, policy_version 25420 (0.0008) +[2023-10-13 01:16:56,928][46662] Updated weights for policy 0, policy_version 25430 (0.0010) +[2023-10-13 01:16:57,303][46662] Updated weights for policy 0, policy_version 25440 (0.0008) +[2023-10-13 01:16:58,257][46663] Updated weights for policy 1, policy_version 25411 (0.0010) +[2023-10-13 01:16:58,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 52068352. Throughput: 0: 1685.9, 1: 1702.4. Samples: 13022966. Policy #0 lag: (min: 31.0, avg: 38.4, max: 63.0) +[2023-10-13 01:16:58,607][45375] Avg episode reward: [(0, '48.080'), (1, '49.550')] +[2023-10-13 01:16:58,632][46663] Updated weights for policy 1, policy_version 25421 (0.0009) +[2023-10-13 01:16:59,000][46663] Updated weights for policy 1, policy_version 25431 (0.0009) +[2023-10-13 01:17:01,421][46662] Updated weights for policy 0, policy_version 25450 (0.0007) +[2023-10-13 01:17:01,781][46662] Updated weights for policy 0, policy_version 25460 (0.0008) +[2023-10-13 01:17:02,157][46662] Updated weights for policy 0, policy_version 25470 (0.0008) +[2023-10-13 01:17:03,137][46663] Updated weights for policy 1, policy_version 25441 (0.0008) +[2023-10-13 01:17:03,501][46663] Updated weights for policy 1, policy_version 25451 (0.0008) +[2023-10-13 01:17:03,606][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 52133888. Throughput: 0: 1663.3, 1: 1705.9. Samples: 13042962. Policy #0 lag: (min: 31.0, avg: 38.4, max: 63.0) +[2023-10-13 01:17:03,607][45375] Avg episode reward: [(0, '48.620'), (1, '48.810')] +[2023-10-13 01:17:03,879][46663] Updated weights for policy 1, policy_version 25461 (0.0007) +[2023-10-13 01:17:04,239][46663] Updated weights for policy 1, policy_version 25471 (0.0008) +[2023-10-13 01:17:06,180][46662] Updated weights for policy 0, policy_version 25480 (0.0011) +[2023-10-13 01:17:06,549][46662] Updated weights for policy 0, policy_version 25490 (0.0007) +[2023-10-13 01:17:06,921][46662] Updated weights for policy 0, policy_version 25500 (0.0007) +[2023-10-13 01:17:08,388][46663] Updated weights for policy 1, policy_version 25481 (0.0008) +[2023-10-13 01:17:08,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.3, 300 sec: 13440.4). Total num frames: 52199424. Throughput: 0: 1677.8, 1: 1686.8. Samples: 13062816. Policy #0 lag: (min: 31.0, avg: 38.4, max: 63.0) +[2023-10-13 01:17:08,607][45375] Avg episode reward: [(0, '48.410'), (1, '48.750')] +[2023-10-13 01:17:08,752][46663] Updated weights for policy 1, policy_version 25491 (0.0009) +[2023-10-13 01:17:09,124][46663] Updated weights for policy 1, policy_version 25501 (0.0011) +[2023-10-13 01:17:10,793][46662] Updated weights for policy 0, policy_version 25510 (0.0008) +[2023-10-13 01:17:11,165][46662] Updated weights for policy 0, policy_version 25520 (0.0008) +[2023-10-13 01:17:11,545][46662] Updated weights for policy 0, policy_version 25530 (0.0007) +[2023-10-13 01:17:13,206][46663] Updated weights for policy 1, policy_version 25511 (0.0008) +[2023-10-13 01:17:13,568][46663] Updated weights for policy 1, policy_version 25521 (0.0007) +[2023-10-13 01:17:13,607][45375] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 52264960. Throughput: 0: 1680.7, 1: 1690.4. Samples: 13073408. Policy #0 lag: (min: 31.0, avg: 38.4, max: 63.0) +[2023-10-13 01:17:13,608][45375] Avg episode reward: [(0, '47.410'), (1, '49.410')] +[2023-10-13 01:17:13,936][46663] Updated weights for policy 1, policy_version 25531 (0.0009) +[2023-10-13 01:17:15,513][46662] Updated weights for policy 0, policy_version 25540 (0.0009) +[2023-10-13 01:17:15,893][46662] Updated weights for policy 0, policy_version 25550 (0.0010) +[2023-10-13 01:17:16,272][46662] Updated weights for policy 0, policy_version 25560 (0.0010) +[2023-10-13 01:17:17,935][46663] Updated weights for policy 1, policy_version 25541 (0.0008) +[2023-10-13 01:17:18,296][46663] Updated weights for policy 1, policy_version 25551 (0.0009) +[2023-10-13 01:17:18,606][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 52330496. Throughput: 0: 1664.5, 1: 1690.8. Samples: 13093232. Policy #0 lag: (min: 15.0, avg: 20.2, max: 47.0) +[2023-10-13 01:17:18,607][45375] Avg episode reward: [(0, '48.380'), (1, '50.210')] +[2023-10-13 01:17:18,665][46663] Updated weights for policy 1, policy_version 25561 (0.0009) +[2023-10-13 01:17:20,233][46662] Updated weights for policy 0, policy_version 25570 (0.0008) +[2023-10-13 01:17:20,613][46662] Updated weights for policy 0, policy_version 25580 (0.0007) +[2023-10-13 01:17:20,984][46662] Updated weights for policy 0, policy_version 25590 (0.0009) +[2023-10-13 01:17:21,343][46662] Updated weights for policy 0, policy_version 25600 (0.0008) +[2023-10-13 01:17:22,778][46663] Updated weights for policy 1, policy_version 25571 (0.0008) +[2023-10-13 01:17:23,144][46663] Updated weights for policy 1, policy_version 25581 (0.0008) +[2023-10-13 01:17:23,524][46663] Updated weights for policy 1, policy_version 25591 (0.0009) +[2023-10-13 01:17:23,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.1, 300 sec: 13440.4). Total num frames: 52396032. Throughput: 0: 1689.9, 1: 1677.2. Samples: 13113388. Policy #0 lag: (min: 15.0, avg: 20.2, max: 47.0) +[2023-10-13 01:17:23,608][45375] Avg episode reward: [(0, '47.800'), (1, '50.680')] +[2023-10-13 01:17:25,588][46662] Updated weights for policy 0, policy_version 25610 (0.0007) +[2023-10-13 01:17:25,959][46662] Updated weights for policy 0, policy_version 25620 (0.0009) +[2023-10-13 01:17:26,339][46662] Updated weights for policy 0, policy_version 25630 (0.0009) +[2023-10-13 01:17:27,581][46663] Updated weights for policy 1, policy_version 25601 (0.0009) +[2023-10-13 01:17:27,948][46663] Updated weights for policy 1, policy_version 25611 (0.0009) +[2023-10-13 01:17:28,308][46663] Updated weights for policy 1, policy_version 25621 (0.0008) +[2023-10-13 01:17:28,606][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 52461568. Throughput: 0: 1676.6, 1: 1695.4. Samples: 13123904. Policy #0 lag: (min: 15.0, avg: 20.2, max: 47.0) +[2023-10-13 01:17:28,607][45375] Avg episode reward: [(0, '47.750'), (1, '50.200')] +[2023-10-13 01:17:28,682][46663] Updated weights for policy 1, policy_version 25631 (0.0008) +[2023-10-13 01:17:30,287][46662] Updated weights for policy 0, policy_version 25640 (0.0009) +[2023-10-13 01:17:30,652][46662] Updated weights for policy 0, policy_version 25650 (0.0009) +[2023-10-13 01:17:31,037][46662] Updated weights for policy 0, policy_version 25660 (0.0009) +[2023-10-13 01:17:32,719][46663] Updated weights for policy 1, policy_version 25641 (0.0011) +[2023-10-13 01:17:33,092][46663] Updated weights for policy 1, policy_version 25651 (0.0011) +[2023-10-13 01:17:33,465][46663] Updated weights for policy 1, policy_version 25661 (0.0009) +[2023-10-13 01:17:33,607][45375] Fps is (10 sec: 16384.3, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 52559872. Throughput: 0: 1668.8, 1: 1691.9. Samples: 13143868. Policy #0 lag: (min: 15.0, avg: 20.2, max: 47.0) +[2023-10-13 01:17:33,607][45375] Avg episode reward: [(0, '49.470'), (1, '48.730')] +[2023-10-13 01:17:35,107][46662] Updated weights for policy 0, policy_version 25670 (0.0008) +[2023-10-13 01:17:35,470][46662] Updated weights for policy 0, policy_version 25680 (0.0010) +[2023-10-13 01:17:35,844][46662] Updated weights for policy 0, policy_version 25690 (0.0008) +[2023-10-13 01:17:37,432][46663] Updated weights for policy 1, policy_version 25671 (0.0010) +[2023-10-13 01:17:37,799][46663] Updated weights for policy 1, policy_version 25681 (0.0008) +[2023-10-13 01:17:38,166][46663] Updated weights for policy 1, policy_version 25691 (0.0008) +[2023-10-13 01:17:38,607][45375] Fps is (10 sec: 16383.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 52625408. Throughput: 0: 1687.9, 1: 1662.0. Samples: 13163262. Policy #0 lag: (min: 31.0, avg: 31.1, max: 39.0) +[2023-10-13 01:17:38,607][45375] Avg episode reward: [(0, '48.460'), (1, '48.970')] +[2023-10-13 01:17:39,997][46662] Updated weights for policy 0, policy_version 25700 (0.0008) +[2023-10-13 01:17:40,395][46662] Updated weights for policy 0, policy_version 25710 (0.0009) +[2023-10-13 01:17:40,762][46662] Updated weights for policy 0, policy_version 25720 (0.0008) +[2023-10-13 01:17:42,259][46663] Updated weights for policy 1, policy_version 25701 (0.0008) +[2023-10-13 01:17:42,658][46663] Updated weights for policy 1, policy_version 25711 (0.0009) +[2023-10-13 01:17:43,022][46663] Updated weights for policy 1, policy_version 25721 (0.0008) +[2023-10-13 01:17:43,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 52690944. Throughput: 0: 1660.7, 1: 1690.8. Samples: 13173782. Policy #0 lag: (min: 31.0, avg: 31.1, max: 39.0) +[2023-10-13 01:17:43,607][45375] Avg episode reward: [(0, '46.840'), (1, '47.680')] +[2023-10-13 01:17:44,957][46662] Updated weights for policy 0, policy_version 25730 (0.0008) +[2023-10-13 01:17:45,323][46662] Updated weights for policy 0, policy_version 25740 (0.0009) +[2023-10-13 01:17:45,691][46662] Updated weights for policy 0, policy_version 25750 (0.0007) +[2023-10-13 01:17:46,061][46662] Updated weights for policy 0, policy_version 25760 (0.0009) +[2023-10-13 01:17:47,081][46663] Updated weights for policy 1, policy_version 25731 (0.0009) +[2023-10-13 01:17:47,442][46663] Updated weights for policy 1, policy_version 25741 (0.0009) +[2023-10-13 01:17:47,806][46663] Updated weights for policy 1, policy_version 25751 (0.0008) +[2023-10-13 01:17:48,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 52756480. Throughput: 0: 1672.0, 1: 1675.0. Samples: 13193580. Policy #0 lag: (min: 31.0, avg: 31.1, max: 39.0) +[2023-10-13 01:17:48,607][45375] Avg episode reward: [(0, '44.050'), (1, '46.460')] +[2023-10-13 01:17:50,150][46662] Updated weights for policy 0, policy_version 25770 (0.0008) +[2023-10-13 01:17:50,521][46662] Updated weights for policy 0, policy_version 25780 (0.0008) +[2023-10-13 01:17:50,886][46662] Updated weights for policy 0, policy_version 25790 (0.0008) +[2023-10-13 01:17:51,825][46663] Updated weights for policy 1, policy_version 25761 (0.0008) +[2023-10-13 01:17:52,192][46663] Updated weights for policy 1, policy_version 25771 (0.0007) +[2023-10-13 01:17:52,557][46663] Updated weights for policy 1, policy_version 25781 (0.0007) +[2023-10-13 01:17:52,931][46663] Updated weights for policy 1, policy_version 25791 (0.0007) +[2023-10-13 01:17:53,607][45375] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 52822016. Throughput: 0: 1681.1, 1: 1666.8. Samples: 13213474. Policy #0 lag: (min: 31.0, avg: 31.1, max: 39.0) +[2023-10-13 01:17:53,608][45375] Avg episode reward: [(0, '44.330'), (1, '46.350')] +[2023-10-13 01:17:53,618][46091] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000025792_26411008.pth... +[2023-10-13 01:17:53,618][46384] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000025792_26411008.pth... +[2023-10-13 01:17:53,655][46384] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000024224_24805376.pth +[2023-10-13 01:17:53,656][46091] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000024224_24805376.pth +[2023-10-13 01:17:54,895][46662] Updated weights for policy 0, policy_version 25800 (0.0008) +[2023-10-13 01:17:55,268][46662] Updated weights for policy 0, policy_version 25810 (0.0008) +[2023-10-13 01:17:55,631][46662] Updated weights for policy 0, policy_version 25820 (0.0010) +[2023-10-13 01:17:56,994][46663] Updated weights for policy 1, policy_version 25801 (0.0009) +[2023-10-13 01:17:57,371][46663] Updated weights for policy 1, policy_version 25811 (0.0009) +[2023-10-13 01:17:57,730][46663] Updated weights for policy 1, policy_version 25821 (0.0010) +[2023-10-13 01:17:58,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 52887552. Throughput: 0: 1655.6, 1: 1689.0. Samples: 13223912. Policy #0 lag: (min: 31.0, avg: 31.5, max: 47.0) +[2023-10-13 01:17:58,607][45375] Avg episode reward: [(0, '43.320'), (1, '45.410')] +[2023-10-13 01:17:59,812][46662] Updated weights for policy 0, policy_version 25830 (0.0008) +[2023-10-13 01:18:00,179][46662] Updated weights for policy 0, policy_version 25840 (0.0009) +[2023-10-13 01:18:00,546][46662] Updated weights for policy 0, policy_version 25850 (0.0009) +[2023-10-13 01:18:01,958][46663] Updated weights for policy 1, policy_version 25831 (0.0008) +[2023-10-13 01:18:02,326][46663] Updated weights for policy 1, policy_version 25841 (0.0007) +[2023-10-13 01:18:02,692][46663] Updated weights for policy 1, policy_version 25851 (0.0008) +[2023-10-13 01:18:03,607][45375] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 52953088. Throughput: 0: 1677.0, 1: 1672.3. Samples: 13243952. Policy #0 lag: (min: 31.0, avg: 31.5, max: 47.0) +[2023-10-13 01:18:03,608][45375] Avg episode reward: [(0, '43.240'), (1, '45.920')] +[2023-10-13 01:18:04,612][46662] Updated weights for policy 0, policy_version 25860 (0.0007) +[2023-10-13 01:18:04,984][46662] Updated weights for policy 0, policy_version 25870 (0.0008) +[2023-10-13 01:18:05,360][46662] Updated weights for policy 0, policy_version 25880 (0.0008) +[2023-10-13 01:18:06,827][46663] Updated weights for policy 1, policy_version 25861 (0.0008) +[2023-10-13 01:18:07,202][46663] Updated weights for policy 1, policy_version 25871 (0.0010) +[2023-10-13 01:18:07,561][46663] Updated weights for policy 1, policy_version 25881 (0.0008) +[2023-10-13 01:18:08,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 53018624. Throughput: 0: 1676.6, 1: 1670.1. Samples: 13263988. Policy #0 lag: (min: 31.0, avg: 31.5, max: 47.0) +[2023-10-13 01:18:08,607][45375] Avg episode reward: [(0, '44.480'), (1, '47.060')] +[2023-10-13 01:18:09,438][46662] Updated weights for policy 0, policy_version 25890 (0.0009) +[2023-10-13 01:18:09,814][46662] Updated weights for policy 0, policy_version 25900 (0.0007) +[2023-10-13 01:18:10,181][46662] Updated weights for policy 0, policy_version 25910 (0.0008) +[2023-10-13 01:18:10,549][46662] Updated weights for policy 0, policy_version 25920 (0.0008) +[2023-10-13 01:18:11,667][46663] Updated weights for policy 1, policy_version 25891 (0.0009) +[2023-10-13 01:18:12,033][46663] Updated weights for policy 1, policy_version 25901 (0.0011) +[2023-10-13 01:18:12,404][46663] Updated weights for policy 1, policy_version 25911 (0.0009) +[2023-10-13 01:18:13,606][45375] Fps is (10 sec: 13107.4, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 53084160. Throughput: 0: 1660.1, 1: 1687.6. Samples: 13274550. Policy #0 lag: (min: 31.0, avg: 31.5, max: 47.0) +[2023-10-13 01:18:13,607][45375] Avg episode reward: [(0, '44.130'), (1, '50.660')] +[2023-10-13 01:18:14,696][46662] Updated weights for policy 0, policy_version 25930 (0.0010) +[2023-10-13 01:18:15,067][46662] Updated weights for policy 0, policy_version 25940 (0.0008) +[2023-10-13 01:18:15,436][46662] Updated weights for policy 0, policy_version 25950 (0.0009) +[2023-10-13 01:18:16,413][46663] Updated weights for policy 1, policy_version 25921 (0.0010) +[2023-10-13 01:18:16,779][46663] Updated weights for policy 1, policy_version 25931 (0.0012) +[2023-10-13 01:18:17,149][46663] Updated weights for policy 1, policy_version 25941 (0.0011) +[2023-10-13 01:18:17,522][46663] Updated weights for policy 1, policy_version 25951 (0.0008) +[2023-10-13 01:18:18,606][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 53149696. Throughput: 0: 1677.6, 1: 1667.6. Samples: 13294402. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 01:18:18,607][45375] Avg episode reward: [(0, '43.650'), (1, '50.640')] +[2023-10-13 01:18:19,458][46662] Updated weights for policy 0, policy_version 25960 (0.0009) +[2023-10-13 01:18:19,836][46662] Updated weights for policy 0, policy_version 25970 (0.0007) +[2023-10-13 01:18:20,207][46662] Updated weights for policy 0, policy_version 25980 (0.0009) +[2023-10-13 01:18:21,667][46663] Updated weights for policy 1, policy_version 25961 (0.0007) +[2023-10-13 01:18:22,039][46663] Updated weights for policy 1, policy_version 25971 (0.0008) +[2023-10-13 01:18:22,420][46663] Updated weights for policy 1, policy_version 25981 (0.0010) +[2023-10-13 01:18:23,607][45375] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 53215232. Throughput: 0: 1687.7, 1: 1687.8. Samples: 13315160. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 01:18:23,608][45375] Avg episode reward: [(0, '44.540'), (1, '50.000')] +[2023-10-13 01:18:24,152][46662] Updated weights for policy 0, policy_version 25990 (0.0008) +[2023-10-13 01:18:24,525][46662] Updated weights for policy 0, policy_version 26000 (0.0010) +[2023-10-13 01:18:24,901][46662] Updated weights for policy 0, policy_version 26010 (0.0008) +[2023-10-13 01:18:26,415][46663] Updated weights for policy 1, policy_version 25991 (0.0008) +[2023-10-13 01:18:26,786][46663] Updated weights for policy 1, policy_version 26001 (0.0008) +[2023-10-13 01:18:27,153][46663] Updated weights for policy 1, policy_version 26011 (0.0008) +[2023-10-13 01:18:28,606][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 53280768. Throughput: 0: 1680.4, 1: 1684.0. Samples: 13325180. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 01:18:28,607][45375] Avg episode reward: [(0, '42.920'), (1, '49.580')] +[2023-10-13 01:18:29,062][46662] Updated weights for policy 0, policy_version 26020 (0.0010) +[2023-10-13 01:18:29,446][46662] Updated weights for policy 0, policy_version 26030 (0.0007) +[2023-10-13 01:18:29,810][46662] Updated weights for policy 0, policy_version 26040 (0.0009) +[2023-10-13 01:18:31,200][46663] Updated weights for policy 1, policy_version 26021 (0.0009) +[2023-10-13 01:18:31,557][46663] Updated weights for policy 1, policy_version 26031 (0.0007) +[2023-10-13 01:18:31,932][46663] Updated weights for policy 1, policy_version 26041 (0.0007) +[2023-10-13 01:18:33,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 53346304. Throughput: 0: 1689.1, 1: 1673.3. Samples: 13344888. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 01:18:33,608][45375] Avg episode reward: [(0, '42.390'), (1, '49.330')] +[2023-10-13 01:18:33,774][46662] Updated weights for policy 0, policy_version 26050 (0.0010) +[2023-10-13 01:18:34,149][46662] Updated weights for policy 0, policy_version 26060 (0.0008) +[2023-10-13 01:18:34,517][46662] Updated weights for policy 0, policy_version 26070 (0.0008) +[2023-10-13 01:18:34,883][46662] Updated weights for policy 0, policy_version 26080 (0.0008) +[2023-10-13 01:18:36,133][46663] Updated weights for policy 1, policy_version 26051 (0.0008) +[2023-10-13 01:18:36,507][46663] Updated weights for policy 1, policy_version 26061 (0.0007) +[2023-10-13 01:18:36,879][46663] Updated weights for policy 1, policy_version 26071 (0.0007) +[2023-10-13 01:18:38,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 53411840. Throughput: 0: 1683.9, 1: 1686.2. Samples: 13365130. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 01:18:38,607][45375] Avg episode reward: [(0, '43.540'), (1, '50.850')] +[2023-10-13 01:18:38,877][46662] Updated weights for policy 0, policy_version 26090 (0.0011) +[2023-10-13 01:18:39,245][46662] Updated weights for policy 0, policy_version 26100 (0.0010) +[2023-10-13 01:18:39,613][46662] Updated weights for policy 0, policy_version 26110 (0.0008) +[2023-10-13 01:18:40,800][46663] Updated weights for policy 1, policy_version 26081 (0.0008) +[2023-10-13 01:18:41,157][46663] Updated weights for policy 1, policy_version 26091 (0.0007) +[2023-10-13 01:18:41,531][46663] Updated weights for policy 1, policy_version 26101 (0.0007) +[2023-10-13 01:18:41,895][46663] Updated weights for policy 1, policy_version 26111 (0.0011) +[2023-10-13 01:18:43,606][45375] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 53477376. Throughput: 0: 1679.6, 1: 1672.5. Samples: 13374760. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 01:18:43,607][45375] Avg episode reward: [(0, '43.950'), (1, '50.710')] +[2023-10-13 01:18:43,646][46662] Updated weights for policy 0, policy_version 26120 (0.0008) +[2023-10-13 01:18:44,008][46662] Updated weights for policy 0, policy_version 26130 (0.0010) +[2023-10-13 01:18:44,376][46662] Updated weights for policy 0, policy_version 26140 (0.0010) +[2023-10-13 01:18:45,911][46663] Updated weights for policy 1, policy_version 26121 (0.0008) +[2023-10-13 01:18:46,271][46663] Updated weights for policy 1, policy_version 26131 (0.0011) +[2023-10-13 01:18:46,652][46663] Updated weights for policy 1, policy_version 26141 (0.0009) +[2023-10-13 01:18:48,530][46662] Updated weights for policy 0, policy_version 26150 (0.0009) +[2023-10-13 01:18:48,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 53542912. Throughput: 0: 1681.3, 1: 1671.5. Samples: 13394826. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 01:18:48,607][45375] Avg episode reward: [(0, '44.160'), (1, '51.500')] +[2023-10-13 01:18:48,608][46384] Saving new best policy, reward=51.500! +[2023-10-13 01:18:48,892][46662] Updated weights for policy 0, policy_version 26160 (0.0010) +[2023-10-13 01:18:49,261][46662] Updated weights for policy 0, policy_version 26170 (0.0010) +[2023-10-13 01:18:50,781][46663] Updated weights for policy 1, policy_version 26151 (0.0009) +[2023-10-13 01:18:51,147][46663] Updated weights for policy 1, policy_version 26161 (0.0010) +[2023-10-13 01:18:51,513][46663] Updated weights for policy 1, policy_version 26171 (0.0009) +[2023-10-13 01:18:53,607][45375] Fps is (10 sec: 13106.8, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 53608448. Throughput: 0: 1677.1, 1: 1687.6. Samples: 13415404. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 01:18:53,608][45375] Avg episode reward: [(0, '44.280'), (1, '50.950')] +[2023-10-13 01:18:53,693][46662] Updated weights for policy 0, policy_version 26180 (0.0009) +[2023-10-13 01:18:54,063][46662] Updated weights for policy 0, policy_version 26190 (0.0008) +[2023-10-13 01:18:54,434][46662] Updated weights for policy 0, policy_version 26200 (0.0007) +[2023-10-13 01:18:55,448][46663] Updated weights for policy 1, policy_version 26181 (0.0009) +[2023-10-13 01:18:55,820][46663] Updated weights for policy 1, policy_version 26191 (0.0008) +[2023-10-13 01:18:56,190][46663] Updated weights for policy 1, policy_version 26201 (0.0008) +[2023-10-13 01:18:58,484][46662] Updated weights for policy 0, policy_version 26210 (0.0009) +[2023-10-13 01:18:58,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 53673984. Throughput: 0: 1681.3, 1: 1656.7. Samples: 13424762. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 01:18:58,608][45375] Avg episode reward: [(0, '45.760'), (1, '53.230')] +[2023-10-13 01:18:58,609][46384] Saving new best policy, reward=53.230! +[2023-10-13 01:18:58,861][46662] Updated weights for policy 0, policy_version 26220 (0.0011) +[2023-10-13 01:18:59,228][46662] Updated weights for policy 0, policy_version 26230 (0.0007) +[2023-10-13 01:18:59,591][46662] Updated weights for policy 0, policy_version 26240 (0.0008) +[2023-10-13 01:19:00,344][46663] Updated weights for policy 1, policy_version 26211 (0.0008) +[2023-10-13 01:19:00,708][46663] Updated weights for policy 1, policy_version 26221 (0.0012) +[2023-10-13 01:19:01,090][46663] Updated weights for policy 1, policy_version 26231 (0.0010) +[2023-10-13 01:19:03,371][46662] Updated weights for policy 0, policy_version 26250 (0.0008) +[2023-10-13 01:19:03,607][45375] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 53739520. Throughput: 0: 1685.9, 1: 1670.4. Samples: 13445434. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 01:19:03,607][45375] Avg episode reward: [(0, '44.250'), (1, '52.500')] +[2023-10-13 01:19:03,746][46662] Updated weights for policy 0, policy_version 26260 (0.0009) +[2023-10-13 01:19:04,115][46662] Updated weights for policy 0, policy_version 26270 (0.0009) +[2023-10-13 01:19:05,046][46663] Updated weights for policy 1, policy_version 26241 (0.0010) +[2023-10-13 01:19:05,414][46663] Updated weights for policy 1, policy_version 26251 (0.0009) +[2023-10-13 01:19:05,782][46663] Updated weights for policy 1, policy_version 26261 (0.0010) +[2023-10-13 01:19:06,145][46663] Updated weights for policy 1, policy_version 26271 (0.0010) +[2023-10-13 01:19:08,193][46662] Updated weights for policy 0, policy_version 26280 (0.0009) +[2023-10-13 01:19:08,555][46662] Updated weights for policy 0, policy_version 26290 (0.0011) +[2023-10-13 01:19:08,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 53805056. Throughput: 0: 1675.7, 1: 1679.1. Samples: 13466124. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 01:19:08,607][45375] Avg episode reward: [(0, '44.920'), (1, '50.830')] +[2023-10-13 01:19:08,919][46662] Updated weights for policy 0, policy_version 26300 (0.0012) +[2023-10-13 01:19:10,188][46663] Updated weights for policy 1, policy_version 26281 (0.0009) +[2023-10-13 01:19:10,558][46663] Updated weights for policy 1, policy_version 26291 (0.0007) +[2023-10-13 01:19:10,929][46663] Updated weights for policy 1, policy_version 26301 (0.0009) +[2023-10-13 01:19:12,998][46662] Updated weights for policy 0, policy_version 26310 (0.0008) +[2023-10-13 01:19:13,369][46662] Updated weights for policy 0, policy_version 26320 (0.0008) +[2023-10-13 01:19:13,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 53870592. Throughput: 0: 1678.8, 1: 1657.7. Samples: 13475322. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 01:19:13,607][45375] Avg episode reward: [(0, '44.350'), (1, '48.140')] +[2023-10-13 01:19:13,748][46662] Updated weights for policy 0, policy_version 26330 (0.0007) +[2023-10-13 01:19:14,868][46663] Updated weights for policy 1, policy_version 26311 (0.0009) +[2023-10-13 01:19:15,242][46663] Updated weights for policy 1, policy_version 26321 (0.0010) +[2023-10-13 01:19:15,616][46663] Updated weights for policy 1, policy_version 26331 (0.0008) +[2023-10-13 01:19:17,855][46662] Updated weights for policy 0, policy_version 26340 (0.0009) +[2023-10-13 01:19:18,233][46662] Updated weights for policy 0, policy_version 26350 (0.0010) +[2023-10-13 01:19:18,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 53936128. Throughput: 0: 1685.2, 1: 1682.6. Samples: 13496440. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 01:19:18,607][45375] Avg episode reward: [(0, '43.970'), (1, '48.420')] +[2023-10-13 01:19:18,610][46662] Updated weights for policy 0, policy_version 26360 (0.0009) +[2023-10-13 01:19:19,700][46663] Updated weights for policy 1, policy_version 26341 (0.0007) +[2023-10-13 01:19:20,100][46663] Updated weights for policy 1, policy_version 26351 (0.0007) +[2023-10-13 01:19:20,472][46663] Updated weights for policy 1, policy_version 26361 (0.0011) +[2023-10-13 01:19:22,554][46662] Updated weights for policy 0, policy_version 26370 (0.0008) +[2023-10-13 01:19:22,921][46662] Updated weights for policy 0, policy_version 26380 (0.0009) +[2023-10-13 01:19:23,296][46662] Updated weights for policy 0, policy_version 26390 (0.0008) +[2023-10-13 01:19:23,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 54001664. Throughput: 0: 1678.5, 1: 1690.4. Samples: 13516732. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 01:19:23,608][45375] Avg episode reward: [(0, '44.210'), (1, '48.650')] +[2023-10-13 01:19:23,662][46662] Updated weights for policy 0, policy_version 26400 (0.0009) +[2023-10-13 01:19:24,623][46663] Updated weights for policy 1, policy_version 26371 (0.0009) +[2023-10-13 01:19:24,989][46663] Updated weights for policy 1, policy_version 26381 (0.0008) +[2023-10-13 01:19:25,345][46663] Updated weights for policy 1, policy_version 26391 (0.0008) +[2023-10-13 01:19:27,570][46662] Updated weights for policy 0, policy_version 26410 (0.0008) +[2023-10-13 01:19:27,937][46662] Updated weights for policy 0, policy_version 26420 (0.0009) +[2023-10-13 01:19:28,314][46662] Updated weights for policy 0, policy_version 26430 (0.0007) +[2023-10-13 01:19:28,607][45375] Fps is (10 sec: 16384.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 54099968. Throughput: 0: 1693.3, 1: 1675.4. Samples: 13526350. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 01:19:28,607][45375] Avg episode reward: [(0, '43.470'), (1, '49.270')] +[2023-10-13 01:19:29,458][46663] Updated weights for policy 1, policy_version 26401 (0.0009) +[2023-10-13 01:19:29,837][46663] Updated weights for policy 1, policy_version 26411 (0.0008) +[2023-10-13 01:19:30,205][46663] Updated weights for policy 1, policy_version 26421 (0.0009) +[2023-10-13 01:19:30,569][46663] Updated weights for policy 1, policy_version 26431 (0.0009) +[2023-10-13 01:19:32,304][46662] Updated weights for policy 0, policy_version 26440 (0.0008) +[2023-10-13 01:19:32,671][46662] Updated weights for policy 0, policy_version 26450 (0.0010) +[2023-10-13 01:19:33,036][46662] Updated weights for policy 0, policy_version 26460 (0.0010) +[2023-10-13 01:19:33,607][45375] Fps is (10 sec: 16384.2, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 54165504. Throughput: 0: 1698.2, 1: 1685.2. Samples: 13547080. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 01:19:33,607][45375] Avg episode reward: [(0, '42.930'), (1, '48.460')] +[2023-10-13 01:19:34,772][46663] Updated weights for policy 1, policy_version 26441 (0.0008) +[2023-10-13 01:19:35,132][46663] Updated weights for policy 1, policy_version 26451 (0.0008) +[2023-10-13 01:19:35,508][46663] Updated weights for policy 1, policy_version 26461 (0.0008) +[2023-10-13 01:19:37,092][46662] Updated weights for policy 0, policy_version 26470 (0.0009) +[2023-10-13 01:19:37,465][46662] Updated weights for policy 0, policy_version 26480 (0.0008) +[2023-10-13 01:19:37,834][46662] Updated weights for policy 0, policy_version 26490 (0.0011) +[2023-10-13 01:19:38,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 54231040. Throughput: 0: 1682.5, 1: 1690.8. Samples: 13567200. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 01:19:38,607][45375] Avg episode reward: [(0, '42.310'), (1, '47.210')] +[2023-10-13 01:19:39,399][46663] Updated weights for policy 1, policy_version 26471 (0.0007) +[2023-10-13 01:19:39,758][46663] Updated weights for policy 1, policy_version 26481 (0.0009) +[2023-10-13 01:19:40,130][46663] Updated weights for policy 1, policy_version 26491 (0.0008) +[2023-10-13 01:19:41,865][46662] Updated weights for policy 0, policy_version 26500 (0.0010) +[2023-10-13 01:19:42,228][46662] Updated weights for policy 0, policy_version 26510 (0.0010) +[2023-10-13 01:19:42,606][46662] Updated weights for policy 0, policy_version 26520 (0.0008) +[2023-10-13 01:19:43,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 54296576. Throughput: 0: 1703.4, 1: 1685.7. Samples: 13577270. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 01:19:43,607][45375] Avg episode reward: [(0, '42.570'), (1, '45.930')] +[2023-10-13 01:19:44,304][46663] Updated weights for policy 1, policy_version 26501 (0.0008) +[2023-10-13 01:19:44,681][46663] Updated weights for policy 1, policy_version 26511 (0.0009) +[2023-10-13 01:19:45,051][46663] Updated weights for policy 1, policy_version 26521 (0.0009) +[2023-10-13 01:19:46,647][46662] Updated weights for policy 0, policy_version 26530 (0.0008) +[2023-10-13 01:19:47,010][46662] Updated weights for policy 0, policy_version 26540 (0.0007) +[2023-10-13 01:19:47,376][46662] Updated weights for policy 0, policy_version 26550 (0.0007) +[2023-10-13 01:19:47,753][46662] Updated weights for policy 0, policy_version 26560 (0.0009) +[2023-10-13 01:19:48,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 54362112. Throughput: 0: 1692.2, 1: 1693.5. Samples: 13597788. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 01:19:48,607][45375] Avg episode reward: [(0, '45.310'), (1, '45.600')] +[2023-10-13 01:19:49,179][46663] Updated weights for policy 1, policy_version 26531 (0.0008) +[2023-10-13 01:19:49,544][46663] Updated weights for policy 1, policy_version 26541 (0.0008) +[2023-10-13 01:19:49,910][46663] Updated weights for policy 1, policy_version 26551 (0.0011) +[2023-10-13 01:19:51,857][46662] Updated weights for policy 0, policy_version 26570 (0.0007) +[2023-10-13 01:19:52,224][46662] Updated weights for policy 0, policy_version 26580 (0.0008) +[2023-10-13 01:19:52,591][46662] Updated weights for policy 0, policy_version 26590 (0.0009) +[2023-10-13 01:19:53,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 54427648. Throughput: 0: 1667.6, 1: 1689.5. Samples: 13617196. Policy #0 lag: (min: 7.0, avg: 14.4, max: 39.0) +[2023-10-13 01:19:53,607][45375] Avg episode reward: [(0, '44.320'), (1, '44.940')] +[2023-10-13 01:19:53,616][46091] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000026592_27230208.pth... +[2023-10-13 01:19:53,616][46384] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000026560_27197440.pth... +[2023-10-13 01:19:53,651][46091] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000025024_25624576.pth +[2023-10-13 01:19:53,656][46384] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000024992_25591808.pth +[2023-10-13 01:19:53,805][46663] Updated weights for policy 1, policy_version 26561 (0.0008) +[2023-10-13 01:19:54,180][46663] Updated weights for policy 1, policy_version 26571 (0.0009) +[2023-10-13 01:19:54,550][46663] Updated weights for policy 1, policy_version 26581 (0.0010) +[2023-10-13 01:19:54,906][46663] Updated weights for policy 1, policy_version 26591 (0.0010) +[2023-10-13 01:19:56,685][46662] Updated weights for policy 0, policy_version 26600 (0.0008) +[2023-10-13 01:19:57,060][46662] Updated weights for policy 0, policy_version 26610 (0.0009) +[2023-10-13 01:19:57,425][46662] Updated weights for policy 0, policy_version 26620 (0.0009) +[2023-10-13 01:19:58,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 54493184. Throughput: 0: 1696.4, 1: 1682.9. Samples: 13627388. Policy #0 lag: (min: 7.0, avg: 14.4, max: 39.0) +[2023-10-13 01:19:58,607][45375] Avg episode reward: [(0, '44.030'), (1, '44.720')] +[2023-10-13 01:19:58,992][46663] Updated weights for policy 1, policy_version 26601 (0.0008) +[2023-10-13 01:19:59,366][46663] Updated weights for policy 1, policy_version 26611 (0.0007) +[2023-10-13 01:19:59,734][46663] Updated weights for policy 1, policy_version 26621 (0.0008) +[2023-10-13 01:20:01,631][46662] Updated weights for policy 0, policy_version 26630 (0.0008) +[2023-10-13 01:20:02,001][46662] Updated weights for policy 0, policy_version 26640 (0.0008) +[2023-10-13 01:20:02,369][46662] Updated weights for policy 0, policy_version 26650 (0.0008) +[2023-10-13 01:20:03,607][45375] Fps is (10 sec: 13107.4, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 54558720. Throughput: 0: 1674.1, 1: 1679.6. Samples: 13647360. Policy #0 lag: (min: 7.0, avg: 14.4, max: 39.0) +[2023-10-13 01:20:03,607][45375] Avg episode reward: [(0, '43.500'), (1, '45.230')] +[2023-10-13 01:20:03,854][46663] Updated weights for policy 1, policy_version 26631 (0.0009) +[2023-10-13 01:20:04,229][46663] Updated weights for policy 1, policy_version 26641 (0.0010) +[2023-10-13 01:20:04,591][46663] Updated weights for policy 1, policy_version 26651 (0.0011) +[2023-10-13 01:20:06,720][46662] Updated weights for policy 0, policy_version 26660 (0.0009) +[2023-10-13 01:20:07,120][46662] Updated weights for policy 0, policy_version 26670 (0.0011) +[2023-10-13 01:20:07,478][46662] Updated weights for policy 0, policy_version 26680 (0.0009) +[2023-10-13 01:20:08,606][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 54624256. Throughput: 0: 1663.0, 1: 1677.7. Samples: 13667060. Policy #0 lag: (min: 7.0, avg: 14.4, max: 39.0) +[2023-10-13 01:20:08,607][45375] Avg episode reward: [(0, '43.120'), (1, '49.240')] +[2023-10-13 01:20:08,865][46663] Updated weights for policy 1, policy_version 26661 (0.0009) +[2023-10-13 01:20:09,255][46663] Updated weights for policy 1, policy_version 26671 (0.0007) +[2023-10-13 01:20:09,624][46663] Updated weights for policy 1, policy_version 26681 (0.0007) +[2023-10-13 01:20:11,567][46662] Updated weights for policy 0, policy_version 26690 (0.0010) +[2023-10-13 01:20:11,943][46662] Updated weights for policy 0, policy_version 26700 (0.0008) +[2023-10-13 01:20:12,313][46662] Updated weights for policy 0, policy_version 26710 (0.0010) +[2023-10-13 01:20:12,688][46662] Updated weights for policy 0, policy_version 26720 (0.0011) +[2023-10-13 01:20:13,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 54689792. Throughput: 0: 1679.9, 1: 1674.3. Samples: 13677290. Policy #0 lag: (min: 28.0, avg: 52.9, max: 56.0) +[2023-10-13 01:20:13,608][45375] Avg episode reward: [(0, '43.140'), (1, '49.920')] +[2023-10-13 01:20:13,690][46663] Updated weights for policy 1, policy_version 26691 (0.0008) +[2023-10-13 01:20:14,049][46663] Updated weights for policy 1, policy_version 26701 (0.0011) +[2023-10-13 01:20:14,417][46663] Updated weights for policy 1, policy_version 26711 (0.0010) +[2023-10-13 01:20:16,519][46662] Updated weights for policy 0, policy_version 26730 (0.0010) +[2023-10-13 01:20:16,902][46662] Updated weights for policy 0, policy_version 26740 (0.0009) +[2023-10-13 01:20:17,261][46662] Updated weights for policy 0, policy_version 26750 (0.0010) +[2023-10-13 01:20:18,600][46663] Updated weights for policy 1, policy_version 26721 (0.0007) +[2023-10-13 01:20:18,606][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 54755328. Throughput: 0: 1659.9, 1: 1680.0. Samples: 13697374. Policy #0 lag: (min: 28.0, avg: 52.9, max: 56.0) +[2023-10-13 01:20:18,607][45375] Avg episode reward: [(0, '42.160'), (1, '51.760')] +[2023-10-13 01:20:18,975][46663] Updated weights for policy 1, policy_version 26731 (0.0008) +[2023-10-13 01:20:19,351][46663] Updated weights for policy 1, policy_version 26741 (0.0007) +[2023-10-13 01:20:19,717][46663] Updated weights for policy 1, policy_version 26751 (0.0008) +[2023-10-13 01:20:21,407][46662] Updated weights for policy 0, policy_version 26760 (0.0009) +[2023-10-13 01:20:21,777][46662] Updated weights for policy 0, policy_version 26770 (0.0008) +[2023-10-13 01:20:22,153][46662] Updated weights for policy 0, policy_version 26780 (0.0008) +[2023-10-13 01:20:23,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 54820864. Throughput: 0: 1667.4, 1: 1675.0. Samples: 13717610. Policy #0 lag: (min: 28.0, avg: 52.9, max: 56.0) +[2023-10-13 01:20:23,608][45375] Avg episode reward: [(0, '42.470'), (1, '51.930')] +[2023-10-13 01:20:23,735][46663] Updated weights for policy 1, policy_version 26761 (0.0010) +[2023-10-13 01:20:24,101][46663] Updated weights for policy 1, policy_version 26771 (0.0010) +[2023-10-13 01:20:24,470][46663] Updated weights for policy 1, policy_version 26781 (0.0011) +[2023-10-13 01:20:26,097][46662] Updated weights for policy 0, policy_version 26790 (0.0007) +[2023-10-13 01:20:26,470][46662] Updated weights for policy 0, policy_version 26800 (0.0008) +[2023-10-13 01:20:26,835][46662] Updated weights for policy 0, policy_version 26810 (0.0008) +[2023-10-13 01:20:28,606][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 54886400. Throughput: 0: 1677.8, 1: 1673.6. Samples: 13728084. Policy #0 lag: (min: 28.0, avg: 52.9, max: 56.0) +[2023-10-13 01:20:28,607][45375] Avg episode reward: [(0, '43.240'), (1, '52.360')] +[2023-10-13 01:20:28,667][46663] Updated weights for policy 1, policy_version 26791 (0.0008) +[2023-10-13 01:20:29,031][46663] Updated weights for policy 1, policy_version 26801 (0.0011) +[2023-10-13 01:20:29,397][46663] Updated weights for policy 1, policy_version 26811 (0.0010) +[2023-10-13 01:20:30,865][46662] Updated weights for policy 0, policy_version 26820 (0.0009) +[2023-10-13 01:20:31,242][46662] Updated weights for policy 0, policy_version 26830 (0.0009) +[2023-10-13 01:20:31,609][46662] Updated weights for policy 0, policy_version 26840 (0.0008) +[2023-10-13 01:20:33,502][46663] Updated weights for policy 1, policy_version 26821 (0.0009) +[2023-10-13 01:20:33,607][45375] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 54951936. Throughput: 0: 1658.1, 1: 1670.3. Samples: 13747566. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 01:20:33,607][45375] Avg episode reward: [(0, '43.290'), (1, '54.040')] +[2023-10-13 01:20:33,864][46663] Updated weights for policy 1, policy_version 26831 (0.0009) +[2023-10-13 01:20:34,229][46663] Updated weights for policy 1, policy_version 26841 (0.0009) +[2023-10-13 01:20:34,490][46384] Saving new best policy, reward=54.040! +[2023-10-13 01:20:35,755][46662] Updated weights for policy 0, policy_version 26850 (0.0007) +[2023-10-13 01:20:36,129][46662] Updated weights for policy 0, policy_version 26860 (0.0009) +[2023-10-13 01:20:36,497][46662] Updated weights for policy 0, policy_version 26870 (0.0007) +[2023-10-13 01:20:36,872][46662] Updated weights for policy 0, policy_version 26880 (0.0008) +[2023-10-13 01:20:38,334][46663] Updated weights for policy 1, policy_version 26851 (0.0008) +[2023-10-13 01:20:38,607][45375] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 55017472. Throughput: 0: 1683.1, 1: 1663.7. Samples: 13767804. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 01:20:38,607][45375] Avg episode reward: [(0, '41.740'), (1, '54.100')] +[2023-10-13 01:20:38,707][46663] Updated weights for policy 1, policy_version 26861 (0.0010) +[2023-10-13 01:20:39,071][46663] Updated weights for policy 1, policy_version 26871 (0.0008) +[2023-10-13 01:20:39,401][46384] Saving new best policy, reward=54.100! +[2023-10-13 01:20:40,737][46662] Updated weights for policy 0, policy_version 26890 (0.0008) +[2023-10-13 01:20:41,108][46662] Updated weights for policy 0, policy_version 26900 (0.0008) +[2023-10-13 01:20:41,477][46662] Updated weights for policy 0, policy_version 26910 (0.0009) +[2023-10-13 01:20:43,266][46663] Updated weights for policy 1, policy_version 26881 (0.0010) +[2023-10-13 01:20:43,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 55083008. Throughput: 0: 1676.7, 1: 1668.0. Samples: 13777896. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 01:20:43,608][45375] Avg episode reward: [(0, '42.020'), (1, '53.770')] +[2023-10-13 01:20:43,631][46663] Updated weights for policy 1, policy_version 26891 (0.0009) +[2023-10-13 01:20:43,999][46663] Updated weights for policy 1, policy_version 26901 (0.0010) +[2023-10-13 01:20:44,373][46663] Updated weights for policy 1, policy_version 26911 (0.0008) +[2023-10-13 01:20:45,490][46662] Updated weights for policy 0, policy_version 26920 (0.0007) +[2023-10-13 01:20:45,862][46662] Updated weights for policy 0, policy_version 26930 (0.0007) +[2023-10-13 01:20:46,220][46662] Updated weights for policy 0, policy_version 26940 (0.0007) +[2023-10-13 01:20:48,246][46663] Updated weights for policy 1, policy_version 26921 (0.0009) +[2023-10-13 01:20:48,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 55148544. Throughput: 0: 1676.8, 1: 1669.4. Samples: 13797942. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 01:20:48,607][45375] Avg episode reward: [(0, '42.180'), (1, '54.420')] +[2023-10-13 01:20:48,612][46663] Updated weights for policy 1, policy_version 26931 (0.0009) +[2023-10-13 01:20:48,982][46663] Updated weights for policy 1, policy_version 26941 (0.0010) +[2023-10-13 01:20:49,094][46384] Saving new best policy, reward=54.420! +[2023-10-13 01:20:50,255][46662] Updated weights for policy 0, policy_version 26950 (0.0008) +[2023-10-13 01:20:50,634][46662] Updated weights for policy 0, policy_version 26960 (0.0008) +[2023-10-13 01:20:50,993][46662] Updated weights for policy 0, policy_version 26970 (0.0009) +[2023-10-13 01:20:53,301][46663] Updated weights for policy 1, policy_version 26951 (0.0010) +[2023-10-13 01:20:53,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 55214080. Throughput: 0: 1696.0, 1: 1656.8. Samples: 13817936. Policy #0 lag: (min: 31.0, avg: 35.3, max: 63.0) +[2023-10-13 01:20:53,607][45375] Avg episode reward: [(0, '42.840'), (1, '56.730')] +[2023-10-13 01:20:53,683][46663] Updated weights for policy 1, policy_version 26961 (0.0008) +[2023-10-13 01:20:54,056][46663] Updated weights for policy 1, policy_version 26971 (0.0008) +[2023-10-13 01:20:54,241][46384] Saving new best policy, reward=56.730! +[2023-10-13 01:20:55,122][46662] Updated weights for policy 0, policy_version 26980 (0.0008) +[2023-10-13 01:20:55,511][46662] Updated weights for policy 0, policy_version 26990 (0.0007) +[2023-10-13 01:20:55,886][46662] Updated weights for policy 0, policy_version 27000 (0.0008) +[2023-10-13 01:20:58,323][46663] Updated weights for policy 1, policy_version 26981 (0.0009) +[2023-10-13 01:20:58,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 55279616. Throughput: 0: 1676.0, 1: 1666.4. Samples: 13827698. Policy #0 lag: (min: 31.0, avg: 35.3, max: 63.0) +[2023-10-13 01:20:58,607][45375] Avg episode reward: [(0, '43.590'), (1, '56.360')] +[2023-10-13 01:20:58,685][46663] Updated weights for policy 1, policy_version 26991 (0.0009) +[2023-10-13 01:20:59,058][46663] Updated weights for policy 1, policy_version 27001 (0.0009) +[2023-10-13 01:21:00,022][46662] Updated weights for policy 0, policy_version 27010 (0.0009) +[2023-10-13 01:21:00,398][46662] Updated weights for policy 0, policy_version 27020 (0.0009) +[2023-10-13 01:21:00,766][46662] Updated weights for policy 0, policy_version 27030 (0.0008) +[2023-10-13 01:21:01,134][46662] Updated weights for policy 0, policy_version 27040 (0.0007) +[2023-10-13 01:21:03,149][46663] Updated weights for policy 1, policy_version 27011 (0.0009) +[2023-10-13 01:21:03,510][46663] Updated weights for policy 1, policy_version 27021 (0.0009) +[2023-10-13 01:21:03,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 55345152. Throughput: 0: 1678.5, 1: 1662.9. Samples: 13847738. Policy #0 lag: (min: 31.0, avg: 35.3, max: 63.0) +[2023-10-13 01:21:03,607][45375] Avg episode reward: [(0, '43.330'), (1, '56.460')] +[2023-10-13 01:21:03,876][46663] Updated weights for policy 1, policy_version 27031 (0.0011) +[2023-10-13 01:21:05,286][46662] Updated weights for policy 0, policy_version 27050 (0.0008) +[2023-10-13 01:21:05,664][46662] Updated weights for policy 0, policy_version 27060 (0.0009) +[2023-10-13 01:21:06,031][46662] Updated weights for policy 0, policy_version 27070 (0.0008) +[2023-10-13 01:21:08,004][46663] Updated weights for policy 1, policy_version 27041 (0.0007) +[2023-10-13 01:21:08,362][46663] Updated weights for policy 1, policy_version 27051 (0.0008) +[2023-10-13 01:21:08,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 55410688. Throughput: 0: 1689.7, 1: 1651.3. Samples: 13867950. Policy #0 lag: (min: 31.0, avg: 35.3, max: 63.0) +[2023-10-13 01:21:08,607][45375] Avg episode reward: [(0, '43.330'), (1, '55.190')] +[2023-10-13 01:21:08,734][46663] Updated weights for policy 1, policy_version 27061 (0.0009) +[2023-10-13 01:21:09,091][46663] Updated weights for policy 1, policy_version 27071 (0.0007) +[2023-10-13 01:21:10,115][46662] Updated weights for policy 0, policy_version 27080 (0.0008) +[2023-10-13 01:21:10,489][46662] Updated weights for policy 0, policy_version 27090 (0.0008) +[2023-10-13 01:21:10,862][46662] Updated weights for policy 0, policy_version 27100 (0.0009) +[2023-10-13 01:21:13,055][46663] Updated weights for policy 1, policy_version 27081 (0.0008) +[2023-10-13 01:21:13,430][46663] Updated weights for policy 1, policy_version 27091 (0.0007) +[2023-10-13 01:21:13,607][45375] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 55476224. Throughput: 0: 1664.3, 1: 1665.3. Samples: 13877920. Policy #0 lag: (min: 11.0, avg: 36.9, max: 40.0) +[2023-10-13 01:21:13,608][45375] Avg episode reward: [(0, '43.540'), (1, '53.480')] +[2023-10-13 01:21:13,790][46663] Updated weights for policy 1, policy_version 27101 (0.0011) +[2023-10-13 01:21:14,856][46662] Updated weights for policy 0, policy_version 27110 (0.0008) +[2023-10-13 01:21:15,218][46662] Updated weights for policy 0, policy_version 27120 (0.0008) +[2023-10-13 01:21:15,590][46662] Updated weights for policy 0, policy_version 27130 (0.0008) +[2023-10-13 01:21:17,856][46663] Updated weights for policy 1, policy_version 27111 (0.0008) +[2023-10-13 01:21:18,230][46663] Updated weights for policy 1, policy_version 27121 (0.0009) +[2023-10-13 01:21:18,592][46663] Updated weights for policy 1, policy_version 27131 (0.0008) +[2023-10-13 01:21:18,607][45375] Fps is (10 sec: 13106.8, 60 sec: 13107.1, 300 sec: 13329.3). Total num frames: 55541760. Throughput: 0: 1687.9, 1: 1665.9. Samples: 13898486. Policy #0 lag: (min: 11.0, avg: 36.9, max: 40.0) +[2023-10-13 01:21:18,608][45375] Avg episode reward: [(0, '43.720'), (1, '52.140')] +[2023-10-13 01:21:19,737][46662] Updated weights for policy 0, policy_version 27140 (0.0007) +[2023-10-13 01:21:20,103][46662] Updated weights for policy 0, policy_version 27150 (0.0007) +[2023-10-13 01:21:20,478][46662] Updated weights for policy 0, policy_version 27160 (0.0007) +[2023-10-13 01:21:22,695][46663] Updated weights for policy 1, policy_version 27141 (0.0008) +[2023-10-13 01:21:23,072][46663] Updated weights for policy 1, policy_version 27151 (0.0007) +[2023-10-13 01:21:23,441][46663] Updated weights for policy 1, policy_version 27161 (0.0009) +[2023-10-13 01:21:23,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 55607296. Throughput: 0: 1686.5, 1: 1654.5. Samples: 13918152. Policy #0 lag: (min: 11.0, avg: 36.9, max: 40.0) +[2023-10-13 01:21:23,608][45375] Avg episode reward: [(0, '44.610'), (1, '49.910')] +[2023-10-13 01:21:24,637][46662] Updated weights for policy 0, policy_version 27170 (0.0010) +[2023-10-13 01:21:25,005][46662] Updated weights for policy 0, policy_version 27180 (0.0007) +[2023-10-13 01:21:25,372][46662] Updated weights for policy 0, policy_version 27190 (0.0008) +[2023-10-13 01:21:25,744][46662] Updated weights for policy 0, policy_version 27200 (0.0007) +[2023-10-13 01:21:27,345][46663] Updated weights for policy 1, policy_version 27171 (0.0008) +[2023-10-13 01:21:27,711][46663] Updated weights for policy 1, policy_version 27181 (0.0011) +[2023-10-13 01:21:28,078][46663] Updated weights for policy 1, policy_version 27191 (0.0010) +[2023-10-13 01:21:28,606][45375] Fps is (10 sec: 16384.5, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 55705600. Throughput: 0: 1664.3, 1: 1672.3. Samples: 13928042. Policy #0 lag: (min: 11.0, avg: 36.9, max: 40.0) +[2023-10-13 01:21:28,607][45375] Avg episode reward: [(0, '45.140'), (1, '49.320')] +[2023-10-13 01:21:29,644][46662] Updated weights for policy 0, policy_version 27210 (0.0009) +[2023-10-13 01:21:30,006][46662] Updated weights for policy 0, policy_version 27220 (0.0008) +[2023-10-13 01:21:30,383][46662] Updated weights for policy 0, policy_version 27230 (0.0008) +[2023-10-13 01:21:32,325][46663] Updated weights for policy 1, policy_version 27201 (0.0010) +[2023-10-13 01:21:32,692][46663] Updated weights for policy 1, policy_version 27211 (0.0011) +[2023-10-13 01:21:33,072][46663] Updated weights for policy 1, policy_version 27221 (0.0010) +[2023-10-13 01:21:33,448][46663] Updated weights for policy 1, policy_version 27231 (0.0011) +[2023-10-13 01:21:33,607][45375] Fps is (10 sec: 16384.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 55771136. Throughput: 0: 1680.7, 1: 1665.6. Samples: 13948528. Policy #0 lag: (min: 31.0, avg: 31.2, max: 40.0) +[2023-10-13 01:21:33,608][45375] Avg episode reward: [(0, '45.240'), (1, '47.670')] +[2023-10-13 01:21:34,577][46662] Updated weights for policy 0, policy_version 27240 (0.0008) +[2023-10-13 01:21:34,943][46662] Updated weights for policy 0, policy_version 27250 (0.0009) +[2023-10-13 01:21:35,301][46662] Updated weights for policy 0, policy_version 27260 (0.0008) +[2023-10-13 01:21:37,648][46663] Updated weights for policy 1, policy_version 27241 (0.0010) +[2023-10-13 01:21:38,026][46663] Updated weights for policy 1, policy_version 27251 (0.0010) +[2023-10-13 01:21:38,393][46663] Updated weights for policy 1, policy_version 27261 (0.0007) +[2023-10-13 01:21:38,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 55836672. Throughput: 0: 1682.0, 1: 1657.4. Samples: 13968208. Policy #0 lag: (min: 31.0, avg: 31.2, max: 40.0) +[2023-10-13 01:21:38,607][45375] Avg episode reward: [(0, '47.110'), (1, '47.330')] +[2023-10-13 01:21:39,142][46662] Updated weights for policy 0, policy_version 27270 (0.0008) +[2023-10-13 01:21:39,526][46662] Updated weights for policy 0, policy_version 27280 (0.0008) +[2023-10-13 01:21:39,887][46662] Updated weights for policy 0, policy_version 27290 (0.0007) +[2023-10-13 01:21:42,313][46663] Updated weights for policy 1, policy_version 27271 (0.0009) +[2023-10-13 01:21:42,684][46663] Updated weights for policy 1, policy_version 27281 (0.0009) +[2023-10-13 01:21:43,055][46663] Updated weights for policy 1, policy_version 27291 (0.0008) +[2023-10-13 01:21:43,607][45375] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 55902208. Throughput: 0: 1672.2, 1: 1677.6. Samples: 13978442. Policy #0 lag: (min: 31.0, avg: 31.2, max: 40.0) +[2023-10-13 01:21:43,607][45375] Avg episode reward: [(0, '44.850'), (1, '48.510')] +[2023-10-13 01:21:44,015][46662] Updated weights for policy 0, policy_version 27300 (0.0009) +[2023-10-13 01:21:44,410][46662] Updated weights for policy 0, policy_version 27310 (0.0009) +[2023-10-13 01:21:44,783][46662] Updated weights for policy 0, policy_version 27320 (0.0009) +[2023-10-13 01:21:47,065][46663] Updated weights for policy 1, policy_version 27301 (0.0008) +[2023-10-13 01:21:47,436][46663] Updated weights for policy 1, policy_version 27311 (0.0008) +[2023-10-13 01:21:47,803][46663] Updated weights for policy 1, policy_version 27321 (0.0008) +[2023-10-13 01:21:48,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 55967744. Throughput: 0: 1681.1, 1: 1667.9. Samples: 13998444. Policy #0 lag: (min: 31.0, avg: 31.2, max: 40.0) +[2023-10-13 01:21:48,607][45375] Avg episode reward: [(0, '44.770'), (1, '49.480')] +[2023-10-13 01:21:48,914][46662] Updated weights for policy 0, policy_version 27330 (0.0008) +[2023-10-13 01:21:49,284][46662] Updated weights for policy 0, policy_version 27340 (0.0007) +[2023-10-13 01:21:49,651][46662] Updated weights for policy 0, policy_version 27350 (0.0008) +[2023-10-13 01:21:50,027][46662] Updated weights for policy 0, policy_version 27360 (0.0010) +[2023-10-13 01:21:51,844][46663] Updated weights for policy 1, policy_version 27331 (0.0009) +[2023-10-13 01:21:52,215][46663] Updated weights for policy 1, policy_version 27341 (0.0008) +[2023-10-13 01:21:52,577][46663] Updated weights for policy 1, policy_version 27351 (0.0008) +[2023-10-13 01:21:53,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 56033280. Throughput: 0: 1682.6, 1: 1664.4. Samples: 14018568. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 01:21:53,607][45375] Avg episode reward: [(0, '44.150'), (1, '49.160')] +[2023-10-13 01:21:53,616][46384] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000027360_28016640.pth... +[2023-10-13 01:21:53,616][46091] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000027360_28016640.pth... +[2023-10-13 01:21:53,653][46091] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000025792_26411008.pth +[2023-10-13 01:21:53,656][46384] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000025792_26411008.pth +[2023-10-13 01:21:54,236][46662] Updated weights for policy 0, policy_version 27370 (0.0008) +[2023-10-13 01:21:54,610][46662] Updated weights for policy 0, policy_version 27380 (0.0008) +[2023-10-13 01:21:54,986][46662] Updated weights for policy 0, policy_version 27390 (0.0008) +[2023-10-13 01:21:56,522][46663] Updated weights for policy 1, policy_version 27361 (0.0007) +[2023-10-13 01:21:56,884][46663] Updated weights for policy 1, policy_version 27371 (0.0008) +[2023-10-13 01:21:57,248][46663] Updated weights for policy 1, policy_version 27381 (0.0008) +[2023-10-13 01:21:57,620][46663] Updated weights for policy 1, policy_version 27391 (0.0008) +[2023-10-13 01:21:58,606][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 56098816. Throughput: 0: 1677.4, 1: 1682.6. Samples: 14029118. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 01:21:58,607][45375] Avg episode reward: [(0, '44.120'), (1, '47.650')] +[2023-10-13 01:21:59,108][46662] Updated weights for policy 0, policy_version 27400 (0.0008) +[2023-10-13 01:21:59,477][46662] Updated weights for policy 0, policy_version 27410 (0.0007) +[2023-10-13 01:21:59,851][46662] Updated weights for policy 0, policy_version 27420 (0.0009) +[2023-10-13 01:22:01,775][46663] Updated weights for policy 1, policy_version 27401 (0.0010) +[2023-10-13 01:22:02,136][46663] Updated weights for policy 1, policy_version 27411 (0.0010) +[2023-10-13 01:22:02,502][46663] Updated weights for policy 1, policy_version 27421 (0.0009) +[2023-10-13 01:22:03,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 56164352. Throughput: 0: 1679.3, 1: 1660.3. Samples: 14048768. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 01:22:03,607][45375] Avg episode reward: [(0, '44.690'), (1, '47.440')] +[2023-10-13 01:22:03,831][46662] Updated weights for policy 0, policy_version 27430 (0.0012) +[2023-10-13 01:22:04,201][46662] Updated weights for policy 0, policy_version 27440 (0.0008) +[2023-10-13 01:22:04,570][46662] Updated weights for policy 0, policy_version 27450 (0.0008) +[2023-10-13 01:22:06,642][46663] Updated weights for policy 1, policy_version 27431 (0.0011) +[2023-10-13 01:22:07,013][46663] Updated weights for policy 1, policy_version 27441 (0.0010) +[2023-10-13 01:22:07,383][46663] Updated weights for policy 1, policy_version 27451 (0.0008) +[2023-10-13 01:22:08,606][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 56229888. Throughput: 0: 1681.7, 1: 1672.0. Samples: 14069068. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 01:22:08,607][45375] Avg episode reward: [(0, '44.340'), (1, '47.370')] +[2023-10-13 01:22:08,691][46662] Updated weights for policy 0, policy_version 27460 (0.0009) +[2023-10-13 01:22:09,062][46662] Updated weights for policy 0, policy_version 27470 (0.0009) +[2023-10-13 01:22:09,428][46662] Updated weights for policy 0, policy_version 27480 (0.0009) +[2023-10-13 01:22:11,497][46663] Updated weights for policy 1, policy_version 27461 (0.0010) +[2023-10-13 01:22:11,867][46663] Updated weights for policy 1, policy_version 27471 (0.0008) +[2023-10-13 01:22:12,242][46663] Updated weights for policy 1, policy_version 27481 (0.0007) +[2023-10-13 01:22:13,515][46662] Updated weights for policy 0, policy_version 27490 (0.0008) +[2023-10-13 01:22:13,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 56295424. Throughput: 0: 1679.5, 1: 1681.4. Samples: 14079284. Policy #0 lag: (min: 31.0, avg: 37.2, max: 63.0) +[2023-10-13 01:22:13,607][45375] Avg episode reward: [(0, '44.800'), (1, '47.840')] +[2023-10-13 01:22:13,889][46662] Updated weights for policy 0, policy_version 27500 (0.0007) +[2023-10-13 01:22:14,261][46662] Updated weights for policy 0, policy_version 27510 (0.0007) +[2023-10-13 01:22:14,625][46662] Updated weights for policy 0, policy_version 27520 (0.0007) +[2023-10-13 01:22:16,316][46663] Updated weights for policy 1, policy_version 27491 (0.0009) +[2023-10-13 01:22:16,691][46663] Updated weights for policy 1, policy_version 27501 (0.0008) +[2023-10-13 01:22:17,058][46663] Updated weights for policy 1, policy_version 27511 (0.0009) +[2023-10-13 01:22:18,418][46662] Updated weights for policy 0, policy_version 27530 (0.0010) +[2023-10-13 01:22:18,606][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 56360960. Throughput: 0: 1681.8, 1: 1662.4. Samples: 14099016. Policy #0 lag: (min: 31.0, avg: 37.2, max: 63.0) +[2023-10-13 01:22:18,607][45375] Avg episode reward: [(0, '42.920'), (1, '48.180')] +[2023-10-13 01:22:18,790][46662] Updated weights for policy 0, policy_version 27540 (0.0010) +[2023-10-13 01:22:19,164][46662] Updated weights for policy 0, policy_version 27550 (0.0007) +[2023-10-13 01:22:21,312][46663] Updated weights for policy 1, policy_version 27521 (0.0008) +[2023-10-13 01:22:21,676][46663] Updated weights for policy 1, policy_version 27531 (0.0010) +[2023-10-13 01:22:22,040][46663] Updated weights for policy 1, policy_version 27541 (0.0010) +[2023-10-13 01:22:22,409][46663] Updated weights for policy 1, policy_version 27551 (0.0008) +[2023-10-13 01:22:23,227][46662] Updated weights for policy 0, policy_version 27560 (0.0010) +[2023-10-13 01:22:23,601][46662] Updated weights for policy 0, policy_version 27570 (0.0010) +[2023-10-13 01:22:23,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 56426496. Throughput: 0: 1682.0, 1: 1684.0. Samples: 14119682. Policy #0 lag: (min: 31.0, avg: 37.2, max: 63.0) +[2023-10-13 01:22:23,607][45375] Avg episode reward: [(0, '42.210'), (1, '48.680')] +[2023-10-13 01:22:23,971][46662] Updated weights for policy 0, policy_version 27580 (0.0009) +[2023-10-13 01:22:26,496][46663] Updated weights for policy 1, policy_version 27561 (0.0008) +[2023-10-13 01:22:26,857][46663] Updated weights for policy 1, policy_version 27571 (0.0011) +[2023-10-13 01:22:27,228][46663] Updated weights for policy 1, policy_version 27581 (0.0011) +[2023-10-13 01:22:28,054][46662] Updated weights for policy 0, policy_version 27590 (0.0007) +[2023-10-13 01:22:28,431][46662] Updated weights for policy 0, policy_version 27600 (0.0007) +[2023-10-13 01:22:28,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 56492032. Throughput: 0: 1683.2, 1: 1679.6. Samples: 14129768. Policy #0 lag: (min: 31.0, avg: 37.2, max: 63.0) +[2023-10-13 01:22:28,607][45375] Avg episode reward: [(0, '40.470'), (1, '49.250')] +[2023-10-13 01:22:28,809][46662] Updated weights for policy 0, policy_version 27610 (0.0010) +[2023-10-13 01:22:31,368][46663] Updated weights for policy 1, policy_version 27591 (0.0010) +[2023-10-13 01:22:31,750][46663] Updated weights for policy 1, policy_version 27601 (0.0008) +[2023-10-13 01:22:32,129][46663] Updated weights for policy 1, policy_version 27611 (0.0008) +[2023-10-13 01:22:33,043][46662] Updated weights for policy 0, policy_version 27620 (0.0009) +[2023-10-13 01:22:33,430][46662] Updated weights for policy 0, policy_version 27630 (0.0011) +[2023-10-13 01:22:33,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 56557568. Throughput: 0: 1682.7, 1: 1664.4. Samples: 14149064. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0) +[2023-10-13 01:22:33,607][45375] Avg episode reward: [(0, '39.470'), (1, '50.430')] +[2023-10-13 01:22:33,808][46662] Updated weights for policy 0, policy_version 27640 (0.0010) +[2023-10-13 01:22:36,111][46663] Updated weights for policy 1, policy_version 27621 (0.0009) +[2023-10-13 01:22:36,475][46663] Updated weights for policy 1, policy_version 27631 (0.0008) +[2023-10-13 01:22:36,840][46663] Updated weights for policy 1, policy_version 27641 (0.0007) +[2023-10-13 01:22:37,777][46662] Updated weights for policy 0, policy_version 27650 (0.0009) +[2023-10-13 01:22:38,150][46662] Updated weights for policy 0, policy_version 27660 (0.0010) +[2023-10-13 01:22:38,525][46662] Updated weights for policy 0, policy_version 27670 (0.0009) +[2023-10-13 01:22:38,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 56623104. Throughput: 0: 1678.0, 1: 1673.5. Samples: 14169388. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0) +[2023-10-13 01:22:38,607][45375] Avg episode reward: [(0, '38.980'), (1, '50.900')] +[2023-10-13 01:22:38,888][46662] Updated weights for policy 0, policy_version 27680 (0.0008) +[2023-10-13 01:22:40,918][46663] Updated weights for policy 1, policy_version 27651 (0.0010) +[2023-10-13 01:22:41,282][46663] Updated weights for policy 1, policy_version 27661 (0.0007) +[2023-10-13 01:22:41,644][46663] Updated weights for policy 1, policy_version 27671 (0.0008) +[2023-10-13 01:22:42,826][46662] Updated weights for policy 0, policy_version 27690 (0.0008) +[2023-10-13 01:22:43,200][46662] Updated weights for policy 0, policy_version 27700 (0.0010) +[2023-10-13 01:22:43,568][46662] Updated weights for policy 0, policy_version 27710 (0.0008) +[2023-10-13 01:22:43,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 56688640. Throughput: 0: 1679.8, 1: 1657.1. Samples: 14179278. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0) +[2023-10-13 01:22:43,608][45375] Avg episode reward: [(0, '38.890'), (1, '50.280')] +[2023-10-13 01:22:45,769][46663] Updated weights for policy 1, policy_version 27681 (0.0007) +[2023-10-13 01:22:46,134][46663] Updated weights for policy 1, policy_version 27691 (0.0009) +[2023-10-13 01:22:46,514][46663] Updated weights for policy 1, policy_version 27701 (0.0009) +[2023-10-13 01:22:46,881][46663] Updated weights for policy 1, policy_version 27711 (0.0008) +[2023-10-13 01:22:47,530][46662] Updated weights for policy 0, policy_version 27720 (0.0008) +[2023-10-13 01:22:47,905][46662] Updated weights for policy 0, policy_version 27730 (0.0009) +[2023-10-13 01:22:48,276][46662] Updated weights for policy 0, policy_version 27740 (0.0009) +[2023-10-13 01:22:48,607][45375] Fps is (10 sec: 16384.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 56786944. Throughput: 0: 1686.0, 1: 1664.2. Samples: 14199526. Policy #0 lag: (min: 10.0, avg: 10.8, max: 26.0) +[2023-10-13 01:22:48,607][45375] Avg episode reward: [(0, '36.800'), (1, '50.980')] +[2023-10-13 01:22:50,986][46663] Updated weights for policy 1, policy_version 27721 (0.0008) +[2023-10-13 01:22:51,353][46663] Updated weights for policy 1, policy_version 27731 (0.0007) +[2023-10-13 01:22:51,713][46663] Updated weights for policy 1, policy_version 27741 (0.0007) +[2023-10-13 01:22:52,384][46662] Updated weights for policy 0, policy_version 27750 (0.0008) +[2023-10-13 01:22:52,755][46662] Updated weights for policy 0, policy_version 27760 (0.0008) +[2023-10-13 01:22:53,123][46662] Updated weights for policy 0, policy_version 27770 (0.0011) +[2023-10-13 01:22:53,607][45375] Fps is (10 sec: 16383.7, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 56852480. Throughput: 0: 1674.3, 1: 1667.6. Samples: 14219454. Policy #0 lag: (min: 10.0, avg: 10.8, max: 26.0) +[2023-10-13 01:22:53,608][45375] Avg episode reward: [(0, '34.970'), (1, '50.560')] +[2023-10-13 01:22:55,988][46663] Updated weights for policy 1, policy_version 27751 (0.0010) +[2023-10-13 01:22:56,351][46663] Updated weights for policy 1, policy_version 27761 (0.0007) +[2023-10-13 01:22:56,715][46663] Updated weights for policy 1, policy_version 27771 (0.0011) +[2023-10-13 01:22:57,217][46662] Updated weights for policy 0, policy_version 27780 (0.0011) +[2023-10-13 01:22:57,580][46662] Updated weights for policy 0, policy_version 27790 (0.0009) +[2023-10-13 01:22:57,963][46662] Updated weights for policy 0, policy_version 27800 (0.0010) +[2023-10-13 01:22:58,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 56918016. Throughput: 0: 1690.9, 1: 1648.7. Samples: 14229564. Policy #0 lag: (min: 10.0, avg: 10.8, max: 26.0) +[2023-10-13 01:22:58,607][45375] Avg episode reward: [(0, '34.630'), (1, '49.470')] +[2023-10-13 01:23:00,675][46663] Updated weights for policy 1, policy_version 27781 (0.0009) +[2023-10-13 01:23:01,053][46663] Updated weights for policy 1, policy_version 27791 (0.0010) +[2023-10-13 01:23:01,409][46663] Updated weights for policy 1, policy_version 27801 (0.0008) +[2023-10-13 01:23:02,064][46662] Updated weights for policy 0, policy_version 27810 (0.0008) +[2023-10-13 01:23:02,441][46662] Updated weights for policy 0, policy_version 27820 (0.0007) +[2023-10-13 01:23:02,806][46662] Updated weights for policy 0, policy_version 27830 (0.0007) +[2023-10-13 01:23:03,178][46662] Updated weights for policy 0, policy_version 27840 (0.0009) +[2023-10-13 01:23:03,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 56983552. Throughput: 0: 1691.1, 1: 1660.5. Samples: 14249838. Policy #0 lag: (min: 10.0, avg: 10.8, max: 26.0) +[2023-10-13 01:23:03,608][45375] Avg episode reward: [(0, '34.130'), (1, '48.920')] +[2023-10-13 01:23:05,584][46663] Updated weights for policy 1, policy_version 27811 (0.0008) +[2023-10-13 01:23:05,958][46663] Updated weights for policy 1, policy_version 27821 (0.0010) +[2023-10-13 01:23:06,315][46663] Updated weights for policy 1, policy_version 27831 (0.0008) +[2023-10-13 01:23:07,231][46662] Updated weights for policy 0, policy_version 27850 (0.0008) +[2023-10-13 01:23:07,603][46662] Updated weights for policy 0, policy_version 27860 (0.0010) +[2023-10-13 01:23:07,978][46662] Updated weights for policy 0, policy_version 27870 (0.0010) +[2023-10-13 01:23:08,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 57049088. Throughput: 0: 1671.7, 1: 1660.2. Samples: 14269616. Policy #0 lag: (min: 31.0, avg: 31.9, max: 48.0) +[2023-10-13 01:23:08,607][45375] Avg episode reward: [(0, '34.650'), (1, '50.530')] +[2023-10-13 01:23:10,490][46663] Updated weights for policy 1, policy_version 27841 (0.0010) +[2023-10-13 01:23:10,862][46663] Updated weights for policy 1, policy_version 27851 (0.0007) +[2023-10-13 01:23:11,231][46663] Updated weights for policy 1, policy_version 27861 (0.0008) +[2023-10-13 01:23:11,610][46663] Updated weights for policy 1, policy_version 27871 (0.0008) +[2023-10-13 01:23:12,060][46662] Updated weights for policy 0, policy_version 27880 (0.0008) +[2023-10-13 01:23:12,441][46662] Updated weights for policy 0, policy_version 27890 (0.0010) +[2023-10-13 01:23:12,818][46662] Updated weights for policy 0, policy_version 27900 (0.0008) +[2023-10-13 01:23:13,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 57114624. Throughput: 0: 1695.6, 1: 1646.7. Samples: 14280176. Policy #0 lag: (min: 31.0, avg: 31.9, max: 48.0) +[2023-10-13 01:23:13,608][45375] Avg episode reward: [(0, '35.560'), (1, '49.970')] +[2023-10-13 01:23:15,650][46663] Updated weights for policy 1, policy_version 27881 (0.0011) +[2023-10-13 01:23:16,023][46663] Updated weights for policy 1, policy_version 27891 (0.0010) +[2023-10-13 01:23:16,388][46663] Updated weights for policy 1, policy_version 27901 (0.0008) +[2023-10-13 01:23:16,784][46662] Updated weights for policy 0, policy_version 27910 (0.0008) +[2023-10-13 01:23:17,156][46662] Updated weights for policy 0, policy_version 27920 (0.0009) +[2023-10-13 01:23:17,526][46662] Updated weights for policy 0, policy_version 27930 (0.0011) +[2023-10-13 01:23:18,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 57180160. Throughput: 0: 1695.7, 1: 1666.7. Samples: 14300374. Policy #0 lag: (min: 31.0, avg: 31.9, max: 48.0) +[2023-10-13 01:23:18,608][45375] Avg episode reward: [(0, '36.380'), (1, '50.910')] +[2023-10-13 01:23:20,368][46663] Updated weights for policy 1, policy_version 27911 (0.0008) +[2023-10-13 01:23:20,760][46663] Updated weights for policy 1, policy_version 27921 (0.0008) +[2023-10-13 01:23:21,127][46663] Updated weights for policy 1, policy_version 27931 (0.0007) +[2023-10-13 01:23:21,672][46662] Updated weights for policy 0, policy_version 27940 (0.0009) +[2023-10-13 01:23:22,042][46662] Updated weights for policy 0, policy_version 27950 (0.0007) +[2023-10-13 01:23:22,419][46662] Updated weights for policy 0, policy_version 27960 (0.0008) +[2023-10-13 01:23:23,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 57245696. Throughput: 0: 1675.0, 1: 1670.1. Samples: 14319920. Policy #0 lag: (min: 31.0, avg: 31.9, max: 48.0) +[2023-10-13 01:23:23,608][45375] Avg episode reward: [(0, '37.420'), (1, '50.200')] +[2023-10-13 01:23:25,146][46663] Updated weights for policy 1, policy_version 27941 (0.0007) +[2023-10-13 01:23:25,507][46663] Updated weights for policy 1, policy_version 27951 (0.0011) +[2023-10-13 01:23:25,878][46663] Updated weights for policy 1, policy_version 27961 (0.0010) +[2023-10-13 01:23:26,443][46662] Updated weights for policy 0, policy_version 27970 (0.0010) +[2023-10-13 01:23:26,824][46662] Updated weights for policy 0, policy_version 27980 (0.0011) +[2023-10-13 01:23:27,203][46662] Updated weights for policy 0, policy_version 27990 (0.0009) +[2023-10-13 01:23:27,568][46662] Updated weights for policy 0, policy_version 28000 (0.0008) +[2023-10-13 01:23:28,606][45375] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 57311232. Throughput: 0: 1699.4, 1: 1656.0. Samples: 14330270. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) +[2023-10-13 01:23:28,607][45375] Avg episode reward: [(0, '37.440'), (1, '50.700')] +[2023-10-13 01:23:30,169][46663] Updated weights for policy 1, policy_version 27971 (0.0007) +[2023-10-13 01:23:30,538][46663] Updated weights for policy 1, policy_version 27981 (0.0008) +[2023-10-13 01:23:30,907][46663] Updated weights for policy 1, policy_version 27991 (0.0008) +[2023-10-13 01:23:31,436][46662] Updated weights for policy 0, policy_version 28010 (0.0009) +[2023-10-13 01:23:31,803][46662] Updated weights for policy 0, policy_version 28020 (0.0008) +[2023-10-13 01:23:32,180][46662] Updated weights for policy 0, policy_version 28030 (0.0008) +[2023-10-13 01:23:33,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 57376768. Throughput: 0: 1674.6, 1: 1670.0. Samples: 14350032. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) +[2023-10-13 01:23:33,608][45375] Avg episode reward: [(0, '34.950'), (1, '50.510')] +[2023-10-13 01:23:34,923][46663] Updated weights for policy 1, policy_version 28001 (0.0008) +[2023-10-13 01:23:35,289][46663] Updated weights for policy 1, policy_version 28011 (0.0010) +[2023-10-13 01:23:35,659][46663] Updated weights for policy 1, policy_version 28021 (0.0011) +[2023-10-13 01:23:36,031][46663] Updated weights for policy 1, policy_version 28031 (0.0010) +[2023-10-13 01:23:36,288][46662] Updated weights for policy 0, policy_version 28040 (0.0008) +[2023-10-13 01:23:36,655][46662] Updated weights for policy 0, policy_version 28050 (0.0008) +[2023-10-13 01:23:37,019][46662] Updated weights for policy 0, policy_version 28060 (0.0008) +[2023-10-13 01:23:38,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 57442304. Throughput: 0: 1675.1, 1: 1677.2. Samples: 14370308. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) +[2023-10-13 01:23:38,607][45375] Avg episode reward: [(0, '37.020'), (1, '49.690')] +[2023-10-13 01:23:40,195][46663] Updated weights for policy 1, policy_version 28041 (0.0009) +[2023-10-13 01:23:40,560][46663] Updated weights for policy 1, policy_version 28051 (0.0007) +[2023-10-13 01:23:40,922][46663] Updated weights for policy 1, policy_version 28061 (0.0008) +[2023-10-13 01:23:41,028][46662] Updated weights for policy 0, policy_version 28070 (0.0009) +[2023-10-13 01:23:41,400][46662] Updated weights for policy 0, policy_version 28080 (0.0008) +[2023-10-13 01:23:41,779][46662] Updated weights for policy 0, policy_version 28090 (0.0010) +[2023-10-13 01:23:43,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 57507840. Throughput: 0: 1692.6, 1: 1666.2. Samples: 14380710. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) +[2023-10-13 01:23:43,608][45375] Avg episode reward: [(0, '38.200'), (1, '48.710')] +[2023-10-13 01:23:45,090][46663] Updated weights for policy 1, policy_version 28071 (0.0008) +[2023-10-13 01:23:45,464][46663] Updated weights for policy 1, policy_version 28081 (0.0010) +[2023-10-13 01:23:45,832][46663] Updated weights for policy 1, policy_version 28091 (0.0008) +[2023-10-13 01:23:45,860][46662] Updated weights for policy 0, policy_version 28100 (0.0008) +[2023-10-13 01:23:46,244][46662] Updated weights for policy 0, policy_version 28110 (0.0010) +[2023-10-13 01:23:46,609][46662] Updated weights for policy 0, policy_version 28120 (0.0011) +[2023-10-13 01:23:48,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 57573376. Throughput: 0: 1662.3, 1: 1678.0. Samples: 14400150. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 01:23:48,607][45375] Avg episode reward: [(0, '38.770'), (1, '48.760')] +[2023-10-13 01:23:49,923][46663] Updated weights for policy 1, policy_version 28101 (0.0009) +[2023-10-13 01:23:50,295][46663] Updated weights for policy 1, policy_version 28111 (0.0009) +[2023-10-13 01:23:50,663][46663] Updated weights for policy 1, policy_version 28121 (0.0009) +[2023-10-13 01:23:50,708][46662] Updated weights for policy 0, policy_version 28130 (0.0008) +[2023-10-13 01:23:51,083][46662] Updated weights for policy 0, policy_version 28140 (0.0007) +[2023-10-13 01:23:51,450][46662] Updated weights for policy 0, policy_version 28150 (0.0008) +[2023-10-13 01:23:51,832][46662] Updated weights for policy 0, policy_version 28160 (0.0011) +[2023-10-13 01:23:53,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 57638912. Throughput: 0: 1675.0, 1: 1681.1. Samples: 14420642. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 01:23:53,608][45375] Avg episode reward: [(0, '37.910'), (1, '48.980')] +[2023-10-13 01:23:53,619][46384] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000028128_28803072.pth... +[2023-10-13 01:23:53,619][46091] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000028160_28835840.pth... +[2023-10-13 01:23:53,659][46091] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000026592_27230208.pth +[2023-10-13 01:23:53,659][46384] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000026560_27197440.pth +[2023-10-13 01:23:54,717][46663] Updated weights for policy 1, policy_version 28131 (0.0008) +[2023-10-13 01:23:55,080][46663] Updated weights for policy 1, policy_version 28141 (0.0009) +[2023-10-13 01:23:55,446][46663] Updated weights for policy 1, policy_version 28151 (0.0010) +[2023-10-13 01:23:55,918][46662] Updated weights for policy 0, policy_version 28170 (0.0010) +[2023-10-13 01:23:56,279][46662] Updated weights for policy 0, policy_version 28180 (0.0009) +[2023-10-13 01:23:56,644][46662] Updated weights for policy 0, policy_version 28190 (0.0010) +[2023-10-13 01:23:58,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 57704448. Throughput: 0: 1673.5, 1: 1670.7. Samples: 14430666. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 01:23:58,608][45375] Avg episode reward: [(0, '37.990'), (1, '48.620')] +[2023-10-13 01:23:59,617][46663] Updated weights for policy 1, policy_version 28161 (0.0009) +[2023-10-13 01:23:59,981][46663] Updated weights for policy 1, policy_version 28171 (0.0007) +[2023-10-13 01:24:00,341][46663] Updated weights for policy 1, policy_version 28181 (0.0007) +[2023-10-13 01:24:00,709][46663] Updated weights for policy 1, policy_version 28191 (0.0007) +[2023-10-13 01:24:00,851][46662] Updated weights for policy 0, policy_version 28200 (0.0010) +[2023-10-13 01:24:01,219][46662] Updated weights for policy 0, policy_version 28210 (0.0010) +[2023-10-13 01:24:01,601][46662] Updated weights for policy 0, policy_version 28220 (0.0007) +[2023-10-13 01:24:03,607][45375] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 57769984. Throughput: 0: 1653.6, 1: 1674.4. Samples: 14450136. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 01:24:03,608][45375] Avg episode reward: [(0, '36.940'), (1, '48.460')] +[2023-10-13 01:24:04,868][46663] Updated weights for policy 1, policy_version 28201 (0.0009) +[2023-10-13 01:24:05,233][46663] Updated weights for policy 1, policy_version 28211 (0.0010) +[2023-10-13 01:24:05,601][46663] Updated weights for policy 1, policy_version 28221 (0.0007) +[2023-10-13 01:24:05,650][46662] Updated weights for policy 0, policy_version 28230 (0.0008) +[2023-10-13 01:24:06,017][46662] Updated weights for policy 0, policy_version 28240 (0.0008) +[2023-10-13 01:24:06,388][46662] Updated weights for policy 0, policy_version 28250 (0.0009) +[2023-10-13 01:24:08,607][45375] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 57835520. Throughput: 0: 1677.5, 1: 1673.2. Samples: 14470700. Policy #0 lag: (min: 26.0, avg: 29.0, max: 58.0) +[2023-10-13 01:24:08,607][45375] Avg episode reward: [(0, '37.150'), (1, '48.460')] +[2023-10-13 01:24:09,646][46663] Updated weights for policy 1, policy_version 28231 (0.0007) +[2023-10-13 01:24:10,030][46663] Updated weights for policy 1, policy_version 28241 (0.0008) +[2023-10-13 01:24:10,398][46663] Updated weights for policy 1, policy_version 28251 (0.0008) +[2023-10-13 01:24:10,505][46662] Updated weights for policy 0, policy_version 28260 (0.0008) +[2023-10-13 01:24:10,894][46662] Updated weights for policy 0, policy_version 28270 (0.0008) +[2023-10-13 01:24:11,260][46662] Updated weights for policy 0, policy_version 28280 (0.0007) +[2023-10-13 01:24:13,607][45375] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 57901056. Throughput: 0: 1666.0, 1: 1670.8. Samples: 14480428. Policy #0 lag: (min: 26.0, avg: 29.0, max: 58.0) +[2023-10-13 01:24:13,607][45375] Avg episode reward: [(0, '36.200'), (1, '48.490')] +[2023-10-13 01:24:14,341][46663] Updated weights for policy 1, policy_version 28261 (0.0007) +[2023-10-13 01:24:14,712][46663] Updated weights for policy 1, policy_version 28271 (0.0007) +[2023-10-13 01:24:15,077][46663] Updated weights for policy 1, policy_version 28281 (0.0008) +[2023-10-13 01:24:15,309][46662] Updated weights for policy 0, policy_version 28290 (0.0009) +[2023-10-13 01:24:15,677][46662] Updated weights for policy 0, policy_version 28300 (0.0009) +[2023-10-13 01:24:16,046][46662] Updated weights for policy 0, policy_version 28310 (0.0009) +[2023-10-13 01:24:16,413][46662] Updated weights for policy 0, policy_version 28320 (0.0008) +[2023-10-13 01:24:18,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 57966592. Throughput: 0: 1666.5, 1: 1677.5. Samples: 14500512. Policy #0 lag: (min: 26.0, avg: 29.0, max: 58.0) +[2023-10-13 01:24:18,608][45375] Avg episode reward: [(0, '36.120'), (1, '47.970')] +[2023-10-13 01:24:19,188][46663] Updated weights for policy 1, policy_version 28291 (0.0007) +[2023-10-13 01:24:19,565][46663] Updated weights for policy 1, policy_version 28301 (0.0009) +[2023-10-13 01:24:19,927][46663] Updated weights for policy 1, policy_version 28311 (0.0009) +[2023-10-13 01:24:20,333][46662] Updated weights for policy 0, policy_version 28330 (0.0009) +[2023-10-13 01:24:20,700][46662] Updated weights for policy 0, policy_version 28340 (0.0008) +[2023-10-13 01:24:21,073][46662] Updated weights for policy 0, policy_version 28350 (0.0008) +[2023-10-13 01:24:23,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 58032128. Throughput: 0: 1676.8, 1: 1674.0. Samples: 14521094. Policy #0 lag: (min: 26.0, avg: 29.0, max: 58.0) +[2023-10-13 01:24:23,607][45375] Avg episode reward: [(0, '37.690'), (1, '47.060')] +[2023-10-13 01:24:24,056][46663] Updated weights for policy 1, policy_version 28321 (0.0009) +[2023-10-13 01:24:24,423][46663] Updated weights for policy 1, policy_version 28331 (0.0009) +[2023-10-13 01:24:24,786][46663] Updated weights for policy 1, policy_version 28341 (0.0008) +[2023-10-13 01:24:25,148][46663] Updated weights for policy 1, policy_version 28351 (0.0008) +[2023-10-13 01:24:25,149][46662] Updated weights for policy 0, policy_version 28360 (0.0008) +[2023-10-13 01:24:25,524][46662] Updated weights for policy 0, policy_version 28370 (0.0010) +[2023-10-13 01:24:25,891][46662] Updated weights for policy 0, policy_version 28380 (0.0008) +[2023-10-13 01:24:28,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 58097664. Throughput: 0: 1651.2, 1: 1680.5. Samples: 14530634. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-10-13 01:24:28,607][45375] Avg episode reward: [(0, '37.190'), (1, '47.210')] +[2023-10-13 01:24:29,123][46663] Updated weights for policy 1, policy_version 28361 (0.0007) +[2023-10-13 01:24:29,495][46663] Updated weights for policy 1, policy_version 28371 (0.0008) +[2023-10-13 01:24:29,862][46663] Updated weights for policy 1, policy_version 28381 (0.0008) +[2023-10-13 01:24:30,016][46662] Updated weights for policy 0, policy_version 28390 (0.0009) +[2023-10-13 01:24:30,391][46662] Updated weights for policy 0, policy_version 28400 (0.0009) +[2023-10-13 01:24:30,764][46662] Updated weights for policy 0, policy_version 28410 (0.0009) +[2023-10-13 01:24:33,606][45375] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 58163200. Throughput: 0: 1668.0, 1: 1686.4. Samples: 14551098. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-10-13 01:24:33,607][45375] Avg episode reward: [(0, '37.370'), (1, '47.770')] +[2023-10-13 01:24:33,698][46663] Updated weights for policy 1, policy_version 28391 (0.0008) +[2023-10-13 01:24:34,063][46663] Updated weights for policy 1, policy_version 28401 (0.0009) +[2023-10-13 01:24:34,433][46663] Updated weights for policy 1, policy_version 28411 (0.0009) +[2023-10-13 01:24:34,762][46662] Updated weights for policy 0, policy_version 28420 (0.0008) +[2023-10-13 01:24:35,140][46662] Updated weights for policy 0, policy_version 28430 (0.0007) +[2023-10-13 01:24:35,517][46662] Updated weights for policy 0, policy_version 28440 (0.0009) +[2023-10-13 01:24:38,472][46663] Updated weights for policy 1, policy_version 28421 (0.0008) +[2023-10-13 01:24:38,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 58228736. Throughput: 0: 1676.2, 1: 1680.8. Samples: 14571708. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-10-13 01:24:38,607][45375] Avg episode reward: [(0, '38.550'), (1, '49.090')] +[2023-10-13 01:24:38,838][46663] Updated weights for policy 1, policy_version 28431 (0.0009) +[2023-10-13 01:24:39,198][46663] Updated weights for policy 1, policy_version 28441 (0.0008) +[2023-10-13 01:24:39,627][46662] Updated weights for policy 0, policy_version 28450 (0.0009) +[2023-10-13 01:24:40,002][46662] Updated weights for policy 0, policy_version 28460 (0.0007) +[2023-10-13 01:24:40,369][46662] Updated weights for policy 0, policy_version 28470 (0.0008) +[2023-10-13 01:24:40,732][46662] Updated weights for policy 0, policy_version 28480 (0.0007) +[2023-10-13 01:24:43,263][46663] Updated weights for policy 1, policy_version 28451 (0.0009) +[2023-10-13 01:24:43,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 58294272. Throughput: 0: 1654.7, 1: 1683.4. Samples: 14580880. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-10-13 01:24:43,607][45375] Avg episode reward: [(0, '38.540'), (1, '49.780')] +[2023-10-13 01:24:43,622][46663] Updated weights for policy 1, policy_version 28461 (0.0010) +[2023-10-13 01:24:43,990][46663] Updated weights for policy 1, policy_version 28471 (0.0011) +[2023-10-13 01:24:44,761][46662] Updated weights for policy 0, policy_version 28490 (0.0010) +[2023-10-13 01:24:45,132][46662] Updated weights for policy 0, policy_version 28500 (0.0008) +[2023-10-13 01:24:45,509][46662] Updated weights for policy 0, policy_version 28510 (0.0007) +[2023-10-13 01:24:48,040][46663] Updated weights for policy 1, policy_version 28481 (0.0007) +[2023-10-13 01:24:48,403][46663] Updated weights for policy 1, policy_version 28491 (0.0010) +[2023-10-13 01:24:48,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 58359808. Throughput: 0: 1681.3, 1: 1688.3. Samples: 14601766. Policy #0 lag: (min: 0.0, avg: 21.1, max: 32.0) +[2023-10-13 01:24:48,607][45375] Avg episode reward: [(0, '40.270'), (1, '50.080')] +[2023-10-13 01:24:48,784][46663] Updated weights for policy 1, policy_version 28501 (0.0009) +[2023-10-13 01:24:49,159][46663] Updated weights for policy 1, policy_version 28511 (0.0009) +[2023-10-13 01:24:49,495][46662] Updated weights for policy 0, policy_version 28520 (0.0009) +[2023-10-13 01:24:49,852][46662] Updated weights for policy 0, policy_version 28530 (0.0011) +[2023-10-13 01:24:50,219][46662] Updated weights for policy 0, policy_version 28540 (0.0010) +[2023-10-13 01:24:53,340][46663] Updated weights for policy 1, policy_version 28521 (0.0010) +[2023-10-13 01:24:53,607][45375] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 58425344. Throughput: 0: 1680.7, 1: 1678.5. Samples: 14621868. Policy #0 lag: (min: 0.0, avg: 21.1, max: 32.0) +[2023-10-13 01:24:53,608][45375] Avg episode reward: [(0, '40.380'), (1, '50.690')] +[2023-10-13 01:24:53,722][46663] Updated weights for policy 1, policy_version 28531 (0.0010) +[2023-10-13 01:24:54,086][46663] Updated weights for policy 1, policy_version 28541 (0.0011) +[2023-10-13 01:24:54,517][46662] Updated weights for policy 0, policy_version 28550 (0.0008) +[2023-10-13 01:24:54,885][46662] Updated weights for policy 0, policy_version 28560 (0.0010) +[2023-10-13 01:24:55,252][46662] Updated weights for policy 0, policy_version 28570 (0.0009) +[2023-10-13 01:24:58,432][46663] Updated weights for policy 1, policy_version 28551 (0.0010) +[2023-10-13 01:24:58,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 58490880. Throughput: 0: 1663.0, 1: 1686.9. Samples: 14631174. Policy #0 lag: (min: 0.0, avg: 21.1, max: 32.0) +[2023-10-13 01:24:58,607][45375] Avg episode reward: [(0, '40.750'), (1, '49.960')] +[2023-10-13 01:24:58,809][46663] Updated weights for policy 1, policy_version 28561 (0.0008) +[2023-10-13 01:24:59,185][46663] Updated weights for policy 1, policy_version 28571 (0.0007) +[2023-10-13 01:24:59,409][46662] Updated weights for policy 0, policy_version 28580 (0.0009) +[2023-10-13 01:24:59,770][46662] Updated weights for policy 0, policy_version 28590 (0.0011) +[2023-10-13 01:25:00,136][46662] Updated weights for policy 0, policy_version 28600 (0.0009) +[2023-10-13 01:25:03,096][46663] Updated weights for policy 1, policy_version 28581 (0.0007) +[2023-10-13 01:25:03,465][46663] Updated weights for policy 1, policy_version 28591 (0.0009) +[2023-10-13 01:25:03,607][45375] Fps is (10 sec: 13107.6, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 58556416. Throughput: 0: 1677.1, 1: 1682.2. Samples: 14651680. Policy #0 lag: (min: 0.0, avg: 21.1, max: 32.0) +[2023-10-13 01:25:03,607][45375] Avg episode reward: [(0, '40.480'), (1, '50.590')] +[2023-10-13 01:25:03,824][46663] Updated weights for policy 1, policy_version 28601 (0.0010) +[2023-10-13 01:25:04,462][46662] Updated weights for policy 0, policy_version 28610 (0.0010) +[2023-10-13 01:25:04,868][46662] Updated weights for policy 0, policy_version 28620 (0.0007) +[2023-10-13 01:25:05,237][46662] Updated weights for policy 0, policy_version 28630 (0.0009) +[2023-10-13 01:25:05,608][46662] Updated weights for policy 0, policy_version 28640 (0.0009) +[2023-10-13 01:25:07,801][46663] Updated weights for policy 1, policy_version 28611 (0.0009) +[2023-10-13 01:25:08,173][46663] Updated weights for policy 1, policy_version 28621 (0.0007) +[2023-10-13 01:25:08,535][46663] Updated weights for policy 1, policy_version 28631 (0.0008) +[2023-10-13 01:25:08,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 58621952. Throughput: 0: 1674.9, 1: 1669.2. Samples: 14671580. Policy #0 lag: (min: 6.0, avg: 8.6, max: 38.0) +[2023-10-13 01:25:08,607][45375] Avg episode reward: [(0, '42.150'), (1, '52.920')] +[2023-10-13 01:25:09,735][46662] Updated weights for policy 0, policy_version 28650 (0.0010) +[2023-10-13 01:25:10,099][46662] Updated weights for policy 0, policy_version 28660 (0.0009) +[2023-10-13 01:25:10,472][46662] Updated weights for policy 0, policy_version 28670 (0.0008) +[2023-10-13 01:25:12,639][46663] Updated weights for policy 1, policy_version 28641 (0.0008) +[2023-10-13 01:25:13,004][46663] Updated weights for policy 1, policy_version 28651 (0.0009) +[2023-10-13 01:25:13,370][46663] Updated weights for policy 1, policy_version 28661 (0.0012) +[2023-10-13 01:25:13,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 58687488. Throughput: 0: 1666.2, 1: 1681.5. Samples: 14681280. Policy #0 lag: (min: 6.0, avg: 8.6, max: 38.0) +[2023-10-13 01:25:13,607][45375] Avg episode reward: [(0, '43.200'), (1, '53.110')] +[2023-10-13 01:25:13,737][46663] Updated weights for policy 1, policy_version 28671 (0.0007) +[2023-10-13 01:25:14,311][46662] Updated weights for policy 0, policy_version 28680 (0.0007) +[2023-10-13 01:25:14,677][46662] Updated weights for policy 0, policy_version 28690 (0.0009) +[2023-10-13 01:25:15,049][46662] Updated weights for policy 0, policy_version 28700 (0.0010) +[2023-10-13 01:25:17,727][46663] Updated weights for policy 1, policy_version 28681 (0.0009) +[2023-10-13 01:25:18,099][46663] Updated weights for policy 1, policy_version 28691 (0.0008) +[2023-10-13 01:25:18,470][46663] Updated weights for policy 1, policy_version 28701 (0.0008) +[2023-10-13 01:25:18,606][45375] Fps is (10 sec: 16384.3, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 58785792. Throughput: 0: 1674.8, 1: 1684.0. Samples: 14702246. Policy #0 lag: (min: 6.0, avg: 8.6, max: 38.0) +[2023-10-13 01:25:18,607][45375] Avg episode reward: [(0, '45.240'), (1, '52.250')] +[2023-10-13 01:25:19,218][46662] Updated weights for policy 0, policy_version 28710 (0.0008) +[2023-10-13 01:25:19,589][46662] Updated weights for policy 0, policy_version 28720 (0.0008) +[2023-10-13 01:25:19,956][46662] Updated weights for policy 0, policy_version 28730 (0.0009) +[2023-10-13 01:25:22,456][46663] Updated weights for policy 1, policy_version 28711 (0.0008) +[2023-10-13 01:25:22,828][46663] Updated weights for policy 1, policy_version 28721 (0.0009) +[2023-10-13 01:25:23,200][46663] Updated weights for policy 1, policy_version 28731 (0.0008) +[2023-10-13 01:25:23,606][45375] Fps is (10 sec: 16384.1, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 58851328. Throughput: 0: 1669.2, 1: 1666.8. Samples: 14721826. Policy #0 lag: (min: 6.0, avg: 8.6, max: 38.0) +[2023-10-13 01:25:23,607][45375] Avg episode reward: [(0, '45.440'), (1, '53.620')] +[2023-10-13 01:25:24,038][46662] Updated weights for policy 0, policy_version 28740 (0.0008) +[2023-10-13 01:25:24,418][46662] Updated weights for policy 0, policy_version 28750 (0.0007) +[2023-10-13 01:25:24,792][46662] Updated weights for policy 0, policy_version 28760 (0.0007) +[2023-10-13 01:25:27,213][46663] Updated weights for policy 1, policy_version 28741 (0.0009) +[2023-10-13 01:25:27,581][46663] Updated weights for policy 1, policy_version 28751 (0.0007) +[2023-10-13 01:25:27,943][46663] Updated weights for policy 1, policy_version 28761 (0.0010) +[2023-10-13 01:25:28,606][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 58916864. Throughput: 0: 1672.4, 1: 1691.1. Samples: 14732236. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) +[2023-10-13 01:25:28,607][45375] Avg episode reward: [(0, '45.210'), (1, '54.380')] +[2023-10-13 01:25:28,713][46662] Updated weights for policy 0, policy_version 28770 (0.0009) +[2023-10-13 01:25:29,082][46662] Updated weights for policy 0, policy_version 28780 (0.0009) +[2023-10-13 01:25:29,458][46662] Updated weights for policy 0, policy_version 28790 (0.0009) +[2023-10-13 01:25:29,825][46662] Updated weights for policy 0, policy_version 28800 (0.0008) +[2023-10-13 01:25:32,227][46663] Updated weights for policy 1, policy_version 28771 (0.0009) +[2023-10-13 01:25:32,593][46663] Updated weights for policy 1, policy_version 28781 (0.0010) +[2023-10-13 01:25:32,962][46663] Updated weights for policy 1, policy_version 28791 (0.0009) +[2023-10-13 01:25:33,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 58982400. Throughput: 0: 1672.6, 1: 1680.4. Samples: 14752650. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) +[2023-10-13 01:25:33,607][45375] Avg episode reward: [(0, '44.260'), (1, '55.110')] +[2023-10-13 01:25:34,013][46662] Updated weights for policy 0, policy_version 28810 (0.0009) +[2023-10-13 01:25:34,373][46662] Updated weights for policy 0, policy_version 28820 (0.0011) +[2023-10-13 01:25:34,745][46662] Updated weights for policy 0, policy_version 28830 (0.0008) +[2023-10-13 01:25:36,986][46663] Updated weights for policy 1, policy_version 28801 (0.0007) +[2023-10-13 01:25:37,340][46663] Updated weights for policy 1, policy_version 28811 (0.0009) +[2023-10-13 01:25:37,705][46663] Updated weights for policy 1, policy_version 28821 (0.0010) +[2023-10-13 01:25:38,070][46663] Updated weights for policy 1, policy_version 28831 (0.0009) +[2023-10-13 01:25:38,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 59047936. Throughput: 0: 1675.7, 1: 1667.1. Samples: 14772294. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) +[2023-10-13 01:25:38,607][45375] Avg episode reward: [(0, '44.530'), (1, '54.700')] +[2023-10-13 01:25:38,796][46662] Updated weights for policy 0, policy_version 28840 (0.0010) +[2023-10-13 01:25:39,173][46662] Updated weights for policy 0, policy_version 28850 (0.0007) +[2023-10-13 01:25:39,528][46662] Updated weights for policy 0, policy_version 28860 (0.0010) +[2023-10-13 01:25:42,277][46663] Updated weights for policy 1, policy_version 28841 (0.0010) +[2023-10-13 01:25:42,652][46663] Updated weights for policy 1, policy_version 28851 (0.0010) +[2023-10-13 01:25:43,015][46663] Updated weights for policy 1, policy_version 28861 (0.0007) +[2023-10-13 01:25:43,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 59113472. Throughput: 0: 1676.2, 1: 1688.2. Samples: 14782570. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) +[2023-10-13 01:25:43,607][45375] Avg episode reward: [(0, '46.020'), (1, '53.800')] +[2023-10-13 01:25:43,613][46662] Updated weights for policy 0, policy_version 28870 (0.0009) +[2023-10-13 01:25:43,979][46662] Updated weights for policy 0, policy_version 28880 (0.0007) +[2023-10-13 01:25:44,347][46662] Updated weights for policy 0, policy_version 28890 (0.0008) +[2023-10-13 01:25:47,100][46663] Updated weights for policy 1, policy_version 28871 (0.0008) +[2023-10-13 01:25:47,460][46663] Updated weights for policy 1, policy_version 28881 (0.0007) +[2023-10-13 01:25:47,833][46663] Updated weights for policy 1, policy_version 28891 (0.0008) +[2023-10-13 01:25:48,407][46662] Updated weights for policy 0, policy_version 28900 (0.0008) +[2023-10-13 01:25:48,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 59179008. Throughput: 0: 1677.0, 1: 1678.5. Samples: 14802678. Policy #0 lag: (min: 30.0, avg: 36.3, max: 62.0) +[2023-10-13 01:25:48,607][45375] Avg episode reward: [(0, '43.660'), (1, '53.440')] +[2023-10-13 01:25:48,773][46662] Updated weights for policy 0, policy_version 28910 (0.0009) +[2023-10-13 01:25:49,153][46662] Updated weights for policy 0, policy_version 28920 (0.0008) +[2023-10-13 01:25:51,843][46663] Updated weights for policy 1, policy_version 28901 (0.0008) +[2023-10-13 01:25:52,216][46663] Updated weights for policy 1, policy_version 28911 (0.0007) +[2023-10-13 01:25:52,584][46663] Updated weights for policy 1, policy_version 28921 (0.0009) +[2023-10-13 01:25:53,144][46662] Updated weights for policy 0, policy_version 28930 (0.0007) +[2023-10-13 01:25:53,548][46662] Updated weights for policy 0, policy_version 28940 (0.0010) +[2023-10-13 01:25:53,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 59244544. Throughput: 0: 1684.1, 1: 1674.7. Samples: 14822724. Policy #0 lag: (min: 30.0, avg: 36.3, max: 62.0) +[2023-10-13 01:25:53,607][45375] Avg episode reward: [(0, '46.140'), (1, '53.220')] +[2023-10-13 01:25:53,616][46384] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000028928_29622272.pth... +[2023-10-13 01:25:53,654][46384] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000027360_28016640.pth +[2023-10-13 01:25:53,930][46662] Updated weights for policy 0, policy_version 28950 (0.0009) +[2023-10-13 01:25:54,299][46091] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000028960_29655040.pth... +[2023-10-13 01:25:54,302][46662] Updated weights for policy 0, policy_version 28960 (0.0008) +[2023-10-13 01:25:54,339][46091] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000027360_28016640.pth +[2023-10-13 01:25:56,788][46663] Updated weights for policy 1, policy_version 28931 (0.0009) +[2023-10-13 01:25:57,170][46663] Updated weights for policy 1, policy_version 28941 (0.0008) +[2023-10-13 01:25:57,535][46663] Updated weights for policy 1, policy_version 28951 (0.0009) +[2023-10-13 01:25:58,390][46662] Updated weights for policy 0, policy_version 28970 (0.0011) +[2023-10-13 01:25:58,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 59310080. Throughput: 0: 1681.0, 1: 1688.1. Samples: 14832890. Policy #0 lag: (min: 30.0, avg: 36.3, max: 62.0) +[2023-10-13 01:25:58,607][45375] Avg episode reward: [(0, '46.870'), (1, '52.680')] +[2023-10-13 01:25:58,764][46662] Updated weights for policy 0, policy_version 28980 (0.0008) +[2023-10-13 01:25:59,128][46662] Updated weights for policy 0, policy_version 28990 (0.0007) +[2023-10-13 01:26:01,676][46663] Updated weights for policy 1, policy_version 28961 (0.0008) +[2023-10-13 01:26:02,052][46663] Updated weights for policy 1, policy_version 28971 (0.0008) +[2023-10-13 01:26:02,417][46663] Updated weights for policy 1, policy_version 28981 (0.0007) +[2023-10-13 01:26:02,793][46663] Updated weights for policy 1, policy_version 28991 (0.0009) +[2023-10-13 01:26:03,296][46662] Updated weights for policy 0, policy_version 29000 (0.0007) +[2023-10-13 01:26:03,607][45375] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 59375616. Throughput: 0: 1680.7, 1: 1663.2. Samples: 14852722. Policy #0 lag: (min: 30.0, avg: 36.3, max: 62.0) +[2023-10-13 01:26:03,608][45375] Avg episode reward: [(0, '46.360'), (1, '52.380')] +[2023-10-13 01:26:03,673][46662] Updated weights for policy 0, policy_version 29010 (0.0009) +[2023-10-13 01:26:04,049][46662] Updated weights for policy 0, policy_version 29020 (0.0010) +[2023-10-13 01:26:06,620][46663] Updated weights for policy 1, policy_version 29001 (0.0010) +[2023-10-13 01:26:06,993][46663] Updated weights for policy 1, policy_version 29011 (0.0009) +[2023-10-13 01:26:07,348][46663] Updated weights for policy 1, policy_version 29021 (0.0008) +[2023-10-13 01:26:08,182][46662] Updated weights for policy 0, policy_version 29030 (0.0009) +[2023-10-13 01:26:08,557][46662] Updated weights for policy 0, policy_version 29040 (0.0009) +[2023-10-13 01:26:08,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 59441152. Throughput: 0: 1686.9, 1: 1674.5. Samples: 14873088. Policy #0 lag: (min: 30.0, avg: 36.3, max: 62.0) +[2023-10-13 01:26:08,607][45375] Avg episode reward: [(0, '45.090'), (1, '52.800')] +[2023-10-13 01:26:08,928][46662] Updated weights for policy 0, policy_version 29050 (0.0007) +[2023-10-13 01:26:11,513][46663] Updated weights for policy 1, policy_version 29031 (0.0007) +[2023-10-13 01:26:11,875][46663] Updated weights for policy 1, policy_version 29041 (0.0008) +[2023-10-13 01:26:12,249][46663] Updated weights for policy 1, policy_version 29051 (0.0007) +[2023-10-13 01:26:13,104][46662] Updated weights for policy 0, policy_version 29060 (0.0008) +[2023-10-13 01:26:13,465][46662] Updated weights for policy 0, policy_version 29070 (0.0008) +[2023-10-13 01:26:13,607][45375] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 59506688. Throughput: 0: 1680.3, 1: 1676.2. Samples: 14883280. Policy #0 lag: (min: 21.0, avg: 21.0, max: 21.0) +[2023-10-13 01:26:13,607][45375] Avg episode reward: [(0, '43.270'), (1, '52.040')] +[2023-10-13 01:26:13,839][46662] Updated weights for policy 0, policy_version 29080 (0.0007) +[2023-10-13 01:26:16,238][46663] Updated weights for policy 1, policy_version 29061 (0.0008) +[2023-10-13 01:26:16,610][46663] Updated weights for policy 1, policy_version 29071 (0.0008) +[2023-10-13 01:26:16,981][46663] Updated weights for policy 1, policy_version 29081 (0.0010) +[2023-10-13 01:26:17,834][46662] Updated weights for policy 0, policy_version 29090 (0.0007) +[2023-10-13 01:26:18,200][46662] Updated weights for policy 0, policy_version 29100 (0.0011) +[2023-10-13 01:26:18,580][46662] Updated weights for policy 0, policy_version 29110 (0.0008) +[2023-10-13 01:26:18,606][45375] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 59572224. Throughput: 0: 1676.8, 1: 1660.6. Samples: 14902834. Policy #0 lag: (min: 21.0, avg: 21.0, max: 21.0) +[2023-10-13 01:26:18,607][45375] Avg episode reward: [(0, '42.520'), (1, '51.590')] +[2023-10-13 01:26:18,944][46662] Updated weights for policy 0, policy_version 29120 (0.0010) +[2023-10-13 01:26:21,130][46663] Updated weights for policy 1, policy_version 29091 (0.0010) +[2023-10-13 01:26:21,498][46663] Updated weights for policy 1, policy_version 29101 (0.0010) +[2023-10-13 01:26:21,865][46663] Updated weights for policy 1, policy_version 29111 (0.0007) +[2023-10-13 01:26:22,766][46662] Updated weights for policy 0, policy_version 29130 (0.0008) +[2023-10-13 01:26:23,142][46662] Updated weights for policy 0, policy_version 29140 (0.0007) +[2023-10-13 01:26:23,520][46662] Updated weights for policy 0, policy_version 29150 (0.0007) +[2023-10-13 01:26:23,606][45375] Fps is (10 sec: 16384.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 59670528. Throughput: 0: 1669.8, 1: 1682.1. Samples: 14923128. Policy #0 lag: (min: 21.0, avg: 21.0, max: 21.0) +[2023-10-13 01:26:23,607][45375] Avg episode reward: [(0, '41.630'), (1, '51.090')] +[2023-10-13 01:26:25,991][46663] Updated weights for policy 1, policy_version 29121 (0.0009) +[2023-10-13 01:26:26,363][46663] Updated weights for policy 1, policy_version 29131 (0.0008) +[2023-10-13 01:26:26,722][46663] Updated weights for policy 1, policy_version 29141 (0.0009) +[2023-10-13 01:26:27,098][46663] Updated weights for policy 1, policy_version 29151 (0.0009) +[2023-10-13 01:26:27,636][46662] Updated weights for policy 0, policy_version 29160 (0.0009) +[2023-10-13 01:26:28,004][46662] Updated weights for policy 0, policy_version 29170 (0.0007) +[2023-10-13 01:26:28,381][46662] Updated weights for policy 0, policy_version 29180 (0.0007) +[2023-10-13 01:26:28,607][45375] Fps is (10 sec: 16383.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 59736064. Throughput: 0: 1678.8, 1: 1673.4. Samples: 14933416. Policy #0 lag: (min: 21.0, avg: 21.0, max: 21.0) +[2023-10-13 01:26:28,607][45375] Avg episode reward: [(0, '44.390'), (1, '53.400')] +[2023-10-13 01:26:31,153][46663] Updated weights for policy 1, policy_version 29161 (0.0008) +[2023-10-13 01:26:31,530][46663] Updated weights for policy 1, policy_version 29171 (0.0008) +[2023-10-13 01:26:31,891][46663] Updated weights for policy 1, policy_version 29181 (0.0009) +[2023-10-13 01:26:32,503][46662] Updated weights for policy 0, policy_version 29190 (0.0008) +[2023-10-13 01:26:32,876][46662] Updated weights for policy 0, policy_version 29200 (0.0008) +[2023-10-13 01:26:33,253][46662] Updated weights for policy 0, policy_version 29210 (0.0009) +[2023-10-13 01:26:33,606][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 59801600. Throughput: 0: 1683.4, 1: 1664.4. Samples: 14953330. Policy #0 lag: (min: 21.0, avg: 21.0, max: 21.0) +[2023-10-13 01:26:33,607][45375] Avg episode reward: [(0, '43.370'), (1, '52.670')] +[2023-10-13 01:26:36,049][46663] Updated weights for policy 1, policy_version 29191 (0.0009) +[2023-10-13 01:26:36,414][46663] Updated weights for policy 1, policy_version 29201 (0.0010) +[2023-10-13 01:26:36,796][46663] Updated weights for policy 1, policy_version 29211 (0.0009) +[2023-10-13 01:26:37,259][46662] Updated weights for policy 0, policy_version 29220 (0.0010) +[2023-10-13 01:26:37,641][46662] Updated weights for policy 0, policy_version 29230 (0.0011) +[2023-10-13 01:26:38,011][46662] Updated weights for policy 0, policy_version 29240 (0.0010) +[2023-10-13 01:26:38,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 59867136. Throughput: 0: 1667.8, 1: 1681.1. Samples: 14973424. Policy #0 lag: (min: 21.0, avg: 21.0, max: 21.0) +[2023-10-13 01:26:38,607][45375] Avg episode reward: [(0, '42.720'), (1, '54.090')] +[2023-10-13 01:26:40,804][46663] Updated weights for policy 1, policy_version 29221 (0.0008) +[2023-10-13 01:26:41,180][46663] Updated weights for policy 1, policy_version 29231 (0.0011) +[2023-10-13 01:26:41,541][46663] Updated weights for policy 1, policy_version 29241 (0.0009) +[2023-10-13 01:26:42,107][46662] Updated weights for policy 0, policy_version 29250 (0.0011) +[2023-10-13 01:26:42,493][46662] Updated weights for policy 0, policy_version 29260 (0.0007) +[2023-10-13 01:26:42,862][46662] Updated weights for policy 0, policy_version 29270 (0.0008) +[2023-10-13 01:26:43,238][46662] Updated weights for policy 0, policy_version 29280 (0.0008) +[2023-10-13 01:26:43,607][45375] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 59932672. Throughput: 0: 1687.1, 1: 1657.2. Samples: 14983384. Policy #0 lag: (min: 21.0, avg: 21.0, max: 21.0) +[2023-10-13 01:26:43,608][45375] Avg episode reward: [(0, '44.110'), (1, '53.640')] +[2023-10-13 01:26:45,635][46663] Updated weights for policy 1, policy_version 29251 (0.0007) +[2023-10-13 01:26:45,996][46663] Updated weights for policy 1, policy_version 29261 (0.0007) +[2023-10-13 01:26:46,367][46663] Updated weights for policy 1, policy_version 29271 (0.0008) +[2023-10-13 01:26:47,265][46662] Updated weights for policy 0, policy_version 29290 (0.0008) +[2023-10-13 01:26:47,628][46662] Updated weights for policy 0, policy_version 29300 (0.0007) +[2023-10-13 01:26:47,995][46662] Updated weights for policy 0, policy_version 29310 (0.0008) +[2023-10-13 01:26:48,606][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 59998208. Throughput: 0: 1691.0, 1: 1665.4. Samples: 15003758. Policy #0 lag: (min: 21.0, avg: 21.0, max: 21.0) +[2023-10-13 01:26:48,607][45375] Avg episode reward: [(0, '42.580'), (1, '54.420')] +[2023-10-13 01:26:50,452][46663] Updated weights for policy 1, policy_version 29281 (0.0008) +[2023-10-13 01:26:50,816][46663] Updated weights for policy 1, policy_version 29291 (0.0010) +[2023-10-13 01:26:51,199][46663] Updated weights for policy 1, policy_version 29301 (0.0008) +[2023-10-13 01:26:51,553][46663] Updated weights for policy 1, policy_version 29311 (0.0009) +[2023-10-13 01:26:52,077][46662] Updated weights for policy 0, policy_version 29320 (0.0010) +[2023-10-13 01:26:52,455][46662] Updated weights for policy 0, policy_version 29330 (0.0011) +[2023-10-13 01:26:52,826][46662] Updated weights for policy 0, policy_version 29340 (0.0009) +[2023-10-13 01:26:53,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 60063744. Throughput: 0: 1667.7, 1: 1675.2. Samples: 15023518. Policy #0 lag: (min: 21.0, avg: 21.0, max: 21.0) +[2023-10-13 01:26:53,607][45375] Avg episode reward: [(0, '43.040'), (1, '54.090')] +[2023-10-13 01:26:55,565][46663] Updated weights for policy 1, policy_version 29321 (0.0008) +[2023-10-13 01:26:55,924][46663] Updated weights for policy 1, policy_version 29331 (0.0008) +[2023-10-13 01:26:56,289][46663] Updated weights for policy 1, policy_version 29341 (0.0008) +[2023-10-13 01:26:56,896][46662] Updated weights for policy 0, policy_version 29350 (0.0007) +[2023-10-13 01:26:57,259][46662] Updated weights for policy 0, policy_version 29360 (0.0007) +[2023-10-13 01:26:57,633][46662] Updated weights for policy 0, policy_version 29370 (0.0007) +[2023-10-13 01:26:58,606][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 60129280. Throughput: 0: 1693.5, 1: 1650.8. Samples: 15033770. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 01:26:58,607][45375] Avg episode reward: [(0, '40.680'), (1, '54.100')] +[2023-10-13 01:27:00,337][46663] Updated weights for policy 1, policy_version 29351 (0.0008) +[2023-10-13 01:27:00,703][46663] Updated weights for policy 1, policy_version 29361 (0.0007) +[2023-10-13 01:27:01,066][46663] Updated weights for policy 1, policy_version 29371 (0.0008) +[2023-10-13 01:27:01,558][46662] Updated weights for policy 0, policy_version 29380 (0.0007) +[2023-10-13 01:27:01,929][46662] Updated weights for policy 0, policy_version 29390 (0.0008) +[2023-10-13 01:27:02,304][46662] Updated weights for policy 0, policy_version 29400 (0.0008) +[2023-10-13 01:27:03,606][45375] Fps is (10 sec: 13107.4, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 60194816. Throughput: 0: 1688.6, 1: 1680.6. Samples: 15054446. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 01:27:03,607][45375] Avg episode reward: [(0, '40.560'), (1, '54.280')] +[2023-10-13 01:27:05,203][46663] Updated weights for policy 1, policy_version 29381 (0.0010) +[2023-10-13 01:27:05,575][46663] Updated weights for policy 1, policy_version 29391 (0.0008) +[2023-10-13 01:27:05,953][46663] Updated weights for policy 1, policy_version 29401 (0.0010) +[2023-10-13 01:27:06,245][46662] Updated weights for policy 0, policy_version 29410 (0.0009) +[2023-10-13 01:27:06,620][46662] Updated weights for policy 0, policy_version 29420 (0.0009) +[2023-10-13 01:27:06,992][46662] Updated weights for policy 0, policy_version 29430 (0.0008) +[2023-10-13 01:27:07,363][46662] Updated weights for policy 0, policy_version 29440 (0.0009) +[2023-10-13 01:27:08,607][45375] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 60260352. Throughput: 0: 1679.5, 1: 1687.3. Samples: 15074634. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 01:27:08,608][45375] Avg episode reward: [(0, '39.740'), (1, '54.110')] +[2023-10-13 01:27:10,170][46663] Updated weights for policy 1, policy_version 29411 (0.0009) +[2023-10-13 01:27:10,535][46663] Updated weights for policy 1, policy_version 29421 (0.0007) +[2023-10-13 01:27:10,891][46663] Updated weights for policy 1, policy_version 29431 (0.0008) +[2023-10-13 01:27:11,302][46662] Updated weights for policy 0, policy_version 29450 (0.0009) +[2023-10-13 01:27:11,659][46662] Updated weights for policy 0, policy_version 29460 (0.0010) +[2023-10-13 01:27:12,044][46662] Updated weights for policy 0, policy_version 29470 (0.0009) +[2023-10-13 01:27:13,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 60325888. Throughput: 0: 1704.2, 1: 1667.5. Samples: 15085140. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 01:27:13,607][45375] Avg episode reward: [(0, '39.320'), (1, '54.430')] +[2023-10-13 01:27:14,836][46663] Updated weights for policy 1, policy_version 29441 (0.0008) +[2023-10-13 01:27:15,199][46663] Updated weights for policy 1, policy_version 29451 (0.0009) +[2023-10-13 01:27:15,570][46663] Updated weights for policy 1, policy_version 29461 (0.0009) +[2023-10-13 01:27:15,948][46663] Updated weights for policy 1, policy_version 29471 (0.0008) +[2023-10-13 01:27:16,148][46662] Updated weights for policy 0, policy_version 29480 (0.0010) +[2023-10-13 01:27:16,520][46662] Updated weights for policy 0, policy_version 29490 (0.0007) +[2023-10-13 01:27:16,890][46662] Updated weights for policy 0, policy_version 29500 (0.0008) +[2023-10-13 01:27:18,606][45375] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 60391424. Throughput: 0: 1677.0, 1: 1692.8. Samples: 15104972. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 01:27:18,607][45375] Avg episode reward: [(0, '40.330'), (1, '53.820')] +[2023-10-13 01:27:19,769][46663] Updated weights for policy 1, policy_version 29481 (0.0009) +[2023-10-13 01:27:20,139][46663] Updated weights for policy 1, policy_version 29491 (0.0011) +[2023-10-13 01:27:20,492][46663] Updated weights for policy 1, policy_version 29501 (0.0009) +[2023-10-13 01:27:21,046][46662] Updated weights for policy 0, policy_version 29510 (0.0010) +[2023-10-13 01:27:21,416][46662] Updated weights for policy 0, policy_version 29520 (0.0009) +[2023-10-13 01:27:21,780][46662] Updated weights for policy 0, policy_version 29530 (0.0011) +[2023-10-13 01:27:23,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 60456960. Throughput: 0: 1680.8, 1: 1696.8. Samples: 15125418. Policy #0 lag: (min: 13.0, avg: 18.4, max: 45.0) +[2023-10-13 01:27:23,607][45375] Avg episode reward: [(0, '40.460'), (1, '53.890')] +[2023-10-13 01:27:24,378][46663] Updated weights for policy 1, policy_version 29511 (0.0008) +[2023-10-13 01:27:24,757][46663] Updated weights for policy 1, policy_version 29521 (0.0009) +[2023-10-13 01:27:25,125][46663] Updated weights for policy 1, policy_version 29531 (0.0010) +[2023-10-13 01:27:25,825][46662] Updated weights for policy 0, policy_version 29540 (0.0009) +[2023-10-13 01:27:26,190][46662] Updated weights for policy 0, policy_version 29550 (0.0010) +[2023-10-13 01:27:26,561][46662] Updated weights for policy 0, policy_version 29560 (0.0011) +[2023-10-13 01:27:28,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 60522496. Throughput: 0: 1691.9, 1: 1687.4. Samples: 15135452. Policy #0 lag: (min: 13.0, avg: 18.4, max: 45.0) +[2023-10-13 01:27:28,607][45375] Avg episode reward: [(0, '39.680'), (1, '54.720')] +[2023-10-13 01:27:29,142][46663] Updated weights for policy 1, policy_version 29541 (0.0008) +[2023-10-13 01:27:29,516][46663] Updated weights for policy 1, policy_version 29551 (0.0010) +[2023-10-13 01:27:29,874][46663] Updated weights for policy 1, policy_version 29561 (0.0008) +[2023-10-13 01:27:30,675][46662] Updated weights for policy 0, policy_version 29570 (0.0010) +[2023-10-13 01:27:31,035][46662] Updated weights for policy 0, policy_version 29580 (0.0010) +[2023-10-13 01:27:31,409][46662] Updated weights for policy 0, policy_version 29590 (0.0010) +[2023-10-13 01:27:31,784][46662] Updated weights for policy 0, policy_version 29600 (0.0011) +[2023-10-13 01:27:33,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 60588032. Throughput: 0: 1660.6, 1: 1701.3. Samples: 15155044. Policy #0 lag: (min: 13.0, avg: 18.4, max: 45.0) +[2023-10-13 01:27:33,607][45375] Avg episode reward: [(0, '38.680'), (1, '56.460')] +[2023-10-13 01:27:33,973][46663] Updated weights for policy 1, policy_version 29571 (0.0008) +[2023-10-13 01:27:34,343][46663] Updated weights for policy 1, policy_version 29581 (0.0009) +[2023-10-13 01:27:34,716][46663] Updated weights for policy 1, policy_version 29591 (0.0008) +[2023-10-13 01:27:36,003][46662] Updated weights for policy 0, policy_version 29610 (0.0010) +[2023-10-13 01:27:36,371][46662] Updated weights for policy 0, policy_version 29620 (0.0010) +[2023-10-13 01:27:36,734][46662] Updated weights for policy 0, policy_version 29630 (0.0010) +[2023-10-13 01:27:38,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 60653568. Throughput: 0: 1678.9, 1: 1701.9. Samples: 15175652. Policy #0 lag: (min: 13.0, avg: 18.4, max: 45.0) +[2023-10-13 01:27:38,607][45375] Avg episode reward: [(0, '39.020'), (1, '55.280')] +[2023-10-13 01:27:39,024][46663] Updated weights for policy 1, policy_version 29601 (0.0007) +[2023-10-13 01:27:39,388][46663] Updated weights for policy 1, policy_version 29611 (0.0009) +[2023-10-13 01:27:39,753][46663] Updated weights for policy 1, policy_version 29621 (0.0010) +[2023-10-13 01:27:40,123][46663] Updated weights for policy 1, policy_version 29631 (0.0011) +[2023-10-13 01:27:40,808][46662] Updated weights for policy 0, policy_version 29640 (0.0009) +[2023-10-13 01:27:41,177][46662] Updated weights for policy 0, policy_version 29650 (0.0007) +[2023-10-13 01:27:41,550][46662] Updated weights for policy 0, policy_version 29660 (0.0008) +[2023-10-13 01:27:43,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 60719104. Throughput: 0: 1680.1, 1: 1700.4. Samples: 15185890. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 01:27:43,607][45375] Avg episode reward: [(0, '38.050'), (1, '56.130')] +[2023-10-13 01:27:43,953][46663] Updated weights for policy 1, policy_version 29641 (0.0008) +[2023-10-13 01:27:44,319][46663] Updated weights for policy 1, policy_version 29651 (0.0008) +[2023-10-13 01:27:44,697][46663] Updated weights for policy 1, policy_version 29661 (0.0008) +[2023-10-13 01:27:45,665][46662] Updated weights for policy 0, policy_version 29670 (0.0008) +[2023-10-13 01:27:46,043][46662] Updated weights for policy 0, policy_version 29680 (0.0007) +[2023-10-13 01:27:46,426][46662] Updated weights for policy 0, policy_version 29690 (0.0010) +[2023-10-13 01:27:48,607][45375] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 60784640. Throughput: 0: 1664.9, 1: 1697.9. Samples: 15205772. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 01:27:48,608][45375] Avg episode reward: [(0, '38.390'), (1, '52.780')] +[2023-10-13 01:27:48,808][46663] Updated weights for policy 1, policy_version 29671 (0.0007) +[2023-10-13 01:27:49,184][46663] Updated weights for policy 1, policy_version 29681 (0.0008) +[2023-10-13 01:27:49,546][46663] Updated weights for policy 1, policy_version 29691 (0.0009) +[2023-10-13 01:27:50,477][46662] Updated weights for policy 0, policy_version 29700 (0.0007) +[2023-10-13 01:27:50,854][46662] Updated weights for policy 0, policy_version 29710 (0.0009) +[2023-10-13 01:27:51,225][46662] Updated weights for policy 0, policy_version 29720 (0.0007) +[2023-10-13 01:27:53,604][46663] Updated weights for policy 1, policy_version 29701 (0.0008) +[2023-10-13 01:27:53,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 60850176. Throughput: 0: 1678.8, 1: 1694.9. Samples: 15226454. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 01:27:53,607][45375] Avg episode reward: [(0, '39.670'), (1, '53.510')] +[2023-10-13 01:27:53,618][46091] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000029728_30441472.pth... +[2023-10-13 01:27:53,654][46091] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000028160_28835840.pth +[2023-10-13 01:27:53,973][46663] Updated weights for policy 1, policy_version 29711 (0.0008) +[2023-10-13 01:27:54,340][46663] Updated weights for policy 1, policy_version 29721 (0.0007) +[2023-10-13 01:27:54,593][46384] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000029728_30441472.pth... +[2023-10-13 01:27:54,633][46384] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000028128_28803072.pth +[2023-10-13 01:27:55,279][46662] Updated weights for policy 0, policy_version 29730 (0.0008) +[2023-10-13 01:27:55,642][46662] Updated weights for policy 0, policy_version 29740 (0.0008) +[2023-10-13 01:27:56,006][46662] Updated weights for policy 0, policy_version 29750 (0.0009) +[2023-10-13 01:27:56,387][46662] Updated weights for policy 0, policy_version 29760 (0.0007) +[2023-10-13 01:27:58,423][46663] Updated weights for policy 1, policy_version 29731 (0.0008) +[2023-10-13 01:27:58,607][45375] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 60915712. Throughput: 0: 1664.5, 1: 1697.2. Samples: 15236420. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 01:27:58,607][45375] Avg episode reward: [(0, '42.120'), (1, '51.950')] +[2023-10-13 01:27:58,789][46663] Updated weights for policy 1, policy_version 29741 (0.0007) +[2023-10-13 01:27:59,163][46663] Updated weights for policy 1, policy_version 29751 (0.0008) +[2023-10-13 01:28:00,481][46662] Updated weights for policy 0, policy_version 29770 (0.0008) +[2023-10-13 01:28:00,841][46662] Updated weights for policy 0, policy_version 29780 (0.0009) +[2023-10-13 01:28:01,214][46662] Updated weights for policy 0, policy_version 29790 (0.0008) +[2023-10-13 01:28:03,070][46663] Updated weights for policy 1, policy_version 29761 (0.0008) +[2023-10-13 01:28:03,437][46663] Updated weights for policy 1, policy_version 29771 (0.0007) +[2023-10-13 01:28:03,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 60981248. Throughput: 0: 1674.1, 1: 1696.0. Samples: 15256628. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 01:28:03,607][45375] Avg episode reward: [(0, '42.820'), (1, '52.490')] +[2023-10-13 01:28:03,805][46663] Updated weights for policy 1, policy_version 29781 (0.0008) +[2023-10-13 01:28:04,171][46663] Updated weights for policy 1, policy_version 29791 (0.0007) +[2023-10-13 01:28:05,188][46662] Updated weights for policy 0, policy_version 29800 (0.0008) +[2023-10-13 01:28:05,556][46662] Updated weights for policy 0, policy_version 29810 (0.0007) +[2023-10-13 01:28:05,926][46662] Updated weights for policy 0, policy_version 29820 (0.0008) +[2023-10-13 01:28:08,377][46663] Updated weights for policy 1, policy_version 29801 (0.0010) +[2023-10-13 01:28:08,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 61046784. Throughput: 0: 1682.6, 1: 1681.2. Samples: 15276790. Policy #0 lag: (min: 1.0, avg: 16.0, max: 33.0) +[2023-10-13 01:28:08,608][45375] Avg episode reward: [(0, '42.990'), (1, '50.500')] +[2023-10-13 01:28:08,743][46663] Updated weights for policy 1, policy_version 29811 (0.0009) +[2023-10-13 01:28:09,114][46663] Updated weights for policy 1, policy_version 29821 (0.0008) +[2023-10-13 01:28:10,080][46662] Updated weights for policy 0, policy_version 29830 (0.0007) +[2023-10-13 01:28:10,457][46662] Updated weights for policy 0, policy_version 29840 (0.0008) +[2023-10-13 01:28:10,836][46662] Updated weights for policy 0, policy_version 29850 (0.0009) +[2023-10-13 01:28:13,404][46663] Updated weights for policy 1, policy_version 29831 (0.0009) +[2023-10-13 01:28:13,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 61112320. Throughput: 0: 1665.9, 1: 1693.7. Samples: 15286632. Policy #0 lag: (min: 1.0, avg: 16.0, max: 33.0) +[2023-10-13 01:28:13,607][45375] Avg episode reward: [(0, '42.730'), (1, '50.820')] +[2023-10-13 01:28:13,781][46663] Updated weights for policy 1, policy_version 29841 (0.0008) +[2023-10-13 01:28:14,157][46663] Updated weights for policy 1, policy_version 29851 (0.0008) +[2023-10-13 01:28:14,792][46662] Updated weights for policy 0, policy_version 29860 (0.0009) +[2023-10-13 01:28:15,169][46662] Updated weights for policy 0, policy_version 29870 (0.0009) +[2023-10-13 01:28:15,535][46662] Updated weights for policy 0, policy_version 29880 (0.0010) +[2023-10-13 01:28:18,145][46663] Updated weights for policy 1, policy_version 29861 (0.0007) +[2023-10-13 01:28:18,514][46663] Updated weights for policy 1, policy_version 29871 (0.0007) +[2023-10-13 01:28:18,607][45375] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 61177856. Throughput: 0: 1685.7, 1: 1685.4. Samples: 15306744. Policy #0 lag: (min: 1.0, avg: 16.0, max: 33.0) +[2023-10-13 01:28:18,607][45375] Avg episode reward: [(0, '42.970'), (1, '50.840')] +[2023-10-13 01:28:18,878][46663] Updated weights for policy 1, policy_version 29881 (0.0009) +[2023-10-13 01:28:19,610][46662] Updated weights for policy 0, policy_version 29890 (0.0008) +[2023-10-13 01:28:19,983][46662] Updated weights for policy 0, policy_version 29900 (0.0007) +[2023-10-13 01:28:20,347][46662] Updated weights for policy 0, policy_version 29910 (0.0008) +[2023-10-13 01:28:20,721][46662] Updated weights for policy 0, policy_version 29920 (0.0010) +[2023-10-13 01:28:22,888][46663] Updated weights for policy 1, policy_version 29891 (0.0008) +[2023-10-13 01:28:23,256][46663] Updated weights for policy 1, policy_version 29901 (0.0008) +[2023-10-13 01:28:23,606][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 61243392. Throughput: 0: 1689.3, 1: 1675.5. Samples: 15327068. Policy #0 lag: (min: 1.0, avg: 16.0, max: 33.0) +[2023-10-13 01:28:23,607][45375] Avg episode reward: [(0, '44.550'), (1, '52.480')] +[2023-10-13 01:28:23,627][46663] Updated weights for policy 1, policy_version 29911 (0.0009) +[2023-10-13 01:28:24,823][46662] Updated weights for policy 0, policy_version 29930 (0.0007) +[2023-10-13 01:28:25,206][46662] Updated weights for policy 0, policy_version 29940 (0.0007) +[2023-10-13 01:28:25,565][46662] Updated weights for policy 0, policy_version 29950 (0.0008) +[2023-10-13 01:28:27,629][46663] Updated weights for policy 1, policy_version 29921 (0.0010) +[2023-10-13 01:28:27,989][46663] Updated weights for policy 1, policy_version 29931 (0.0010) +[2023-10-13 01:28:28,361][46663] Updated weights for policy 1, policy_version 29941 (0.0008) +[2023-10-13 01:28:28,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 61308928. Throughput: 0: 1663.3, 1: 1686.8. Samples: 15336648. Policy #0 lag: (min: 20.0, avg: 20.0, max: 20.0) +[2023-10-13 01:28:28,607][45375] Avg episode reward: [(0, '46.310'), (1, '53.770')] +[2023-10-13 01:28:28,729][46663] Updated weights for policy 1, policy_version 29951 (0.0009) +[2023-10-13 01:28:29,719][46662] Updated weights for policy 0, policy_version 29960 (0.0007) +[2023-10-13 01:28:30,083][46662] Updated weights for policy 0, policy_version 29970 (0.0007) +[2023-10-13 01:28:30,452][46662] Updated weights for policy 0, policy_version 29980 (0.0007) +[2023-10-13 01:28:32,939][46663] Updated weights for policy 1, policy_version 29961 (0.0009) +[2023-10-13 01:28:33,312][46663] Updated weights for policy 1, policy_version 29971 (0.0009) +[2023-10-13 01:28:33,606][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 61374464. Throughput: 0: 1680.5, 1: 1685.7. Samples: 15357246. Policy #0 lag: (min: 20.0, avg: 20.0, max: 20.0) +[2023-10-13 01:28:33,607][45375] Avg episode reward: [(0, '46.280'), (1, '53.740')] +[2023-10-13 01:28:33,679][46663] Updated weights for policy 1, policy_version 29981 (0.0010) +[2023-10-13 01:28:34,363][46662] Updated weights for policy 0, policy_version 29990 (0.0008) +[2023-10-13 01:28:34,738][46662] Updated weights for policy 0, policy_version 30000 (0.0008) +[2023-10-13 01:28:35,113][46662] Updated weights for policy 0, policy_version 30010 (0.0009) +[2023-10-13 01:28:37,548][46663] Updated weights for policy 1, policy_version 29991 (0.0007) +[2023-10-13 01:28:37,905][46663] Updated weights for policy 1, policy_version 30001 (0.0009) +[2023-10-13 01:28:38,277][46663] Updated weights for policy 1, policy_version 30011 (0.0008) +[2023-10-13 01:28:38,607][45375] Fps is (10 sec: 16384.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 61472768. Throughput: 0: 1685.4, 1: 1664.6. Samples: 15377202. Policy #0 lag: (min: 20.0, avg: 20.0, max: 20.0) +[2023-10-13 01:28:38,607][45375] Avg episode reward: [(0, '45.830'), (1, '53.310')] +[2023-10-13 01:28:39,090][46662] Updated weights for policy 0, policy_version 30020 (0.0008) +[2023-10-13 01:28:39,455][46662] Updated weights for policy 0, policy_version 30030 (0.0008) +[2023-10-13 01:28:39,826][46662] Updated weights for policy 0, policy_version 30040 (0.0009) +[2023-10-13 01:28:42,358][46663] Updated weights for policy 1, policy_version 30021 (0.0007) +[2023-10-13 01:28:42,721][46663] Updated weights for policy 1, policy_version 30031 (0.0008) +[2023-10-13 01:28:43,086][46663] Updated weights for policy 1, policy_version 30041 (0.0011) +[2023-10-13 01:28:43,607][45375] Fps is (10 sec: 16383.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 61538304. Throughput: 0: 1666.6, 1: 1688.9. Samples: 15387416. Policy #0 lag: (min: 20.0, avg: 20.0, max: 20.0) +[2023-10-13 01:28:43,607][45375] Avg episode reward: [(0, '46.030'), (1, '52.520')] +[2023-10-13 01:28:43,885][46662] Updated weights for policy 0, policy_version 30050 (0.0008) +[2023-10-13 01:28:44,248][46662] Updated weights for policy 0, policy_version 30060 (0.0010) +[2023-10-13 01:28:44,625][46662] Updated weights for policy 0, policy_version 30070 (0.0007) +[2023-10-13 01:28:44,990][46662] Updated weights for policy 0, policy_version 30080 (0.0008) +[2023-10-13 01:28:47,071][46663] Updated weights for policy 1, policy_version 30051 (0.0008) +[2023-10-13 01:28:47,434][46663] Updated weights for policy 1, policy_version 30061 (0.0009) +[2023-10-13 01:28:47,796][46663] Updated weights for policy 1, policy_version 30071 (0.0009) +[2023-10-13 01:28:48,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 61603840. Throughput: 0: 1679.8, 1: 1673.1. Samples: 15407506. Policy #0 lag: (min: 20.0, avg: 20.0, max: 20.0) +[2023-10-13 01:28:48,607][45375] Avg episode reward: [(0, '46.610'), (1, '51.350')] +[2023-10-13 01:28:49,162][46662] Updated weights for policy 0, policy_version 30090 (0.0008) +[2023-10-13 01:28:49,527][46662] Updated weights for policy 0, policy_version 30100 (0.0007) +[2023-10-13 01:28:49,902][46662] Updated weights for policy 0, policy_version 30110 (0.0008) +[2023-10-13 01:28:51,943][46663] Updated weights for policy 1, policy_version 30081 (0.0008) +[2023-10-13 01:28:52,309][46663] Updated weights for policy 1, policy_version 30091 (0.0008) +[2023-10-13 01:28:52,674][46663] Updated weights for policy 1, policy_version 30101 (0.0008) +[2023-10-13 01:28:53,045][46663] Updated weights for policy 1, policy_version 30111 (0.0008) +[2023-10-13 01:28:53,607][45375] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 61669376. Throughput: 0: 1676.8, 1: 1664.1. Samples: 15427130. Policy #0 lag: (min: 31.0, avg: 39.9, max: 63.0) +[2023-10-13 01:28:53,608][45375] Avg episode reward: [(0, '47.580'), (1, '50.930')] +[2023-10-13 01:28:54,153][46662] Updated weights for policy 0, policy_version 30120 (0.0010) +[2023-10-13 01:28:54,519][46662] Updated weights for policy 0, policy_version 30130 (0.0011) +[2023-10-13 01:28:54,901][46662] Updated weights for policy 0, policy_version 30140 (0.0009) +[2023-10-13 01:28:57,174][46663] Updated weights for policy 1, policy_version 30121 (0.0010) +[2023-10-13 01:28:57,545][46663] Updated weights for policy 1, policy_version 30131 (0.0008) +[2023-10-13 01:28:57,914][46663] Updated weights for policy 1, policy_version 30141 (0.0009) +[2023-10-13 01:28:58,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 61734912. Throughput: 0: 1669.6, 1: 1682.1. Samples: 15437456. Policy #0 lag: (min: 31.0, avg: 39.9, max: 63.0) +[2023-10-13 01:28:58,607][45375] Avg episode reward: [(0, '45.540'), (1, '52.960')] +[2023-10-13 01:28:58,907][46662] Updated weights for policy 0, policy_version 30150 (0.0008) +[2023-10-13 01:28:59,274][46662] Updated weights for policy 0, policy_version 30160 (0.0007) +[2023-10-13 01:28:59,649][46662] Updated weights for policy 0, policy_version 30170 (0.0009) +[2023-10-13 01:29:02,013][46663] Updated weights for policy 1, policy_version 30151 (0.0010) +[2023-10-13 01:29:02,387][46663] Updated weights for policy 1, policy_version 30161 (0.0007) +[2023-10-13 01:29:02,754][46663] Updated weights for policy 1, policy_version 30171 (0.0010) +[2023-10-13 01:29:03,577][46662] Updated weights for policy 0, policy_version 30180 (0.0008) +[2023-10-13 01:29:03,607][45375] Fps is (10 sec: 13107.6, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 61800448. Throughput: 0: 1678.0, 1: 1672.0. Samples: 15457492. Policy #0 lag: (min: 31.0, avg: 39.9, max: 63.0) +[2023-10-13 01:29:03,607][45375] Avg episode reward: [(0, '46.180'), (1, '52.790')] +[2023-10-13 01:29:03,945][46662] Updated weights for policy 0, policy_version 30190 (0.0008) +[2023-10-13 01:29:04,325][46662] Updated weights for policy 0, policy_version 30200 (0.0008) +[2023-10-13 01:29:06,877][46663] Updated weights for policy 1, policy_version 30181 (0.0007) +[2023-10-13 01:29:07,243][46663] Updated weights for policy 1, policy_version 30191 (0.0009) +[2023-10-13 01:29:07,620][46663] Updated weights for policy 1, policy_version 30201 (0.0009) +[2023-10-13 01:29:08,412][46662] Updated weights for policy 0, policy_version 30210 (0.0009) +[2023-10-13 01:29:08,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 61865984. Throughput: 0: 1678.2, 1: 1668.6. Samples: 15477674. Policy #0 lag: (min: 31.0, avg: 39.9, max: 63.0) +[2023-10-13 01:29:08,607][45375] Avg episode reward: [(0, '46.260'), (1, '53.450')] +[2023-10-13 01:29:08,770][46662] Updated weights for policy 0, policy_version 30220 (0.0009) +[2023-10-13 01:29:09,149][46662] Updated weights for policy 0, policy_version 30230 (0.0010) +[2023-10-13 01:29:09,517][46662] Updated weights for policy 0, policy_version 30240 (0.0007) +[2023-10-13 01:29:11,560][46663] Updated weights for policy 1, policy_version 30211 (0.0007) +[2023-10-13 01:29:11,933][46663] Updated weights for policy 1, policy_version 30221 (0.0009) +[2023-10-13 01:29:12,292][46663] Updated weights for policy 1, policy_version 30231 (0.0009) +[2023-10-13 01:29:13,607][45375] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 61931520. Throughput: 0: 1678.0, 1: 1684.4. Samples: 15487956. Policy #0 lag: (min: 31.0, avg: 39.9, max: 63.0) +[2023-10-13 01:29:13,608][45375] Avg episode reward: [(0, '44.770'), (1, '54.220')] +[2023-10-13 01:29:13,665][46662] Updated weights for policy 0, policy_version 30250 (0.0007) +[2023-10-13 01:29:14,027][46662] Updated weights for policy 0, policy_version 30260 (0.0007) +[2023-10-13 01:29:14,405][46662] Updated weights for policy 0, policy_version 30270 (0.0009) +[2023-10-13 01:29:16,381][46663] Updated weights for policy 1, policy_version 30241 (0.0009) +[2023-10-13 01:29:16,752][46663] Updated weights for policy 1, policy_version 30251 (0.0007) +[2023-10-13 01:29:17,126][46663] Updated weights for policy 1, policy_version 30261 (0.0007) +[2023-10-13 01:29:17,486][46663] Updated weights for policy 1, policy_version 30271 (0.0010) +[2023-10-13 01:29:18,346][46662] Updated weights for policy 0, policy_version 30280 (0.0008) +[2023-10-13 01:29:18,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 61997056. Throughput: 0: 1683.9, 1: 1661.8. Samples: 15507804. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 01:29:18,607][45375] Avg episode reward: [(0, '43.840'), (1, '54.540')] +[2023-10-13 01:29:18,718][46662] Updated weights for policy 0, policy_version 30290 (0.0008) +[2023-10-13 01:29:19,086][46662] Updated weights for policy 0, policy_version 30300 (0.0009) +[2023-10-13 01:29:21,433][46663] Updated weights for policy 1, policy_version 30281 (0.0009) +[2023-10-13 01:29:21,802][46663] Updated weights for policy 1, policy_version 30291 (0.0007) +[2023-10-13 01:29:22,176][46663] Updated weights for policy 1, policy_version 30301 (0.0007) +[2023-10-13 01:29:23,302][46662] Updated weights for policy 0, policy_version 30310 (0.0008) +[2023-10-13 01:29:23,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 62062592. Throughput: 0: 1674.3, 1: 1680.0. Samples: 15528144. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 01:29:23,607][45375] Avg episode reward: [(0, '43.960'), (1, '54.190')] +[2023-10-13 01:29:23,673][46662] Updated weights for policy 0, policy_version 30320 (0.0007) +[2023-10-13 01:29:24,039][46662] Updated weights for policy 0, policy_version 30330 (0.0007) +[2023-10-13 01:29:26,071][46663] Updated weights for policy 1, policy_version 30311 (0.0008) +[2023-10-13 01:29:26,438][46663] Updated weights for policy 1, policy_version 30321 (0.0007) +[2023-10-13 01:29:26,813][46663] Updated weights for policy 1, policy_version 30331 (0.0007) +[2023-10-13 01:29:28,055][46662] Updated weights for policy 0, policy_version 30340 (0.0007) +[2023-10-13 01:29:28,432][46662] Updated weights for policy 0, policy_version 30350 (0.0007) +[2023-10-13 01:29:28,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 62128128. Throughput: 0: 1673.4, 1: 1674.8. Samples: 15538086. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 01:29:28,607][45375] Avg episode reward: [(0, '45.200'), (1, '53.380')] +[2023-10-13 01:29:28,806][46662] Updated weights for policy 0, policy_version 30360 (0.0007) +[2023-10-13 01:29:30,923][46663] Updated weights for policy 1, policy_version 30341 (0.0007) +[2023-10-13 01:29:31,280][46663] Updated weights for policy 1, policy_version 30351 (0.0008) +[2023-10-13 01:29:31,648][46663] Updated weights for policy 1, policy_version 30361 (0.0011) +[2023-10-13 01:29:32,988][46662] Updated weights for policy 0, policy_version 30370 (0.0008) +[2023-10-13 01:29:33,355][46662] Updated weights for policy 0, policy_version 30380 (0.0009) +[2023-10-13 01:29:33,606][45375] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 62193664. Throughput: 0: 1679.7, 1: 1670.6. Samples: 15558268. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 01:29:33,607][45375] Avg episode reward: [(0, '45.300'), (1, '53.240')] +[2023-10-13 01:29:33,734][46662] Updated weights for policy 0, policy_version 30390 (0.0008) +[2023-10-13 01:29:34,104][46662] Updated weights for policy 0, policy_version 30400 (0.0007) +[2023-10-13 01:29:35,835][46663] Updated weights for policy 1, policy_version 30371 (0.0010) +[2023-10-13 01:29:36,203][46663] Updated weights for policy 1, policy_version 30381 (0.0009) +[2023-10-13 01:29:36,564][46663] Updated weights for policy 1, policy_version 30391 (0.0007) +[2023-10-13 01:29:38,067][46662] Updated weights for policy 0, policy_version 30410 (0.0008) +[2023-10-13 01:29:38,434][46662] Updated weights for policy 0, policy_version 30420 (0.0007) +[2023-10-13 01:29:38,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 62259200. Throughput: 0: 1683.5, 1: 1694.3. Samples: 15579128. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 01:29:38,607][45375] Avg episode reward: [(0, '44.200'), (1, '52.520')] +[2023-10-13 01:29:38,810][46662] Updated weights for policy 0, policy_version 30430 (0.0008) +[2023-10-13 01:29:40,530][46663] Updated weights for policy 1, policy_version 30401 (0.0008) +[2023-10-13 01:29:40,889][46663] Updated weights for policy 1, policy_version 30411 (0.0011) +[2023-10-13 01:29:41,255][46663] Updated weights for policy 1, policy_version 30421 (0.0010) +[2023-10-13 01:29:41,621][46663] Updated weights for policy 1, policy_version 30431 (0.0009) +[2023-10-13 01:29:42,739][46662] Updated weights for policy 0, policy_version 30440 (0.0009) +[2023-10-13 01:29:43,099][46662] Updated weights for policy 0, policy_version 30450 (0.0008) +[2023-10-13 01:29:43,482][46662] Updated weights for policy 0, policy_version 30460 (0.0010) +[2023-10-13 01:29:43,606][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 62324736. Throughput: 0: 1687.0, 1: 1677.1. Samples: 15588840. Policy #0 lag: (min: 31.0, avg: 45.1, max: 63.0) +[2023-10-13 01:29:43,607][45375] Avg episode reward: [(0, '44.020'), (1, '54.100')] +[2023-10-13 01:29:45,717][46663] Updated weights for policy 1, policy_version 30441 (0.0008) +[2023-10-13 01:29:46,083][46663] Updated weights for policy 1, policy_version 30451 (0.0009) +[2023-10-13 01:29:46,448][46663] Updated weights for policy 1, policy_version 30461 (0.0009) +[2023-10-13 01:29:47,496][46662] Updated weights for policy 0, policy_version 30470 (0.0008) +[2023-10-13 01:29:47,863][46662] Updated weights for policy 0, policy_version 30480 (0.0009) +[2023-10-13 01:29:48,245][46662] Updated weights for policy 0, policy_version 30490 (0.0010) +[2023-10-13 01:29:48,607][45375] Fps is (10 sec: 16384.1, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 62423040. Throughput: 0: 1689.2, 1: 1683.7. Samples: 15609272. Policy #0 lag: (min: 31.0, avg: 45.1, max: 63.0) +[2023-10-13 01:29:48,607][45375] Avg episode reward: [(0, '44.400'), (1, '53.550')] +[2023-10-13 01:29:50,671][46663] Updated weights for policy 1, policy_version 30471 (0.0009) +[2023-10-13 01:29:51,058][46663] Updated weights for policy 1, policy_version 30481 (0.0009) +[2023-10-13 01:29:51,421][46663] Updated weights for policy 1, policy_version 30491 (0.0009) +[2023-10-13 01:29:52,322][46662] Updated weights for policy 0, policy_version 30500 (0.0009) +[2023-10-13 01:29:52,692][46662] Updated weights for policy 0, policy_version 30510 (0.0010) +[2023-10-13 01:29:53,061][46662] Updated weights for policy 0, policy_version 30520 (0.0007) +[2023-10-13 01:29:53,607][45375] Fps is (10 sec: 16383.9, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 62488576. Throughput: 0: 1673.3, 1: 1693.9. Samples: 15629198. Policy #0 lag: (min: 31.0, avg: 45.1, max: 63.0) +[2023-10-13 01:29:53,607][45375] Avg episode reward: [(0, '43.030'), (1, '53.860')] +[2023-10-13 01:29:53,617][46384] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000030496_31227904.pth... +[2023-10-13 01:29:53,617][46091] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000030528_31260672.pth... +[2023-10-13 01:29:53,653][46091] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000028960_29655040.pth +[2023-10-13 01:29:53,657][46384] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000028928_29622272.pth +[2023-10-13 01:29:55,422][46663] Updated weights for policy 1, policy_version 30501 (0.0008) +[2023-10-13 01:29:55,782][46663] Updated weights for policy 1, policy_version 30511 (0.0007) +[2023-10-13 01:29:56,152][46663] Updated weights for policy 1, policy_version 30521 (0.0009) +[2023-10-13 01:29:57,012][46662] Updated weights for policy 0, policy_version 30530 (0.0009) +[2023-10-13 01:29:57,382][46662] Updated weights for policy 0, policy_version 30540 (0.0009) +[2023-10-13 01:29:57,764][46662] Updated weights for policy 0, policy_version 30550 (0.0010) +[2023-10-13 01:29:58,142][46662] Updated weights for policy 0, policy_version 30560 (0.0008) +[2023-10-13 01:29:58,607][45375] Fps is (10 sec: 13106.7, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 62554112. Throughput: 0: 1692.5, 1: 1667.0. Samples: 15639132. Policy #0 lag: (min: 31.0, avg: 45.1, max: 63.0) +[2023-10-13 01:29:58,608][45375] Avg episode reward: [(0, '43.520'), (1, '54.780')] +[2023-10-13 01:30:00,387][46663] Updated weights for policy 1, policy_version 30531 (0.0010) +[2023-10-13 01:30:00,753][46663] Updated weights for policy 1, policy_version 30541 (0.0009) +[2023-10-13 01:30:01,125][46663] Updated weights for policy 1, policy_version 30551 (0.0009) +[2023-10-13 01:30:02,315][46662] Updated weights for policy 0, policy_version 30570 (0.0009) +[2023-10-13 01:30:02,678][46662] Updated weights for policy 0, policy_version 30580 (0.0009) +[2023-10-13 01:30:03,046][46662] Updated weights for policy 0, policy_version 30590 (0.0007) +[2023-10-13 01:30:03,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 62619648. Throughput: 0: 1688.2, 1: 1682.7. Samples: 15659494. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 01:30:03,607][45375] Avg episode reward: [(0, '42.920'), (1, '53.500')] +[2023-10-13 01:30:05,227][46663] Updated weights for policy 1, policy_version 30561 (0.0009) +[2023-10-13 01:30:05,599][46663] Updated weights for policy 1, policy_version 30571 (0.0008) +[2023-10-13 01:30:05,960][46663] Updated weights for policy 1, policy_version 30581 (0.0008) +[2023-10-13 01:30:06,334][46663] Updated weights for policy 1, policy_version 30591 (0.0007) +[2023-10-13 01:30:07,177][46662] Updated weights for policy 0, policy_version 30600 (0.0009) +[2023-10-13 01:30:07,551][46662] Updated weights for policy 0, policy_version 30610 (0.0009) +[2023-10-13 01:30:07,922][46662] Updated weights for policy 0, policy_version 30620 (0.0010) +[2023-10-13 01:30:08,607][45375] Fps is (10 sec: 13107.6, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 62685184. Throughput: 0: 1668.7, 1: 1683.2. Samples: 15678980. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 01:30:08,607][45375] Avg episode reward: [(0, '43.980'), (1, '52.890')] +[2023-10-13 01:30:10,521][46663] Updated weights for policy 1, policy_version 30601 (0.0010) +[2023-10-13 01:30:10,890][46663] Updated weights for policy 1, policy_version 30611 (0.0010) +[2023-10-13 01:30:11,255][46663] Updated weights for policy 1, policy_version 30621 (0.0010) +[2023-10-13 01:30:12,040][46662] Updated weights for policy 0, policy_version 30630 (0.0010) +[2023-10-13 01:30:12,401][46662] Updated weights for policy 0, policy_version 30640 (0.0010) +[2023-10-13 01:30:12,771][46662] Updated weights for policy 0, policy_version 30650 (0.0011) +[2023-10-13 01:30:13,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 62750720. Throughput: 0: 1689.7, 1: 1663.6. Samples: 15688986. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 01:30:13,607][45375] Avg episode reward: [(0, '45.200'), (1, '52.720')] +[2023-10-13 01:30:15,241][46663] Updated weights for policy 1, policy_version 30631 (0.0010) +[2023-10-13 01:30:15,613][46663] Updated weights for policy 1, policy_version 30641 (0.0008) +[2023-10-13 01:30:15,978][46663] Updated weights for policy 1, policy_version 30651 (0.0011) +[2023-10-13 01:30:16,992][46662] Updated weights for policy 0, policy_version 30660 (0.0008) +[2023-10-13 01:30:17,372][46662] Updated weights for policy 0, policy_version 30670 (0.0007) +[2023-10-13 01:30:17,741][46662] Updated weights for policy 0, policy_version 30680 (0.0009) +[2023-10-13 01:30:18,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 62816256. Throughput: 0: 1684.8, 1: 1678.0. Samples: 15709592. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 01:30:18,607][45375] Avg episode reward: [(0, '44.720'), (1, '54.490')] +[2023-10-13 01:30:20,193][46663] Updated weights for policy 1, policy_version 30661 (0.0009) +[2023-10-13 01:30:20,570][46663] Updated weights for policy 1, policy_version 30671 (0.0008) +[2023-10-13 01:30:20,927][46663] Updated weights for policy 1, policy_version 30681 (0.0009) +[2023-10-13 01:30:21,818][46662] Updated weights for policy 0, policy_version 30690 (0.0007) +[2023-10-13 01:30:22,201][46662] Updated weights for policy 0, policy_version 30700 (0.0009) +[2023-10-13 01:30:22,567][46662] Updated weights for policy 0, policy_version 30710 (0.0009) +[2023-10-13 01:30:22,936][46662] Updated weights for policy 0, policy_version 30720 (0.0007) +[2023-10-13 01:30:23,607][45375] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 62881792. Throughput: 0: 1662.4, 1: 1670.3. Samples: 15729104. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 01:30:23,608][45375] Avg episode reward: [(0, '44.120'), (1, '55.580')] +[2023-10-13 01:30:25,032][46663] Updated weights for policy 1, policy_version 30691 (0.0010) +[2023-10-13 01:30:25,397][46663] Updated weights for policy 1, policy_version 30701 (0.0009) +[2023-10-13 01:30:25,763][46663] Updated weights for policy 1, policy_version 30711 (0.0009) +[2023-10-13 01:30:27,096][46662] Updated weights for policy 0, policy_version 30730 (0.0010) +[2023-10-13 01:30:27,465][46662] Updated weights for policy 0, policy_version 30740 (0.0011) +[2023-10-13 01:30:27,838][46662] Updated weights for policy 0, policy_version 30750 (0.0010) +[2023-10-13 01:30:28,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 62947328. Throughput: 0: 1681.3, 1: 1656.8. Samples: 15739052. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 01:30:28,607][45375] Avg episode reward: [(0, '44.250'), (1, '56.830')] +[2023-10-13 01:30:28,608][46384] Saving new best policy, reward=56.830! +[2023-10-13 01:30:29,987][46663] Updated weights for policy 1, policy_version 30721 (0.0009) +[2023-10-13 01:30:30,348][46663] Updated weights for policy 1, policy_version 30731 (0.0010) +[2023-10-13 01:30:30,726][46663] Updated weights for policy 1, policy_version 30741 (0.0010) +[2023-10-13 01:30:31,104][46663] Updated weights for policy 1, policy_version 30751 (0.0011) +[2023-10-13 01:30:31,954][46662] Updated weights for policy 0, policy_version 30760 (0.0008) +[2023-10-13 01:30:32,322][46662] Updated weights for policy 0, policy_version 30770 (0.0008) +[2023-10-13 01:30:32,700][46662] Updated weights for policy 0, policy_version 30780 (0.0010) +[2023-10-13 01:30:33,606][45375] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 63012864. Throughput: 0: 1672.7, 1: 1660.3. Samples: 15759256. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 01:30:33,607][45375] Avg episode reward: [(0, '43.830'), (1, '57.930')] +[2023-10-13 01:30:33,608][46384] Saving new best policy, reward=57.930! +[2023-10-13 01:30:35,053][46663] Updated weights for policy 1, policy_version 30761 (0.0008) +[2023-10-13 01:30:35,437][46663] Updated weights for policy 1, policy_version 30771 (0.0009) +[2023-10-13 01:30:35,809][46663] Updated weights for policy 1, policy_version 30781 (0.0008) +[2023-10-13 01:30:36,621][46662] Updated weights for policy 0, policy_version 30790 (0.0007) +[2023-10-13 01:30:36,994][46662] Updated weights for policy 0, policy_version 30800 (0.0007) +[2023-10-13 01:30:37,360][46662] Updated weights for policy 0, policy_version 30810 (0.0009) +[2023-10-13 01:30:38,607][45375] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 63078400. Throughput: 0: 1667.0, 1: 1660.5. Samples: 15778934. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 01:30:38,608][45375] Avg episode reward: [(0, '44.490'), (1, '58.480')] +[2023-10-13 01:30:38,617][46384] Saving new best policy, reward=58.480! +[2023-10-13 01:30:39,920][46663] Updated weights for policy 1, policy_version 30791 (0.0008) +[2023-10-13 01:30:40,276][46663] Updated weights for policy 1, policy_version 30801 (0.0008) +[2023-10-13 01:30:40,657][46663] Updated weights for policy 1, policy_version 30811 (0.0007) +[2023-10-13 01:30:41,299][46662] Updated weights for policy 0, policy_version 30820 (0.0009) +[2023-10-13 01:30:41,661][46662] Updated weights for policy 0, policy_version 30830 (0.0007) +[2023-10-13 01:30:42,028][46662] Updated weights for policy 0, policy_version 30840 (0.0008) +[2023-10-13 01:30:43,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 63143936. Throughput: 0: 1681.0, 1: 1657.7. Samples: 15789374. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 01:30:43,607][45375] Avg episode reward: [(0, '44.740'), (1, '58.900')] +[2023-10-13 01:30:43,608][46384] Saving new best policy, reward=58.900! +[2023-10-13 01:30:44,814][46663] Updated weights for policy 1, policy_version 30821 (0.0009) +[2023-10-13 01:30:45,181][46663] Updated weights for policy 1, policy_version 30831 (0.0008) +[2023-10-13 01:30:45,552][46663] Updated weights for policy 1, policy_version 30841 (0.0007) +[2023-10-13 01:30:46,254][46662] Updated weights for policy 0, policy_version 30850 (0.0009) +[2023-10-13 01:30:46,611][46662] Updated weights for policy 0, policy_version 30860 (0.0007) +[2023-10-13 01:30:46,993][46662] Updated weights for policy 0, policy_version 30870 (0.0010) +[2023-10-13 01:30:47,367][46662] Updated weights for policy 0, policy_version 30880 (0.0010) +[2023-10-13 01:30:48,607][45375] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 63209472. Throughput: 0: 1665.9, 1: 1665.8. Samples: 15809420. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 01:30:48,608][45375] Avg episode reward: [(0, '43.450'), (1, '58.690')] +[2023-10-13 01:30:49,525][46663] Updated weights for policy 1, policy_version 30851 (0.0007) +[2023-10-13 01:30:49,896][46663] Updated weights for policy 1, policy_version 30861 (0.0008) +[2023-10-13 01:30:50,259][46663] Updated weights for policy 1, policy_version 30871 (0.0009) +[2023-10-13 01:30:51,531][46662] Updated weights for policy 0, policy_version 30890 (0.0010) +[2023-10-13 01:30:51,908][46662] Updated weights for policy 0, policy_version 30900 (0.0007) +[2023-10-13 01:30:52,276][46662] Updated weights for policy 0, policy_version 30910 (0.0008) +[2023-10-13 01:30:53,607][45375] Fps is (10 sec: 13106.9, 60 sec: 13107.1, 300 sec: 13440.4). Total num frames: 63275008. Throughput: 0: 1672.7, 1: 1665.5. Samples: 15829204. Policy #0 lag: (min: 26.0, avg: 29.5, max: 58.0) +[2023-10-13 01:30:53,608][45375] Avg episode reward: [(0, '44.070'), (1, '60.140')] +[2023-10-13 01:30:53,618][46384] Saving new best policy, reward=60.140! +[2023-10-13 01:30:54,401][46663] Updated weights for policy 1, policy_version 30881 (0.0010) +[2023-10-13 01:30:54,770][46663] Updated weights for policy 1, policy_version 30891 (0.0009) +[2023-10-13 01:30:55,148][46663] Updated weights for policy 1, policy_version 30901 (0.0008) +[2023-10-13 01:30:55,521][46663] Updated weights for policy 1, policy_version 30911 (0.0009) +[2023-10-13 01:30:56,237][46662] Updated weights for policy 0, policy_version 30920 (0.0009) +[2023-10-13 01:30:56,612][46662] Updated weights for policy 0, policy_version 30930 (0.0007) +[2023-10-13 01:30:56,976][46662] Updated weights for policy 0, policy_version 30940 (0.0008) +[2023-10-13 01:30:58,606][45375] Fps is (10 sec: 13107.5, 60 sec: 13107.3, 300 sec: 13440.4). Total num frames: 63340544. Throughput: 0: 1683.4, 1: 1665.4. Samples: 15839684. Policy #0 lag: (min: 26.0, avg: 29.5, max: 58.0) +[2023-10-13 01:30:58,607][45375] Avg episode reward: [(0, '42.110'), (1, '59.380')] +[2023-10-13 01:30:59,471][46663] Updated weights for policy 1, policy_version 30921 (0.0007) +[2023-10-13 01:30:59,835][46663] Updated weights for policy 1, policy_version 30931 (0.0007) +[2023-10-13 01:31:00,201][46663] Updated weights for policy 1, policy_version 30941 (0.0009) +[2023-10-13 01:31:01,017][46662] Updated weights for policy 0, policy_version 30950 (0.0009) +[2023-10-13 01:31:01,385][46662] Updated weights for policy 0, policy_version 30960 (0.0007) +[2023-10-13 01:31:01,753][46662] Updated weights for policy 0, policy_version 30970 (0.0009) +[2023-10-13 01:31:03,607][45375] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 63406080. Throughput: 0: 1661.0, 1: 1668.6. Samples: 15859426. Policy #0 lag: (min: 26.0, avg: 29.5, max: 58.0) +[2023-10-13 01:31:03,608][45375] Avg episode reward: [(0, '40.140'), (1, '58.950')] +[2023-10-13 01:31:04,196][46663] Updated weights for policy 1, policy_version 30951 (0.0009) +[2023-10-13 01:31:04,556][46663] Updated weights for policy 1, policy_version 30961 (0.0008) +[2023-10-13 01:31:04,929][46663] Updated weights for policy 1, policy_version 30971 (0.0008) +[2023-10-13 01:31:05,850][46662] Updated weights for policy 0, policy_version 30980 (0.0009) +[2023-10-13 01:31:06,221][46662] Updated weights for policy 0, policy_version 30990 (0.0007) +[2023-10-13 01:31:06,590][46662] Updated weights for policy 0, policy_version 31000 (0.0009) +[2023-10-13 01:31:08,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 63471616. Throughput: 0: 1677.3, 1: 1670.6. Samples: 15879762. Policy #0 lag: (min: 26.0, avg: 29.5, max: 58.0) +[2023-10-13 01:31:08,607][45375] Avg episode reward: [(0, '39.260'), (1, '59.680')] +[2023-10-13 01:31:09,064][46663] Updated weights for policy 1, policy_version 30981 (0.0010) +[2023-10-13 01:31:09,435][46663] Updated weights for policy 1, policy_version 30991 (0.0008) +[2023-10-13 01:31:09,798][46663] Updated weights for policy 1, policy_version 31001 (0.0007) +[2023-10-13 01:31:10,716][46662] Updated weights for policy 0, policy_version 31010 (0.0008) +[2023-10-13 01:31:11,088][46662] Updated weights for policy 0, policy_version 31020 (0.0008) +[2023-10-13 01:31:11,454][46662] Updated weights for policy 0, policy_version 31030 (0.0009) +[2023-10-13 01:31:11,828][46662] Updated weights for policy 0, policy_version 31040 (0.0010) +[2023-10-13 01:31:13,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.1, 300 sec: 13440.4). Total num frames: 63537152. Throughput: 0: 1676.8, 1: 1672.8. Samples: 15889782. Policy #0 lag: (min: 26.0, avg: 29.5, max: 58.0) +[2023-10-13 01:31:13,608][45375] Avg episode reward: [(0, '41.110'), (1, '60.360')] +[2023-10-13 01:31:13,609][46384] Saving new best policy, reward=60.360! +[2023-10-13 01:31:13,923][46663] Updated weights for policy 1, policy_version 31011 (0.0007) +[2023-10-13 01:31:14,285][46663] Updated weights for policy 1, policy_version 31021 (0.0007) +[2023-10-13 01:31:14,660][46663] Updated weights for policy 1, policy_version 31031 (0.0011) +[2023-10-13 01:31:15,961][46662] Updated weights for policy 0, policy_version 31050 (0.0008) +[2023-10-13 01:31:16,338][46662] Updated weights for policy 0, policy_version 31060 (0.0008) +[2023-10-13 01:31:16,715][46662] Updated weights for policy 0, policy_version 31070 (0.0009) +[2023-10-13 01:31:18,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 63602688. Throughput: 0: 1661.2, 1: 1675.9. Samples: 15909426. Policy #0 lag: (min: 12.0, avg: 30.5, max: 32.0) +[2023-10-13 01:31:18,607][45375] Avg episode reward: [(0, '41.170'), (1, '61.790')] +[2023-10-13 01:31:18,608][46384] Saving new best policy, reward=61.790! +[2023-10-13 01:31:18,857][46663] Updated weights for policy 1, policy_version 31041 (0.0009) +[2023-10-13 01:31:19,228][46663] Updated weights for policy 1, policy_version 31051 (0.0007) +[2023-10-13 01:31:19,589][46663] Updated weights for policy 1, policy_version 31061 (0.0007) +[2023-10-13 01:31:19,956][46663] Updated weights for policy 1, policy_version 31071 (0.0009) +[2023-10-13 01:31:20,773][46662] Updated weights for policy 0, policy_version 31080 (0.0009) +[2023-10-13 01:31:21,142][46662] Updated weights for policy 0, policy_version 31090 (0.0009) +[2023-10-13 01:31:21,513][46662] Updated weights for policy 0, policy_version 31100 (0.0011) +[2023-10-13 01:31:23,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 63668224. Throughput: 0: 1679.5, 1: 1674.4. Samples: 15929860. Policy #0 lag: (min: 12.0, avg: 30.5, max: 32.0) +[2023-10-13 01:31:23,608][45375] Avg episode reward: [(0, '41.960'), (1, '62.090')] +[2023-10-13 01:31:23,619][46384] Saving new best policy, reward=62.090! +[2023-10-13 01:31:24,209][46663] Updated weights for policy 1, policy_version 31081 (0.0010) +[2023-10-13 01:31:24,579][46663] Updated weights for policy 1, policy_version 31091 (0.0007) +[2023-10-13 01:31:24,940][46663] Updated weights for policy 1, policy_version 31101 (0.0007) +[2023-10-13 01:31:25,654][46662] Updated weights for policy 0, policy_version 31110 (0.0008) +[2023-10-13 01:31:26,035][46662] Updated weights for policy 0, policy_version 31120 (0.0007) +[2023-10-13 01:31:26,407][46662] Updated weights for policy 0, policy_version 31130 (0.0008) +[2023-10-13 01:31:28,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 63733760. Throughput: 0: 1667.5, 1: 1672.8. Samples: 15939686. Policy #0 lag: (min: 12.0, avg: 30.5, max: 32.0) +[2023-10-13 01:31:28,607][45375] Avg episode reward: [(0, '42.730'), (1, '61.450')] +[2023-10-13 01:31:29,201][46663] Updated weights for policy 1, policy_version 31111 (0.0009) +[2023-10-13 01:31:29,568][46663] Updated weights for policy 1, policy_version 31121 (0.0008) +[2023-10-13 01:31:29,942][46663] Updated weights for policy 1, policy_version 31131 (0.0007) +[2023-10-13 01:31:30,395][46662] Updated weights for policy 0, policy_version 31140 (0.0008) +[2023-10-13 01:31:30,764][46662] Updated weights for policy 0, policy_version 31150 (0.0009) +[2023-10-13 01:31:31,133][46662] Updated weights for policy 0, policy_version 31160 (0.0009) +[2023-10-13 01:31:33,607][45375] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 63799296. Throughput: 0: 1663.5, 1: 1667.5. Samples: 15959312. Policy #0 lag: (min: 12.0, avg: 30.5, max: 32.0) +[2023-10-13 01:31:33,607][45375] Avg episode reward: [(0, '41.800'), (1, '61.430')] +[2023-10-13 01:31:34,141][46663] Updated weights for policy 1, policy_version 31141 (0.0007) +[2023-10-13 01:31:34,507][46663] Updated weights for policy 1, policy_version 31151 (0.0008) +[2023-10-13 01:31:34,877][46663] Updated weights for policy 1, policy_version 31161 (0.0011) +[2023-10-13 01:31:35,208][46662] Updated weights for policy 0, policy_version 31170 (0.0009) +[2023-10-13 01:31:35,585][46662] Updated weights for policy 0, policy_version 31180 (0.0007) +[2023-10-13 01:31:35,958][46662] Updated weights for policy 0, policy_version 31190 (0.0009) +[2023-10-13 01:31:36,326][46662] Updated weights for policy 0, policy_version 31200 (0.0007) +[2023-10-13 01:31:38,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 63864832. Throughput: 0: 1685.6, 1: 1672.2. Samples: 15980304. Policy #0 lag: (min: 12.0, avg: 30.5, max: 32.0) +[2023-10-13 01:31:38,607][45375] Avg episode reward: [(0, '41.170'), (1, '60.760')] +[2023-10-13 01:31:38,965][46663] Updated weights for policy 1, policy_version 31171 (0.0008) +[2023-10-13 01:31:39,346][46663] Updated weights for policy 1, policy_version 31181 (0.0008) +[2023-10-13 01:31:39,719][46663] Updated weights for policy 1, policy_version 31191 (0.0010) +[2023-10-13 01:31:40,366][46662] Updated weights for policy 0, policy_version 31210 (0.0008) +[2023-10-13 01:31:40,747][46662] Updated weights for policy 0, policy_version 31220 (0.0009) +[2023-10-13 01:31:41,110][46662] Updated weights for policy 0, policy_version 31230 (0.0009) +[2023-10-13 01:31:43,607][45375] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 63930368. Throughput: 0: 1664.2, 1: 1672.5. Samples: 15989834. Policy #0 lag: (min: 31.0, avg: 41.0, max: 63.0) +[2023-10-13 01:31:43,607][45375] Avg episode reward: [(0, '41.520'), (1, '60.490')] +[2023-10-13 01:31:43,836][46663] Updated weights for policy 1, policy_version 31201 (0.0009) +[2023-10-13 01:31:44,209][46663] Updated weights for policy 1, policy_version 31211 (0.0009) +[2023-10-13 01:31:44,575][46663] Updated weights for policy 1, policy_version 31221 (0.0008) +[2023-10-13 01:31:44,947][46663] Updated weights for policy 1, policy_version 31231 (0.0008) +[2023-10-13 01:31:45,052][46662] Updated weights for policy 0, policy_version 31240 (0.0010) +[2023-10-13 01:31:45,419][46662] Updated weights for policy 0, policy_version 31250 (0.0010) +[2023-10-13 01:31:45,782][46662] Updated weights for policy 0, policy_version 31260 (0.0009) +[2023-10-13 01:31:48,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 63995904. Throughput: 0: 1681.2, 1: 1666.2. Samples: 16010056. Policy #0 lag: (min: 31.0, avg: 41.0, max: 63.0) +[2023-10-13 01:31:48,607][45375] Avg episode reward: [(0, '41.380'), (1, '60.640')] +[2023-10-13 01:31:48,960][46663] Updated weights for policy 1, policy_version 31241 (0.0009) +[2023-10-13 01:31:49,322][46663] Updated weights for policy 1, policy_version 31251 (0.0009) +[2023-10-13 01:31:49,687][46663] Updated weights for policy 1, policy_version 31261 (0.0007) +[2023-10-13 01:31:49,762][46662] Updated weights for policy 0, policy_version 31270 (0.0010) +[2023-10-13 01:31:50,134][46662] Updated weights for policy 0, policy_version 31280 (0.0009) +[2023-10-13 01:31:50,507][46662] Updated weights for policy 0, policy_version 31290 (0.0009) +[2023-10-13 01:31:53,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 64061440. Throughput: 0: 1685.1, 1: 1670.0. Samples: 16030744. Policy #0 lag: (min: 31.0, avg: 41.0, max: 63.0) +[2023-10-13 01:31:53,608][45375] Avg episode reward: [(0, '41.500'), (1, '60.770')] +[2023-10-13 01:31:53,619][46091] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000031296_32047104.pth... +[2023-10-13 01:31:53,650][46091] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000029728_30441472.pth +[2023-10-13 01:31:53,654][46091] Saving a milestone ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/milestones/checkpoint_000031296_32047104.pth +[2023-10-13 01:31:53,723][46663] Updated weights for policy 1, policy_version 31271 (0.0008) +[2023-10-13 01:31:54,091][46663] Updated weights for policy 1, policy_version 31281 (0.0010) +[2023-10-13 01:31:54,466][46663] Updated weights for policy 1, policy_version 31291 (0.0010) +[2023-10-13 01:31:54,608][46662] Updated weights for policy 0, policy_version 31300 (0.0007) +[2023-10-13 01:31:54,649][46384] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000031296_32047104.pth... +[2023-10-13 01:31:54,684][46384] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000029728_30441472.pth +[2023-10-13 01:31:54,687][46384] Saving a milestone ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/milestones/checkpoint_000031296_32047104.pth +[2023-10-13 01:31:54,984][46662] Updated weights for policy 0, policy_version 31310 (0.0008) +[2023-10-13 01:31:55,344][46662] Updated weights for policy 0, policy_version 31320 (0.0008) +[2023-10-13 01:31:58,442][46663] Updated weights for policy 1, policy_version 31301 (0.0009) +[2023-10-13 01:31:58,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 64126976. Throughput: 0: 1662.8, 1: 1669.3. Samples: 16039726. Policy #0 lag: (min: 31.0, avg: 41.0, max: 63.0) +[2023-10-13 01:31:58,607][45375] Avg episode reward: [(0, '42.500'), (1, '59.900')] +[2023-10-13 01:31:58,814][46663] Updated weights for policy 1, policy_version 31311 (0.0009) +[2023-10-13 01:31:59,182][46663] Updated weights for policy 1, policy_version 31321 (0.0008) +[2023-10-13 01:31:59,420][46662] Updated weights for policy 0, policy_version 31330 (0.0007) +[2023-10-13 01:31:59,795][46662] Updated weights for policy 0, policy_version 31340 (0.0009) +[2023-10-13 01:32:00,154][46662] Updated weights for policy 0, policy_version 31350 (0.0007) +[2023-10-13 01:32:00,525][46662] Updated weights for policy 0, policy_version 31360 (0.0010) +[2023-10-13 01:32:03,225][46663] Updated weights for policy 1, policy_version 31331 (0.0007) +[2023-10-13 01:32:03,596][46663] Updated weights for policy 1, policy_version 31341 (0.0007) +[2023-10-13 01:32:03,606][45375] Fps is (10 sec: 13107.7, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 64192512. Throughput: 0: 1684.0, 1: 1667.6. Samples: 16060250. Policy #0 lag: (min: 31.0, avg: 41.0, max: 63.0) +[2023-10-13 01:32:03,607][45375] Avg episode reward: [(0, '41.230'), (1, '59.480')] +[2023-10-13 01:32:03,958][46663] Updated weights for policy 1, policy_version 31351 (0.0008) +[2023-10-13 01:32:04,608][46662] Updated weights for policy 0, policy_version 31370 (0.0011) +[2023-10-13 01:32:04,984][46662] Updated weights for policy 0, policy_version 31380 (0.0009) +[2023-10-13 01:32:05,354][46662] Updated weights for policy 0, policy_version 31390 (0.0008) +[2023-10-13 01:32:07,841][46663] Updated weights for policy 1, policy_version 31361 (0.0007) +[2023-10-13 01:32:08,206][46663] Updated weights for policy 1, policy_version 31371 (0.0009) +[2023-10-13 01:32:08,580][46663] Updated weights for policy 1, policy_version 31381 (0.0008) +[2023-10-13 01:32:08,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 64258048. Throughput: 0: 1686.2, 1: 1660.2. Samples: 16080450. Policy #0 lag: (min: 18.0, avg: 21.6, max: 50.0) +[2023-10-13 01:32:08,607][45375] Avg episode reward: [(0, '40.740'), (1, '59.190')] +[2023-10-13 01:32:08,940][46663] Updated weights for policy 1, policy_version 31391 (0.0011) +[2023-10-13 01:32:09,340][46662] Updated weights for policy 0, policy_version 31400 (0.0010) +[2023-10-13 01:32:09,712][46662] Updated weights for policy 0, policy_version 31410 (0.0009) +[2023-10-13 01:32:10,084][46662] Updated weights for policy 0, policy_version 31420 (0.0009) +[2023-10-13 01:32:13,089][46663] Updated weights for policy 1, policy_version 31401 (0.0008) +[2023-10-13 01:32:13,462][46663] Updated weights for policy 1, policy_version 31411 (0.0009) +[2023-10-13 01:32:13,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 64323584. Throughput: 0: 1667.6, 1: 1678.5. Samples: 16090260. Policy #0 lag: (min: 18.0, avg: 21.6, max: 50.0) +[2023-10-13 01:32:13,607][45375] Avg episode reward: [(0, '41.160'), (1, '59.890')] +[2023-10-13 01:32:13,821][46663] Updated weights for policy 1, policy_version 31421 (0.0007) +[2023-10-13 01:32:14,148][46662] Updated weights for policy 0, policy_version 31430 (0.0008) +[2023-10-13 01:32:14,527][46662] Updated weights for policy 0, policy_version 31440 (0.0008) +[2023-10-13 01:32:14,898][46662] Updated weights for policy 0, policy_version 31450 (0.0008) +[2023-10-13 01:32:17,822][46663] Updated weights for policy 1, policy_version 31431 (0.0008) +[2023-10-13 01:32:18,190][46663] Updated weights for policy 1, policy_version 31441 (0.0009) +[2023-10-13 01:32:18,561][46663] Updated weights for policy 1, policy_version 31451 (0.0009) +[2023-10-13 01:32:18,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 64389120. Throughput: 0: 1689.1, 1: 1682.3. Samples: 16111026. Policy #0 lag: (min: 18.0, avg: 21.6, max: 50.0) +[2023-10-13 01:32:18,607][45375] Avg episode reward: [(0, '41.350'), (1, '58.410')] +[2023-10-13 01:32:18,885][46662] Updated weights for policy 0, policy_version 31460 (0.0010) +[2023-10-13 01:32:19,263][46662] Updated weights for policy 0, policy_version 31470 (0.0008) +[2023-10-13 01:32:19,625][46662] Updated weights for policy 0, policy_version 31480 (0.0008) +[2023-10-13 01:32:22,605][46663] Updated weights for policy 1, policy_version 31461 (0.0008) +[2023-10-13 01:32:22,976][46663] Updated weights for policy 1, policy_version 31471 (0.0009) +[2023-10-13 01:32:23,334][46663] Updated weights for policy 1, policy_version 31481 (0.0009) +[2023-10-13 01:32:23,607][45375] Fps is (10 sec: 16383.7, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 64487424. Throughput: 0: 1688.1, 1: 1658.7. Samples: 16130910. Policy #0 lag: (min: 18.0, avg: 21.6, max: 50.0) +[2023-10-13 01:32:23,608][45375] Avg episode reward: [(0, '40.300'), (1, '57.830')] +[2023-10-13 01:32:23,788][46662] Updated weights for policy 0, policy_version 31490 (0.0008) +[2023-10-13 01:32:24,163][46662] Updated weights for policy 0, policy_version 31500 (0.0010) +[2023-10-13 01:32:24,525][46662] Updated weights for policy 0, policy_version 31510 (0.0008) +[2023-10-13 01:32:24,904][46662] Updated weights for policy 0, policy_version 31520 (0.0008) +[2023-10-13 01:32:27,417][46663] Updated weights for policy 1, policy_version 31491 (0.0009) +[2023-10-13 01:32:27,789][46663] Updated weights for policy 1, policy_version 31501 (0.0008) +[2023-10-13 01:32:28,154][46663] Updated weights for policy 1, policy_version 31511 (0.0010) +[2023-10-13 01:32:28,607][45375] Fps is (10 sec: 16384.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 64552960. Throughput: 0: 1678.3, 1: 1683.0. Samples: 16141094. Policy #0 lag: (min: 18.0, avg: 21.6, max: 50.0) +[2023-10-13 01:32:28,607][45375] Avg episode reward: [(0, '41.900'), (1, '57.290')] +[2023-10-13 01:32:28,914][46662] Updated weights for policy 0, policy_version 31530 (0.0010) +[2023-10-13 01:32:29,287][46662] Updated weights for policy 0, policy_version 31540 (0.0011) +[2023-10-13 01:32:29,664][46662] Updated weights for policy 0, policy_version 31550 (0.0011) +[2023-10-13 01:32:32,369][46663] Updated weights for policy 1, policy_version 31521 (0.0008) +[2023-10-13 01:32:32,740][46663] Updated weights for policy 1, policy_version 31531 (0.0010) +[2023-10-13 01:32:33,102][46663] Updated weights for policy 1, policy_version 31541 (0.0009) +[2023-10-13 01:32:33,467][46663] Updated weights for policy 1, policy_version 31551 (0.0007) +[2023-10-13 01:32:33,607][45375] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 64618496. Throughput: 0: 1680.3, 1: 1687.4. Samples: 16161602. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0) +[2023-10-13 01:32:33,607][45375] Avg episode reward: [(0, '43.180'), (1, '57.710')] +[2023-10-13 01:32:33,789][46662] Updated weights for policy 0, policy_version 31560 (0.0008) +[2023-10-13 01:32:34,165][46662] Updated weights for policy 0, policy_version 31570 (0.0007) +[2023-10-13 01:32:34,527][46662] Updated weights for policy 0, policy_version 31580 (0.0007) +[2023-10-13 01:32:37,378][46663] Updated weights for policy 1, policy_version 31561 (0.0008) +[2023-10-13 01:32:37,746][46663] Updated weights for policy 1, policy_version 31571 (0.0007) +[2023-10-13 01:32:38,112][46663] Updated weights for policy 1, policy_version 31581 (0.0009) +[2023-10-13 01:32:38,594][46662] Updated weights for policy 0, policy_version 31590 (0.0008) +[2023-10-13 01:32:38,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 64684032. Throughput: 0: 1681.8, 1: 1659.4. Samples: 16181096. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0) +[2023-10-13 01:32:38,607][45375] Avg episode reward: [(0, '43.380'), (1, '56.990')] +[2023-10-13 01:32:38,961][46662] Updated weights for policy 0, policy_version 31600 (0.0008) +[2023-10-13 01:32:39,332][46662] Updated weights for policy 0, policy_version 31610 (0.0008) +[2023-10-13 01:32:42,111][46663] Updated weights for policy 1, policy_version 31591 (0.0007) +[2023-10-13 01:32:42,484][46663] Updated weights for policy 1, policy_version 31601 (0.0007) +[2023-10-13 01:32:42,839][46663] Updated weights for policy 1, policy_version 31611 (0.0009) +[2023-10-13 01:32:43,371][46662] Updated weights for policy 0, policy_version 31620 (0.0009) +[2023-10-13 01:32:43,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 64749568. Throughput: 0: 1678.5, 1: 1694.0. Samples: 16191488. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0) +[2023-10-13 01:32:43,607][45375] Avg episode reward: [(0, '45.010'), (1, '56.810')] +[2023-10-13 01:32:43,740][46662] Updated weights for policy 0, policy_version 31630 (0.0009) +[2023-10-13 01:32:44,114][46662] Updated weights for policy 0, policy_version 31640 (0.0009) +[2023-10-13 01:32:47,017][46663] Updated weights for policy 1, policy_version 31621 (0.0009) +[2023-10-13 01:32:47,387][46663] Updated weights for policy 1, policy_version 31631 (0.0008) +[2023-10-13 01:32:47,748][46663] Updated weights for policy 1, policy_version 31641 (0.0009) +[2023-10-13 01:32:48,221][46662] Updated weights for policy 0, policy_version 31650 (0.0009) +[2023-10-13 01:32:48,582][46662] Updated weights for policy 0, policy_version 31660 (0.0007) +[2023-10-13 01:32:48,606][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 64815104. Throughput: 0: 1681.8, 1: 1680.8. Samples: 16211566. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0) +[2023-10-13 01:32:48,607][45375] Avg episode reward: [(0, '45.370'), (1, '55.640')] +[2023-10-13 01:32:48,955][46662] Updated weights for policy 0, policy_version 31670 (0.0007) +[2023-10-13 01:32:49,326][46662] Updated weights for policy 0, policy_version 31680 (0.0008) +[2023-10-13 01:32:51,793][46663] Updated weights for policy 1, policy_version 31651 (0.0009) +[2023-10-13 01:32:52,166][46663] Updated weights for policy 1, policy_version 31661 (0.0010) +[2023-10-13 01:32:52,523][46663] Updated weights for policy 1, policy_version 31671 (0.0009) +[2023-10-13 01:32:53,545][46662] Updated weights for policy 0, policy_version 31690 (0.0007) +[2023-10-13 01:32:53,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 64880640. Throughput: 0: 1685.5, 1: 1676.2. Samples: 16231726. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0) +[2023-10-13 01:32:53,607][45375] Avg episode reward: [(0, '46.330'), (1, '54.060')] +[2023-10-13 01:32:53,920][46662] Updated weights for policy 0, policy_version 31700 (0.0010) +[2023-10-13 01:32:54,282][46662] Updated weights for policy 0, policy_version 31710 (0.0010) +[2023-10-13 01:32:56,869][46663] Updated weights for policy 1, policy_version 31681 (0.0008) +[2023-10-13 01:32:57,224][46663] Updated weights for policy 1, policy_version 31691 (0.0010) +[2023-10-13 01:32:57,599][46663] Updated weights for policy 1, policy_version 31701 (0.0007) +[2023-10-13 01:32:57,976][46663] Updated weights for policy 1, policy_version 31711 (0.0007) +[2023-10-13 01:32:58,367][46662] Updated weights for policy 0, policy_version 31720 (0.0008) +[2023-10-13 01:32:58,606][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 64946176. Throughput: 0: 1684.4, 1: 1692.7. Samples: 16242230. Policy #0 lag: (min: 31.0, avg: 38.9, max: 63.0) +[2023-10-13 01:32:58,607][45375] Avg episode reward: [(0, '46.660'), (1, '53.270')] +[2023-10-13 01:32:58,739][46662] Updated weights for policy 0, policy_version 31730 (0.0009) +[2023-10-13 01:32:59,117][46662] Updated weights for policy 0, policy_version 31740 (0.0008) +[2023-10-13 01:33:02,081][46663] Updated weights for policy 1, policy_version 31721 (0.0008) +[2023-10-13 01:33:02,455][46663] Updated weights for policy 1, policy_version 31731 (0.0009) +[2023-10-13 01:33:02,820][46663] Updated weights for policy 1, policy_version 31741 (0.0008) +[2023-10-13 01:33:03,155][46662] Updated weights for policy 0, policy_version 31750 (0.0008) +[2023-10-13 01:33:03,526][46662] Updated weights for policy 0, policy_version 31760 (0.0008) +[2023-10-13 01:33:03,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 65011712. Throughput: 0: 1682.5, 1: 1675.8. Samples: 16262148. Policy #0 lag: (min: 31.0, avg: 38.9, max: 63.0) +[2023-10-13 01:33:03,608][45375] Avg episode reward: [(0, '47.460'), (1, '52.210')] +[2023-10-13 01:33:03,893][46662] Updated weights for policy 0, policy_version 31770 (0.0008) +[2023-10-13 01:33:06,826][46663] Updated weights for policy 1, policy_version 31751 (0.0008) +[2023-10-13 01:33:07,203][46663] Updated weights for policy 1, policy_version 31761 (0.0008) +[2023-10-13 01:33:07,570][46663] Updated weights for policy 1, policy_version 31771 (0.0007) +[2023-10-13 01:33:08,004][46662] Updated weights for policy 0, policy_version 31780 (0.0009) +[2023-10-13 01:33:08,373][46662] Updated weights for policy 0, policy_version 31790 (0.0008) +[2023-10-13 01:33:08,606][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 65077248. Throughput: 0: 1678.8, 1: 1683.7. Samples: 16282224. Policy #0 lag: (min: 31.0, avg: 38.9, max: 63.0) +[2023-10-13 01:33:08,607][45375] Avg episode reward: [(0, '48.790'), (1, '51.740')] +[2023-10-13 01:33:08,739][46662] Updated weights for policy 0, policy_version 31800 (0.0008) +[2023-10-13 01:33:11,749][46663] Updated weights for policy 1, policy_version 31781 (0.0009) +[2023-10-13 01:33:12,117][46663] Updated weights for policy 1, policy_version 31791 (0.0008) +[2023-10-13 01:33:12,481][46663] Updated weights for policy 1, policy_version 31801 (0.0009) +[2023-10-13 01:33:12,711][46662] Updated weights for policy 0, policy_version 31810 (0.0008) +[2023-10-13 01:33:13,089][46662] Updated weights for policy 0, policy_version 31820 (0.0008) +[2023-10-13 01:33:13,468][46662] Updated weights for policy 0, policy_version 31830 (0.0009) +[2023-10-13 01:33:13,606][45375] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 65142784. Throughput: 0: 1673.9, 1: 1685.3. Samples: 16292256. Policy #0 lag: (min: 31.0, avg: 38.9, max: 63.0) +[2023-10-13 01:33:13,607][45375] Avg episode reward: [(0, '49.610'), (1, '51.250')] +[2023-10-13 01:33:13,829][46662] Updated weights for policy 0, policy_version 31840 (0.0009) +[2023-10-13 01:33:16,513][46663] Updated weights for policy 1, policy_version 31811 (0.0009) +[2023-10-13 01:33:16,877][46663] Updated weights for policy 1, policy_version 31821 (0.0008) +[2023-10-13 01:33:17,245][46663] Updated weights for policy 1, policy_version 31831 (0.0010) +[2023-10-13 01:33:17,865][46662] Updated weights for policy 0, policy_version 31850 (0.0007) +[2023-10-13 01:33:18,233][46662] Updated weights for policy 0, policy_version 31860 (0.0007) +[2023-10-13 01:33:18,604][46662] Updated weights for policy 0, policy_version 31870 (0.0007) +[2023-10-13 01:33:18,606][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 65208320. Throughput: 0: 1679.4, 1: 1661.2. Samples: 16311932. Policy #0 lag: (min: 31.0, avg: 38.9, max: 63.0) +[2023-10-13 01:33:18,607][45375] Avg episode reward: [(0, '50.230'), (1, '51.380')] +[2023-10-13 01:33:21,319][46663] Updated weights for policy 1, policy_version 31841 (0.0009) +[2023-10-13 01:33:21,692][46663] Updated weights for policy 1, policy_version 31851 (0.0009) +[2023-10-13 01:33:22,066][46663] Updated weights for policy 1, policy_version 31861 (0.0011) +[2023-10-13 01:33:22,427][46663] Updated weights for policy 1, policy_version 31871 (0.0008) +[2023-10-13 01:33:22,701][46662] Updated weights for policy 0, policy_version 31880 (0.0008) +[2023-10-13 01:33:23,060][46662] Updated weights for policy 0, policy_version 31890 (0.0009) +[2023-10-13 01:33:23,438][46662] Updated weights for policy 0, policy_version 31900 (0.0007) +[2023-10-13 01:33:23,607][45375] Fps is (10 sec: 16383.6, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 65306624. Throughput: 0: 1668.2, 1: 1683.0. Samples: 16331902. Policy #0 lag: (min: 28.0, avg: 28.0, max: 28.0) +[2023-10-13 01:33:23,608][45375] Avg episode reward: [(0, '49.130'), (1, '53.750')] +[2023-10-13 01:33:26,382][46663] Updated weights for policy 1, policy_version 31881 (0.0007) +[2023-10-13 01:33:26,752][46663] Updated weights for policy 1, policy_version 31891 (0.0007) +[2023-10-13 01:33:27,117][46663] Updated weights for policy 1, policy_version 31901 (0.0009) +[2023-10-13 01:33:27,448][46662] Updated weights for policy 0, policy_version 31910 (0.0008) +[2023-10-13 01:33:27,807][46662] Updated weights for policy 0, policy_version 31920 (0.0011) +[2023-10-13 01:33:28,177][46662] Updated weights for policy 0, policy_version 31930 (0.0009) +[2023-10-13 01:33:28,607][45375] Fps is (10 sec: 16383.8, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 65372160. Throughput: 0: 1686.0, 1: 1676.2. Samples: 16342784. Policy #0 lag: (min: 28.0, avg: 28.0, max: 28.0) +[2023-10-13 01:33:28,607][45375] Avg episode reward: [(0, '48.990'), (1, '54.190')] +[2023-10-13 01:33:31,144][46663] Updated weights for policy 1, policy_version 31911 (0.0009) +[2023-10-13 01:33:31,518][46663] Updated weights for policy 1, policy_version 31921 (0.0008) +[2023-10-13 01:33:31,887][46663] Updated weights for policy 1, policy_version 31931 (0.0010) +[2023-10-13 01:33:32,380][46662] Updated weights for policy 0, policy_version 31940 (0.0008) +[2023-10-13 01:33:32,744][46662] Updated weights for policy 0, policy_version 31950 (0.0008) +[2023-10-13 01:33:33,119][46662] Updated weights for policy 0, policy_version 31960 (0.0007) +[2023-10-13 01:33:33,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 65437696. Throughput: 0: 1680.6, 1: 1675.9. Samples: 16362608. Policy #0 lag: (min: 28.0, avg: 28.0, max: 28.0) +[2023-10-13 01:33:33,608][45375] Avg episode reward: [(0, '48.530'), (1, '52.970')] +[2023-10-13 01:33:35,858][46663] Updated weights for policy 1, policy_version 31941 (0.0007) +[2023-10-13 01:33:36,232][46663] Updated weights for policy 1, policy_version 31951 (0.0007) +[2023-10-13 01:33:36,593][46663] Updated weights for policy 1, policy_version 31961 (0.0008) +[2023-10-13 01:33:36,923][46662] Updated weights for policy 0, policy_version 31970 (0.0008) +[2023-10-13 01:33:37,283][46662] Updated weights for policy 0, policy_version 31980 (0.0007) +[2023-10-13 01:33:37,664][46662] Updated weights for policy 0, policy_version 31990 (0.0008) +[2023-10-13 01:33:38,032][46662] Updated weights for policy 0, policy_version 32000 (0.0007) +[2023-10-13 01:33:38,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 65503232. Throughput: 0: 1658.9, 1: 1695.8. Samples: 16382688. Policy #0 lag: (min: 28.0, avg: 28.0, max: 28.0) +[2023-10-13 01:33:38,607][45375] Avg episode reward: [(0, '49.670'), (1, '53.880')] +[2023-10-13 01:33:40,549][46663] Updated weights for policy 1, policy_version 31971 (0.0008) +[2023-10-13 01:33:40,912][46663] Updated weights for policy 1, policy_version 31981 (0.0008) +[2023-10-13 01:33:41,283][46663] Updated weights for policy 1, policy_version 31991 (0.0009) +[2023-10-13 01:33:42,068][46662] Updated weights for policy 0, policy_version 32010 (0.0010) +[2023-10-13 01:33:42,445][46662] Updated weights for policy 0, policy_version 32020 (0.0009) +[2023-10-13 01:33:42,810][46662] Updated weights for policy 0, policy_version 32030 (0.0008) +[2023-10-13 01:33:43,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 65568768. Throughput: 0: 1683.3, 1: 1671.5. Samples: 16393194. Policy #0 lag: (min: 24.0, avg: 47.0, max: 56.0) +[2023-10-13 01:33:43,608][45375] Avg episode reward: [(0, '47.840'), (1, '54.130')] +[2023-10-13 01:33:45,451][46663] Updated weights for policy 1, policy_version 32001 (0.0010) +[2023-10-13 01:33:45,830][46663] Updated weights for policy 1, policy_version 32011 (0.0010) +[2023-10-13 01:33:46,191][46663] Updated weights for policy 1, policy_version 32021 (0.0007) +[2023-10-13 01:33:46,554][46663] Updated weights for policy 1, policy_version 32031 (0.0009) +[2023-10-13 01:33:47,004][46662] Updated weights for policy 0, policy_version 32040 (0.0009) +[2023-10-13 01:33:47,375][46662] Updated weights for policy 0, policy_version 32050 (0.0009) +[2023-10-13 01:33:47,743][46662] Updated weights for policy 0, policy_version 32060 (0.0010) +[2023-10-13 01:33:48,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 65634304. Throughput: 0: 1685.0, 1: 1674.9. Samples: 16413344. Policy #0 lag: (min: 24.0, avg: 47.0, max: 56.0) +[2023-10-13 01:33:48,607][45375] Avg episode reward: [(0, '47.800'), (1, '53.130')] +[2023-10-13 01:33:50,760][46663] Updated weights for policy 1, policy_version 32041 (0.0010) +[2023-10-13 01:33:51,125][46663] Updated weights for policy 1, policy_version 32051 (0.0010) +[2023-10-13 01:33:51,495][46663] Updated weights for policy 1, policy_version 32061 (0.0010) +[2023-10-13 01:33:51,746][46662] Updated weights for policy 0, policy_version 32070 (0.0009) +[2023-10-13 01:33:52,127][46662] Updated weights for policy 0, policy_version 32080 (0.0010) +[2023-10-13 01:33:52,489][46662] Updated weights for policy 0, policy_version 32090 (0.0009) +[2023-10-13 01:33:53,607][45375] Fps is (10 sec: 13107.5, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 65699840. Throughput: 0: 1661.1, 1: 1681.6. Samples: 16432648. Policy #0 lag: (min: 24.0, avg: 47.0, max: 56.0) +[2023-10-13 01:33:53,607][45375] Avg episode reward: [(0, '47.440'), (1, '53.250')] +[2023-10-13 01:33:53,615][46091] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000032096_32866304.pth... +[2023-10-13 01:33:53,615][46384] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000032064_32833536.pth... +[2023-10-13 01:33:53,644][46091] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000030528_31260672.pth +[2023-10-13 01:33:53,647][46384] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000030496_31227904.pth +[2023-10-13 01:33:55,650][46663] Updated weights for policy 1, policy_version 32071 (0.0010) +[2023-10-13 01:33:56,024][46663] Updated weights for policy 1, policy_version 32081 (0.0009) +[2023-10-13 01:33:56,394][46663] Updated weights for policy 1, policy_version 32091 (0.0009) +[2023-10-13 01:33:56,581][46662] Updated weights for policy 0, policy_version 32100 (0.0010) +[2023-10-13 01:33:56,947][46662] Updated weights for policy 0, policy_version 32110 (0.0008) +[2023-10-13 01:33:57,319][46662] Updated weights for policy 0, policy_version 32120 (0.0009) +[2023-10-13 01:33:58,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 65765376. Throughput: 0: 1696.4, 1: 1659.7. Samples: 16443280. Policy #0 lag: (min: 24.0, avg: 47.0, max: 56.0) +[2023-10-13 01:33:58,607][45375] Avg episode reward: [(0, '48.210'), (1, '53.990')] +[2023-10-13 01:34:00,437][46663] Updated weights for policy 1, policy_version 32101 (0.0008) +[2023-10-13 01:34:00,810][46663] Updated weights for policy 1, policy_version 32111 (0.0009) +[2023-10-13 01:34:01,182][46663] Updated weights for policy 1, policy_version 32121 (0.0008) +[2023-10-13 01:34:01,313][46662] Updated weights for policy 0, policy_version 32130 (0.0008) +[2023-10-13 01:34:01,682][46662] Updated weights for policy 0, policy_version 32140 (0.0008) +[2023-10-13 01:34:02,063][46662] Updated weights for policy 0, policy_version 32150 (0.0009) +[2023-10-13 01:34:02,427][46662] Updated weights for policy 0, policy_version 32160 (0.0008) +[2023-10-13 01:34:03,607][45375] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 65830912. Throughput: 0: 1680.0, 1: 1680.5. Samples: 16463156. Policy #0 lag: (min: 24.0, avg: 47.0, max: 56.0) +[2023-10-13 01:34:03,608][45375] Avg episode reward: [(0, '47.250'), (1, '54.610')] +[2023-10-13 01:34:05,125][46663] Updated weights for policy 1, policy_version 32131 (0.0008) +[2023-10-13 01:34:05,481][46663] Updated weights for policy 1, policy_version 32141 (0.0009) +[2023-10-13 01:34:05,850][46663] Updated weights for policy 1, policy_version 32151 (0.0010) +[2023-10-13 01:34:06,676][46662] Updated weights for policy 0, policy_version 32170 (0.0010) +[2023-10-13 01:34:07,047][46662] Updated weights for policy 0, policy_version 32180 (0.0010) +[2023-10-13 01:34:07,412][46662] Updated weights for policy 0, policy_version 32190 (0.0011) +[2023-10-13 01:34:08,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 65896448. Throughput: 0: 1670.0, 1: 1688.3. Samples: 16483024. Policy #0 lag: (min: 31.0, avg: 40.4, max: 63.0) +[2023-10-13 01:34:08,607][45375] Avg episode reward: [(0, '47.160'), (1, '54.350')] +[2023-10-13 01:34:10,010][46663] Updated weights for policy 1, policy_version 32161 (0.0009) +[2023-10-13 01:34:10,381][46663] Updated weights for policy 1, policy_version 32171 (0.0007) +[2023-10-13 01:34:10,757][46663] Updated weights for policy 1, policy_version 32181 (0.0007) +[2023-10-13 01:34:11,123][46663] Updated weights for policy 1, policy_version 32191 (0.0008) +[2023-10-13 01:34:11,350][46662] Updated weights for policy 0, policy_version 32200 (0.0008) +[2023-10-13 01:34:11,720][46662] Updated weights for policy 0, policy_version 32210 (0.0008) +[2023-10-13 01:34:12,089][46662] Updated weights for policy 0, policy_version 32220 (0.0009) +[2023-10-13 01:34:13,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 65961984. Throughput: 0: 1685.7, 1: 1662.0. Samples: 16493430. Policy #0 lag: (min: 31.0, avg: 40.4, max: 63.0) +[2023-10-13 01:34:13,608][45375] Avg episode reward: [(0, '46.040'), (1, '54.620')] +[2023-10-13 01:34:15,130][46663] Updated weights for policy 1, policy_version 32201 (0.0008) +[2023-10-13 01:34:15,491][46663] Updated weights for policy 1, policy_version 32211 (0.0008) +[2023-10-13 01:34:15,855][46663] Updated weights for policy 1, policy_version 32221 (0.0010) +[2023-10-13 01:34:16,192][46662] Updated weights for policy 0, policy_version 32230 (0.0008) +[2023-10-13 01:34:16,560][46662] Updated weights for policy 0, policy_version 32240 (0.0007) +[2023-10-13 01:34:16,936][46662] Updated weights for policy 0, policy_version 32250 (0.0008) +[2023-10-13 01:34:18,606][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 66027520. Throughput: 0: 1669.1, 1: 1681.9. Samples: 16513400. Policy #0 lag: (min: 31.0, avg: 40.4, max: 63.0) +[2023-10-13 01:34:18,607][45375] Avg episode reward: [(0, '44.310'), (1, '54.080')] +[2023-10-13 01:34:20,126][46663] Updated weights for policy 1, policy_version 32231 (0.0008) +[2023-10-13 01:34:20,494][46663] Updated weights for policy 1, policy_version 32241 (0.0008) +[2023-10-13 01:34:20,856][46663] Updated weights for policy 1, policy_version 32251 (0.0008) +[2023-10-13 01:34:20,995][46662] Updated weights for policy 0, policy_version 32260 (0.0008) +[2023-10-13 01:34:21,355][46662] Updated weights for policy 0, policy_version 32270 (0.0008) +[2023-10-13 01:34:21,727][46662] Updated weights for policy 0, policy_version 32280 (0.0009) +[2023-10-13 01:34:23,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 66093056. Throughput: 0: 1674.4, 1: 1672.3. Samples: 16533290. Policy #0 lag: (min: 31.0, avg: 40.4, max: 63.0) +[2023-10-13 01:34:23,608][45375] Avg episode reward: [(0, '43.180'), (1, '53.480')] +[2023-10-13 01:34:25,048][46663] Updated weights for policy 1, policy_version 32261 (0.0008) +[2023-10-13 01:34:25,415][46663] Updated weights for policy 1, policy_version 32271 (0.0007) +[2023-10-13 01:34:25,736][46662] Updated weights for policy 0, policy_version 32290 (0.0010) +[2023-10-13 01:34:25,778][46663] Updated weights for policy 1, policy_version 32281 (0.0007) +[2023-10-13 01:34:26,099][46662] Updated weights for policy 0, policy_version 32300 (0.0009) +[2023-10-13 01:34:26,473][46662] Updated weights for policy 0, policy_version 32310 (0.0008) +[2023-10-13 01:34:26,840][46662] Updated weights for policy 0, policy_version 32320 (0.0008) +[2023-10-13 01:34:28,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 66158592. Throughput: 0: 1679.5, 1: 1664.0. Samples: 16543652. Policy #0 lag: (min: 31.0, avg: 40.4, max: 63.0) +[2023-10-13 01:34:28,607][45375] Avg episode reward: [(0, '43.520'), (1, '52.960')] +[2023-10-13 01:34:29,796][46663] Updated weights for policy 1, policy_version 32291 (0.0009) +[2023-10-13 01:34:30,164][46663] Updated weights for policy 1, policy_version 32301 (0.0009) +[2023-10-13 01:34:30,530][46663] Updated weights for policy 1, policy_version 32311 (0.0008) +[2023-10-13 01:34:30,938][46662] Updated weights for policy 0, policy_version 32330 (0.0009) +[2023-10-13 01:34:31,310][46662] Updated weights for policy 0, policy_version 32340 (0.0007) +[2023-10-13 01:34:31,677][46662] Updated weights for policy 0, policy_version 32350 (0.0008) +[2023-10-13 01:34:33,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 66224128. Throughput: 0: 1653.2, 1: 1676.3. Samples: 16563170. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 01:34:33,608][45375] Avg episode reward: [(0, '42.390'), (1, '51.640')] +[2023-10-13 01:34:34,494][46663] Updated weights for policy 1, policy_version 32321 (0.0008) +[2023-10-13 01:34:34,866][46663] Updated weights for policy 1, policy_version 32331 (0.0011) +[2023-10-13 01:34:35,231][46663] Updated weights for policy 1, policy_version 32341 (0.0011) +[2023-10-13 01:34:35,605][46663] Updated weights for policy 1, policy_version 32351 (0.0007) +[2023-10-13 01:34:35,752][46662] Updated weights for policy 0, policy_version 32360 (0.0007) +[2023-10-13 01:34:36,117][46662] Updated weights for policy 0, policy_version 32370 (0.0009) +[2023-10-13 01:34:36,487][46662] Updated weights for policy 0, policy_version 32380 (0.0008) +[2023-10-13 01:34:38,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 66289664. Throughput: 0: 1678.7, 1: 1682.7. Samples: 16583910. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 01:34:38,607][45375] Avg episode reward: [(0, '42.180'), (1, '50.570')] +[2023-10-13 01:34:39,692][46663] Updated weights for policy 1, policy_version 32361 (0.0007) +[2023-10-13 01:34:40,062][46663] Updated weights for policy 1, policy_version 32371 (0.0010) +[2023-10-13 01:34:40,433][46663] Updated weights for policy 1, policy_version 32381 (0.0008) +[2023-10-13 01:34:40,602][46662] Updated weights for policy 0, policy_version 32390 (0.0009) +[2023-10-13 01:34:40,980][46662] Updated weights for policy 0, policy_version 32400 (0.0009) +[2023-10-13 01:34:41,343][46662] Updated weights for policy 0, policy_version 32410 (0.0010) +[2023-10-13 01:34:43,607][45375] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 66355200. Throughput: 0: 1669.2, 1: 1678.7. Samples: 16593936. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 01:34:43,607][45375] Avg episode reward: [(0, '43.000'), (1, '50.580')] +[2023-10-13 01:34:44,394][46663] Updated weights for policy 1, policy_version 32391 (0.0009) +[2023-10-13 01:34:44,760][46663] Updated weights for policy 1, policy_version 32401 (0.0008) +[2023-10-13 01:34:45,119][46663] Updated weights for policy 1, policy_version 32411 (0.0008) +[2023-10-13 01:34:45,420][46662] Updated weights for policy 0, policy_version 32420 (0.0009) +[2023-10-13 01:34:45,786][46662] Updated weights for policy 0, policy_version 32430 (0.0007) +[2023-10-13 01:34:46,153][46662] Updated weights for policy 0, policy_version 32440 (0.0009) +[2023-10-13 01:34:48,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 66420736. Throughput: 0: 1667.0, 1: 1682.3. Samples: 16613872. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 01:34:48,607][45375] Avg episode reward: [(0, '44.300'), (1, '52.130')] +[2023-10-13 01:34:49,310][46663] Updated weights for policy 1, policy_version 32421 (0.0007) +[2023-10-13 01:34:49,679][46663] Updated weights for policy 1, policy_version 32431 (0.0009) +[2023-10-13 01:34:50,057][46663] Updated weights for policy 1, policy_version 32441 (0.0007) +[2023-10-13 01:34:50,273][46662] Updated weights for policy 0, policy_version 32450 (0.0009) +[2023-10-13 01:34:50,636][46662] Updated weights for policy 0, policy_version 32460 (0.0008) +[2023-10-13 01:34:51,007][46662] Updated weights for policy 0, policy_version 32470 (0.0007) +[2023-10-13 01:34:51,374][46662] Updated weights for policy 0, policy_version 32480 (0.0007) +[2023-10-13 01:34:53,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 66486272. Throughput: 0: 1688.1, 1: 1679.1. Samples: 16634548. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 01:34:53,607][45375] Avg episode reward: [(0, '42.800'), (1, '52.050')] +[2023-10-13 01:34:54,118][46663] Updated weights for policy 1, policy_version 32451 (0.0008) +[2023-10-13 01:34:54,484][46663] Updated weights for policy 1, policy_version 32461 (0.0009) +[2023-10-13 01:34:54,850][46663] Updated weights for policy 1, policy_version 32471 (0.0008) +[2023-10-13 01:34:55,431][46662] Updated weights for policy 0, policy_version 32490 (0.0009) +[2023-10-13 01:34:55,797][46662] Updated weights for policy 0, policy_version 32500 (0.0010) +[2023-10-13 01:34:56,174][46662] Updated weights for policy 0, policy_version 32510 (0.0010) +[2023-10-13 01:34:58,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 66551808. Throughput: 0: 1667.0, 1: 1680.5. Samples: 16644068. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 01:34:58,607][45375] Avg episode reward: [(0, '43.060'), (1, '51.950')] +[2023-10-13 01:34:58,854][46663] Updated weights for policy 1, policy_version 32481 (0.0007) +[2023-10-13 01:34:59,224][46663] Updated weights for policy 1, policy_version 32491 (0.0007) +[2023-10-13 01:34:59,592][46663] Updated weights for policy 1, policy_version 32501 (0.0007) +[2023-10-13 01:34:59,959][46663] Updated weights for policy 1, policy_version 32511 (0.0008) +[2023-10-13 01:35:00,130][46662] Updated weights for policy 0, policy_version 32520 (0.0010) +[2023-10-13 01:35:00,509][46662] Updated weights for policy 0, policy_version 32530 (0.0010) +[2023-10-13 01:35:00,871][46662] Updated weights for policy 0, policy_version 32540 (0.0010) +[2023-10-13 01:35:03,607][45375] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 66617344. Throughput: 0: 1673.9, 1: 1681.8. Samples: 16664408. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 01:35:03,608][45375] Avg episode reward: [(0, '44.840'), (1, '51.950')] +[2023-10-13 01:35:03,909][46663] Updated weights for policy 1, policy_version 32521 (0.0009) +[2023-10-13 01:35:04,266][46663] Updated weights for policy 1, policy_version 32531 (0.0007) +[2023-10-13 01:35:04,644][46663] Updated weights for policy 1, policy_version 32541 (0.0008) +[2023-10-13 01:35:04,953][46662] Updated weights for policy 0, policy_version 32550 (0.0010) +[2023-10-13 01:35:05,323][46662] Updated weights for policy 0, policy_version 32560 (0.0008) +[2023-10-13 01:35:05,701][46662] Updated weights for policy 0, policy_version 32570 (0.0007) +[2023-10-13 01:35:08,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 66682880. Throughput: 0: 1687.0, 1: 1684.5. Samples: 16685010. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 01:35:08,607][45375] Avg episode reward: [(0, '44.410'), (1, '51.120')] +[2023-10-13 01:35:08,702][46663] Updated weights for policy 1, policy_version 32551 (0.0008) +[2023-10-13 01:35:09,079][46663] Updated weights for policy 1, policy_version 32561 (0.0008) +[2023-10-13 01:35:09,444][46663] Updated weights for policy 1, policy_version 32571 (0.0007) +[2023-10-13 01:35:09,712][46662] Updated weights for policy 0, policy_version 32580 (0.0009) +[2023-10-13 01:35:10,084][46662] Updated weights for policy 0, policy_version 32590 (0.0007) +[2023-10-13 01:35:10,454][46662] Updated weights for policy 0, policy_version 32600 (0.0007) +[2023-10-13 01:35:13,315][46663] Updated weights for policy 1, policy_version 32581 (0.0008) +[2023-10-13 01:35:13,606][45375] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 66748416. Throughput: 0: 1658.3, 1: 1687.1. Samples: 16694194. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 01:35:13,607][45375] Avg episode reward: [(0, '43.760'), (1, '51.200')] +[2023-10-13 01:35:13,682][46663] Updated weights for policy 1, policy_version 32591 (0.0011) +[2023-10-13 01:35:14,063][46663] Updated weights for policy 1, policy_version 32601 (0.0011) +[2023-10-13 01:35:14,612][46662] Updated weights for policy 0, policy_version 32610 (0.0008) +[2023-10-13 01:35:14,973][46662] Updated weights for policy 0, policy_version 32620 (0.0009) +[2023-10-13 01:35:15,348][46662] Updated weights for policy 0, policy_version 32630 (0.0009) +[2023-10-13 01:35:15,713][46662] Updated weights for policy 0, policy_version 32640 (0.0009) +[2023-10-13 01:35:18,177][46663] Updated weights for policy 1, policy_version 32611 (0.0009) +[2023-10-13 01:35:18,551][46663] Updated weights for policy 1, policy_version 32621 (0.0009) +[2023-10-13 01:35:18,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 66813952. Throughput: 0: 1682.5, 1: 1690.5. Samples: 16714952. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 01:35:18,607][45375] Avg episode reward: [(0, '45.360'), (1, '52.280')] +[2023-10-13 01:35:18,914][46663] Updated weights for policy 1, policy_version 32631 (0.0008) +[2023-10-13 01:35:19,896][46662] Updated weights for policy 0, policy_version 32650 (0.0007) +[2023-10-13 01:35:20,269][46662] Updated weights for policy 0, policy_version 32660 (0.0008) +[2023-10-13 01:35:20,640][46662] Updated weights for policy 0, policy_version 32670 (0.0008) +[2023-10-13 01:35:22,894][46663] Updated weights for policy 1, policy_version 32641 (0.0009) +[2023-10-13 01:35:23,261][46663] Updated weights for policy 1, policy_version 32651 (0.0010) +[2023-10-13 01:35:23,606][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 66879488. Throughput: 0: 1682.4, 1: 1678.3. Samples: 16735140. Policy #0 lag: (min: 6.0, avg: 7.3, max: 31.0) +[2023-10-13 01:35:23,607][45375] Avg episode reward: [(0, '45.880'), (1, '52.580')] +[2023-10-13 01:35:23,621][46663] Updated weights for policy 1, policy_version 32661 (0.0009) +[2023-10-13 01:35:23,992][46663] Updated weights for policy 1, policy_version 32671 (0.0010) +[2023-10-13 01:35:24,559][46662] Updated weights for policy 0, policy_version 32680 (0.0009) +[2023-10-13 01:35:24,933][46662] Updated weights for policy 0, policy_version 32690 (0.0008) +[2023-10-13 01:35:25,303][46662] Updated weights for policy 0, policy_version 32700 (0.0008) +[2023-10-13 01:35:28,285][46663] Updated weights for policy 1, policy_version 32681 (0.0008) +[2023-10-13 01:35:28,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 66945024. Throughput: 0: 1663.5, 1: 1688.5. Samples: 16744778. Policy #0 lag: (min: 6.0, avg: 7.3, max: 31.0) +[2023-10-13 01:35:28,607][45375] Avg episode reward: [(0, '46.710'), (1, '52.470')] +[2023-10-13 01:35:28,667][46663] Updated weights for policy 1, policy_version 32691 (0.0008) +[2023-10-13 01:35:29,037][46663] Updated weights for policy 1, policy_version 32701 (0.0009) +[2023-10-13 01:35:29,466][46662] Updated weights for policy 0, policy_version 32710 (0.0008) +[2023-10-13 01:35:29,831][46662] Updated weights for policy 0, policy_version 32720 (0.0008) +[2023-10-13 01:35:30,208][46662] Updated weights for policy 0, policy_version 32730 (0.0008) +[2023-10-13 01:35:33,206][46663] Updated weights for policy 1, policy_version 32711 (0.0008) +[2023-10-13 01:35:33,570][46663] Updated weights for policy 1, policy_version 32721 (0.0007) +[2023-10-13 01:35:33,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 67010560. Throughput: 0: 1681.0, 1: 1684.1. Samples: 16765300. Policy #0 lag: (min: 6.0, avg: 7.3, max: 31.0) +[2023-10-13 01:35:33,608][45375] Avg episode reward: [(0, '46.820'), (1, '51.810')] +[2023-10-13 01:35:33,940][46663] Updated weights for policy 1, policy_version 32731 (0.0007) +[2023-10-13 01:35:34,282][46662] Updated weights for policy 0, policy_version 32740 (0.0007) +[2023-10-13 01:35:34,663][46662] Updated weights for policy 0, policy_version 32750 (0.0008) +[2023-10-13 01:35:35,039][46662] Updated weights for policy 0, policy_version 32760 (0.0008) +[2023-10-13 01:35:38,045][46663] Updated weights for policy 1, policy_version 32741 (0.0007) +[2023-10-13 01:35:38,406][46663] Updated weights for policy 1, policy_version 32751 (0.0010) +[2023-10-13 01:35:38,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 67076096. Throughput: 0: 1683.1, 1: 1673.3. Samples: 16785588. Policy #0 lag: (min: 6.0, avg: 7.3, max: 31.0) +[2023-10-13 01:35:38,607][45375] Avg episode reward: [(0, '46.090'), (1, '52.540')] +[2023-10-13 01:35:38,779][46663] Updated weights for policy 1, policy_version 32761 (0.0009) +[2023-10-13 01:35:39,310][46662] Updated weights for policy 0, policy_version 32770 (0.0009) +[2023-10-13 01:35:39,688][46662] Updated weights for policy 0, policy_version 32780 (0.0008) +[2023-10-13 01:35:40,054][46662] Updated weights for policy 0, policy_version 32790 (0.0008) +[2023-10-13 01:35:40,424][46662] Updated weights for policy 0, policy_version 32800 (0.0008) +[2023-10-13 01:35:42,859][46663] Updated weights for policy 1, policy_version 32771 (0.0009) +[2023-10-13 01:35:43,219][46663] Updated weights for policy 1, policy_version 32781 (0.0010) +[2023-10-13 01:35:43,600][46663] Updated weights for policy 1, policy_version 32791 (0.0010) +[2023-10-13 01:35:43,606][45375] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 67141632. Throughput: 0: 1672.1, 1: 1685.9. Samples: 16795178. Policy #0 lag: (min: 6.0, avg: 7.3, max: 31.0) +[2023-10-13 01:35:43,607][45375] Avg episode reward: [(0, '46.880'), (1, '51.700')] +[2023-10-13 01:35:44,464][46662] Updated weights for policy 0, policy_version 32810 (0.0010) +[2023-10-13 01:35:44,843][46662] Updated weights for policy 0, policy_version 32820 (0.0011) +[2023-10-13 01:35:45,217][46662] Updated weights for policy 0, policy_version 32830 (0.0011) +[2023-10-13 01:35:47,743][46663] Updated weights for policy 1, policy_version 32801 (0.0009) +[2023-10-13 01:35:48,117][46663] Updated weights for policy 1, policy_version 32811 (0.0010) +[2023-10-13 01:35:48,483][46663] Updated weights for policy 1, policy_version 32821 (0.0010) +[2023-10-13 01:35:48,606][45375] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 67207168. Throughput: 0: 1684.2, 1: 1682.3. Samples: 16815902. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 01:35:48,607][45375] Avg episode reward: [(0, '46.960'), (1, '50.010')] +[2023-10-13 01:35:48,866][46663] Updated weights for policy 1, policy_version 32831 (0.0007) +[2023-10-13 01:35:49,220][46662] Updated weights for policy 0, policy_version 32840 (0.0008) +[2023-10-13 01:35:49,593][46662] Updated weights for policy 0, policy_version 32850 (0.0009) +[2023-10-13 01:35:49,952][46662] Updated weights for policy 0, policy_version 32860 (0.0007) +[2023-10-13 01:35:52,976][46663] Updated weights for policy 1, policy_version 32841 (0.0009) +[2023-10-13 01:35:53,343][46663] Updated weights for policy 1, policy_version 32851 (0.0008) +[2023-10-13 01:35:53,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 67272704. Throughput: 0: 1685.5, 1: 1664.9. Samples: 16835776. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 01:35:53,607][45375] Avg episode reward: [(0, '47.590'), (1, '48.770')] +[2023-10-13 01:35:53,717][46663] Updated weights for policy 1, policy_version 32861 (0.0007) +[2023-10-13 01:35:53,820][46384] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000032864_33652736.pth... +[2023-10-13 01:35:53,849][46384] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000031296_32047104.pth +[2023-10-13 01:35:53,897][46662] Updated weights for policy 0, policy_version 32870 (0.0007) +[2023-10-13 01:35:54,269][46662] Updated weights for policy 0, policy_version 32880 (0.0008) +[2023-10-13 01:35:54,636][46662] Updated weights for policy 0, policy_version 32890 (0.0007) +[2023-10-13 01:35:54,856][46091] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000032896_33685504.pth... +[2023-10-13 01:35:54,895][46091] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000031296_32047104.pth +[2023-10-13 01:35:57,799][46663] Updated weights for policy 1, policy_version 32871 (0.0009) +[2023-10-13 01:35:58,167][46663] Updated weights for policy 1, policy_version 32881 (0.0009) +[2023-10-13 01:35:58,548][46663] Updated weights for policy 1, policy_version 32891 (0.0010) +[2023-10-13 01:35:58,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 67338240. Throughput: 0: 1689.2, 1: 1678.4. Samples: 16845734. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 01:35:58,607][45375] Avg episode reward: [(0, '47.720'), (1, '49.370')] +[2023-10-13 01:35:58,720][46662] Updated weights for policy 0, policy_version 32900 (0.0010) +[2023-10-13 01:35:59,091][46662] Updated weights for policy 0, policy_version 32910 (0.0007) +[2023-10-13 01:35:59,465][46662] Updated weights for policy 0, policy_version 32920 (0.0009) +[2023-10-13 01:36:02,661][46663] Updated weights for policy 1, policy_version 32901 (0.0009) +[2023-10-13 01:36:03,027][46663] Updated weights for policy 1, policy_version 32911 (0.0007) +[2023-10-13 01:36:03,392][46663] Updated weights for policy 1, policy_version 32921 (0.0008) +[2023-10-13 01:36:03,410][46662] Updated weights for policy 0, policy_version 32930 (0.0009) +[2023-10-13 01:36:03,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 67403776. Throughput: 0: 1689.3, 1: 1674.4. Samples: 16866316. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 01:36:03,608][45375] Avg episode reward: [(0, '47.420'), (1, '50.190')] +[2023-10-13 01:36:03,766][46662] Updated weights for policy 0, policy_version 32940 (0.0009) +[2023-10-13 01:36:04,144][46662] Updated weights for policy 0, policy_version 32950 (0.0008) +[2023-10-13 01:36:04,506][46662] Updated weights for policy 0, policy_version 32960 (0.0007) +[2023-10-13 01:36:07,407][46663] Updated weights for policy 1, policy_version 32931 (0.0010) +[2023-10-13 01:36:07,761][46663] Updated weights for policy 1, policy_version 32941 (0.0010) +[2023-10-13 01:36:08,125][46663] Updated weights for policy 1, policy_version 32951 (0.0011) +[2023-10-13 01:36:08,607][45375] Fps is (10 sec: 16383.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 67502080. Throughput: 0: 1691.7, 1: 1659.5. Samples: 16885944. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 01:36:08,607][45375] Avg episode reward: [(0, '46.430'), (1, '50.330')] +[2023-10-13 01:36:08,673][46662] Updated weights for policy 0, policy_version 32970 (0.0009) +[2023-10-13 01:36:09,043][46662] Updated weights for policy 0, policy_version 32980 (0.0009) +[2023-10-13 01:36:09,416][46662] Updated weights for policy 0, policy_version 32990 (0.0007) +[2023-10-13 01:36:12,229][46663] Updated weights for policy 1, policy_version 32961 (0.0008) +[2023-10-13 01:36:12,593][46663] Updated weights for policy 1, policy_version 32971 (0.0009) +[2023-10-13 01:36:12,973][46663] Updated weights for policy 1, policy_version 32981 (0.0009) +[2023-10-13 01:36:13,338][46663] Updated weights for policy 1, policy_version 32991 (0.0007) +[2023-10-13 01:36:13,424][46662] Updated weights for policy 0, policy_version 33000 (0.0008) +[2023-10-13 01:36:13,607][45375] Fps is (10 sec: 16384.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 67567616. Throughput: 0: 1691.1, 1: 1672.3. Samples: 16896134. Policy #0 lag: (min: 14.0, avg: 14.4, max: 28.0) +[2023-10-13 01:36:13,608][45375] Avg episode reward: [(0, '46.190'), (1, '49.340')] +[2023-10-13 01:36:13,791][46662] Updated weights for policy 0, policy_version 33010 (0.0011) +[2023-10-13 01:36:14,174][46662] Updated weights for policy 0, policy_version 33020 (0.0011) +[2023-10-13 01:36:17,654][46663] Updated weights for policy 1, policy_version 33001 (0.0010) +[2023-10-13 01:36:18,027][46663] Updated weights for policy 1, policy_version 33011 (0.0007) +[2023-10-13 01:36:18,317][46662] Updated weights for policy 0, policy_version 33030 (0.0008) +[2023-10-13 01:36:18,392][46663] Updated weights for policy 1, policy_version 33021 (0.0009) +[2023-10-13 01:36:18,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 67633152. Throughput: 0: 1688.0, 1: 1672.6. Samples: 16916524. Policy #0 lag: (min: 14.0, avg: 14.4, max: 28.0) +[2023-10-13 01:36:18,608][45375] Avg episode reward: [(0, '47.030'), (1, '51.050')] +[2023-10-13 01:36:18,690][46662] Updated weights for policy 0, policy_version 33040 (0.0008) +[2023-10-13 01:36:19,059][46662] Updated weights for policy 0, policy_version 33050 (0.0007) +[2023-10-13 01:36:22,356][46663] Updated weights for policy 1, policy_version 33031 (0.0008) +[2023-10-13 01:36:22,736][46663] Updated weights for policy 1, policy_version 33041 (0.0010) +[2023-10-13 01:36:23,063][46662] Updated weights for policy 0, policy_version 33060 (0.0007) +[2023-10-13 01:36:23,114][46663] Updated weights for policy 1, policy_version 33051 (0.0009) +[2023-10-13 01:36:23,432][46662] Updated weights for policy 0, policy_version 33070 (0.0009) +[2023-10-13 01:36:23,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 67698688. Throughput: 0: 1687.1, 1: 1655.0. Samples: 16935984. Policy #0 lag: (min: 14.0, avg: 14.4, max: 28.0) +[2023-10-13 01:36:23,608][45375] Avg episode reward: [(0, '46.710'), (1, '49.910')] +[2023-10-13 01:36:23,814][46662] Updated weights for policy 0, policy_version 33080 (0.0010) +[2023-10-13 01:36:27,085][46663] Updated weights for policy 1, policy_version 33061 (0.0009) +[2023-10-13 01:36:27,450][46663] Updated weights for policy 1, policy_version 33071 (0.0007) +[2023-10-13 01:36:27,813][46662] Updated weights for policy 0, policy_version 33090 (0.0008) +[2023-10-13 01:36:27,816][46663] Updated weights for policy 1, policy_version 33081 (0.0010) +[2023-10-13 01:36:28,178][46662] Updated weights for policy 0, policy_version 33100 (0.0008) +[2023-10-13 01:36:28,553][46662] Updated weights for policy 0, policy_version 33110 (0.0007) +[2023-10-13 01:36:28,607][45375] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 67764224. Throughput: 0: 1687.2, 1: 1673.7. Samples: 16946418. Policy #0 lag: (min: 14.0, avg: 14.4, max: 28.0) +[2023-10-13 01:36:28,607][45375] Avg episode reward: [(0, '45.500'), (1, '48.200')] +[2023-10-13 01:36:28,918][46662] Updated weights for policy 0, policy_version 33120 (0.0008) +[2023-10-13 01:36:31,904][46663] Updated weights for policy 1, policy_version 33091 (0.0007) +[2023-10-13 01:36:32,272][46663] Updated weights for policy 1, policy_version 33101 (0.0009) +[2023-10-13 01:36:32,631][46663] Updated weights for policy 1, policy_version 33111 (0.0009) +[2023-10-13 01:36:33,183][46662] Updated weights for policy 0, policy_version 33130 (0.0008) +[2023-10-13 01:36:33,565][46662] Updated weights for policy 0, policy_version 33140 (0.0008) +[2023-10-13 01:36:33,607][45375] Fps is (10 sec: 13107.5, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 67829760. Throughput: 0: 1686.8, 1: 1660.2. Samples: 16966518. Policy #0 lag: (min: 14.0, avg: 14.4, max: 28.0) +[2023-10-13 01:36:33,607][45375] Avg episode reward: [(0, '45.970'), (1, '48.510')] +[2023-10-13 01:36:33,940][46662] Updated weights for policy 0, policy_version 33150 (0.0008) +[2023-10-13 01:36:36,728][46663] Updated weights for policy 1, policy_version 33121 (0.0008) +[2023-10-13 01:36:37,097][46663] Updated weights for policy 1, policy_version 33131 (0.0007) +[2023-10-13 01:36:37,473][46663] Updated weights for policy 1, policy_version 33141 (0.0007) +[2023-10-13 01:36:37,838][46663] Updated weights for policy 1, policy_version 33151 (0.0007) +[2023-10-13 01:36:38,058][46662] Updated weights for policy 0, policy_version 33160 (0.0010) +[2023-10-13 01:36:38,429][46662] Updated weights for policy 0, policy_version 33170 (0.0007) +[2023-10-13 01:36:38,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 67895296. Throughput: 0: 1679.4, 1: 1668.4. Samples: 16986426. Policy #0 lag: (min: 9.0, avg: 13.3, max: 41.0) +[2023-10-13 01:36:38,607][45375] Avg episode reward: [(0, '44.010'), (1, '47.790')] +[2023-10-13 01:36:38,811][46662] Updated weights for policy 0, policy_version 33180 (0.0009) +[2023-10-13 01:36:41,716][46663] Updated weights for policy 1, policy_version 33161 (0.0007) +[2023-10-13 01:36:42,082][46663] Updated weights for policy 1, policy_version 33171 (0.0007) +[2023-10-13 01:36:42,444][46663] Updated weights for policy 1, policy_version 33181 (0.0008) +[2023-10-13 01:36:42,627][46662] Updated weights for policy 0, policy_version 33190 (0.0007) +[2023-10-13 01:36:42,990][46662] Updated weights for policy 0, policy_version 33200 (0.0008) +[2023-10-13 01:36:43,355][46662] Updated weights for policy 0, policy_version 33210 (0.0009) +[2023-10-13 01:36:43,607][45375] Fps is (10 sec: 16383.9, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 67993600. Throughput: 0: 1677.8, 1: 1686.2. Samples: 16997114. Policy #0 lag: (min: 9.0, avg: 13.3, max: 41.0) +[2023-10-13 01:36:43,607][45375] Avg episode reward: [(0, '42.290'), (1, '48.880')] +[2023-10-13 01:36:46,344][46663] Updated weights for policy 1, policy_version 33191 (0.0008) +[2023-10-13 01:36:46,707][46663] Updated weights for policy 1, policy_version 33201 (0.0007) +[2023-10-13 01:36:47,083][46663] Updated weights for policy 1, policy_version 33211 (0.0008) +[2023-10-13 01:36:47,485][46662] Updated weights for policy 0, policy_version 33220 (0.0009) +[2023-10-13 01:36:47,859][46662] Updated weights for policy 0, policy_version 33230 (0.0008) +[2023-10-13 01:36:48,217][46662] Updated weights for policy 0, policy_version 33240 (0.0009) +[2023-10-13 01:36:48,607][45375] Fps is (10 sec: 16383.9, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 68059136. Throughput: 0: 1681.6, 1: 1661.0. Samples: 17016732. Policy #0 lag: (min: 9.0, avg: 13.3, max: 41.0) +[2023-10-13 01:36:48,607][45375] Avg episode reward: [(0, '41.350'), (1, '50.780')] +[2023-10-13 01:36:51,285][46663] Updated weights for policy 1, policy_version 33221 (0.0009) +[2023-10-13 01:36:51,659][46663] Updated weights for policy 1, policy_version 33231 (0.0007) +[2023-10-13 01:36:52,025][46663] Updated weights for policy 1, policy_version 33241 (0.0007) +[2023-10-13 01:36:52,177][46662] Updated weights for policy 0, policy_version 33250 (0.0011) +[2023-10-13 01:36:52,548][46662] Updated weights for policy 0, policy_version 33260 (0.0008) +[2023-10-13 01:36:52,920][46662] Updated weights for policy 0, policy_version 33270 (0.0007) +[2023-10-13 01:36:53,293][46662] Updated weights for policy 0, policy_version 33280 (0.0008) +[2023-10-13 01:36:53,607][45375] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 68124672. Throughput: 0: 1669.6, 1: 1683.4. Samples: 17036828. Policy #0 lag: (min: 9.0, avg: 13.3, max: 41.0) +[2023-10-13 01:36:53,607][45375] Avg episode reward: [(0, '40.280'), (1, '49.560')] +[2023-10-13 01:36:56,133][46663] Updated weights for policy 1, policy_version 33251 (0.0008) +[2023-10-13 01:36:56,500][46663] Updated weights for policy 1, policy_version 33261 (0.0007) +[2023-10-13 01:36:56,865][46663] Updated weights for policy 1, policy_version 33271 (0.0011) +[2023-10-13 01:36:57,293][46662] Updated weights for policy 0, policy_version 33290 (0.0008) +[2023-10-13 01:36:57,670][46662] Updated weights for policy 0, policy_version 33300 (0.0009) +[2023-10-13 01:36:58,035][46662] Updated weights for policy 0, policy_version 33310 (0.0008) +[2023-10-13 01:36:58,607][45375] Fps is (10 sec: 13107.0, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 68190208. Throughput: 0: 1687.5, 1: 1679.0. Samples: 17047628. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 01:36:58,608][45375] Avg episode reward: [(0, '39.200'), (1, '50.240')] +[2023-10-13 01:37:01,038][46663] Updated weights for policy 1, policy_version 33281 (0.0008) +[2023-10-13 01:37:01,408][46663] Updated weights for policy 1, policy_version 33291 (0.0007) +[2023-10-13 01:37:01,774][46663] Updated weights for policy 1, policy_version 33301 (0.0008) +[2023-10-13 01:37:02,135][46663] Updated weights for policy 1, policy_version 33311 (0.0009) +[2023-10-13 01:37:02,342][46662] Updated weights for policy 0, policy_version 33320 (0.0009) +[2023-10-13 01:37:02,714][46662] Updated weights for policy 0, policy_version 33330 (0.0010) +[2023-10-13 01:37:03,097][46662] Updated weights for policy 0, policy_version 33340 (0.0009) +[2023-10-13 01:37:03,606][45375] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 68255744. Throughput: 0: 1689.9, 1: 1664.2. Samples: 17067458. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 01:37:03,607][45375] Avg episode reward: [(0, '39.600'), (1, '49.510')] +[2023-10-13 01:37:06,334][46663] Updated weights for policy 1, policy_version 33321 (0.0010) +[2023-10-13 01:37:06,702][46663] Updated weights for policy 1, policy_version 33331 (0.0008) +[2023-10-13 01:37:07,039][46662] Updated weights for policy 0, policy_version 33350 (0.0008) +[2023-10-13 01:37:07,074][46663] Updated weights for policy 1, policy_version 33341 (0.0007) +[2023-10-13 01:37:07,411][46662] Updated weights for policy 0, policy_version 33360 (0.0008) +[2023-10-13 01:37:07,771][46662] Updated weights for policy 0, policy_version 33370 (0.0009) +[2023-10-13 01:37:08,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 68321280. Throughput: 0: 1666.6, 1: 1689.7. Samples: 17087018. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 01:37:08,607][45375] Avg episode reward: [(0, '40.280'), (1, '50.340')] +[2023-10-13 01:37:11,050][46663] Updated weights for policy 1, policy_version 33351 (0.0008) +[2023-10-13 01:37:11,413][46663] Updated weights for policy 1, policy_version 33361 (0.0009) +[2023-10-13 01:37:11,750][46662] Updated weights for policy 0, policy_version 33380 (0.0010) +[2023-10-13 01:37:11,787][46663] Updated weights for policy 1, policy_version 33371 (0.0009) +[2023-10-13 01:37:12,113][46662] Updated weights for policy 0, policy_version 33390 (0.0008) +[2023-10-13 01:37:12,485][46662] Updated weights for policy 0, policy_version 33400 (0.0009) +[2023-10-13 01:37:13,607][45375] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 68386816. Throughput: 0: 1689.1, 1: 1672.3. Samples: 17097678. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 01:37:13,608][45375] Avg episode reward: [(0, '39.390'), (1, '50.490')] +[2023-10-13 01:37:15,922][46663] Updated weights for policy 1, policy_version 33381 (0.0010) +[2023-10-13 01:37:16,285][46663] Updated weights for policy 1, policy_version 33391 (0.0009) +[2023-10-13 01:37:16,472][46662] Updated weights for policy 0, policy_version 33410 (0.0009) +[2023-10-13 01:37:16,649][46663] Updated weights for policy 1, policy_version 33401 (0.0009) +[2023-10-13 01:37:16,846][46662] Updated weights for policy 0, policy_version 33420 (0.0007) +[2023-10-13 01:37:17,211][46662] Updated weights for policy 0, policy_version 33430 (0.0007) +[2023-10-13 01:37:17,590][46662] Updated weights for policy 0, policy_version 33440 (0.0009) +[2023-10-13 01:37:18,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 68452352. Throughput: 0: 1680.8, 1: 1670.5. Samples: 17117326. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 01:37:18,607][45375] Avg episode reward: [(0, '41.220'), (1, '49.250')] +[2023-10-13 01:37:20,898][46663] Updated weights for policy 1, policy_version 33411 (0.0010) +[2023-10-13 01:37:21,273][46663] Updated weights for policy 1, policy_version 33421 (0.0008) +[2023-10-13 01:37:21,640][46662] Updated weights for policy 0, policy_version 33450 (0.0008) +[2023-10-13 01:37:21,643][46663] Updated weights for policy 1, policy_version 33431 (0.0008) +[2023-10-13 01:37:22,020][46662] Updated weights for policy 0, policy_version 33460 (0.0008) +[2023-10-13 01:37:22,388][46662] Updated weights for policy 0, policy_version 33470 (0.0008) +[2023-10-13 01:37:23,606][45375] Fps is (10 sec: 13107.4, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 68517888. Throughput: 0: 1667.9, 1: 1684.5. Samples: 17137282. Policy #0 lag: (min: 14.0, avg: 16.9, max: 46.0) +[2023-10-13 01:37:23,607][45375] Avg episode reward: [(0, '41.690'), (1, '50.540')] +[2023-10-13 01:37:25,616][46663] Updated weights for policy 1, policy_version 33441 (0.0009) +[2023-10-13 01:37:25,976][46663] Updated weights for policy 1, policy_version 33451 (0.0009) +[2023-10-13 01:37:26,233][46662] Updated weights for policy 0, policy_version 33480 (0.0008) +[2023-10-13 01:37:26,342][46663] Updated weights for policy 1, policy_version 33461 (0.0008) +[2023-10-13 01:37:26,607][46662] Updated weights for policy 0, policy_version 33490 (0.0009) +[2023-10-13 01:37:26,712][46663] Updated weights for policy 1, policy_version 33471 (0.0007) +[2023-10-13 01:37:26,985][46662] Updated weights for policy 0, policy_version 33500 (0.0010) +[2023-10-13 01:37:28,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 68583424. Throughput: 0: 1697.5, 1: 1661.5. Samples: 17148268. Policy #0 lag: (min: 14.0, avg: 16.9, max: 46.0) +[2023-10-13 01:37:28,607][45375] Avg episode reward: [(0, '42.120'), (1, '51.120')] +[2023-10-13 01:37:30,756][46663] Updated weights for policy 1, policy_version 33481 (0.0008) +[2023-10-13 01:37:31,093][46662] Updated weights for policy 0, policy_version 33510 (0.0009) +[2023-10-13 01:37:31,125][46663] Updated weights for policy 1, policy_version 33491 (0.0008) +[2023-10-13 01:37:31,461][46662] Updated weights for policy 0, policy_version 33520 (0.0008) +[2023-10-13 01:37:31,489][46663] Updated weights for policy 1, policy_version 33501 (0.0007) +[2023-10-13 01:37:31,826][46662] Updated weights for policy 0, policy_version 33530 (0.0009) +[2023-10-13 01:37:33,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 68648960. Throughput: 0: 1669.9, 1: 1678.6. Samples: 17167412. Policy #0 lag: (min: 14.0, avg: 16.9, max: 46.0) +[2023-10-13 01:37:33,607][45375] Avg episode reward: [(0, '41.680'), (1, '51.310')] +[2023-10-13 01:37:35,421][46663] Updated weights for policy 1, policy_version 33511 (0.0009) +[2023-10-13 01:37:35,793][46663] Updated weights for policy 1, policy_version 33521 (0.0008) +[2023-10-13 01:37:35,980][46662] Updated weights for policy 0, policy_version 33540 (0.0008) +[2023-10-13 01:37:36,161][46663] Updated weights for policy 1, policy_version 33531 (0.0009) +[2023-10-13 01:37:36,344][46662] Updated weights for policy 0, policy_version 33550 (0.0007) +[2023-10-13 01:37:36,724][46662] Updated weights for policy 0, policy_version 33560 (0.0008) +[2023-10-13 01:37:38,606][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 68714496. Throughput: 0: 1673.9, 1: 1682.5. Samples: 17187864. Policy #0 lag: (min: 14.0, avg: 16.9, max: 46.0) +[2023-10-13 01:37:38,607][45375] Avg episode reward: [(0, '43.390'), (1, '49.320')] +[2023-10-13 01:37:40,434][46663] Updated weights for policy 1, policy_version 33541 (0.0008) +[2023-10-13 01:37:40,798][46663] Updated weights for policy 1, policy_version 33551 (0.0008) +[2023-10-13 01:37:40,802][46662] Updated weights for policy 0, policy_version 33570 (0.0009) +[2023-10-13 01:37:41,166][46662] Updated weights for policy 0, policy_version 33580 (0.0007) +[2023-10-13 01:37:41,168][46663] Updated weights for policy 1, policy_version 33561 (0.0008) +[2023-10-13 01:37:41,539][46662] Updated weights for policy 0, policy_version 33590 (0.0008) +[2023-10-13 01:37:41,905][46662] Updated weights for policy 0, policy_version 33600 (0.0008) +[2023-10-13 01:37:43,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 68780032. Throughput: 0: 1682.8, 1: 1664.9. Samples: 17198274. Policy #0 lag: (min: 14.0, avg: 16.9, max: 46.0) +[2023-10-13 01:37:43,607][45375] Avg episode reward: [(0, '43.340'), (1, '48.960')] +[2023-10-13 01:37:44,971][46663] Updated weights for policy 1, policy_version 33571 (0.0008) +[2023-10-13 01:37:45,341][46663] Updated weights for policy 1, policy_version 33581 (0.0009) +[2023-10-13 01:37:45,711][46663] Updated weights for policy 1, policy_version 33591 (0.0009) +[2023-10-13 01:37:45,927][46662] Updated weights for policy 0, policy_version 33610 (0.0008) +[2023-10-13 01:37:46,296][46662] Updated weights for policy 0, policy_version 33620 (0.0007) +[2023-10-13 01:37:46,667][46662] Updated weights for policy 0, policy_version 33630 (0.0007) +[2023-10-13 01:37:48,607][45375] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 68845568. Throughput: 0: 1659.5, 1: 1685.1. Samples: 17217966. Policy #0 lag: (min: 17.0, avg: 28.0, max: 49.0) +[2023-10-13 01:37:48,608][45375] Avg episode reward: [(0, '43.040'), (1, '47.680')] +[2023-10-13 01:37:49,625][46663] Updated weights for policy 1, policy_version 33601 (0.0009) +[2023-10-13 01:37:49,992][46663] Updated weights for policy 1, policy_version 33611 (0.0007) +[2023-10-13 01:37:50,358][46663] Updated weights for policy 1, policy_version 33621 (0.0007) +[2023-10-13 01:37:50,704][46662] Updated weights for policy 0, policy_version 33640 (0.0009) +[2023-10-13 01:37:50,724][46663] Updated weights for policy 1, policy_version 33631 (0.0007) +[2023-10-13 01:37:51,080][46662] Updated weights for policy 0, policy_version 33650 (0.0007) +[2023-10-13 01:37:51,441][46662] Updated weights for policy 0, policy_version 33660 (0.0007) +[2023-10-13 01:37:53,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 68911104. Throughput: 0: 1678.7, 1: 1686.2. Samples: 17238438. Policy #0 lag: (min: 17.0, avg: 28.0, max: 49.0) +[2023-10-13 01:37:53,607][45375] Avg episode reward: [(0, '45.430'), (1, '49.020')] +[2023-10-13 01:37:53,616][46384] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000033632_34439168.pth... +[2023-10-13 01:37:53,616][46091] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000033664_34471936.pth... +[2023-10-13 01:37:53,654][46091] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000032096_32866304.pth +[2023-10-13 01:37:53,656][46384] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000032064_32833536.pth +[2023-10-13 01:37:55,117][46663] Updated weights for policy 1, policy_version 33641 (0.0007) +[2023-10-13 01:37:55,500][46663] Updated weights for policy 1, policy_version 33651 (0.0007) +[2023-10-13 01:37:55,647][46662] Updated weights for policy 0, policy_version 33670 (0.0007) +[2023-10-13 01:37:55,865][46663] Updated weights for policy 1, policy_version 33661 (0.0010) +[2023-10-13 01:37:56,023][46662] Updated weights for policy 0, policy_version 33680 (0.0008) +[2023-10-13 01:37:56,396][46662] Updated weights for policy 0, policy_version 33690 (0.0009) +[2023-10-13 01:37:58,606][45375] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 68976640. Throughput: 0: 1677.3, 1: 1670.1. Samples: 17248310. Policy #0 lag: (min: 17.0, avg: 28.0, max: 49.0) +[2023-10-13 01:37:58,607][45375] Avg episode reward: [(0, '45.570'), (1, '49.550')] +[2023-10-13 01:38:00,096][46663] Updated weights for policy 1, policy_version 33671 (0.0008) +[2023-10-13 01:38:00,451][46663] Updated weights for policy 1, policy_version 33681 (0.0008) +[2023-10-13 01:38:00,481][46662] Updated weights for policy 0, policy_version 33700 (0.0009) +[2023-10-13 01:38:00,823][46663] Updated weights for policy 1, policy_version 33691 (0.0008) +[2023-10-13 01:38:00,851][46662] Updated weights for policy 0, policy_version 33710 (0.0008) +[2023-10-13 01:38:01,212][46662] Updated weights for policy 0, policy_version 33720 (0.0009) +[2023-10-13 01:38:03,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 69042176. Throughput: 0: 1666.0, 1: 1688.6. Samples: 17268286. Policy #0 lag: (min: 17.0, avg: 28.0, max: 49.0) +[2023-10-13 01:38:03,607][45375] Avg episode reward: [(0, '45.680'), (1, '50.560')] +[2023-10-13 01:38:04,780][46663] Updated weights for policy 1, policy_version 33701 (0.0008) +[2023-10-13 01:38:05,147][46663] Updated weights for policy 1, policy_version 33711 (0.0008) +[2023-10-13 01:38:05,260][46662] Updated weights for policy 0, policy_version 33730 (0.0009) +[2023-10-13 01:38:05,525][46663] Updated weights for policy 1, policy_version 33721 (0.0008) +[2023-10-13 01:38:05,621][46662] Updated weights for policy 0, policy_version 33740 (0.0008) +[2023-10-13 01:38:05,999][46662] Updated weights for policy 0, policy_version 33750 (0.0009) +[2023-10-13 01:38:06,358][46662] Updated weights for policy 0, policy_version 33760 (0.0008) +[2023-10-13 01:38:08,606][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 69107712. Throughput: 0: 1683.2, 1: 1685.7. Samples: 17288886. Policy #0 lag: (min: 17.0, avg: 28.0, max: 49.0) +[2023-10-13 01:38:08,607][45375] Avg episode reward: [(0, '46.260'), (1, '51.320')] +[2023-10-13 01:38:09,747][46663] Updated weights for policy 1, policy_version 33731 (0.0008) +[2023-10-13 01:38:10,115][46663] Updated weights for policy 1, policy_version 33741 (0.0007) +[2023-10-13 01:38:10,479][46663] Updated weights for policy 1, policy_version 33751 (0.0007) +[2023-10-13 01:38:10,509][46662] Updated weights for policy 0, policy_version 33770 (0.0007) +[2023-10-13 01:38:10,881][46662] Updated weights for policy 0, policy_version 33780 (0.0007) +[2023-10-13 01:38:11,235][46662] Updated weights for policy 0, policy_version 33790 (0.0011) +[2023-10-13 01:38:13,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 69173248. Throughput: 0: 1664.0, 1: 1674.5. Samples: 17298500. Policy #0 lag: (min: 21.0, avg: 21.0, max: 21.0) +[2023-10-13 01:38:13,607][45375] Avg episode reward: [(0, '46.870'), (1, '50.030')] +[2023-10-13 01:38:14,455][46663] Updated weights for policy 1, policy_version 33761 (0.0007) +[2023-10-13 01:38:14,825][46663] Updated weights for policy 1, policy_version 33771 (0.0008) +[2023-10-13 01:38:15,186][46663] Updated weights for policy 1, policy_version 33781 (0.0010) +[2023-10-13 01:38:15,434][46662] Updated weights for policy 0, policy_version 33800 (0.0008) +[2023-10-13 01:38:15,558][46663] Updated weights for policy 1, policy_version 33791 (0.0008) +[2023-10-13 01:38:15,801][46662] Updated weights for policy 0, policy_version 33810 (0.0010) +[2023-10-13 01:38:16,178][46662] Updated weights for policy 0, policy_version 33820 (0.0008) +[2023-10-13 01:38:18,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 69238784. Throughput: 0: 1673.4, 1: 1688.9. Samples: 17318714. Policy #0 lag: (min: 21.0, avg: 21.0, max: 21.0) +[2023-10-13 01:38:18,607][45375] Avg episode reward: [(0, '47.610'), (1, '50.550')] +[2023-10-13 01:38:19,562][46663] Updated weights for policy 1, policy_version 33801 (0.0008) +[2023-10-13 01:38:19,929][46663] Updated weights for policy 1, policy_version 33811 (0.0008) +[2023-10-13 01:38:20,161][46662] Updated weights for policy 0, policy_version 33830 (0.0008) +[2023-10-13 01:38:20,291][46663] Updated weights for policy 1, policy_version 33821 (0.0007) +[2023-10-13 01:38:20,540][46662] Updated weights for policy 0, policy_version 33840 (0.0008) +[2023-10-13 01:38:20,916][46662] Updated weights for policy 0, policy_version 33850 (0.0009) +[2023-10-13 01:38:23,607][45375] Fps is (10 sec: 13106.8, 60 sec: 13107.1, 300 sec: 13329.3). Total num frames: 69304320. Throughput: 0: 1681.3, 1: 1688.1. Samples: 17339486. Policy #0 lag: (min: 21.0, avg: 21.0, max: 21.0) +[2023-10-13 01:38:23,608][45375] Avg episode reward: [(0, '47.110'), (1, '49.840')] +[2023-10-13 01:38:24,218][46663] Updated weights for policy 1, policy_version 33831 (0.0008) +[2023-10-13 01:38:24,595][46663] Updated weights for policy 1, policy_version 33841 (0.0008) +[2023-10-13 01:38:24,863][46662] Updated weights for policy 0, policy_version 33860 (0.0007) +[2023-10-13 01:38:24,958][46663] Updated weights for policy 1, policy_version 33851 (0.0009) +[2023-10-13 01:38:25,232][46662] Updated weights for policy 0, policy_version 33870 (0.0008) +[2023-10-13 01:38:25,592][46662] Updated weights for policy 0, policy_version 33880 (0.0008) +[2023-10-13 01:38:28,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 69369856. Throughput: 0: 1658.4, 1: 1686.1. Samples: 17348780. Policy #0 lag: (min: 21.0, avg: 21.0, max: 21.0) +[2023-10-13 01:38:28,607][45375] Avg episode reward: [(0, '48.250'), (1, '49.920')] +[2023-10-13 01:38:28,924][46663] Updated weights for policy 1, policy_version 33861 (0.0009) +[2023-10-13 01:38:29,294][46663] Updated weights for policy 1, policy_version 33871 (0.0008) +[2023-10-13 01:38:29,661][46663] Updated weights for policy 1, policy_version 33881 (0.0007) +[2023-10-13 01:38:29,709][46662] Updated weights for policy 0, policy_version 33890 (0.0007) +[2023-10-13 01:38:30,076][46662] Updated weights for policy 0, policy_version 33900 (0.0009) +[2023-10-13 01:38:30,444][46662] Updated weights for policy 0, policy_version 33910 (0.0010) +[2023-10-13 01:38:30,816][46662] Updated weights for policy 0, policy_version 33920 (0.0011) +[2023-10-13 01:38:33,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 69435392. Throughput: 0: 1680.4, 1: 1692.2. Samples: 17369734. Policy #0 lag: (min: 21.0, avg: 21.0, max: 21.0) +[2023-10-13 01:38:33,608][45375] Avg episode reward: [(0, '49.300'), (1, '49.630')] +[2023-10-13 01:38:33,749][46663] Updated weights for policy 1, policy_version 33891 (0.0010) +[2023-10-13 01:38:34,121][46663] Updated weights for policy 1, policy_version 33901 (0.0009) +[2023-10-13 01:38:34,497][46663] Updated weights for policy 1, policy_version 33911 (0.0008) +[2023-10-13 01:38:34,825][46662] Updated weights for policy 0, policy_version 33930 (0.0009) +[2023-10-13 01:38:35,186][46662] Updated weights for policy 0, policy_version 33940 (0.0009) +[2023-10-13 01:38:35,553][46662] Updated weights for policy 0, policy_version 33950 (0.0007) +[2023-10-13 01:38:38,421][46663] Updated weights for policy 1, policy_version 33921 (0.0009) +[2023-10-13 01:38:38,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 69500928. Throughput: 0: 1683.2, 1: 1695.4. Samples: 17390474. Policy #0 lag: (min: 0.0, avg: 26.6, max: 32.0) +[2023-10-13 01:38:38,607][45375] Avg episode reward: [(0, '48.660'), (1, '49.760')] +[2023-10-13 01:38:38,781][46663] Updated weights for policy 1, policy_version 33931 (0.0008) +[2023-10-13 01:38:39,150][46663] Updated weights for policy 1, policy_version 33941 (0.0010) +[2023-10-13 01:38:39,520][46663] Updated weights for policy 1, policy_version 33951 (0.0009) +[2023-10-13 01:38:39,667][46662] Updated weights for policy 0, policy_version 33960 (0.0009) +[2023-10-13 01:38:40,039][46662] Updated weights for policy 0, policy_version 33970 (0.0009) +[2023-10-13 01:38:40,423][46662] Updated weights for policy 0, policy_version 33980 (0.0007) +[2023-10-13 01:38:43,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 69566464. Throughput: 0: 1663.1, 1: 1695.9. Samples: 17399462. Policy #0 lag: (min: 0.0, avg: 26.6, max: 32.0) +[2023-10-13 01:38:43,608][45375] Avg episode reward: [(0, '49.650'), (1, '51.470')] +[2023-10-13 01:38:43,622][46663] Updated weights for policy 1, policy_version 33961 (0.0007) +[2023-10-13 01:38:43,987][46663] Updated weights for policy 1, policy_version 33971 (0.0007) +[2023-10-13 01:38:44,356][46663] Updated weights for policy 1, policy_version 33981 (0.0007) +[2023-10-13 01:38:44,437][46662] Updated weights for policy 0, policy_version 33990 (0.0008) +[2023-10-13 01:38:44,819][46662] Updated weights for policy 0, policy_version 34000 (0.0009) +[2023-10-13 01:38:45,192][46662] Updated weights for policy 0, policy_version 34010 (0.0009) +[2023-10-13 01:38:48,426][46663] Updated weights for policy 1, policy_version 33991 (0.0010) +[2023-10-13 01:38:48,606][45375] Fps is (10 sec: 13107.5, 60 sec: 13107.3, 300 sec: 13329.4). Total num frames: 69632000. Throughput: 0: 1690.3, 1: 1688.0. Samples: 17420308. Policy #0 lag: (min: 0.0, avg: 26.6, max: 32.0) +[2023-10-13 01:38:48,607][45375] Avg episode reward: [(0, '51.890'), (1, '52.370')] +[2023-10-13 01:38:48,778][46663] Updated weights for policy 1, policy_version 34001 (0.0008) +[2023-10-13 01:38:49,142][46663] Updated weights for policy 1, policy_version 34011 (0.0010) +[2023-10-13 01:38:49,318][46662] Updated weights for policy 0, policy_version 34020 (0.0009) +[2023-10-13 01:38:49,697][46662] Updated weights for policy 0, policy_version 34030 (0.0009) +[2023-10-13 01:38:50,064][46662] Updated weights for policy 0, policy_version 34040 (0.0010) +[2023-10-13 01:38:53,276][46663] Updated weights for policy 1, policy_version 34021 (0.0009) +[2023-10-13 01:38:53,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.1, 300 sec: 13329.3). Total num frames: 69697536. Throughput: 0: 1693.6, 1: 1678.6. Samples: 17440638. Policy #0 lag: (min: 0.0, avg: 26.6, max: 32.0) +[2023-10-13 01:38:53,608][45375] Avg episode reward: [(0, '50.690'), (1, '52.290')] +[2023-10-13 01:38:53,645][46663] Updated weights for policy 1, policy_version 34031 (0.0008) +[2023-10-13 01:38:53,999][46662] Updated weights for policy 0, policy_version 34050 (0.0008) +[2023-10-13 01:38:54,013][46663] Updated weights for policy 1, policy_version 34041 (0.0008) +[2023-10-13 01:38:54,364][46662] Updated weights for policy 0, policy_version 34060 (0.0007) +[2023-10-13 01:38:54,732][46662] Updated weights for policy 0, policy_version 34070 (0.0008) +[2023-10-13 01:38:55,104][46662] Updated weights for policy 0, policy_version 34080 (0.0008) +[2023-10-13 01:38:58,210][46663] Updated weights for policy 1, policy_version 34051 (0.0010) +[2023-10-13 01:38:58,572][46663] Updated weights for policy 1, policy_version 34061 (0.0009) +[2023-10-13 01:38:58,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 69763072. Throughput: 0: 1682.6, 1: 1683.5. Samples: 17449976. Policy #0 lag: (min: 0.0, avg: 26.6, max: 32.0) +[2023-10-13 01:38:58,607][45375] Avg episode reward: [(0, '50.740'), (1, '51.220')] +[2023-10-13 01:38:58,942][46663] Updated weights for policy 1, policy_version 34071 (0.0009) +[2023-10-13 01:38:59,166][46662] Updated weights for policy 0, policy_version 34090 (0.0008) +[2023-10-13 01:38:59,547][46662] Updated weights for policy 0, policy_version 34100 (0.0009) +[2023-10-13 01:38:59,916][46662] Updated weights for policy 0, policy_version 34110 (0.0010) +[2023-10-13 01:39:03,091][46663] Updated weights for policy 1, policy_version 34081 (0.0009) +[2023-10-13 01:39:03,461][46663] Updated weights for policy 1, policy_version 34091 (0.0008) +[2023-10-13 01:39:03,607][45375] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 69828608. Throughput: 0: 1693.6, 1: 1678.4. Samples: 17470450. Policy #0 lag: (min: 10.0, avg: 16.8, max: 42.0) +[2023-10-13 01:39:03,607][45375] Avg episode reward: [(0, '51.430'), (1, '50.770')] +[2023-10-13 01:39:03,840][46663] Updated weights for policy 1, policy_version 34101 (0.0009) +[2023-10-13 01:39:04,159][46662] Updated weights for policy 0, policy_version 34120 (0.0008) +[2023-10-13 01:39:04,201][46663] Updated weights for policy 1, policy_version 34111 (0.0008) +[2023-10-13 01:39:04,521][46662] Updated weights for policy 0, policy_version 34130 (0.0009) +[2023-10-13 01:39:04,892][46662] Updated weights for policy 0, policy_version 34140 (0.0010) +[2023-10-13 01:39:08,314][46663] Updated weights for policy 1, policy_version 34121 (0.0007) +[2023-10-13 01:39:08,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 69894144. Throughput: 0: 1690.9, 1: 1662.9. Samples: 17490406. Policy #0 lag: (min: 10.0, avg: 16.8, max: 42.0) +[2023-10-13 01:39:08,607][45375] Avg episode reward: [(0, '52.370'), (1, '50.000')] +[2023-10-13 01:39:08,678][46663] Updated weights for policy 1, policy_version 34131 (0.0008) +[2023-10-13 01:39:08,996][46662] Updated weights for policy 0, policy_version 34150 (0.0008) +[2023-10-13 01:39:09,046][46663] Updated weights for policy 1, policy_version 34141 (0.0008) +[2023-10-13 01:39:09,366][46662] Updated weights for policy 0, policy_version 34160 (0.0008) +[2023-10-13 01:39:09,728][46662] Updated weights for policy 0, policy_version 34170 (0.0007) +[2023-10-13 01:39:13,107][46663] Updated weights for policy 1, policy_version 34151 (0.0009) +[2023-10-13 01:39:13,474][46663] Updated weights for policy 1, policy_version 34161 (0.0007) +[2023-10-13 01:39:13,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 69959680. Throughput: 0: 1686.0, 1: 1671.5. Samples: 17499866. Policy #0 lag: (min: 10.0, avg: 16.8, max: 42.0) +[2023-10-13 01:39:13,607][45375] Avg episode reward: [(0, '49.710'), (1, '49.990')] +[2023-10-13 01:39:13,620][46662] Updated weights for policy 0, policy_version 34180 (0.0009) +[2023-10-13 01:39:13,841][46663] Updated weights for policy 1, policy_version 34171 (0.0007) +[2023-10-13 01:39:13,989][46662] Updated weights for policy 0, policy_version 34190 (0.0007) +[2023-10-13 01:39:14,364][46662] Updated weights for policy 0, policy_version 34200 (0.0008) +[2023-10-13 01:39:17,999][46663] Updated weights for policy 1, policy_version 34181 (0.0009) +[2023-10-13 01:39:18,367][46663] Updated weights for policy 1, policy_version 34191 (0.0008) +[2023-10-13 01:39:18,571][46662] Updated weights for policy 0, policy_version 34210 (0.0008) +[2023-10-13 01:39:18,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 70025216. Throughput: 0: 1689.3, 1: 1665.1. Samples: 17520680. Policy #0 lag: (min: 10.0, avg: 16.8, max: 42.0) +[2023-10-13 01:39:18,607][45375] Avg episode reward: [(0, '48.060'), (1, '50.170')] +[2023-10-13 01:39:18,737][46663] Updated weights for policy 1, policy_version 34201 (0.0009) +[2023-10-13 01:39:18,946][46662] Updated weights for policy 0, policy_version 34220 (0.0008) +[2023-10-13 01:39:19,319][46662] Updated weights for policy 0, policy_version 34230 (0.0007) +[2023-10-13 01:39:19,684][46662] Updated weights for policy 0, policy_version 34240 (0.0007) +[2023-10-13 01:39:22,801][46663] Updated weights for policy 1, policy_version 34211 (0.0009) +[2023-10-13 01:39:23,175][46663] Updated weights for policy 1, policy_version 34221 (0.0009) +[2023-10-13 01:39:23,539][46663] Updated weights for policy 1, policy_version 34231 (0.0008) +[2023-10-13 01:39:23,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 70090752. Throughput: 0: 1686.3, 1: 1651.5. Samples: 17540674. Policy #0 lag: (min: 10.0, avg: 16.8, max: 42.0) +[2023-10-13 01:39:23,608][45375] Avg episode reward: [(0, '47.830'), (1, '49.370')] +[2023-10-13 01:39:23,758][46662] Updated weights for policy 0, policy_version 34250 (0.0011) +[2023-10-13 01:39:24,126][46662] Updated weights for policy 0, policy_version 34260 (0.0007) +[2023-10-13 01:39:24,494][46662] Updated weights for policy 0, policy_version 34270 (0.0007) +[2023-10-13 01:39:27,751][46663] Updated weights for policy 1, policy_version 34241 (0.0009) +[2023-10-13 01:39:28,119][46663] Updated weights for policy 1, policy_version 34251 (0.0009) +[2023-10-13 01:39:28,480][46663] Updated weights for policy 1, policy_version 34261 (0.0007) +[2023-10-13 01:39:28,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 70156288. Throughput: 0: 1688.6, 1: 1666.2. Samples: 17550426. Policy #0 lag: (min: 31.0, avg: 38.9, max: 63.0) +[2023-10-13 01:39:28,607][45375] Avg episode reward: [(0, '47.440'), (1, '50.920')] +[2023-10-13 01:39:28,622][46662] Updated weights for policy 0, policy_version 34280 (0.0008) +[2023-10-13 01:39:28,853][46663] Updated weights for policy 1, policy_version 34271 (0.0008) +[2023-10-13 01:39:28,993][46662] Updated weights for policy 0, policy_version 34290 (0.0008) +[2023-10-13 01:39:29,359][46662] Updated weights for policy 0, policy_version 34300 (0.0009) +[2023-10-13 01:39:32,779][46663] Updated weights for policy 1, policy_version 34281 (0.0009) +[2023-10-13 01:39:33,154][46663] Updated weights for policy 1, policy_version 34291 (0.0008) +[2023-10-13 01:39:33,446][46662] Updated weights for policy 0, policy_version 34310 (0.0010) +[2023-10-13 01:39:33,513][46663] Updated weights for policy 1, policy_version 34301 (0.0007) +[2023-10-13 01:39:33,606][45375] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 70221824. Throughput: 0: 1678.7, 1: 1668.1. Samples: 17570914. Policy #0 lag: (min: 31.0, avg: 38.9, max: 63.0) +[2023-10-13 01:39:33,607][45375] Avg episode reward: [(0, '47.470'), (1, '51.220')] +[2023-10-13 01:39:33,812][46662] Updated weights for policy 0, policy_version 34320 (0.0008) +[2023-10-13 01:39:34,189][46662] Updated weights for policy 0, policy_version 34330 (0.0007) +[2023-10-13 01:39:37,632][46663] Updated weights for policy 1, policy_version 34311 (0.0010) +[2023-10-13 01:39:38,000][46663] Updated weights for policy 1, policy_version 34321 (0.0009) +[2023-10-13 01:39:38,173][46662] Updated weights for policy 0, policy_version 34340 (0.0007) +[2023-10-13 01:39:38,364][46663] Updated weights for policy 1, policy_version 34331 (0.0007) +[2023-10-13 01:39:38,544][46662] Updated weights for policy 0, policy_version 34350 (0.0009) +[2023-10-13 01:39:38,606][45375] Fps is (10 sec: 16384.1, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 70320128. Throughput: 0: 1679.9, 1: 1656.1. Samples: 17590758. Policy #0 lag: (min: 31.0, avg: 38.9, max: 63.0) +[2023-10-13 01:39:38,607][45375] Avg episode reward: [(0, '46.880'), (1, '51.260')] +[2023-10-13 01:39:38,916][46662] Updated weights for policy 0, policy_version 34360 (0.0010) +[2023-10-13 01:39:42,485][46663] Updated weights for policy 1, policy_version 34341 (0.0008) +[2023-10-13 01:39:42,850][46663] Updated weights for policy 1, policy_version 34351 (0.0008) +[2023-10-13 01:39:42,987][46662] Updated weights for policy 0, policy_version 34370 (0.0009) +[2023-10-13 01:39:43,216][46663] Updated weights for policy 1, policy_version 34361 (0.0008) +[2023-10-13 01:39:43,343][46662] Updated weights for policy 0, policy_version 34380 (0.0007) +[2023-10-13 01:39:43,607][45375] Fps is (10 sec: 16383.9, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 70385664. Throughput: 0: 1680.9, 1: 1676.3. Samples: 17601048. Policy #0 lag: (min: 31.0, avg: 38.9, max: 63.0) +[2023-10-13 01:39:43,607][45375] Avg episode reward: [(0, '45.830'), (1, '50.930')] +[2023-10-13 01:39:43,716][46662] Updated weights for policy 0, policy_version 34390 (0.0010) +[2023-10-13 01:39:44,096][46662] Updated weights for policy 0, policy_version 34400 (0.0011) +[2023-10-13 01:39:47,286][46663] Updated weights for policy 1, policy_version 34371 (0.0008) +[2023-10-13 01:39:47,654][46663] Updated weights for policy 1, policy_version 34381 (0.0008) +[2023-10-13 01:39:48,025][46663] Updated weights for policy 1, policy_version 34391 (0.0009) +[2023-10-13 01:39:48,294][46662] Updated weights for policy 0, policy_version 34410 (0.0010) +[2023-10-13 01:39:48,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 70451200. Throughput: 0: 1682.4, 1: 1670.8. Samples: 17621342. Policy #0 lag: (min: 31.0, avg: 38.9, max: 63.0) +[2023-10-13 01:39:48,607][45375] Avg episode reward: [(0, '45.350'), (1, '50.000')] +[2023-10-13 01:39:48,665][46662] Updated weights for policy 0, policy_version 34420 (0.0007) +[2023-10-13 01:39:49,040][46662] Updated weights for policy 0, policy_version 34430 (0.0007) +[2023-10-13 01:39:52,310][46663] Updated weights for policy 1, policy_version 34401 (0.0010) +[2023-10-13 01:39:52,681][46663] Updated weights for policy 1, policy_version 34411 (0.0008) +[2023-10-13 01:39:52,996][46662] Updated weights for policy 0, policy_version 34440 (0.0007) +[2023-10-13 01:39:53,039][46663] Updated weights for policy 1, policy_version 34421 (0.0007) +[2023-10-13 01:39:53,362][46662] Updated weights for policy 0, policy_version 34450 (0.0008) +[2023-10-13 01:39:53,418][46663] Updated weights for policy 1, policy_version 34431 (0.0008) +[2023-10-13 01:39:53,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 70516736. Throughput: 0: 1677.8, 1: 1665.6. Samples: 17640862. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 01:39:53,607][45375] Avg episode reward: [(0, '45.670'), (1, '49.430')] +[2023-10-13 01:39:53,613][46384] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000034432_35258368.pth... +[2023-10-13 01:39:53,648][46384] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000032864_33652736.pth +[2023-10-13 01:39:53,735][46662] Updated weights for policy 0, policy_version 34460 (0.0010) +[2023-10-13 01:39:53,881][46091] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000034464_35291136.pth... +[2023-10-13 01:39:53,910][46091] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000032896_33685504.pth +[2023-10-13 01:39:57,475][46663] Updated weights for policy 1, policy_version 34441 (0.0008) +[2023-10-13 01:39:57,762][46662] Updated weights for policy 0, policy_version 34470 (0.0008) +[2023-10-13 01:39:57,837][46663] Updated weights for policy 1, policy_version 34451 (0.0008) +[2023-10-13 01:39:58,132][46662] Updated weights for policy 0, policy_version 34480 (0.0008) +[2023-10-13 01:39:58,204][46663] Updated weights for policy 1, policy_version 34461 (0.0008) +[2023-10-13 01:39:58,494][46662] Updated weights for policy 0, policy_version 34490 (0.0009) +[2023-10-13 01:39:58,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 70582272. Throughput: 0: 1681.5, 1: 1680.7. Samples: 17651164. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 01:39:58,607][45375] Avg episode reward: [(0, '44.830'), (1, '48.690')] +[2023-10-13 01:40:02,202][46663] Updated weights for policy 1, policy_version 34471 (0.0007) +[2023-10-13 01:40:02,573][46663] Updated weights for policy 1, policy_version 34481 (0.0010) +[2023-10-13 01:40:02,733][46662] Updated weights for policy 0, policy_version 34500 (0.0009) +[2023-10-13 01:40:02,943][46663] Updated weights for policy 1, policy_version 34491 (0.0009) +[2023-10-13 01:40:03,098][46662] Updated weights for policy 0, policy_version 34510 (0.0009) +[2023-10-13 01:40:03,463][46662] Updated weights for policy 0, policy_version 34520 (0.0008) +[2023-10-13 01:40:03,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 70647808. Throughput: 0: 1675.4, 1: 1670.5. Samples: 17671246. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 01:40:03,607][45375] Avg episode reward: [(0, '45.410'), (1, '49.820')] +[2023-10-13 01:40:07,073][46663] Updated weights for policy 1, policy_version 34501 (0.0008) +[2023-10-13 01:40:07,433][46663] Updated weights for policy 1, policy_version 34511 (0.0009) +[2023-10-13 01:40:07,598][46662] Updated weights for policy 0, policy_version 34530 (0.0009) +[2023-10-13 01:40:07,797][46663] Updated weights for policy 1, policy_version 34521 (0.0009) +[2023-10-13 01:40:07,962][46662] Updated weights for policy 0, policy_version 34540 (0.0009) +[2023-10-13 01:40:08,338][46662] Updated weights for policy 0, policy_version 34550 (0.0009) +[2023-10-13 01:40:08,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 70713344. Throughput: 0: 1670.6, 1: 1665.3. Samples: 17690786. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 01:40:08,607][45375] Avg episode reward: [(0, '44.460'), (1, '48.180')] +[2023-10-13 01:40:08,695][46662] Updated weights for policy 0, policy_version 34560 (0.0009) +[2023-10-13 01:40:11,935][46663] Updated weights for policy 1, policy_version 34531 (0.0009) +[2023-10-13 01:40:12,293][46663] Updated weights for policy 1, policy_version 34541 (0.0011) +[2023-10-13 01:40:12,601][46662] Updated weights for policy 0, policy_version 34570 (0.0009) +[2023-10-13 01:40:12,666][46663] Updated weights for policy 1, policy_version 34551 (0.0008) +[2023-10-13 01:40:12,977][46662] Updated weights for policy 0, policy_version 34580 (0.0008) +[2023-10-13 01:40:13,348][46662] Updated weights for policy 0, policy_version 34590 (0.0008) +[2023-10-13 01:40:13,607][45375] Fps is (10 sec: 16383.7, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 70811648. Throughput: 0: 1676.7, 1: 1676.5. Samples: 17701320. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 01:40:13,608][45375] Avg episode reward: [(0, '44.640'), (1, '48.180')] +[2023-10-13 01:40:16,658][46663] Updated weights for policy 1, policy_version 34561 (0.0008) +[2023-10-13 01:40:17,089][46663] Updated weights for policy 1, policy_version 34571 (0.0007) +[2023-10-13 01:40:17,458][46663] Updated weights for policy 1, policy_version 34581 (0.0009) +[2023-10-13 01:40:17,508][46662] Updated weights for policy 0, policy_version 34600 (0.0008) +[2023-10-13 01:40:17,819][46663] Updated weights for policy 1, policy_version 34591 (0.0009) +[2023-10-13 01:40:17,869][46662] Updated weights for policy 0, policy_version 34610 (0.0008) +[2023-10-13 01:40:18,246][46662] Updated weights for policy 0, policy_version 34620 (0.0008) +[2023-10-13 01:40:18,607][45375] Fps is (10 sec: 16383.9, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 70877184. Throughput: 0: 1679.9, 1: 1659.0. Samples: 17721164. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 01:40:18,607][45375] Avg episode reward: [(0, '46.130'), (1, '47.200')] +[2023-10-13 01:40:21,713][46663] Updated weights for policy 1, policy_version 34601 (0.0009) +[2023-10-13 01:40:22,081][46663] Updated weights for policy 1, policy_version 34611 (0.0008) +[2023-10-13 01:40:22,328][46662] Updated weights for policy 0, policy_version 34630 (0.0008) +[2023-10-13 01:40:22,443][46663] Updated weights for policy 1, policy_version 34621 (0.0008) +[2023-10-13 01:40:22,701][46662] Updated weights for policy 0, policy_version 34640 (0.0009) +[2023-10-13 01:40:23,080][46662] Updated weights for policy 0, policy_version 34650 (0.0008) +[2023-10-13 01:40:23,607][45375] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 70942720. Throughput: 0: 1662.1, 1: 1671.5. Samples: 17740770. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 01:40:23,607][45375] Avg episode reward: [(0, '46.150'), (1, '48.170')] +[2023-10-13 01:40:26,606][46663] Updated weights for policy 1, policy_version 34631 (0.0007) +[2023-10-13 01:40:26,977][46663] Updated weights for policy 1, policy_version 34641 (0.0009) +[2023-10-13 01:40:27,174][46662] Updated weights for policy 0, policy_version 34660 (0.0009) +[2023-10-13 01:40:27,349][46663] Updated weights for policy 1, policy_version 34651 (0.0008) +[2023-10-13 01:40:27,543][46662] Updated weights for policy 0, policy_version 34670 (0.0008) +[2023-10-13 01:40:27,915][46662] Updated weights for policy 0, policy_version 34680 (0.0010) +[2023-10-13 01:40:28,606][45375] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 71008256. Throughput: 0: 1677.1, 1: 1671.8. Samples: 17751748. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 01:40:28,607][45375] Avg episode reward: [(0, '45.680'), (1, '47.890')] +[2023-10-13 01:40:31,457][46663] Updated weights for policy 1, policy_version 34661 (0.0009) +[2023-10-13 01:40:31,823][46663] Updated weights for policy 1, policy_version 34671 (0.0009) +[2023-10-13 01:40:32,030][46662] Updated weights for policy 0, policy_version 34690 (0.0007) +[2023-10-13 01:40:32,192][46663] Updated weights for policy 1, policy_version 34681 (0.0008) +[2023-10-13 01:40:32,394][46662] Updated weights for policy 0, policy_version 34700 (0.0009) +[2023-10-13 01:40:32,765][46662] Updated weights for policy 0, policy_version 34710 (0.0010) +[2023-10-13 01:40:33,136][46662] Updated weights for policy 0, policy_version 34720 (0.0011) +[2023-10-13 01:40:33,606][45375] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 71073792. Throughput: 0: 1678.0, 1: 1650.7. Samples: 17771130. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 01:40:33,607][45375] Avg episode reward: [(0, '45.800'), (1, '46.490')] +[2023-10-13 01:40:36,381][46663] Updated weights for policy 1, policy_version 34691 (0.0008) +[2023-10-13 01:40:36,746][46663] Updated weights for policy 1, policy_version 34701 (0.0007) +[2023-10-13 01:40:37,117][46663] Updated weights for policy 1, policy_version 34711 (0.0008) +[2023-10-13 01:40:37,347][46662] Updated weights for policy 0, policy_version 34730 (0.0009) +[2023-10-13 01:40:37,708][46662] Updated weights for policy 0, policy_version 34740 (0.0011) +[2023-10-13 01:40:38,076][46662] Updated weights for policy 0, policy_version 34750 (0.0009) +[2023-10-13 01:40:38,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 71139328. Throughput: 0: 1659.2, 1: 1670.4. Samples: 17790692. Policy #0 lag: (min: 31.0, avg: 33.3, max: 63.0) +[2023-10-13 01:40:38,607][45375] Avg episode reward: [(0, '45.830'), (1, '47.120')] +[2023-10-13 01:40:41,205][46663] Updated weights for policy 1, policy_version 34721 (0.0009) +[2023-10-13 01:40:41,565][46663] Updated weights for policy 1, policy_version 34731 (0.0009) +[2023-10-13 01:40:41,933][46663] Updated weights for policy 1, policy_version 34741 (0.0010) +[2023-10-13 01:40:42,125][46662] Updated weights for policy 0, policy_version 34760 (0.0008) +[2023-10-13 01:40:42,296][46663] Updated weights for policy 1, policy_version 34751 (0.0009) +[2023-10-13 01:40:42,494][46662] Updated weights for policy 0, policy_version 34770 (0.0008) +[2023-10-13 01:40:42,863][46662] Updated weights for policy 0, policy_version 34780 (0.0009) +[2023-10-13 01:40:43,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 71204864. Throughput: 0: 1674.2, 1: 1671.0. Samples: 17801698. Policy #0 lag: (min: 31.0, avg: 33.3, max: 63.0) +[2023-10-13 01:40:43,607][45375] Avg episode reward: [(0, '47.590'), (1, '45.900')] +[2023-10-13 01:40:46,379][46663] Updated weights for policy 1, policy_version 34761 (0.0010) +[2023-10-13 01:40:46,743][46663] Updated weights for policy 1, policy_version 34771 (0.0010) +[2023-10-13 01:40:47,007][46662] Updated weights for policy 0, policy_version 34790 (0.0009) +[2023-10-13 01:40:47,116][46663] Updated weights for policy 1, policy_version 34781 (0.0007) +[2023-10-13 01:40:47,377][46662] Updated weights for policy 0, policy_version 34800 (0.0010) +[2023-10-13 01:40:47,746][46662] Updated weights for policy 0, policy_version 34810 (0.0011) +[2023-10-13 01:40:48,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 71270400. Throughput: 0: 1673.4, 1: 1657.1. Samples: 17821118. Policy #0 lag: (min: 31.0, avg: 33.3, max: 63.0) +[2023-10-13 01:40:48,607][45375] Avg episode reward: [(0, '46.310'), (1, '46.770')] +[2023-10-13 01:40:51,168][46663] Updated weights for policy 1, policy_version 34791 (0.0009) +[2023-10-13 01:40:51,538][46663] Updated weights for policy 1, policy_version 34801 (0.0009) +[2023-10-13 01:40:51,799][46662] Updated weights for policy 0, policy_version 34820 (0.0009) +[2023-10-13 01:40:51,896][46663] Updated weights for policy 1, policy_version 34811 (0.0008) +[2023-10-13 01:40:52,166][46662] Updated weights for policy 0, policy_version 34830 (0.0007) +[2023-10-13 01:40:52,534][46662] Updated weights for policy 0, policy_version 34840 (0.0008) +[2023-10-13 01:40:53,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 71335936. Throughput: 0: 1654.8, 1: 1677.7. Samples: 17840748. Policy #0 lag: (min: 31.0, avg: 33.3, max: 63.0) +[2023-10-13 01:40:53,607][45375] Avg episode reward: [(0, '45.690'), (1, '47.580')] +[2023-10-13 01:40:56,033][46663] Updated weights for policy 1, policy_version 34821 (0.0011) +[2023-10-13 01:40:56,402][46663] Updated weights for policy 1, policy_version 34831 (0.0010) +[2023-10-13 01:40:56,721][46662] Updated weights for policy 0, policy_version 34850 (0.0007) +[2023-10-13 01:40:56,772][46663] Updated weights for policy 1, policy_version 34841 (0.0010) +[2023-10-13 01:40:57,093][46662] Updated weights for policy 0, policy_version 34860 (0.0008) +[2023-10-13 01:40:57,463][46662] Updated weights for policy 0, policy_version 34870 (0.0011) +[2023-10-13 01:40:57,833][46662] Updated weights for policy 0, policy_version 34880 (0.0011) +[2023-10-13 01:40:58,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 71401472. Throughput: 0: 1673.0, 1: 1672.0. Samples: 17851844. Policy #0 lag: (min: 31.0, avg: 33.3, max: 63.0) +[2023-10-13 01:40:58,607][45375] Avg episode reward: [(0, '45.720'), (1, '49.300')] +[2023-10-13 01:41:00,824][46663] Updated weights for policy 1, policy_version 34851 (0.0008) +[2023-10-13 01:41:01,189][46663] Updated weights for policy 1, policy_version 34861 (0.0010) +[2023-10-13 01:41:01,551][46663] Updated weights for policy 1, policy_version 34871 (0.0010) +[2023-10-13 01:41:01,967][46662] Updated weights for policy 0, policy_version 34890 (0.0008) +[2023-10-13 01:41:02,333][46662] Updated weights for policy 0, policy_version 34900 (0.0010) +[2023-10-13 01:41:02,700][46662] Updated weights for policy 0, policy_version 34910 (0.0009) +[2023-10-13 01:41:03,607][45375] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 71467008. Throughput: 0: 1666.7, 1: 1672.3. Samples: 17871420. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 01:41:03,608][45375] Avg episode reward: [(0, '44.820'), (1, '49.020')] +[2023-10-13 01:41:05,612][46663] Updated weights for policy 1, policy_version 34881 (0.0008) +[2023-10-13 01:41:05,980][46663] Updated weights for policy 1, policy_version 34891 (0.0009) +[2023-10-13 01:41:06,348][46663] Updated weights for policy 1, policy_version 34901 (0.0008) +[2023-10-13 01:41:06,723][46663] Updated weights for policy 1, policy_version 34911 (0.0008) +[2023-10-13 01:41:06,724][46662] Updated weights for policy 0, policy_version 34920 (0.0009) +[2023-10-13 01:41:07,096][46662] Updated weights for policy 0, policy_version 34930 (0.0010) +[2023-10-13 01:41:07,467][46662] Updated weights for policy 0, policy_version 34940 (0.0010) +[2023-10-13 01:41:08,606][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 71532544. Throughput: 0: 1658.4, 1: 1684.0. Samples: 17891176. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 01:41:08,607][45375] Avg episode reward: [(0, '45.250'), (1, '49.490')] +[2023-10-13 01:41:10,844][46663] Updated weights for policy 1, policy_version 34921 (0.0008) +[2023-10-13 01:41:11,193][46663] Updated weights for policy 1, policy_version 34931 (0.0009) +[2023-10-13 01:41:11,479][46662] Updated weights for policy 0, policy_version 34950 (0.0008) +[2023-10-13 01:41:11,567][46663] Updated weights for policy 1, policy_version 34941 (0.0009) +[2023-10-13 01:41:11,847][46662] Updated weights for policy 0, policy_version 34960 (0.0008) +[2023-10-13 01:41:12,220][46662] Updated weights for policy 0, policy_version 34970 (0.0008) +[2023-10-13 01:41:13,606][45375] Fps is (10 sec: 13107.5, 60 sec: 13107.3, 300 sec: 13440.4). Total num frames: 71598080. Throughput: 0: 1671.3, 1: 1667.4. Samples: 17901988. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 01:41:13,607][45375] Avg episode reward: [(0, '45.480'), (1, '50.010')] +[2023-10-13 01:41:15,487][46663] Updated weights for policy 1, policy_version 34951 (0.0009) +[2023-10-13 01:41:15,867][46663] Updated weights for policy 1, policy_version 34961 (0.0008) +[2023-10-13 01:41:16,209][46662] Updated weights for policy 0, policy_version 34980 (0.0009) +[2023-10-13 01:41:16,234][46663] Updated weights for policy 1, policy_version 34971 (0.0008) +[2023-10-13 01:41:16,578][46662] Updated weights for policy 0, policy_version 34990 (0.0009) +[2023-10-13 01:41:16,949][46662] Updated weights for policy 0, policy_version 35000 (0.0009) +[2023-10-13 01:41:18,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 71663616. Throughput: 0: 1656.7, 1: 1686.0. Samples: 17921552. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 01:41:18,607][45375] Avg episode reward: [(0, '45.740'), (1, '51.670')] +[2023-10-13 01:41:20,241][46663] Updated weights for policy 1, policy_version 34981 (0.0010) +[2023-10-13 01:41:20,601][46663] Updated weights for policy 1, policy_version 34991 (0.0009) +[2023-10-13 01:41:20,913][46662] Updated weights for policy 0, policy_version 35010 (0.0009) +[2023-10-13 01:41:20,962][46663] Updated weights for policy 1, policy_version 35001 (0.0009) +[2023-10-13 01:41:21,280][46662] Updated weights for policy 0, policy_version 35020 (0.0009) +[2023-10-13 01:41:21,645][46662] Updated weights for policy 0, policy_version 35030 (0.0009) +[2023-10-13 01:41:22,020][46662] Updated weights for policy 0, policy_version 35040 (0.0007) +[2023-10-13 01:41:23,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 71729152. Throughput: 0: 1671.5, 1: 1685.3. Samples: 17941748. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 01:41:23,607][45375] Avg episode reward: [(0, '45.610'), (1, '52.520')] +[2023-10-13 01:41:24,987][46663] Updated weights for policy 1, policy_version 35011 (0.0009) +[2023-10-13 01:41:25,357][46663] Updated weights for policy 1, policy_version 35021 (0.0008) +[2023-10-13 01:41:25,712][46663] Updated weights for policy 1, policy_version 35031 (0.0008) +[2023-10-13 01:41:26,212][46662] Updated weights for policy 0, policy_version 35050 (0.0008) +[2023-10-13 01:41:26,592][46662] Updated weights for policy 0, policy_version 35060 (0.0007) +[2023-10-13 01:41:26,961][46662] Updated weights for policy 0, policy_version 35070 (0.0007) +[2023-10-13 01:41:28,606][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 71794688. Throughput: 0: 1680.4, 1: 1660.0. Samples: 17952018. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0) +[2023-10-13 01:41:28,607][45375] Avg episode reward: [(0, '45.000'), (1, '52.780')] +[2023-10-13 01:41:29,743][46663] Updated weights for policy 1, policy_version 35041 (0.0009) +[2023-10-13 01:41:30,101][46663] Updated weights for policy 1, policy_version 35051 (0.0010) +[2023-10-13 01:41:30,473][46663] Updated weights for policy 1, policy_version 35061 (0.0007) +[2023-10-13 01:41:30,834][46663] Updated weights for policy 1, policy_version 35071 (0.0008) +[2023-10-13 01:41:31,125][46662] Updated weights for policy 0, policy_version 35080 (0.0008) +[2023-10-13 01:41:31,492][46662] Updated weights for policy 0, policy_version 35090 (0.0008) +[2023-10-13 01:41:31,865][46662] Updated weights for policy 0, policy_version 35100 (0.0009) +[2023-10-13 01:41:33,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13107.1, 300 sec: 13440.4). Total num frames: 71860224. Throughput: 0: 1661.4, 1: 1686.7. Samples: 17971786. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0) +[2023-10-13 01:41:33,608][45375] Avg episode reward: [(0, '46.960'), (1, '52.100')] +[2023-10-13 01:41:34,955][46663] Updated weights for policy 1, policy_version 35081 (0.0007) +[2023-10-13 01:41:35,322][46663] Updated weights for policy 1, policy_version 35091 (0.0011) +[2023-10-13 01:41:35,682][46663] Updated weights for policy 1, policy_version 35101 (0.0008) +[2023-10-13 01:41:35,790][46662] Updated weights for policy 0, policy_version 35110 (0.0008) +[2023-10-13 01:41:36,165][46662] Updated weights for policy 0, policy_version 35120 (0.0009) +[2023-10-13 01:41:36,527][46662] Updated weights for policy 0, policy_version 35130 (0.0007) +[2023-10-13 01:41:38,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 71925760. Throughput: 0: 1683.2, 1: 1680.5. Samples: 17992114. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0) +[2023-10-13 01:41:38,607][45375] Avg episode reward: [(0, '46.600'), (1, '53.210')] +[2023-10-13 01:41:39,832][46663] Updated weights for policy 1, policy_version 35111 (0.0009) +[2023-10-13 01:41:40,193][46663] Updated weights for policy 1, policy_version 35121 (0.0007) +[2023-10-13 01:41:40,539][46662] Updated weights for policy 0, policy_version 35140 (0.0009) +[2023-10-13 01:41:40,568][46663] Updated weights for policy 1, policy_version 35131 (0.0009) +[2023-10-13 01:41:40,903][46662] Updated weights for policy 0, policy_version 35150 (0.0009) +[2023-10-13 01:41:41,279][46662] Updated weights for policy 0, policy_version 35160 (0.0011) +[2023-10-13 01:41:43,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 71991296. Throughput: 0: 1677.2, 1: 1661.1. Samples: 18002070. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0) +[2023-10-13 01:41:43,608][45375] Avg episode reward: [(0, '45.150'), (1, '52.550')] +[2023-10-13 01:41:44,724][46663] Updated weights for policy 1, policy_version 35141 (0.0008) +[2023-10-13 01:41:45,099][46663] Updated weights for policy 1, policy_version 35151 (0.0009) +[2023-10-13 01:41:45,397][46662] Updated weights for policy 0, policy_version 35170 (0.0010) +[2023-10-13 01:41:45,468][46663] Updated weights for policy 1, policy_version 35161 (0.0008) +[2023-10-13 01:41:45,773][46662] Updated weights for policy 0, policy_version 35180 (0.0008) +[2023-10-13 01:41:46,130][46662] Updated weights for policy 0, policy_version 35190 (0.0010) +[2023-10-13 01:41:46,509][46662] Updated weights for policy 0, policy_version 35200 (0.0008) +[2023-10-13 01:41:48,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 72056832. Throughput: 0: 1664.7, 1: 1682.6. Samples: 18022048. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0) +[2023-10-13 01:41:48,607][45375] Avg episode reward: [(0, '46.180'), (1, '54.030')] +[2023-10-13 01:41:49,652][46663] Updated weights for policy 1, policy_version 35171 (0.0008) +[2023-10-13 01:41:50,024][46663] Updated weights for policy 1, policy_version 35181 (0.0007) +[2023-10-13 01:41:50,384][46663] Updated weights for policy 1, policy_version 35191 (0.0008) +[2023-10-13 01:41:50,619][46662] Updated weights for policy 0, policy_version 35210 (0.0008) +[2023-10-13 01:41:50,988][46662] Updated weights for policy 0, policy_version 35220 (0.0009) +[2023-10-13 01:41:51,354][46662] Updated weights for policy 0, policy_version 35230 (0.0007) +[2023-10-13 01:41:53,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.1, 300 sec: 13329.3). Total num frames: 72122368. Throughput: 0: 1687.1, 1: 1680.8. Samples: 18042734. Policy #0 lag: (min: 31.0, avg: 35.5, max: 63.0) +[2023-10-13 01:41:53,608][45375] Avg episode reward: [(0, '46.580'), (1, '55.150')] +[2023-10-13 01:41:53,621][46384] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000035200_36044800.pth... +[2023-10-13 01:41:53,621][46091] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000035232_36077568.pth... +[2023-10-13 01:41:53,660][46091] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000033664_34471936.pth +[2023-10-13 01:41:53,661][46384] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000033632_34439168.pth +[2023-10-13 01:41:54,590][46663] Updated weights for policy 1, policy_version 35201 (0.0008) +[2023-10-13 01:41:55,022][46663] Updated weights for policy 1, policy_version 35211 (0.0008) +[2023-10-13 01:41:55,334][46662] Updated weights for policy 0, policy_version 35240 (0.0009) +[2023-10-13 01:41:55,385][46663] Updated weights for policy 1, policy_version 35221 (0.0009) +[2023-10-13 01:41:55,707][46662] Updated weights for policy 0, policy_version 35250 (0.0008) +[2023-10-13 01:41:55,750][46663] Updated weights for policy 1, policy_version 35231 (0.0010) +[2023-10-13 01:41:56,079][46662] Updated weights for policy 0, policy_version 35260 (0.0009) +[2023-10-13 01:41:58,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 72187904. Throughput: 0: 1670.7, 1: 1668.5. Samples: 18052252. Policy #0 lag: (min: 31.0, avg: 35.5, max: 63.0) +[2023-10-13 01:41:58,607][45375] Avg episode reward: [(0, '48.550'), (1, '55.560')] +[2023-10-13 01:41:59,744][46663] Updated weights for policy 1, policy_version 35241 (0.0009) +[2023-10-13 01:42:00,100][46663] Updated weights for policy 1, policy_version 35251 (0.0009) +[2023-10-13 01:42:00,254][46662] Updated weights for policy 0, policy_version 35270 (0.0011) +[2023-10-13 01:42:00,468][46663] Updated weights for policy 1, policy_version 35261 (0.0008) +[2023-10-13 01:42:00,629][46662] Updated weights for policy 0, policy_version 35280 (0.0008) +[2023-10-13 01:42:00,997][46662] Updated weights for policy 0, policy_version 35290 (0.0008) +[2023-10-13 01:42:03,607][45375] Fps is (10 sec: 13107.6, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 72253440. Throughput: 0: 1671.3, 1: 1679.5. Samples: 18072338. Policy #0 lag: (min: 31.0, avg: 35.5, max: 63.0) +[2023-10-13 01:42:03,607][45375] Avg episode reward: [(0, '51.180'), (1, '55.470')] +[2023-10-13 01:42:04,600][46663] Updated weights for policy 1, policy_version 35271 (0.0008) +[2023-10-13 01:42:04,936][46662] Updated weights for policy 0, policy_version 35300 (0.0009) +[2023-10-13 01:42:04,960][46663] Updated weights for policy 1, policy_version 35281 (0.0009) +[2023-10-13 01:42:05,302][46662] Updated weights for policy 0, policy_version 35310 (0.0008) +[2023-10-13 01:42:05,325][46663] Updated weights for policy 1, policy_version 35291 (0.0007) +[2023-10-13 01:42:05,680][46662] Updated weights for policy 0, policy_version 35320 (0.0008) +[2023-10-13 01:42:08,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 72318976. Throughput: 0: 1680.6, 1: 1676.3. Samples: 18092808. Policy #0 lag: (min: 31.0, avg: 35.5, max: 63.0) +[2023-10-13 01:42:08,607][45375] Avg episode reward: [(0, '51.290'), (1, '54.560')] +[2023-10-13 01:42:09,499][46663] Updated weights for policy 1, policy_version 35301 (0.0009) +[2023-10-13 01:42:09,828][46662] Updated weights for policy 0, policy_version 35330 (0.0009) +[2023-10-13 01:42:09,873][46663] Updated weights for policy 1, policy_version 35311 (0.0010) +[2023-10-13 01:42:10,189][46662] Updated weights for policy 0, policy_version 35340 (0.0007) +[2023-10-13 01:42:10,243][46663] Updated weights for policy 1, policy_version 35321 (0.0009) +[2023-10-13 01:42:10,567][46662] Updated weights for policy 0, policy_version 35350 (0.0007) +[2023-10-13 01:42:10,935][46662] Updated weights for policy 0, policy_version 35360 (0.0007) +[2023-10-13 01:42:13,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 72384512. Throughput: 0: 1658.7, 1: 1675.5. Samples: 18102058. Policy #0 lag: (min: 31.0, avg: 35.5, max: 63.0) +[2023-10-13 01:42:13,608][45375] Avg episode reward: [(0, '52.530'), (1, '54.740')] +[2023-10-13 01:42:14,233][46663] Updated weights for policy 1, policy_version 35331 (0.0010) +[2023-10-13 01:42:14,598][46663] Updated weights for policy 1, policy_version 35341 (0.0008) +[2023-10-13 01:42:14,967][46663] Updated weights for policy 1, policy_version 35351 (0.0007) +[2023-10-13 01:42:14,998][46662] Updated weights for policy 0, policy_version 35370 (0.0009) +[2023-10-13 01:42:15,371][46662] Updated weights for policy 0, policy_version 35380 (0.0009) +[2023-10-13 01:42:15,747][46662] Updated weights for policy 0, policy_version 35390 (0.0008) +[2023-10-13 01:42:18,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 72450048. Throughput: 0: 1678.3, 1: 1670.5. Samples: 18122482. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 01:42:18,607][45375] Avg episode reward: [(0, '51.260'), (1, '54.830')] +[2023-10-13 01:42:19,045][46663] Updated weights for policy 1, policy_version 35361 (0.0010) +[2023-10-13 01:42:19,416][46663] Updated weights for policy 1, policy_version 35371 (0.0008) +[2023-10-13 01:42:19,779][46663] Updated weights for policy 1, policy_version 35381 (0.0009) +[2023-10-13 01:42:19,964][46662] Updated weights for policy 0, policy_version 35400 (0.0008) +[2023-10-13 01:42:20,140][46663] Updated weights for policy 1, policy_version 35391 (0.0008) +[2023-10-13 01:42:20,329][46662] Updated weights for policy 0, policy_version 35410 (0.0009) +[2023-10-13 01:42:20,702][46662] Updated weights for policy 0, policy_version 35420 (0.0009) +[2023-10-13 01:42:23,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 72515584. Throughput: 0: 1681.8, 1: 1670.5. Samples: 18142968. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 01:42:23,608][45375] Avg episode reward: [(0, '52.050'), (1, '55.310')] +[2023-10-13 01:42:24,360][46663] Updated weights for policy 1, policy_version 35401 (0.0009) +[2023-10-13 01:42:24,708][46662] Updated weights for policy 0, policy_version 35430 (0.0008) +[2023-10-13 01:42:24,727][46663] Updated weights for policy 1, policy_version 35411 (0.0009) +[2023-10-13 01:42:25,079][46662] Updated weights for policy 0, policy_version 35440 (0.0008) +[2023-10-13 01:42:25,089][46663] Updated weights for policy 1, policy_version 35421 (0.0008) +[2023-10-13 01:42:25,456][46662] Updated weights for policy 0, policy_version 35450 (0.0010) +[2023-10-13 01:42:28,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 72581120. Throughput: 0: 1667.0, 1: 1670.2. Samples: 18152244. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 01:42:28,607][45375] Avg episode reward: [(0, '52.980'), (1, '54.370')] +[2023-10-13 01:42:29,150][46663] Updated weights for policy 1, policy_version 35431 (0.0008) +[2023-10-13 01:42:29,522][46663] Updated weights for policy 1, policy_version 35441 (0.0009) +[2023-10-13 01:42:29,611][46662] Updated weights for policy 0, policy_version 35460 (0.0008) +[2023-10-13 01:42:29,884][46663] Updated weights for policy 1, policy_version 35451 (0.0009) +[2023-10-13 01:42:29,984][46662] Updated weights for policy 0, policy_version 35470 (0.0007) +[2023-10-13 01:42:30,345][46662] Updated weights for policy 0, policy_version 35480 (0.0009) +[2023-10-13 01:42:33,607][45375] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 72646656. Throughput: 0: 1683.7, 1: 1667.0. Samples: 18172828. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 01:42:33,607][45375] Avg episode reward: [(0, '54.010'), (1, '54.230')] +[2023-10-13 01:42:34,040][46663] Updated weights for policy 1, policy_version 35461 (0.0007) +[2023-10-13 01:42:34,404][46663] Updated weights for policy 1, policy_version 35471 (0.0009) +[2023-10-13 01:42:34,438][46662] Updated weights for policy 0, policy_version 35490 (0.0008) +[2023-10-13 01:42:34,773][46663] Updated weights for policy 1, policy_version 35481 (0.0011) +[2023-10-13 01:42:34,801][46662] Updated weights for policy 0, policy_version 35500 (0.0008) +[2023-10-13 01:42:35,166][46662] Updated weights for policy 0, policy_version 35510 (0.0007) +[2023-10-13 01:42:35,548][46662] Updated weights for policy 0, policy_version 35520 (0.0010) +[2023-10-13 01:42:38,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 72712192. Throughput: 0: 1680.6, 1: 1668.7. Samples: 18193452. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 01:42:38,607][45375] Avg episode reward: [(0, '52.490'), (1, '54.300')] +[2023-10-13 01:42:38,824][46663] Updated weights for policy 1, policy_version 35491 (0.0010) +[2023-10-13 01:42:39,172][46663] Updated weights for policy 1, policy_version 35501 (0.0009) +[2023-10-13 01:42:39,510][46662] Updated weights for policy 0, policy_version 35530 (0.0008) +[2023-10-13 01:42:39,533][46663] Updated weights for policy 1, policy_version 35511 (0.0008) +[2023-10-13 01:42:39,869][46662] Updated weights for policy 0, policy_version 35540 (0.0009) +[2023-10-13 01:42:40,215][46662] Updated weights for policy 0, policy_version 35550 (0.0011) +[2023-10-13 01:42:43,331][46663] Updated weights for policy 1, policy_version 35521 (0.0008) +[2023-10-13 01:42:43,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 72777728. Throughput: 0: 1671.5, 1: 1677.2. Samples: 18202944. Policy #0 lag: (min: 6.0, avg: 15.0, max: 38.0) +[2023-10-13 01:42:43,607][45375] Avg episode reward: [(0, '52.520'), (1, '53.170')] +[2023-10-13 01:42:43,753][46663] Updated weights for policy 1, policy_version 35531 (0.0007) +[2023-10-13 01:42:44,033][46662] Updated weights for policy 0, policy_version 35560 (0.0007) +[2023-10-13 01:42:44,106][46663] Updated weights for policy 1, policy_version 35541 (0.0008) +[2023-10-13 01:42:44,384][46662] Updated weights for policy 0, policy_version 35570 (0.0007) +[2023-10-13 01:42:44,472][46663] Updated weights for policy 1, policy_version 35551 (0.0008) +[2023-10-13 01:42:44,754][46662] Updated weights for policy 0, policy_version 35580 (0.0008) +[2023-10-13 01:42:48,606][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 72843264. Throughput: 0: 1699.8, 1: 1680.0. Samples: 18224432. Policy #0 lag: (min: 6.0, avg: 15.0, max: 38.0) +[2023-10-13 01:42:48,607][45375] Avg episode reward: [(0, '53.470'), (1, '53.970')] +[2023-10-13 01:42:48,641][46663] Updated weights for policy 1, policy_version 35561 (0.0007) +[2023-10-13 01:42:48,730][46662] Updated weights for policy 0, policy_version 35590 (0.0007) +[2023-10-13 01:42:48,995][46663] Updated weights for policy 1, policy_version 35571 (0.0008) +[2023-10-13 01:42:49,103][46662] Updated weights for policy 0, policy_version 35600 (0.0007) +[2023-10-13 01:42:49,357][46663] Updated weights for policy 1, policy_version 35581 (0.0008) +[2023-10-13 01:42:49,464][46662] Updated weights for policy 0, policy_version 35610 (0.0009) +[2023-10-13 01:42:53,404][46662] Updated weights for policy 0, policy_version 35620 (0.0008) +[2023-10-13 01:42:53,408][46663] Updated weights for policy 1, policy_version 35591 (0.0009) +[2023-10-13 01:42:53,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.3, 300 sec: 13329.4). Total num frames: 72908800. Throughput: 0: 1699.5, 1: 1681.2. Samples: 18244936. Policy #0 lag: (min: 6.0, avg: 15.0, max: 38.0) +[2023-10-13 01:42:53,607][45375] Avg episode reward: [(0, '53.580'), (1, '55.880')] +[2023-10-13 01:42:53,772][46663] Updated weights for policy 1, policy_version 35601 (0.0009) +[2023-10-13 01:42:53,783][46662] Updated weights for policy 0, policy_version 35630 (0.0008) +[2023-10-13 01:42:54,141][46663] Updated weights for policy 1, policy_version 35611 (0.0008) +[2023-10-13 01:42:54,152][46662] Updated weights for policy 0, policy_version 35640 (0.0009) +[2023-10-13 01:42:58,074][46662] Updated weights for policy 0, policy_version 35650 (0.0009) +[2023-10-13 01:42:58,232][46663] Updated weights for policy 1, policy_version 35621 (0.0010) +[2023-10-13 01:42:58,448][46662] Updated weights for policy 0, policy_version 35660 (0.0007) +[2023-10-13 01:42:58,594][46663] Updated weights for policy 1, policy_version 35631 (0.0007) +[2023-10-13 01:42:58,606][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 72974336. Throughput: 0: 1695.5, 1: 1690.9. Samples: 18254444. Policy #0 lag: (min: 6.0, avg: 15.0, max: 38.0) +[2023-10-13 01:42:58,607][45375] Avg episode reward: [(0, '53.960'), (1, '54.980')] +[2023-10-13 01:42:58,807][46662] Updated weights for policy 0, policy_version 35670 (0.0008) +[2023-10-13 01:42:58,959][46663] Updated weights for policy 1, policy_version 35641 (0.0008) +[2023-10-13 01:42:59,181][46662] Updated weights for policy 0, policy_version 35680 (0.0008) +[2023-10-13 01:43:03,270][46663] Updated weights for policy 1, policy_version 35651 (0.0009) +[2023-10-13 01:43:03,429][46662] Updated weights for policy 0, policy_version 35690 (0.0008) +[2023-10-13 01:43:03,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 73039872. Throughput: 0: 1701.2, 1: 1686.5. Samples: 18274932. Policy #0 lag: (min: 6.0, avg: 15.0, max: 38.0) +[2023-10-13 01:43:03,608][45375] Avg episode reward: [(0, '51.160'), (1, '53.390')] +[2023-10-13 01:43:03,626][46663] Updated weights for policy 1, policy_version 35661 (0.0009) +[2023-10-13 01:43:03,803][46662] Updated weights for policy 0, policy_version 35700 (0.0008) +[2023-10-13 01:43:03,998][46663] Updated weights for policy 1, policy_version 35671 (0.0008) +[2023-10-13 01:43:04,177][46662] Updated weights for policy 0, policy_version 35710 (0.0009) +[2023-10-13 01:43:08,071][46663] Updated weights for policy 1, policy_version 35681 (0.0010) +[2023-10-13 01:43:08,202][46662] Updated weights for policy 0, policy_version 35720 (0.0008) +[2023-10-13 01:43:08,439][46663] Updated weights for policy 1, policy_version 35691 (0.0007) +[2023-10-13 01:43:08,566][46662] Updated weights for policy 0, policy_version 35730 (0.0009) +[2023-10-13 01:43:08,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 73105408. Throughput: 0: 1705.1, 1: 1681.1. Samples: 18295346. Policy #0 lag: (min: 31.0, avg: 38.7, max: 63.0) +[2023-10-13 01:43:08,607][45375] Avg episode reward: [(0, '49.880'), (1, '54.250')] +[2023-10-13 01:43:08,813][46663] Updated weights for policy 1, policy_version 35701 (0.0008) +[2023-10-13 01:43:08,931][46662] Updated weights for policy 0, policy_version 35740 (0.0008) +[2023-10-13 01:43:09,176][46663] Updated weights for policy 1, policy_version 35711 (0.0007) +[2023-10-13 01:43:12,907][46662] Updated weights for policy 0, policy_version 35750 (0.0008) +[2023-10-13 01:43:13,190][46663] Updated weights for policy 1, policy_version 35721 (0.0008) +[2023-10-13 01:43:13,280][46662] Updated weights for policy 0, policy_version 35760 (0.0010) +[2023-10-13 01:43:13,563][46663] Updated weights for policy 1, policy_version 35731 (0.0008) +[2023-10-13 01:43:13,607][45375] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 73170944. Throughput: 0: 1700.8, 1: 1691.4. Samples: 18304892. Policy #0 lag: (min: 31.0, avg: 38.7, max: 63.0) +[2023-10-13 01:43:13,607][45375] Avg episode reward: [(0, '50.950'), (1, '53.020')] +[2023-10-13 01:43:13,646][46662] Updated weights for policy 0, policy_version 35770 (0.0009) +[2023-10-13 01:43:13,928][46663] Updated weights for policy 1, policy_version 35741 (0.0008) +[2023-10-13 01:43:17,754][46662] Updated weights for policy 0, policy_version 35780 (0.0009) +[2023-10-13 01:43:17,920][46663] Updated weights for policy 1, policy_version 35751 (0.0009) +[2023-10-13 01:43:18,121][46662] Updated weights for policy 0, policy_version 35790 (0.0008) +[2023-10-13 01:43:18,274][46663] Updated weights for policy 1, policy_version 35761 (0.0008) +[2023-10-13 01:43:18,496][46662] Updated weights for policy 0, policy_version 35800 (0.0007) +[2023-10-13 01:43:18,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 73236480. Throughput: 0: 1701.9, 1: 1696.7. Samples: 18325764. Policy #0 lag: (min: 31.0, avg: 38.7, max: 63.0) +[2023-10-13 01:43:18,607][45375] Avg episode reward: [(0, '49.510'), (1, '52.970')] +[2023-10-13 01:43:18,635][46663] Updated weights for policy 1, policy_version 35771 (0.0008) +[2023-10-13 01:43:22,391][46662] Updated weights for policy 0, policy_version 35810 (0.0007) +[2023-10-13 01:43:22,733][46663] Updated weights for policy 1, policy_version 35781 (0.0008) +[2023-10-13 01:43:22,766][46662] Updated weights for policy 0, policy_version 35820 (0.0008) +[2023-10-13 01:43:23,098][46663] Updated weights for policy 1, policy_version 35791 (0.0009) +[2023-10-13 01:43:23,130][46662] Updated weights for policy 0, policy_version 35830 (0.0008) +[2023-10-13 01:43:23,457][46663] Updated weights for policy 1, policy_version 35801 (0.0009) +[2023-10-13 01:43:23,506][46662] Updated weights for policy 0, policy_version 35840 (0.0008) +[2023-10-13 01:43:23,607][45375] Fps is (10 sec: 16383.6, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 73334784. Throughput: 0: 1697.8, 1: 1676.5. Samples: 18345298. Policy #0 lag: (min: 31.0, avg: 38.7, max: 63.0) +[2023-10-13 01:43:23,608][45375] Avg episode reward: [(0, '50.530'), (1, '53.330')] +[2023-10-13 01:43:27,533][46662] Updated weights for policy 0, policy_version 35850 (0.0008) +[2023-10-13 01:43:27,558][46663] Updated weights for policy 1, policy_version 35811 (0.0009) +[2023-10-13 01:43:27,907][46662] Updated weights for policy 0, policy_version 35860 (0.0008) +[2023-10-13 01:43:27,920][46663] Updated weights for policy 1, policy_version 35821 (0.0009) +[2023-10-13 01:43:28,275][46662] Updated weights for policy 0, policy_version 35870 (0.0009) +[2023-10-13 01:43:28,296][46663] Updated weights for policy 1, policy_version 35831 (0.0007) +[2023-10-13 01:43:28,606][45375] Fps is (10 sec: 16384.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 73400320. Throughput: 0: 1702.0, 1: 1688.4. Samples: 18355508. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 01:43:28,607][45375] Avg episode reward: [(0, '50.300'), (1, '52.480')] +[2023-10-13 01:43:32,420][46663] Updated weights for policy 1, policy_version 35841 (0.0008) +[2023-10-13 01:43:32,517][46662] Updated weights for policy 0, policy_version 35880 (0.0008) +[2023-10-13 01:43:32,814][46663] Updated weights for policy 1, policy_version 35851 (0.0007) +[2023-10-13 01:43:32,883][46662] Updated weights for policy 0, policy_version 35890 (0.0010) +[2023-10-13 01:43:33,182][46663] Updated weights for policy 1, policy_version 35861 (0.0009) +[2023-10-13 01:43:33,254][46662] Updated weights for policy 0, policy_version 35900 (0.0008) +[2023-10-13 01:43:33,546][46663] Updated weights for policy 1, policy_version 35871 (0.0009) +[2023-10-13 01:43:33,607][45375] Fps is (10 sec: 16384.3, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 73498624. Throughput: 0: 1691.5, 1: 1682.3. Samples: 18376252. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 01:43:33,607][45375] Avg episode reward: [(0, '49.610'), (1, '51.990')] +[2023-10-13 01:43:37,175][46662] Updated weights for policy 0, policy_version 35910 (0.0009) +[2023-10-13 01:43:37,539][46662] Updated weights for policy 0, policy_version 35920 (0.0007) +[2023-10-13 01:43:37,540][46663] Updated weights for policy 1, policy_version 35881 (0.0008) +[2023-10-13 01:43:37,906][46662] Updated weights for policy 0, policy_version 35930 (0.0009) +[2023-10-13 01:43:37,908][46663] Updated weights for policy 1, policy_version 35891 (0.0008) +[2023-10-13 01:43:38,270][46663] Updated weights for policy 1, policy_version 35901 (0.0007) +[2023-10-13 01:43:38,607][45375] Fps is (10 sec: 16383.8, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 73564160. Throughput: 0: 1675.3, 1: 1656.7. Samples: 18394874. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 01:43:38,607][45375] Avg episode reward: [(0, '49.800'), (1, '52.080')] +[2023-10-13 01:43:41,897][46662] Updated weights for policy 0, policy_version 35940 (0.0009) +[2023-10-13 01:43:42,267][46662] Updated weights for policy 0, policy_version 35950 (0.0008) +[2023-10-13 01:43:42,362][46663] Updated weights for policy 1, policy_version 35911 (0.0008) +[2023-10-13 01:43:42,636][46662] Updated weights for policy 0, policy_version 35960 (0.0009) +[2023-10-13 01:43:42,727][46663] Updated weights for policy 1, policy_version 35921 (0.0009) +[2023-10-13 01:43:43,091][46663] Updated weights for policy 1, policy_version 35931 (0.0008) +[2023-10-13 01:43:43,607][45375] Fps is (10 sec: 13107.0, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 73629696. Throughput: 0: 1698.2, 1: 1677.5. Samples: 18406348. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 01:43:43,608][45375] Avg episode reward: [(0, '49.140'), (1, '53.060')] +[2023-10-13 01:43:46,761][46662] Updated weights for policy 0, policy_version 35970 (0.0008) +[2023-10-13 01:43:47,114][46663] Updated weights for policy 1, policy_version 35941 (0.0008) +[2023-10-13 01:43:47,135][46662] Updated weights for policy 0, policy_version 35980 (0.0009) +[2023-10-13 01:43:47,479][46663] Updated weights for policy 1, policy_version 35951 (0.0010) +[2023-10-13 01:43:47,508][46662] Updated weights for policy 0, policy_version 35990 (0.0008) +[2023-10-13 01:43:47,837][46663] Updated weights for policy 1, policy_version 35961 (0.0009) +[2023-10-13 01:43:47,866][46662] Updated weights for policy 0, policy_version 36000 (0.0009) +[2023-10-13 01:43:48,607][45375] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 73695232. Throughput: 0: 1694.6, 1: 1673.1. Samples: 18426476. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 01:43:48,607][45375] Avg episode reward: [(0, '48.860'), (1, '53.100')] +[2023-10-13 01:43:51,933][46663] Updated weights for policy 1, policy_version 35971 (0.0010) +[2023-10-13 01:43:51,989][46662] Updated weights for policy 0, policy_version 36010 (0.0010) +[2023-10-13 01:43:52,302][46663] Updated weights for policy 1, policy_version 35981 (0.0008) +[2023-10-13 01:43:52,355][46662] Updated weights for policy 0, policy_version 36020 (0.0008) +[2023-10-13 01:43:52,662][46663] Updated weights for policy 1, policy_version 35991 (0.0008) +[2023-10-13 01:43:52,731][46662] Updated weights for policy 0, policy_version 36030 (0.0008) +[2023-10-13 01:43:53,607][45375] Fps is (10 sec: 13107.0, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 73760768. Throughput: 0: 1665.5, 1: 1660.8. Samples: 18445030. Policy #0 lag: (min: 24.0, avg: 41.5, max: 56.0) +[2023-10-13 01:43:53,608][45375] Avg episode reward: [(0, '50.250'), (1, '51.350')] +[2023-10-13 01:43:53,618][46091] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000036032_36896768.pth... +[2023-10-13 01:43:53,618][46384] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000036000_36864000.pth... +[2023-10-13 01:43:53,648][46091] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000034464_35291136.pth +[2023-10-13 01:43:53,659][46384] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000034432_35258368.pth +[2023-10-13 01:43:56,769][46663] Updated weights for policy 1, policy_version 36001 (0.0008) +[2023-10-13 01:43:56,807][46662] Updated weights for policy 0, policy_version 36040 (0.0009) +[2023-10-13 01:43:57,135][46663] Updated weights for policy 1, policy_version 36011 (0.0007) +[2023-10-13 01:43:57,178][46662] Updated weights for policy 0, policy_version 36050 (0.0007) +[2023-10-13 01:43:57,493][46663] Updated weights for policy 1, policy_version 36021 (0.0009) +[2023-10-13 01:43:57,550][46662] Updated weights for policy 0, policy_version 36060 (0.0010) +[2023-10-13 01:43:57,855][46663] Updated weights for policy 1, policy_version 36031 (0.0009) +[2023-10-13 01:43:58,606][45375] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 73826304. Throughput: 0: 1688.8, 1: 1680.0. Samples: 18456490. Policy #0 lag: (min: 24.0, avg: 41.5, max: 56.0) +[2023-10-13 01:43:58,607][45375] Avg episode reward: [(0, '50.600'), (1, '50.790')] +[2023-10-13 01:44:01,708][46662] Updated weights for policy 0, policy_version 36070 (0.0009) +[2023-10-13 01:44:01,941][46663] Updated weights for policy 1, policy_version 36041 (0.0008) +[2023-10-13 01:44:02,079][46662] Updated weights for policy 0, policy_version 36080 (0.0008) +[2023-10-13 01:44:02,311][46663] Updated weights for policy 1, policy_version 36051 (0.0008) +[2023-10-13 01:44:02,446][46662] Updated weights for policy 0, policy_version 36090 (0.0007) +[2023-10-13 01:44:02,674][46663] Updated weights for policy 1, policy_version 36061 (0.0010) +[2023-10-13 01:44:03,607][45375] Fps is (10 sec: 13107.6, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 73891840. Throughput: 0: 1680.8, 1: 1654.3. Samples: 18475846. Policy #0 lag: (min: 24.0, avg: 41.5, max: 56.0) +[2023-10-13 01:44:03,607][45375] Avg episode reward: [(0, '50.240'), (1, '50.480')] +[2023-10-13 01:44:06,487][46662] Updated weights for policy 0, policy_version 36100 (0.0008) +[2023-10-13 01:44:06,642][46663] Updated weights for policy 1, policy_version 36071 (0.0010) +[2023-10-13 01:44:06,852][46662] Updated weights for policy 0, policy_version 36110 (0.0007) +[2023-10-13 01:44:07,005][46663] Updated weights for policy 1, policy_version 36081 (0.0008) +[2023-10-13 01:44:07,213][46662] Updated weights for policy 0, policy_version 36120 (0.0008) +[2023-10-13 01:44:07,368][46663] Updated weights for policy 1, policy_version 36091 (0.0008) +[2023-10-13 01:44:08,607][45375] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 73957376. Throughput: 0: 1666.6, 1: 1662.6. Samples: 18495114. Policy #0 lag: (min: 24.0, avg: 41.5, max: 56.0) +[2023-10-13 01:44:08,607][45375] Avg episode reward: [(0, '49.990'), (1, '51.330')] +[2023-10-13 01:44:11,318][46662] Updated weights for policy 0, policy_version 36130 (0.0007) +[2023-10-13 01:44:11,624][46663] Updated weights for policy 1, policy_version 36101 (0.0008) +[2023-10-13 01:44:11,700][46662] Updated weights for policy 0, policy_version 36140 (0.0009) +[2023-10-13 01:44:11,981][46663] Updated weights for policy 1, policy_version 36111 (0.0008) +[2023-10-13 01:44:12,070][46662] Updated weights for policy 0, policy_version 36150 (0.0007) +[2023-10-13 01:44:12,358][46663] Updated weights for policy 1, policy_version 36121 (0.0009) +[2023-10-13 01:44:12,432][46662] Updated weights for policy 0, policy_version 36160 (0.0008) +[2023-10-13 01:44:13,607][45375] Fps is (10 sec: 13107.0, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 74022912. Throughput: 0: 1688.9, 1: 1679.6. Samples: 18507092. Policy #0 lag: (min: 24.0, avg: 41.5, max: 56.0) +[2023-10-13 01:44:13,608][45375] Avg episode reward: [(0, '52.200'), (1, '51.810')] +[2023-10-13 01:44:16,234][46662] Updated weights for policy 0, policy_version 36170 (0.0010) +[2023-10-13 01:44:16,498][46663] Updated weights for policy 1, policy_version 36131 (0.0010) +[2023-10-13 01:44:16,608][46662] Updated weights for policy 0, policy_version 36180 (0.0009) +[2023-10-13 01:44:16,869][46663] Updated weights for policy 1, policy_version 36141 (0.0007) +[2023-10-13 01:44:16,987][46662] Updated weights for policy 0, policy_version 36190 (0.0009) +[2023-10-13 01:44:17,241][46663] Updated weights for policy 1, policy_version 36151 (0.0010) +[2023-10-13 01:44:18,607][45375] Fps is (10 sec: 13107.2, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 74088448. Throughput: 0: 1672.0, 1: 1653.8. Samples: 18525914. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 01:44:18,607][45375] Avg episode reward: [(0, '51.510'), (1, '53.170')] +[2023-10-13 01:44:21,059][46662] Updated weights for policy 0, policy_version 36200 (0.0007) +[2023-10-13 01:44:21,286][46663] Updated weights for policy 1, policy_version 36161 (0.0009) +[2023-10-13 01:44:21,423][46662] Updated weights for policy 0, policy_version 36210 (0.0009) +[2023-10-13 01:44:21,677][46663] Updated weights for policy 1, policy_version 36171 (0.0009) +[2023-10-13 01:44:21,784][46662] Updated weights for policy 0, policy_version 36220 (0.0007) +[2023-10-13 01:44:22,036][46663] Updated weights for policy 1, policy_version 36181 (0.0008) +[2023-10-13 01:44:22,406][46663] Updated weights for policy 1, policy_version 36191 (0.0010) +[2023-10-13 01:44:23,607][45375] Fps is (10 sec: 13107.4, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 74153984. Throughput: 0: 1680.8, 1: 1681.2. Samples: 18546166. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 01:44:23,607][45375] Avg episode reward: [(0, '52.440'), (1, '53.310')] +[2023-10-13 01:44:25,865][46662] Updated weights for policy 0, policy_version 36230 (0.0008) +[2023-10-13 01:44:26,238][46662] Updated weights for policy 0, policy_version 36240 (0.0007) +[2023-10-13 01:44:26,478][46663] Updated weights for policy 1, policy_version 36201 (0.0008) +[2023-10-13 01:44:26,597][46662] Updated weights for policy 0, policy_version 36250 (0.0008) +[2023-10-13 01:44:26,848][46663] Updated weights for policy 1, policy_version 36211 (0.0007) +[2023-10-13 01:44:27,223][46663] Updated weights for policy 1, policy_version 36221 (0.0009) +[2023-10-13 01:44:28,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 74219520. Throughput: 0: 1683.5, 1: 1675.9. Samples: 18557522. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 01:44:28,607][45375] Avg episode reward: [(0, '52.090'), (1, '55.220')] +[2023-10-13 01:44:30,717][46662] Updated weights for policy 0, policy_version 36260 (0.0009) +[2023-10-13 01:44:31,086][46662] Updated weights for policy 0, policy_version 36270 (0.0009) +[2023-10-13 01:44:31,264][46663] Updated weights for policy 1, policy_version 36231 (0.0007) +[2023-10-13 01:44:31,457][46662] Updated weights for policy 0, policy_version 36280 (0.0010) +[2023-10-13 01:44:31,624][46663] Updated weights for policy 1, policy_version 36241 (0.0008) +[2023-10-13 01:44:31,990][46663] Updated weights for policy 1, policy_version 36251 (0.0009) +[2023-10-13 01:44:33,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 74285056. Throughput: 0: 1661.2, 1: 1662.9. Samples: 18576064. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 01:44:33,607][45375] Avg episode reward: [(0, '51.340'), (1, '54.450')] +[2023-10-13 01:44:35,440][46662] Updated weights for policy 0, policy_version 36290 (0.0010) +[2023-10-13 01:44:35,806][46662] Updated weights for policy 0, policy_version 36300 (0.0008) +[2023-10-13 01:44:36,129][46663] Updated weights for policy 1, policy_version 36261 (0.0008) +[2023-10-13 01:44:36,180][46662] Updated weights for policy 0, policy_version 36310 (0.0009) +[2023-10-13 01:44:36,496][46663] Updated weights for policy 1, policy_version 36271 (0.0008) +[2023-10-13 01:44:36,548][46662] Updated weights for policy 0, policy_version 36320 (0.0007) +[2023-10-13 01:44:36,877][46663] Updated weights for policy 1, policy_version 36281 (0.0010) +[2023-10-13 01:44:38,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 74350592. Throughput: 0: 1689.2, 1: 1680.1. Samples: 18596650. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 01:44:38,607][45375] Avg episode reward: [(0, '51.420'), (1, '53.140')] +[2023-10-13 01:44:40,768][46662] Updated weights for policy 0, policy_version 36330 (0.0007) +[2023-10-13 01:44:40,871][46663] Updated weights for policy 1, policy_version 36291 (0.0008) +[2023-10-13 01:44:41,134][46662] Updated weights for policy 0, policy_version 36340 (0.0008) +[2023-10-13 01:44:41,236][46663] Updated weights for policy 1, policy_version 36301 (0.0009) +[2023-10-13 01:44:41,501][46662] Updated weights for policy 0, policy_version 36350 (0.0008) +[2023-10-13 01:44:41,605][46663] Updated weights for policy 1, policy_version 36311 (0.0007) +[2023-10-13 01:44:43,607][45375] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 74416128. Throughput: 0: 1680.4, 1: 1666.1. Samples: 18607084. Policy #0 lag: (min: 31.0, avg: 31.0, max: 32.0) +[2023-10-13 01:44:43,608][45375] Avg episode reward: [(0, '51.290'), (1, '54.580')] +[2023-10-13 01:44:45,468][46662] Updated weights for policy 0, policy_version 36360 (0.0007) +[2023-10-13 01:44:45,720][46663] Updated weights for policy 1, policy_version 36321 (0.0009) +[2023-10-13 01:44:45,840][46662] Updated weights for policy 0, policy_version 36370 (0.0009) +[2023-10-13 01:44:46,085][46663] Updated weights for policy 1, policy_version 36331 (0.0008) +[2023-10-13 01:44:46,204][46662] Updated weights for policy 0, policy_version 36380 (0.0011) +[2023-10-13 01:44:46,449][46663] Updated weights for policy 1, policy_version 36341 (0.0007) +[2023-10-13 01:44:46,831][46663] Updated weights for policy 1, policy_version 36351 (0.0008) +[2023-10-13 01:44:48,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 74481664. Throughput: 0: 1673.1, 1: 1672.2. Samples: 18626384. Policy #0 lag: (min: 31.0, avg: 31.0, max: 32.0) +[2023-10-13 01:44:48,607][45375] Avg episode reward: [(0, '50.590'), (1, '54.640')] +[2023-10-13 01:44:50,340][46662] Updated weights for policy 0, policy_version 36390 (0.0009) +[2023-10-13 01:44:50,711][46662] Updated weights for policy 0, policy_version 36400 (0.0009) +[2023-10-13 01:44:51,027][46663] Updated weights for policy 1, policy_version 36361 (0.0008) +[2023-10-13 01:44:51,084][46662] Updated weights for policy 0, policy_version 36410 (0.0007) +[2023-10-13 01:44:51,394][46663] Updated weights for policy 1, policy_version 36371 (0.0007) +[2023-10-13 01:44:51,760][46663] Updated weights for policy 1, policy_version 36381 (0.0007) +[2023-10-13 01:44:53,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 74547200. Throughput: 0: 1689.0, 1: 1682.7. Samples: 18646840. Policy #0 lag: (min: 31.0, avg: 31.0, max: 32.0) +[2023-10-13 01:44:53,607][45375] Avg episode reward: [(0, '49.600'), (1, '54.150')] +[2023-10-13 01:44:55,381][46662] Updated weights for policy 0, policy_version 36420 (0.0009) +[2023-10-13 01:44:55,747][46662] Updated weights for policy 0, policy_version 36430 (0.0008) +[2023-10-13 01:44:55,814][46663] Updated weights for policy 1, policy_version 36391 (0.0008) +[2023-10-13 01:44:56,119][46662] Updated weights for policy 0, policy_version 36440 (0.0008) +[2023-10-13 01:44:56,177][46663] Updated weights for policy 1, policy_version 36401 (0.0009) +[2023-10-13 01:44:56,544][46663] Updated weights for policy 1, policy_version 36411 (0.0009) +[2023-10-13 01:44:58,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 74612736. Throughput: 0: 1669.4, 1: 1660.2. Samples: 18656924. Policy #0 lag: (min: 31.0, avg: 31.0, max: 32.0) +[2023-10-13 01:44:58,607][45375] Avg episode reward: [(0, '48.570'), (1, '53.420')] +[2023-10-13 01:45:00,234][46662] Updated weights for policy 0, policy_version 36450 (0.0010) +[2023-10-13 01:45:00,544][46663] Updated weights for policy 1, policy_version 36421 (0.0009) +[2023-10-13 01:45:00,616][46662] Updated weights for policy 0, policy_version 36460 (0.0008) +[2023-10-13 01:45:00,919][46663] Updated weights for policy 1, policy_version 36431 (0.0007) +[2023-10-13 01:45:00,978][46662] Updated weights for policy 0, policy_version 36470 (0.0008) +[2023-10-13 01:45:01,280][46663] Updated weights for policy 1, policy_version 36441 (0.0009) +[2023-10-13 01:45:01,353][46662] Updated weights for policy 0, policy_version 36480 (0.0009) +[2023-10-13 01:45:03,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 74678272. Throughput: 0: 1668.6, 1: 1673.4. Samples: 18676306. Policy #0 lag: (min: 31.0, avg: 31.0, max: 32.0) +[2023-10-13 01:45:03,607][45375] Avg episode reward: [(0, '48.190'), (1, '52.220')] +[2023-10-13 01:45:05,212][46662] Updated weights for policy 0, policy_version 36490 (0.0007) +[2023-10-13 01:45:05,449][46663] Updated weights for policy 1, policy_version 36451 (0.0008) +[2023-10-13 01:45:05,584][46662] Updated weights for policy 0, policy_version 36500 (0.0007) +[2023-10-13 01:45:05,806][46663] Updated weights for policy 1, policy_version 36461 (0.0008) +[2023-10-13 01:45:05,951][46662] Updated weights for policy 0, policy_version 36510 (0.0010) +[2023-10-13 01:45:06,170][46663] Updated weights for policy 1, policy_version 36471 (0.0008) +[2023-10-13 01:45:08,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 74743808. Throughput: 0: 1680.9, 1: 1669.1. Samples: 18696914. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 01:45:08,607][45375] Avg episode reward: [(0, '48.360'), (1, '51.260')] +[2023-10-13 01:45:09,896][46662] Updated weights for policy 0, policy_version 36520 (0.0007) +[2023-10-13 01:45:10,269][46662] Updated weights for policy 0, policy_version 36530 (0.0007) +[2023-10-13 01:45:10,339][46663] Updated weights for policy 1, policy_version 36481 (0.0009) +[2023-10-13 01:45:10,624][46662] Updated weights for policy 0, policy_version 36540 (0.0009) +[2023-10-13 01:45:10,760][46663] Updated weights for policy 1, policy_version 36491 (0.0007) +[2023-10-13 01:45:11,119][46663] Updated weights for policy 1, policy_version 36501 (0.0010) +[2023-10-13 01:45:11,482][46663] Updated weights for policy 1, policy_version 36511 (0.0009) +[2023-10-13 01:45:13,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 74809344. Throughput: 0: 1654.5, 1: 1650.7. Samples: 18706258. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 01:45:13,607][45375] Avg episode reward: [(0, '47.760'), (1, '50.430')] +[2023-10-13 01:45:14,806][46662] Updated weights for policy 0, policy_version 36550 (0.0008) +[2023-10-13 01:45:15,167][46662] Updated weights for policy 0, policy_version 36560 (0.0010) +[2023-10-13 01:45:15,539][46662] Updated weights for policy 0, policy_version 36570 (0.0008) +[2023-10-13 01:45:15,596][46663] Updated weights for policy 1, policy_version 36521 (0.0008) +[2023-10-13 01:45:15,961][46663] Updated weights for policy 1, policy_version 36531 (0.0008) +[2023-10-13 01:45:16,323][46663] Updated weights for policy 1, policy_version 36541 (0.0010) +[2023-10-13 01:45:18,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 74874880. Throughput: 0: 1677.0, 1: 1669.0. Samples: 18726632. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 01:45:18,607][45375] Avg episode reward: [(0, '45.530'), (1, '50.030')] +[2023-10-13 01:45:19,582][46662] Updated weights for policy 0, policy_version 36580 (0.0008) +[2023-10-13 01:45:19,944][46662] Updated weights for policy 0, policy_version 36590 (0.0007) +[2023-10-13 01:45:20,314][46662] Updated weights for policy 0, policy_version 36600 (0.0007) +[2023-10-13 01:45:20,379][46663] Updated weights for policy 1, policy_version 36551 (0.0008) +[2023-10-13 01:45:20,757][46663] Updated weights for policy 1, policy_version 36561 (0.0010) +[2023-10-13 01:45:21,121][46663] Updated weights for policy 1, policy_version 36571 (0.0010) +[2023-10-13 01:45:23,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 74940416. Throughput: 0: 1672.9, 1: 1677.3. Samples: 18747410. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 01:45:23,607][45375] Avg episode reward: [(0, '44.350'), (1, '49.980')] +[2023-10-13 01:45:24,486][46662] Updated weights for policy 0, policy_version 36610 (0.0007) +[2023-10-13 01:45:24,858][46662] Updated weights for policy 0, policy_version 36620 (0.0007) +[2023-10-13 01:45:25,156][46663] Updated weights for policy 1, policy_version 36581 (0.0009) +[2023-10-13 01:45:25,232][46662] Updated weights for policy 0, policy_version 36630 (0.0008) +[2023-10-13 01:45:25,524][46663] Updated weights for policy 1, policy_version 36591 (0.0007) +[2023-10-13 01:45:25,593][46662] Updated weights for policy 0, policy_version 36640 (0.0008) +[2023-10-13 01:45:25,896][46663] Updated weights for policy 1, policy_version 36601 (0.0009) +[2023-10-13 01:45:28,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 75005952. Throughput: 0: 1660.2, 1: 1661.6. Samples: 18756566. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 01:45:28,607][45375] Avg episode reward: [(0, '44.720'), (1, '49.180')] +[2023-10-13 01:45:29,814][46663] Updated weights for policy 1, policy_version 36611 (0.0008) +[2023-10-13 01:45:29,824][46662] Updated weights for policy 0, policy_version 36650 (0.0009) +[2023-10-13 01:45:30,182][46663] Updated weights for policy 1, policy_version 36621 (0.0007) +[2023-10-13 01:45:30,196][46662] Updated weights for policy 0, policy_version 36660 (0.0008) +[2023-10-13 01:45:30,542][46663] Updated weights for policy 1, policy_version 36631 (0.0007) +[2023-10-13 01:45:30,574][46662] Updated weights for policy 0, policy_version 36670 (0.0008) +[2023-10-13 01:45:33,607][45375] Fps is (10 sec: 13106.9, 60 sec: 13107.1, 300 sec: 13329.4). Total num frames: 75071488. Throughput: 0: 1668.8, 1: 1680.0. Samples: 18777080. Policy #0 lag: (min: 1.0, avg: 5.4, max: 33.0) +[2023-10-13 01:45:33,608][45375] Avg episode reward: [(0, '46.170'), (1, '49.340')] +[2023-10-13 01:45:34,582][46663] Updated weights for policy 1, policy_version 36641 (0.0010) +[2023-10-13 01:45:34,768][46662] Updated weights for policy 0, policy_version 36680 (0.0007) +[2023-10-13 01:45:34,946][46663] Updated weights for policy 1, policy_version 36651 (0.0007) +[2023-10-13 01:45:35,137][46662] Updated weights for policy 0, policy_version 36690 (0.0009) +[2023-10-13 01:45:35,307][46663] Updated weights for policy 1, policy_version 36661 (0.0008) +[2023-10-13 01:45:35,507][46662] Updated weights for policy 0, policy_version 36700 (0.0008) +[2023-10-13 01:45:35,671][46663] Updated weights for policy 1, policy_version 36671 (0.0008) +[2023-10-13 01:45:38,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 75137024. Throughput: 0: 1674.0, 1: 1676.8. Samples: 18797628. Policy #0 lag: (min: 1.0, avg: 5.4, max: 33.0) +[2023-10-13 01:45:38,607][45375] Avg episode reward: [(0, '46.660'), (1, '50.110')] +[2023-10-13 01:45:39,571][46662] Updated weights for policy 0, policy_version 36710 (0.0008) +[2023-10-13 01:45:39,874][46663] Updated weights for policy 1, policy_version 36681 (0.0007) +[2023-10-13 01:45:39,946][46662] Updated weights for policy 0, policy_version 36720 (0.0008) +[2023-10-13 01:45:40,246][46663] Updated weights for policy 1, policy_version 36691 (0.0008) +[2023-10-13 01:45:40,319][46662] Updated weights for policy 0, policy_version 36730 (0.0009) +[2023-10-13 01:45:40,625][46663] Updated weights for policy 1, policy_version 36701 (0.0007) +[2023-10-13 01:45:43,607][45375] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 75202560. Throughput: 0: 1658.7, 1: 1663.8. Samples: 18806436. Policy #0 lag: (min: 1.0, avg: 5.4, max: 33.0) +[2023-10-13 01:45:43,607][45375] Avg episode reward: [(0, '45.430'), (1, '51.090')] +[2023-10-13 01:45:44,461][46662] Updated weights for policy 0, policy_version 36740 (0.0007) +[2023-10-13 01:45:44,778][46663] Updated weights for policy 1, policy_version 36711 (0.0009) +[2023-10-13 01:45:44,836][46662] Updated weights for policy 0, policy_version 36750 (0.0008) +[2023-10-13 01:45:45,139][46663] Updated weights for policy 1, policy_version 36721 (0.0008) +[2023-10-13 01:45:45,203][46662] Updated weights for policy 0, policy_version 36760 (0.0008) +[2023-10-13 01:45:45,503][46663] Updated weights for policy 1, policy_version 36731 (0.0007) +[2023-10-13 01:45:48,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 75268096. Throughput: 0: 1676.4, 1: 1676.9. Samples: 18827202. Policy #0 lag: (min: 1.0, avg: 5.4, max: 33.0) +[2023-10-13 01:45:48,607][45375] Avg episode reward: [(0, '46.980'), (1, '49.850')] +[2023-10-13 01:45:49,316][46662] Updated weights for policy 0, policy_version 36770 (0.0008) +[2023-10-13 01:45:49,604][46663] Updated weights for policy 1, policy_version 36741 (0.0008) +[2023-10-13 01:45:49,688][46662] Updated weights for policy 0, policy_version 36780 (0.0009) +[2023-10-13 01:45:49,970][46663] Updated weights for policy 1, policy_version 36751 (0.0008) +[2023-10-13 01:45:50,054][46662] Updated weights for policy 0, policy_version 36790 (0.0008) +[2023-10-13 01:45:50,326][46663] Updated weights for policy 1, policy_version 36761 (0.0009) +[2023-10-13 01:45:50,422][46662] Updated weights for policy 0, policy_version 36800 (0.0009) +[2023-10-13 01:45:53,607][45375] Fps is (10 sec: 13106.7, 60 sec: 13107.1, 300 sec: 13329.3). Total num frames: 75333632. Throughput: 0: 1668.4, 1: 1683.6. Samples: 18847756. Policy #0 lag: (min: 1.0, avg: 5.4, max: 33.0) +[2023-10-13 01:45:53,608][45375] Avg episode reward: [(0, '47.940'), (1, '49.680')] +[2023-10-13 01:45:53,617][46091] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000036800_37683200.pth... +[2023-10-13 01:45:53,617][46384] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000036768_37650432.pth... +[2023-10-13 01:45:53,650][46384] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000035200_36044800.pth +[2023-10-13 01:45:53,658][46091] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000035232_36077568.pth +[2023-10-13 01:45:54,293][46662] Updated weights for policy 0, policy_version 36810 (0.0011) +[2023-10-13 01:45:54,347][46663] Updated weights for policy 1, policy_version 36771 (0.0008) +[2023-10-13 01:45:54,658][46662] Updated weights for policy 0, policy_version 36820 (0.0009) +[2023-10-13 01:45:54,720][46663] Updated weights for policy 1, policy_version 36781 (0.0009) +[2023-10-13 01:45:55,035][46662] Updated weights for policy 0, policy_version 36830 (0.0009) +[2023-10-13 01:45:55,093][46663] Updated weights for policy 1, policy_version 36791 (0.0008) +[2023-10-13 01:45:58,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 75399168. Throughput: 0: 1670.5, 1: 1680.1. Samples: 18857036. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 01:45:58,608][45375] Avg episode reward: [(0, '48.410'), (1, '50.900')] +[2023-10-13 01:45:59,030][46662] Updated weights for policy 0, policy_version 36840 (0.0008) +[2023-10-13 01:45:59,168][46663] Updated weights for policy 1, policy_version 36801 (0.0007) +[2023-10-13 01:45:59,398][46662] Updated weights for policy 0, policy_version 36850 (0.0008) +[2023-10-13 01:45:59,540][46663] Updated weights for policy 1, policy_version 36811 (0.0008) +[2023-10-13 01:45:59,766][46662] Updated weights for policy 0, policy_version 36860 (0.0007) +[2023-10-13 01:45:59,905][46663] Updated weights for policy 1, policy_version 36821 (0.0007) +[2023-10-13 01:46:00,274][46663] Updated weights for policy 1, policy_version 36831 (0.0007) +[2023-10-13 01:46:03,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.1, 300 sec: 13329.3). Total num frames: 75464704. Throughput: 0: 1674.7, 1: 1682.3. Samples: 18877700. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 01:46:03,608][45375] Avg episode reward: [(0, '47.540'), (1, '51.330')] +[2023-10-13 01:46:03,636][46662] Updated weights for policy 0, policy_version 36870 (0.0007) +[2023-10-13 01:46:04,010][46662] Updated weights for policy 0, policy_version 36880 (0.0007) +[2023-10-13 01:46:04,365][46663] Updated weights for policy 1, policy_version 36841 (0.0008) +[2023-10-13 01:46:04,379][46662] Updated weights for policy 0, policy_version 36890 (0.0009) +[2023-10-13 01:46:04,727][46663] Updated weights for policy 1, policy_version 36851 (0.0009) +[2023-10-13 01:46:05,090][46663] Updated weights for policy 1, policy_version 36861 (0.0007) +[2023-10-13 01:46:08,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 75530240. Throughput: 0: 1681.5, 1: 1676.8. Samples: 18898532. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 01:46:08,607][45375] Avg episode reward: [(0, '47.350'), (1, '52.610')] +[2023-10-13 01:46:08,680][46662] Updated weights for policy 0, policy_version 36900 (0.0010) +[2023-10-13 01:46:09,043][46662] Updated weights for policy 0, policy_version 36910 (0.0007) +[2023-10-13 01:46:09,311][46663] Updated weights for policy 1, policy_version 36871 (0.0008) +[2023-10-13 01:46:09,415][46662] Updated weights for policy 0, policy_version 36920 (0.0008) +[2023-10-13 01:46:09,673][46663] Updated weights for policy 1, policy_version 36881 (0.0008) +[2023-10-13 01:46:10,040][46663] Updated weights for policy 1, policy_version 36891 (0.0009) +[2023-10-13 01:46:13,563][46662] Updated weights for policy 0, policy_version 36930 (0.0007) +[2023-10-13 01:46:13,607][45375] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 75595776. Throughput: 0: 1676.5, 1: 1679.0. Samples: 18907566. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 01:46:13,608][45375] Avg episode reward: [(0, '46.070'), (1, '52.820')] +[2023-10-13 01:46:13,924][46662] Updated weights for policy 0, policy_version 36940 (0.0007) +[2023-10-13 01:46:14,032][46663] Updated weights for policy 1, policy_version 36901 (0.0008) +[2023-10-13 01:46:14,299][46662] Updated weights for policy 0, policy_version 36950 (0.0007) +[2023-10-13 01:46:14,408][46663] Updated weights for policy 1, policy_version 36911 (0.0009) +[2023-10-13 01:46:14,665][46662] Updated weights for policy 0, policy_version 36960 (0.0008) +[2023-10-13 01:46:14,781][46663] Updated weights for policy 1, policy_version 36921 (0.0008) +[2023-10-13 01:46:18,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 75661312. Throughput: 0: 1687.4, 1: 1679.1. Samples: 18928570. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 01:46:18,607][45375] Avg episode reward: [(0, '46.260'), (1, '52.800')] +[2023-10-13 01:46:18,619][46662] Updated weights for policy 0, policy_version 36970 (0.0008) +[2023-10-13 01:46:18,730][46663] Updated weights for policy 1, policy_version 36931 (0.0008) +[2023-10-13 01:46:18,985][46662] Updated weights for policy 0, policy_version 36980 (0.0009) +[2023-10-13 01:46:19,103][46663] Updated weights for policy 1, policy_version 36941 (0.0008) +[2023-10-13 01:46:19,350][46662] Updated weights for policy 0, policy_version 36990 (0.0008) +[2023-10-13 01:46:19,467][46663] Updated weights for policy 1, policy_version 36951 (0.0007) +[2023-10-13 01:46:23,457][46662] Updated weights for policy 0, policy_version 37000 (0.0010) +[2023-10-13 01:46:23,606][45375] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 75726848. Throughput: 0: 1689.2, 1: 1679.4. Samples: 18949218. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 01:46:23,607][45375] Avg episode reward: [(0, '46.880'), (1, '53.850')] +[2023-10-13 01:46:23,639][46663] Updated weights for policy 1, policy_version 36961 (0.0008) +[2023-10-13 01:46:23,829][46662] Updated weights for policy 0, policy_version 37010 (0.0008) +[2023-10-13 01:46:23,999][46663] Updated weights for policy 1, policy_version 36971 (0.0007) +[2023-10-13 01:46:24,190][46662] Updated weights for policy 0, policy_version 37020 (0.0009) +[2023-10-13 01:46:24,357][46663] Updated weights for policy 1, policy_version 36981 (0.0007) +[2023-10-13 01:46:24,727][46663] Updated weights for policy 1, policy_version 36991 (0.0007) +[2023-10-13 01:46:28,214][46662] Updated weights for policy 0, policy_version 37030 (0.0010) +[2023-10-13 01:46:28,593][46662] Updated weights for policy 0, policy_version 37040 (0.0007) +[2023-10-13 01:46:28,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 75792384. Throughput: 0: 1690.4, 1: 1685.4. Samples: 18958346. Policy #0 lag: (min: 2.0, avg: 4.6, max: 34.0) +[2023-10-13 01:46:28,607][45375] Avg episode reward: [(0, '46.560'), (1, '52.640')] +[2023-10-13 01:46:28,806][46663] Updated weights for policy 1, policy_version 37001 (0.0007) +[2023-10-13 01:46:28,960][46662] Updated weights for policy 0, policy_version 37050 (0.0010) +[2023-10-13 01:46:29,167][46663] Updated weights for policy 1, policy_version 37011 (0.0007) +[2023-10-13 01:46:29,539][46663] Updated weights for policy 1, policy_version 37021 (0.0007) +[2023-10-13 01:46:32,990][46662] Updated weights for policy 0, policy_version 37060 (0.0009) +[2023-10-13 01:46:33,366][46662] Updated weights for policy 0, policy_version 37070 (0.0008) +[2023-10-13 01:46:33,568][46663] Updated weights for policy 1, policy_version 37031 (0.0008) +[2023-10-13 01:46:33,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 75857920. Throughput: 0: 1690.8, 1: 1681.3. Samples: 18978946. Policy #0 lag: (min: 2.0, avg: 4.6, max: 34.0) +[2023-10-13 01:46:33,607][45375] Avg episode reward: [(0, '46.470'), (1, '53.740')] +[2023-10-13 01:46:33,725][46662] Updated weights for policy 0, policy_version 37080 (0.0007) +[2023-10-13 01:46:33,936][46663] Updated weights for policy 1, policy_version 37041 (0.0009) +[2023-10-13 01:46:34,316][46663] Updated weights for policy 1, policy_version 37051 (0.0007) +[2023-10-13 01:46:37,752][46662] Updated weights for policy 0, policy_version 37090 (0.0010) +[2023-10-13 01:46:38,133][46662] Updated weights for policy 0, policy_version 37100 (0.0009) +[2023-10-13 01:46:38,273][46663] Updated weights for policy 1, policy_version 37061 (0.0008) +[2023-10-13 01:46:38,500][46662] Updated weights for policy 0, policy_version 37110 (0.0007) +[2023-10-13 01:46:38,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 75923456. Throughput: 0: 1690.2, 1: 1679.2. Samples: 18999380. Policy #0 lag: (min: 2.0, avg: 4.6, max: 34.0) +[2023-10-13 01:46:38,607][45375] Avg episode reward: [(0, '45.750'), (1, '53.010')] +[2023-10-13 01:46:38,639][46663] Updated weights for policy 1, policy_version 37071 (0.0008) +[2023-10-13 01:46:38,858][46662] Updated weights for policy 0, policy_version 37120 (0.0008) +[2023-10-13 01:46:39,013][46663] Updated weights for policy 1, policy_version 37081 (0.0010) +[2023-10-13 01:46:43,009][46663] Updated weights for policy 1, policy_version 37091 (0.0007) +[2023-10-13 01:46:43,191][46662] Updated weights for policy 0, policy_version 37130 (0.0007) +[2023-10-13 01:46:43,365][46663] Updated weights for policy 1, policy_version 37101 (0.0009) +[2023-10-13 01:46:43,566][46662] Updated weights for policy 0, policy_version 37140 (0.0008) +[2023-10-13 01:46:43,606][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 75988992. Throughput: 0: 1687.9, 1: 1688.6. Samples: 19008978. Policy #0 lag: (min: 2.0, avg: 4.6, max: 34.0) +[2023-10-13 01:46:43,607][45375] Avg episode reward: [(0, '45.260'), (1, '53.510')] +[2023-10-13 01:46:43,734][46663] Updated weights for policy 1, policy_version 37111 (0.0007) +[2023-10-13 01:46:43,932][46662] Updated weights for policy 0, policy_version 37150 (0.0008) +[2023-10-13 01:46:47,908][46663] Updated weights for policy 1, policy_version 37121 (0.0007) +[2023-10-13 01:46:47,984][46662] Updated weights for policy 0, policy_version 37160 (0.0011) +[2023-10-13 01:46:48,326][46663] Updated weights for policy 1, policy_version 37131 (0.0009) +[2023-10-13 01:46:48,349][46662] Updated weights for policy 0, policy_version 37170 (0.0008) +[2023-10-13 01:46:48,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 76054528. Throughput: 0: 1686.6, 1: 1691.5. Samples: 19029714. Policy #0 lag: (min: 2.0, avg: 4.6, max: 34.0) +[2023-10-13 01:46:48,607][45375] Avg episode reward: [(0, '45.520'), (1, '54.340')] +[2023-10-13 01:46:48,698][46663] Updated weights for policy 1, policy_version 37141 (0.0007) +[2023-10-13 01:46:48,707][46662] Updated weights for policy 0, policy_version 37180 (0.0007) +[2023-10-13 01:46:49,057][46663] Updated weights for policy 1, policy_version 37151 (0.0007) +[2023-10-13 01:46:52,948][46662] Updated weights for policy 0, policy_version 37190 (0.0010) +[2023-10-13 01:46:53,177][46663] Updated weights for policy 1, policy_version 37161 (0.0008) +[2023-10-13 01:46:53,308][46662] Updated weights for policy 0, policy_version 37200 (0.0008) +[2023-10-13 01:46:53,546][46663] Updated weights for policy 1, policy_version 37171 (0.0009) +[2023-10-13 01:46:53,606][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.3, 300 sec: 13329.4). Total num frames: 76120064. Throughput: 0: 1675.4, 1: 1673.7. Samples: 19049244. Policy #0 lag: (min: 31.0, avg: 35.1, max: 63.0) +[2023-10-13 01:46:53,607][45375] Avg episode reward: [(0, '44.430'), (1, '53.820')] +[2023-10-13 01:46:53,686][46662] Updated weights for policy 0, policy_version 37210 (0.0007) +[2023-10-13 01:46:53,915][46663] Updated weights for policy 1, policy_version 37181 (0.0009) +[2023-10-13 01:46:57,683][46662] Updated weights for policy 0, policy_version 37220 (0.0008) +[2023-10-13 01:46:57,979][46663] Updated weights for policy 1, policy_version 37191 (0.0010) +[2023-10-13 01:46:58,048][46662] Updated weights for policy 0, policy_version 37230 (0.0008) +[2023-10-13 01:46:58,349][46663] Updated weights for policy 1, policy_version 37201 (0.0009) +[2023-10-13 01:46:58,427][46662] Updated weights for policy 0, policy_version 37240 (0.0008) +[2023-10-13 01:46:58,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 76185600. Throughput: 0: 1679.7, 1: 1685.3. Samples: 19058988. Policy #0 lag: (min: 31.0, avg: 35.1, max: 63.0) +[2023-10-13 01:46:58,607][45375] Avg episode reward: [(0, '43.810'), (1, '53.110')] +[2023-10-13 01:46:58,726][46663] Updated weights for policy 1, policy_version 37211 (0.0008) +[2023-10-13 01:47:02,553][46662] Updated weights for policy 0, policy_version 37250 (0.0009) +[2023-10-13 01:47:02,772][46663] Updated weights for policy 1, policy_version 37221 (0.0007) +[2023-10-13 01:47:02,918][46662] Updated weights for policy 0, policy_version 37260 (0.0007) +[2023-10-13 01:47:03,139][46663] Updated weights for policy 1, policy_version 37231 (0.0007) +[2023-10-13 01:47:03,286][46662] Updated weights for policy 0, policy_version 37270 (0.0008) +[2023-10-13 01:47:03,502][46663] Updated weights for policy 1, policy_version 37241 (0.0008) +[2023-10-13 01:47:03,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 76251136. Throughput: 0: 1675.0, 1: 1685.9. Samples: 19079810. Policy #0 lag: (min: 31.0, avg: 35.1, max: 63.0) +[2023-10-13 01:47:03,608][45375] Avg episode reward: [(0, '44.330'), (1, '52.840')] +[2023-10-13 01:47:03,658][46662] Updated weights for policy 0, policy_version 37280 (0.0007) +[2023-10-13 01:47:07,561][46663] Updated weights for policy 1, policy_version 37251 (0.0009) +[2023-10-13 01:47:07,654][46662] Updated weights for policy 0, policy_version 37290 (0.0009) +[2023-10-13 01:47:07,914][46663] Updated weights for policy 1, policy_version 37261 (0.0009) +[2023-10-13 01:47:08,023][46662] Updated weights for policy 0, policy_version 37300 (0.0007) +[2023-10-13 01:47:08,279][46663] Updated weights for policy 1, policy_version 37271 (0.0010) +[2023-10-13 01:47:08,398][46662] Updated weights for policy 0, policy_version 37310 (0.0008) +[2023-10-13 01:47:08,606][45375] Fps is (10 sec: 19660.9, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 76382208. Throughput: 0: 1659.7, 1: 1666.1. Samples: 19098880. Policy #0 lag: (min: 31.0, avg: 35.1, max: 63.0) +[2023-10-13 01:47:08,607][45375] Avg episode reward: [(0, '46.820'), (1, '52.950')] +[2023-10-13 01:47:12,359][46663] Updated weights for policy 1, policy_version 37281 (0.0009) +[2023-10-13 01:47:12,565][46662] Updated weights for policy 0, policy_version 37320 (0.0007) +[2023-10-13 01:47:12,721][46663] Updated weights for policy 1, policy_version 37291 (0.0009) +[2023-10-13 01:47:12,938][46662] Updated weights for policy 0, policy_version 37330 (0.0009) +[2023-10-13 01:47:13,089][46663] Updated weights for policy 1, policy_version 37301 (0.0007) +[2023-10-13 01:47:13,310][46662] Updated weights for policy 0, policy_version 37340 (0.0009) +[2023-10-13 01:47:13,453][46663] Updated weights for policy 1, policy_version 37311 (0.0007) +[2023-10-13 01:47:13,606][45375] Fps is (10 sec: 19661.2, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 76447744. Throughput: 0: 1671.7, 1: 1683.2. Samples: 19109314. Policy #0 lag: (min: 31.0, avg: 35.1, max: 63.0) +[2023-10-13 01:47:13,607][45375] Avg episode reward: [(0, '47.260'), (1, '52.110')] +[2023-10-13 01:47:17,386][46662] Updated weights for policy 0, policy_version 37350 (0.0009) +[2023-10-13 01:47:17,427][46663] Updated weights for policy 1, policy_version 37321 (0.0010) +[2023-10-13 01:47:17,761][46662] Updated weights for policy 0, policy_version 37360 (0.0007) +[2023-10-13 01:47:17,790][46663] Updated weights for policy 1, policy_version 37331 (0.0008) +[2023-10-13 01:47:18,137][46662] Updated weights for policy 0, policy_version 37370 (0.0010) +[2023-10-13 01:47:18,161][46663] Updated weights for policy 1, policy_version 37341 (0.0008) +[2023-10-13 01:47:18,607][45375] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 76513280. Throughput: 0: 1668.7, 1: 1685.7. Samples: 19129896. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0) +[2023-10-13 01:47:18,607][45375] Avg episode reward: [(0, '48.860'), (1, '51.650')] +[2023-10-13 01:47:22,034][46662] Updated weights for policy 0, policy_version 37380 (0.0007) +[2023-10-13 01:47:22,410][46662] Updated weights for policy 0, policy_version 37390 (0.0010) +[2023-10-13 01:47:22,420][46663] Updated weights for policy 1, policy_version 37351 (0.0009) +[2023-10-13 01:47:22,775][46662] Updated weights for policy 0, policy_version 37400 (0.0007) +[2023-10-13 01:47:22,791][46663] Updated weights for policy 1, policy_version 37361 (0.0008) +[2023-10-13 01:47:23,161][46663] Updated weights for policy 1, policy_version 37371 (0.0007) +[2023-10-13 01:47:23,607][45375] Fps is (10 sec: 13107.0, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 76578816. Throughput: 0: 1657.6, 1: 1663.6. Samples: 19148832. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0) +[2023-10-13 01:47:23,608][45375] Avg episode reward: [(0, '47.790'), (1, '52.850')] +[2023-10-13 01:47:26,891][46662] Updated weights for policy 0, policy_version 37410 (0.0008) +[2023-10-13 01:47:27,268][46662] Updated weights for policy 0, policy_version 37420 (0.0008) +[2023-10-13 01:47:27,291][46663] Updated weights for policy 1, policy_version 37381 (0.0008) +[2023-10-13 01:47:27,642][46662] Updated weights for policy 0, policy_version 37430 (0.0007) +[2023-10-13 01:47:27,652][46663] Updated weights for policy 1, policy_version 37391 (0.0010) +[2023-10-13 01:47:28,010][46662] Updated weights for policy 0, policy_version 37440 (0.0009) +[2023-10-13 01:47:28,015][46663] Updated weights for policy 1, policy_version 37401 (0.0008) +[2023-10-13 01:47:28,606][45375] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 76644352. Throughput: 0: 1674.7, 1: 1677.6. Samples: 19159830. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0) +[2023-10-13 01:47:28,607][45375] Avg episode reward: [(0, '47.150'), (1, '51.170')] +[2023-10-13 01:47:32,081][46662] Updated weights for policy 0, policy_version 37450 (0.0008) +[2023-10-13 01:47:32,177][46663] Updated weights for policy 1, policy_version 37411 (0.0009) +[2023-10-13 01:47:32,448][46662] Updated weights for policy 0, policy_version 37460 (0.0007) +[2023-10-13 01:47:32,549][46663] Updated weights for policy 1, policy_version 37421 (0.0010) +[2023-10-13 01:47:32,819][46662] Updated weights for policy 0, policy_version 37470 (0.0007) +[2023-10-13 01:47:32,900][46663] Updated weights for policy 1, policy_version 37431 (0.0008) +[2023-10-13 01:47:33,606][45375] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 76709888. Throughput: 0: 1670.3, 1: 1673.2. Samples: 19180172. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0) +[2023-10-13 01:47:33,607][45375] Avg episode reward: [(0, '46.150'), (1, '51.410')] +[2023-10-13 01:47:36,854][46662] Updated weights for policy 0, policy_version 37480 (0.0007) +[2023-10-13 01:47:37,185][46663] Updated weights for policy 1, policy_version 37441 (0.0008) +[2023-10-13 01:47:37,223][46662] Updated weights for policy 0, policy_version 37490 (0.0008) +[2023-10-13 01:47:37,571][46663] Updated weights for policy 1, policy_version 37451 (0.0007) +[2023-10-13 01:47:37,600][46662] Updated weights for policy 0, policy_version 37500 (0.0008) +[2023-10-13 01:47:37,934][46663] Updated weights for policy 1, policy_version 37461 (0.0007) +[2023-10-13 01:47:38,297][46663] Updated weights for policy 1, policy_version 37471 (0.0007) +[2023-10-13 01:47:38,607][45375] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 76775424. Throughput: 0: 1648.8, 1: 1664.3. Samples: 19198332. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0) +[2023-10-13 01:47:38,607][45375] Avg episode reward: [(0, '47.550'), (1, '51.160')] +[2023-10-13 01:47:41,824][46662] Updated weights for policy 0, policy_version 37510 (0.0009) +[2023-10-13 01:47:42,196][46662] Updated weights for policy 0, policy_version 37520 (0.0008) +[2023-10-13 01:47:42,307][46663] Updated weights for policy 1, policy_version 37481 (0.0008) +[2023-10-13 01:47:42,565][46662] Updated weights for policy 0, policy_version 37530 (0.0008) +[2023-10-13 01:47:42,672][46663] Updated weights for policy 1, policy_version 37491 (0.0009) +[2023-10-13 01:47:43,056][46663] Updated weights for policy 1, policy_version 37501 (0.0011) +[2023-10-13 01:47:43,606][45375] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 76840960. Throughput: 0: 1671.5, 1: 1682.1. Samples: 19209902. Policy #0 lag: (min: 24.0, avg: 49.3, max: 56.0) +[2023-10-13 01:47:43,607][45375] Avg episode reward: [(0, '47.900'), (1, '51.390')] +[2023-10-13 01:47:46,495][46662] Updated weights for policy 0, policy_version 37540 (0.0008) +[2023-10-13 01:47:46,862][46662] Updated weights for policy 0, policy_version 37550 (0.0007) +[2023-10-13 01:47:47,011][46663] Updated weights for policy 1, policy_version 37511 (0.0008) +[2023-10-13 01:47:47,233][46662] Updated weights for policy 0, policy_version 37560 (0.0007) +[2023-10-13 01:47:47,383][46663] Updated weights for policy 1, policy_version 37521 (0.0007) +[2023-10-13 01:47:47,741][46663] Updated weights for policy 1, policy_version 37531 (0.0009) +[2023-10-13 01:47:48,606][45375] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 76906496. Throughput: 0: 1667.2, 1: 1665.5. Samples: 19229782. Policy #0 lag: (min: 24.0, avg: 49.3, max: 56.0) +[2023-10-13 01:47:48,607][45375] Avg episode reward: [(0, '48.250'), (1, '50.560')] +[2023-10-13 01:47:51,406][46662] Updated weights for policy 0, policy_version 37570 (0.0008) +[2023-10-13 01:47:51,771][46662] Updated weights for policy 0, policy_version 37580 (0.0007) +[2023-10-13 01:47:51,840][46663] Updated weights for policy 1, policy_version 37541 (0.0009) +[2023-10-13 01:47:52,145][46662] Updated weights for policy 0, policy_version 37590 (0.0007) +[2023-10-13 01:47:52,213][46663] Updated weights for policy 1, policy_version 37551 (0.0007) +[2023-10-13 01:47:52,507][46662] Updated weights for policy 0, policy_version 37600 (0.0008) +[2023-10-13 01:47:52,573][46663] Updated weights for policy 1, policy_version 37561 (0.0008) +[2023-10-13 01:47:53,607][45375] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 76972032. Throughput: 0: 1660.8, 1: 1674.0. Samples: 19248950. Policy #0 lag: (min: 24.0, avg: 49.3, max: 56.0) +[2023-10-13 01:47:53,607][45375] Avg episode reward: [(0, '48.840'), (1, '48.690')] +[2023-10-13 01:47:53,616][46091] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000037600_38502400.pth... +[2023-10-13 01:47:53,616][46384] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000037568_38469632.pth... +[2023-10-13 01:47:53,652][46091] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000036032_36896768.pth +[2023-10-13 01:47:53,653][46384] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000036000_36864000.pth +[2023-10-13 01:47:56,611][46663] Updated weights for policy 1, policy_version 37571 (0.0009) +[2023-10-13 01:47:56,664][46662] Updated weights for policy 0, policy_version 37610 (0.0009) +[2023-10-13 01:47:56,972][46663] Updated weights for policy 1, policy_version 37581 (0.0008) +[2023-10-13 01:47:57,043][46662] Updated weights for policy 0, policy_version 37620 (0.0009) +[2023-10-13 01:47:57,345][46663] Updated weights for policy 1, policy_version 37591 (0.0007) +[2023-10-13 01:47:57,407][46662] Updated weights for policy 0, policy_version 37630 (0.0008) +[2023-10-13 01:47:58,607][45375] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 77037568. Throughput: 0: 1682.4, 1: 1678.8. Samples: 19260572. Policy #0 lag: (min: 24.0, avg: 49.3, max: 56.0) +[2023-10-13 01:47:58,607][45375] Avg episode reward: [(0, '48.130'), (1, '49.420')] +[2023-10-13 01:48:01,438][46663] Updated weights for policy 1, policy_version 37601 (0.0010) +[2023-10-13 01:48:01,487][46662] Updated weights for policy 0, policy_version 37640 (0.0009) +[2023-10-13 01:48:01,802][46663] Updated weights for policy 1, policy_version 37611 (0.0008) +[2023-10-13 01:48:01,862][46662] Updated weights for policy 0, policy_version 37650 (0.0007) +[2023-10-13 01:48:02,171][46663] Updated weights for policy 1, policy_version 37621 (0.0007) +[2023-10-13 01:48:02,221][46662] Updated weights for policy 0, policy_version 37660 (0.0007) +[2023-10-13 01:48:02,534][46663] Updated weights for policy 1, policy_version 37631 (0.0008) +[2023-10-13 01:48:03,607][45375] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 77103104. Throughput: 0: 1665.8, 1: 1653.3. Samples: 19279258. Policy #0 lag: (min: 24.0, avg: 49.3, max: 56.0) +[2023-10-13 01:48:03,607][45375] Avg episode reward: [(0, '48.920'), (1, '49.730')] +[2023-10-13 01:48:06,412][46662] Updated weights for policy 0, policy_version 37670 (0.0008) +[2023-10-13 01:48:06,555][46663] Updated weights for policy 1, policy_version 37641 (0.0008) +[2023-10-13 01:48:06,793][46662] Updated weights for policy 0, policy_version 37680 (0.0008) +[2023-10-13 01:48:06,925][46663] Updated weights for policy 1, policy_version 37651 (0.0007) +[2023-10-13 01:48:07,165][46662] Updated weights for policy 0, policy_version 37690 (0.0009) +[2023-10-13 01:48:07,296][46663] Updated weights for policy 1, policy_version 37661 (0.0008) +[2023-10-13 01:48:08,606][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13551.5). Total num frames: 77168640. Throughput: 0: 1664.6, 1: 1673.0. Samples: 19299022. Policy #0 lag: (min: 24.0, avg: 49.3, max: 56.0) +[2023-10-13 01:48:08,607][45375] Avg episode reward: [(0, '47.460'), (1, '50.400')] +[2023-10-13 01:48:11,269][46662] Updated weights for policy 0, policy_version 37700 (0.0007) +[2023-10-13 01:48:11,504][46663] Updated weights for policy 1, policy_version 37671 (0.0007) +[2023-10-13 01:48:11,649][46662] Updated weights for policy 0, policy_version 37710 (0.0009) +[2023-10-13 01:48:11,873][46663] Updated weights for policy 1, policy_version 37681 (0.0008) +[2023-10-13 01:48:12,020][46662] Updated weights for policy 0, policy_version 37720 (0.0009) +[2023-10-13 01:48:12,237][46663] Updated weights for policy 1, policy_version 37691 (0.0008) +[2023-10-13 01:48:13,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13551.5). Total num frames: 77234176. Throughput: 0: 1673.1, 1: 1675.1. Samples: 19310500. Policy #0 lag: (min: 7.0, avg: 12.9, max: 39.0) +[2023-10-13 01:48:13,607][45375] Avg episode reward: [(0, '46.410'), (1, '49.330')] +[2023-10-13 01:48:16,065][46662] Updated weights for policy 0, policy_version 37730 (0.0008) +[2023-10-13 01:48:16,107][46663] Updated weights for policy 1, policy_version 37701 (0.0007) +[2023-10-13 01:48:16,439][46662] Updated weights for policy 0, policy_version 37740 (0.0008) +[2023-10-13 01:48:16,479][46663] Updated weights for policy 1, policy_version 37711 (0.0007) +[2023-10-13 01:48:16,818][46662] Updated weights for policy 0, policy_version 37750 (0.0008) +[2023-10-13 01:48:16,845][46663] Updated weights for policy 1, policy_version 37721 (0.0007) +[2023-10-13 01:48:17,187][46662] Updated weights for policy 0, policy_version 37760 (0.0007) +[2023-10-13 01:48:18,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 77299712. Throughput: 0: 1657.6, 1: 1656.0. Samples: 19329286. Policy #0 lag: (min: 7.0, avg: 12.9, max: 39.0) +[2023-10-13 01:48:18,607][45375] Avg episode reward: [(0, '45.910'), (1, '50.160')] +[2023-10-13 01:48:21,081][46662] Updated weights for policy 0, policy_version 37770 (0.0010) +[2023-10-13 01:48:21,106][46663] Updated weights for policy 1, policy_version 37731 (0.0009) +[2023-10-13 01:48:21,445][46662] Updated weights for policy 0, policy_version 37780 (0.0008) +[2023-10-13 01:48:21,466][46663] Updated weights for policy 1, policy_version 37741 (0.0007) +[2023-10-13 01:48:21,819][46662] Updated weights for policy 0, policy_version 37790 (0.0007) +[2023-10-13 01:48:21,830][46663] Updated weights for policy 1, policy_version 37751 (0.0009) +[2023-10-13 01:48:23,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 77365248. Throughput: 0: 1675.4, 1: 1679.2. Samples: 19349286. Policy #0 lag: (min: 7.0, avg: 12.9, max: 39.0) +[2023-10-13 01:48:23,608][45375] Avg episode reward: [(0, '46.700'), (1, '49.610')] +[2023-10-13 01:48:25,953][46662] Updated weights for policy 0, policy_version 37800 (0.0007) +[2023-10-13 01:48:25,977][46663] Updated weights for policy 1, policy_version 37761 (0.0010) +[2023-10-13 01:48:26,328][46662] Updated weights for policy 0, policy_version 37810 (0.0009) +[2023-10-13 01:48:26,397][46663] Updated weights for policy 1, policy_version 37771 (0.0008) +[2023-10-13 01:48:26,703][46662] Updated weights for policy 0, policy_version 37820 (0.0009) +[2023-10-13 01:48:26,765][46663] Updated weights for policy 1, policy_version 37781 (0.0007) +[2023-10-13 01:48:27,137][46663] Updated weights for policy 1, policy_version 37791 (0.0008) +[2023-10-13 01:48:28,606][45375] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 77430784. Throughput: 0: 1674.7, 1: 1668.4. Samples: 19360342. Policy #0 lag: (min: 7.0, avg: 12.9, max: 39.0) +[2023-10-13 01:48:28,607][45375] Avg episode reward: [(0, '45.720'), (1, '48.600')] +[2023-10-13 01:48:30,812][46662] Updated weights for policy 0, policy_version 37830 (0.0008) +[2023-10-13 01:48:31,149][46663] Updated weights for policy 1, policy_version 37801 (0.0009) +[2023-10-13 01:48:31,181][46662] Updated weights for policy 0, policy_version 37840 (0.0008) +[2023-10-13 01:48:31,514][46663] Updated weights for policy 1, policy_version 37811 (0.0009) +[2023-10-13 01:48:31,556][46662] Updated weights for policy 0, policy_version 37850 (0.0008) +[2023-10-13 01:48:31,889][46663] Updated weights for policy 1, policy_version 37821 (0.0010) +[2023-10-13 01:48:33,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 77496320. Throughput: 0: 1657.4, 1: 1661.9. Samples: 19379152. Policy #0 lag: (min: 7.0, avg: 12.9, max: 39.0) +[2023-10-13 01:48:33,608][45375] Avg episode reward: [(0, '46.040'), (1, '49.870')] +[2023-10-13 01:48:35,697][46662] Updated weights for policy 0, policy_version 37860 (0.0008) +[2023-10-13 01:48:35,904][46663] Updated weights for policy 1, policy_version 37831 (0.0008) +[2023-10-13 01:48:36,063][46662] Updated weights for policy 0, policy_version 37870 (0.0009) +[2023-10-13 01:48:36,278][46663] Updated weights for policy 1, policy_version 37841 (0.0009) +[2023-10-13 01:48:36,438][46662] Updated weights for policy 0, policy_version 37880 (0.0009) +[2023-10-13 01:48:36,647][46663] Updated weights for policy 1, policy_version 37851 (0.0009) +[2023-10-13 01:48:38,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 77561856. Throughput: 0: 1684.0, 1: 1675.7. Samples: 19400134. Policy #0 lag: (min: 25.0, avg: 31.2, max: 57.0) +[2023-10-13 01:48:38,607][45375] Avg episode reward: [(0, '44.990'), (1, '48.800')] +[2023-10-13 01:48:40,352][46662] Updated weights for policy 0, policy_version 37890 (0.0008) +[2023-10-13 01:48:40,696][46663] Updated weights for policy 1, policy_version 37861 (0.0008) +[2023-10-13 01:48:40,728][46662] Updated weights for policy 0, policy_version 37900 (0.0008) +[2023-10-13 01:48:41,060][46663] Updated weights for policy 1, policy_version 37871 (0.0009) +[2023-10-13 01:48:41,105][46662] Updated weights for policy 0, policy_version 37910 (0.0008) +[2023-10-13 01:48:41,426][46663] Updated weights for policy 1, policy_version 37881 (0.0007) +[2023-10-13 01:48:41,470][46662] Updated weights for policy 0, policy_version 37920 (0.0009) +[2023-10-13 01:48:43,607][45375] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 77627392. Throughput: 0: 1671.8, 1: 1658.2. Samples: 19410424. Policy #0 lag: (min: 25.0, avg: 31.2, max: 57.0) +[2023-10-13 01:48:43,607][45375] Avg episode reward: [(0, '45.750'), (1, '48.930')] +[2023-10-13 01:48:45,408][46663] Updated weights for policy 1, policy_version 37891 (0.0007) +[2023-10-13 01:48:45,689][46662] Updated weights for policy 0, policy_version 37930 (0.0008) +[2023-10-13 01:48:45,777][46663] Updated weights for policy 1, policy_version 37901 (0.0008) +[2023-10-13 01:48:46,056][46662] Updated weights for policy 0, policy_version 37940 (0.0007) +[2023-10-13 01:48:46,141][46663] Updated weights for policy 1, policy_version 37911 (0.0009) +[2023-10-13 01:48:46,428][46662] Updated weights for policy 0, policy_version 37950 (0.0008) +[2023-10-13 01:48:48,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 77692928. Throughput: 0: 1667.3, 1: 1680.3. Samples: 19429900. Policy #0 lag: (min: 25.0, avg: 31.2, max: 57.0) +[2023-10-13 01:48:48,607][45375] Avg episode reward: [(0, '46.880'), (1, '50.100')] +[2023-10-13 01:48:50,220][46663] Updated weights for policy 1, policy_version 37921 (0.0007) +[2023-10-13 01:48:50,586][46663] Updated weights for policy 1, policy_version 37931 (0.0008) +[2023-10-13 01:48:50,601][46662] Updated weights for policy 0, policy_version 37960 (0.0009) +[2023-10-13 01:48:50,953][46663] Updated weights for policy 1, policy_version 37941 (0.0010) +[2023-10-13 01:48:50,961][46662] Updated weights for policy 0, policy_version 37970 (0.0007) +[2023-10-13 01:48:51,323][46663] Updated weights for policy 1, policy_version 37951 (0.0008) +[2023-10-13 01:48:51,328][46662] Updated weights for policy 0, policy_version 37980 (0.0007) +[2023-10-13 01:48:53,607][45375] Fps is (10 sec: 13106.8, 60 sec: 13107.1, 300 sec: 13329.3). Total num frames: 77758464. Throughput: 0: 1684.8, 1: 1682.9. Samples: 19450568. Policy #0 lag: (min: 25.0, avg: 31.2, max: 57.0) +[2023-10-13 01:48:53,608][45375] Avg episode reward: [(0, '47.200'), (1, '49.710')] +[2023-10-13 01:48:55,398][46663] Updated weights for policy 1, policy_version 37961 (0.0009) +[2023-10-13 01:48:55,450][46662] Updated weights for policy 0, policy_version 37990 (0.0008) +[2023-10-13 01:48:55,760][46663] Updated weights for policy 1, policy_version 37971 (0.0008) +[2023-10-13 01:48:55,814][46662] Updated weights for policy 0, policy_version 38000 (0.0008) +[2023-10-13 01:48:56,131][46663] Updated weights for policy 1, policy_version 37981 (0.0009) +[2023-10-13 01:48:56,174][46662] Updated weights for policy 0, policy_version 38010 (0.0007) +[2023-10-13 01:48:58,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 77824000. Throughput: 0: 1672.6, 1: 1656.0. Samples: 19460286. Policy #0 lag: (min: 25.0, avg: 31.2, max: 57.0) +[2023-10-13 01:48:58,607][45375] Avg episode reward: [(0, '47.560'), (1, '51.220')] +[2023-10-13 01:49:00,213][46663] Updated weights for policy 1, policy_version 37991 (0.0008) +[2023-10-13 01:49:00,328][46662] Updated weights for policy 0, policy_version 38020 (0.0009) +[2023-10-13 01:49:00,586][46663] Updated weights for policy 1, policy_version 38001 (0.0008) +[2023-10-13 01:49:00,697][46662] Updated weights for policy 0, policy_version 38030 (0.0009) +[2023-10-13 01:49:00,956][46663] Updated weights for policy 1, policy_version 38011 (0.0008) +[2023-10-13 01:49:01,066][46662] Updated weights for policy 0, policy_version 38040 (0.0007) +[2023-10-13 01:49:03,607][45375] Fps is (10 sec: 13107.6, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 77889536. Throughput: 0: 1673.9, 1: 1679.8. Samples: 19480202. Policy #0 lag: (min: 25.0, avg: 31.2, max: 57.0) +[2023-10-13 01:49:03,607][45375] Avg episode reward: [(0, '48.250'), (1, '50.340')] +[2023-10-13 01:49:04,970][46663] Updated weights for policy 1, policy_version 38021 (0.0007) +[2023-10-13 01:49:04,978][46662] Updated weights for policy 0, policy_version 38050 (0.0009) +[2023-10-13 01:49:05,329][46663] Updated weights for policy 1, policy_version 38031 (0.0008) +[2023-10-13 01:49:05,345][46662] Updated weights for policy 0, policy_version 38060 (0.0009) +[2023-10-13 01:49:05,697][46663] Updated weights for policy 1, policy_version 38041 (0.0007) +[2023-10-13 01:49:05,708][46662] Updated weights for policy 0, policy_version 38070 (0.0008) +[2023-10-13 01:49:06,076][46662] Updated weights for policy 0, policy_version 38080 (0.0008) +[2023-10-13 01:49:08,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 77955072. Throughput: 0: 1684.2, 1: 1689.1. Samples: 19501082. Policy #0 lag: (min: 31.0, avg: 35.2, max: 63.0) +[2023-10-13 01:49:08,607][45375] Avg episode reward: [(0, '48.690'), (1, '49.120')] +[2023-10-13 01:49:09,921][46662] Updated weights for policy 0, policy_version 38090 (0.0008) +[2023-10-13 01:49:09,993][46663] Updated weights for policy 1, policy_version 38051 (0.0008) +[2023-10-13 01:49:10,285][46662] Updated weights for policy 0, policy_version 38100 (0.0008) +[2023-10-13 01:49:10,357][46663] Updated weights for policy 1, policy_version 38061 (0.0008) +[2023-10-13 01:49:10,656][46662] Updated weights for policy 0, policy_version 38110 (0.0008) +[2023-10-13 01:49:10,723][46663] Updated weights for policy 1, policy_version 38071 (0.0008) +[2023-10-13 01:49:13,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 78020608. Throughput: 0: 1660.2, 1: 1668.4. Samples: 19510128. Policy #0 lag: (min: 31.0, avg: 35.2, max: 63.0) +[2023-10-13 01:49:13,608][45375] Avg episode reward: [(0, '49.170'), (1, '48.860')] +[2023-10-13 01:49:14,675][46663] Updated weights for policy 1, policy_version 38081 (0.0010) +[2023-10-13 01:49:14,861][46662] Updated weights for policy 0, policy_version 38120 (0.0008) +[2023-10-13 01:49:15,110][46663] Updated weights for policy 1, policy_version 38091 (0.0010) +[2023-10-13 01:49:15,228][46662] Updated weights for policy 0, policy_version 38130 (0.0009) +[2023-10-13 01:49:15,479][46663] Updated weights for policy 1, policy_version 38101 (0.0007) +[2023-10-13 01:49:15,600][46662] Updated weights for policy 0, policy_version 38140 (0.0008) +[2023-10-13 01:49:15,850][46663] Updated weights for policy 1, policy_version 38111 (0.0008) +[2023-10-13 01:49:18,607][45375] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 78086144. Throughput: 0: 1678.0, 1: 1690.2. Samples: 19530720. Policy #0 lag: (min: 31.0, avg: 35.2, max: 63.0) +[2023-10-13 01:49:18,608][45375] Avg episode reward: [(0, '49.870'), (1, '49.890')] +[2023-10-13 01:49:19,727][46662] Updated weights for policy 0, policy_version 38150 (0.0008) +[2023-10-13 01:49:19,821][46663] Updated weights for policy 1, policy_version 38121 (0.0007) +[2023-10-13 01:49:20,099][46662] Updated weights for policy 0, policy_version 38160 (0.0009) +[2023-10-13 01:49:20,187][46663] Updated weights for policy 1, policy_version 38131 (0.0009) +[2023-10-13 01:49:20,483][46662] Updated weights for policy 0, policy_version 38170 (0.0009) +[2023-10-13 01:49:20,553][46663] Updated weights for policy 1, policy_version 38141 (0.0009) +[2023-10-13 01:49:23,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 78151680. Throughput: 0: 1673.6, 1: 1687.4. Samples: 19551380. Policy #0 lag: (min: 31.0, avg: 35.2, max: 63.0) +[2023-10-13 01:49:23,607][45375] Avg episode reward: [(0, '49.480'), (1, '49.390')] +[2023-10-13 01:49:24,459][46662] Updated weights for policy 0, policy_version 38180 (0.0009) +[2023-10-13 01:49:24,596][46663] Updated weights for policy 1, policy_version 38151 (0.0008) +[2023-10-13 01:49:24,822][46662] Updated weights for policy 0, policy_version 38190 (0.0008) +[2023-10-13 01:49:24,973][46663] Updated weights for policy 1, policy_version 38161 (0.0008) +[2023-10-13 01:49:25,202][46662] Updated weights for policy 0, policy_version 38200 (0.0010) +[2023-10-13 01:49:25,337][46663] Updated weights for policy 1, policy_version 38171 (0.0008) +[2023-10-13 01:49:28,606][45375] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 78217216. Throughput: 0: 1654.1, 1: 1681.2. Samples: 19560512. Policy #0 lag: (min: 31.0, avg: 35.2, max: 63.0) +[2023-10-13 01:49:28,607][45375] Avg episode reward: [(0, '50.520'), (1, '49.850')] +[2023-10-13 01:49:29,269][46663] Updated weights for policy 1, policy_version 38181 (0.0008) +[2023-10-13 01:49:29,323][46662] Updated weights for policy 0, policy_version 38210 (0.0010) +[2023-10-13 01:49:29,632][46663] Updated weights for policy 1, policy_version 38191 (0.0007) +[2023-10-13 01:49:29,681][46662] Updated weights for policy 0, policy_version 38220 (0.0008) +[2023-10-13 01:49:30,008][46663] Updated weights for policy 1, policy_version 38201 (0.0007) +[2023-10-13 01:49:30,054][46662] Updated weights for policy 0, policy_version 38230 (0.0007) +[2023-10-13 01:49:30,430][46662] Updated weights for policy 0, policy_version 38240 (0.0007) +[2023-10-13 01:49:33,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 78282752. Throughput: 0: 1672.9, 1: 1684.1. Samples: 19580966. Policy #0 lag: (min: 31.0, avg: 35.2, max: 63.0) +[2023-10-13 01:49:33,607][45375] Avg episode reward: [(0, '52.820'), (1, '51.530')] +[2023-10-13 01:49:34,163][46663] Updated weights for policy 1, policy_version 38211 (0.0007) +[2023-10-13 01:49:34,529][46663] Updated weights for policy 1, policy_version 38221 (0.0008) +[2023-10-13 01:49:34,763][46662] Updated weights for policy 0, policy_version 38250 (0.0007) +[2023-10-13 01:49:34,903][46663] Updated weights for policy 1, policy_version 38231 (0.0007) +[2023-10-13 01:49:35,126][46662] Updated weights for policy 0, policy_version 38260 (0.0007) +[2023-10-13 01:49:35,495][46662] Updated weights for policy 0, policy_version 38270 (0.0009) +[2023-10-13 01:49:38,606][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 78348288. Throughput: 0: 1666.8, 1: 1678.9. Samples: 19601120. Policy #0 lag: (min: 10.0, avg: 10.7, max: 28.0) +[2023-10-13 01:49:38,607][45375] Avg episode reward: [(0, '53.030'), (1, '52.550')] +[2023-10-13 01:49:39,047][46663] Updated weights for policy 1, policy_version 38241 (0.0008) +[2023-10-13 01:49:39,410][46663] Updated weights for policy 1, policy_version 38251 (0.0009) +[2023-10-13 01:49:39,588][46662] Updated weights for policy 0, policy_version 38280 (0.0008) +[2023-10-13 01:49:39,782][46663] Updated weights for policy 1, policy_version 38261 (0.0007) +[2023-10-13 01:49:39,962][46662] Updated weights for policy 0, policy_version 38290 (0.0007) +[2023-10-13 01:49:40,140][46663] Updated weights for policy 1, policy_version 38271 (0.0009) +[2023-10-13 01:49:40,332][46662] Updated weights for policy 0, policy_version 38300 (0.0008) +[2023-10-13 01:49:43,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 78413824. Throughput: 0: 1653.3, 1: 1678.8. Samples: 19610228. Policy #0 lag: (min: 10.0, avg: 10.7, max: 28.0) +[2023-10-13 01:49:43,607][45375] Avg episode reward: [(0, '51.970'), (1, '52.630')] +[2023-10-13 01:49:44,154][46663] Updated weights for policy 1, policy_version 38281 (0.0009) +[2023-10-13 01:49:44,348][46662] Updated weights for policy 0, policy_version 38310 (0.0007) +[2023-10-13 01:49:44,527][46663] Updated weights for policy 1, policy_version 38291 (0.0008) +[2023-10-13 01:49:44,713][46662] Updated weights for policy 0, policy_version 38320 (0.0009) +[2023-10-13 01:49:44,888][46663] Updated weights for policy 1, policy_version 38301 (0.0007) +[2023-10-13 01:49:45,085][46662] Updated weights for policy 0, policy_version 38330 (0.0007) +[2023-10-13 01:49:48,606][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 78479360. Throughput: 0: 1667.0, 1: 1677.5. Samples: 19630706. Policy #0 lag: (min: 10.0, avg: 10.7, max: 28.0) +[2023-10-13 01:49:48,607][45375] Avg episode reward: [(0, '51.750'), (1, '52.770')] +[2023-10-13 01:49:49,176][46662] Updated weights for policy 0, policy_version 38340 (0.0007) +[2023-10-13 01:49:49,262][46663] Updated weights for policy 1, policy_version 38311 (0.0007) +[2023-10-13 01:49:49,539][46662] Updated weights for policy 0, policy_version 38350 (0.0008) +[2023-10-13 01:49:49,629][46663] Updated weights for policy 1, policy_version 38321 (0.0008) +[2023-10-13 01:49:49,906][46662] Updated weights for policy 0, policy_version 38360 (0.0007) +[2023-10-13 01:49:49,990][46663] Updated weights for policy 1, policy_version 38331 (0.0008) +[2023-10-13 01:49:53,607][45375] Fps is (10 sec: 13106.8, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 78544896. Throughput: 0: 1668.7, 1: 1669.8. Samples: 19651314. Policy #0 lag: (min: 10.0, avg: 10.7, max: 28.0) +[2023-10-13 01:49:53,608][45375] Avg episode reward: [(0, '49.350'), (1, '52.410')] +[2023-10-13 01:49:53,621][46384] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000038336_39256064.pth... +[2023-10-13 01:49:53,621][46091] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000038368_39288832.pth... +[2023-10-13 01:49:53,653][46384] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000036768_37650432.pth +[2023-10-13 01:49:53,661][46091] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000036800_37683200.pth +[2023-10-13 01:49:54,035][46662] Updated weights for policy 0, policy_version 38370 (0.0007) +[2023-10-13 01:49:54,081][46663] Updated weights for policy 1, policy_version 38341 (0.0008) +[2023-10-13 01:49:54,407][46662] Updated weights for policy 0, policy_version 38380 (0.0007) +[2023-10-13 01:49:54,453][46663] Updated weights for policy 1, policy_version 38351 (0.0009) +[2023-10-13 01:49:54,773][46662] Updated weights for policy 0, policy_version 38390 (0.0007) +[2023-10-13 01:49:54,820][46663] Updated weights for policy 1, policy_version 38361 (0.0009) +[2023-10-13 01:49:55,134][46662] Updated weights for policy 0, policy_version 38400 (0.0008) +[2023-10-13 01:49:58,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 78610432. Throughput: 0: 1670.0, 1: 1670.5. Samples: 19660450. Policy #0 lag: (min: 10.0, avg: 10.7, max: 28.0) +[2023-10-13 01:49:58,607][45375] Avg episode reward: [(0, '50.570'), (1, '52.070')] +[2023-10-13 01:49:58,908][46663] Updated weights for policy 1, policy_version 38371 (0.0008) +[2023-10-13 01:49:59,031][46662] Updated weights for policy 0, policy_version 38410 (0.0009) +[2023-10-13 01:49:59,272][46663] Updated weights for policy 1, policy_version 38381 (0.0007) +[2023-10-13 01:49:59,400][46662] Updated weights for policy 0, policy_version 38420 (0.0007) +[2023-10-13 01:49:59,648][46663] Updated weights for policy 1, policy_version 38391 (0.0010) +[2023-10-13 01:49:59,766][46662] Updated weights for policy 0, policy_version 38430 (0.0007) +[2023-10-13 01:50:03,607][45375] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 78675968. Throughput: 0: 1679.7, 1: 1671.5. Samples: 19681524. Policy #0 lag: (min: 10.0, avg: 10.7, max: 28.0) +[2023-10-13 01:50:03,608][45375] Avg episode reward: [(0, '48.960'), (1, '51.060')] +[2023-10-13 01:50:03,854][46663] Updated weights for policy 1, policy_version 38401 (0.0009) +[2023-10-13 01:50:03,855][46662] Updated weights for policy 0, policy_version 38440 (0.0010) +[2023-10-13 01:50:04,221][46662] Updated weights for policy 0, policy_version 38450 (0.0008) +[2023-10-13 01:50:04,288][46663] Updated weights for policy 1, policy_version 38411 (0.0009) +[2023-10-13 01:50:04,589][46662] Updated weights for policy 0, policy_version 38460 (0.0008) +[2023-10-13 01:50:04,662][46663] Updated weights for policy 1, policy_version 38421 (0.0009) +[2023-10-13 01:50:05,036][46663] Updated weights for policy 1, policy_version 38431 (0.0008) +[2023-10-13 01:50:08,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 78741504. Throughput: 0: 1681.3, 1: 1668.3. Samples: 19702112. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 01:50:08,607][45375] Avg episode reward: [(0, '47.250'), (1, '51.270')] +[2023-10-13 01:50:08,614][46662] Updated weights for policy 0, policy_version 38470 (0.0009) +[2023-10-13 01:50:08,922][46663] Updated weights for policy 1, policy_version 38441 (0.0010) +[2023-10-13 01:50:08,976][46662] Updated weights for policy 0, policy_version 38480 (0.0007) +[2023-10-13 01:50:09,283][46663] Updated weights for policy 1, policy_version 38451 (0.0009) +[2023-10-13 01:50:09,345][46662] Updated weights for policy 0, policy_version 38490 (0.0007) +[2023-10-13 01:50:09,655][46663] Updated weights for policy 1, policy_version 38461 (0.0009) +[2023-10-13 01:50:13,221][46662] Updated weights for policy 0, policy_version 38500 (0.0007) +[2023-10-13 01:50:13,587][46662] Updated weights for policy 0, policy_version 38510 (0.0009) +[2023-10-13 01:50:13,606][45375] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 78807040. Throughput: 0: 1686.3, 1: 1664.3. Samples: 19711290. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 01:50:13,607][45375] Avg episode reward: [(0, '47.540'), (1, '52.720')] +[2023-10-13 01:50:13,785][46663] Updated weights for policy 1, policy_version 38471 (0.0008) +[2023-10-13 01:50:13,955][46662] Updated weights for policy 0, policy_version 38520 (0.0008) +[2023-10-13 01:50:14,141][46663] Updated weights for policy 1, policy_version 38481 (0.0009) +[2023-10-13 01:50:14,519][46663] Updated weights for policy 1, policy_version 38491 (0.0007) +[2023-10-13 01:50:18,040][46662] Updated weights for policy 0, policy_version 38530 (0.0008) +[2023-10-13 01:50:18,409][46662] Updated weights for policy 0, policy_version 38540 (0.0007) +[2023-10-13 01:50:18,606][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 78872576. Throughput: 0: 1690.3, 1: 1662.3. Samples: 19731830. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 01:50:18,607][45375] Avg episode reward: [(0, '47.560'), (1, '52.480')] +[2023-10-13 01:50:18,644][46663] Updated weights for policy 1, policy_version 38501 (0.0007) +[2023-10-13 01:50:18,783][46662] Updated weights for policy 0, policy_version 38550 (0.0007) +[2023-10-13 01:50:19,003][46663] Updated weights for policy 1, policy_version 38511 (0.0009) +[2023-10-13 01:50:19,152][46662] Updated weights for policy 0, policy_version 38560 (0.0007) +[2023-10-13 01:50:19,368][46663] Updated weights for policy 1, policy_version 38521 (0.0008) +[2023-10-13 01:50:23,386][46663] Updated weights for policy 1, policy_version 38531 (0.0010) +[2023-10-13 01:50:23,393][46662] Updated weights for policy 0, policy_version 38570 (0.0007) +[2023-10-13 01:50:23,607][45375] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 78938112. Throughput: 0: 1694.9, 1: 1669.0. Samples: 19752496. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 01:50:23,608][45375] Avg episode reward: [(0, '48.060'), (1, '51.160')] +[2023-10-13 01:50:23,750][46663] Updated weights for policy 1, policy_version 38541 (0.0007) +[2023-10-13 01:50:23,764][46662] Updated weights for policy 0, policy_version 38580 (0.0008) +[2023-10-13 01:50:24,119][46663] Updated weights for policy 1, policy_version 38551 (0.0007) +[2023-10-13 01:50:24,142][46662] Updated weights for policy 0, policy_version 38590 (0.0008) +[2023-10-13 01:50:28,131][46662] Updated weights for policy 0, policy_version 38600 (0.0008) +[2023-10-13 01:50:28,307][46663] Updated weights for policy 1, policy_version 38561 (0.0008) +[2023-10-13 01:50:28,503][46662] Updated weights for policy 0, policy_version 38610 (0.0007) +[2023-10-13 01:50:28,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 79003648. Throughput: 0: 1691.8, 1: 1671.2. Samples: 19761562. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 01:50:28,607][45375] Avg episode reward: [(0, '46.980'), (1, '49.650')] +[2023-10-13 01:50:28,668][46663] Updated weights for policy 1, policy_version 38571 (0.0008) +[2023-10-13 01:50:28,875][46662] Updated weights for policy 0, policy_version 38620 (0.0008) +[2023-10-13 01:50:29,023][46663] Updated weights for policy 1, policy_version 38581 (0.0007) +[2023-10-13 01:50:29,392][46663] Updated weights for policy 1, policy_version 38591 (0.0007) +[2023-10-13 01:50:32,763][46662] Updated weights for policy 0, policy_version 38630 (0.0010) +[2023-10-13 01:50:33,125][46662] Updated weights for policy 0, policy_version 38640 (0.0009) +[2023-10-13 01:50:33,488][46662] Updated weights for policy 0, policy_version 38650 (0.0008) +[2023-10-13 01:50:33,584][46663] Updated weights for policy 1, policy_version 38601 (0.0009) +[2023-10-13 01:50:33,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 79069184. Throughput: 0: 1698.3, 1: 1668.6. Samples: 19782216. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 01:50:33,608][45375] Avg episode reward: [(0, '46.270'), (1, '48.890')] +[2023-10-13 01:50:33,948][46663] Updated weights for policy 1, policy_version 38611 (0.0009) +[2023-10-13 01:50:34,325][46663] Updated weights for policy 1, policy_version 38621 (0.0009) +[2023-10-13 01:50:37,538][46662] Updated weights for policy 0, policy_version 38660 (0.0008) +[2023-10-13 01:50:37,909][46662] Updated weights for policy 0, policy_version 38670 (0.0008) +[2023-10-13 01:50:38,277][46662] Updated weights for policy 0, policy_version 38680 (0.0008) +[2023-10-13 01:50:38,424][46663] Updated weights for policy 1, policy_version 38631 (0.0009) +[2023-10-13 01:50:38,606][45375] Fps is (10 sec: 16384.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 79167488. Throughput: 0: 1686.2, 1: 1666.2. Samples: 19802174. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) +[2023-10-13 01:50:38,607][45375] Avg episode reward: [(0, '46.190'), (1, '49.110')] +[2023-10-13 01:50:38,789][46663] Updated weights for policy 1, policy_version 38641 (0.0010) +[2023-10-13 01:50:39,159][46663] Updated weights for policy 1, policy_version 38651 (0.0009) +[2023-10-13 01:50:42,279][46662] Updated weights for policy 0, policy_version 38690 (0.0009) +[2023-10-13 01:50:42,647][46662] Updated weights for policy 0, policy_version 38700 (0.0008) +[2023-10-13 01:50:43,017][46662] Updated weights for policy 0, policy_version 38710 (0.0008) +[2023-10-13 01:50:43,247][46663] Updated weights for policy 1, policy_version 38661 (0.0008) +[2023-10-13 01:50:43,395][46662] Updated weights for policy 0, policy_version 38720 (0.0008) +[2023-10-13 01:50:43,607][45375] Fps is (10 sec: 16384.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 79233024. Throughput: 0: 1697.1, 1: 1667.0. Samples: 19811836. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) +[2023-10-13 01:50:43,607][45375] Avg episode reward: [(0, '46.440'), (1, '47.440')] +[2023-10-13 01:50:43,620][46663] Updated weights for policy 1, policy_version 38671 (0.0007) +[2023-10-13 01:50:43,989][46663] Updated weights for policy 1, policy_version 38681 (0.0008) +[2023-10-13 01:50:47,767][46662] Updated weights for policy 0, policy_version 38730 (0.0008) +[2023-10-13 01:50:48,132][46663] Updated weights for policy 1, policy_version 38691 (0.0011) +[2023-10-13 01:50:48,136][46662] Updated weights for policy 0, policy_version 38740 (0.0010) +[2023-10-13 01:50:48,495][46663] Updated weights for policy 1, policy_version 38701 (0.0008) +[2023-10-13 01:50:48,499][46662] Updated weights for policy 0, policy_version 38750 (0.0008) +[2023-10-13 01:50:48,606][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.5). Total num frames: 79298560. Throughput: 0: 1686.3, 1: 1665.7. Samples: 19832360. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) +[2023-10-13 01:50:48,607][45375] Avg episode reward: [(0, '45.360'), (1, '47.100')] +[2023-10-13 01:50:48,858][46663] Updated weights for policy 1, policy_version 38711 (0.0010) +[2023-10-13 01:50:52,620][46662] Updated weights for policy 0, policy_version 38760 (0.0008) +[2023-10-13 01:50:52,977][46663] Updated weights for policy 1, policy_version 38721 (0.0009) +[2023-10-13 01:50:52,998][46662] Updated weights for policy 0, policy_version 38770 (0.0008) +[2023-10-13 01:50:53,362][46662] Updated weights for policy 0, policy_version 38780 (0.0007) +[2023-10-13 01:50:53,397][46663] Updated weights for policy 1, policy_version 38731 (0.0007) +[2023-10-13 01:50:53,607][45375] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 79364096. Throughput: 0: 1670.2, 1: 1659.5. Samples: 19851946. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) +[2023-10-13 01:50:53,608][45375] Avg episode reward: [(0, '47.170'), (1, '47.460')] +[2023-10-13 01:50:53,765][46663] Updated weights for policy 1, policy_version 38741 (0.0007) +[2023-10-13 01:50:54,144][46663] Updated weights for policy 1, policy_version 38751 (0.0007) +[2023-10-13 01:50:57,497][46662] Updated weights for policy 0, policy_version 38790 (0.0008) +[2023-10-13 01:50:57,863][46662] Updated weights for policy 0, policy_version 38800 (0.0008) +[2023-10-13 01:50:58,120][46663] Updated weights for policy 1, policy_version 38761 (0.0007) +[2023-10-13 01:50:58,230][46662] Updated weights for policy 0, policy_version 38810 (0.0008) +[2023-10-13 01:50:58,491][46663] Updated weights for policy 1, policy_version 38771 (0.0008) +[2023-10-13 01:50:58,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 79429632. Throughput: 0: 1677.2, 1: 1672.6. Samples: 19862030. Policy #0 lag: (min: 3.0, avg: 4.7, max: 31.0) +[2023-10-13 01:50:58,607][45375] Avg episode reward: [(0, '46.500'), (1, '45.920')] +[2023-10-13 01:50:58,852][46663] Updated weights for policy 1, policy_version 38781 (0.0009) +[2023-10-13 01:51:02,358][46662] Updated weights for policy 0, policy_version 38820 (0.0010) +[2023-10-13 01:51:02,731][46662] Updated weights for policy 0, policy_version 38830 (0.0009) +[2023-10-13 01:51:02,897][46663] Updated weights for policy 1, policy_version 38791 (0.0009) +[2023-10-13 01:51:03,107][46662] Updated weights for policy 0, policy_version 38840 (0.0009) +[2023-10-13 01:51:03,261][46663] Updated weights for policy 1, policy_version 38801 (0.0008) +[2023-10-13 01:51:03,607][45375] Fps is (10 sec: 13107.5, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 79495168. Throughput: 0: 1674.5, 1: 1679.7. Samples: 19882770. Policy #0 lag: (min: 3.0, avg: 4.7, max: 31.0) +[2023-10-13 01:51:03,607][45375] Avg episode reward: [(0, '46.460'), (1, '48.150')] +[2023-10-13 01:51:03,633][46663] Updated weights for policy 1, policy_version 38811 (0.0008) +[2023-10-13 01:51:07,201][46662] Updated weights for policy 0, policy_version 38850 (0.0009) +[2023-10-13 01:51:07,579][46662] Updated weights for policy 0, policy_version 38860 (0.0009) +[2023-10-13 01:51:07,852][46663] Updated weights for policy 1, policy_version 38821 (0.0007) +[2023-10-13 01:51:07,944][46662] Updated weights for policy 0, policy_version 38870 (0.0010) +[2023-10-13 01:51:08,221][46663] Updated weights for policy 1, policy_version 38831 (0.0009) +[2023-10-13 01:51:08,309][46662] Updated weights for policy 0, policy_version 38880 (0.0009) +[2023-10-13 01:51:08,587][46663] Updated weights for policy 1, policy_version 38841 (0.0011) +[2023-10-13 01:51:08,606][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 79560704. Throughput: 0: 1655.8, 1: 1658.8. Samples: 19901654. Policy #0 lag: (min: 3.0, avg: 4.7, max: 31.0) +[2023-10-13 01:51:08,607][45375] Avg episode reward: [(0, '45.520'), (1, '47.120')] +[2023-10-13 01:51:12,534][46662] Updated weights for policy 0, policy_version 38890 (0.0010) +[2023-10-13 01:51:12,613][46663] Updated weights for policy 1, policy_version 38851 (0.0009) +[2023-10-13 01:51:12,900][46662] Updated weights for policy 0, policy_version 38900 (0.0009) +[2023-10-13 01:51:12,989][46663] Updated weights for policy 1, policy_version 38861 (0.0010) +[2023-10-13 01:51:13,281][46662] Updated weights for policy 0, policy_version 38910 (0.0008) +[2023-10-13 01:51:13,354][46663] Updated weights for policy 1, policy_version 38871 (0.0009) +[2023-10-13 01:51:13,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 79626240. Throughput: 0: 1673.9, 1: 1674.4. Samples: 19912236. Policy #0 lag: (min: 3.0, avg: 4.7, max: 31.0) +[2023-10-13 01:51:13,608][45375] Avg episode reward: [(0, '45.070'), (1, '47.470')] +[2023-10-13 01:51:17,336][46663] Updated weights for policy 1, policy_version 38881 (0.0008) +[2023-10-13 01:51:17,465][46662] Updated weights for policy 0, policy_version 38920 (0.0010) +[2023-10-13 01:51:17,702][46663] Updated weights for policy 1, policy_version 38891 (0.0008) +[2023-10-13 01:51:17,834][46662] Updated weights for policy 0, policy_version 38930 (0.0008) +[2023-10-13 01:51:18,067][46663] Updated weights for policy 1, policy_version 38901 (0.0011) +[2023-10-13 01:51:18,203][46662] Updated weights for policy 0, policy_version 38940 (0.0010) +[2023-10-13 01:51:18,437][46663] Updated weights for policy 1, policy_version 38911 (0.0009) +[2023-10-13 01:51:18,607][45375] Fps is (10 sec: 16383.9, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 79724544. Throughput: 0: 1664.4, 1: 1679.3. Samples: 19932680. Policy #0 lag: (min: 3.0, avg: 4.7, max: 31.0) +[2023-10-13 01:51:18,607][45375] Avg episode reward: [(0, '45.910'), (1, '47.710')] +[2023-10-13 01:51:22,181][46662] Updated weights for policy 0, policy_version 38950 (0.0009) +[2023-10-13 01:51:22,508][46663] Updated weights for policy 1, policy_version 38921 (0.0007) +[2023-10-13 01:51:22,546][46662] Updated weights for policy 0, policy_version 38960 (0.0008) +[2023-10-13 01:51:22,872][46663] Updated weights for policy 1, policy_version 38931 (0.0009) +[2023-10-13 01:51:22,910][46662] Updated weights for policy 0, policy_version 38970 (0.0009) +[2023-10-13 01:51:23,234][46663] Updated weights for policy 1, policy_version 38941 (0.0009) +[2023-10-13 01:51:23,606][45375] Fps is (10 sec: 16384.4, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 79790080. Throughput: 0: 1655.8, 1: 1658.4. Samples: 19951314. Policy #0 lag: (min: 3.0, avg: 4.7, max: 31.0) +[2023-10-13 01:51:23,607][45375] Avg episode reward: [(0, '47.680'), (1, '48.790')] +[2023-10-13 01:51:27,090][46662] Updated weights for policy 0, policy_version 38980 (0.0008) +[2023-10-13 01:51:27,214][46663] Updated weights for policy 1, policy_version 38951 (0.0010) +[2023-10-13 01:51:27,460][46662] Updated weights for policy 0, policy_version 38990 (0.0007) +[2023-10-13 01:51:27,570][46663] Updated weights for policy 1, policy_version 38961 (0.0008) +[2023-10-13 01:51:27,824][46662] Updated weights for policy 0, policy_version 39000 (0.0008) +[2023-10-13 01:51:27,936][46663] Updated weights for policy 1, policy_version 38971 (0.0009) +[2023-10-13 01:51:28,607][45375] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 79855616. Throughput: 0: 1664.4, 1: 1683.2. Samples: 19962476. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 01:51:28,607][45375] Avg episode reward: [(0, '48.110'), (1, '50.160')] +[2023-10-13 01:51:32,011][46662] Updated weights for policy 0, policy_version 39010 (0.0007) +[2023-10-13 01:51:32,106][46663] Updated weights for policy 1, policy_version 38981 (0.0009) +[2023-10-13 01:51:32,376][46662] Updated weights for policy 0, policy_version 39020 (0.0007) +[2023-10-13 01:51:32,473][46663] Updated weights for policy 1, policy_version 38991 (0.0009) +[2023-10-13 01:51:32,746][46662] Updated weights for policy 0, policy_version 39030 (0.0008) +[2023-10-13 01:51:32,834][46663] Updated weights for policy 1, policy_version 39001 (0.0010) +[2023-10-13 01:51:33,116][46662] Updated weights for policy 0, policy_version 39040 (0.0009) +[2023-10-13 01:51:33,607][45375] Fps is (10 sec: 13106.9, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 79921152. Throughput: 0: 1664.5, 1: 1671.5. Samples: 19982484. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 01:51:33,608][45375] Avg episode reward: [(0, '48.110'), (1, '50.880')] +[2023-10-13 01:51:36,805][46663] Updated weights for policy 1, policy_version 39011 (0.0008) +[2023-10-13 01:51:37,160][46663] Updated weights for policy 1, policy_version 39021 (0.0008) +[2023-10-13 01:51:37,205][46662] Updated weights for policy 0, policy_version 39050 (0.0008) +[2023-10-13 01:51:37,539][46663] Updated weights for policy 1, policy_version 39031 (0.0009) +[2023-10-13 01:51:37,567][46662] Updated weights for policy 0, policy_version 39060 (0.0007) +[2023-10-13 01:51:37,931][46662] Updated weights for policy 0, policy_version 39070 (0.0009) +[2023-10-13 01:51:38,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 79986688. Throughput: 0: 1656.2, 1: 1668.5. Samples: 20001558. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 01:51:38,607][45375] Avg episode reward: [(0, '47.360'), (1, '51.940')] +[2023-10-13 01:51:41,754][46663] Updated weights for policy 1, policy_version 39041 (0.0009) +[2023-10-13 01:51:42,007][46662] Updated weights for policy 0, policy_version 39080 (0.0008) +[2023-10-13 01:51:42,168][46663] Updated weights for policy 1, policy_version 39051 (0.0007) +[2023-10-13 01:51:42,385][46662] Updated weights for policy 0, policy_version 39090 (0.0009) +[2023-10-13 01:51:42,538][46663] Updated weights for policy 1, policy_version 39061 (0.0008) +[2023-10-13 01:51:42,749][46662] Updated weights for policy 0, policy_version 39100 (0.0008) +[2023-10-13 01:51:42,907][46663] Updated weights for policy 1, policy_version 39071 (0.0008) +[2023-10-13 01:51:43,606][45375] Fps is (10 sec: 13107.5, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 80052224. Throughput: 0: 1668.6, 1: 1686.4. Samples: 20013006. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 01:51:43,607][45375] Avg episode reward: [(0, '46.470'), (1, '51.230')] +[2023-10-13 01:51:46,758][46662] Updated weights for policy 0, policy_version 39110 (0.0009) +[2023-10-13 01:51:46,840][46663] Updated weights for policy 1, policy_version 39081 (0.0008) +[2023-10-13 01:51:47,133][46662] Updated weights for policy 0, policy_version 39120 (0.0009) +[2023-10-13 01:51:47,208][46663] Updated weights for policy 1, policy_version 39091 (0.0007) +[2023-10-13 01:51:47,505][46662] Updated weights for policy 0, policy_version 39130 (0.0009) +[2023-10-13 01:51:47,574][46663] Updated weights for policy 1, policy_version 39101 (0.0007) +[2023-10-13 01:51:48,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 80117760. Throughput: 0: 1664.4, 1: 1664.4. Samples: 20032562. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 01:51:48,607][45375] Avg episode reward: [(0, '47.680'), (1, '50.720')] +[2023-10-13 01:51:51,537][46662] Updated weights for policy 0, policy_version 39140 (0.0009) +[2023-10-13 01:51:51,775][46663] Updated weights for policy 1, policy_version 39111 (0.0010) +[2023-10-13 01:51:51,909][46662] Updated weights for policy 0, policy_version 39150 (0.0007) +[2023-10-13 01:51:52,137][46663] Updated weights for policy 1, policy_version 39121 (0.0008) +[2023-10-13 01:51:52,269][46662] Updated weights for policy 0, policy_version 39160 (0.0009) +[2023-10-13 01:51:52,502][46663] Updated weights for policy 1, policy_version 39131 (0.0009) +[2023-10-13 01:51:53,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 80183296. Throughput: 0: 1661.0, 1: 1674.5. Samples: 20051752. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 01:51:53,607][45375] Avg episode reward: [(0, '47.460'), (1, '52.450')] +[2023-10-13 01:51:53,616][46384] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000039136_40075264.pth... +[2023-10-13 01:51:53,616][46091] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000039168_40108032.pth... +[2023-10-13 01:51:53,650][46091] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000037600_38502400.pth +[2023-10-13 01:51:53,656][46091] Saving a milestone ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/milestones/checkpoint_000039168_40108032.pth +[2023-10-13 01:51:53,657][46384] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000037568_38469632.pth +[2023-10-13 01:51:53,664][46384] Saving a milestone ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/milestones/checkpoint_000039136_40075264.pth +[2023-10-13 01:51:56,391][46662] Updated weights for policy 0, policy_version 39170 (0.0009) +[2023-10-13 01:51:56,641][46663] Updated weights for policy 1, policy_version 39141 (0.0008) +[2023-10-13 01:51:56,807][46662] Updated weights for policy 0, policy_version 39180 (0.0008) +[2023-10-13 01:51:57,007][46663] Updated weights for policy 1, policy_version 39151 (0.0011) +[2023-10-13 01:51:57,184][46662] Updated weights for policy 0, policy_version 39190 (0.0008) +[2023-10-13 01:51:57,372][46663] Updated weights for policy 1, policy_version 39161 (0.0009) +[2023-10-13 01:51:57,550][46662] Updated weights for policy 0, policy_version 39200 (0.0008) +[2023-10-13 01:51:58,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 80248832. Throughput: 0: 1674.8, 1: 1683.9. Samples: 20063378. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 01:51:58,607][45375] Avg episode reward: [(0, '46.460'), (1, '50.490')] +[2023-10-13 01:52:01,514][46663] Updated weights for policy 1, policy_version 39171 (0.0008) +[2023-10-13 01:52:01,578][46662] Updated weights for policy 0, policy_version 39210 (0.0009) +[2023-10-13 01:52:01,895][46663] Updated weights for policy 1, policy_version 39181 (0.0008) +[2023-10-13 01:52:01,944][46662] Updated weights for policy 0, policy_version 39220 (0.0007) +[2023-10-13 01:52:02,254][46663] Updated weights for policy 1, policy_version 39191 (0.0009) +[2023-10-13 01:52:02,315][46662] Updated weights for policy 0, policy_version 39230 (0.0008) +[2023-10-13 01:52:03,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13329.3). Total num frames: 80314368. Throughput: 0: 1665.0, 1: 1656.9. Samples: 20082166. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 01:52:03,608][45375] Avg episode reward: [(0, '47.280'), (1, '49.820')] +[2023-10-13 01:52:06,317][46663] Updated weights for policy 1, policy_version 39201 (0.0009) +[2023-10-13 01:52:06,340][46662] Updated weights for policy 0, policy_version 39240 (0.0007) +[2023-10-13 01:52:06,683][46663] Updated weights for policy 1, policy_version 39211 (0.0007) +[2023-10-13 01:52:06,709][46662] Updated weights for policy 0, policy_version 39250 (0.0007) +[2023-10-13 01:52:07,041][46663] Updated weights for policy 1, policy_version 39221 (0.0007) +[2023-10-13 01:52:07,079][46662] Updated weights for policy 0, policy_version 39260 (0.0009) +[2023-10-13 01:52:07,403][46663] Updated weights for policy 1, policy_version 39231 (0.0010) +[2023-10-13 01:52:08,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13329.4). Total num frames: 80379904. Throughput: 0: 1670.4, 1: 1677.2. Samples: 20101960. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 01:52:08,607][45375] Avg episode reward: [(0, '48.630'), (1, '50.620')] +[2023-10-13 01:52:11,137][46662] Updated weights for policy 0, policy_version 39270 (0.0009) +[2023-10-13 01:52:11,451][46663] Updated weights for policy 1, policy_version 39241 (0.0009) +[2023-10-13 01:52:11,507][46662] Updated weights for policy 0, policy_version 39280 (0.0009) +[2023-10-13 01:52:11,814][46663] Updated weights for policy 1, policy_version 39251 (0.0009) +[2023-10-13 01:52:11,881][46662] Updated weights for policy 0, policy_version 39290 (0.0007) +[2023-10-13 01:52:12,184][46663] Updated weights for policy 1, policy_version 39261 (0.0008) +[2023-10-13 01:52:13,606][45375] Fps is (10 sec: 13107.5, 60 sec: 13653.4, 300 sec: 13329.4). Total num frames: 80445440. Throughput: 0: 1678.4, 1: 1674.9. Samples: 20113376. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 01:52:13,607][45375] Avg episode reward: [(0, '48.960'), (1, '49.350')] +[2023-10-13 01:52:15,894][46662] Updated weights for policy 0, policy_version 39300 (0.0007) +[2023-10-13 01:52:16,109][46663] Updated weights for policy 1, policy_version 39271 (0.0008) +[2023-10-13 01:52:16,263][46662] Updated weights for policy 0, policy_version 39310 (0.0007) +[2023-10-13 01:52:16,470][46663] Updated weights for policy 1, policy_version 39281 (0.0009) +[2023-10-13 01:52:16,632][46662] Updated weights for policy 0, policy_version 39320 (0.0008) +[2023-10-13 01:52:16,838][46663] Updated weights for policy 1, policy_version 39291 (0.0008) +[2023-10-13 01:52:18,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 80510976. Throughput: 0: 1659.2, 1: 1664.1. Samples: 20132034. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 01:52:18,607][45375] Avg episode reward: [(0, '47.290'), (1, '47.590')] +[2023-10-13 01:52:20,546][46662] Updated weights for policy 0, policy_version 39330 (0.0008) +[2023-10-13 01:52:20,913][46662] Updated weights for policy 0, policy_version 39340 (0.0008) +[2023-10-13 01:52:21,088][46663] Updated weights for policy 1, policy_version 39301 (0.0009) +[2023-10-13 01:52:21,281][46662] Updated weights for policy 0, policy_version 39350 (0.0007) +[2023-10-13 01:52:21,456][46663] Updated weights for policy 1, policy_version 39311 (0.0009) +[2023-10-13 01:52:21,651][46662] Updated weights for policy 0, policy_version 39360 (0.0008) +[2023-10-13 01:52:21,821][46663] Updated weights for policy 1, policy_version 39321 (0.0009) +[2023-10-13 01:52:23,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 80576512. Throughput: 0: 1676.6, 1: 1680.6. Samples: 20152632. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 01:52:23,607][45375] Avg episode reward: [(0, '44.170'), (1, '46.090')] +[2023-10-13 01:52:25,858][46662] Updated weights for policy 0, policy_version 39370 (0.0008) +[2023-10-13 01:52:25,923][46663] Updated weights for policy 1, policy_version 39331 (0.0008) +[2023-10-13 01:52:26,228][46662] Updated weights for policy 0, policy_version 39380 (0.0007) +[2023-10-13 01:52:26,282][46663] Updated weights for policy 1, policy_version 39341 (0.0008) +[2023-10-13 01:52:26,611][46662] Updated weights for policy 0, policy_version 39390 (0.0007) +[2023-10-13 01:52:26,652][46663] Updated weights for policy 1, policy_version 39351 (0.0009) +[2023-10-13 01:52:28,606][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 80642048. Throughput: 0: 1674.3, 1: 1665.9. Samples: 20163316. Policy #0 lag: (min: 3.0, avg: 6.9, max: 35.0) +[2023-10-13 01:52:28,607][45375] Avg episode reward: [(0, '43.540'), (1, '44.800')] +[2023-10-13 01:52:30,699][46662] Updated weights for policy 0, policy_version 39400 (0.0009) +[2023-10-13 01:52:30,835][46663] Updated weights for policy 1, policy_version 39361 (0.0008) +[2023-10-13 01:52:31,065][46662] Updated weights for policy 0, policy_version 39410 (0.0008) +[2023-10-13 01:52:31,196][46663] Updated weights for policy 1, policy_version 39371 (0.0009) +[2023-10-13 01:52:31,429][46662] Updated weights for policy 0, policy_version 39420 (0.0007) +[2023-10-13 01:52:31,561][46663] Updated weights for policy 1, policy_version 39381 (0.0008) +[2023-10-13 01:52:31,928][46663] Updated weights for policy 1, policy_version 39391 (0.0010) +[2023-10-13 01:52:33,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 80707584. Throughput: 0: 1658.5, 1: 1668.9. Samples: 20182294. Policy #0 lag: (min: 3.0, avg: 6.9, max: 35.0) +[2023-10-13 01:52:33,608][45375] Avg episode reward: [(0, '44.530'), (1, '45.200')] +[2023-10-13 01:52:35,538][46662] Updated weights for policy 0, policy_version 39430 (0.0008) +[2023-10-13 01:52:35,911][46662] Updated weights for policy 0, policy_version 39440 (0.0008) +[2023-10-13 01:52:36,123][46663] Updated weights for policy 1, policy_version 39401 (0.0009) +[2023-10-13 01:52:36,285][46662] Updated weights for policy 0, policy_version 39450 (0.0010) +[2023-10-13 01:52:36,482][46663] Updated weights for policy 1, policy_version 39411 (0.0009) +[2023-10-13 01:52:36,846][46663] Updated weights for policy 1, policy_version 39421 (0.0010) +[2023-10-13 01:52:38,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 80773120. Throughput: 0: 1681.4, 1: 1677.3. Samples: 20202894. Policy #0 lag: (min: 3.0, avg: 6.9, max: 35.0) +[2023-10-13 01:52:38,607][45375] Avg episode reward: [(0, '46.920'), (1, '45.280')] +[2023-10-13 01:52:40,415][46662] Updated weights for policy 0, policy_version 39460 (0.0008) +[2023-10-13 01:52:40,781][46662] Updated weights for policy 0, policy_version 39470 (0.0008) +[2023-10-13 01:52:40,794][46663] Updated weights for policy 1, policy_version 39431 (0.0008) +[2023-10-13 01:52:41,149][46662] Updated weights for policy 0, policy_version 39480 (0.0008) +[2023-10-13 01:52:41,155][46663] Updated weights for policy 1, policy_version 39441 (0.0009) +[2023-10-13 01:52:41,519][46663] Updated weights for policy 1, policy_version 39451 (0.0010) +[2023-10-13 01:52:43,606][45375] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 80838656. Throughput: 0: 1667.0, 1: 1661.4. Samples: 20213154. Policy #0 lag: (min: 3.0, avg: 6.9, max: 35.0) +[2023-10-13 01:52:43,607][45375] Avg episode reward: [(0, '47.430'), (1, '44.760')] +[2023-10-13 01:52:45,131][46662] Updated weights for policy 0, policy_version 39490 (0.0009) +[2023-10-13 01:52:45,543][46662] Updated weights for policy 0, policy_version 39500 (0.0007) +[2023-10-13 01:52:45,607][46663] Updated weights for policy 1, policy_version 39461 (0.0009) +[2023-10-13 01:52:45,906][46662] Updated weights for policy 0, policy_version 39510 (0.0009) +[2023-10-13 01:52:45,973][46663] Updated weights for policy 1, policy_version 39471 (0.0009) +[2023-10-13 01:52:46,284][46662] Updated weights for policy 0, policy_version 39520 (0.0007) +[2023-10-13 01:52:46,328][46663] Updated weights for policy 1, policy_version 39481 (0.0008) +[2023-10-13 01:52:48,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 80904192. Throughput: 0: 1668.0, 1: 1678.8. Samples: 20232770. Policy #0 lag: (min: 3.0, avg: 6.9, max: 35.0) +[2023-10-13 01:52:48,607][45375] Avg episode reward: [(0, '46.650'), (1, '45.530')] +[2023-10-13 01:52:50,409][46662] Updated weights for policy 0, policy_version 39530 (0.0007) +[2023-10-13 01:52:50,543][46663] Updated weights for policy 1, policy_version 39491 (0.0008) +[2023-10-13 01:52:50,779][46662] Updated weights for policy 0, policy_version 39540 (0.0008) +[2023-10-13 01:52:50,901][46663] Updated weights for policy 1, policy_version 39501 (0.0007) +[2023-10-13 01:52:51,137][46662] Updated weights for policy 0, policy_version 39550 (0.0007) +[2023-10-13 01:52:51,266][46663] Updated weights for policy 1, policy_version 39511 (0.0008) +[2023-10-13 01:52:53,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 80969728. Throughput: 0: 1681.6, 1: 1681.2. Samples: 20253290. Policy #0 lag: (min: 3.0, avg: 6.9, max: 35.0) +[2023-10-13 01:52:53,607][45375] Avg episode reward: [(0, '46.700'), (1, '45.710')] +[2023-10-13 01:52:55,221][46662] Updated weights for policy 0, policy_version 39560 (0.0010) +[2023-10-13 01:52:55,233][46663] Updated weights for policy 1, policy_version 39521 (0.0009) +[2023-10-13 01:52:55,592][46663] Updated weights for policy 1, policy_version 39531 (0.0008) +[2023-10-13 01:52:55,594][46662] Updated weights for policy 0, policy_version 39570 (0.0007) +[2023-10-13 01:52:55,958][46662] Updated weights for policy 0, policy_version 39580 (0.0008) +[2023-10-13 01:52:55,959][46663] Updated weights for policy 1, policy_version 39541 (0.0007) +[2023-10-13 01:52:56,332][46663] Updated weights for policy 1, policy_version 39551 (0.0007) +[2023-10-13 01:52:58,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 81035264. Throughput: 0: 1661.7, 1: 1660.5. Samples: 20262876. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 01:52:58,607][45375] Avg episode reward: [(0, '48.070'), (1, '44.840')] +[2023-10-13 01:52:59,986][46662] Updated weights for policy 0, policy_version 39590 (0.0008) +[2023-10-13 01:53:00,346][46662] Updated weights for policy 0, policy_version 39600 (0.0009) +[2023-10-13 01:53:00,408][46663] Updated weights for policy 1, policy_version 39561 (0.0008) +[2023-10-13 01:53:00,720][46662] Updated weights for policy 0, policy_version 39610 (0.0008) +[2023-10-13 01:53:00,786][46663] Updated weights for policy 1, policy_version 39571 (0.0009) +[2023-10-13 01:53:01,147][46663] Updated weights for policy 1, policy_version 39581 (0.0009) +[2023-10-13 01:53:03,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 81100800. Throughput: 0: 1675.4, 1: 1674.9. Samples: 20282796. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 01:53:03,608][45375] Avg episode reward: [(0, '47.920'), (1, '45.810')] +[2023-10-13 01:53:04,830][46662] Updated weights for policy 0, policy_version 39620 (0.0008) +[2023-10-13 01:53:05,205][46662] Updated weights for policy 0, policy_version 39630 (0.0008) +[2023-10-13 01:53:05,250][46663] Updated weights for policy 1, policy_version 39591 (0.0008) +[2023-10-13 01:53:05,569][46662] Updated weights for policy 0, policy_version 39640 (0.0008) +[2023-10-13 01:53:05,612][46663] Updated weights for policy 1, policy_version 39601 (0.0008) +[2023-10-13 01:53:05,982][46663] Updated weights for policy 1, policy_version 39611 (0.0009) +[2023-10-13 01:53:08,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 81166336. Throughput: 0: 1679.3, 1: 1673.6. Samples: 20303516. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 01:53:08,607][45375] Avg episode reward: [(0, '48.010'), (1, '47.080')] +[2023-10-13 01:53:09,657][46662] Updated weights for policy 0, policy_version 39650 (0.0009) +[2023-10-13 01:53:10,008][46663] Updated weights for policy 1, policy_version 39621 (0.0008) +[2023-10-13 01:53:10,022][46662] Updated weights for policy 0, policy_version 39660 (0.0009) +[2023-10-13 01:53:10,386][46663] Updated weights for policy 1, policy_version 39631 (0.0009) +[2023-10-13 01:53:10,392][46662] Updated weights for policy 0, policy_version 39670 (0.0009) +[2023-10-13 01:53:10,749][46663] Updated weights for policy 1, policy_version 39641 (0.0007) +[2023-10-13 01:53:10,764][46662] Updated weights for policy 0, policy_version 39680 (0.0010) +[2023-10-13 01:53:13,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 81231872. Throughput: 0: 1657.4, 1: 1659.4. Samples: 20312570. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 01:53:13,608][45375] Avg episode reward: [(0, '47.890'), (1, '46.830')] +[2023-10-13 01:53:14,781][46663] Updated weights for policy 1, policy_version 39651 (0.0009) +[2023-10-13 01:53:15,052][46662] Updated weights for policy 0, policy_version 39690 (0.0009) +[2023-10-13 01:53:15,143][46663] Updated weights for policy 1, policy_version 39661 (0.0008) +[2023-10-13 01:53:15,423][46662] Updated weights for policy 0, policy_version 39700 (0.0009) +[2023-10-13 01:53:15,504][46663] Updated weights for policy 1, policy_version 39671 (0.0009) +[2023-10-13 01:53:15,784][46662] Updated weights for policy 0, policy_version 39710 (0.0007) +[2023-10-13 01:53:18,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 81297408. Throughput: 0: 1677.8, 1: 1678.7. Samples: 20333334. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 01:53:18,607][45375] Avg episode reward: [(0, '47.440'), (1, '48.380')] +[2023-10-13 01:53:19,438][46663] Updated weights for policy 1, policy_version 39681 (0.0009) +[2023-10-13 01:53:19,801][46663] Updated weights for policy 1, policy_version 39691 (0.0009) +[2023-10-13 01:53:19,840][46662] Updated weights for policy 0, policy_version 39720 (0.0009) +[2023-10-13 01:53:20,167][46663] Updated weights for policy 1, policy_version 39701 (0.0008) +[2023-10-13 01:53:20,213][46662] Updated weights for policy 0, policy_version 39730 (0.0009) +[2023-10-13 01:53:20,532][46663] Updated weights for policy 1, policy_version 39711 (0.0008) +[2023-10-13 01:53:20,582][46662] Updated weights for policy 0, policy_version 39740 (0.0009) +[2023-10-13 01:53:23,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 81362944. Throughput: 0: 1668.7, 1: 1682.5. Samples: 20353696. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 01:53:23,608][45375] Avg episode reward: [(0, '47.830'), (1, '48.650')] +[2023-10-13 01:53:24,748][46663] Updated weights for policy 1, policy_version 39721 (0.0008) +[2023-10-13 01:53:24,822][46662] Updated weights for policy 0, policy_version 39750 (0.0009) +[2023-10-13 01:53:25,113][46663] Updated weights for policy 1, policy_version 39731 (0.0007) +[2023-10-13 01:53:25,195][46662] Updated weights for policy 0, policy_version 39760 (0.0008) +[2023-10-13 01:53:25,474][46663] Updated weights for policy 1, policy_version 39741 (0.0007) +[2023-10-13 01:53:25,560][46662] Updated weights for policy 0, policy_version 39770 (0.0007) +[2023-10-13 01:53:28,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 81428480. Throughput: 0: 1654.6, 1: 1670.7. Samples: 20362790. Policy #0 lag: (min: 31.0, avg: 32.8, max: 60.0) +[2023-10-13 01:53:28,607][45375] Avg episode reward: [(0, '48.280'), (1, '48.890')] +[2023-10-13 01:53:29,490][46663] Updated weights for policy 1, policy_version 39751 (0.0009) +[2023-10-13 01:53:29,555][46662] Updated weights for policy 0, policy_version 39780 (0.0009) +[2023-10-13 01:53:29,855][46663] Updated weights for policy 1, policy_version 39761 (0.0008) +[2023-10-13 01:53:29,917][46662] Updated weights for policy 0, policy_version 39790 (0.0008) +[2023-10-13 01:53:30,228][46663] Updated weights for policy 1, policy_version 39771 (0.0007) +[2023-10-13 01:53:30,285][46662] Updated weights for policy 0, policy_version 39800 (0.0007) +[2023-10-13 01:53:33,606][45375] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 81494016. Throughput: 0: 1664.9, 1: 1681.1. Samples: 20383338. Policy #0 lag: (min: 31.0, avg: 32.8, max: 60.0) +[2023-10-13 01:53:33,607][45375] Avg episode reward: [(0, '47.700'), (1, '49.760')] +[2023-10-13 01:53:34,211][46663] Updated weights for policy 1, policy_version 39781 (0.0008) +[2023-10-13 01:53:34,582][46663] Updated weights for policy 1, policy_version 39791 (0.0007) +[2023-10-13 01:53:34,641][46662] Updated weights for policy 0, policy_version 39810 (0.0007) +[2023-10-13 01:53:34,941][46663] Updated weights for policy 1, policy_version 39801 (0.0007) +[2023-10-13 01:53:35,056][46662] Updated weights for policy 0, policy_version 39820 (0.0007) +[2023-10-13 01:53:35,430][46662] Updated weights for policy 0, policy_version 39830 (0.0007) +[2023-10-13 01:53:35,795][46662] Updated weights for policy 0, policy_version 39840 (0.0007) +[2023-10-13 01:53:38,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 81559552. Throughput: 0: 1659.1, 1: 1689.4. Samples: 20403972. Policy #0 lag: (min: 31.0, avg: 32.8, max: 60.0) +[2023-10-13 01:53:38,607][45375] Avg episode reward: [(0, '47.160'), (1, '49.190')] +[2023-10-13 01:53:38,988][46663] Updated weights for policy 1, policy_version 39811 (0.0008) +[2023-10-13 01:53:39,347][46663] Updated weights for policy 1, policy_version 39821 (0.0010) +[2023-10-13 01:53:39,712][46663] Updated weights for policy 1, policy_version 39831 (0.0009) +[2023-10-13 01:53:39,838][46662] Updated weights for policy 0, policy_version 39850 (0.0007) +[2023-10-13 01:53:40,199][46662] Updated weights for policy 0, policy_version 39860 (0.0009) +[2023-10-13 01:53:40,583][46662] Updated weights for policy 0, policy_version 39870 (0.0008) +[2023-10-13 01:53:43,607][45375] Fps is (10 sec: 13106.8, 60 sec: 13107.1, 300 sec: 13329.3). Total num frames: 81625088. Throughput: 0: 1648.8, 1: 1682.8. Samples: 20412802. Policy #0 lag: (min: 31.0, avg: 32.8, max: 60.0) +[2023-10-13 01:53:43,608][45375] Avg episode reward: [(0, '45.340'), (1, '48.860')] +[2023-10-13 01:53:43,817][46663] Updated weights for policy 1, policy_version 39841 (0.0010) +[2023-10-13 01:53:44,174][46663] Updated weights for policy 1, policy_version 39851 (0.0009) +[2023-10-13 01:53:44,551][46663] Updated weights for policy 1, policy_version 39861 (0.0008) +[2023-10-13 01:53:44,652][46662] Updated weights for policy 0, policy_version 39880 (0.0007) +[2023-10-13 01:53:44,914][46663] Updated weights for policy 1, policy_version 39871 (0.0009) +[2023-10-13 01:53:45,029][46662] Updated weights for policy 0, policy_version 39890 (0.0009) +[2023-10-13 01:53:45,388][46662] Updated weights for policy 0, policy_version 39900 (0.0009) +[2023-10-13 01:53:48,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 81690624. Throughput: 0: 1659.8, 1: 1687.6. Samples: 20433430. Policy #0 lag: (min: 31.0, avg: 32.8, max: 60.0) +[2023-10-13 01:53:48,607][45375] Avg episode reward: [(0, '45.680'), (1, '48.990')] +[2023-10-13 01:53:49,067][46663] Updated weights for policy 1, policy_version 39881 (0.0008) +[2023-10-13 01:53:49,430][46663] Updated weights for policy 1, policy_version 39891 (0.0008) +[2023-10-13 01:53:49,490][46662] Updated weights for policy 0, policy_version 39910 (0.0008) +[2023-10-13 01:53:49,799][46663] Updated weights for policy 1, policy_version 39901 (0.0007) +[2023-10-13 01:53:49,858][46662] Updated weights for policy 0, policy_version 39920 (0.0009) +[2023-10-13 01:53:50,217][46662] Updated weights for policy 0, policy_version 39930 (0.0009) +[2023-10-13 01:53:53,607][45375] Fps is (10 sec: 13107.6, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 81756160. Throughput: 0: 1658.4, 1: 1685.9. Samples: 20454012. Policy #0 lag: (min: 31.0, avg: 32.8, max: 60.0) +[2023-10-13 01:53:53,607][45375] Avg episode reward: [(0, '44.680'), (1, '49.100')] +[2023-10-13 01:53:53,617][46384] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000039904_40861696.pth... +[2023-10-13 01:53:53,617][46091] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000039936_40894464.pth... +[2023-10-13 01:53:53,660][46384] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000038336_39256064.pth +[2023-10-13 01:53:53,661][46091] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000038368_39288832.pth +[2023-10-13 01:53:54,070][46663] Updated weights for policy 1, policy_version 39911 (0.0009) +[2023-10-13 01:53:54,345][46662] Updated weights for policy 0, policy_version 39940 (0.0008) +[2023-10-13 01:53:54,439][46663] Updated weights for policy 1, policy_version 39921 (0.0007) +[2023-10-13 01:53:54,708][46662] Updated weights for policy 0, policy_version 39950 (0.0008) +[2023-10-13 01:53:54,800][46663] Updated weights for policy 1, policy_version 39931 (0.0007) +[2023-10-13 01:53:55,077][46662] Updated weights for policy 0, policy_version 39960 (0.0008) +[2023-10-13 01:53:58,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 81821696. Throughput: 0: 1656.6, 1: 1688.5. Samples: 20463102. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) +[2023-10-13 01:53:58,607][45375] Avg episode reward: [(0, '44.050'), (1, '49.280')] +[2023-10-13 01:53:58,751][46663] Updated weights for policy 1, policy_version 39941 (0.0008) +[2023-10-13 01:53:59,119][46663] Updated weights for policy 1, policy_version 39951 (0.0007) +[2023-10-13 01:53:59,213][46662] Updated weights for policy 0, policy_version 39970 (0.0010) +[2023-10-13 01:53:59,493][46663] Updated weights for policy 1, policy_version 39961 (0.0007) +[2023-10-13 01:53:59,570][46662] Updated weights for policy 0, policy_version 39980 (0.0008) +[2023-10-13 01:53:59,951][46662] Updated weights for policy 0, policy_version 39990 (0.0010) +[2023-10-13 01:54:00,311][46662] Updated weights for policy 0, policy_version 40000 (0.0011) +[2023-10-13 01:54:03,512][46663] Updated weights for policy 1, policy_version 39971 (0.0009) +[2023-10-13 01:54:03,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 81887232. Throughput: 0: 1658.5, 1: 1687.3. Samples: 20483896. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) +[2023-10-13 01:54:03,607][45375] Avg episode reward: [(0, '43.780'), (1, '49.890')] +[2023-10-13 01:54:03,883][46663] Updated weights for policy 1, policy_version 39981 (0.0008) +[2023-10-13 01:54:04,248][46662] Updated weights for policy 0, policy_version 40010 (0.0007) +[2023-10-13 01:54:04,251][46663] Updated weights for policy 1, policy_version 39991 (0.0007) +[2023-10-13 01:54:04,614][46662] Updated weights for policy 0, policy_version 40020 (0.0008) +[2023-10-13 01:54:04,981][46662] Updated weights for policy 0, policy_version 40030 (0.0007) +[2023-10-13 01:54:08,372][46663] Updated weights for policy 1, policy_version 40001 (0.0008) +[2023-10-13 01:54:08,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 81952768. Throughput: 0: 1665.8, 1: 1680.8. Samples: 20504294. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) +[2023-10-13 01:54:08,607][45375] Avg episode reward: [(0, '43.400'), (1, '50.750')] +[2023-10-13 01:54:08,735][46663] Updated weights for policy 1, policy_version 40011 (0.0008) +[2023-10-13 01:54:08,988][46662] Updated weights for policy 0, policy_version 40040 (0.0008) +[2023-10-13 01:54:09,096][46663] Updated weights for policy 1, policy_version 40021 (0.0008) +[2023-10-13 01:54:09,355][46662] Updated weights for policy 0, policy_version 40050 (0.0009) +[2023-10-13 01:54:09,468][46663] Updated weights for policy 1, policy_version 40031 (0.0007) +[2023-10-13 01:54:09,728][46662] Updated weights for policy 0, policy_version 40060 (0.0010) +[2023-10-13 01:54:13,593][46663] Updated weights for policy 1, policy_version 40041 (0.0008) +[2023-10-13 01:54:13,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 82018304. Throughput: 0: 1662.8, 1: 1686.7. Samples: 20513516. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) +[2023-10-13 01:54:13,608][45375] Avg episode reward: [(0, '44.640'), (1, '50.380')] +[2023-10-13 01:54:13,804][46662] Updated weights for policy 0, policy_version 40070 (0.0008) +[2023-10-13 01:54:13,957][46663] Updated weights for policy 1, policy_version 40051 (0.0009) +[2023-10-13 01:54:14,179][46662] Updated weights for policy 0, policy_version 40080 (0.0007) +[2023-10-13 01:54:14,331][46663] Updated weights for policy 1, policy_version 40061 (0.0008) +[2023-10-13 01:54:14,542][46662] Updated weights for policy 0, policy_version 40090 (0.0009) +[2023-10-13 01:54:18,309][46663] Updated weights for policy 1, policy_version 40071 (0.0008) +[2023-10-13 01:54:18,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 82083840. Throughput: 0: 1671.3, 1: 1680.7. Samples: 20534180. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) +[2023-10-13 01:54:18,607][45375] Avg episode reward: [(0, '45.260'), (1, '49.030')] +[2023-10-13 01:54:18,677][46663] Updated weights for policy 1, policy_version 40081 (0.0008) +[2023-10-13 01:54:18,690][46662] Updated weights for policy 0, policy_version 40100 (0.0008) +[2023-10-13 01:54:19,051][46663] Updated weights for policy 1, policy_version 40091 (0.0008) +[2023-10-13 01:54:19,055][46662] Updated weights for policy 0, policy_version 40110 (0.0007) +[2023-10-13 01:54:19,426][46662] Updated weights for policy 0, policy_version 40120 (0.0007) +[2023-10-13 01:54:23,200][46663] Updated weights for policy 1, policy_version 40101 (0.0010) +[2023-10-13 01:54:23,572][46662] Updated weights for policy 0, policy_version 40130 (0.0008) +[2023-10-13 01:54:23,579][46663] Updated weights for policy 1, policy_version 40111 (0.0008) +[2023-10-13 01:54:23,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 82149376. Throughput: 0: 1674.3, 1: 1665.4. Samples: 20554256. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) +[2023-10-13 01:54:23,608][45375] Avg episode reward: [(0, '44.340'), (1, '49.540')] +[2023-10-13 01:54:23,944][46663] Updated weights for policy 1, policy_version 40121 (0.0008) +[2023-10-13 01:54:23,968][46662] Updated weights for policy 0, policy_version 40140 (0.0008) +[2023-10-13 01:54:24,332][46662] Updated weights for policy 0, policy_version 40150 (0.0009) +[2023-10-13 01:54:24,706][46662] Updated weights for policy 0, policy_version 40160 (0.0010) +[2023-10-13 01:54:28,046][46663] Updated weights for policy 1, policy_version 40131 (0.0008) +[2023-10-13 01:54:28,410][46663] Updated weights for policy 1, policy_version 40141 (0.0008) +[2023-10-13 01:54:28,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 82214912. Throughput: 0: 1671.8, 1: 1678.3. Samples: 20563558. Policy #0 lag: (min: 20.0, avg: 20.0, max: 20.0) +[2023-10-13 01:54:28,607][45375] Avg episode reward: [(0, '44.630'), (1, '49.810')] +[2023-10-13 01:54:28,771][46663] Updated weights for policy 1, policy_version 40151 (0.0008) +[2023-10-13 01:54:28,802][46662] Updated weights for policy 0, policy_version 40170 (0.0009) +[2023-10-13 01:54:29,177][46662] Updated weights for policy 0, policy_version 40180 (0.0010) +[2023-10-13 01:54:29,531][46662] Updated weights for policy 0, policy_version 40190 (0.0009) +[2023-10-13 01:54:32,788][46663] Updated weights for policy 1, policy_version 40161 (0.0010) +[2023-10-13 01:54:33,156][46663] Updated weights for policy 1, policy_version 40171 (0.0008) +[2023-10-13 01:54:33,525][46663] Updated weights for policy 1, policy_version 40181 (0.0009) +[2023-10-13 01:54:33,607][45375] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 82280448. Throughput: 0: 1669.0, 1: 1684.2. Samples: 20584324. Policy #0 lag: (min: 20.0, avg: 20.0, max: 20.0) +[2023-10-13 01:54:33,607][45375] Avg episode reward: [(0, '45.340'), (1, '50.130')] +[2023-10-13 01:54:33,639][46662] Updated weights for policy 0, policy_version 40200 (0.0007) +[2023-10-13 01:54:33,895][46663] Updated weights for policy 1, policy_version 40191 (0.0007) +[2023-10-13 01:54:34,015][46662] Updated weights for policy 0, policy_version 40210 (0.0010) +[2023-10-13 01:54:34,393][46662] Updated weights for policy 0, policy_version 40220 (0.0010) +[2023-10-13 01:54:38,118][46663] Updated weights for policy 1, policy_version 40201 (0.0009) +[2023-10-13 01:54:38,352][46662] Updated weights for policy 0, policy_version 40230 (0.0010) +[2023-10-13 01:54:38,480][46663] Updated weights for policy 1, policy_version 40211 (0.0008) +[2023-10-13 01:54:38,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 82345984. Throughput: 0: 1674.8, 1: 1665.8. Samples: 20604340. Policy #0 lag: (min: 20.0, avg: 20.0, max: 20.0) +[2023-10-13 01:54:38,607][45375] Avg episode reward: [(0, '43.700'), (1, '50.460')] +[2023-10-13 01:54:38,723][46662] Updated weights for policy 0, policy_version 40240 (0.0009) +[2023-10-13 01:54:38,848][46663] Updated weights for policy 1, policy_version 40221 (0.0007) +[2023-10-13 01:54:39,102][46662] Updated weights for policy 0, policy_version 40250 (0.0008) +[2023-10-13 01:54:43,027][46663] Updated weights for policy 1, policy_version 40231 (0.0010) +[2023-10-13 01:54:43,352][46662] Updated weights for policy 0, policy_version 40260 (0.0009) +[2023-10-13 01:54:43,401][46663] Updated weights for policy 1, policy_version 40241 (0.0009) +[2023-10-13 01:54:43,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.3, 300 sec: 13329.4). Total num frames: 82411520. Throughput: 0: 1676.8, 1: 1678.5. Samples: 20614090. Policy #0 lag: (min: 20.0, avg: 20.0, max: 20.0) +[2023-10-13 01:54:43,607][45375] Avg episode reward: [(0, '46.070'), (1, '50.730')] +[2023-10-13 01:54:43,719][46662] Updated weights for policy 0, policy_version 40270 (0.0009) +[2023-10-13 01:54:43,767][46663] Updated weights for policy 1, policy_version 40251 (0.0009) +[2023-10-13 01:54:44,087][46662] Updated weights for policy 0, policy_version 40280 (0.0008) +[2023-10-13 01:54:47,777][46663] Updated weights for policy 1, policy_version 40261 (0.0010) +[2023-10-13 01:54:48,156][46663] Updated weights for policy 1, policy_version 40271 (0.0010) +[2023-10-13 01:54:48,289][46662] Updated weights for policy 0, policy_version 40290 (0.0009) +[2023-10-13 01:54:48,519][46663] Updated weights for policy 1, policy_version 40281 (0.0007) +[2023-10-13 01:54:48,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 82477056. Throughput: 0: 1676.8, 1: 1675.7. Samples: 20634758. Policy #0 lag: (min: 20.0, avg: 20.0, max: 20.0) +[2023-10-13 01:54:48,607][45375] Avg episode reward: [(0, '46.700'), (1, '51.260')] +[2023-10-13 01:54:48,656][46662] Updated weights for policy 0, policy_version 40300 (0.0008) +[2023-10-13 01:54:49,021][46662] Updated weights for policy 0, policy_version 40310 (0.0010) +[2023-10-13 01:54:49,396][46662] Updated weights for policy 0, policy_version 40320 (0.0009) +[2023-10-13 01:54:52,664][46663] Updated weights for policy 1, policy_version 40291 (0.0007) +[2023-10-13 01:54:53,037][46663] Updated weights for policy 1, policy_version 40301 (0.0008) +[2023-10-13 01:54:53,318][46662] Updated weights for policy 0, policy_version 40330 (0.0008) +[2023-10-13 01:54:53,394][46663] Updated weights for policy 1, policy_version 40311 (0.0009) +[2023-10-13 01:54:53,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 82542592. Throughput: 0: 1674.8, 1: 1660.0. Samples: 20654358. Policy #0 lag: (min: 20.0, avg: 20.0, max: 20.0) +[2023-10-13 01:54:53,607][45375] Avg episode reward: [(0, '46.870'), (1, '51.760')] +[2023-10-13 01:54:53,692][46662] Updated weights for policy 0, policy_version 40340 (0.0009) +[2023-10-13 01:54:54,063][46662] Updated weights for policy 0, policy_version 40350 (0.0008) +[2023-10-13 01:54:57,515][46663] Updated weights for policy 1, policy_version 40321 (0.0008) +[2023-10-13 01:54:57,883][46663] Updated weights for policy 1, policy_version 40331 (0.0008) +[2023-10-13 01:54:58,230][46662] Updated weights for policy 0, policy_version 40360 (0.0009) +[2023-10-13 01:54:58,249][46663] Updated weights for policy 1, policy_version 40341 (0.0009) +[2023-10-13 01:54:58,587][46662] Updated weights for policy 0, policy_version 40370 (0.0007) +[2023-10-13 01:54:58,606][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 82608128. Throughput: 0: 1675.7, 1: 1672.0. Samples: 20664162. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 01:54:58,607][45375] Avg episode reward: [(0, '46.370'), (1, '50.790')] +[2023-10-13 01:54:58,612][46663] Updated weights for policy 1, policy_version 40351 (0.0008) +[2023-10-13 01:54:58,964][46662] Updated weights for policy 0, policy_version 40380 (0.0009) +[2023-10-13 01:55:02,619][46663] Updated weights for policy 1, policy_version 40361 (0.0009) +[2023-10-13 01:55:02,977][46662] Updated weights for policy 0, policy_version 40390 (0.0008) +[2023-10-13 01:55:02,995][46663] Updated weights for policy 1, policy_version 40371 (0.0009) +[2023-10-13 01:55:03,350][46662] Updated weights for policy 0, policy_version 40400 (0.0008) +[2023-10-13 01:55:03,359][46663] Updated weights for policy 1, policy_version 40381 (0.0008) +[2023-10-13 01:55:03,607][45375] Fps is (10 sec: 16383.7, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 82706432. Throughput: 0: 1672.3, 1: 1673.1. Samples: 20684722. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 01:55:03,608][45375] Avg episode reward: [(0, '47.120'), (1, '50.350')] +[2023-10-13 01:55:03,721][46662] Updated weights for policy 0, policy_version 40410 (0.0008) +[2023-10-13 01:55:07,467][46663] Updated weights for policy 1, policy_version 40391 (0.0010) +[2023-10-13 01:55:07,827][46663] Updated weights for policy 1, policy_version 40401 (0.0007) +[2023-10-13 01:55:07,949][46662] Updated weights for policy 0, policy_version 40420 (0.0008) +[2023-10-13 01:55:08,193][46663] Updated weights for policy 1, policy_version 40411 (0.0008) +[2023-10-13 01:55:08,321][46662] Updated weights for policy 0, policy_version 40430 (0.0008) +[2023-10-13 01:55:08,607][45375] Fps is (10 sec: 16383.8, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 82771968. Throughput: 0: 1672.4, 1: 1656.1. Samples: 20704040. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 01:55:08,607][45375] Avg episode reward: [(0, '47.450'), (1, '49.550')] +[2023-10-13 01:55:08,683][46662] Updated weights for policy 0, policy_version 40440 (0.0008) +[2023-10-13 01:55:12,335][46663] Updated weights for policy 1, policy_version 40421 (0.0007) +[2023-10-13 01:55:12,700][46663] Updated weights for policy 1, policy_version 40431 (0.0009) +[2023-10-13 01:55:12,729][46662] Updated weights for policy 0, policy_version 40450 (0.0010) +[2023-10-13 01:55:13,065][46663] Updated weights for policy 1, policy_version 40441 (0.0008) +[2023-10-13 01:55:13,157][46662] Updated weights for policy 0, policy_version 40460 (0.0008) +[2023-10-13 01:55:13,532][46662] Updated weights for policy 0, policy_version 40470 (0.0007) +[2023-10-13 01:55:13,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 82837504. Throughput: 0: 1675.7, 1: 1672.7. Samples: 20714234. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 01:55:13,608][45375] Avg episode reward: [(0, '46.800'), (1, '48.610')] +[2023-10-13 01:55:13,896][46662] Updated weights for policy 0, policy_version 40480 (0.0007) +[2023-10-13 01:55:17,314][46663] Updated weights for policy 1, policy_version 40451 (0.0008) +[2023-10-13 01:55:17,675][46663] Updated weights for policy 1, policy_version 40461 (0.0008) +[2023-10-13 01:55:17,912][46662] Updated weights for policy 0, policy_version 40490 (0.0008) +[2023-10-13 01:55:18,054][46663] Updated weights for policy 1, policy_version 40471 (0.0007) +[2023-10-13 01:55:18,270][46662] Updated weights for policy 0, policy_version 40500 (0.0009) +[2023-10-13 01:55:18,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 82903040. Throughput: 0: 1677.8, 1: 1658.7. Samples: 20734466. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 01:55:18,607][45375] Avg episode reward: [(0, '47.010'), (1, '48.000')] +[2023-10-13 01:55:18,641][46662] Updated weights for policy 0, policy_version 40510 (0.0008) +[2023-10-13 01:55:22,225][46663] Updated weights for policy 1, policy_version 40481 (0.0007) +[2023-10-13 01:55:22,582][46663] Updated weights for policy 1, policy_version 40491 (0.0009) +[2023-10-13 01:55:22,759][46662] Updated weights for policy 0, policy_version 40520 (0.0008) +[2023-10-13 01:55:22,950][46663] Updated weights for policy 1, policy_version 40501 (0.0009) +[2023-10-13 01:55:23,130][46662] Updated weights for policy 0, policy_version 40530 (0.0007) +[2023-10-13 01:55:23,316][46663] Updated weights for policy 1, policy_version 40511 (0.0009) +[2023-10-13 01:55:23,515][46662] Updated weights for policy 0, policy_version 40540 (0.0010) +[2023-10-13 01:55:23,607][45375] Fps is (10 sec: 13107.4, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 82968576. Throughput: 0: 1664.9, 1: 1651.6. Samples: 20753580. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 01:55:23,607][45375] Avg episode reward: [(0, '48.700'), (1, '47.220')] +[2023-10-13 01:55:27,525][46663] Updated weights for policy 1, policy_version 40521 (0.0007) +[2023-10-13 01:55:27,707][46662] Updated weights for policy 0, policy_version 40550 (0.0008) +[2023-10-13 01:55:27,886][46663] Updated weights for policy 1, policy_version 40531 (0.0008) +[2023-10-13 01:55:28,073][46662] Updated weights for policy 0, policy_version 40560 (0.0008) +[2023-10-13 01:55:28,251][46663] Updated weights for policy 1, policy_version 40541 (0.0010) +[2023-10-13 01:55:28,444][46662] Updated weights for policy 0, policy_version 40570 (0.0007) +[2023-10-13 01:55:28,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 83034112. Throughput: 0: 1670.3, 1: 1661.0. Samples: 20763996. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 01:55:28,608][45375] Avg episode reward: [(0, '50.690'), (1, '48.530')] +[2023-10-13 01:55:32,391][46663] Updated weights for policy 1, policy_version 40551 (0.0008) +[2023-10-13 01:55:32,651][46662] Updated weights for policy 0, policy_version 40580 (0.0008) +[2023-10-13 01:55:32,761][46663] Updated weights for policy 1, policy_version 40561 (0.0009) +[2023-10-13 01:55:33,008][46662] Updated weights for policy 0, policy_version 40590 (0.0008) +[2023-10-13 01:55:33,136][46663] Updated weights for policy 1, policy_version 40571 (0.0008) +[2023-10-13 01:55:33,377][46662] Updated weights for policy 0, policy_version 40600 (0.0007) +[2023-10-13 01:55:33,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13329.4). Total num frames: 83099648. Throughput: 0: 1667.1, 1: 1660.9. Samples: 20784520. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 01:55:33,607][45375] Avg episode reward: [(0, '50.750'), (1, '48.400')] +[2023-10-13 01:55:37,110][46663] Updated weights for policy 1, policy_version 40581 (0.0007) +[2023-10-13 01:55:37,345][46662] Updated weights for policy 0, policy_version 40610 (0.0008) +[2023-10-13 01:55:37,474][46663] Updated weights for policy 1, policy_version 40591 (0.0009) +[2023-10-13 01:55:37,721][46662] Updated weights for policy 0, policy_version 40620 (0.0009) +[2023-10-13 01:55:37,849][46663] Updated weights for policy 1, policy_version 40601 (0.0009) +[2023-10-13 01:55:38,090][46662] Updated weights for policy 0, policy_version 40630 (0.0009) +[2023-10-13 01:55:38,458][46662] Updated weights for policy 0, policy_version 40640 (0.0008) +[2023-10-13 01:55:38,607][45375] Fps is (10 sec: 16384.1, 60 sec: 14199.5, 300 sec: 13440.4). Total num frames: 83197952. Throughput: 0: 1658.5, 1: 1661.6. Samples: 20803764. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 01:55:38,607][45375] Avg episode reward: [(0, '50.700'), (1, '47.560')] +[2023-10-13 01:55:41,783][46663] Updated weights for policy 1, policy_version 40611 (0.0009) +[2023-10-13 01:55:42,163][46663] Updated weights for policy 1, policy_version 40621 (0.0011) +[2023-10-13 01:55:42,527][46663] Updated weights for policy 1, policy_version 40631 (0.0008) +[2023-10-13 01:55:42,564][46662] Updated weights for policy 0, policy_version 40650 (0.0008) +[2023-10-13 01:55:42,932][46662] Updated weights for policy 0, policy_version 40660 (0.0008) +[2023-10-13 01:55:43,300][46662] Updated weights for policy 0, policy_version 40670 (0.0010) +[2023-10-13 01:55:43,607][45375] Fps is (10 sec: 16383.8, 60 sec: 14199.4, 300 sec: 13440.4). Total num frames: 83263488. Throughput: 0: 1671.0, 1: 1678.7. Samples: 20814900. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 01:55:43,608][45375] Avg episode reward: [(0, '51.060'), (1, '48.330')] +[2023-10-13 01:55:46,670][46663] Updated weights for policy 1, policy_version 40641 (0.0008) +[2023-10-13 01:55:47,082][46663] Updated weights for policy 1, policy_version 40651 (0.0009) +[2023-10-13 01:55:47,453][46663] Updated weights for policy 1, policy_version 40661 (0.0009) +[2023-10-13 01:55:47,475][46662] Updated weights for policy 0, policy_version 40680 (0.0009) +[2023-10-13 01:55:47,811][46663] Updated weights for policy 1, policy_version 40671 (0.0010) +[2023-10-13 01:55:47,851][46662] Updated weights for policy 0, policy_version 40690 (0.0007) +[2023-10-13 01:55:48,220][46662] Updated weights for policy 0, policy_version 40700 (0.0007) +[2023-10-13 01:55:48,607][45375] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 13440.4). Total num frames: 83329024. Throughput: 0: 1669.8, 1: 1661.0. Samples: 20834610. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 01:55:48,607][45375] Avg episode reward: [(0, '51.270'), (1, '48.170')] +[2023-10-13 01:55:51,772][46663] Updated weights for policy 1, policy_version 40681 (0.0007) +[2023-10-13 01:55:52,129][46663] Updated weights for policy 1, policy_version 40691 (0.0007) +[2023-10-13 01:55:52,324][46662] Updated weights for policy 0, policy_version 40710 (0.0008) +[2023-10-13 01:55:52,493][46663] Updated weights for policy 1, policy_version 40701 (0.0008) +[2023-10-13 01:55:52,687][46662] Updated weights for policy 0, policy_version 40720 (0.0008) +[2023-10-13 01:55:53,050][46662] Updated weights for policy 0, policy_version 40730 (0.0007) +[2023-10-13 01:55:53,607][45375] Fps is (10 sec: 13107.3, 60 sec: 14199.4, 300 sec: 13440.4). Total num frames: 83394560. Throughput: 0: 1653.7, 1: 1679.2. Samples: 20854022. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 01:55:53,607][45375] Avg episode reward: [(0, '50.770'), (1, '47.950')] +[2023-10-13 01:55:53,616][46091] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000040736_41713664.pth... +[2023-10-13 01:55:53,616][46384] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000040704_41680896.pth... +[2023-10-13 01:55:53,667][46384] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000039136_40075264.pth +[2023-10-13 01:55:53,668][46091] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000039168_40108032.pth +[2023-10-13 01:55:56,586][46663] Updated weights for policy 1, policy_version 40711 (0.0007) +[2023-10-13 01:55:56,959][46663] Updated weights for policy 1, policy_version 40721 (0.0008) +[2023-10-13 01:55:57,225][46662] Updated weights for policy 0, policy_version 40740 (0.0010) +[2023-10-13 01:55:57,325][46663] Updated weights for policy 1, policy_version 40731 (0.0007) +[2023-10-13 01:55:57,600][46662] Updated weights for policy 0, policy_version 40750 (0.0010) +[2023-10-13 01:55:57,966][46662] Updated weights for policy 0, policy_version 40760 (0.0010) +[2023-10-13 01:55:58,607][45375] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 13440.4). Total num frames: 83460096. Throughput: 0: 1668.4, 1: 1679.4. Samples: 20864882. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 01:55:58,607][45375] Avg episode reward: [(0, '49.470'), (1, '49.770')] +[2023-10-13 01:56:01,327][46663] Updated weights for policy 1, policy_version 40741 (0.0008) +[2023-10-13 01:56:01,689][46663] Updated weights for policy 1, policy_version 40751 (0.0010) +[2023-10-13 01:56:02,065][46663] Updated weights for policy 1, policy_version 40761 (0.0007) +[2023-10-13 01:56:02,229][46662] Updated weights for policy 0, policy_version 40770 (0.0009) +[2023-10-13 01:56:02,617][46662] Updated weights for policy 0, policy_version 40780 (0.0010) +[2023-10-13 01:56:02,985][46662] Updated weights for policy 0, policy_version 40790 (0.0011) +[2023-10-13 01:56:03,354][46662] Updated weights for policy 0, policy_version 40800 (0.0009) +[2023-10-13 01:56:03,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 83525632. Throughput: 0: 1667.3, 1: 1661.4. Samples: 20884256. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 01:56:03,607][45375] Avg episode reward: [(0, '50.490'), (1, '49.200')] +[2023-10-13 01:56:06,200][46663] Updated weights for policy 1, policy_version 40771 (0.0008) +[2023-10-13 01:56:06,559][46663] Updated weights for policy 1, policy_version 40781 (0.0007) +[2023-10-13 01:56:06,933][46663] Updated weights for policy 1, policy_version 40791 (0.0007) +[2023-10-13 01:56:07,271][46662] Updated weights for policy 0, policy_version 40810 (0.0009) +[2023-10-13 01:56:07,634][46662] Updated weights for policy 0, policy_version 40820 (0.0010) +[2023-10-13 01:56:07,999][46662] Updated weights for policy 0, policy_version 40830 (0.0011) +[2023-10-13 01:56:08,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 83591168. Throughput: 0: 1649.7, 1: 1688.6. Samples: 20903804. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 01:56:08,607][45375] Avg episode reward: [(0, '47.490'), (1, '50.190')] +[2023-10-13 01:56:10,975][46663] Updated weights for policy 1, policy_version 40801 (0.0007) +[2023-10-13 01:56:11,337][46663] Updated weights for policy 1, policy_version 40811 (0.0008) +[2023-10-13 01:56:11,711][46663] Updated weights for policy 1, policy_version 40821 (0.0009) +[2023-10-13 01:56:12,050][46662] Updated weights for policy 0, policy_version 40840 (0.0008) +[2023-10-13 01:56:12,075][46663] Updated weights for policy 1, policy_version 40831 (0.0008) +[2023-10-13 01:56:12,421][46662] Updated weights for policy 0, policy_version 40850 (0.0009) +[2023-10-13 01:56:12,800][46662] Updated weights for policy 0, policy_version 40860 (0.0010) +[2023-10-13 01:56:13,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13329.4). Total num frames: 83656704. Throughput: 0: 1665.4, 1: 1681.7. Samples: 20914614. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 01:56:13,608][45375] Avg episode reward: [(0, '48.620'), (1, '49.960')] +[2023-10-13 01:56:16,210][46663] Updated weights for policy 1, policy_version 40841 (0.0009) +[2023-10-13 01:56:16,584][46663] Updated weights for policy 1, policy_version 40851 (0.0008) +[2023-10-13 01:56:16,924][46662] Updated weights for policy 0, policy_version 40870 (0.0008) +[2023-10-13 01:56:16,956][46663] Updated weights for policy 1, policy_version 40861 (0.0008) +[2023-10-13 01:56:17,286][46662] Updated weights for policy 0, policy_version 40880 (0.0009) +[2023-10-13 01:56:17,663][46662] Updated weights for policy 0, policy_version 40890 (0.0008) +[2023-10-13 01:56:18,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13329.4). Total num frames: 83722240. Throughput: 0: 1668.2, 1: 1663.9. Samples: 20934464. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 01:56:18,607][45375] Avg episode reward: [(0, '48.210'), (1, '49.800')] +[2023-10-13 01:56:20,965][46663] Updated weights for policy 1, policy_version 40871 (0.0010) +[2023-10-13 01:56:21,334][46663] Updated weights for policy 1, policy_version 40881 (0.0011) +[2023-10-13 01:56:21,613][46662] Updated weights for policy 0, policy_version 40900 (0.0007) +[2023-10-13 01:56:21,700][46663] Updated weights for policy 1, policy_version 40891 (0.0009) +[2023-10-13 01:56:21,996][46662] Updated weights for policy 0, policy_version 40910 (0.0008) +[2023-10-13 01:56:22,367][46662] Updated weights for policy 0, policy_version 40920 (0.0010) +[2023-10-13 01:56:23,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13329.4). Total num frames: 83787776. Throughput: 0: 1651.2, 1: 1685.9. Samples: 20953930. Policy #0 lag: (min: 11.0, avg: 13.8, max: 43.0) +[2023-10-13 01:56:23,607][45375] Avg episode reward: [(0, '47.830'), (1, '51.290')] +[2023-10-13 01:56:25,562][46663] Updated weights for policy 1, policy_version 40901 (0.0008) +[2023-10-13 01:56:25,931][46663] Updated weights for policy 1, policy_version 40911 (0.0009) +[2023-10-13 01:56:26,305][46663] Updated weights for policy 1, policy_version 40921 (0.0007) +[2023-10-13 01:56:26,401][46662] Updated weights for policy 0, policy_version 40930 (0.0009) +[2023-10-13 01:56:26,760][46662] Updated weights for policy 0, policy_version 40940 (0.0007) +[2023-10-13 01:56:27,140][46662] Updated weights for policy 0, policy_version 40950 (0.0010) +[2023-10-13 01:56:27,516][46662] Updated weights for policy 0, policy_version 40960 (0.0009) +[2023-10-13 01:56:28,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13329.4). Total num frames: 83853312. Throughput: 0: 1670.2, 1: 1656.6. Samples: 20964606. Policy #0 lag: (min: 11.0, avg: 13.8, max: 43.0) +[2023-10-13 01:56:28,608][45375] Avg episode reward: [(0, '47.300'), (1, '48.810')] +[2023-10-13 01:56:30,554][46663] Updated weights for policy 1, policy_version 40931 (0.0008) +[2023-10-13 01:56:30,915][46663] Updated weights for policy 1, policy_version 40941 (0.0008) +[2023-10-13 01:56:31,284][46663] Updated weights for policy 1, policy_version 40951 (0.0008) +[2023-10-13 01:56:31,581][46662] Updated weights for policy 0, policy_version 40970 (0.0008) +[2023-10-13 01:56:31,949][46662] Updated weights for policy 0, policy_version 40980 (0.0008) +[2023-10-13 01:56:32,323][46662] Updated weights for policy 0, policy_version 40990 (0.0010) +[2023-10-13 01:56:33,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13329.4). Total num frames: 83918848. Throughput: 0: 1659.9, 1: 1667.4. Samples: 20984336. Policy #0 lag: (min: 11.0, avg: 13.8, max: 43.0) +[2023-10-13 01:56:33,608][45375] Avg episode reward: [(0, '46.970'), (1, '49.460')] +[2023-10-13 01:56:35,432][46663] Updated weights for policy 1, policy_version 40961 (0.0009) +[2023-10-13 01:56:35,840][46663] Updated weights for policy 1, policy_version 40971 (0.0009) +[2023-10-13 01:56:36,212][46663] Updated weights for policy 1, policy_version 40981 (0.0010) +[2023-10-13 01:56:36,357][46662] Updated weights for policy 0, policy_version 41000 (0.0007) +[2023-10-13 01:56:36,578][46663] Updated weights for policy 1, policy_version 40991 (0.0007) +[2023-10-13 01:56:36,726][46662] Updated weights for policy 0, policy_version 41010 (0.0009) +[2023-10-13 01:56:37,104][46662] Updated weights for policy 0, policy_version 41020 (0.0009) +[2023-10-13 01:56:38,606][45375] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 83984384. Throughput: 0: 1666.0, 1: 1674.5. Samples: 21004344. Policy #0 lag: (min: 11.0, avg: 13.8, max: 43.0) +[2023-10-13 01:56:38,607][45375] Avg episode reward: [(0, '47.080'), (1, '48.170')] +[2023-10-13 01:56:40,573][46663] Updated weights for policy 1, policy_version 41001 (0.0007) +[2023-10-13 01:56:40,943][46663] Updated weights for policy 1, policy_version 41011 (0.0008) +[2023-10-13 01:56:41,099][46662] Updated weights for policy 0, policy_version 41030 (0.0009) +[2023-10-13 01:56:41,305][46663] Updated weights for policy 1, policy_version 41021 (0.0008) +[2023-10-13 01:56:41,472][46662] Updated weights for policy 0, policy_version 41040 (0.0008) +[2023-10-13 01:56:41,837][46662] Updated weights for policy 0, policy_version 41050 (0.0008) +[2023-10-13 01:56:43,607][45375] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 84049920. Throughput: 0: 1681.2, 1: 1653.1. Samples: 21014924. Policy #0 lag: (min: 11.0, avg: 13.8, max: 43.0) +[2023-10-13 01:56:43,607][45375] Avg episode reward: [(0, '46.520'), (1, '48.450')] +[2023-10-13 01:56:45,442][46663] Updated weights for policy 1, policy_version 41031 (0.0009) +[2023-10-13 01:56:45,801][46663] Updated weights for policy 1, policy_version 41041 (0.0008) +[2023-10-13 01:56:45,848][46662] Updated weights for policy 0, policy_version 41060 (0.0008) +[2023-10-13 01:56:46,172][46663] Updated weights for policy 1, policy_version 41051 (0.0009) +[2023-10-13 01:56:46,205][46662] Updated weights for policy 0, policy_version 41070 (0.0008) +[2023-10-13 01:56:46,574][46662] Updated weights for policy 0, policy_version 41080 (0.0008) +[2023-10-13 01:56:48,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 84115456. Throughput: 0: 1657.7, 1: 1673.7. Samples: 21034172. Policy #0 lag: (min: 11.0, avg: 13.8, max: 43.0) +[2023-10-13 01:56:48,607][45375] Avg episode reward: [(0, '45.810'), (1, '48.920')] +[2023-10-13 01:56:50,458][46663] Updated weights for policy 1, policy_version 41061 (0.0008) +[2023-10-13 01:56:50,713][46662] Updated weights for policy 0, policy_version 41090 (0.0007) +[2023-10-13 01:56:50,818][46663] Updated weights for policy 1, policy_version 41071 (0.0009) +[2023-10-13 01:56:51,127][46662] Updated weights for policy 0, policy_version 41100 (0.0009) +[2023-10-13 01:56:51,188][46663] Updated weights for policy 1, policy_version 41081 (0.0008) +[2023-10-13 01:56:51,495][46662] Updated weights for policy 0, policy_version 41110 (0.0008) +[2023-10-13 01:56:51,855][46662] Updated weights for policy 0, policy_version 41120 (0.0007) +[2023-10-13 01:56:53,607][45375] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 84180992. Throughput: 0: 1678.6, 1: 1666.2. Samples: 21054322. Policy #0 lag: (min: 16.0, avg: 33.1, max: 48.0) +[2023-10-13 01:56:53,608][45375] Avg episode reward: [(0, '45.440'), (1, '48.090')] +[2023-10-13 01:56:55,594][46663] Updated weights for policy 1, policy_version 41091 (0.0008) +[2023-10-13 01:56:55,917][46662] Updated weights for policy 0, policy_version 41130 (0.0008) +[2023-10-13 01:56:55,956][46663] Updated weights for policy 1, policy_version 41101 (0.0008) +[2023-10-13 01:56:56,294][46662] Updated weights for policy 0, policy_version 41140 (0.0007) +[2023-10-13 01:56:56,334][46663] Updated weights for policy 1, policy_version 41111 (0.0009) +[2023-10-13 01:56:56,664][46662] Updated weights for policy 0, policy_version 41150 (0.0007) +[2023-10-13 01:56:58,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 84246528. Throughput: 0: 1680.4, 1: 1655.8. Samples: 21064742. Policy #0 lag: (min: 16.0, avg: 33.1, max: 48.0) +[2023-10-13 01:56:58,608][45375] Avg episode reward: [(0, '47.830'), (1, '48.710')] +[2023-10-13 01:57:00,332][46663] Updated weights for policy 1, policy_version 41121 (0.0008) +[2023-10-13 01:57:00,708][46663] Updated weights for policy 1, policy_version 41131 (0.0010) +[2023-10-13 01:57:00,803][46662] Updated weights for policy 0, policy_version 41160 (0.0009) +[2023-10-13 01:57:01,068][46663] Updated weights for policy 1, policy_version 41141 (0.0007) +[2023-10-13 01:57:01,171][46662] Updated weights for policy 0, policy_version 41170 (0.0009) +[2023-10-13 01:57:01,423][46663] Updated weights for policy 1, policy_version 41151 (0.0010) +[2023-10-13 01:57:01,541][46662] Updated weights for policy 0, policy_version 41180 (0.0008) +[2023-10-13 01:57:03,606][45375] Fps is (10 sec: 13107.6, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 84312064. Throughput: 0: 1655.8, 1: 1668.2. Samples: 21084044. Policy #0 lag: (min: 16.0, avg: 33.1, max: 48.0) +[2023-10-13 01:57:03,607][45375] Avg episode reward: [(0, '48.750'), (1, '48.040')] +[2023-10-13 01:57:05,301][46663] Updated weights for policy 1, policy_version 41161 (0.0011) +[2023-10-13 01:57:05,667][46663] Updated weights for policy 1, policy_version 41171 (0.0008) +[2023-10-13 01:57:05,732][46662] Updated weights for policy 0, policy_version 41190 (0.0008) +[2023-10-13 01:57:06,024][46663] Updated weights for policy 1, policy_version 41181 (0.0008) +[2023-10-13 01:57:06,108][46662] Updated weights for policy 0, policy_version 41200 (0.0009) +[2023-10-13 01:57:06,478][46662] Updated weights for policy 0, policy_version 41210 (0.0007) +[2023-10-13 01:57:08,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 84377600. Throughput: 0: 1683.2, 1: 1673.5. Samples: 21104978. Policy #0 lag: (min: 16.0, avg: 33.1, max: 48.0) +[2023-10-13 01:57:08,607][45375] Avg episode reward: [(0, '49.660'), (1, '48.010')] +[2023-10-13 01:57:10,033][46663] Updated weights for policy 1, policy_version 41191 (0.0009) +[2023-10-13 01:57:10,399][46663] Updated weights for policy 1, policy_version 41201 (0.0009) +[2023-10-13 01:57:10,735][46662] Updated weights for policy 0, policy_version 41220 (0.0007) +[2023-10-13 01:57:10,773][46663] Updated weights for policy 1, policy_version 41211 (0.0009) +[2023-10-13 01:57:11,102][46662] Updated weights for policy 0, policy_version 41230 (0.0007) +[2023-10-13 01:57:11,464][46662] Updated weights for policy 0, policy_version 41240 (0.0009) +[2023-10-13 01:57:13,607][45375] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 84443136. Throughput: 0: 1675.9, 1: 1667.9. Samples: 21115076. Policy #0 lag: (min: 16.0, avg: 33.1, max: 48.0) +[2023-10-13 01:57:13,608][45375] Avg episode reward: [(0, '49.340'), (1, '46.910')] +[2023-10-13 01:57:14,696][46663] Updated weights for policy 1, policy_version 41221 (0.0009) +[2023-10-13 01:57:15,062][46663] Updated weights for policy 1, policy_version 41231 (0.0009) +[2023-10-13 01:57:15,386][46662] Updated weights for policy 0, policy_version 41250 (0.0008) +[2023-10-13 01:57:15,434][46663] Updated weights for policy 1, policy_version 41241 (0.0010) +[2023-10-13 01:57:15,755][46662] Updated weights for policy 0, policy_version 41260 (0.0008) +[2023-10-13 01:57:16,130][46662] Updated weights for policy 0, policy_version 41270 (0.0009) +[2023-10-13 01:57:16,494][46662] Updated weights for policy 0, policy_version 41280 (0.0008) +[2023-10-13 01:57:18,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 84508672. Throughput: 0: 1669.8, 1: 1682.4. Samples: 21135182. Policy #0 lag: (min: 16.0, avg: 33.1, max: 48.0) +[2023-10-13 01:57:18,607][45375] Avg episode reward: [(0, '47.900'), (1, '47.700')] +[2023-10-13 01:57:19,353][46663] Updated weights for policy 1, policy_version 41251 (0.0007) +[2023-10-13 01:57:19,709][46663] Updated weights for policy 1, policy_version 41261 (0.0007) +[2023-10-13 01:57:20,080][46663] Updated weights for policy 1, policy_version 41271 (0.0009) +[2023-10-13 01:57:20,491][46662] Updated weights for policy 0, policy_version 41290 (0.0010) +[2023-10-13 01:57:20,862][46662] Updated weights for policy 0, policy_version 41300 (0.0009) +[2023-10-13 01:57:21,229][46662] Updated weights for policy 0, policy_version 41310 (0.0008) +[2023-10-13 01:57:23,606][45375] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 84574208. Throughput: 0: 1681.2, 1: 1684.5. Samples: 21155804. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) +[2023-10-13 01:57:23,607][45375] Avg episode reward: [(0, '46.260'), (1, '47.980')] +[2023-10-13 01:57:24,298][46663] Updated weights for policy 1, policy_version 41281 (0.0009) +[2023-10-13 01:57:24,708][46663] Updated weights for policy 1, policy_version 41291 (0.0010) +[2023-10-13 01:57:25,080][46663] Updated weights for policy 1, policy_version 41301 (0.0009) +[2023-10-13 01:57:25,203][46662] Updated weights for policy 0, policy_version 41320 (0.0008) +[2023-10-13 01:57:25,446][46663] Updated weights for policy 1, policy_version 41311 (0.0007) +[2023-10-13 01:57:25,573][46662] Updated weights for policy 0, policy_version 41330 (0.0008) +[2023-10-13 01:57:25,936][46662] Updated weights for policy 0, policy_version 41340 (0.0008) +[2023-10-13 01:57:28,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 84639744. Throughput: 0: 1661.6, 1: 1678.1. Samples: 21165212. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) +[2023-10-13 01:57:28,607][45375] Avg episode reward: [(0, '45.950'), (1, '46.480')] +[2023-10-13 01:57:29,617][46663] Updated weights for policy 1, policy_version 41321 (0.0007) +[2023-10-13 01:57:29,977][46663] Updated weights for policy 1, policy_version 41331 (0.0007) +[2023-10-13 01:57:30,075][46662] Updated weights for policy 0, policy_version 41350 (0.0008) +[2023-10-13 01:57:30,343][46663] Updated weights for policy 1, policy_version 41341 (0.0007) +[2023-10-13 01:57:30,436][46662] Updated weights for policy 0, policy_version 41360 (0.0008) +[2023-10-13 01:57:30,818][46662] Updated weights for policy 0, policy_version 41370 (0.0008) +[2023-10-13 01:57:33,606][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 84705280. Throughput: 0: 1679.3, 1: 1684.4. Samples: 21185542. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) +[2023-10-13 01:57:33,607][45375] Avg episode reward: [(0, '46.870'), (1, '48.810')] +[2023-10-13 01:57:34,352][46663] Updated weights for policy 1, policy_version 41351 (0.0010) +[2023-10-13 01:57:34,716][46663] Updated weights for policy 1, policy_version 41361 (0.0010) +[2023-10-13 01:57:34,855][46662] Updated weights for policy 0, policy_version 41380 (0.0008) +[2023-10-13 01:57:35,087][46663] Updated weights for policy 1, policy_version 41371 (0.0008) +[2023-10-13 01:57:35,226][46662] Updated weights for policy 0, policy_version 41390 (0.0008) +[2023-10-13 01:57:35,597][46662] Updated weights for policy 0, policy_version 41400 (0.0008) +[2023-10-13 01:57:38,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 84770816. Throughput: 0: 1683.9, 1: 1695.3. Samples: 21206384. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) +[2023-10-13 01:57:38,607][45375] Avg episode reward: [(0, '45.980'), (1, '49.490')] +[2023-10-13 01:57:39,175][46663] Updated weights for policy 1, policy_version 41381 (0.0008) +[2023-10-13 01:57:39,537][46663] Updated weights for policy 1, policy_version 41391 (0.0009) +[2023-10-13 01:57:39,902][46663] Updated weights for policy 1, policy_version 41401 (0.0009) +[2023-10-13 01:57:39,913][46662] Updated weights for policy 0, policy_version 41410 (0.0009) +[2023-10-13 01:57:40,286][46662] Updated weights for policy 0, policy_version 41420 (0.0010) +[2023-10-13 01:57:40,645][46662] Updated weights for policy 0, policy_version 41430 (0.0010) +[2023-10-13 01:57:41,024][46662] Updated weights for policy 0, policy_version 41440 (0.0010) +[2023-10-13 01:57:43,607][45375] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 84836352. Throughput: 0: 1665.2, 1: 1688.7. Samples: 21215670. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) +[2023-10-13 01:57:43,608][45375] Avg episode reward: [(0, '45.390'), (1, '49.940')] +[2023-10-13 01:57:43,955][46663] Updated weights for policy 1, policy_version 41411 (0.0008) +[2023-10-13 01:57:44,315][46663] Updated weights for policy 1, policy_version 41421 (0.0009) +[2023-10-13 01:57:44,680][46663] Updated weights for policy 1, policy_version 41431 (0.0009) +[2023-10-13 01:57:45,103][46662] Updated weights for policy 0, policy_version 41450 (0.0010) +[2023-10-13 01:57:45,478][46662] Updated weights for policy 0, policy_version 41460 (0.0007) +[2023-10-13 01:57:45,851][46662] Updated weights for policy 0, policy_version 41470 (0.0008) +[2023-10-13 01:57:48,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 84901888. Throughput: 0: 1678.8, 1: 1699.7. Samples: 21236078. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) +[2023-10-13 01:57:48,607][45375] Avg episode reward: [(0, '45.590'), (1, '51.150')] +[2023-10-13 01:57:48,662][46663] Updated weights for policy 1, policy_version 41441 (0.0009) +[2023-10-13 01:57:49,035][46663] Updated weights for policy 1, policy_version 41451 (0.0007) +[2023-10-13 01:57:49,407][46663] Updated weights for policy 1, policy_version 41461 (0.0007) +[2023-10-13 01:57:49,779][46663] Updated weights for policy 1, policy_version 41471 (0.0007) +[2023-10-13 01:57:50,159][46662] Updated weights for policy 0, policy_version 41480 (0.0009) +[2023-10-13 01:57:50,534][46662] Updated weights for policy 0, policy_version 41490 (0.0010) +[2023-10-13 01:57:50,901][46662] Updated weights for policy 0, policy_version 41500 (0.0009) +[2023-10-13 01:57:53,544][46663] Updated weights for policy 1, policy_version 41481 (0.0010) +[2023-10-13 01:57:53,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 84967424. Throughput: 0: 1673.1, 1: 1699.6. Samples: 21256752. Policy #0 lag: (min: 31.0, avg: 33.8, max: 63.0) +[2023-10-13 01:57:53,607][45375] Avg episode reward: [(0, '45.360'), (1, '50.870')] +[2023-10-13 01:57:53,614][46091] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000041504_42500096.pth... +[2023-10-13 01:57:53,649][46091] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000039936_40894464.pth +[2023-10-13 01:57:53,903][46663] Updated weights for policy 1, policy_version 41491 (0.0011) +[2023-10-13 01:57:54,274][46663] Updated weights for policy 1, policy_version 41501 (0.0009) +[2023-10-13 01:57:54,386][46384] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000041504_42500096.pth... +[2023-10-13 01:57:54,414][46384] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000039904_40861696.pth +[2023-10-13 01:57:54,718][46662] Updated weights for policy 0, policy_version 41510 (0.0007) +[2023-10-13 01:57:55,085][46662] Updated weights for policy 0, policy_version 41520 (0.0007) +[2023-10-13 01:57:55,455][46662] Updated weights for policy 0, policy_version 41530 (0.0008) +[2023-10-13 01:57:58,521][46663] Updated weights for policy 1, policy_version 41511 (0.0008) +[2023-10-13 01:57:58,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 85032960. Throughput: 0: 1654.6, 1: 1704.4. Samples: 21266232. Policy #0 lag: (min: 31.0, avg: 33.8, max: 63.0) +[2023-10-13 01:57:58,607][45375] Avg episode reward: [(0, '46.020'), (1, '49.040')] +[2023-10-13 01:57:58,891][46663] Updated weights for policy 1, policy_version 41521 (0.0009) +[2023-10-13 01:57:59,260][46663] Updated weights for policy 1, policy_version 41531 (0.0009) +[2023-10-13 01:57:59,393][46662] Updated weights for policy 0, policy_version 41540 (0.0008) +[2023-10-13 01:57:59,759][46662] Updated weights for policy 0, policy_version 41550 (0.0008) +[2023-10-13 01:58:00,135][46662] Updated weights for policy 0, policy_version 41560 (0.0009) +[2023-10-13 01:58:03,423][46663] Updated weights for policy 1, policy_version 41541 (0.0010) +[2023-10-13 01:58:03,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 85098496. Throughput: 0: 1676.0, 1: 1697.5. Samples: 21286990. Policy #0 lag: (min: 31.0, avg: 33.8, max: 63.0) +[2023-10-13 01:58:03,607][45375] Avg episode reward: [(0, '45.400'), (1, '49.330')] +[2023-10-13 01:58:03,778][46663] Updated weights for policy 1, policy_version 41551 (0.0009) +[2023-10-13 01:58:04,103][46662] Updated weights for policy 0, policy_version 41570 (0.0008) +[2023-10-13 01:58:04,151][46663] Updated weights for policy 1, policy_version 41561 (0.0007) +[2023-10-13 01:58:04,482][46662] Updated weights for policy 0, policy_version 41580 (0.0007) +[2023-10-13 01:58:04,847][46662] Updated weights for policy 0, policy_version 41590 (0.0007) +[2023-10-13 01:58:05,220][46662] Updated weights for policy 0, policy_version 41600 (0.0008) +[2023-10-13 01:58:08,242][46663] Updated weights for policy 1, policy_version 41571 (0.0007) +[2023-10-13 01:58:08,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 85164032. Throughput: 0: 1686.6, 1: 1691.9. Samples: 21307840. Policy #0 lag: (min: 31.0, avg: 33.8, max: 63.0) +[2023-10-13 01:58:08,608][45375] Avg episode reward: [(0, '46.040'), (1, '49.820')] +[2023-10-13 01:58:08,614][46663] Updated weights for policy 1, policy_version 41581 (0.0007) +[2023-10-13 01:58:08,981][46663] Updated weights for policy 1, policy_version 41591 (0.0009) +[2023-10-13 01:58:09,224][46662] Updated weights for policy 0, policy_version 41610 (0.0008) +[2023-10-13 01:58:09,584][46662] Updated weights for policy 0, policy_version 41620 (0.0010) +[2023-10-13 01:58:09,956][46662] Updated weights for policy 0, policy_version 41630 (0.0011) +[2023-10-13 01:58:13,026][46663] Updated weights for policy 1, policy_version 41601 (0.0008) +[2023-10-13 01:58:13,447][46663] Updated weights for policy 1, policy_version 41611 (0.0008) +[2023-10-13 01:58:13,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 85229568. Throughput: 0: 1674.6, 1: 1701.7. Samples: 21317148. Policy #0 lag: (min: 31.0, avg: 33.8, max: 63.0) +[2023-10-13 01:58:13,608][45375] Avg episode reward: [(0, '46.080'), (1, '50.140')] +[2023-10-13 01:58:13,813][46663] Updated weights for policy 1, policy_version 41621 (0.0007) +[2023-10-13 01:58:14,149][46662] Updated weights for policy 0, policy_version 41640 (0.0007) +[2023-10-13 01:58:14,178][46663] Updated weights for policy 1, policy_version 41631 (0.0007) +[2023-10-13 01:58:14,525][46662] Updated weights for policy 0, policy_version 41650 (0.0007) +[2023-10-13 01:58:14,893][46662] Updated weights for policy 0, policy_version 41660 (0.0009) +[2023-10-13 01:58:18,143][46663] Updated weights for policy 1, policy_version 41641 (0.0009) +[2023-10-13 01:58:18,521][46663] Updated weights for policy 1, policy_version 41651 (0.0008) +[2023-10-13 01:58:18,607][45375] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 85295104. Throughput: 0: 1680.8, 1: 1700.5. Samples: 21337698. Policy #0 lag: (min: 31.0, avg: 33.8, max: 63.0) +[2023-10-13 01:58:18,607][45375] Avg episode reward: [(0, '46.680'), (1, '49.240')] +[2023-10-13 01:58:18,890][46663] Updated weights for policy 1, policy_version 41661 (0.0007) +[2023-10-13 01:58:19,027][46662] Updated weights for policy 0, policy_version 41670 (0.0010) +[2023-10-13 01:58:19,408][46662] Updated weights for policy 0, policy_version 41680 (0.0009) +[2023-10-13 01:58:19,782][46662] Updated weights for policy 0, policy_version 41690 (0.0009) +[2023-10-13 01:58:23,016][46663] Updated weights for policy 1, policy_version 41671 (0.0008) +[2023-10-13 01:58:23,385][46663] Updated weights for policy 1, policy_version 41681 (0.0007) +[2023-10-13 01:58:23,606][45375] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 85360640. Throughput: 0: 1679.6, 1: 1680.8. Samples: 21357598. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 01:58:23,607][45375] Avg episode reward: [(0, '48.990'), (1, '48.740')] +[2023-10-13 01:58:23,752][46663] Updated weights for policy 1, policy_version 41691 (0.0007) +[2023-10-13 01:58:23,827][46662] Updated weights for policy 0, policy_version 41700 (0.0008) +[2023-10-13 01:58:24,204][46662] Updated weights for policy 0, policy_version 41710 (0.0010) +[2023-10-13 01:58:24,568][46662] Updated weights for policy 0, policy_version 41720 (0.0011) +[2023-10-13 01:58:27,615][46663] Updated weights for policy 1, policy_version 41701 (0.0007) +[2023-10-13 01:58:27,988][46663] Updated weights for policy 1, policy_version 41711 (0.0008) +[2023-10-13 01:58:28,351][46663] Updated weights for policy 1, policy_version 41721 (0.0009) +[2023-10-13 01:58:28,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 85426176. Throughput: 0: 1676.9, 1: 1695.4. Samples: 21367424. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 01:58:28,607][45375] Avg episode reward: [(0, '48.280'), (1, '48.560')] +[2023-10-13 01:58:28,728][46662] Updated weights for policy 0, policy_version 41730 (0.0009) +[2023-10-13 01:58:29,090][46662] Updated weights for policy 0, policy_version 41740 (0.0010) +[2023-10-13 01:58:29,460][46662] Updated weights for policy 0, policy_version 41750 (0.0011) +[2023-10-13 01:58:29,837][46662] Updated weights for policy 0, policy_version 41760 (0.0009) +[2023-10-13 01:58:32,404][46663] Updated weights for policy 1, policy_version 41731 (0.0009) +[2023-10-13 01:58:32,773][46663] Updated weights for policy 1, policy_version 41741 (0.0008) +[2023-10-13 01:58:33,147][46663] Updated weights for policy 1, policy_version 41751 (0.0008) +[2023-10-13 01:58:33,607][45375] Fps is (10 sec: 16383.6, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 85524480. Throughput: 0: 1683.7, 1: 1688.3. Samples: 21387816. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 01:58:33,608][45375] Avg episode reward: [(0, '46.900'), (1, '47.700')] +[2023-10-13 01:58:33,958][46662] Updated weights for policy 0, policy_version 41770 (0.0009) +[2023-10-13 01:58:34,323][46662] Updated weights for policy 0, policy_version 41780 (0.0010) +[2023-10-13 01:58:34,695][46662] Updated weights for policy 0, policy_version 41790 (0.0010) +[2023-10-13 01:58:37,073][46663] Updated weights for policy 1, policy_version 41761 (0.0009) +[2023-10-13 01:58:37,437][46663] Updated weights for policy 1, policy_version 41771 (0.0010) +[2023-10-13 01:58:37,807][46663] Updated weights for policy 1, policy_version 41781 (0.0008) +[2023-10-13 01:58:38,171][46663] Updated weights for policy 1, policy_version 41791 (0.0007) +[2023-10-13 01:58:38,606][45375] Fps is (10 sec: 16384.0, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 85590016. Throughput: 0: 1688.7, 1: 1657.3. Samples: 21407324. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 01:58:38,607][45375] Avg episode reward: [(0, '46.820'), (1, '47.770')] +[2023-10-13 01:58:38,850][46662] Updated weights for policy 0, policy_version 41800 (0.0009) +[2023-10-13 01:58:39,224][46662] Updated weights for policy 0, policy_version 41810 (0.0008) +[2023-10-13 01:58:39,592][46662] Updated weights for policy 0, policy_version 41820 (0.0008) +[2023-10-13 01:58:42,197][46663] Updated weights for policy 1, policy_version 41801 (0.0010) +[2023-10-13 01:58:42,566][46663] Updated weights for policy 1, policy_version 41811 (0.0010) +[2023-10-13 01:58:42,923][46663] Updated weights for policy 1, policy_version 41821 (0.0011) +[2023-10-13 01:58:43,584][46662] Updated weights for policy 0, policy_version 41830 (0.0008) +[2023-10-13 01:58:43,606][45375] Fps is (10 sec: 13107.5, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 85655552. Throughput: 0: 1682.8, 1: 1683.9. Samples: 21417730. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 01:58:43,607][45375] Avg episode reward: [(0, '48.920'), (1, '46.910')] +[2023-10-13 01:58:43,953][46662] Updated weights for policy 0, policy_version 41840 (0.0007) +[2023-10-13 01:58:44,321][46662] Updated weights for policy 0, policy_version 41850 (0.0009) +[2023-10-13 01:58:47,071][46663] Updated weights for policy 1, policy_version 41831 (0.0011) +[2023-10-13 01:58:47,437][46663] Updated weights for policy 1, policy_version 41841 (0.0008) +[2023-10-13 01:58:47,815][46663] Updated weights for policy 1, policy_version 41851 (0.0009) +[2023-10-13 01:58:48,476][46662] Updated weights for policy 0, policy_version 41860 (0.0009) +[2023-10-13 01:58:48,606][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 85721088. Throughput: 0: 1680.0, 1: 1671.8. Samples: 21437822. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 01:58:48,607][45375] Avg episode reward: [(0, '47.440'), (1, '46.740')] +[2023-10-13 01:58:48,852][46662] Updated weights for policy 0, policy_version 41870 (0.0007) +[2023-10-13 01:58:49,215][46662] Updated weights for policy 0, policy_version 41880 (0.0008) +[2023-10-13 01:58:52,002][46663] Updated weights for policy 1, policy_version 41861 (0.0007) +[2023-10-13 01:58:52,373][46663] Updated weights for policy 1, policy_version 41871 (0.0008) +[2023-10-13 01:58:52,743][46663] Updated weights for policy 1, policy_version 41881 (0.0008) +[2023-10-13 01:58:53,198][46662] Updated weights for policy 0, policy_version 41890 (0.0007) +[2023-10-13 01:58:53,580][46662] Updated weights for policy 0, policy_version 41900 (0.0009) +[2023-10-13 01:58:53,607][45375] Fps is (10 sec: 13106.8, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 85786624. Throughput: 0: 1675.2, 1: 1662.7. Samples: 21458048. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 01:58:53,608][45375] Avg episode reward: [(0, '47.500'), (1, '45.540')] +[2023-10-13 01:58:53,948][46662] Updated weights for policy 0, policy_version 41910 (0.0008) +[2023-10-13 01:58:54,310][46662] Updated weights for policy 0, policy_version 41920 (0.0008) +[2023-10-13 01:58:56,866][46663] Updated weights for policy 1, policy_version 41891 (0.0008) +[2023-10-13 01:58:57,240][46663] Updated weights for policy 1, policy_version 41901 (0.0008) +[2023-10-13 01:58:57,619][46663] Updated weights for policy 1, policy_version 41911 (0.0009) +[2023-10-13 01:58:58,364][46662] Updated weights for policy 0, policy_version 41930 (0.0008) +[2023-10-13 01:58:58,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 85852160. Throughput: 0: 1674.8, 1: 1683.8. Samples: 21468284. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 01:58:58,607][45375] Avg episode reward: [(0, '44.940'), (1, '45.640')] +[2023-10-13 01:58:58,729][46662] Updated weights for policy 0, policy_version 41940 (0.0007) +[2023-10-13 01:58:59,095][46662] Updated weights for policy 0, policy_version 41950 (0.0008) +[2023-10-13 01:59:01,730][46663] Updated weights for policy 1, policy_version 41921 (0.0010) +[2023-10-13 01:59:02,145][46663] Updated weights for policy 1, policy_version 41931 (0.0008) +[2023-10-13 01:59:02,511][46663] Updated weights for policy 1, policy_version 41941 (0.0010) +[2023-10-13 01:59:02,884][46663] Updated weights for policy 1, policy_version 41951 (0.0010) +[2023-10-13 01:59:03,164][46662] Updated weights for policy 0, policy_version 41960 (0.0009) +[2023-10-13 01:59:03,540][46662] Updated weights for policy 0, policy_version 41970 (0.0010) +[2023-10-13 01:59:03,606][45375] Fps is (10 sec: 13107.6, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 85917696. Throughput: 0: 1677.6, 1: 1665.5. Samples: 21488138. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 01:59:03,607][45375] Avg episode reward: [(0, '45.220'), (1, '47.020')] +[2023-10-13 01:59:03,909][46662] Updated weights for policy 0, policy_version 41980 (0.0007) +[2023-10-13 01:59:06,784][46663] Updated weights for policy 1, policy_version 41961 (0.0011) +[2023-10-13 01:59:07,156][46663] Updated weights for policy 1, policy_version 41971 (0.0010) +[2023-10-13 01:59:07,514][46663] Updated weights for policy 1, policy_version 41981 (0.0010) +[2023-10-13 01:59:08,134][46662] Updated weights for policy 0, policy_version 41990 (0.0009) +[2023-10-13 01:59:08,498][46662] Updated weights for policy 0, policy_version 42000 (0.0009) +[2023-10-13 01:59:08,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 85983232. Throughput: 0: 1676.4, 1: 1675.6. Samples: 21508440. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 01:59:08,608][45375] Avg episode reward: [(0, '45.860'), (1, '46.200')] +[2023-10-13 01:59:08,865][46662] Updated weights for policy 0, policy_version 42010 (0.0008) +[2023-10-13 01:59:11,354][46663] Updated weights for policy 1, policy_version 41991 (0.0007) +[2023-10-13 01:59:11,720][46663] Updated weights for policy 1, policy_version 42001 (0.0007) +[2023-10-13 01:59:12,084][46663] Updated weights for policy 1, policy_version 42011 (0.0010) +[2023-10-13 01:59:12,991][46662] Updated weights for policy 0, policy_version 42020 (0.0008) +[2023-10-13 01:59:13,360][46662] Updated weights for policy 0, policy_version 42030 (0.0009) +[2023-10-13 01:59:13,606][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 86048768. Throughput: 0: 1674.1, 1: 1683.6. Samples: 21518524. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 01:59:13,607][45375] Avg episode reward: [(0, '47.190'), (1, '45.020')] +[2023-10-13 01:59:13,728][46662] Updated weights for policy 0, policy_version 42040 (0.0007) +[2023-10-13 01:59:16,236][46663] Updated weights for policy 1, policy_version 42021 (0.0009) +[2023-10-13 01:59:16,608][46663] Updated weights for policy 1, policy_version 42031 (0.0010) +[2023-10-13 01:59:16,985][46663] Updated weights for policy 1, policy_version 42041 (0.0007) +[2023-10-13 01:59:17,909][46662] Updated weights for policy 0, policy_version 42050 (0.0007) +[2023-10-13 01:59:18,321][46662] Updated weights for policy 0, policy_version 42060 (0.0007) +[2023-10-13 01:59:18,607][45375] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 86114304. Throughput: 0: 1678.1, 1: 1664.0. Samples: 21538206. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 01:59:18,607][45375] Avg episode reward: [(0, '48.380'), (1, '45.480')] +[2023-10-13 01:59:18,699][46662] Updated weights for policy 0, policy_version 42070 (0.0009) +[2023-10-13 01:59:19,069][46662] Updated weights for policy 0, policy_version 42080 (0.0007) +[2023-10-13 01:59:21,093][46663] Updated weights for policy 1, policy_version 42051 (0.0009) +[2023-10-13 01:59:21,458][46663] Updated weights for policy 1, policy_version 42061 (0.0011) +[2023-10-13 01:59:21,829][46663] Updated weights for policy 1, policy_version 42071 (0.0010) +[2023-10-13 01:59:23,115][46662] Updated weights for policy 0, policy_version 42090 (0.0007) +[2023-10-13 01:59:23,480][46662] Updated weights for policy 0, policy_version 42100 (0.0008) +[2023-10-13 01:59:23,607][45375] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 86179840. Throughput: 0: 1673.8, 1: 1688.9. Samples: 21558646. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) +[2023-10-13 01:59:23,608][45375] Avg episode reward: [(0, '47.670'), (1, '45.390')] +[2023-10-13 01:59:23,846][46662] Updated weights for policy 0, policy_version 42110 (0.0010) +[2023-10-13 01:59:25,801][46663] Updated weights for policy 1, policy_version 42081 (0.0007) +[2023-10-13 01:59:26,166][46663] Updated weights for policy 1, policy_version 42091 (0.0008) +[2023-10-13 01:59:26,529][46663] Updated weights for policy 1, policy_version 42101 (0.0011) +[2023-10-13 01:59:26,893][46663] Updated weights for policy 1, policy_version 42111 (0.0009) +[2023-10-13 01:59:27,885][46662] Updated weights for policy 0, policy_version 42120 (0.0010) +[2023-10-13 01:59:28,260][46662] Updated weights for policy 0, policy_version 42130 (0.0009) +[2023-10-13 01:59:28,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 86245376. Throughput: 0: 1676.4, 1: 1675.4. Samples: 21568562. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) +[2023-10-13 01:59:28,607][45375] Avg episode reward: [(0, '47.650'), (1, '45.300')] +[2023-10-13 01:59:28,633][46662] Updated weights for policy 0, policy_version 42140 (0.0007) +[2023-10-13 01:59:31,079][46663] Updated weights for policy 1, policy_version 42121 (0.0009) +[2023-10-13 01:59:31,446][46663] Updated weights for policy 1, policy_version 42131 (0.0010) +[2023-10-13 01:59:31,811][46663] Updated weights for policy 1, policy_version 42141 (0.0007) +[2023-10-13 01:59:32,887][46662] Updated weights for policy 0, policy_version 42150 (0.0009) +[2023-10-13 01:59:33,264][46662] Updated weights for policy 0, policy_version 42160 (0.0010) +[2023-10-13 01:59:33,607][45375] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 86310912. Throughput: 0: 1672.1, 1: 1671.2. Samples: 21588272. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) +[2023-10-13 01:59:33,607][45375] Avg episode reward: [(0, '48.000'), (1, '46.770')] +[2023-10-13 01:59:33,625][46662] Updated weights for policy 0, policy_version 42170 (0.0009) +[2023-10-13 01:59:35,714][46663] Updated weights for policy 1, policy_version 42151 (0.0007) +[2023-10-13 01:59:36,075][46663] Updated weights for policy 1, policy_version 42161 (0.0009) +[2023-10-13 01:59:36,450][46663] Updated weights for policy 1, policy_version 42171 (0.0008) +[2023-10-13 01:59:37,559][46662] Updated weights for policy 0, policy_version 42180 (0.0008) +[2023-10-13 01:59:37,926][46662] Updated weights for policy 0, policy_version 42190 (0.0008) +[2023-10-13 01:59:38,286][46662] Updated weights for policy 0, policy_version 42200 (0.0009) +[2023-10-13 01:59:38,606][45375] Fps is (10 sec: 16384.2, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 86409216. Throughput: 0: 1659.7, 1: 1690.3. Samples: 21608796. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) +[2023-10-13 01:59:38,607][45375] Avg episode reward: [(0, '49.270'), (1, '48.810')] +[2023-10-13 01:59:40,654][46663] Updated weights for policy 1, policy_version 42181 (0.0010) +[2023-10-13 01:59:41,016][46663] Updated weights for policy 1, policy_version 42191 (0.0008) +[2023-10-13 01:59:41,381][46663] Updated weights for policy 1, policy_version 42201 (0.0007) +[2023-10-13 01:59:42,335][46662] Updated weights for policy 0, policy_version 42210 (0.0010) +[2023-10-13 01:59:42,702][46662] Updated weights for policy 0, policy_version 42220 (0.0008) +[2023-10-13 01:59:43,076][46662] Updated weights for policy 0, policy_version 42230 (0.0008) +[2023-10-13 01:59:43,448][46662] Updated weights for policy 0, policy_version 42240 (0.0008) +[2023-10-13 01:59:43,607][45375] Fps is (10 sec: 16383.7, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 86474752. Throughput: 0: 1672.6, 1: 1667.4. Samples: 21618584. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) +[2023-10-13 01:59:43,608][45375] Avg episode reward: [(0, '49.410'), (1, '47.330')] +[2023-10-13 01:59:45,613][46663] Updated weights for policy 1, policy_version 42211 (0.0007) +[2023-10-13 01:59:45,981][46663] Updated weights for policy 1, policy_version 42221 (0.0009) +[2023-10-13 01:59:46,349][46663] Updated weights for policy 1, policy_version 42231 (0.0010) +[2023-10-13 01:59:47,567][46662] Updated weights for policy 0, policy_version 42250 (0.0011) +[2023-10-13 01:59:47,939][46662] Updated weights for policy 0, policy_version 42260 (0.0008) +[2023-10-13 01:59:48,309][46662] Updated weights for policy 0, policy_version 42270 (0.0010) +[2023-10-13 01:59:48,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 86540288. Throughput: 0: 1673.8, 1: 1679.6. Samples: 21639038. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 01:59:48,607][45375] Avg episode reward: [(0, '49.940'), (1, '48.590')] +[2023-10-13 01:59:50,351][46663] Updated weights for policy 1, policy_version 42241 (0.0008) +[2023-10-13 01:59:50,795][46663] Updated weights for policy 1, policy_version 42251 (0.0007) +[2023-10-13 01:59:51,172][46663] Updated weights for policy 1, policy_version 42261 (0.0007) +[2023-10-13 01:59:51,548][46663] Updated weights for policy 1, policy_version 42271 (0.0008) +[2023-10-13 01:59:52,275][46662] Updated weights for policy 0, policy_version 42280 (0.0008) +[2023-10-13 01:59:52,644][46662] Updated weights for policy 0, policy_version 42290 (0.0008) +[2023-10-13 01:59:53,022][46662] Updated weights for policy 0, policy_version 42300 (0.0009) +[2023-10-13 01:59:53,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 86605824. Throughput: 0: 1657.8, 1: 1686.8. Samples: 21658946. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 01:59:53,608][45375] Avg episode reward: [(0, '51.130'), (1, '48.750')] +[2023-10-13 01:59:53,618][46384] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000042272_43286528.pth... +[2023-10-13 01:59:53,619][46091] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000042304_43319296.pth... +[2023-10-13 01:59:53,647][46384] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000040704_41680896.pth +[2023-10-13 01:59:53,653][46091] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000040736_41713664.pth +[2023-10-13 01:59:55,481][46663] Updated weights for policy 1, policy_version 42281 (0.0010) +[2023-10-13 01:59:55,847][46663] Updated weights for policy 1, policy_version 42291 (0.0008) +[2023-10-13 01:59:56,217][46663] Updated weights for policy 1, policy_version 42301 (0.0011) +[2023-10-13 01:59:57,045][46662] Updated weights for policy 0, policy_version 42310 (0.0008) +[2023-10-13 01:59:57,408][46662] Updated weights for policy 0, policy_version 42320 (0.0009) +[2023-10-13 01:59:57,789][46662] Updated weights for policy 0, policy_version 42330 (0.0008) +[2023-10-13 01:59:58,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 86671360. Throughput: 0: 1678.5, 1: 1662.8. Samples: 21668884. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 01:59:58,607][45375] Avg episode reward: [(0, '53.790'), (1, '49.230')] +[2023-10-13 02:00:00,208][46663] Updated weights for policy 1, policy_version 42311 (0.0008) +[2023-10-13 02:00:00,584][46663] Updated weights for policy 1, policy_version 42321 (0.0007) +[2023-10-13 02:00:00,943][46663] Updated weights for policy 1, policy_version 42331 (0.0007) +[2023-10-13 02:00:01,761][46662] Updated weights for policy 0, policy_version 42340 (0.0009) +[2023-10-13 02:00:02,132][46662] Updated weights for policy 0, policy_version 42350 (0.0008) +[2023-10-13 02:00:02,491][46662] Updated weights for policy 0, policy_version 42360 (0.0010) +[2023-10-13 02:00:03,607][45375] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 86736896. Throughput: 0: 1676.9, 1: 1686.5. Samples: 21689560. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:00:03,607][45375] Avg episode reward: [(0, '53.900'), (1, '50.760')] +[2023-10-13 02:00:04,941][46663] Updated weights for policy 1, policy_version 42341 (0.0008) +[2023-10-13 02:00:05,301][46663] Updated weights for policy 1, policy_version 42351 (0.0009) +[2023-10-13 02:00:05,673][46663] Updated weights for policy 1, policy_version 42361 (0.0007) +[2023-10-13 02:00:06,492][46662] Updated weights for policy 0, policy_version 42370 (0.0008) +[2023-10-13 02:00:06,862][46662] Updated weights for policy 0, policy_version 42380 (0.0011) +[2023-10-13 02:00:07,231][46662] Updated weights for policy 0, policy_version 42390 (0.0010) +[2023-10-13 02:00:07,599][46662] Updated weights for policy 0, policy_version 42400 (0.0010) +[2023-10-13 02:00:08,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 86802432. Throughput: 0: 1659.0, 1: 1687.7. Samples: 21709246. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:00:08,607][45375] Avg episode reward: [(0, '54.820'), (1, '50.090')] +[2023-10-13 02:00:09,840][46663] Updated weights for policy 1, policy_version 42371 (0.0008) +[2023-10-13 02:00:10,207][46663] Updated weights for policy 1, policy_version 42381 (0.0011) +[2023-10-13 02:00:10,581][46663] Updated weights for policy 1, policy_version 42391 (0.0009) +[2023-10-13 02:00:11,644][46662] Updated weights for policy 0, policy_version 42410 (0.0007) +[2023-10-13 02:00:12,006][46662] Updated weights for policy 0, policy_version 42420 (0.0007) +[2023-10-13 02:00:12,382][46662] Updated weights for policy 0, policy_version 42430 (0.0008) +[2023-10-13 02:00:13,606][45375] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 86867968. Throughput: 0: 1688.9, 1: 1667.8. Samples: 21719616. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:00:13,607][45375] Avg episode reward: [(0, '55.770'), (1, '50.840')] +[2023-10-13 02:00:13,608][46091] Saving new best policy, reward=55.770! +[2023-10-13 02:00:14,665][46663] Updated weights for policy 1, policy_version 42401 (0.0009) +[2023-10-13 02:00:15,030][46663] Updated weights for policy 1, policy_version 42411 (0.0009) +[2023-10-13 02:00:15,397][46663] Updated weights for policy 1, policy_version 42421 (0.0011) +[2023-10-13 02:00:15,772][46663] Updated weights for policy 1, policy_version 42431 (0.0008) +[2023-10-13 02:00:16,377][46662] Updated weights for policy 0, policy_version 42440 (0.0007) +[2023-10-13 02:00:16,749][46662] Updated weights for policy 0, policy_version 42450 (0.0009) +[2023-10-13 02:00:17,121][46662] Updated weights for policy 0, policy_version 42460 (0.0007) +[2023-10-13 02:00:18,606][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 86933504. Throughput: 0: 1677.6, 1: 1686.8. Samples: 21739670. Policy #0 lag: (min: 31.0, avg: 36.4, max: 63.0) +[2023-10-13 02:00:18,607][45375] Avg episode reward: [(0, '56.170'), (1, '50.730')] +[2023-10-13 02:00:18,608][46091] Saving new best policy, reward=56.170! +[2023-10-13 02:00:19,874][46663] Updated weights for policy 1, policy_version 42441 (0.0009) +[2023-10-13 02:00:20,251][46663] Updated weights for policy 1, policy_version 42451 (0.0009) +[2023-10-13 02:00:20,624][46663] Updated weights for policy 1, policy_version 42461 (0.0009) +[2023-10-13 02:00:21,346][46662] Updated weights for policy 0, policy_version 42470 (0.0008) +[2023-10-13 02:00:21,716][46662] Updated weights for policy 0, policy_version 42480 (0.0007) +[2023-10-13 02:00:22,090][46662] Updated weights for policy 0, policy_version 42490 (0.0009) +[2023-10-13 02:00:23,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 86999040. Throughput: 0: 1672.2, 1: 1679.2. Samples: 21759606. Policy #0 lag: (min: 31.0, avg: 36.4, max: 63.0) +[2023-10-13 02:00:23,607][45375] Avg episode reward: [(0, '56.120'), (1, '50.010')] +[2023-10-13 02:00:24,690][46663] Updated weights for policy 1, policy_version 42471 (0.0008) +[2023-10-13 02:00:25,051][46663] Updated weights for policy 1, policy_version 42481 (0.0009) +[2023-10-13 02:00:25,413][46663] Updated weights for policy 1, policy_version 42491 (0.0009) +[2023-10-13 02:00:26,138][46662] Updated weights for policy 0, policy_version 42500 (0.0009) +[2023-10-13 02:00:26,512][46662] Updated weights for policy 0, policy_version 42510 (0.0010) +[2023-10-13 02:00:26,885][46662] Updated weights for policy 0, policy_version 42520 (0.0009) +[2023-10-13 02:00:28,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 87064576. Throughput: 0: 1693.1, 1: 1674.5. Samples: 21770126. Policy #0 lag: (min: 31.0, avg: 36.4, max: 63.0) +[2023-10-13 02:00:28,607][45375] Avg episode reward: [(0, '56.280'), (1, '51.170')] +[2023-10-13 02:00:28,608][46091] Saving new best policy, reward=56.280! +[2023-10-13 02:00:29,481][46663] Updated weights for policy 1, policy_version 42501 (0.0010) +[2023-10-13 02:00:29,851][46663] Updated weights for policy 1, policy_version 42511 (0.0008) +[2023-10-13 02:00:30,221][46663] Updated weights for policy 1, policy_version 42521 (0.0008) +[2023-10-13 02:00:31,000][46662] Updated weights for policy 0, policy_version 42530 (0.0010) +[2023-10-13 02:00:31,378][46662] Updated weights for policy 0, policy_version 42540 (0.0008) +[2023-10-13 02:00:31,740][46662] Updated weights for policy 0, policy_version 42550 (0.0009) +[2023-10-13 02:00:32,113][46662] Updated weights for policy 0, policy_version 42560 (0.0009) +[2023-10-13 02:00:33,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13329.4). Total num frames: 87130112. Throughput: 0: 1666.3, 1: 1679.6. Samples: 21789602. Policy #0 lag: (min: 31.0, avg: 36.4, max: 63.0) +[2023-10-13 02:00:33,607][45375] Avg episode reward: [(0, '55.590'), (1, '51.960')] +[2023-10-13 02:00:34,436][46663] Updated weights for policy 1, policy_version 42531 (0.0007) +[2023-10-13 02:00:34,805][46663] Updated weights for policy 1, policy_version 42541 (0.0007) +[2023-10-13 02:00:35,173][46663] Updated weights for policy 1, policy_version 42551 (0.0008) +[2023-10-13 02:00:36,065][46662] Updated weights for policy 0, policy_version 42570 (0.0007) +[2023-10-13 02:00:36,440][46662] Updated weights for policy 0, policy_version 42580 (0.0007) +[2023-10-13 02:00:36,804][46662] Updated weights for policy 0, policy_version 42590 (0.0008) +[2023-10-13 02:00:38,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 87195648. Throughput: 0: 1677.1, 1: 1683.7. Samples: 21810178. Policy #0 lag: (min: 31.0, avg: 36.4, max: 63.0) +[2023-10-13 02:00:38,607][45375] Avg episode reward: [(0, '55.830'), (1, '52.330')] +[2023-10-13 02:00:39,283][46663] Updated weights for policy 1, policy_version 42561 (0.0008) +[2023-10-13 02:00:39,713][46663] Updated weights for policy 1, policy_version 42571 (0.0009) +[2023-10-13 02:00:40,093][46663] Updated weights for policy 1, policy_version 42581 (0.0007) +[2023-10-13 02:00:40,462][46663] Updated weights for policy 1, policy_version 42591 (0.0008) +[2023-10-13 02:00:40,874][46662] Updated weights for policy 0, policy_version 42600 (0.0009) +[2023-10-13 02:00:41,242][46662] Updated weights for policy 0, policy_version 42610 (0.0008) +[2023-10-13 02:00:41,612][46662] Updated weights for policy 0, policy_version 42620 (0.0008) +[2023-10-13 02:00:43,606][45375] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 87261184. Throughput: 0: 1680.2, 1: 1681.2. Samples: 21820146. Policy #0 lag: (min: 31.0, avg: 36.4, max: 63.0) +[2023-10-13 02:00:43,607][45375] Avg episode reward: [(0, '53.000'), (1, '52.300')] +[2023-10-13 02:00:44,355][46663] Updated weights for policy 1, policy_version 42601 (0.0007) +[2023-10-13 02:00:44,717][46663] Updated weights for policy 1, policy_version 42611 (0.0007) +[2023-10-13 02:00:45,079][46663] Updated weights for policy 1, policy_version 42621 (0.0007) +[2023-10-13 02:00:45,725][46662] Updated weights for policy 0, policy_version 42630 (0.0009) +[2023-10-13 02:00:46,096][46662] Updated weights for policy 0, policy_version 42640 (0.0008) +[2023-10-13 02:00:46,465][46662] Updated weights for policy 0, policy_version 42650 (0.0007) +[2023-10-13 02:00:48,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 87326720. Throughput: 0: 1660.0, 1: 1684.6. Samples: 21840064. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:00:48,607][45375] Avg episode reward: [(0, '51.910'), (1, '52.110')] +[2023-10-13 02:00:49,107][46663] Updated weights for policy 1, policy_version 42631 (0.0008) +[2023-10-13 02:00:49,471][46663] Updated weights for policy 1, policy_version 42641 (0.0009) +[2023-10-13 02:00:49,834][46663] Updated weights for policy 1, policy_version 42651 (0.0009) +[2023-10-13 02:00:50,621][46662] Updated weights for policy 0, policy_version 42660 (0.0007) +[2023-10-13 02:00:50,989][46662] Updated weights for policy 0, policy_version 42670 (0.0009) +[2023-10-13 02:00:51,357][46662] Updated weights for policy 0, policy_version 42680 (0.0009) +[2023-10-13 02:00:53,606][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.3, 300 sec: 13329.4). Total num frames: 87392256. Throughput: 0: 1686.2, 1: 1685.4. Samples: 21860968. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:00:53,607][45375] Avg episode reward: [(0, '50.270'), (1, '52.020')] +[2023-10-13 02:00:53,944][46663] Updated weights for policy 1, policy_version 42661 (0.0008) +[2023-10-13 02:00:54,318][46663] Updated weights for policy 1, policy_version 42671 (0.0009) +[2023-10-13 02:00:54,688][46663] Updated weights for policy 1, policy_version 42681 (0.0007) +[2023-10-13 02:00:55,405][46662] Updated weights for policy 0, policy_version 42690 (0.0007) +[2023-10-13 02:00:55,801][46662] Updated weights for policy 0, policy_version 42700 (0.0007) +[2023-10-13 02:00:56,178][46662] Updated weights for policy 0, policy_version 42710 (0.0008) +[2023-10-13 02:00:56,545][46662] Updated weights for policy 0, policy_version 42720 (0.0007) +[2023-10-13 02:00:58,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 87457792. Throughput: 0: 1671.6, 1: 1688.9. Samples: 21870840. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:00:58,607][45375] Avg episode reward: [(0, '49.500'), (1, '51.230')] +[2023-10-13 02:00:58,816][46663] Updated weights for policy 1, policy_version 42691 (0.0008) +[2023-10-13 02:00:59,187][46663] Updated weights for policy 1, policy_version 42701 (0.0007) +[2023-10-13 02:00:59,541][46663] Updated weights for policy 1, policy_version 42711 (0.0009) +[2023-10-13 02:01:00,601][46662] Updated weights for policy 0, policy_version 42730 (0.0007) +[2023-10-13 02:01:00,968][46662] Updated weights for policy 0, policy_version 42740 (0.0007) +[2023-10-13 02:01:01,329][46662] Updated weights for policy 0, policy_version 42750 (0.0007) +[2023-10-13 02:01:03,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 87523328. Throughput: 0: 1672.1, 1: 1686.1. Samples: 21890790. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:01:03,607][45375] Avg episode reward: [(0, '49.710'), (1, '50.560')] +[2023-10-13 02:01:03,640][46663] Updated weights for policy 1, policy_version 42721 (0.0007) +[2023-10-13 02:01:04,008][46663] Updated weights for policy 1, policy_version 42731 (0.0009) +[2023-10-13 02:01:04,392][46663] Updated weights for policy 1, policy_version 42741 (0.0009) +[2023-10-13 02:01:04,763][46663] Updated weights for policy 1, policy_version 42751 (0.0010) +[2023-10-13 02:01:05,421][46662] Updated weights for policy 0, policy_version 42760 (0.0008) +[2023-10-13 02:01:05,788][46662] Updated weights for policy 0, policy_version 42770 (0.0009) +[2023-10-13 02:01:06,153][46662] Updated weights for policy 0, policy_version 42780 (0.0009) +[2023-10-13 02:01:08,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 87588864. Throughput: 0: 1685.9, 1: 1690.4. Samples: 21911536. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:01:08,607][45375] Avg episode reward: [(0, '48.720'), (1, '48.750')] +[2023-10-13 02:01:08,712][46663] Updated weights for policy 1, policy_version 42761 (0.0008) +[2023-10-13 02:01:09,074][46663] Updated weights for policy 1, policy_version 42771 (0.0008) +[2023-10-13 02:01:09,448][46663] Updated weights for policy 1, policy_version 42781 (0.0008) +[2023-10-13 02:01:10,149][46662] Updated weights for policy 0, policy_version 42790 (0.0007) +[2023-10-13 02:01:10,512][46662] Updated weights for policy 0, policy_version 42800 (0.0009) +[2023-10-13 02:01:10,885][46662] Updated weights for policy 0, policy_version 42810 (0.0008) +[2023-10-13 02:01:13,437][46663] Updated weights for policy 1, policy_version 42791 (0.0009) +[2023-10-13 02:01:13,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13107.1, 300 sec: 13329.3). Total num frames: 87654400. Throughput: 0: 1662.2, 1: 1693.1. Samples: 21921114. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:01:13,608][45375] Avg episode reward: [(0, '47.150'), (1, '50.170')] +[2023-10-13 02:01:13,812][46663] Updated weights for policy 1, policy_version 42801 (0.0009) +[2023-10-13 02:01:14,181][46663] Updated weights for policy 1, policy_version 42811 (0.0010) +[2023-10-13 02:01:14,868][46662] Updated weights for policy 0, policy_version 42820 (0.0008) +[2023-10-13 02:01:15,238][46662] Updated weights for policy 0, policy_version 42830 (0.0008) +[2023-10-13 02:01:15,615][46662] Updated weights for policy 0, policy_version 42840 (0.0007) +[2023-10-13 02:01:18,334][46663] Updated weights for policy 1, policy_version 42821 (0.0010) +[2023-10-13 02:01:18,606][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 87719936. Throughput: 0: 1681.8, 1: 1692.4. Samples: 21941442. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-10-13 02:01:18,607][45375] Avg episode reward: [(0, '45.680'), (1, '49.400')] +[2023-10-13 02:01:18,717][46663] Updated weights for policy 1, policy_version 42831 (0.0009) +[2023-10-13 02:01:19,082][46663] Updated weights for policy 1, policy_version 42841 (0.0009) +[2023-10-13 02:01:19,701][46662] Updated weights for policy 0, policy_version 42850 (0.0009) +[2023-10-13 02:01:20,068][46662] Updated weights for policy 0, policy_version 42860 (0.0007) +[2023-10-13 02:01:20,435][46662] Updated weights for policy 0, policy_version 42870 (0.0007) +[2023-10-13 02:01:20,806][46662] Updated weights for policy 0, policy_version 42880 (0.0010) +[2023-10-13 02:01:23,059][46663] Updated weights for policy 1, policy_version 42851 (0.0008) +[2023-10-13 02:01:23,429][46663] Updated weights for policy 1, policy_version 42861 (0.0008) +[2023-10-13 02:01:23,607][45375] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 87785472. Throughput: 0: 1689.7, 1: 1682.3. Samples: 21961916. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-10-13 02:01:23,607][45375] Avg episode reward: [(0, '44.610'), (1, '50.060')] +[2023-10-13 02:01:23,805][46663] Updated weights for policy 1, policy_version 42871 (0.0009) +[2023-10-13 02:01:24,808][46662] Updated weights for policy 0, policy_version 42890 (0.0011) +[2023-10-13 02:01:25,182][46662] Updated weights for policy 0, policy_version 42900 (0.0009) +[2023-10-13 02:01:25,561][46662] Updated weights for policy 0, policy_version 42910 (0.0008) +[2023-10-13 02:01:27,889][46663] Updated weights for policy 1, policy_version 42881 (0.0007) +[2023-10-13 02:01:28,294][46663] Updated weights for policy 1, policy_version 42891 (0.0008) +[2023-10-13 02:01:28,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 87851008. Throughput: 0: 1669.7, 1: 1697.2. Samples: 21971656. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-10-13 02:01:28,607][45375] Avg episode reward: [(0, '45.110'), (1, '50.510')] +[2023-10-13 02:01:28,652][46663] Updated weights for policy 1, policy_version 42901 (0.0008) +[2023-10-13 02:01:29,023][46663] Updated weights for policy 1, policy_version 42911 (0.0008) +[2023-10-13 02:01:29,510][46662] Updated weights for policy 0, policy_version 42920 (0.0009) +[2023-10-13 02:01:29,886][46662] Updated weights for policy 0, policy_version 42930 (0.0008) +[2023-10-13 02:01:30,258][46662] Updated weights for policy 0, policy_version 42940 (0.0008) +[2023-10-13 02:01:33,022][46663] Updated weights for policy 1, policy_version 42921 (0.0009) +[2023-10-13 02:01:33,398][46663] Updated weights for policy 1, policy_version 42931 (0.0010) +[2023-10-13 02:01:33,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 87916544. Throughput: 0: 1693.3, 1: 1693.4. Samples: 21992468. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-10-13 02:01:33,607][45375] Avg episode reward: [(0, '46.670'), (1, '50.930')] +[2023-10-13 02:01:33,760][46663] Updated weights for policy 1, policy_version 42941 (0.0009) +[2023-10-13 02:01:34,387][46662] Updated weights for policy 0, policy_version 42950 (0.0008) +[2023-10-13 02:01:34,754][46662] Updated weights for policy 0, policy_version 42960 (0.0010) +[2023-10-13 02:01:35,123][46662] Updated weights for policy 0, policy_version 42970 (0.0007) +[2023-10-13 02:01:37,793][46663] Updated weights for policy 1, policy_version 42951 (0.0009) +[2023-10-13 02:01:38,165][46663] Updated weights for policy 1, policy_version 42961 (0.0011) +[2023-10-13 02:01:38,532][46663] Updated weights for policy 1, policy_version 42971 (0.0011) +[2023-10-13 02:01:38,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 87982080. Throughput: 0: 1691.6, 1: 1668.5. Samples: 22012170. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-10-13 02:01:38,607][45375] Avg episode reward: [(0, '46.910'), (1, '49.010')] +[2023-10-13 02:01:39,057][46662] Updated weights for policy 0, policy_version 42980 (0.0008) +[2023-10-13 02:01:39,424][46662] Updated weights for policy 0, policy_version 42990 (0.0010) +[2023-10-13 02:01:39,799][46662] Updated weights for policy 0, policy_version 43000 (0.0011) +[2023-10-13 02:01:42,429][46663] Updated weights for policy 1, policy_version 42981 (0.0009) +[2023-10-13 02:01:42,794][46663] Updated weights for policy 1, policy_version 42991 (0.0010) +[2023-10-13 02:01:43,169][46663] Updated weights for policy 1, policy_version 43001 (0.0009) +[2023-10-13 02:01:43,607][45375] Fps is (10 sec: 16383.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 88080384. Throughput: 0: 1677.4, 1: 1692.0. Samples: 22022462. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-10-13 02:01:43,607][45375] Avg episode reward: [(0, '47.440'), (1, '47.840')] +[2023-10-13 02:01:43,921][46662] Updated weights for policy 0, policy_version 43010 (0.0010) +[2023-10-13 02:01:44,326][46662] Updated weights for policy 0, policy_version 43020 (0.0008) +[2023-10-13 02:01:44,699][46662] Updated weights for policy 0, policy_version 43030 (0.0009) +[2023-10-13 02:01:45,066][46662] Updated weights for policy 0, policy_version 43040 (0.0008) +[2023-10-13 02:01:47,337][46663] Updated weights for policy 1, policy_version 43011 (0.0010) +[2023-10-13 02:01:47,705][46663] Updated weights for policy 1, policy_version 43021 (0.0010) +[2023-10-13 02:01:48,076][46663] Updated weights for policy 1, policy_version 43031 (0.0008) +[2023-10-13 02:01:48,606][45375] Fps is (10 sec: 16384.3, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 88145920. Throughput: 0: 1686.7, 1: 1692.8. Samples: 22042866. Policy #0 lag: (min: 12.0, avg: 15.9, max: 44.0) +[2023-10-13 02:01:48,607][45375] Avg episode reward: [(0, '47.940'), (1, '47.210')] +[2023-10-13 02:01:49,117][46662] Updated weights for policy 0, policy_version 43050 (0.0007) +[2023-10-13 02:01:49,482][46662] Updated weights for policy 0, policy_version 43060 (0.0007) +[2023-10-13 02:01:49,847][46662] Updated weights for policy 0, policy_version 43070 (0.0007) +[2023-10-13 02:01:52,181][46663] Updated weights for policy 1, policy_version 43041 (0.0009) +[2023-10-13 02:01:52,556][46663] Updated weights for policy 1, policy_version 43051 (0.0009) +[2023-10-13 02:01:52,922][46663] Updated weights for policy 1, policy_version 43061 (0.0009) +[2023-10-13 02:01:53,289][46663] Updated weights for policy 1, policy_version 43071 (0.0011) +[2023-10-13 02:01:53,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 88211456. Throughput: 0: 1692.2, 1: 1664.3. Samples: 22062576. Policy #0 lag: (min: 12.0, avg: 15.9, max: 44.0) +[2023-10-13 02:01:53,607][45375] Avg episode reward: [(0, '48.200'), (1, '45.870')] +[2023-10-13 02:01:53,616][46384] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000043072_44105728.pth... +[2023-10-13 02:01:53,657][46384] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000041504_42500096.pth +[2023-10-13 02:01:53,795][46662] Updated weights for policy 0, policy_version 43080 (0.0010) +[2023-10-13 02:01:54,163][46662] Updated weights for policy 0, policy_version 43090 (0.0012) +[2023-10-13 02:01:54,525][46662] Updated weights for policy 0, policy_version 43100 (0.0009) +[2023-10-13 02:01:54,672][46091] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000043104_44138496.pth... +[2023-10-13 02:01:54,709][46091] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000041504_42500096.pth +[2023-10-13 02:01:57,348][46663] Updated weights for policy 1, policy_version 43081 (0.0010) +[2023-10-13 02:01:57,718][46663] Updated weights for policy 1, policy_version 43091 (0.0007) +[2023-10-13 02:01:58,074][46663] Updated weights for policy 1, policy_version 43101 (0.0007) +[2023-10-13 02:01:58,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 88276992. Throughput: 0: 1683.7, 1: 1689.4. Samples: 22072904. Policy #0 lag: (min: 12.0, avg: 15.9, max: 44.0) +[2023-10-13 02:01:58,607][45375] Avg episode reward: [(0, '48.830'), (1, '45.590')] +[2023-10-13 02:01:58,703][46662] Updated weights for policy 0, policy_version 43110 (0.0009) +[2023-10-13 02:01:59,072][46662] Updated weights for policy 0, policy_version 43120 (0.0008) +[2023-10-13 02:01:59,438][46662] Updated weights for policy 0, policy_version 43130 (0.0010) +[2023-10-13 02:02:02,156][46663] Updated weights for policy 1, policy_version 43111 (0.0008) +[2023-10-13 02:02:02,526][46663] Updated weights for policy 1, policy_version 43121 (0.0008) +[2023-10-13 02:02:02,897][46663] Updated weights for policy 1, policy_version 43131 (0.0007) +[2023-10-13 02:02:03,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 88342528. Throughput: 0: 1685.9, 1: 1682.6. Samples: 22093026. Policy #0 lag: (min: 12.0, avg: 15.9, max: 44.0) +[2023-10-13 02:02:03,608][45375] Avg episode reward: [(0, '49.190'), (1, '46.320')] +[2023-10-13 02:02:03,630][46662] Updated weights for policy 0, policy_version 43140 (0.0010) +[2023-10-13 02:02:03,997][46662] Updated weights for policy 0, policy_version 43150 (0.0009) +[2023-10-13 02:02:04,377][46662] Updated weights for policy 0, policy_version 43160 (0.0009) +[2023-10-13 02:02:06,987][46663] Updated weights for policy 1, policy_version 43141 (0.0010) +[2023-10-13 02:02:07,360][46663] Updated weights for policy 1, policy_version 43151 (0.0010) +[2023-10-13 02:02:07,725][46663] Updated weights for policy 1, policy_version 43161 (0.0010) +[2023-10-13 02:02:08,393][46662] Updated weights for policy 0, policy_version 43170 (0.0008) +[2023-10-13 02:02:08,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 88408064. Throughput: 0: 1687.8, 1: 1667.7. Samples: 22112916. Policy #0 lag: (min: 12.0, avg: 15.9, max: 44.0) +[2023-10-13 02:02:08,607][45375] Avg episode reward: [(0, '48.730'), (1, '46.920')] +[2023-10-13 02:02:08,761][46662] Updated weights for policy 0, policy_version 43180 (0.0008) +[2023-10-13 02:02:09,137][46662] Updated weights for policy 0, policy_version 43190 (0.0011) +[2023-10-13 02:02:09,507][46662] Updated weights for policy 0, policy_version 43200 (0.0007) +[2023-10-13 02:02:11,753][46663] Updated weights for policy 1, policy_version 43171 (0.0009) +[2023-10-13 02:02:12,116][46663] Updated weights for policy 1, policy_version 43181 (0.0009) +[2023-10-13 02:02:12,483][46663] Updated weights for policy 1, policy_version 43191 (0.0008) +[2023-10-13 02:02:13,606][45375] Fps is (10 sec: 13107.5, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 88473600. Throughput: 0: 1686.8, 1: 1688.9. Samples: 22123562. Policy #0 lag: (min: 12.0, avg: 15.9, max: 44.0) +[2023-10-13 02:02:13,607][45375] Avg episode reward: [(0, '49.630'), (1, '46.910')] +[2023-10-13 02:02:13,670][46662] Updated weights for policy 0, policy_version 43210 (0.0008) +[2023-10-13 02:02:14,048][46662] Updated weights for policy 0, policy_version 43220 (0.0009) +[2023-10-13 02:02:14,424][46662] Updated weights for policy 0, policy_version 43230 (0.0008) +[2023-10-13 02:02:16,570][46663] Updated weights for policy 1, policy_version 43201 (0.0012) +[2023-10-13 02:02:16,991][46663] Updated weights for policy 1, policy_version 43211 (0.0008) +[2023-10-13 02:02:17,359][46663] Updated weights for policy 1, policy_version 43221 (0.0009) +[2023-10-13 02:02:17,730][46663] Updated weights for policy 1, policy_version 43231 (0.0010) +[2023-10-13 02:02:18,550][46662] Updated weights for policy 0, policy_version 43240 (0.0007) +[2023-10-13 02:02:18,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 88539136. Throughput: 0: 1682.1, 1: 1669.6. Samples: 22143294. Policy #0 lag: (min: 28.0, avg: 35.8, max: 60.0) +[2023-10-13 02:02:18,607][45375] Avg episode reward: [(0, '50.550'), (1, '47.520')] +[2023-10-13 02:02:18,916][46662] Updated weights for policy 0, policy_version 43250 (0.0009) +[2023-10-13 02:02:19,300][46662] Updated weights for policy 0, policy_version 43260 (0.0011) +[2023-10-13 02:02:21,648][46663] Updated weights for policy 1, policy_version 43241 (0.0007) +[2023-10-13 02:02:22,025][46663] Updated weights for policy 1, policy_version 43251 (0.0007) +[2023-10-13 02:02:22,398][46663] Updated weights for policy 1, policy_version 43261 (0.0008) +[2023-10-13 02:02:23,334][46662] Updated weights for policy 0, policy_version 43270 (0.0009) +[2023-10-13 02:02:23,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 88604672. Throughput: 0: 1683.3, 1: 1686.0. Samples: 22163790. Policy #0 lag: (min: 28.0, avg: 35.8, max: 60.0) +[2023-10-13 02:02:23,607][45375] Avg episode reward: [(0, '50.050'), (1, '48.140')] +[2023-10-13 02:02:23,690][46662] Updated weights for policy 0, policy_version 43280 (0.0009) +[2023-10-13 02:02:24,071][46662] Updated weights for policy 0, policy_version 43290 (0.0009) +[2023-10-13 02:02:26,455][46663] Updated weights for policy 1, policy_version 43271 (0.0009) +[2023-10-13 02:02:26,821][46663] Updated weights for policy 1, policy_version 43281 (0.0007) +[2023-10-13 02:02:27,199][46663] Updated weights for policy 1, policy_version 43291 (0.0008) +[2023-10-13 02:02:28,033][46662] Updated weights for policy 0, policy_version 43300 (0.0009) +[2023-10-13 02:02:28,398][46662] Updated weights for policy 0, policy_version 43310 (0.0010) +[2023-10-13 02:02:28,607][45375] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 88670208. Throughput: 0: 1680.6, 1: 1687.0. Samples: 22174004. Policy #0 lag: (min: 28.0, avg: 35.8, max: 60.0) +[2023-10-13 02:02:28,608][45375] Avg episode reward: [(0, '50.380'), (1, '48.300')] +[2023-10-13 02:02:28,762][46662] Updated weights for policy 0, policy_version 43320 (0.0008) +[2023-10-13 02:02:31,187][46663] Updated weights for policy 1, policy_version 43301 (0.0008) +[2023-10-13 02:02:31,551][46663] Updated weights for policy 1, policy_version 43311 (0.0008) +[2023-10-13 02:02:31,914][46663] Updated weights for policy 1, policy_version 43321 (0.0009) +[2023-10-13 02:02:32,890][46662] Updated weights for policy 0, policy_version 43330 (0.0008) +[2023-10-13 02:02:33,278][46662] Updated weights for policy 0, policy_version 43340 (0.0008) +[2023-10-13 02:02:33,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 88735744. Throughput: 0: 1687.9, 1: 1662.7. Samples: 22193648. Policy #0 lag: (min: 28.0, avg: 35.8, max: 60.0) +[2023-10-13 02:02:33,608][45375] Avg episode reward: [(0, '49.880'), (1, '47.420')] +[2023-10-13 02:02:33,649][46662] Updated weights for policy 0, policy_version 43350 (0.0007) +[2023-10-13 02:02:34,012][46662] Updated weights for policy 0, policy_version 43360 (0.0008) +[2023-10-13 02:02:35,972][46663] Updated weights for policy 1, policy_version 43331 (0.0008) +[2023-10-13 02:02:36,338][46663] Updated weights for policy 1, policy_version 43341 (0.0008) +[2023-10-13 02:02:36,713][46663] Updated weights for policy 1, policy_version 43351 (0.0008) +[2023-10-13 02:02:38,085][46662] Updated weights for policy 0, policy_version 43370 (0.0007) +[2023-10-13 02:02:38,459][46662] Updated weights for policy 0, policy_version 43380 (0.0010) +[2023-10-13 02:02:38,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 88801280. Throughput: 0: 1676.3, 1: 1691.3. Samples: 22214116. Policy #0 lag: (min: 28.0, avg: 35.8, max: 60.0) +[2023-10-13 02:02:38,607][45375] Avg episode reward: [(0, '49.910'), (1, '47.100')] +[2023-10-13 02:02:38,827][46662] Updated weights for policy 0, policy_version 43390 (0.0007) +[2023-10-13 02:02:40,677][46663] Updated weights for policy 1, policy_version 43361 (0.0008) +[2023-10-13 02:02:41,047][46663] Updated weights for policy 1, policy_version 43371 (0.0009) +[2023-10-13 02:02:41,415][46663] Updated weights for policy 1, policy_version 43381 (0.0009) +[2023-10-13 02:02:41,787][46663] Updated weights for policy 1, policy_version 43391 (0.0008) +[2023-10-13 02:02:42,823][46662] Updated weights for policy 0, policy_version 43400 (0.0007) +[2023-10-13 02:02:43,201][46662] Updated weights for policy 0, policy_version 43410 (0.0007) +[2023-10-13 02:02:43,559][46662] Updated weights for policy 0, policy_version 43420 (0.0007) +[2023-10-13 02:02:43,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 88866816. Throughput: 0: 1682.4, 1: 1678.0. Samples: 22224124. Policy #0 lag: (min: 28.0, avg: 35.8, max: 60.0) +[2023-10-13 02:02:43,608][45375] Avg episode reward: [(0, '50.170'), (1, '46.690')] +[2023-10-13 02:02:46,025][46663] Updated weights for policy 1, policy_version 43401 (0.0008) +[2023-10-13 02:02:46,383][46663] Updated weights for policy 1, policy_version 43411 (0.0008) +[2023-10-13 02:02:46,752][46663] Updated weights for policy 1, policy_version 43421 (0.0008) +[2023-10-13 02:02:47,565][46662] Updated weights for policy 0, policy_version 43430 (0.0010) +[2023-10-13 02:02:47,934][46662] Updated weights for policy 0, policy_version 43440 (0.0010) +[2023-10-13 02:02:48,302][46662] Updated weights for policy 0, policy_version 43450 (0.0011) +[2023-10-13 02:02:48,607][45375] Fps is (10 sec: 16384.2, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 88965120. Throughput: 0: 1692.0, 1: 1674.5. Samples: 22244518. Policy #0 lag: (min: 11.0, avg: 18.7, max: 43.0) +[2023-10-13 02:02:48,607][45375] Avg episode reward: [(0, '50.260'), (1, '47.120')] +[2023-10-13 02:02:50,882][46663] Updated weights for policy 1, policy_version 43431 (0.0008) +[2023-10-13 02:02:51,249][46663] Updated weights for policy 1, policy_version 43441 (0.0009) +[2023-10-13 02:02:51,615][46663] Updated weights for policy 1, policy_version 43451 (0.0008) +[2023-10-13 02:02:52,199][46662] Updated weights for policy 0, policy_version 43460 (0.0011) +[2023-10-13 02:02:52,559][46662] Updated weights for policy 0, policy_version 43470 (0.0008) +[2023-10-13 02:02:52,930][46662] Updated weights for policy 0, policy_version 43480 (0.0008) +[2023-10-13 02:02:53,606][45375] Fps is (10 sec: 16384.2, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 89030656. Throughput: 0: 1675.4, 1: 1690.4. Samples: 22264376. Policy #0 lag: (min: 11.0, avg: 18.7, max: 43.0) +[2023-10-13 02:02:53,607][45375] Avg episode reward: [(0, '51.620'), (1, '47.220')] +[2023-10-13 02:02:55,538][46663] Updated weights for policy 1, policy_version 43461 (0.0009) +[2023-10-13 02:02:55,913][46663] Updated weights for policy 1, policy_version 43471 (0.0009) +[2023-10-13 02:02:56,291][46663] Updated weights for policy 1, policy_version 43481 (0.0009) +[2023-10-13 02:02:56,923][46662] Updated weights for policy 0, policy_version 43490 (0.0008) +[2023-10-13 02:02:57,294][46662] Updated weights for policy 0, policy_version 43500 (0.0009) +[2023-10-13 02:02:57,657][46662] Updated weights for policy 0, policy_version 43510 (0.0009) +[2023-10-13 02:02:58,028][46662] Updated weights for policy 0, policy_version 43520 (0.0009) +[2023-10-13 02:02:58,606][45375] Fps is (10 sec: 13107.4, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 89096192. Throughput: 0: 1693.5, 1: 1664.8. Samples: 22274682. Policy #0 lag: (min: 11.0, avg: 18.7, max: 43.0) +[2023-10-13 02:02:58,607][45375] Avg episode reward: [(0, '50.910'), (1, '47.500')] +[2023-10-13 02:03:00,361][46663] Updated weights for policy 1, policy_version 43491 (0.0009) +[2023-10-13 02:03:00,723][46663] Updated weights for policy 1, policy_version 43501 (0.0008) +[2023-10-13 02:03:01,089][46663] Updated weights for policy 1, policy_version 43511 (0.0007) +[2023-10-13 02:03:02,119][46662] Updated weights for policy 0, policy_version 43530 (0.0008) +[2023-10-13 02:03:02,479][46662] Updated weights for policy 0, policy_version 43540 (0.0009) +[2023-10-13 02:03:02,850][46662] Updated weights for policy 0, policy_version 43550 (0.0008) +[2023-10-13 02:03:03,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 89161728. Throughput: 0: 1696.8, 1: 1677.6. Samples: 22295144. Policy #0 lag: (min: 11.0, avg: 18.7, max: 43.0) +[2023-10-13 02:03:03,607][45375] Avg episode reward: [(0, '49.380'), (1, '47.720')] +[2023-10-13 02:03:05,177][46663] Updated weights for policy 1, policy_version 43521 (0.0008) +[2023-10-13 02:03:05,549][46663] Updated weights for policy 1, policy_version 43531 (0.0011) +[2023-10-13 02:03:05,920][46663] Updated weights for policy 1, policy_version 43541 (0.0011) +[2023-10-13 02:03:06,291][46663] Updated weights for policy 1, policy_version 43551 (0.0011) +[2023-10-13 02:03:06,875][46662] Updated weights for policy 0, policy_version 43560 (0.0008) +[2023-10-13 02:03:07,256][46662] Updated weights for policy 0, policy_version 43570 (0.0008) +[2023-10-13 02:03:07,620][46662] Updated weights for policy 0, policy_version 43580 (0.0008) +[2023-10-13 02:03:08,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 89227264. Throughput: 0: 1670.3, 1: 1688.5. Samples: 22314938. Policy #0 lag: (min: 11.0, avg: 18.7, max: 43.0) +[2023-10-13 02:03:08,607][45375] Avg episode reward: [(0, '50.080'), (1, '48.600')] +[2023-10-13 02:03:10,240][46663] Updated weights for policy 1, policy_version 43561 (0.0010) +[2023-10-13 02:03:10,609][46663] Updated weights for policy 1, policy_version 43571 (0.0008) +[2023-10-13 02:03:10,973][46663] Updated weights for policy 1, policy_version 43581 (0.0008) +[2023-10-13 02:03:11,721][46662] Updated weights for policy 0, policy_version 43590 (0.0010) +[2023-10-13 02:03:12,095][46662] Updated weights for policy 0, policy_version 43600 (0.0008) +[2023-10-13 02:03:12,465][46662] Updated weights for policy 0, policy_version 43610 (0.0011) +[2023-10-13 02:03:13,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 89292800. Throughput: 0: 1702.2, 1: 1662.1. Samples: 22325398. Policy #0 lag: (min: 27.0, avg: 27.0, max: 27.0) +[2023-10-13 02:03:13,608][45375] Avg episode reward: [(0, '49.260'), (1, '47.370')] +[2023-10-13 02:03:15,064][46663] Updated weights for policy 1, policy_version 43591 (0.0009) +[2023-10-13 02:03:15,437][46663] Updated weights for policy 1, policy_version 43601 (0.0009) +[2023-10-13 02:03:15,799][46663] Updated weights for policy 1, policy_version 43611 (0.0009) +[2023-10-13 02:03:16,471][46662] Updated weights for policy 0, policy_version 43620 (0.0008) +[2023-10-13 02:03:16,839][46662] Updated weights for policy 0, policy_version 43630 (0.0010) +[2023-10-13 02:03:17,211][46662] Updated weights for policy 0, policy_version 43640 (0.0011) +[2023-10-13 02:03:18,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 89358336. Throughput: 0: 1688.5, 1: 1685.0. Samples: 22345458. Policy #0 lag: (min: 27.0, avg: 27.0, max: 27.0) +[2023-10-13 02:03:18,607][45375] Avg episode reward: [(0, '47.740'), (1, '48.020')] +[2023-10-13 02:03:19,902][46663] Updated weights for policy 1, policy_version 43621 (0.0010) +[2023-10-13 02:03:20,276][46663] Updated weights for policy 1, policy_version 43631 (0.0008) +[2023-10-13 02:03:20,636][46663] Updated weights for policy 1, policy_version 43641 (0.0009) +[2023-10-13 02:03:21,229][46662] Updated weights for policy 0, policy_version 43650 (0.0010) +[2023-10-13 02:03:21,615][46662] Updated weights for policy 0, policy_version 43660 (0.0009) +[2023-10-13 02:03:21,979][46662] Updated weights for policy 0, policy_version 43670 (0.0011) +[2023-10-13 02:03:22,353][46662] Updated weights for policy 0, policy_version 43680 (0.0010) +[2023-10-13 02:03:23,606][45375] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 89423872. Throughput: 0: 1677.1, 1: 1684.6. Samples: 22365392. Policy #0 lag: (min: 27.0, avg: 27.0, max: 27.0) +[2023-10-13 02:03:23,607][45375] Avg episode reward: [(0, '47.580'), (1, '48.040')] +[2023-10-13 02:03:24,761][46663] Updated weights for policy 1, policy_version 43651 (0.0007) +[2023-10-13 02:03:25,123][46663] Updated weights for policy 1, policy_version 43661 (0.0010) +[2023-10-13 02:03:25,487][46663] Updated weights for policy 1, policy_version 43671 (0.0007) +[2023-10-13 02:03:26,400][46662] Updated weights for policy 0, policy_version 43690 (0.0010) +[2023-10-13 02:03:26,782][46662] Updated weights for policy 0, policy_version 43700 (0.0008) +[2023-10-13 02:03:27,146][46662] Updated weights for policy 0, policy_version 43710 (0.0009) +[2023-10-13 02:03:28,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 89489408. Throughput: 0: 1706.2, 1: 1670.3. Samples: 22376068. Policy #0 lag: (min: 27.0, avg: 27.0, max: 27.0) +[2023-10-13 02:03:28,607][45375] Avg episode reward: [(0, '46.750'), (1, '48.190')] +[2023-10-13 02:03:29,611][46663] Updated weights for policy 1, policy_version 43681 (0.0008) +[2023-10-13 02:03:29,968][46663] Updated weights for policy 1, policy_version 43691 (0.0008) +[2023-10-13 02:03:30,335][46663] Updated weights for policy 1, policy_version 43701 (0.0009) +[2023-10-13 02:03:30,696][46663] Updated weights for policy 1, policy_version 43711 (0.0007) +[2023-10-13 02:03:31,161][46662] Updated weights for policy 0, policy_version 43720 (0.0009) +[2023-10-13 02:03:31,528][46662] Updated weights for policy 0, policy_version 43730 (0.0010) +[2023-10-13 02:03:31,905][46662] Updated weights for policy 0, policy_version 43740 (0.0008) +[2023-10-13 02:03:33,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 89554944. Throughput: 0: 1675.1, 1: 1690.8. Samples: 22395984. Policy #0 lag: (min: 27.0, avg: 27.0, max: 27.0) +[2023-10-13 02:03:33,607][45375] Avg episode reward: [(0, '45.590'), (1, '46.690')] +[2023-10-13 02:03:34,664][46663] Updated weights for policy 1, policy_version 43721 (0.0010) +[2023-10-13 02:03:35,024][46663] Updated weights for policy 1, policy_version 43731 (0.0008) +[2023-10-13 02:03:35,390][46663] Updated weights for policy 1, policy_version 43741 (0.0008) +[2023-10-13 02:03:35,983][46662] Updated weights for policy 0, policy_version 43750 (0.0010) +[2023-10-13 02:03:36,351][46662] Updated weights for policy 0, policy_version 43760 (0.0010) +[2023-10-13 02:03:36,722][46662] Updated weights for policy 0, policy_version 43770 (0.0008) +[2023-10-13 02:03:38,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 89620480. Throughput: 0: 1684.3, 1: 1695.1. Samples: 22416454. Policy #0 lag: (min: 27.0, avg: 27.0, max: 27.0) +[2023-10-13 02:03:38,607][45375] Avg episode reward: [(0, '44.850'), (1, '46.750')] +[2023-10-13 02:03:39,432][46663] Updated weights for policy 1, policy_version 43751 (0.0008) +[2023-10-13 02:03:39,792][46663] Updated weights for policy 1, policy_version 43761 (0.0009) +[2023-10-13 02:03:40,165][46663] Updated weights for policy 1, policy_version 43771 (0.0008) +[2023-10-13 02:03:40,876][46662] Updated weights for policy 0, policy_version 43780 (0.0008) +[2023-10-13 02:03:41,245][46662] Updated weights for policy 0, policy_version 43790 (0.0011) +[2023-10-13 02:03:41,619][46662] Updated weights for policy 0, policy_version 43800 (0.0007) +[2023-10-13 02:03:43,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 89686016. Throughput: 0: 1690.9, 1: 1687.7. Samples: 22426722. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:03:43,607][45375] Avg episode reward: [(0, '45.910'), (1, '46.830')] +[2023-10-13 02:03:44,185][46663] Updated weights for policy 1, policy_version 43781 (0.0010) +[2023-10-13 02:03:44,540][46663] Updated weights for policy 1, policy_version 43791 (0.0010) +[2023-10-13 02:03:44,911][46663] Updated weights for policy 1, policy_version 43801 (0.0011) +[2023-10-13 02:03:45,586][46662] Updated weights for policy 0, policy_version 43810 (0.0009) +[2023-10-13 02:03:45,954][46662] Updated weights for policy 0, policy_version 43820 (0.0010) +[2023-10-13 02:03:46,328][46662] Updated weights for policy 0, policy_version 43830 (0.0010) +[2023-10-13 02:03:46,695][46662] Updated weights for policy 0, policy_version 43840 (0.0007) +[2023-10-13 02:03:48,607][45375] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 89751552. Throughput: 0: 1668.7, 1: 1689.6. Samples: 22446264. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:03:48,607][45375] Avg episode reward: [(0, '44.990'), (1, '46.880')] +[2023-10-13 02:03:49,089][46663] Updated weights for policy 1, policy_version 43811 (0.0009) +[2023-10-13 02:03:49,452][46663] Updated weights for policy 1, policy_version 43821 (0.0007) +[2023-10-13 02:03:49,824][46663] Updated weights for policy 1, policy_version 43831 (0.0009) +[2023-10-13 02:03:50,755][46662] Updated weights for policy 0, policy_version 43850 (0.0008) +[2023-10-13 02:03:51,127][46662] Updated weights for policy 0, policy_version 43860 (0.0009) +[2023-10-13 02:03:51,498][46662] Updated weights for policy 0, policy_version 43870 (0.0009) +[2023-10-13 02:03:53,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 89817088. Throughput: 0: 1693.6, 1: 1685.0. Samples: 22466978. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:03:53,607][45375] Avg episode reward: [(0, '44.010'), (1, '47.320')] +[2023-10-13 02:03:53,617][46091] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000043872_44924928.pth... +[2023-10-13 02:03:53,617][46384] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000043840_44892160.pth... +[2023-10-13 02:03:53,653][46091] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000042304_43319296.pth +[2023-10-13 02:03:53,653][46384] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000042272_43286528.pth +[2023-10-13 02:03:53,951][46663] Updated weights for policy 1, policy_version 43841 (0.0009) +[2023-10-13 02:03:54,335][46663] Updated weights for policy 1, policy_version 43851 (0.0009) +[2023-10-13 02:03:54,705][46663] Updated weights for policy 1, policy_version 43861 (0.0008) +[2023-10-13 02:03:55,083][46663] Updated weights for policy 1, policy_version 43871 (0.0009) +[2023-10-13 02:03:55,497][46662] Updated weights for policy 0, policy_version 43880 (0.0008) +[2023-10-13 02:03:55,867][46662] Updated weights for policy 0, policy_version 43890 (0.0008) +[2023-10-13 02:03:56,236][46662] Updated weights for policy 0, policy_version 43900 (0.0010) +[2023-10-13 02:03:58,607][45375] Fps is (10 sec: 13106.8, 60 sec: 13107.1, 300 sec: 13440.4). Total num frames: 89882624. Throughput: 0: 1680.2, 1: 1685.2. Samples: 22476838. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:03:58,607][45375] Avg episode reward: [(0, '43.530'), (1, '48.630')] +[2023-10-13 02:03:59,101][46663] Updated weights for policy 1, policy_version 43881 (0.0007) +[2023-10-13 02:03:59,473][46663] Updated weights for policy 1, policy_version 43891 (0.0007) +[2023-10-13 02:03:59,834][46663] Updated weights for policy 1, policy_version 43901 (0.0007) +[2023-10-13 02:04:00,262][46662] Updated weights for policy 0, policy_version 43910 (0.0010) +[2023-10-13 02:04:00,635][46662] Updated weights for policy 0, policy_version 43920 (0.0010) +[2023-10-13 02:04:01,005][46662] Updated weights for policy 0, policy_version 43930 (0.0009) +[2023-10-13 02:04:03,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 89948160. Throughput: 0: 1679.9, 1: 1687.8. Samples: 22497004. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:04:03,607][45375] Avg episode reward: [(0, '43.770'), (1, '49.450')] +[2023-10-13 02:04:03,781][46663] Updated weights for policy 1, policy_version 43911 (0.0008) +[2023-10-13 02:04:04,149][46663] Updated weights for policy 1, policy_version 43921 (0.0008) +[2023-10-13 02:04:04,510][46663] Updated weights for policy 1, policy_version 43931 (0.0008) +[2023-10-13 02:04:05,018][46662] Updated weights for policy 0, policy_version 43940 (0.0010) +[2023-10-13 02:04:05,378][46662] Updated weights for policy 0, policy_version 43950 (0.0009) +[2023-10-13 02:04:05,750][46662] Updated weights for policy 0, policy_version 43960 (0.0008) +[2023-10-13 02:04:08,563][46663] Updated weights for policy 1, policy_version 43941 (0.0008) +[2023-10-13 02:04:08,607][45375] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 90013696. Throughput: 0: 1700.9, 1: 1686.3. Samples: 22517818. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:04:08,607][45375] Avg episode reward: [(0, '44.850'), (1, '49.460')] +[2023-10-13 02:04:08,931][46663] Updated weights for policy 1, policy_version 43951 (0.0009) +[2023-10-13 02:04:09,293][46663] Updated weights for policy 1, policy_version 43961 (0.0008) +[2023-10-13 02:04:09,914][46662] Updated weights for policy 0, policy_version 43970 (0.0009) +[2023-10-13 02:04:10,312][46662] Updated weights for policy 0, policy_version 43980 (0.0009) +[2023-10-13 02:04:10,684][46662] Updated weights for policy 0, policy_version 43990 (0.0008) +[2023-10-13 02:04:11,055][46662] Updated weights for policy 0, policy_version 44000 (0.0011) +[2023-10-13 02:04:13,448][46663] Updated weights for policy 1, policy_version 43971 (0.0007) +[2023-10-13 02:04:13,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 90079232. Throughput: 0: 1670.6, 1: 1686.4. Samples: 22527134. Policy #0 lag: (min: 31.0, avg: 38.8, max: 63.0) +[2023-10-13 02:04:13,607][45375] Avg episode reward: [(0, '45.050'), (1, '50.350')] +[2023-10-13 02:04:13,809][46663] Updated weights for policy 1, policy_version 43981 (0.0007) +[2023-10-13 02:04:14,184][46663] Updated weights for policy 1, policy_version 43991 (0.0009) +[2023-10-13 02:04:15,010][46662] Updated weights for policy 0, policy_version 44010 (0.0010) +[2023-10-13 02:04:15,376][46662] Updated weights for policy 0, policy_version 44020 (0.0008) +[2023-10-13 02:04:15,750][46662] Updated weights for policy 0, policy_version 44030 (0.0007) +[2023-10-13 02:04:18,445][46663] Updated weights for policy 1, policy_version 44001 (0.0008) +[2023-10-13 02:04:18,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 90144768. Throughput: 0: 1691.4, 1: 1672.8. Samples: 22547376. Policy #0 lag: (min: 31.0, avg: 38.8, max: 63.0) +[2023-10-13 02:04:18,607][45375] Avg episode reward: [(0, '44.710'), (1, '50.160')] +[2023-10-13 02:04:18,802][46663] Updated weights for policy 1, policy_version 44011 (0.0009) +[2023-10-13 02:04:19,171][46663] Updated weights for policy 1, policy_version 44021 (0.0011) +[2023-10-13 02:04:19,540][46663] Updated weights for policy 1, policy_version 44031 (0.0008) +[2023-10-13 02:04:19,732][46662] Updated weights for policy 0, policy_version 44040 (0.0008) +[2023-10-13 02:04:20,108][46662] Updated weights for policy 0, policy_version 44050 (0.0008) +[2023-10-13 02:04:20,474][46662] Updated weights for policy 0, policy_version 44060 (0.0008) +[2023-10-13 02:04:23,607][45375] Fps is (10 sec: 13106.9, 60 sec: 13107.1, 300 sec: 13440.4). Total num frames: 90210304. Throughput: 0: 1696.1, 1: 1674.3. Samples: 22568124. Policy #0 lag: (min: 31.0, avg: 38.8, max: 63.0) +[2023-10-13 02:04:23,608][45375] Avg episode reward: [(0, '45.980'), (1, '50.230')] +[2023-10-13 02:04:23,694][46663] Updated weights for policy 1, policy_version 44041 (0.0007) +[2023-10-13 02:04:24,054][46663] Updated weights for policy 1, policy_version 44051 (0.0007) +[2023-10-13 02:04:24,420][46663] Updated weights for policy 1, policy_version 44061 (0.0009) +[2023-10-13 02:04:24,494][46662] Updated weights for policy 0, policy_version 44070 (0.0008) +[2023-10-13 02:04:24,865][46662] Updated weights for policy 0, policy_version 44080 (0.0010) +[2023-10-13 02:04:25,233][46662] Updated weights for policy 0, policy_version 44090 (0.0010) +[2023-10-13 02:04:28,514][46663] Updated weights for policy 1, policy_version 44071 (0.0009) +[2023-10-13 02:04:28,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 90275840. Throughput: 0: 1667.0, 1: 1676.7. Samples: 22577190. Policy #0 lag: (min: 31.0, avg: 38.8, max: 63.0) +[2023-10-13 02:04:28,607][45375] Avg episode reward: [(0, '45.720'), (1, '49.210')] +[2023-10-13 02:04:28,889][46663] Updated weights for policy 1, policy_version 44081 (0.0007) +[2023-10-13 02:04:29,266][46663] Updated weights for policy 1, policy_version 44091 (0.0008) +[2023-10-13 02:04:29,476][46662] Updated weights for policy 0, policy_version 44100 (0.0009) +[2023-10-13 02:04:29,854][46662] Updated weights for policy 0, policy_version 44110 (0.0010) +[2023-10-13 02:04:30,215][46662] Updated weights for policy 0, policy_version 44120 (0.0010) +[2023-10-13 02:04:33,359][46663] Updated weights for policy 1, policy_version 44101 (0.0008) +[2023-10-13 02:04:33,607][45375] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 90341376. Throughput: 0: 1689.7, 1: 1677.1. Samples: 22597770. Policy #0 lag: (min: 31.0, avg: 38.8, max: 63.0) +[2023-10-13 02:04:33,607][45375] Avg episode reward: [(0, '46.840'), (1, '49.730')] +[2023-10-13 02:04:33,729][46663] Updated weights for policy 1, policy_version 44111 (0.0007) +[2023-10-13 02:04:34,103][46663] Updated weights for policy 1, policy_version 44121 (0.0007) +[2023-10-13 02:04:34,182][46662] Updated weights for policy 0, policy_version 44130 (0.0011) +[2023-10-13 02:04:34,547][46662] Updated weights for policy 0, policy_version 44140 (0.0010) +[2023-10-13 02:04:34,935][46662] Updated weights for policy 0, policy_version 44150 (0.0011) +[2023-10-13 02:04:35,300][46662] Updated weights for policy 0, policy_version 44160 (0.0008) +[2023-10-13 02:04:38,224][46663] Updated weights for policy 1, policy_version 44131 (0.0009) +[2023-10-13 02:04:38,587][46663] Updated weights for policy 1, policy_version 44141 (0.0009) +[2023-10-13 02:04:38,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 90406912. Throughput: 0: 1688.9, 1: 1667.0. Samples: 22617994. Policy #0 lag: (min: 31.0, avg: 38.8, max: 63.0) +[2023-10-13 02:04:38,607][45375] Avg episode reward: [(0, '47.680'), (1, '50.040')] +[2023-10-13 02:04:38,956][46663] Updated weights for policy 1, policy_version 44151 (0.0008) +[2023-10-13 02:04:39,402][46662] Updated weights for policy 0, policy_version 44170 (0.0009) +[2023-10-13 02:04:39,784][46662] Updated weights for policy 0, policy_version 44180 (0.0009) +[2023-10-13 02:04:40,156][46662] Updated weights for policy 0, policy_version 44190 (0.0009) +[2023-10-13 02:04:43,088][46663] Updated weights for policy 1, policy_version 44161 (0.0007) +[2023-10-13 02:04:43,513][46663] Updated weights for policy 1, policy_version 44171 (0.0009) +[2023-10-13 02:04:43,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 90472448. Throughput: 0: 1671.4, 1: 1675.2. Samples: 22627436. Policy #0 lag: (min: 22.0, avg: 22.0, max: 22.0) +[2023-10-13 02:04:43,608][45375] Avg episode reward: [(0, '47.560'), (1, '49.950')] +[2023-10-13 02:04:43,875][46663] Updated weights for policy 1, policy_version 44181 (0.0010) +[2023-10-13 02:04:44,177][46662] Updated weights for policy 0, policy_version 44200 (0.0008) +[2023-10-13 02:04:44,238][46663] Updated weights for policy 1, policy_version 44191 (0.0007) +[2023-10-13 02:04:44,539][46662] Updated weights for policy 0, policy_version 44210 (0.0009) +[2023-10-13 02:04:44,917][46662] Updated weights for policy 0, policy_version 44220 (0.0008) +[2023-10-13 02:04:48,210][46663] Updated weights for policy 1, policy_version 44201 (0.0008) +[2023-10-13 02:04:48,579][46663] Updated weights for policy 1, policy_version 44211 (0.0007) +[2023-10-13 02:04:48,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 90537984. Throughput: 0: 1687.2, 1: 1667.5. Samples: 22647968. Policy #0 lag: (min: 22.0, avg: 22.0, max: 22.0) +[2023-10-13 02:04:48,607][45375] Avg episode reward: [(0, '47.670'), (1, '51.530')] +[2023-10-13 02:04:48,956][46663] Updated weights for policy 1, policy_version 44221 (0.0009) +[2023-10-13 02:04:49,028][46662] Updated weights for policy 0, policy_version 44230 (0.0008) +[2023-10-13 02:04:49,393][46662] Updated weights for policy 0, policy_version 44240 (0.0010) +[2023-10-13 02:04:49,760][46662] Updated weights for policy 0, policy_version 44250 (0.0007) +[2023-10-13 02:04:53,094][46663] Updated weights for policy 1, policy_version 44231 (0.0008) +[2023-10-13 02:04:53,467][46663] Updated weights for policy 1, policy_version 44241 (0.0008) +[2023-10-13 02:04:53,607][45375] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 90603520. Throughput: 0: 1680.4, 1: 1649.0. Samples: 22667644. Policy #0 lag: (min: 22.0, avg: 22.0, max: 22.0) +[2023-10-13 02:04:53,607][45375] Avg episode reward: [(0, '48.640'), (1, '51.690')] +[2023-10-13 02:04:53,777][46662] Updated weights for policy 0, policy_version 44260 (0.0007) +[2023-10-13 02:04:53,823][46663] Updated weights for policy 1, policy_version 44251 (0.0009) +[2023-10-13 02:04:54,149][46662] Updated weights for policy 0, policy_version 44270 (0.0009) +[2023-10-13 02:04:54,516][46662] Updated weights for policy 0, policy_version 44280 (0.0009) +[2023-10-13 02:04:57,925][46663] Updated weights for policy 1, policy_version 44261 (0.0008) +[2023-10-13 02:04:58,304][46663] Updated weights for policy 1, policy_version 44271 (0.0007) +[2023-10-13 02:04:58,606][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.3, 300 sec: 13329.4). Total num frames: 90669056. Throughput: 0: 1676.6, 1: 1660.6. Samples: 22677308. Policy #0 lag: (min: 22.0, avg: 22.0, max: 22.0) +[2023-10-13 02:04:58,607][45375] Avg episode reward: [(0, '48.240'), (1, '51.300')] +[2023-10-13 02:04:58,641][46662] Updated weights for policy 0, policy_version 44290 (0.0009) +[2023-10-13 02:04:58,669][46663] Updated weights for policy 1, policy_version 44281 (0.0009) +[2023-10-13 02:04:59,024][46662] Updated weights for policy 0, policy_version 44300 (0.0009) +[2023-10-13 02:04:59,397][46662] Updated weights for policy 0, policy_version 44310 (0.0009) +[2023-10-13 02:04:59,775][46662] Updated weights for policy 0, policy_version 44320 (0.0008) +[2023-10-13 02:05:02,689][46663] Updated weights for policy 1, policy_version 44291 (0.0008) +[2023-10-13 02:05:03,055][46663] Updated weights for policy 1, policy_version 44301 (0.0011) +[2023-10-13 02:05:03,427][46663] Updated weights for policy 1, policy_version 44311 (0.0007) +[2023-10-13 02:05:03,606][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 90734592. Throughput: 0: 1677.5, 1: 1670.1. Samples: 22698018. Policy #0 lag: (min: 22.0, avg: 22.0, max: 22.0) +[2023-10-13 02:05:03,607][45375] Avg episode reward: [(0, '48.110'), (1, '52.720')] +[2023-10-13 02:05:03,898][46662] Updated weights for policy 0, policy_version 44330 (0.0008) +[2023-10-13 02:05:04,281][46662] Updated weights for policy 0, policy_version 44340 (0.0008) +[2023-10-13 02:05:04,658][46662] Updated weights for policy 0, policy_version 44350 (0.0009) +[2023-10-13 02:05:07,521][46663] Updated weights for policy 1, policy_version 44321 (0.0008) +[2023-10-13 02:05:07,884][46663] Updated weights for policy 1, policy_version 44331 (0.0011) +[2023-10-13 02:05:08,247][46663] Updated weights for policy 1, policy_version 44341 (0.0009) +[2023-10-13 02:05:08,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 90800128. Throughput: 0: 1679.7, 1: 1646.2. Samples: 22717790. Policy #0 lag: (min: 22.0, avg: 22.0, max: 22.0) +[2023-10-13 02:05:08,607][45375] Avg episode reward: [(0, '47.370'), (1, '52.080')] +[2023-10-13 02:05:08,613][46663] Updated weights for policy 1, policy_version 44351 (0.0008) +[2023-10-13 02:05:08,655][46662] Updated weights for policy 0, policy_version 44360 (0.0009) +[2023-10-13 02:05:09,025][46662] Updated weights for policy 0, policy_version 44370 (0.0009) +[2023-10-13 02:05:09,395][46662] Updated weights for policy 0, policy_version 44380 (0.0007) +[2023-10-13 02:05:12,652][46663] Updated weights for policy 1, policy_version 44361 (0.0011) +[2023-10-13 02:05:13,023][46663] Updated weights for policy 1, policy_version 44371 (0.0007) +[2023-10-13 02:05:13,386][46663] Updated weights for policy 1, policy_version 44381 (0.0009) +[2023-10-13 02:05:13,522][46662] Updated weights for policy 0, policy_version 44390 (0.0007) +[2023-10-13 02:05:13,607][45375] Fps is (10 sec: 16383.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 90898432. Throughput: 0: 1682.3, 1: 1664.3. Samples: 22727786. Policy #0 lag: (min: 31.0, avg: 37.4, max: 63.0) +[2023-10-13 02:05:13,607][45375] Avg episode reward: [(0, '47.460'), (1, '51.010')] +[2023-10-13 02:05:13,899][46662] Updated weights for policy 0, policy_version 44400 (0.0008) +[2023-10-13 02:05:14,278][46662] Updated weights for policy 0, policy_version 44410 (0.0007) +[2023-10-13 02:05:17,515][46663] Updated weights for policy 1, policy_version 44391 (0.0010) +[2023-10-13 02:05:17,874][46663] Updated weights for policy 1, policy_version 44401 (0.0010) +[2023-10-13 02:05:18,235][46663] Updated weights for policy 1, policy_version 44411 (0.0008) +[2023-10-13 02:05:18,348][46662] Updated weights for policy 0, policy_version 44420 (0.0008) +[2023-10-13 02:05:18,606][45375] Fps is (10 sec: 16384.2, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 90963968. Throughput: 0: 1684.6, 1: 1661.9. Samples: 22748362. Policy #0 lag: (min: 31.0, avg: 37.4, max: 63.0) +[2023-10-13 02:05:18,607][45375] Avg episode reward: [(0, '49.770'), (1, '51.380')] +[2023-10-13 02:05:18,715][46662] Updated weights for policy 0, policy_version 44430 (0.0009) +[2023-10-13 02:05:19,089][46662] Updated weights for policy 0, policy_version 44440 (0.0007) +[2023-10-13 02:05:22,366][46663] Updated weights for policy 1, policy_version 44421 (0.0008) +[2023-10-13 02:05:22,733][46663] Updated weights for policy 1, policy_version 44431 (0.0009) +[2023-10-13 02:05:23,105][46663] Updated weights for policy 1, policy_version 44441 (0.0007) +[2023-10-13 02:05:23,224][46662] Updated weights for policy 0, policy_version 44450 (0.0008) +[2023-10-13 02:05:23,592][46662] Updated weights for policy 0, policy_version 44460 (0.0007) +[2023-10-13 02:05:23,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 91029504. Throughput: 0: 1681.3, 1: 1648.9. Samples: 22767854. Policy #0 lag: (min: 31.0, avg: 37.4, max: 63.0) +[2023-10-13 02:05:23,607][45375] Avg episode reward: [(0, '49.670'), (1, '51.770')] +[2023-10-13 02:05:23,960][46662] Updated weights for policy 0, policy_version 44470 (0.0009) +[2023-10-13 02:05:24,336][46662] Updated weights for policy 0, policy_version 44480 (0.0010) +[2023-10-13 02:05:27,277][46663] Updated weights for policy 1, policy_version 44451 (0.0009) +[2023-10-13 02:05:27,649][46663] Updated weights for policy 1, policy_version 44461 (0.0010) +[2023-10-13 02:05:28,023][46663] Updated weights for policy 1, policy_version 44471 (0.0011) +[2023-10-13 02:05:28,431][46662] Updated weights for policy 0, policy_version 44490 (0.0009) +[2023-10-13 02:05:28,606][45375] Fps is (10 sec: 13107.1, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 91095040. Throughput: 0: 1678.0, 1: 1667.8. Samples: 22777998. Policy #0 lag: (min: 31.0, avg: 37.4, max: 63.0) +[2023-10-13 02:05:28,607][45375] Avg episode reward: [(0, '48.970'), (1, '52.260')] +[2023-10-13 02:05:28,797][46662] Updated weights for policy 0, policy_version 44500 (0.0007) +[2023-10-13 02:05:29,169][46662] Updated weights for policy 0, policy_version 44510 (0.0007) +[2023-10-13 02:05:32,200][46663] Updated weights for policy 1, policy_version 44481 (0.0008) +[2023-10-13 02:05:32,577][46663] Updated weights for policy 1, policy_version 44491 (0.0011) +[2023-10-13 02:05:32,947][46663] Updated weights for policy 1, policy_version 44501 (0.0008) +[2023-10-13 02:05:33,264][46662] Updated weights for policy 0, policy_version 44520 (0.0007) +[2023-10-13 02:05:33,319][46663] Updated weights for policy 1, policy_version 44511 (0.0007) +[2023-10-13 02:05:33,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 91160576. Throughput: 0: 1676.1, 1: 1667.7. Samples: 22798442. Policy #0 lag: (min: 31.0, avg: 37.4, max: 63.0) +[2023-10-13 02:05:33,607][45375] Avg episode reward: [(0, '50.690'), (1, '53.120')] +[2023-10-13 02:05:33,637][46662] Updated weights for policy 0, policy_version 44530 (0.0008) +[2023-10-13 02:05:34,009][46662] Updated weights for policy 0, policy_version 44540 (0.0007) +[2023-10-13 02:05:37,528][46663] Updated weights for policy 1, policy_version 44521 (0.0007) +[2023-10-13 02:05:37,895][46663] Updated weights for policy 1, policy_version 44531 (0.0008) +[2023-10-13 02:05:37,948][46662] Updated weights for policy 0, policy_version 44550 (0.0010) +[2023-10-13 02:05:38,261][46663] Updated weights for policy 1, policy_version 44541 (0.0009) +[2023-10-13 02:05:38,318][46662] Updated weights for policy 0, policy_version 44560 (0.0009) +[2023-10-13 02:05:38,606][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 91226112. Throughput: 0: 1679.3, 1: 1661.6. Samples: 22817984. Policy #0 lag: (min: 31.0, avg: 37.4, max: 63.0) +[2023-10-13 02:05:38,607][45375] Avg episode reward: [(0, '49.650'), (1, '52.700')] +[2023-10-13 02:05:38,688][46662] Updated weights for policy 0, policy_version 44570 (0.0009) +[2023-10-13 02:05:42,344][46663] Updated weights for policy 1, policy_version 44551 (0.0008) +[2023-10-13 02:05:42,628][46662] Updated weights for policy 0, policy_version 44580 (0.0009) +[2023-10-13 02:05:42,709][46663] Updated weights for policy 1, policy_version 44561 (0.0008) +[2023-10-13 02:05:43,004][46662] Updated weights for policy 0, policy_version 44590 (0.0008) +[2023-10-13 02:05:43,084][46663] Updated weights for policy 1, policy_version 44571 (0.0009) +[2023-10-13 02:05:43,379][46662] Updated weights for policy 0, policy_version 44600 (0.0010) +[2023-10-13 02:05:43,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 91291648. Throughput: 0: 1682.9, 1: 1671.5. Samples: 22828260. Policy #0 lag: (min: 16.0, avg: 36.3, max: 48.0) +[2023-10-13 02:05:43,608][45375] Avg episode reward: [(0, '49.300'), (1, '52.120')] +[2023-10-13 02:05:47,097][46663] Updated weights for policy 1, policy_version 44581 (0.0008) +[2023-10-13 02:05:47,465][46663] Updated weights for policy 1, policy_version 44591 (0.0009) +[2023-10-13 02:05:47,631][46662] Updated weights for policy 0, policy_version 44610 (0.0010) +[2023-10-13 02:05:47,826][46663] Updated weights for policy 1, policy_version 44601 (0.0009) +[2023-10-13 02:05:48,043][46662] Updated weights for policy 0, policy_version 44620 (0.0008) +[2023-10-13 02:05:48,417][46662] Updated weights for policy 0, policy_version 44630 (0.0011) +[2023-10-13 02:05:48,606][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 91357184. Throughput: 0: 1681.6, 1: 1658.8. Samples: 22848336. Policy #0 lag: (min: 16.0, avg: 36.3, max: 48.0) +[2023-10-13 02:05:48,607][45375] Avg episode reward: [(0, '49.510'), (1, '52.070')] +[2023-10-13 02:05:48,779][46662] Updated weights for policy 0, policy_version 44640 (0.0009) +[2023-10-13 02:05:51,886][46663] Updated weights for policy 1, policy_version 44611 (0.0009) +[2023-10-13 02:05:52,245][46663] Updated weights for policy 1, policy_version 44621 (0.0007) +[2023-10-13 02:05:52,617][46663] Updated weights for policy 1, policy_version 44631 (0.0009) +[2023-10-13 02:05:52,981][46662] Updated weights for policy 0, policy_version 44650 (0.0009) +[2023-10-13 02:05:53,341][46662] Updated weights for policy 0, policy_version 44660 (0.0008) +[2023-10-13 02:05:53,607][45375] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 91422720. Throughput: 0: 1667.8, 1: 1665.8. Samples: 22867804. Policy #0 lag: (min: 16.0, avg: 36.3, max: 48.0) +[2023-10-13 02:05:53,607][45375] Avg episode reward: [(0, '49.170'), (1, '52.740')] +[2023-10-13 02:05:53,615][46384] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000044640_45711360.pth... +[2023-10-13 02:05:53,644][46384] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000043072_44105728.pth +[2023-10-13 02:05:53,710][46662] Updated weights for policy 0, policy_version 44670 (0.0008) +[2023-10-13 02:05:53,784][46091] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000044672_45744128.pth... +[2023-10-13 02:05:53,815][46091] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000043104_44138496.pth +[2023-10-13 02:05:56,686][46663] Updated weights for policy 1, policy_version 44641 (0.0009) +[2023-10-13 02:05:57,063][46663] Updated weights for policy 1, policy_version 44651 (0.0007) +[2023-10-13 02:05:57,424][46663] Updated weights for policy 1, policy_version 44661 (0.0009) +[2023-10-13 02:05:57,648][46662] Updated weights for policy 0, policy_version 44680 (0.0008) +[2023-10-13 02:05:57,790][46663] Updated weights for policy 1, policy_version 44671 (0.0009) +[2023-10-13 02:05:58,014][46662] Updated weights for policy 0, policy_version 44690 (0.0009) +[2023-10-13 02:05:58,386][46662] Updated weights for policy 0, policy_version 44700 (0.0009) +[2023-10-13 02:05:58,606][45375] Fps is (10 sec: 16384.0, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 91521024. Throughput: 0: 1673.1, 1: 1673.2. Samples: 22878370. Policy #0 lag: (min: 16.0, avg: 36.3, max: 48.0) +[2023-10-13 02:05:58,607][45375] Avg episode reward: [(0, '49.620'), (1, '52.150')] +[2023-10-13 02:06:01,930][46663] Updated weights for policy 1, policy_version 44681 (0.0008) +[2023-10-13 02:06:02,305][46663] Updated weights for policy 1, policy_version 44691 (0.0008) +[2023-10-13 02:06:02,440][46662] Updated weights for policy 0, policy_version 44710 (0.0009) +[2023-10-13 02:06:02,675][46663] Updated weights for policy 1, policy_version 44701 (0.0008) +[2023-10-13 02:06:02,812][46662] Updated weights for policy 0, policy_version 44720 (0.0007) +[2023-10-13 02:06:03,191][46662] Updated weights for policy 0, policy_version 44730 (0.0007) +[2023-10-13 02:06:03,607][45375] Fps is (10 sec: 16383.8, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 91586560. Throughput: 0: 1673.8, 1: 1659.7. Samples: 22898368. Policy #0 lag: (min: 16.0, avg: 36.3, max: 48.0) +[2023-10-13 02:06:03,608][45375] Avg episode reward: [(0, '51.420'), (1, '51.500')] +[2023-10-13 02:06:06,821][46663] Updated weights for policy 1, policy_version 44711 (0.0007) +[2023-10-13 02:06:07,187][46663] Updated weights for policy 1, policy_version 44721 (0.0008) +[2023-10-13 02:06:07,321][46662] Updated weights for policy 0, policy_version 44740 (0.0008) +[2023-10-13 02:06:07,559][46663] Updated weights for policy 1, policy_version 44731 (0.0007) +[2023-10-13 02:06:07,696][46662] Updated weights for policy 0, policy_version 44750 (0.0007) +[2023-10-13 02:06:08,065][46662] Updated weights for policy 0, policy_version 44760 (0.0007) +[2023-10-13 02:06:08,607][45375] Fps is (10 sec: 13107.0, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 91652096. Throughput: 0: 1662.6, 1: 1670.8. Samples: 22917856. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:06:08,607][45375] Avg episode reward: [(0, '52.890'), (1, '50.680')] +[2023-10-13 02:06:11,644][46663] Updated weights for policy 1, policy_version 44741 (0.0008) +[2023-10-13 02:06:12,009][46663] Updated weights for policy 1, policy_version 44751 (0.0010) +[2023-10-13 02:06:12,198][46662] Updated weights for policy 0, policy_version 44770 (0.0009) +[2023-10-13 02:06:12,370][46663] Updated weights for policy 1, policy_version 44761 (0.0008) +[2023-10-13 02:06:12,564][46662] Updated weights for policy 0, policy_version 44780 (0.0009) +[2023-10-13 02:06:12,946][46662] Updated weights for policy 0, policy_version 44790 (0.0010) +[2023-10-13 02:06:13,324][46662] Updated weights for policy 0, policy_version 44800 (0.0009) +[2023-10-13 02:06:13,607][45375] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 91717632. Throughput: 0: 1676.0, 1: 1671.1. Samples: 22928616. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:06:13,607][45375] Avg episode reward: [(0, '53.610'), (1, '50.730')] +[2023-10-13 02:06:16,575][46663] Updated weights for policy 1, policy_version 44771 (0.0008) +[2023-10-13 02:06:16,930][46663] Updated weights for policy 1, policy_version 44781 (0.0010) +[2023-10-13 02:06:17,297][46663] Updated weights for policy 1, policy_version 44791 (0.0009) +[2023-10-13 02:06:17,338][46662] Updated weights for policy 0, policy_version 44810 (0.0008) +[2023-10-13 02:06:17,710][46662] Updated weights for policy 0, policy_version 44820 (0.0007) +[2023-10-13 02:06:18,065][46662] Updated weights for policy 0, policy_version 44830 (0.0009) +[2023-10-13 02:06:18,606][45375] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 91783168. Throughput: 0: 1674.8, 1: 1652.2. Samples: 22948160. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:06:18,607][45375] Avg episode reward: [(0, '52.490'), (1, '50.820')] +[2023-10-13 02:06:21,380][46663] Updated weights for policy 1, policy_version 44801 (0.0008) +[2023-10-13 02:06:21,772][46663] Updated weights for policy 1, policy_version 44811 (0.0010) +[2023-10-13 02:06:22,059][46662] Updated weights for policy 0, policy_version 44840 (0.0007) +[2023-10-13 02:06:22,138][46663] Updated weights for policy 1, policy_version 44821 (0.0009) +[2023-10-13 02:06:22,424][46662] Updated weights for policy 0, policy_version 44850 (0.0008) +[2023-10-13 02:06:22,504][46663] Updated weights for policy 1, policy_version 44831 (0.0008) +[2023-10-13 02:06:22,788][46662] Updated weights for policy 0, policy_version 44860 (0.0008) +[2023-10-13 02:06:23,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 91848704. Throughput: 0: 1655.7, 1: 1669.7. Samples: 22967626. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:06:23,608][45375] Avg episode reward: [(0, '50.370'), (1, '51.920')] +[2023-10-13 02:06:26,635][46663] Updated weights for policy 1, policy_version 44841 (0.0010) +[2023-10-13 02:06:26,864][46662] Updated weights for policy 0, policy_version 44870 (0.0009) +[2023-10-13 02:06:27,005][46663] Updated weights for policy 1, policy_version 44851 (0.0007) +[2023-10-13 02:06:27,232][46662] Updated weights for policy 0, policy_version 44880 (0.0008) +[2023-10-13 02:06:27,368][46663] Updated weights for policy 1, policy_version 44861 (0.0009) +[2023-10-13 02:06:27,599][46662] Updated weights for policy 0, policy_version 44890 (0.0009) +[2023-10-13 02:06:28,606][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 91914240. Throughput: 0: 1677.6, 1: 1673.3. Samples: 22979050. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:06:28,607][45375] Avg episode reward: [(0, '49.420'), (1, '51.450')] +[2023-10-13 02:06:31,504][46663] Updated weights for policy 1, policy_version 44871 (0.0009) +[2023-10-13 02:06:31,516][46662] Updated weights for policy 0, policy_version 44900 (0.0010) +[2023-10-13 02:06:31,863][46663] Updated weights for policy 1, policy_version 44881 (0.0007) +[2023-10-13 02:06:31,885][46662] Updated weights for policy 0, policy_version 44910 (0.0007) +[2023-10-13 02:06:32,229][46663] Updated weights for policy 1, policy_version 44891 (0.0008) +[2023-10-13 02:06:32,255][46662] Updated weights for policy 0, policy_version 44920 (0.0007) +[2023-10-13 02:06:33,606][45375] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 91979776. Throughput: 0: 1673.0, 1: 1658.8. Samples: 22998266. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:06:33,607][45375] Avg episode reward: [(0, '48.030'), (1, '51.590')] +[2023-10-13 02:06:36,416][46663] Updated weights for policy 1, policy_version 44901 (0.0008) +[2023-10-13 02:06:36,435][46662] Updated weights for policy 0, policy_version 44930 (0.0010) +[2023-10-13 02:06:36,787][46663] Updated weights for policy 1, policy_version 44911 (0.0009) +[2023-10-13 02:06:36,828][46662] Updated weights for policy 0, policy_version 44940 (0.0007) +[2023-10-13 02:06:37,156][46663] Updated weights for policy 1, policy_version 44921 (0.0009) +[2023-10-13 02:06:37,199][46662] Updated weights for policy 0, policy_version 44950 (0.0008) +[2023-10-13 02:06:37,566][46662] Updated weights for policy 0, policy_version 44960 (0.0007) +[2023-10-13 02:06:38,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 92045312. Throughput: 0: 1666.5, 1: 1670.9. Samples: 23017988. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:06:38,607][45375] Avg episode reward: [(0, '48.270'), (1, '50.130')] +[2023-10-13 02:06:41,343][46663] Updated weights for policy 1, policy_version 44931 (0.0007) +[2023-10-13 02:06:41,463][46662] Updated weights for policy 0, policy_version 44970 (0.0009) +[2023-10-13 02:06:41,706][46663] Updated weights for policy 1, policy_version 44941 (0.0007) +[2023-10-13 02:06:41,834][46662] Updated weights for policy 0, policy_version 44980 (0.0008) +[2023-10-13 02:06:42,063][46663] Updated weights for policy 1, policy_version 44951 (0.0010) +[2023-10-13 02:06:42,205][46662] Updated weights for policy 0, policy_version 44990 (0.0007) +[2023-10-13 02:06:43,606][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 92110848. Throughput: 0: 1699.6, 1: 1665.1. Samples: 23029784. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:06:43,607][45375] Avg episode reward: [(0, '49.630'), (1, '50.790')] +[2023-10-13 02:06:46,101][46663] Updated weights for policy 1, policy_version 44961 (0.0009) +[2023-10-13 02:06:46,244][46662] Updated weights for policy 0, policy_version 45000 (0.0008) +[2023-10-13 02:06:46,462][46663] Updated weights for policy 1, policy_version 44971 (0.0009) +[2023-10-13 02:06:46,614][46662] Updated weights for policy 0, policy_version 45010 (0.0009) +[2023-10-13 02:06:46,819][46663] Updated weights for policy 1, policy_version 44981 (0.0008) +[2023-10-13 02:06:46,985][46662] Updated weights for policy 0, policy_version 45020 (0.0007) +[2023-10-13 02:06:47,193][46663] Updated weights for policy 1, policy_version 44991 (0.0009) +[2023-10-13 02:06:48,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 92176384. Throughput: 0: 1674.4, 1: 1662.7. Samples: 23048534. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:06:48,607][45375] Avg episode reward: [(0, '49.650'), (1, '49.120')] +[2023-10-13 02:06:51,138][46662] Updated weights for policy 0, policy_version 45030 (0.0008) +[2023-10-13 02:06:51,237][46663] Updated weights for policy 1, policy_version 45001 (0.0008) +[2023-10-13 02:06:51,500][46662] Updated weights for policy 0, policy_version 45040 (0.0009) +[2023-10-13 02:06:51,598][46663] Updated weights for policy 1, policy_version 45011 (0.0007) +[2023-10-13 02:06:51,871][46662] Updated weights for policy 0, policy_version 45050 (0.0009) +[2023-10-13 02:06:51,972][46663] Updated weights for policy 1, policy_version 45021 (0.0008) +[2023-10-13 02:06:53,607][45375] Fps is (10 sec: 13106.8, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 92241920. Throughput: 0: 1677.0, 1: 1676.8. Samples: 23068778. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:06:53,608][45375] Avg episode reward: [(0, '49.470'), (1, '49.410')] +[2023-10-13 02:06:55,912][46662] Updated weights for policy 0, policy_version 45060 (0.0008) +[2023-10-13 02:06:55,947][46663] Updated weights for policy 1, policy_version 45031 (0.0008) +[2023-10-13 02:06:56,275][46662] Updated weights for policy 0, policy_version 45070 (0.0009) +[2023-10-13 02:06:56,316][46663] Updated weights for policy 1, policy_version 45041 (0.0007) +[2023-10-13 02:06:56,651][46662] Updated weights for policy 0, policy_version 45080 (0.0009) +[2023-10-13 02:06:56,682][46663] Updated weights for policy 1, policy_version 45051 (0.0007) +[2023-10-13 02:06:58,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 92307456. Throughput: 0: 1695.6, 1: 1663.6. Samples: 23079778. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:06:58,607][45375] Avg episode reward: [(0, '48.520'), (1, '48.260')] +[2023-10-13 02:07:00,777][46662] Updated weights for policy 0, policy_version 45090 (0.0008) +[2023-10-13 02:07:00,842][46663] Updated weights for policy 1, policy_version 45061 (0.0008) +[2023-10-13 02:07:01,145][46662] Updated weights for policy 0, policy_version 45100 (0.0007) +[2023-10-13 02:07:01,218][46663] Updated weights for policy 1, policy_version 45071 (0.0010) +[2023-10-13 02:07:01,519][46662] Updated weights for policy 0, policy_version 45110 (0.0007) +[2023-10-13 02:07:01,582][46663] Updated weights for policy 1, policy_version 45081 (0.0007) +[2023-10-13 02:07:01,881][46662] Updated weights for policy 0, policy_version 45120 (0.0008) +[2023-10-13 02:07:03,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 92372992. Throughput: 0: 1665.4, 1: 1675.0. Samples: 23098476. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:07:03,608][45375] Avg episode reward: [(0, '47.050'), (1, '49.170')] +[2023-10-13 02:07:05,527][46663] Updated weights for policy 1, policy_version 45091 (0.0008) +[2023-10-13 02:07:05,896][46663] Updated weights for policy 1, policy_version 45101 (0.0009) +[2023-10-13 02:07:06,244][46662] Updated weights for policy 0, policy_version 45130 (0.0010) +[2023-10-13 02:07:06,277][46663] Updated weights for policy 1, policy_version 45111 (0.0008) +[2023-10-13 02:07:06,608][46662] Updated weights for policy 0, policy_version 45140 (0.0007) +[2023-10-13 02:07:06,982][46662] Updated weights for policy 0, policy_version 45150 (0.0007) +[2023-10-13 02:07:08,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 92438528. Throughput: 0: 1676.3, 1: 1680.4. Samples: 23118676. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:07:08,607][45375] Avg episode reward: [(0, '45.990'), (1, '48.210')] +[2023-10-13 02:07:10,281][46663] Updated weights for policy 1, policy_version 45121 (0.0009) +[2023-10-13 02:07:10,697][46663] Updated weights for policy 1, policy_version 45131 (0.0010) +[2023-10-13 02:07:11,057][46662] Updated weights for policy 0, policy_version 45160 (0.0010) +[2023-10-13 02:07:11,058][46663] Updated weights for policy 1, policy_version 45141 (0.0008) +[2023-10-13 02:07:11,423][46662] Updated weights for policy 0, policy_version 45170 (0.0009) +[2023-10-13 02:07:11,434][46663] Updated weights for policy 1, policy_version 45151 (0.0008) +[2023-10-13 02:07:11,802][46662] Updated weights for policy 0, policy_version 45180 (0.0009) +[2023-10-13 02:07:13,607][45375] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 92504064. Throughput: 0: 1677.9, 1: 1654.8. Samples: 23129020. Policy #0 lag: (min: 31.0, avg: 33.3, max: 63.0) +[2023-10-13 02:07:13,607][45375] Avg episode reward: [(0, '46.560'), (1, '47.920')] +[2023-10-13 02:07:15,504][46663] Updated weights for policy 1, policy_version 45161 (0.0007) +[2023-10-13 02:07:15,860][46662] Updated weights for policy 0, policy_version 45190 (0.0008) +[2023-10-13 02:07:15,871][46663] Updated weights for policy 1, policy_version 45171 (0.0009) +[2023-10-13 02:07:16,223][46662] Updated weights for policy 0, policy_version 45200 (0.0007) +[2023-10-13 02:07:16,240][46663] Updated weights for policy 1, policy_version 45181 (0.0009) +[2023-10-13 02:07:16,597][46662] Updated weights for policy 0, policy_version 45210 (0.0008) +[2023-10-13 02:07:18,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 92569600. Throughput: 0: 1659.4, 1: 1675.0. Samples: 23148314. Policy #0 lag: (min: 31.0, avg: 33.3, max: 63.0) +[2023-10-13 02:07:18,607][45375] Avg episode reward: [(0, '46.460'), (1, '47.560')] +[2023-10-13 02:07:20,262][46663] Updated weights for policy 1, policy_version 45191 (0.0008) +[2023-10-13 02:07:20,618][46663] Updated weights for policy 1, policy_version 45201 (0.0008) +[2023-10-13 02:07:20,744][46662] Updated weights for policy 0, policy_version 45220 (0.0010) +[2023-10-13 02:07:20,994][46663] Updated weights for policy 1, policy_version 45211 (0.0008) +[2023-10-13 02:07:21,137][46662] Updated weights for policy 0, policy_version 45230 (0.0008) +[2023-10-13 02:07:21,498][46662] Updated weights for policy 0, policy_version 45240 (0.0008) +[2023-10-13 02:07:23,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 92635136. Throughput: 0: 1677.4, 1: 1674.4. Samples: 23168816. Policy #0 lag: (min: 31.0, avg: 33.3, max: 63.0) +[2023-10-13 02:07:23,608][45375] Avg episode reward: [(0, '47.010'), (1, '48.950')] +[2023-10-13 02:07:25,290][46663] Updated weights for policy 1, policy_version 45221 (0.0008) +[2023-10-13 02:07:25,661][46663] Updated weights for policy 1, policy_version 45231 (0.0009) +[2023-10-13 02:07:25,700][46662] Updated weights for policy 0, policy_version 45250 (0.0007) +[2023-10-13 02:07:26,033][46663] Updated weights for policy 1, policy_version 45241 (0.0010) +[2023-10-13 02:07:26,063][46662] Updated weights for policy 0, policy_version 45260 (0.0008) +[2023-10-13 02:07:26,431][46662] Updated weights for policy 0, policy_version 45270 (0.0009) +[2023-10-13 02:07:26,801][46662] Updated weights for policy 0, policy_version 45280 (0.0007) +[2023-10-13 02:07:28,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 92700672. Throughput: 0: 1661.3, 1: 1648.3. Samples: 23178716. Policy #0 lag: (min: 31.0, avg: 33.3, max: 63.0) +[2023-10-13 02:07:28,607][45375] Avg episode reward: [(0, '46.900'), (1, '48.990')] +[2023-10-13 02:07:30,339][46663] Updated weights for policy 1, policy_version 45251 (0.0008) +[2023-10-13 02:07:30,607][46662] Updated weights for policy 0, policy_version 45290 (0.0007) +[2023-10-13 02:07:30,711][46663] Updated weights for policy 1, policy_version 45261 (0.0009) +[2023-10-13 02:07:30,976][46662] Updated weights for policy 0, policy_version 45300 (0.0008) +[2023-10-13 02:07:31,077][46663] Updated weights for policy 1, policy_version 45271 (0.0007) +[2023-10-13 02:07:31,348][46662] Updated weights for policy 0, policy_version 45310 (0.0008) +[2023-10-13 02:07:33,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.1, 300 sec: 13440.4). Total num frames: 92766208. Throughput: 0: 1659.5, 1: 1662.8. Samples: 23198038. Policy #0 lag: (min: 31.0, avg: 33.3, max: 63.0) +[2023-10-13 02:07:33,608][45375] Avg episode reward: [(0, '48.120'), (1, '47.740')] +[2023-10-13 02:07:35,131][46663] Updated weights for policy 1, policy_version 45281 (0.0009) +[2023-10-13 02:07:35,494][46663] Updated weights for policy 1, policy_version 45291 (0.0008) +[2023-10-13 02:07:35,531][46662] Updated weights for policy 0, policy_version 45320 (0.0009) +[2023-10-13 02:07:35,865][46663] Updated weights for policy 1, policy_version 45301 (0.0009) +[2023-10-13 02:07:35,900][46662] Updated weights for policy 0, policy_version 45330 (0.0010) +[2023-10-13 02:07:36,223][46663] Updated weights for policy 1, policy_version 45311 (0.0009) +[2023-10-13 02:07:36,264][46662] Updated weights for policy 0, policy_version 45340 (0.0007) +[2023-10-13 02:07:38,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 92831744. Throughput: 0: 1669.1, 1: 1654.3. Samples: 23218332. Policy #0 lag: (min: 31.0, avg: 33.3, max: 63.0) +[2023-10-13 02:07:38,607][45375] Avg episode reward: [(0, '47.540'), (1, '46.630')] +[2023-10-13 02:07:40,302][46663] Updated weights for policy 1, policy_version 45321 (0.0007) +[2023-10-13 02:07:40,330][46662] Updated weights for policy 0, policy_version 45350 (0.0009) +[2023-10-13 02:07:40,661][46663] Updated weights for policy 1, policy_version 45331 (0.0007) +[2023-10-13 02:07:40,700][46662] Updated weights for policy 0, policy_version 45360 (0.0009) +[2023-10-13 02:07:41,034][46663] Updated weights for policy 1, policy_version 45341 (0.0008) +[2023-10-13 02:07:41,080][46662] Updated weights for policy 0, policy_version 45370 (0.0008) +[2023-10-13 02:07:43,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 92897280. Throughput: 0: 1650.6, 1: 1642.1. Samples: 23227950. Policy #0 lag: (min: 31.0, avg: 33.3, max: 63.0) +[2023-10-13 02:07:43,608][45375] Avg episode reward: [(0, '46.270'), (1, '47.270')] +[2023-10-13 02:07:45,271][46662] Updated weights for policy 0, policy_version 45380 (0.0008) +[2023-10-13 02:07:45,309][46663] Updated weights for policy 1, policy_version 45351 (0.0008) +[2023-10-13 02:07:45,642][46662] Updated weights for policy 0, policy_version 45390 (0.0009) +[2023-10-13 02:07:45,676][46663] Updated weights for policy 1, policy_version 45361 (0.0007) +[2023-10-13 02:07:46,006][46662] Updated weights for policy 0, policy_version 45400 (0.0008) +[2023-10-13 02:07:46,038][46663] Updated weights for policy 1, policy_version 45371 (0.0007) +[2023-10-13 02:07:48,606][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 92962816. Throughput: 0: 1664.4, 1: 1657.8. Samples: 23247974. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:07:48,607][45375] Avg episode reward: [(0, '47.230'), (1, '46.180')] +[2023-10-13 02:07:50,114][46663] Updated weights for policy 1, policy_version 45381 (0.0008) +[2023-10-13 02:07:50,207][46662] Updated weights for policy 0, policy_version 45410 (0.0009) +[2023-10-13 02:07:50,474][46663] Updated weights for policy 1, policy_version 45391 (0.0007) +[2023-10-13 02:07:50,572][46662] Updated weights for policy 0, policy_version 45420 (0.0007) +[2023-10-13 02:07:50,839][46663] Updated weights for policy 1, policy_version 45401 (0.0009) +[2023-10-13 02:07:50,937][46662] Updated weights for policy 0, policy_version 45430 (0.0007) +[2023-10-13 02:07:51,310][46662] Updated weights for policy 0, policy_version 45440 (0.0007) +[2023-10-13 02:07:53,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 93028352. Throughput: 0: 1667.5, 1: 1654.1. Samples: 23268148. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:07:53,608][45375] Avg episode reward: [(0, '48.550'), (1, '45.800')] +[2023-10-13 02:07:53,620][46091] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000045440_46530560.pth... +[2023-10-13 02:07:53,620][46384] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000045408_46497792.pth... +[2023-10-13 02:07:53,651][46091] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000043872_44924928.pth +[2023-10-13 02:07:53,659][46384] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000043840_44892160.pth +[2023-10-13 02:07:55,085][46663] Updated weights for policy 1, policy_version 45411 (0.0009) +[2023-10-13 02:07:55,365][46662] Updated weights for policy 0, policy_version 45450 (0.0009) +[2023-10-13 02:07:55,471][46663] Updated weights for policy 1, policy_version 45421 (0.0007) +[2023-10-13 02:07:55,729][46662] Updated weights for policy 0, policy_version 45460 (0.0009) +[2023-10-13 02:07:55,836][46663] Updated weights for policy 1, policy_version 45431 (0.0008) +[2023-10-13 02:07:56,098][46662] Updated weights for policy 0, policy_version 45470 (0.0009) +[2023-10-13 02:07:58,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 93093888. Throughput: 0: 1649.1, 1: 1650.7. Samples: 23277512. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:07:58,607][45375] Avg episode reward: [(0, '49.160'), (1, '45.760')] +[2023-10-13 02:07:59,926][46663] Updated weights for policy 1, policy_version 45441 (0.0008) +[2023-10-13 02:08:00,200][46662] Updated weights for policy 0, policy_version 45480 (0.0008) +[2023-10-13 02:08:00,290][46663] Updated weights for policy 1, policy_version 45451 (0.0008) +[2023-10-13 02:08:00,566][46662] Updated weights for policy 0, policy_version 45490 (0.0009) +[2023-10-13 02:08:00,650][46663] Updated weights for policy 1, policy_version 45461 (0.0008) +[2023-10-13 02:08:00,935][46662] Updated weights for policy 0, policy_version 45500 (0.0009) +[2023-10-13 02:08:01,026][46663] Updated weights for policy 1, policy_version 45471 (0.0009) +[2023-10-13 02:08:03,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 93159424. Throughput: 0: 1668.2, 1: 1656.0. Samples: 23297906. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:08:03,608][45375] Avg episode reward: [(0, '49.640'), (1, '46.670')] +[2023-10-13 02:08:05,010][46663] Updated weights for policy 1, policy_version 45481 (0.0009) +[2023-10-13 02:08:05,028][46662] Updated weights for policy 0, policy_version 45510 (0.0008) +[2023-10-13 02:08:05,383][46663] Updated weights for policy 1, policy_version 45491 (0.0007) +[2023-10-13 02:08:05,388][46662] Updated weights for policy 0, policy_version 45520 (0.0009) +[2023-10-13 02:08:05,740][46663] Updated weights for policy 1, policy_version 45501 (0.0008) +[2023-10-13 02:08:05,755][46662] Updated weights for policy 0, policy_version 45530 (0.0009) +[2023-10-13 02:08:08,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 93224960. Throughput: 0: 1669.1, 1: 1662.5. Samples: 23318736. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:08:08,607][45375] Avg episode reward: [(0, '49.640'), (1, '46.040')] +[2023-10-13 02:08:09,697][46663] Updated weights for policy 1, policy_version 45511 (0.0008) +[2023-10-13 02:08:09,946][46662] Updated weights for policy 0, policy_version 45540 (0.0009) +[2023-10-13 02:08:10,067][46663] Updated weights for policy 1, policy_version 45521 (0.0008) +[2023-10-13 02:08:10,341][46662] Updated weights for policy 0, policy_version 45550 (0.0009) +[2023-10-13 02:08:10,441][46663] Updated weights for policy 1, policy_version 45531 (0.0008) +[2023-10-13 02:08:10,719][46662] Updated weights for policy 0, policy_version 45560 (0.0008) +[2023-10-13 02:08:13,607][45375] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 93290496. Throughput: 0: 1651.2, 1: 1668.6. Samples: 23328108. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:08:13,607][45375] Avg episode reward: [(0, '49.380'), (1, '46.590')] +[2023-10-13 02:08:14,591][46663] Updated weights for policy 1, policy_version 45541 (0.0007) +[2023-10-13 02:08:14,799][46662] Updated weights for policy 0, policy_version 45570 (0.0011) +[2023-10-13 02:08:14,959][46663] Updated weights for policy 1, policy_version 45551 (0.0008) +[2023-10-13 02:08:15,167][46662] Updated weights for policy 0, policy_version 45580 (0.0009) +[2023-10-13 02:08:15,318][46663] Updated weights for policy 1, policy_version 45561 (0.0009) +[2023-10-13 02:08:15,530][46662] Updated weights for policy 0, policy_version 45590 (0.0009) +[2023-10-13 02:08:15,901][46662] Updated weights for policy 0, policy_version 45600 (0.0009) +[2023-10-13 02:08:18,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 93356032. Throughput: 0: 1666.6, 1: 1672.1. Samples: 23348278. Policy #0 lag: (min: 22.0, avg: 22.0, max: 22.0) +[2023-10-13 02:08:18,607][45375] Avg episode reward: [(0, '49.960'), (1, '47.820')] +[2023-10-13 02:08:19,332][46663] Updated weights for policy 1, policy_version 45571 (0.0010) +[2023-10-13 02:08:19,700][46663] Updated weights for policy 1, policy_version 45581 (0.0010) +[2023-10-13 02:08:19,988][46662] Updated weights for policy 0, policy_version 45610 (0.0008) +[2023-10-13 02:08:20,058][46663] Updated weights for policy 1, policy_version 45591 (0.0008) +[2023-10-13 02:08:20,363][46662] Updated weights for policy 0, policy_version 45620 (0.0008) +[2023-10-13 02:08:20,729][46662] Updated weights for policy 0, policy_version 45630 (0.0009) +[2023-10-13 02:08:23,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 93421568. Throughput: 0: 1673.1, 1: 1676.9. Samples: 23369082. Policy #0 lag: (min: 22.0, avg: 22.0, max: 22.0) +[2023-10-13 02:08:23,607][45375] Avg episode reward: [(0, '50.060'), (1, '47.770')] +[2023-10-13 02:08:24,174][46663] Updated weights for policy 1, policy_version 45601 (0.0007) +[2023-10-13 02:08:24,546][46663] Updated weights for policy 1, policy_version 45611 (0.0007) +[2023-10-13 02:08:24,565][46662] Updated weights for policy 0, policy_version 45640 (0.0008) +[2023-10-13 02:08:24,915][46663] Updated weights for policy 1, policy_version 45621 (0.0007) +[2023-10-13 02:08:24,926][46662] Updated weights for policy 0, policy_version 45650 (0.0010) +[2023-10-13 02:08:25,284][46663] Updated weights for policy 1, policy_version 45631 (0.0007) +[2023-10-13 02:08:25,290][46662] Updated weights for policy 0, policy_version 45660 (0.0010) +[2023-10-13 02:08:28,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 93487104. Throughput: 0: 1666.0, 1: 1678.4. Samples: 23378444. Policy #0 lag: (min: 22.0, avg: 22.0, max: 22.0) +[2023-10-13 02:08:28,607][45375] Avg episode reward: [(0, '49.990'), (1, '47.750')] +[2023-10-13 02:08:29,153][46662] Updated weights for policy 0, policy_version 45670 (0.0008) +[2023-10-13 02:08:29,370][46663] Updated weights for policy 1, policy_version 45641 (0.0007) +[2023-10-13 02:08:29,538][46662] Updated weights for policy 0, policy_version 45680 (0.0007) +[2023-10-13 02:08:29,744][46663] Updated weights for policy 1, policy_version 45651 (0.0007) +[2023-10-13 02:08:29,902][46662] Updated weights for policy 0, policy_version 45690 (0.0007) +[2023-10-13 02:08:30,111][46663] Updated weights for policy 1, policy_version 45661 (0.0009) +[2023-10-13 02:08:33,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.3, 300 sec: 13329.4). Total num frames: 93552640. Throughput: 0: 1689.5, 1: 1677.3. Samples: 23399478. Policy #0 lag: (min: 22.0, avg: 22.0, max: 22.0) +[2023-10-13 02:08:33,607][45375] Avg episode reward: [(0, '50.870'), (1, '45.910')] +[2023-10-13 02:08:33,890][46662] Updated weights for policy 0, policy_version 45700 (0.0007) +[2023-10-13 02:08:34,222][46663] Updated weights for policy 1, policy_version 45671 (0.0007) +[2023-10-13 02:08:34,262][46662] Updated weights for policy 0, policy_version 45710 (0.0007) +[2023-10-13 02:08:34,591][46663] Updated weights for policy 1, policy_version 45681 (0.0007) +[2023-10-13 02:08:34,626][46662] Updated weights for policy 0, policy_version 45720 (0.0009) +[2023-10-13 02:08:34,960][46663] Updated weights for policy 1, policy_version 45691 (0.0008) +[2023-10-13 02:08:38,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 93618176. Throughput: 0: 1692.7, 1: 1684.7. Samples: 23420130. Policy #0 lag: (min: 22.0, avg: 22.0, max: 22.0) +[2023-10-13 02:08:38,607][45375] Avg episode reward: [(0, '49.890'), (1, '47.530')] +[2023-10-13 02:08:38,721][46662] Updated weights for policy 0, policy_version 45730 (0.0008) +[2023-10-13 02:08:38,833][46663] Updated weights for policy 1, policy_version 45701 (0.0008) +[2023-10-13 02:08:39,104][46662] Updated weights for policy 0, policy_version 45740 (0.0008) +[2023-10-13 02:08:39,204][46663] Updated weights for policy 1, policy_version 45711 (0.0009) +[2023-10-13 02:08:39,468][46662] Updated weights for policy 0, policy_version 45750 (0.0007) +[2023-10-13 02:08:39,571][46663] Updated weights for policy 1, policy_version 45721 (0.0009) +[2023-10-13 02:08:39,833][46662] Updated weights for policy 0, policy_version 45760 (0.0009) +[2023-10-13 02:08:43,607][45375] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 93683712. Throughput: 0: 1683.2, 1: 1688.0. Samples: 23429220. Policy #0 lag: (min: 22.0, avg: 22.0, max: 22.0) +[2023-10-13 02:08:43,608][45375] Avg episode reward: [(0, '50.510'), (1, '48.940')] +[2023-10-13 02:08:43,892][46663] Updated weights for policy 1, policy_version 45731 (0.0009) +[2023-10-13 02:08:43,952][46662] Updated weights for policy 0, policy_version 45770 (0.0009) +[2023-10-13 02:08:44,290][46663] Updated weights for policy 1, policy_version 45741 (0.0008) +[2023-10-13 02:08:44,314][46662] Updated weights for policy 0, policy_version 45780 (0.0007) +[2023-10-13 02:08:44,659][46663] Updated weights for policy 1, policy_version 45751 (0.0007) +[2023-10-13 02:08:44,688][46662] Updated weights for policy 0, policy_version 45790 (0.0009) +[2023-10-13 02:08:48,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 93749248. Throughput: 0: 1689.8, 1: 1677.3. Samples: 23449424. Policy #0 lag: (min: 22.0, avg: 22.0, max: 22.0) +[2023-10-13 02:08:48,607][45375] Avg episode reward: [(0, '50.940'), (1, '48.280')] +[2023-10-13 02:08:48,812][46663] Updated weights for policy 1, policy_version 45761 (0.0010) +[2023-10-13 02:08:48,845][46662] Updated weights for policy 0, policy_version 45800 (0.0009) +[2023-10-13 02:08:49,168][46663] Updated weights for policy 1, policy_version 45771 (0.0008) +[2023-10-13 02:08:49,216][46662] Updated weights for policy 0, policy_version 45810 (0.0009) +[2023-10-13 02:08:49,533][46663] Updated weights for policy 1, policy_version 45781 (0.0008) +[2023-10-13 02:08:49,587][46662] Updated weights for policy 0, policy_version 45820 (0.0008) +[2023-10-13 02:08:49,912][46663] Updated weights for policy 1, policy_version 45791 (0.0007) +[2023-10-13 02:08:53,606][45375] Fps is (10 sec: 13107.4, 60 sec: 13107.3, 300 sec: 13329.4). Total num frames: 93814784. Throughput: 0: 1687.2, 1: 1671.6. Samples: 23469880. Policy #0 lag: (min: 23.0, avg: 26.5, max: 55.0) +[2023-10-13 02:08:53,607][45375] Avg episode reward: [(0, '49.950'), (1, '48.870')] +[2023-10-13 02:08:53,629][46662] Updated weights for policy 0, policy_version 45830 (0.0007) +[2023-10-13 02:08:53,977][46663] Updated weights for policy 1, policy_version 45801 (0.0008) +[2023-10-13 02:08:53,996][46662] Updated weights for policy 0, policy_version 45840 (0.0008) +[2023-10-13 02:08:54,344][46663] Updated weights for policy 1, policy_version 45811 (0.0008) +[2023-10-13 02:08:54,363][46662] Updated weights for policy 0, policy_version 45850 (0.0007) +[2023-10-13 02:08:54,712][46663] Updated weights for policy 1, policy_version 45821 (0.0008) +[2023-10-13 02:08:58,336][46662] Updated weights for policy 0, policy_version 45860 (0.0009) +[2023-10-13 02:08:58,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 93880320. Throughput: 0: 1685.9, 1: 1669.4. Samples: 23479096. Policy #0 lag: (min: 23.0, avg: 26.5, max: 55.0) +[2023-10-13 02:08:58,607][45375] Avg episode reward: [(0, '48.830'), (1, '49.420')] +[2023-10-13 02:08:58,729][46662] Updated weights for policy 0, policy_version 45870 (0.0007) +[2023-10-13 02:08:58,903][46663] Updated weights for policy 1, policy_version 45831 (0.0008) +[2023-10-13 02:08:59,097][46662] Updated weights for policy 0, policy_version 45880 (0.0007) +[2023-10-13 02:08:59,263][46663] Updated weights for policy 1, policy_version 45841 (0.0009) +[2023-10-13 02:08:59,629][46663] Updated weights for policy 1, policy_version 45851 (0.0009) +[2023-10-13 02:09:03,143][46662] Updated weights for policy 0, policy_version 45890 (0.0008) +[2023-10-13 02:09:03,507][46662] Updated weights for policy 0, policy_version 45900 (0.0009) +[2023-10-13 02:09:03,582][46663] Updated weights for policy 1, policy_version 45861 (0.0010) +[2023-10-13 02:09:03,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 93945856. Throughput: 0: 1690.0, 1: 1676.6. Samples: 23499774. Policy #0 lag: (min: 23.0, avg: 26.5, max: 55.0) +[2023-10-13 02:09:03,608][45375] Avg episode reward: [(0, '49.600'), (1, '48.390')] +[2023-10-13 02:09:03,883][46662] Updated weights for policy 0, policy_version 45910 (0.0007) +[2023-10-13 02:09:03,939][46663] Updated weights for policy 1, policy_version 45871 (0.0009) +[2023-10-13 02:09:04,248][46662] Updated weights for policy 0, policy_version 45920 (0.0007) +[2023-10-13 02:09:04,303][46663] Updated weights for policy 1, policy_version 45881 (0.0009) +[2023-10-13 02:09:08,346][46663] Updated weights for policy 1, policy_version 45891 (0.0009) +[2023-10-13 02:09:08,463][46662] Updated weights for policy 0, policy_version 45930 (0.0007) +[2023-10-13 02:09:08,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 94011392. Throughput: 0: 1685.8, 1: 1672.7. Samples: 23520212. Policy #0 lag: (min: 23.0, avg: 26.5, max: 55.0) +[2023-10-13 02:09:08,607][45375] Avg episode reward: [(0, '50.890'), (1, '48.120')] +[2023-10-13 02:09:08,715][46663] Updated weights for policy 1, policy_version 45901 (0.0007) +[2023-10-13 02:09:08,829][46662] Updated weights for policy 0, policy_version 45940 (0.0008) +[2023-10-13 02:09:09,089][46663] Updated weights for policy 1, policy_version 45911 (0.0010) +[2023-10-13 02:09:09,205][46662] Updated weights for policy 0, policy_version 45950 (0.0008) +[2023-10-13 02:09:13,047][46663] Updated weights for policy 1, policy_version 45921 (0.0009) +[2023-10-13 02:09:13,392][46662] Updated weights for policy 0, policy_version 45960 (0.0008) +[2023-10-13 02:09:13,420][46663] Updated weights for policy 1, policy_version 45931 (0.0008) +[2023-10-13 02:09:13,607][45375] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 94076928. Throughput: 0: 1678.8, 1: 1674.8. Samples: 23529356. Policy #0 lag: (min: 23.0, avg: 26.5, max: 55.0) +[2023-10-13 02:09:13,607][45375] Avg episode reward: [(0, '51.690'), (1, '48.780')] +[2023-10-13 02:09:13,757][46662] Updated weights for policy 0, policy_version 45970 (0.0008) +[2023-10-13 02:09:13,795][46663] Updated weights for policy 1, policy_version 45941 (0.0010) +[2023-10-13 02:09:14,133][46662] Updated weights for policy 0, policy_version 45980 (0.0009) +[2023-10-13 02:09:14,162][46663] Updated weights for policy 1, policy_version 45951 (0.0009) +[2023-10-13 02:09:18,290][46662] Updated weights for policy 0, policy_version 45990 (0.0010) +[2023-10-13 02:09:18,313][46663] Updated weights for policy 1, policy_version 45961 (0.0010) +[2023-10-13 02:09:18,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 94142464. Throughput: 0: 1669.5, 1: 1675.2. Samples: 23549988. Policy #0 lag: (min: 23.0, avg: 26.5, max: 55.0) +[2023-10-13 02:09:18,607][45375] Avg episode reward: [(0, '52.700'), (1, '50.370')] +[2023-10-13 02:09:18,662][46662] Updated weights for policy 0, policy_version 46000 (0.0009) +[2023-10-13 02:09:18,686][46663] Updated weights for policy 1, policy_version 45971 (0.0009) +[2023-10-13 02:09:19,022][46662] Updated weights for policy 0, policy_version 46010 (0.0008) +[2023-10-13 02:09:19,045][46663] Updated weights for policy 1, policy_version 45981 (0.0007) +[2023-10-13 02:09:23,213][46662] Updated weights for policy 0, policy_version 46020 (0.0008) +[2023-10-13 02:09:23,225][46663] Updated weights for policy 1, policy_version 45991 (0.0007) +[2023-10-13 02:09:23,584][46662] Updated weights for policy 0, policy_version 46030 (0.0008) +[2023-10-13 02:09:23,598][46663] Updated weights for policy 1, policy_version 46001 (0.0009) +[2023-10-13 02:09:23,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 94208000. Throughput: 0: 1669.7, 1: 1664.8. Samples: 23570182. Policy #0 lag: (min: 23.0, avg: 26.5, max: 55.0) +[2023-10-13 02:09:23,607][45375] Avg episode reward: [(0, '51.840'), (1, '49.460')] +[2023-10-13 02:09:23,949][46662] Updated weights for policy 0, policy_version 46040 (0.0007) +[2023-10-13 02:09:23,966][46663] Updated weights for policy 1, policy_version 46011 (0.0009) +[2023-10-13 02:09:28,041][46663] Updated weights for policy 1, policy_version 46021 (0.0008) +[2023-10-13 02:09:28,051][46662] Updated weights for policy 0, policy_version 46050 (0.0008) +[2023-10-13 02:09:28,425][46662] Updated weights for policy 0, policy_version 46060 (0.0008) +[2023-10-13 02:09:28,436][46663] Updated weights for policy 1, policy_version 46031 (0.0009) +[2023-10-13 02:09:28,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 94273536. Throughput: 0: 1666.6, 1: 1675.6. Samples: 23579620. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:09:28,607][45375] Avg episode reward: [(0, '51.290'), (1, '49.320')] +[2023-10-13 02:09:28,792][46663] Updated weights for policy 1, policy_version 46041 (0.0008) +[2023-10-13 02:09:28,794][46662] Updated weights for policy 0, policy_version 46070 (0.0008) +[2023-10-13 02:09:29,157][46662] Updated weights for policy 0, policy_version 46080 (0.0008) +[2023-10-13 02:09:32,863][46663] Updated weights for policy 1, policy_version 46051 (0.0007) +[2023-10-13 02:09:33,236][46663] Updated weights for policy 1, policy_version 46061 (0.0008) +[2023-10-13 02:09:33,381][46662] Updated weights for policy 0, policy_version 46090 (0.0008) +[2023-10-13 02:09:33,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 94339072. Throughput: 0: 1664.1, 1: 1679.5. Samples: 23599888. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:09:33,607][45375] Avg episode reward: [(0, '52.070'), (1, '49.610')] +[2023-10-13 02:09:33,611][46663] Updated weights for policy 1, policy_version 46071 (0.0007) +[2023-10-13 02:09:33,751][46662] Updated weights for policy 0, policy_version 46100 (0.0007) +[2023-10-13 02:09:34,133][46662] Updated weights for policy 0, policy_version 46110 (0.0008) +[2023-10-13 02:09:37,699][46663] Updated weights for policy 1, policy_version 46081 (0.0007) +[2023-10-13 02:09:38,063][46663] Updated weights for policy 1, policy_version 46091 (0.0008) +[2023-10-13 02:09:38,149][46662] Updated weights for policy 0, policy_version 46120 (0.0008) +[2023-10-13 02:09:38,431][46663] Updated weights for policy 1, policy_version 46101 (0.0009) +[2023-10-13 02:09:38,518][46662] Updated weights for policy 0, policy_version 46130 (0.0007) +[2023-10-13 02:09:38,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 94404608. Throughput: 0: 1665.3, 1: 1664.9. Samples: 23619742. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:09:38,607][45375] Avg episode reward: [(0, '51.660'), (1, '50.380')] +[2023-10-13 02:09:38,789][46663] Updated weights for policy 1, policy_version 46111 (0.0008) +[2023-10-13 02:09:38,880][46662] Updated weights for policy 0, policy_version 46140 (0.0008) +[2023-10-13 02:09:43,111][46663] Updated weights for policy 1, policy_version 46121 (0.0008) +[2023-10-13 02:09:43,146][46662] Updated weights for policy 0, policy_version 46150 (0.0008) +[2023-10-13 02:09:43,480][46663] Updated weights for policy 1, policy_version 46131 (0.0007) +[2023-10-13 02:09:43,519][46662] Updated weights for policy 0, policy_version 46160 (0.0009) +[2023-10-13 02:09:43,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 94470144. Throughput: 0: 1662.6, 1: 1677.3. Samples: 23629390. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:09:43,607][45375] Avg episode reward: [(0, '51.920'), (1, '48.270')] +[2023-10-13 02:09:43,855][46663] Updated weights for policy 1, policy_version 46141 (0.0007) +[2023-10-13 02:09:43,883][46662] Updated weights for policy 0, policy_version 46170 (0.0007) +[2023-10-13 02:09:47,989][46662] Updated weights for policy 0, policy_version 46180 (0.0009) +[2023-10-13 02:09:48,047][46663] Updated weights for policy 1, policy_version 46151 (0.0007) +[2023-10-13 02:09:48,354][46662] Updated weights for policy 0, policy_version 46190 (0.0009) +[2023-10-13 02:09:48,407][46663] Updated weights for policy 1, policy_version 46161 (0.0011) +[2023-10-13 02:09:48,606][45375] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 94535680. Throughput: 0: 1662.8, 1: 1670.9. Samples: 23649792. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:09:48,607][45375] Avg episode reward: [(0, '52.260'), (1, '47.320')] +[2023-10-13 02:09:48,733][46662] Updated weights for policy 0, policy_version 46200 (0.0009) +[2023-10-13 02:09:48,776][46663] Updated weights for policy 1, policy_version 46171 (0.0007) +[2023-10-13 02:09:52,592][46663] Updated weights for policy 1, policy_version 46181 (0.0009) +[2023-10-13 02:09:52,596][46662] Updated weights for policy 0, policy_version 46210 (0.0008) +[2023-10-13 02:09:52,955][46663] Updated weights for policy 1, policy_version 46191 (0.0008) +[2023-10-13 02:09:52,968][46662] Updated weights for policy 0, policy_version 46220 (0.0008) +[2023-10-13 02:09:53,325][46663] Updated weights for policy 1, policy_version 46201 (0.0008) +[2023-10-13 02:09:53,330][46662] Updated weights for policy 0, policy_version 46230 (0.0007) +[2023-10-13 02:09:53,607][45375] Fps is (10 sec: 16384.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 94633984. Throughput: 0: 1659.2, 1: 1657.6. Samples: 23669464. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:09:53,607][45375] Avg episode reward: [(0, '52.030'), (1, '48.090')] +[2023-10-13 02:09:53,613][46384] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000046208_47316992.pth... +[2023-10-13 02:09:53,644][46384] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000044640_45711360.pth +[2023-10-13 02:09:53,701][46091] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000046240_47349760.pth... +[2023-10-13 02:09:53,705][46662] Updated weights for policy 0, policy_version 46240 (0.0009) +[2023-10-13 02:09:53,737][46091] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000044672_45744128.pth +[2023-10-13 02:09:57,346][46663] Updated weights for policy 1, policy_version 46211 (0.0009) +[2023-10-13 02:09:57,703][46663] Updated weights for policy 1, policy_version 46221 (0.0009) +[2023-10-13 02:09:57,919][46662] Updated weights for policy 0, policy_version 46250 (0.0009) +[2023-10-13 02:09:58,074][46663] Updated weights for policy 1, policy_version 46231 (0.0008) +[2023-10-13 02:09:58,294][46662] Updated weights for policy 0, policy_version 46260 (0.0009) +[2023-10-13 02:09:58,606][45375] Fps is (10 sec: 16384.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 94699520. Throughput: 0: 1666.8, 1: 1675.2. Samples: 23679744. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:09:58,607][45375] Avg episode reward: [(0, '53.010'), (1, '49.130')] +[2023-10-13 02:09:58,669][46662] Updated weights for policy 0, policy_version 46270 (0.0009) +[2023-10-13 02:10:02,296][46663] Updated weights for policy 1, policy_version 46241 (0.0008) +[2023-10-13 02:10:02,662][46663] Updated weights for policy 1, policy_version 46251 (0.0007) +[2023-10-13 02:10:02,671][46662] Updated weights for policy 0, policy_version 46280 (0.0008) +[2023-10-13 02:10:03,029][46663] Updated weights for policy 1, policy_version 46261 (0.0009) +[2023-10-13 02:10:03,044][46662] Updated weights for policy 0, policy_version 46290 (0.0007) +[2023-10-13 02:10:03,396][46663] Updated weights for policy 1, policy_version 46271 (0.0010) +[2023-10-13 02:10:03,416][46662] Updated weights for policy 0, policy_version 46300 (0.0007) +[2023-10-13 02:10:03,607][45375] Fps is (10 sec: 16383.7, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 94797824. Throughput: 0: 1668.4, 1: 1673.0. Samples: 23700350. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0) +[2023-10-13 02:10:03,608][45375] Avg episode reward: [(0, '53.400'), (1, '50.300')] +[2023-10-13 02:10:07,266][46663] Updated weights for policy 1, policy_version 46281 (0.0007) +[2023-10-13 02:10:07,396][46662] Updated weights for policy 0, policy_version 46310 (0.0007) +[2023-10-13 02:10:07,625][46663] Updated weights for policy 1, policy_version 46291 (0.0007) +[2023-10-13 02:10:07,764][46662] Updated weights for policy 0, policy_version 46320 (0.0007) +[2023-10-13 02:10:07,989][46663] Updated weights for policy 1, policy_version 46301 (0.0010) +[2023-10-13 02:10:08,135][46662] Updated weights for policy 0, policy_version 46330 (0.0009) +[2023-10-13 02:10:08,607][45375] Fps is (10 sec: 16383.7, 60 sec: 14199.4, 300 sec: 13440.4). Total num frames: 94863360. Throughput: 0: 1659.7, 1: 1660.9. Samples: 23719612. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0) +[2023-10-13 02:10:08,608][45375] Avg episode reward: [(0, '53.290'), (1, '51.460')] +[2023-10-13 02:10:12,137][46663] Updated weights for policy 1, policy_version 46311 (0.0007) +[2023-10-13 02:10:12,210][46662] Updated weights for policy 0, policy_version 46340 (0.0010) +[2023-10-13 02:10:12,499][46663] Updated weights for policy 1, policy_version 46321 (0.0007) +[2023-10-13 02:10:12,572][46662] Updated weights for policy 0, policy_version 46350 (0.0009) +[2023-10-13 02:10:12,866][46663] Updated weights for policy 1, policy_version 46331 (0.0010) +[2023-10-13 02:10:12,940][46662] Updated weights for policy 0, policy_version 46360 (0.0011) +[2023-10-13 02:10:13,606][45375] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 13440.4). Total num frames: 94928896. Throughput: 0: 1678.3, 1: 1677.2. Samples: 23730618. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0) +[2023-10-13 02:10:13,607][45375] Avg episode reward: [(0, '51.290'), (1, '51.970')] +[2023-10-13 02:10:16,918][46662] Updated weights for policy 0, policy_version 46370 (0.0009) +[2023-10-13 02:10:17,040][46663] Updated weights for policy 1, policy_version 46341 (0.0007) +[2023-10-13 02:10:17,297][46662] Updated weights for policy 0, policy_version 46380 (0.0009) +[2023-10-13 02:10:17,443][46663] Updated weights for policy 1, policy_version 46351 (0.0008) +[2023-10-13 02:10:17,655][46662] Updated weights for policy 0, policy_version 46390 (0.0008) +[2023-10-13 02:10:17,799][46663] Updated weights for policy 1, policy_version 46361 (0.0007) +[2023-10-13 02:10:18,036][46662] Updated weights for policy 0, policy_version 46400 (0.0008) +[2023-10-13 02:10:18,607][45375] Fps is (10 sec: 13107.3, 60 sec: 14199.4, 300 sec: 13440.4). Total num frames: 94994432. Throughput: 0: 1683.7, 1: 1669.5. Samples: 23750780. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0) +[2023-10-13 02:10:18,607][45375] Avg episode reward: [(0, '50.610'), (1, '52.690')] +[2023-10-13 02:10:21,937][46663] Updated weights for policy 1, policy_version 46371 (0.0008) +[2023-10-13 02:10:22,053][46662] Updated weights for policy 0, policy_version 46410 (0.0008) +[2023-10-13 02:10:22,304][46663] Updated weights for policy 1, policy_version 46381 (0.0007) +[2023-10-13 02:10:22,422][46662] Updated weights for policy 0, policy_version 46420 (0.0008) +[2023-10-13 02:10:22,669][46663] Updated weights for policy 1, policy_version 46391 (0.0008) +[2023-10-13 02:10:22,790][46662] Updated weights for policy 0, policy_version 46430 (0.0007) +[2023-10-13 02:10:23,607][45375] Fps is (10 sec: 13106.9, 60 sec: 14199.4, 300 sec: 13440.4). Total num frames: 95059968. Throughput: 0: 1658.6, 1: 1668.7. Samples: 23769468. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0) +[2023-10-13 02:10:23,608][45375] Avg episode reward: [(0, '50.100'), (1, '52.400')] +[2023-10-13 02:10:26,708][46663] Updated weights for policy 1, policy_version 46401 (0.0009) +[2023-10-13 02:10:27,064][46663] Updated weights for policy 1, policy_version 46411 (0.0009) +[2023-10-13 02:10:27,102][46662] Updated weights for policy 0, policy_version 46440 (0.0008) +[2023-10-13 02:10:27,435][46663] Updated weights for policy 1, policy_version 46421 (0.0009) +[2023-10-13 02:10:27,471][46662] Updated weights for policy 0, policy_version 46450 (0.0009) +[2023-10-13 02:10:27,795][46663] Updated weights for policy 1, policy_version 46431 (0.0007) +[2023-10-13 02:10:27,835][46662] Updated weights for policy 0, policy_version 46460 (0.0008) +[2023-10-13 02:10:28,606][45375] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 13440.4). Total num frames: 95125504. Throughput: 0: 1682.3, 1: 1684.6. Samples: 23780902. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0) +[2023-10-13 02:10:28,607][45375] Avg episode reward: [(0, '51.100'), (1, '51.420')] +[2023-10-13 02:10:31,768][46662] Updated weights for policy 0, policy_version 46470 (0.0010) +[2023-10-13 02:10:32,044][46663] Updated weights for policy 1, policy_version 46441 (0.0008) +[2023-10-13 02:10:32,143][46662] Updated weights for policy 0, policy_version 46480 (0.0007) +[2023-10-13 02:10:32,413][46663] Updated weights for policy 1, policy_version 46451 (0.0009) +[2023-10-13 02:10:32,507][46662] Updated weights for policy 0, policy_version 46490 (0.0008) +[2023-10-13 02:10:32,780][46663] Updated weights for policy 1, policy_version 46461 (0.0008) +[2023-10-13 02:10:33,607][45375] Fps is (10 sec: 13107.3, 60 sec: 14199.4, 300 sec: 13440.4). Total num frames: 95191040. Throughput: 0: 1680.3, 1: 1670.1. Samples: 23800562. Policy #0 lag: (min: 21.0, avg: 27.2, max: 53.0) +[2023-10-13 02:10:33,608][45375] Avg episode reward: [(0, '51.550'), (1, '51.330')] +[2023-10-13 02:10:36,497][46662] Updated weights for policy 0, policy_version 46500 (0.0008) +[2023-10-13 02:10:36,864][46662] Updated weights for policy 0, policy_version 46510 (0.0008) +[2023-10-13 02:10:36,877][46663] Updated weights for policy 1, policy_version 46471 (0.0007) +[2023-10-13 02:10:37,231][46662] Updated weights for policy 0, policy_version 46520 (0.0007) +[2023-10-13 02:10:37,241][46663] Updated weights for policy 1, policy_version 46481 (0.0007) +[2023-10-13 02:10:37,618][46663] Updated weights for policy 1, policy_version 46491 (0.0009) +[2023-10-13 02:10:38,606][45375] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 13440.4). Total num frames: 95256576. Throughput: 0: 1663.6, 1: 1673.2. Samples: 23819624. Policy #0 lag: (min: 21.0, avg: 27.2, max: 53.0) +[2023-10-13 02:10:38,607][45375] Avg episode reward: [(0, '51.820'), (1, '51.740')] +[2023-10-13 02:10:41,254][46662] Updated weights for policy 0, policy_version 46530 (0.0008) +[2023-10-13 02:10:41,481][46663] Updated weights for policy 1, policy_version 46501 (0.0009) +[2023-10-13 02:10:41,619][46662] Updated weights for policy 0, policy_version 46540 (0.0009) +[2023-10-13 02:10:41,855][46663] Updated weights for policy 1, policy_version 46511 (0.0008) +[2023-10-13 02:10:41,995][46662] Updated weights for policy 0, policy_version 46550 (0.0009) +[2023-10-13 02:10:42,219][46663] Updated weights for policy 1, policy_version 46521 (0.0009) +[2023-10-13 02:10:42,353][46662] Updated weights for policy 0, policy_version 46560 (0.0008) +[2023-10-13 02:10:43,607][45375] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 13440.4). Total num frames: 95322112. Throughput: 0: 1687.9, 1: 1680.6. Samples: 23831324. Policy #0 lag: (min: 21.0, avg: 27.2, max: 53.0) +[2023-10-13 02:10:43,607][45375] Avg episode reward: [(0, '51.620'), (1, '51.320')] +[2023-10-13 02:10:46,367][46663] Updated weights for policy 1, policy_version 46531 (0.0008) +[2023-10-13 02:10:46,522][46662] Updated weights for policy 0, policy_version 46570 (0.0009) +[2023-10-13 02:10:46,730][46663] Updated weights for policy 1, policy_version 46541 (0.0008) +[2023-10-13 02:10:46,894][46662] Updated weights for policy 0, policy_version 46580 (0.0009) +[2023-10-13 02:10:47,093][46663] Updated weights for policy 1, policy_version 46551 (0.0008) +[2023-10-13 02:10:47,255][46662] Updated weights for policy 0, policy_version 46590 (0.0009) +[2023-10-13 02:10:48,606][45375] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 13440.4). Total num frames: 95387648. Throughput: 0: 1673.5, 1: 1657.4. Samples: 23850240. Policy #0 lag: (min: 21.0, avg: 27.2, max: 53.0) +[2023-10-13 02:10:48,607][45375] Avg episode reward: [(0, '51.300'), (1, '50.790')] +[2023-10-13 02:10:51,418][46663] Updated weights for policy 1, policy_version 46561 (0.0010) +[2023-10-13 02:10:51,496][46662] Updated weights for policy 0, policy_version 46600 (0.0008) +[2023-10-13 02:10:51,787][46663] Updated weights for policy 1, policy_version 46571 (0.0008) +[2023-10-13 02:10:51,874][46662] Updated weights for policy 0, policy_version 46610 (0.0008) +[2023-10-13 02:10:52,149][46663] Updated weights for policy 1, policy_version 46581 (0.0008) +[2023-10-13 02:10:52,240][46662] Updated weights for policy 0, policy_version 46620 (0.0008) +[2023-10-13 02:10:52,521][46663] Updated weights for policy 1, policy_version 46591 (0.0010) +[2023-10-13 02:10:53,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13329.4). Total num frames: 95453184. Throughput: 0: 1666.7, 1: 1671.9. Samples: 23869850. Policy #0 lag: (min: 21.0, avg: 27.2, max: 53.0) +[2023-10-13 02:10:53,607][45375] Avg episode reward: [(0, '51.570'), (1, '52.820')] +[2023-10-13 02:10:56,307][46662] Updated weights for policy 0, policy_version 46630 (0.0008) +[2023-10-13 02:10:56,510][46663] Updated weights for policy 1, policy_version 46601 (0.0009) +[2023-10-13 02:10:56,676][46662] Updated weights for policy 0, policy_version 46640 (0.0007) +[2023-10-13 02:10:56,879][46663] Updated weights for policy 1, policy_version 46611 (0.0008) +[2023-10-13 02:10:57,048][46662] Updated weights for policy 0, policy_version 46650 (0.0007) +[2023-10-13 02:10:57,244][46663] Updated weights for policy 1, policy_version 46621 (0.0007) +[2023-10-13 02:10:58,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13329.4). Total num frames: 95518720. Throughput: 0: 1680.0, 1: 1669.7. Samples: 23881358. Policy #0 lag: (min: 21.0, avg: 27.2, max: 53.0) +[2023-10-13 02:10:58,607][45375] Avg episode reward: [(0, '49.840'), (1, '52.960')] +[2023-10-13 02:11:01,121][46662] Updated weights for policy 0, policy_version 46660 (0.0007) +[2023-10-13 02:11:01,399][46663] Updated weights for policy 1, policy_version 46631 (0.0008) +[2023-10-13 02:11:01,485][46662] Updated weights for policy 0, policy_version 46670 (0.0008) +[2023-10-13 02:11:01,764][46663] Updated weights for policy 1, policy_version 46641 (0.0008) +[2023-10-13 02:11:01,854][46662] Updated weights for policy 0, policy_version 46680 (0.0010) +[2023-10-13 02:11:02,122][46663] Updated weights for policy 1, policy_version 46651 (0.0010) +[2023-10-13 02:11:03,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 95584256. Throughput: 0: 1663.3, 1: 1653.7. Samples: 23900046. Policy #0 lag: (min: 21.0, avg: 27.2, max: 53.0) +[2023-10-13 02:11:03,607][45375] Avg episode reward: [(0, '49.570'), (1, '53.660')] +[2023-10-13 02:11:05,776][46662] Updated weights for policy 0, policy_version 46690 (0.0009) +[2023-10-13 02:11:06,151][46662] Updated weights for policy 0, policy_version 46700 (0.0009) +[2023-10-13 02:11:06,341][46663] Updated weights for policy 1, policy_version 46661 (0.0010) +[2023-10-13 02:11:06,518][46662] Updated weights for policy 0, policy_version 46710 (0.0008) +[2023-10-13 02:11:06,712][46663] Updated weights for policy 1, policy_version 46671 (0.0007) +[2023-10-13 02:11:06,890][46662] Updated weights for policy 0, policy_version 46720 (0.0007) +[2023-10-13 02:11:07,075][46663] Updated weights for policy 1, policy_version 46681 (0.0009) +[2023-10-13 02:11:08,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 95649792. Throughput: 0: 1686.4, 1: 1669.5. Samples: 23920484. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:11:08,607][45375] Avg episode reward: [(0, '50.090'), (1, '53.840')] +[2023-10-13 02:11:11,004][46662] Updated weights for policy 0, policy_version 46730 (0.0010) +[2023-10-13 02:11:11,178][46663] Updated weights for policy 1, policy_version 46691 (0.0010) +[2023-10-13 02:11:11,365][46662] Updated weights for policy 0, policy_version 46740 (0.0008) +[2023-10-13 02:11:11,552][46663] Updated weights for policy 1, policy_version 46701 (0.0009) +[2023-10-13 02:11:11,745][46662] Updated weights for policy 0, policy_version 46750 (0.0007) +[2023-10-13 02:11:11,919][46663] Updated weights for policy 1, policy_version 46711 (0.0009) +[2023-10-13 02:11:13,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 95715328. Throughput: 0: 1686.2, 1: 1662.5. Samples: 23931594. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:11:13,607][45375] Avg episode reward: [(0, '49.280'), (1, '52.190')] +[2023-10-13 02:11:15,772][46663] Updated weights for policy 1, policy_version 46721 (0.0008) +[2023-10-13 02:11:15,798][46662] Updated weights for policy 0, policy_version 46760 (0.0007) +[2023-10-13 02:11:16,143][46663] Updated weights for policy 1, policy_version 46731 (0.0009) +[2023-10-13 02:11:16,168][46662] Updated weights for policy 0, policy_version 46770 (0.0007) +[2023-10-13 02:11:16,503][46663] Updated weights for policy 1, policy_version 46741 (0.0008) +[2023-10-13 02:11:16,544][46662] Updated weights for policy 0, policy_version 46780 (0.0009) +[2023-10-13 02:11:16,872][46663] Updated weights for policy 1, policy_version 46751 (0.0011) +[2023-10-13 02:11:18,606][45375] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 95780864. Throughput: 0: 1669.7, 1: 1663.2. Samples: 23950542. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:11:18,607][45375] Avg episode reward: [(0, '49.160'), (1, '52.990')] +[2023-10-13 02:11:20,763][46662] Updated weights for policy 0, policy_version 46790 (0.0009) +[2023-10-13 02:11:20,959][46663] Updated weights for policy 1, policy_version 46761 (0.0008) +[2023-10-13 02:11:21,151][46662] Updated weights for policy 0, policy_version 46800 (0.0007) +[2023-10-13 02:11:21,328][46663] Updated weights for policy 1, policy_version 46771 (0.0008) +[2023-10-13 02:11:21,513][46662] Updated weights for policy 0, policy_version 46810 (0.0008) +[2023-10-13 02:11:21,694][46663] Updated weights for policy 1, policy_version 46781 (0.0008) +[2023-10-13 02:11:23,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 95846400. Throughput: 0: 1690.5, 1: 1682.4. Samples: 23971408. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:11:23,607][45375] Avg episode reward: [(0, '49.390'), (1, '51.340')] +[2023-10-13 02:11:25,559][46662] Updated weights for policy 0, policy_version 46820 (0.0008) +[2023-10-13 02:11:25,775][46663] Updated weights for policy 1, policy_version 46791 (0.0008) +[2023-10-13 02:11:25,927][46662] Updated weights for policy 0, policy_version 46830 (0.0009) +[2023-10-13 02:11:26,148][46663] Updated weights for policy 1, policy_version 46801 (0.0009) +[2023-10-13 02:11:26,300][46662] Updated weights for policy 0, policy_version 46840 (0.0008) +[2023-10-13 02:11:26,514][46663] Updated weights for policy 1, policy_version 46811 (0.0008) +[2023-10-13 02:11:28,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 95911936. Throughput: 0: 1681.4, 1: 1661.5. Samples: 23981752. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:11:28,608][45375] Avg episode reward: [(0, '48.660'), (1, '50.120')] +[2023-10-13 02:11:30,486][46662] Updated weights for policy 0, policy_version 46850 (0.0007) +[2023-10-13 02:11:30,655][46663] Updated weights for policy 1, policy_version 46821 (0.0008) +[2023-10-13 02:11:30,856][46662] Updated weights for policy 0, policy_version 46860 (0.0009) +[2023-10-13 02:11:31,031][46663] Updated weights for policy 1, policy_version 46831 (0.0010) +[2023-10-13 02:11:31,232][46662] Updated weights for policy 0, policy_version 46870 (0.0009) +[2023-10-13 02:11:31,401][46663] Updated weights for policy 1, policy_version 46841 (0.0009) +[2023-10-13 02:11:31,595][46662] Updated weights for policy 0, policy_version 46880 (0.0009) +[2023-10-13 02:11:33,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 95977472. Throughput: 0: 1671.7, 1: 1677.0. Samples: 24000930. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:11:33,607][45375] Avg episode reward: [(0, '47.920'), (1, '49.780')] +[2023-10-13 02:11:35,555][46662] Updated weights for policy 0, policy_version 46890 (0.0008) +[2023-10-13 02:11:35,617][46663] Updated weights for policy 1, policy_version 46851 (0.0007) +[2023-10-13 02:11:35,925][46662] Updated weights for policy 0, policy_version 46900 (0.0008) +[2023-10-13 02:11:35,992][46663] Updated weights for policy 1, policy_version 46861 (0.0007) +[2023-10-13 02:11:36,303][46662] Updated weights for policy 0, policy_version 46910 (0.0007) +[2023-10-13 02:11:36,367][46663] Updated weights for policy 1, policy_version 46871 (0.0008) +[2023-10-13 02:11:38,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 96043008. Throughput: 0: 1692.7, 1: 1677.0. Samples: 24021486. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:11:38,607][45375] Avg episode reward: [(0, '47.910'), (1, '48.430')] +[2023-10-13 02:11:40,354][46663] Updated weights for policy 1, policy_version 46881 (0.0009) +[2023-10-13 02:11:40,401][46662] Updated weights for policy 0, policy_version 46920 (0.0009) +[2023-10-13 02:11:40,725][46663] Updated weights for policy 1, policy_version 46891 (0.0007) +[2023-10-13 02:11:40,775][46662] Updated weights for policy 0, policy_version 46930 (0.0008) +[2023-10-13 02:11:41,093][46663] Updated weights for policy 1, policy_version 46901 (0.0008) +[2023-10-13 02:11:41,148][46662] Updated weights for policy 0, policy_version 46940 (0.0007) +[2023-10-13 02:11:41,459][46663] Updated weights for policy 1, policy_version 46911 (0.0009) +[2023-10-13 02:11:43,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 96108544. Throughput: 0: 1676.2, 1: 1654.9. Samples: 24031256. Policy #0 lag: (min: 31.0, avg: 31.1, max: 36.0) +[2023-10-13 02:11:43,608][45375] Avg episode reward: [(0, '48.010'), (1, '48.010')] +[2023-10-13 02:11:45,212][46662] Updated weights for policy 0, policy_version 46950 (0.0008) +[2023-10-13 02:11:45,400][46663] Updated weights for policy 1, policy_version 46921 (0.0009) +[2023-10-13 02:11:45,568][46662] Updated weights for policy 0, policy_version 46960 (0.0007) +[2023-10-13 02:11:45,765][46663] Updated weights for policy 1, policy_version 46931 (0.0007) +[2023-10-13 02:11:45,939][46662] Updated weights for policy 0, policy_version 46970 (0.0007) +[2023-10-13 02:11:46,126][46663] Updated weights for policy 1, policy_version 46941 (0.0008) +[2023-10-13 02:11:48,607][45375] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 96174080. Throughput: 0: 1679.1, 1: 1685.5. Samples: 24051450. Policy #0 lag: (min: 31.0, avg: 31.1, max: 36.0) +[2023-10-13 02:11:48,607][45375] Avg episode reward: [(0, '49.480'), (1, '49.310')] +[2023-10-13 02:11:50,141][46662] Updated weights for policy 0, policy_version 46980 (0.0009) +[2023-10-13 02:11:50,390][46663] Updated weights for policy 1, policy_version 46951 (0.0008) +[2023-10-13 02:11:50,510][46662] Updated weights for policy 0, policy_version 46990 (0.0007) +[2023-10-13 02:11:50,779][46663] Updated weights for policy 1, policy_version 46961 (0.0008) +[2023-10-13 02:11:50,879][46662] Updated weights for policy 0, policy_version 47000 (0.0008) +[2023-10-13 02:11:51,151][46663] Updated weights for policy 1, policy_version 46971 (0.0008) +[2023-10-13 02:11:53,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 96239616. Throughput: 0: 1680.8, 1: 1686.5. Samples: 24072010. Policy #0 lag: (min: 31.0, avg: 31.1, max: 36.0) +[2023-10-13 02:11:53,607][45375] Avg episode reward: [(0, '50.820'), (1, '47.110')] +[2023-10-13 02:11:53,618][46384] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000046976_48103424.pth... +[2023-10-13 02:11:53,618][46091] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000047008_48136192.pth... +[2023-10-13 02:11:53,648][46384] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000045408_46497792.pth +[2023-10-13 02:11:53,652][46384] Saving a milestone ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/milestones/checkpoint_000046976_48103424.pth +[2023-10-13 02:11:53,659][46091] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000045440_46530560.pth +[2023-10-13 02:11:53,664][46091] Saving a milestone ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/milestones/checkpoint_000047008_48136192.pth +[2023-10-13 02:11:54,872][46662] Updated weights for policy 0, policy_version 47010 (0.0009) +[2023-10-13 02:11:55,116][46663] Updated weights for policy 1, policy_version 46981 (0.0008) +[2023-10-13 02:11:55,241][46662] Updated weights for policy 0, policy_version 47020 (0.0008) +[2023-10-13 02:11:55,487][46663] Updated weights for policy 1, policy_version 46991 (0.0008) +[2023-10-13 02:11:55,615][46662] Updated weights for policy 0, policy_version 47030 (0.0009) +[2023-10-13 02:11:55,846][46663] Updated weights for policy 1, policy_version 47001 (0.0008) +[2023-10-13 02:11:55,977][46662] Updated weights for policy 0, policy_version 47040 (0.0010) +[2023-10-13 02:11:58,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 96305152. Throughput: 0: 1660.2, 1: 1663.4. Samples: 24081156. Policy #0 lag: (min: 31.0, avg: 31.1, max: 36.0) +[2023-10-13 02:11:58,607][45375] Avg episode reward: [(0, '50.470'), (1, '47.680')] +[2023-10-13 02:11:59,976][46663] Updated weights for policy 1, policy_version 47011 (0.0010) +[2023-10-13 02:11:59,991][46662] Updated weights for policy 0, policy_version 47050 (0.0009) +[2023-10-13 02:12:00,330][46663] Updated weights for policy 1, policy_version 47021 (0.0009) +[2023-10-13 02:12:00,366][46662] Updated weights for policy 0, policy_version 47060 (0.0008) +[2023-10-13 02:12:00,696][46663] Updated weights for policy 1, policy_version 47031 (0.0009) +[2023-10-13 02:12:00,738][46662] Updated weights for policy 0, policy_version 47070 (0.0010) +[2023-10-13 02:12:03,607][45375] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 96370688. Throughput: 0: 1674.7, 1: 1675.7. Samples: 24101312. Policy #0 lag: (min: 31.0, avg: 31.1, max: 36.0) +[2023-10-13 02:12:03,607][45375] Avg episode reward: [(0, '50.420'), (1, '45.270')] +[2023-10-13 02:12:04,614][46662] Updated weights for policy 0, policy_version 47080 (0.0010) +[2023-10-13 02:12:04,745][46663] Updated weights for policy 1, policy_version 47041 (0.0009) +[2023-10-13 02:12:04,986][46662] Updated weights for policy 0, policy_version 47090 (0.0007) +[2023-10-13 02:12:05,112][46663] Updated weights for policy 1, policy_version 47051 (0.0008) +[2023-10-13 02:12:05,356][46662] Updated weights for policy 0, policy_version 47100 (0.0007) +[2023-10-13 02:12:05,479][46663] Updated weights for policy 1, policy_version 47061 (0.0008) +[2023-10-13 02:12:05,856][46663] Updated weights for policy 1, policy_version 47071 (0.0008) +[2023-10-13 02:12:08,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 96436224. Throughput: 0: 1680.5, 1: 1674.5. Samples: 24122384. Policy #0 lag: (min: 31.0, avg: 31.1, max: 36.0) +[2023-10-13 02:12:08,607][45375] Avg episode reward: [(0, '50.380'), (1, '44.430')] +[2023-10-13 02:12:09,652][46662] Updated weights for policy 0, policy_version 47110 (0.0008) +[2023-10-13 02:12:09,893][46663] Updated weights for policy 1, policy_version 47081 (0.0007) +[2023-10-13 02:12:10,037][46662] Updated weights for policy 0, policy_version 47120 (0.0009) +[2023-10-13 02:12:10,256][46663] Updated weights for policy 1, policy_version 47091 (0.0009) +[2023-10-13 02:12:10,416][46662] Updated weights for policy 0, policy_version 47130 (0.0007) +[2023-10-13 02:12:10,615][46663] Updated weights for policy 1, policy_version 47101 (0.0009) +[2023-10-13 02:12:13,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 96501760. Throughput: 0: 1655.0, 1: 1664.8. Samples: 24131146. Policy #0 lag: (min: 31.0, avg: 31.1, max: 36.0) +[2023-10-13 02:12:13,608][45375] Avg episode reward: [(0, '50.130'), (1, '45.750')] +[2023-10-13 02:12:14,552][46662] Updated weights for policy 0, policy_version 47140 (0.0008) +[2023-10-13 02:12:14,876][46663] Updated weights for policy 1, policy_version 47111 (0.0008) +[2023-10-13 02:12:14,923][46662] Updated weights for policy 0, policy_version 47150 (0.0010) +[2023-10-13 02:12:15,235][46663] Updated weights for policy 1, policy_version 47121 (0.0008) +[2023-10-13 02:12:15,297][46662] Updated weights for policy 0, policy_version 47160 (0.0008) +[2023-10-13 02:12:15,596][46663] Updated weights for policy 1, policy_version 47131 (0.0008) +[2023-10-13 02:12:18,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 96567296. Throughput: 0: 1676.6, 1: 1676.6. Samples: 24151824. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:12:18,607][45375] Avg episode reward: [(0, '51.170'), (1, '45.390')] +[2023-10-13 02:12:19,515][46662] Updated weights for policy 0, policy_version 47170 (0.0008) +[2023-10-13 02:12:19,610][46663] Updated weights for policy 1, policy_version 47141 (0.0008) +[2023-10-13 02:12:19,887][46662] Updated weights for policy 0, policy_version 47180 (0.0008) +[2023-10-13 02:12:19,983][46663] Updated weights for policy 1, policy_version 47151 (0.0008) +[2023-10-13 02:12:20,243][46662] Updated weights for policy 0, policy_version 47190 (0.0008) +[2023-10-13 02:12:20,349][46663] Updated weights for policy 1, policy_version 47161 (0.0007) +[2023-10-13 02:12:20,608][46662] Updated weights for policy 0, policy_version 47200 (0.0007) +[2023-10-13 02:12:23,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 96632832. Throughput: 0: 1672.4, 1: 1683.9. Samples: 24172518. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:12:23,607][45375] Avg episode reward: [(0, '50.310'), (1, '45.660')] +[2023-10-13 02:12:24,414][46663] Updated weights for policy 1, policy_version 47171 (0.0008) +[2023-10-13 02:12:24,701][46662] Updated weights for policy 0, policy_version 47210 (0.0007) +[2023-10-13 02:12:24,782][46663] Updated weights for policy 1, policy_version 47181 (0.0008) +[2023-10-13 02:12:25,069][46662] Updated weights for policy 0, policy_version 47220 (0.0007) +[2023-10-13 02:12:25,140][46663] Updated weights for policy 1, policy_version 47191 (0.0009) +[2023-10-13 02:12:25,435][46662] Updated weights for policy 0, policy_version 47230 (0.0009) +[2023-10-13 02:12:28,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 96698368. Throughput: 0: 1660.1, 1: 1675.2. Samples: 24181342. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:12:28,607][45375] Avg episode reward: [(0, '52.230'), (1, '45.350')] +[2023-10-13 02:12:29,447][46663] Updated weights for policy 1, policy_version 47201 (0.0010) +[2023-10-13 02:12:29,511][46662] Updated weights for policy 0, policy_version 47240 (0.0008) +[2023-10-13 02:12:29,818][46663] Updated weights for policy 1, policy_version 47211 (0.0009) +[2023-10-13 02:12:29,879][46662] Updated weights for policy 0, policy_version 47250 (0.0007) +[2023-10-13 02:12:30,192][46663] Updated weights for policy 1, policy_version 47221 (0.0008) +[2023-10-13 02:12:30,252][46662] Updated weights for policy 0, policy_version 47260 (0.0008) +[2023-10-13 02:12:30,555][46663] Updated weights for policy 1, policy_version 47231 (0.0007) +[2023-10-13 02:12:33,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 96763904. Throughput: 0: 1670.4, 1: 1667.6. Samples: 24201662. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:12:33,608][45375] Avg episode reward: [(0, '51.520'), (1, '46.010')] +[2023-10-13 02:12:34,119][46662] Updated weights for policy 0, policy_version 47270 (0.0011) +[2023-10-13 02:12:34,489][46662] Updated weights for policy 0, policy_version 47280 (0.0009) +[2023-10-13 02:12:34,659][46663] Updated weights for policy 1, policy_version 47241 (0.0008) +[2023-10-13 02:12:34,861][46662] Updated weights for policy 0, policy_version 47290 (0.0009) +[2023-10-13 02:12:35,024][46663] Updated weights for policy 1, policy_version 47251 (0.0008) +[2023-10-13 02:12:35,397][46663] Updated weights for policy 1, policy_version 47261 (0.0009) +[2023-10-13 02:12:38,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 96829440. Throughput: 0: 1673.9, 1: 1667.8. Samples: 24222388. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:12:38,607][45375] Avg episode reward: [(0, '50.550'), (1, '46.880')] +[2023-10-13 02:12:38,905][46662] Updated weights for policy 0, policy_version 47300 (0.0010) +[2023-10-13 02:12:39,283][46662] Updated weights for policy 0, policy_version 47310 (0.0009) +[2023-10-13 02:12:39,629][46663] Updated weights for policy 1, policy_version 47271 (0.0007) +[2023-10-13 02:12:39,648][46662] Updated weights for policy 0, policy_version 47320 (0.0008) +[2023-10-13 02:12:40,008][46663] Updated weights for policy 1, policy_version 47281 (0.0008) +[2023-10-13 02:12:40,375][46663] Updated weights for policy 1, policy_version 47291 (0.0007) +[2023-10-13 02:12:43,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 96894976. Throughput: 0: 1674.6, 1: 1667.2. Samples: 24231538. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:12:43,608][45375] Avg episode reward: [(0, '49.490'), (1, '47.610')] +[2023-10-13 02:12:43,696][46662] Updated weights for policy 0, policy_version 47330 (0.0008) +[2023-10-13 02:12:44,072][46662] Updated weights for policy 0, policy_version 47340 (0.0008) +[2023-10-13 02:12:44,432][46662] Updated weights for policy 0, policy_version 47350 (0.0009) +[2023-10-13 02:12:44,490][46663] Updated weights for policy 1, policy_version 47301 (0.0009) +[2023-10-13 02:12:44,795][46662] Updated weights for policy 0, policy_version 47360 (0.0009) +[2023-10-13 02:12:44,847][46663] Updated weights for policy 1, policy_version 47311 (0.0008) +[2023-10-13 02:12:45,207][46663] Updated weights for policy 1, policy_version 47321 (0.0010) +[2023-10-13 02:12:48,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 96960512. Throughput: 0: 1686.8, 1: 1668.5. Samples: 24252300. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:12:48,607][45375] Avg episode reward: [(0, '48.470'), (1, '47.830')] +[2023-10-13 02:12:48,729][46662] Updated weights for policy 0, policy_version 47370 (0.0009) +[2023-10-13 02:12:49,100][46662] Updated weights for policy 0, policy_version 47380 (0.0008) +[2023-10-13 02:12:49,297][46663] Updated weights for policy 1, policy_version 47331 (0.0009) +[2023-10-13 02:12:49,471][46662] Updated weights for policy 0, policy_version 47390 (0.0007) +[2023-10-13 02:12:49,665][46663] Updated weights for policy 1, policy_version 47341 (0.0009) +[2023-10-13 02:12:50,029][46663] Updated weights for policy 1, policy_version 47351 (0.0010) +[2023-10-13 02:12:53,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 97026048. Throughput: 0: 1679.1, 1: 1662.3. Samples: 24272746. Policy #0 lag: (min: 20.0, avg: 28.7, max: 52.0) +[2023-10-13 02:12:53,608][45375] Avg episode reward: [(0, '49.000'), (1, '48.260')] +[2023-10-13 02:12:53,614][46662] Updated weights for policy 0, policy_version 47400 (0.0008) +[2023-10-13 02:12:53,986][46663] Updated weights for policy 1, policy_version 47361 (0.0010) +[2023-10-13 02:12:53,995][46662] Updated weights for policy 0, policy_version 47410 (0.0008) +[2023-10-13 02:12:54,354][46662] Updated weights for policy 0, policy_version 47420 (0.0008) +[2023-10-13 02:12:54,357][46663] Updated weights for policy 1, policy_version 47371 (0.0009) +[2023-10-13 02:12:54,717][46663] Updated weights for policy 1, policy_version 47381 (0.0010) +[2023-10-13 02:12:55,077][46663] Updated weights for policy 1, policy_version 47391 (0.0011) +[2023-10-13 02:12:58,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 97091584. Throughput: 0: 1684.1, 1: 1664.3. Samples: 24281820. Policy #0 lag: (min: 20.0, avg: 28.7, max: 52.0) +[2023-10-13 02:12:58,607][45375] Avg episode reward: [(0, '49.500'), (1, '48.050')] +[2023-10-13 02:12:58,649][46662] Updated weights for policy 0, policy_version 47430 (0.0008) +[2023-10-13 02:12:59,025][46662] Updated weights for policy 0, policy_version 47440 (0.0008) +[2023-10-13 02:12:59,044][46663] Updated weights for policy 1, policy_version 47401 (0.0008) +[2023-10-13 02:12:59,388][46662] Updated weights for policy 0, policy_version 47450 (0.0010) +[2023-10-13 02:12:59,420][46663] Updated weights for policy 1, policy_version 47411 (0.0008) +[2023-10-13 02:12:59,785][46663] Updated weights for policy 1, policy_version 47421 (0.0009) +[2023-10-13 02:13:03,410][46662] Updated weights for policy 0, policy_version 47460 (0.0009) +[2023-10-13 02:13:03,607][45375] Fps is (10 sec: 13107.6, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 97157120. Throughput: 0: 1683.4, 1: 1663.0. Samples: 24302410. Policy #0 lag: (min: 20.0, avg: 28.7, max: 52.0) +[2023-10-13 02:13:03,607][45375] Avg episode reward: [(0, '50.810'), (1, '48.150')] +[2023-10-13 02:13:03,775][46662] Updated weights for policy 0, policy_version 47470 (0.0007) +[2023-10-13 02:13:03,809][46663] Updated weights for policy 1, policy_version 47431 (0.0007) +[2023-10-13 02:13:04,149][46662] Updated weights for policy 0, policy_version 47480 (0.0007) +[2023-10-13 02:13:04,168][46663] Updated weights for policy 1, policy_version 47441 (0.0007) +[2023-10-13 02:13:04,528][46663] Updated weights for policy 1, policy_version 47451 (0.0009) +[2023-10-13 02:13:08,263][46662] Updated weights for policy 0, policy_version 47490 (0.0007) +[2023-10-13 02:13:08,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 97222656. Throughput: 0: 1680.6, 1: 1661.6. Samples: 24322916. Policy #0 lag: (min: 20.0, avg: 28.7, max: 52.0) +[2023-10-13 02:13:08,607][45375] Avg episode reward: [(0, '50.990'), (1, '48.120')] +[2023-10-13 02:13:08,638][46662] Updated weights for policy 0, policy_version 47500 (0.0007) +[2023-10-13 02:13:08,656][46663] Updated weights for policy 1, policy_version 47461 (0.0008) +[2023-10-13 02:13:09,005][46662] Updated weights for policy 0, policy_version 47510 (0.0008) +[2023-10-13 02:13:09,026][46663] Updated weights for policy 1, policy_version 47471 (0.0009) +[2023-10-13 02:13:09,366][46662] Updated weights for policy 0, policy_version 47520 (0.0010) +[2023-10-13 02:13:09,397][46663] Updated weights for policy 1, policy_version 47481 (0.0007) +[2023-10-13 02:13:13,518][46662] Updated weights for policy 0, policy_version 47530 (0.0009) +[2023-10-13 02:13:13,528][46663] Updated weights for policy 1, policy_version 47491 (0.0007) +[2023-10-13 02:13:13,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 97288192. Throughput: 0: 1680.3, 1: 1664.3. Samples: 24331848. Policy #0 lag: (min: 20.0, avg: 28.7, max: 52.0) +[2023-10-13 02:13:13,607][45375] Avg episode reward: [(0, '50.010'), (1, '48.680')] +[2023-10-13 02:13:13,897][46662] Updated weights for policy 0, policy_version 47540 (0.0008) +[2023-10-13 02:13:13,898][46663] Updated weights for policy 1, policy_version 47501 (0.0008) +[2023-10-13 02:13:14,269][46662] Updated weights for policy 0, policy_version 47550 (0.0009) +[2023-10-13 02:13:14,270][46663] Updated weights for policy 1, policy_version 47511 (0.0009) +[2023-10-13 02:13:18,275][46662] Updated weights for policy 0, policy_version 47560 (0.0008) +[2023-10-13 02:13:18,409][46663] Updated weights for policy 1, policy_version 47521 (0.0008) +[2023-10-13 02:13:18,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 97353728. Throughput: 0: 1681.4, 1: 1676.3. Samples: 24352760. Policy #0 lag: (min: 20.0, avg: 28.7, max: 52.0) +[2023-10-13 02:13:18,607][45375] Avg episode reward: [(0, '47.590'), (1, '47.030')] +[2023-10-13 02:13:18,644][46662] Updated weights for policy 0, policy_version 47570 (0.0007) +[2023-10-13 02:13:18,785][46663] Updated weights for policy 1, policy_version 47531 (0.0008) +[2023-10-13 02:13:19,005][46662] Updated weights for policy 0, policy_version 47580 (0.0008) +[2023-10-13 02:13:19,157][46663] Updated weights for policy 1, policy_version 47541 (0.0009) +[2023-10-13 02:13:19,527][46663] Updated weights for policy 1, policy_version 47551 (0.0007) +[2023-10-13 02:13:23,079][46662] Updated weights for policy 0, policy_version 47590 (0.0007) +[2023-10-13 02:13:23,448][46662] Updated weights for policy 0, policy_version 47600 (0.0007) +[2023-10-13 02:13:23,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 97419264. Throughput: 0: 1681.5, 1: 1673.0. Samples: 24373340. Policy #0 lag: (min: 20.0, avg: 28.7, max: 52.0) +[2023-10-13 02:13:23,607][45375] Avg episode reward: [(0, '47.640'), (1, '47.390')] +[2023-10-13 02:13:23,643][46663] Updated weights for policy 1, policy_version 47561 (0.0009) +[2023-10-13 02:13:23,816][46662] Updated weights for policy 0, policy_version 47610 (0.0009) +[2023-10-13 02:13:24,019][46663] Updated weights for policy 1, policy_version 47571 (0.0009) +[2023-10-13 02:13:24,391][46663] Updated weights for policy 1, policy_version 47581 (0.0008) +[2023-10-13 02:13:27,660][46662] Updated weights for policy 0, policy_version 47620 (0.0008) +[2023-10-13 02:13:28,032][46662] Updated weights for policy 0, policy_version 47630 (0.0011) +[2023-10-13 02:13:28,396][46662] Updated weights for policy 0, policy_version 47640 (0.0009) +[2023-10-13 02:13:28,529][46663] Updated weights for policy 1, policy_version 47591 (0.0009) +[2023-10-13 02:13:28,606][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 97484800. Throughput: 0: 1681.7, 1: 1675.7. Samples: 24382618. Policy #0 lag: (min: 19.0, avg: 20.3, max: 43.0) +[2023-10-13 02:13:28,607][45375] Avg episode reward: [(0, '46.120'), (1, '48.380')] +[2023-10-13 02:13:28,914][46663] Updated weights for policy 1, policy_version 47601 (0.0008) +[2023-10-13 02:13:29,279][46663] Updated weights for policy 1, policy_version 47611 (0.0008) +[2023-10-13 02:13:32,455][46662] Updated weights for policy 0, policy_version 47650 (0.0008) +[2023-10-13 02:13:32,824][46662] Updated weights for policy 0, policy_version 47660 (0.0008) +[2023-10-13 02:13:33,194][46662] Updated weights for policy 0, policy_version 47670 (0.0008) +[2023-10-13 02:13:33,362][46663] Updated weights for policy 1, policy_version 47621 (0.0009) +[2023-10-13 02:13:33,562][46662] Updated weights for policy 0, policy_version 47680 (0.0007) +[2023-10-13 02:13:33,606][45375] Fps is (10 sec: 16384.0, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 97583104. Throughput: 0: 1682.1, 1: 1675.7. Samples: 24403400. Policy #0 lag: (min: 19.0, avg: 20.3, max: 43.0) +[2023-10-13 02:13:33,607][45375] Avg episode reward: [(0, '45.780'), (1, '49.260')] +[2023-10-13 02:13:33,732][46663] Updated weights for policy 1, policy_version 47631 (0.0009) +[2023-10-13 02:13:34,093][46663] Updated weights for policy 1, policy_version 47641 (0.0008) +[2023-10-13 02:13:37,519][46662] Updated weights for policy 0, policy_version 47690 (0.0012) +[2023-10-13 02:13:37,893][46662] Updated weights for policy 0, policy_version 47700 (0.0011) +[2023-10-13 02:13:38,187][46663] Updated weights for policy 1, policy_version 47651 (0.0008) +[2023-10-13 02:13:38,257][46662] Updated weights for policy 0, policy_version 47710 (0.0010) +[2023-10-13 02:13:38,554][46663] Updated weights for policy 1, policy_version 47661 (0.0009) +[2023-10-13 02:13:38,607][45375] Fps is (10 sec: 16383.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 97648640. Throughput: 0: 1672.8, 1: 1673.7. Samples: 24423338. Policy #0 lag: (min: 19.0, avg: 20.3, max: 43.0) +[2023-10-13 02:13:38,607][45375] Avg episode reward: [(0, '46.020'), (1, '50.130')] +[2023-10-13 02:13:38,928][46663] Updated weights for policy 1, policy_version 47671 (0.0007) +[2023-10-13 02:13:42,482][46662] Updated weights for policy 0, policy_version 47720 (0.0008) +[2023-10-13 02:13:42,854][46662] Updated weights for policy 0, policy_version 47730 (0.0008) +[2023-10-13 02:13:42,901][46663] Updated weights for policy 1, policy_version 47681 (0.0007) +[2023-10-13 02:13:43,219][46662] Updated weights for policy 0, policy_version 47740 (0.0008) +[2023-10-13 02:13:43,268][46663] Updated weights for policy 1, policy_version 47691 (0.0008) +[2023-10-13 02:13:43,607][45375] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 97714176. Throughput: 0: 1688.9, 1: 1680.5. Samples: 24433444. Policy #0 lag: (min: 19.0, avg: 20.3, max: 43.0) +[2023-10-13 02:13:43,608][45375] Avg episode reward: [(0, '46.360'), (1, '50.010')] +[2023-10-13 02:13:43,639][46663] Updated weights for policy 1, policy_version 47701 (0.0009) +[2023-10-13 02:13:44,003][46663] Updated weights for policy 1, policy_version 47711 (0.0007) +[2023-10-13 02:13:47,253][46662] Updated weights for policy 0, policy_version 47750 (0.0008) +[2023-10-13 02:13:47,619][46662] Updated weights for policy 0, policy_version 47760 (0.0010) +[2023-10-13 02:13:47,979][46662] Updated weights for policy 0, policy_version 47770 (0.0007) +[2023-10-13 02:13:47,987][46663] Updated weights for policy 1, policy_version 47721 (0.0008) +[2023-10-13 02:13:48,350][46663] Updated weights for policy 1, policy_version 47731 (0.0009) +[2023-10-13 02:13:48,606][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 97779712. Throughput: 0: 1694.4, 1: 1687.4. Samples: 24454592. Policy #0 lag: (min: 19.0, avg: 20.3, max: 43.0) +[2023-10-13 02:13:48,607][45375] Avg episode reward: [(0, '46.620'), (1, '49.220')] +[2023-10-13 02:13:48,720][46663] Updated weights for policy 1, policy_version 47741 (0.0008) +[2023-10-13 02:13:51,966][46662] Updated weights for policy 0, policy_version 47780 (0.0008) +[2023-10-13 02:13:52,333][46662] Updated weights for policy 0, policy_version 47790 (0.0007) +[2023-10-13 02:13:52,700][46662] Updated weights for policy 0, policy_version 47800 (0.0009) +[2023-10-13 02:13:52,725][46663] Updated weights for policy 1, policy_version 47751 (0.0008) +[2023-10-13 02:13:53,095][46663] Updated weights for policy 1, policy_version 47761 (0.0009) +[2023-10-13 02:13:53,466][46663] Updated weights for policy 1, policy_version 47771 (0.0009) +[2023-10-13 02:13:53,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 97845248. Throughput: 0: 1672.3, 1: 1665.7. Samples: 24473130. Policy #0 lag: (min: 19.0, avg: 20.3, max: 43.0) +[2023-10-13 02:13:53,608][45375] Avg episode reward: [(0, '44.720'), (1, '49.610')] +[2023-10-13 02:13:53,620][46091] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000047808_48955392.pth... +[2023-10-13 02:13:53,643][46384] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000047776_48922624.pth... +[2023-10-13 02:13:53,652][46091] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000046240_47349760.pth +[2023-10-13 02:13:53,672][46384] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000046208_47316992.pth +[2023-10-13 02:13:56,764][46662] Updated weights for policy 0, policy_version 47810 (0.0008) +[2023-10-13 02:13:57,143][46662] Updated weights for policy 0, policy_version 47820 (0.0008) +[2023-10-13 02:13:57,521][46662] Updated weights for policy 0, policy_version 47830 (0.0010) +[2023-10-13 02:13:57,561][46663] Updated weights for policy 1, policy_version 47781 (0.0007) +[2023-10-13 02:13:57,886][46662] Updated weights for policy 0, policy_version 47840 (0.0008) +[2023-10-13 02:13:57,924][46663] Updated weights for policy 1, policy_version 47791 (0.0008) +[2023-10-13 02:13:58,291][46663] Updated weights for policy 1, policy_version 47801 (0.0009) +[2023-10-13 02:13:58,607][45375] Fps is (10 sec: 16383.9, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 97943552. Throughput: 0: 1696.3, 1: 1689.5. Samples: 24484208. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:13:58,607][45375] Avg episode reward: [(0, '44.410'), (1, '50.870')] +[2023-10-13 02:14:01,945][46662] Updated weights for policy 0, policy_version 47850 (0.0008) +[2023-10-13 02:14:02,309][46663] Updated weights for policy 1, policy_version 47811 (0.0010) +[2023-10-13 02:14:02,312][46662] Updated weights for policy 0, policy_version 47860 (0.0009) +[2023-10-13 02:14:02,677][46662] Updated weights for policy 0, policy_version 47870 (0.0007) +[2023-10-13 02:14:02,679][46663] Updated weights for policy 1, policy_version 47821 (0.0009) +[2023-10-13 02:14:03,048][46663] Updated weights for policy 1, policy_version 47831 (0.0010) +[2023-10-13 02:14:03,607][45375] Fps is (10 sec: 16384.4, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 98009088. Throughput: 0: 1692.9, 1: 1679.2. Samples: 24504508. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:14:03,607][45375] Avg episode reward: [(0, '43.410'), (1, '51.620')] +[2023-10-13 02:14:06,850][46662] Updated weights for policy 0, policy_version 47880 (0.0009) +[2023-10-13 02:14:07,021][46663] Updated weights for policy 1, policy_version 47841 (0.0010) +[2023-10-13 02:14:07,214][46662] Updated weights for policy 0, policy_version 47890 (0.0009) +[2023-10-13 02:14:07,390][46663] Updated weights for policy 1, policy_version 47851 (0.0010) +[2023-10-13 02:14:07,584][46662] Updated weights for policy 0, policy_version 47900 (0.0009) +[2023-10-13 02:14:07,752][46663] Updated weights for policy 1, policy_version 47861 (0.0010) +[2023-10-13 02:14:08,121][46663] Updated weights for policy 1, policy_version 47871 (0.0007) +[2023-10-13 02:14:08,607][45375] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 98074624. Throughput: 0: 1667.5, 1: 1664.6. Samples: 24523284. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:14:08,607][45375] Avg episode reward: [(0, '44.780'), (1, '52.250')] +[2023-10-13 02:14:11,607][46662] Updated weights for policy 0, policy_version 47910 (0.0008) +[2023-10-13 02:14:11,986][46662] Updated weights for policy 0, policy_version 47920 (0.0009) +[2023-10-13 02:14:12,199][46663] Updated weights for policy 1, policy_version 47881 (0.0008) +[2023-10-13 02:14:12,348][46662] Updated weights for policy 0, policy_version 47930 (0.0009) +[2023-10-13 02:14:12,556][46663] Updated weights for policy 1, policy_version 47891 (0.0008) +[2023-10-13 02:14:12,936][46663] Updated weights for policy 1, policy_version 47901 (0.0009) +[2023-10-13 02:14:13,607][45375] Fps is (10 sec: 13107.0, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 98140160. Throughput: 0: 1693.4, 1: 1693.4. Samples: 24535024. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:14:13,608][45375] Avg episode reward: [(0, '45.650'), (1, '53.090')] +[2023-10-13 02:14:16,367][46662] Updated weights for policy 0, policy_version 47940 (0.0009) +[2023-10-13 02:14:16,737][46662] Updated weights for policy 0, policy_version 47950 (0.0009) +[2023-10-13 02:14:17,105][46662] Updated weights for policy 0, policy_version 47960 (0.0009) +[2023-10-13 02:14:17,202][46663] Updated weights for policy 1, policy_version 47911 (0.0008) +[2023-10-13 02:14:17,586][46663] Updated weights for policy 1, policy_version 47921 (0.0010) +[2023-10-13 02:14:17,953][46663] Updated weights for policy 1, policy_version 47931 (0.0009) +[2023-10-13 02:14:18,607][45375] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 98205696. Throughput: 0: 1677.5, 1: 1683.7. Samples: 24554654. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:14:18,607][45375] Avg episode reward: [(0, '46.460'), (1, '52.140')] +[2023-10-13 02:14:21,177][46662] Updated weights for policy 0, policy_version 47970 (0.0008) +[2023-10-13 02:14:21,541][46662] Updated weights for policy 0, policy_version 47980 (0.0010) +[2023-10-13 02:14:21,914][46662] Updated weights for policy 0, policy_version 47990 (0.0007) +[2023-10-13 02:14:21,951][46663] Updated weights for policy 1, policy_version 47941 (0.0007) +[2023-10-13 02:14:22,281][46662] Updated weights for policy 0, policy_version 48000 (0.0009) +[2023-10-13 02:14:22,322][46663] Updated weights for policy 1, policy_version 47951 (0.0009) +[2023-10-13 02:14:22,681][46663] Updated weights for policy 1, policy_version 47961 (0.0009) +[2023-10-13 02:14:23,607][45375] Fps is (10 sec: 13107.3, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 98271232. Throughput: 0: 1674.2, 1: 1669.6. Samples: 24573810. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:14:23,607][45375] Avg episode reward: [(0, '44.930'), (1, '52.000')] +[2023-10-13 02:14:26,362][46662] Updated weights for policy 0, policy_version 48010 (0.0010) +[2023-10-13 02:14:26,728][46662] Updated weights for policy 0, policy_version 48020 (0.0007) +[2023-10-13 02:14:26,891][46663] Updated weights for policy 1, policy_version 47971 (0.0010) +[2023-10-13 02:14:27,104][46662] Updated weights for policy 0, policy_version 48030 (0.0007) +[2023-10-13 02:14:27,245][46663] Updated weights for policy 1, policy_version 47981 (0.0008) +[2023-10-13 02:14:27,612][46663] Updated weights for policy 1, policy_version 47991 (0.0009) +[2023-10-13 02:14:28,606][45375] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 98336768. Throughput: 0: 1686.8, 1: 1686.9. Samples: 24585258. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:14:28,607][45375] Avg episode reward: [(0, '46.020'), (1, '52.390')] +[2023-10-13 02:14:31,140][46662] Updated weights for policy 0, policy_version 48040 (0.0009) +[2023-10-13 02:14:31,518][46662] Updated weights for policy 0, policy_version 48050 (0.0007) +[2023-10-13 02:14:31,742][46663] Updated weights for policy 1, policy_version 48001 (0.0009) +[2023-10-13 02:14:31,891][46662] Updated weights for policy 0, policy_version 48060 (0.0008) +[2023-10-13 02:14:32,103][46663] Updated weights for policy 1, policy_version 48011 (0.0007) +[2023-10-13 02:14:32,460][46663] Updated weights for policy 1, policy_version 48021 (0.0010) +[2023-10-13 02:14:32,823][46663] Updated weights for policy 1, policy_version 48031 (0.0007) +[2023-10-13 02:14:33,606][45375] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 98402304. Throughput: 0: 1659.8, 1: 1663.2. Samples: 24604124. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:14:33,607][45375] Avg episode reward: [(0, '46.900'), (1, '52.620')] +[2023-10-13 02:14:36,039][46662] Updated weights for policy 0, policy_version 48070 (0.0009) +[2023-10-13 02:14:36,423][46662] Updated weights for policy 0, policy_version 48080 (0.0010) +[2023-10-13 02:14:36,791][46662] Updated weights for policy 0, policy_version 48090 (0.0009) +[2023-10-13 02:14:36,986][46663] Updated weights for policy 1, policy_version 48041 (0.0009) +[2023-10-13 02:14:37,351][46663] Updated weights for policy 1, policy_version 48051 (0.0010) +[2023-10-13 02:14:37,723][46663] Updated weights for policy 1, policy_version 48061 (0.0011) +[2023-10-13 02:14:38,606][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 98467840. Throughput: 0: 1674.8, 1: 1671.3. Samples: 24623704. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:14:38,607][45375] Avg episode reward: [(0, '48.730'), (1, '51.610')] +[2023-10-13 02:14:40,677][46662] Updated weights for policy 0, policy_version 48100 (0.0009) +[2023-10-13 02:14:41,046][46662] Updated weights for policy 0, policy_version 48110 (0.0010) +[2023-10-13 02:14:41,406][46662] Updated weights for policy 0, policy_version 48120 (0.0010) +[2023-10-13 02:14:41,772][46663] Updated weights for policy 1, policy_version 48071 (0.0009) +[2023-10-13 02:14:42,138][46663] Updated weights for policy 1, policy_version 48081 (0.0007) +[2023-10-13 02:14:42,505][46663] Updated weights for policy 1, policy_version 48091 (0.0009) +[2023-10-13 02:14:43,607][45375] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 98533376. Throughput: 0: 1675.5, 1: 1679.8. Samples: 24635198. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:14:43,608][45375] Avg episode reward: [(0, '49.340'), (1, '50.390')] +[2023-10-13 02:14:45,487][46662] Updated weights for policy 0, policy_version 48130 (0.0009) +[2023-10-13 02:14:45,862][46662] Updated weights for policy 0, policy_version 48140 (0.0009) +[2023-10-13 02:14:46,230][46662] Updated weights for policy 0, policy_version 48150 (0.0007) +[2023-10-13 02:14:46,605][46662] Updated weights for policy 0, policy_version 48160 (0.0008) +[2023-10-13 02:14:46,605][46663] Updated weights for policy 1, policy_version 48101 (0.0009) +[2023-10-13 02:14:46,976][46663] Updated weights for policy 1, policy_version 48111 (0.0009) +[2023-10-13 02:14:47,335][46663] Updated weights for policy 1, policy_version 48121 (0.0010) +[2023-10-13 02:14:48,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 98598912. Throughput: 0: 1657.0, 1: 1666.4. Samples: 24654062. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:14:48,607][45375] Avg episode reward: [(0, '49.060'), (1, '49.020')] +[2023-10-13 02:14:50,565][46662] Updated weights for policy 0, policy_version 48170 (0.0009) +[2023-10-13 02:14:50,943][46662] Updated weights for policy 0, policy_version 48180 (0.0007) +[2023-10-13 02:14:51,306][46662] Updated weights for policy 0, policy_version 48190 (0.0008) +[2023-10-13 02:14:51,549][46663] Updated weights for policy 1, policy_version 48131 (0.0008) +[2023-10-13 02:14:51,918][46663] Updated weights for policy 1, policy_version 48141 (0.0007) +[2023-10-13 02:14:52,298][46663] Updated weights for policy 1, policy_version 48151 (0.0008) +[2023-10-13 02:14:53,607][45375] Fps is (10 sec: 13107.4, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 98664448. Throughput: 0: 1683.3, 1: 1674.0. Samples: 24674362. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:14:53,607][45375] Avg episode reward: [(0, '48.340'), (1, '49.600')] +[2023-10-13 02:14:55,551][46662] Updated weights for policy 0, policy_version 48200 (0.0009) +[2023-10-13 02:14:55,919][46662] Updated weights for policy 0, policy_version 48210 (0.0009) +[2023-10-13 02:14:56,293][46662] Updated weights for policy 0, policy_version 48220 (0.0009) +[2023-10-13 02:14:56,299][46663] Updated weights for policy 1, policy_version 48161 (0.0009) +[2023-10-13 02:14:56,666][46663] Updated weights for policy 1, policy_version 48171 (0.0009) +[2023-10-13 02:14:57,024][46663] Updated weights for policy 1, policy_version 48181 (0.0007) +[2023-10-13 02:14:57,399][46663] Updated weights for policy 1, policy_version 48191 (0.0009) +[2023-10-13 02:14:58,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 98729984. Throughput: 0: 1665.8, 1: 1672.3. Samples: 24685236. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:14:58,607][45375] Avg episode reward: [(0, '47.040'), (1, '47.900')] +[2023-10-13 02:15:00,412][46662] Updated weights for policy 0, policy_version 48230 (0.0009) +[2023-10-13 02:15:00,792][46662] Updated weights for policy 0, policy_version 48240 (0.0009) +[2023-10-13 02:15:01,170][46662] Updated weights for policy 0, policy_version 48250 (0.0009) +[2023-10-13 02:15:01,558][46663] Updated weights for policy 1, policy_version 48201 (0.0008) +[2023-10-13 02:15:01,929][46663] Updated weights for policy 1, policy_version 48211 (0.0008) +[2023-10-13 02:15:02,288][46663] Updated weights for policy 1, policy_version 48221 (0.0012) +[2023-10-13 02:15:03,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 98795520. Throughput: 0: 1658.9, 1: 1656.7. Samples: 24703858. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:15:03,608][45375] Avg episode reward: [(0, '46.980'), (1, '48.430')] +[2023-10-13 02:15:05,328][46662] Updated weights for policy 0, policy_version 48260 (0.0008) +[2023-10-13 02:15:05,688][46662] Updated weights for policy 0, policy_version 48270 (0.0010) +[2023-10-13 02:15:06,069][46662] Updated weights for policy 0, policy_version 48280 (0.0009) +[2023-10-13 02:15:06,290][46663] Updated weights for policy 1, policy_version 48231 (0.0009) +[2023-10-13 02:15:06,655][46663] Updated weights for policy 1, policy_version 48241 (0.0008) +[2023-10-13 02:15:07,022][46663] Updated weights for policy 1, policy_version 48251 (0.0009) +[2023-10-13 02:15:08,606][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 98861056. Throughput: 0: 1672.1, 1: 1673.0. Samples: 24724342. Policy #0 lag: (min: 31.0, avg: 36.3, max: 63.0) +[2023-10-13 02:15:08,607][45375] Avg episode reward: [(0, '46.810'), (1, '47.380')] +[2023-10-13 02:15:10,036][46662] Updated weights for policy 0, policy_version 48290 (0.0010) +[2023-10-13 02:15:10,404][46662] Updated weights for policy 0, policy_version 48300 (0.0010) +[2023-10-13 02:15:10,773][46662] Updated weights for policy 0, policy_version 48310 (0.0009) +[2023-10-13 02:15:11,144][46662] Updated weights for policy 0, policy_version 48320 (0.0009) +[2023-10-13 02:15:11,208][46663] Updated weights for policy 1, policy_version 48261 (0.0010) +[2023-10-13 02:15:11,577][46663] Updated weights for policy 1, policy_version 48271 (0.0010) +[2023-10-13 02:15:11,937][46663] Updated weights for policy 1, policy_version 48281 (0.0007) +[2023-10-13 02:15:13,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 98926592. Throughput: 0: 1653.3, 1: 1666.8. Samples: 24734664. Policy #0 lag: (min: 31.0, avg: 36.3, max: 63.0) +[2023-10-13 02:15:13,608][45375] Avg episode reward: [(0, '47.210'), (1, '47.490')] +[2023-10-13 02:15:15,240][46662] Updated weights for policy 0, policy_version 48330 (0.0011) +[2023-10-13 02:15:15,599][46662] Updated weights for policy 0, policy_version 48340 (0.0009) +[2023-10-13 02:15:15,968][46662] Updated weights for policy 0, policy_version 48350 (0.0010) +[2023-10-13 02:15:16,090][46663] Updated weights for policy 1, policy_version 48291 (0.0008) +[2023-10-13 02:15:16,455][46663] Updated weights for policy 1, policy_version 48301 (0.0011) +[2023-10-13 02:15:16,820][46663] Updated weights for policy 1, policy_version 48311 (0.0009) +[2023-10-13 02:15:18,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 98992128. Throughput: 0: 1671.7, 1: 1660.8. Samples: 24754088. Policy #0 lag: (min: 31.0, avg: 36.3, max: 63.0) +[2023-10-13 02:15:18,607][45375] Avg episode reward: [(0, '48.220'), (1, '49.280')] +[2023-10-13 02:15:20,050][46662] Updated weights for policy 0, policy_version 48360 (0.0009) +[2023-10-13 02:15:20,420][46662] Updated weights for policy 0, policy_version 48370 (0.0008) +[2023-10-13 02:15:20,714][46663] Updated weights for policy 1, policy_version 48321 (0.0011) +[2023-10-13 02:15:20,797][46662] Updated weights for policy 0, policy_version 48380 (0.0007) +[2023-10-13 02:15:21,087][46663] Updated weights for policy 1, policy_version 48331 (0.0009) +[2023-10-13 02:15:21,464][46663] Updated weights for policy 1, policy_version 48341 (0.0009) +[2023-10-13 02:15:21,840][46663] Updated weights for policy 1, policy_version 48351 (0.0008) +[2023-10-13 02:15:23,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 99057664. Throughput: 0: 1684.5, 1: 1674.8. Samples: 24774872. Policy #0 lag: (min: 31.0, avg: 36.3, max: 63.0) +[2023-10-13 02:15:23,607][45375] Avg episode reward: [(0, '48.360'), (1, '49.120')] +[2023-10-13 02:15:24,905][46662] Updated weights for policy 0, policy_version 48390 (0.0008) +[2023-10-13 02:15:25,287][46662] Updated weights for policy 0, policy_version 48400 (0.0008) +[2023-10-13 02:15:25,662][46662] Updated weights for policy 0, policy_version 48410 (0.0007) +[2023-10-13 02:15:25,934][46663] Updated weights for policy 1, policy_version 48361 (0.0009) +[2023-10-13 02:15:26,314][46663] Updated weights for policy 1, policy_version 48371 (0.0009) +[2023-10-13 02:15:26,672][46663] Updated weights for policy 1, policy_version 48381 (0.0009) +[2023-10-13 02:15:28,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 99123200. Throughput: 0: 1656.8, 1: 1657.3. Samples: 24784330. Policy #0 lag: (min: 31.0, avg: 36.3, max: 63.0) +[2023-10-13 02:15:28,607][45375] Avg episode reward: [(0, '48.330'), (1, '49.090')] +[2023-10-13 02:15:29,715][46662] Updated weights for policy 0, policy_version 48420 (0.0008) +[2023-10-13 02:15:30,085][46662] Updated weights for policy 0, policy_version 48430 (0.0008) +[2023-10-13 02:15:30,458][46662] Updated weights for policy 0, policy_version 48440 (0.0008) +[2023-10-13 02:15:30,712][46663] Updated weights for policy 1, policy_version 48391 (0.0009) +[2023-10-13 02:15:31,079][46663] Updated weights for policy 1, policy_version 48401 (0.0008) +[2023-10-13 02:15:31,445][46663] Updated weights for policy 1, policy_version 48411 (0.0009) +[2023-10-13 02:15:33,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 99188736. Throughput: 0: 1676.8, 1: 1665.6. Samples: 24804468. Policy #0 lag: (min: 31.0, avg: 36.3, max: 63.0) +[2023-10-13 02:15:33,608][45375] Avg episode reward: [(0, '46.530'), (1, '48.800')] +[2023-10-13 02:15:34,540][46662] Updated weights for policy 0, policy_version 48450 (0.0008) +[2023-10-13 02:15:34,906][46662] Updated weights for policy 0, policy_version 48460 (0.0010) +[2023-10-13 02:15:35,281][46662] Updated weights for policy 0, policy_version 48470 (0.0008) +[2023-10-13 02:15:35,507][46663] Updated weights for policy 1, policy_version 48421 (0.0008) +[2023-10-13 02:15:35,640][46662] Updated weights for policy 0, policy_version 48480 (0.0008) +[2023-10-13 02:15:35,873][46663] Updated weights for policy 1, policy_version 48431 (0.0007) +[2023-10-13 02:15:36,248][46663] Updated weights for policy 1, policy_version 48441 (0.0009) +[2023-10-13 02:15:38,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 99254272. Throughput: 0: 1675.9, 1: 1681.0. Samples: 24825424. Policy #0 lag: (min: 31.0, avg: 36.3, max: 63.0) +[2023-10-13 02:15:38,607][45375] Avg episode reward: [(0, '46.250'), (1, '48.360')] +[2023-10-13 02:15:39,708][46662] Updated weights for policy 0, policy_version 48490 (0.0009) +[2023-10-13 02:15:40,071][46662] Updated weights for policy 0, policy_version 48500 (0.0008) +[2023-10-13 02:15:40,411][46663] Updated weights for policy 1, policy_version 48451 (0.0008) +[2023-10-13 02:15:40,439][46662] Updated weights for policy 0, policy_version 48510 (0.0008) +[2023-10-13 02:15:40,785][46663] Updated weights for policy 1, policy_version 48461 (0.0011) +[2023-10-13 02:15:41,145][46663] Updated weights for policy 1, policy_version 48471 (0.0008) +[2023-10-13 02:15:43,607][45375] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 99319808. Throughput: 0: 1660.7, 1: 1658.6. Samples: 24834604. Policy #0 lag: (min: 21.0, avg: 26.0, max: 53.0) +[2023-10-13 02:15:43,607][45375] Avg episode reward: [(0, '46.090'), (1, '48.770')] +[2023-10-13 02:15:44,653][46662] Updated weights for policy 0, policy_version 48520 (0.0009) +[2023-10-13 02:15:45,021][46662] Updated weights for policy 0, policy_version 48530 (0.0008) +[2023-10-13 02:15:45,173][46663] Updated weights for policy 1, policy_version 48481 (0.0009) +[2023-10-13 02:15:45,403][46662] Updated weights for policy 0, policy_version 48540 (0.0010) +[2023-10-13 02:15:45,536][46663] Updated weights for policy 1, policy_version 48491 (0.0007) +[2023-10-13 02:15:45,896][46663] Updated weights for policy 1, policy_version 48501 (0.0008) +[2023-10-13 02:15:46,267][46663] Updated weights for policy 1, policy_version 48511 (0.0008) +[2023-10-13 02:15:48,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 99385344. Throughput: 0: 1676.3, 1: 1686.0. Samples: 24855162. Policy #0 lag: (min: 21.0, avg: 26.0, max: 53.0) +[2023-10-13 02:15:48,607][45375] Avg episode reward: [(0, '44.980'), (1, '48.320')] +[2023-10-13 02:15:49,668][46662] Updated weights for policy 0, policy_version 48550 (0.0009) +[2023-10-13 02:15:50,052][46662] Updated weights for policy 0, policy_version 48560 (0.0007) +[2023-10-13 02:15:50,423][46662] Updated weights for policy 0, policy_version 48570 (0.0007) +[2023-10-13 02:15:50,476][46663] Updated weights for policy 1, policy_version 48521 (0.0010) +[2023-10-13 02:15:50,851][46663] Updated weights for policy 1, policy_version 48531 (0.0010) +[2023-10-13 02:15:51,211][46663] Updated weights for policy 1, policy_version 48541 (0.0010) +[2023-10-13 02:15:53,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 99450880. Throughput: 0: 1677.1, 1: 1684.9. Samples: 24875634. Policy #0 lag: (min: 21.0, avg: 26.0, max: 53.0) +[2023-10-13 02:15:53,607][45375] Avg episode reward: [(0, '45.680'), (1, '50.130')] +[2023-10-13 02:15:53,614][46091] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000048576_49741824.pth... +[2023-10-13 02:15:53,615][46384] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000048544_49709056.pth... +[2023-10-13 02:15:53,653][46384] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000046976_48103424.pth +[2023-10-13 02:15:53,654][46091] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000047008_48136192.pth +[2023-10-13 02:15:54,390][46662] Updated weights for policy 0, policy_version 48580 (0.0009) +[2023-10-13 02:15:54,769][46662] Updated weights for policy 0, policy_version 48590 (0.0009) +[2023-10-13 02:15:55,128][46662] Updated weights for policy 0, policy_version 48600 (0.0008) +[2023-10-13 02:15:55,181][46663] Updated weights for policy 1, policy_version 48551 (0.0008) +[2023-10-13 02:15:55,557][46663] Updated weights for policy 1, policy_version 48561 (0.0008) +[2023-10-13 02:15:55,919][46663] Updated weights for policy 1, policy_version 48571 (0.0010) +[2023-10-13 02:15:58,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 99516416. Throughput: 0: 1667.9, 1: 1664.5. Samples: 24884620. Policy #0 lag: (min: 21.0, avg: 26.0, max: 53.0) +[2023-10-13 02:15:58,607][45375] Avg episode reward: [(0, '44.820'), (1, '49.530')] +[2023-10-13 02:15:59,129][46662] Updated weights for policy 0, policy_version 48610 (0.0008) +[2023-10-13 02:15:59,496][46662] Updated weights for policy 0, policy_version 48620 (0.0008) +[2023-10-13 02:15:59,874][46662] Updated weights for policy 0, policy_version 48630 (0.0008) +[2023-10-13 02:16:00,015][46663] Updated weights for policy 1, policy_version 48581 (0.0009) +[2023-10-13 02:16:00,240][46662] Updated weights for policy 0, policy_version 48640 (0.0008) +[2023-10-13 02:16:00,376][46663] Updated weights for policy 1, policy_version 48591 (0.0009) +[2023-10-13 02:16:00,731][46663] Updated weights for policy 1, policy_version 48601 (0.0009) +[2023-10-13 02:16:03,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 99581952. Throughput: 0: 1682.3, 1: 1685.3. Samples: 24905630. Policy #0 lag: (min: 21.0, avg: 26.0, max: 53.0) +[2023-10-13 02:16:03,607][45375] Avg episode reward: [(0, '45.390'), (1, '50.910')] +[2023-10-13 02:16:04,150][46662] Updated weights for policy 0, policy_version 48650 (0.0009) +[2023-10-13 02:16:04,514][46662] Updated weights for policy 0, policy_version 48660 (0.0008) +[2023-10-13 02:16:04,885][46662] Updated weights for policy 0, policy_version 48670 (0.0008) +[2023-10-13 02:16:04,975][46663] Updated weights for policy 1, policy_version 48611 (0.0011) +[2023-10-13 02:16:05,348][46663] Updated weights for policy 1, policy_version 48621 (0.0009) +[2023-10-13 02:16:05,707][46663] Updated weights for policy 1, policy_version 48631 (0.0007) +[2023-10-13 02:16:08,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 99647488. Throughput: 0: 1681.8, 1: 1683.8. Samples: 24926322. Policy #0 lag: (min: 21.0, avg: 26.0, max: 53.0) +[2023-10-13 02:16:08,607][45375] Avg episode reward: [(0, '44.140'), (1, '49.830')] +[2023-10-13 02:16:09,008][46662] Updated weights for policy 0, policy_version 48680 (0.0008) +[2023-10-13 02:16:09,376][46662] Updated weights for policy 0, policy_version 48690 (0.0007) +[2023-10-13 02:16:09,741][46662] Updated weights for policy 0, policy_version 48700 (0.0007) +[2023-10-13 02:16:09,838][46663] Updated weights for policy 1, policy_version 48641 (0.0009) +[2023-10-13 02:16:10,195][46663] Updated weights for policy 1, policy_version 48651 (0.0010) +[2023-10-13 02:16:10,559][46663] Updated weights for policy 1, policy_version 48661 (0.0009) +[2023-10-13 02:16:10,925][46663] Updated weights for policy 1, policy_version 48671 (0.0010) +[2023-10-13 02:16:13,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 99713024. Throughput: 0: 1687.3, 1: 1668.9. Samples: 24935360. Policy #0 lag: (min: 21.0, avg: 26.0, max: 53.0) +[2023-10-13 02:16:13,607][45375] Avg episode reward: [(0, '43.490'), (1, '49.590')] +[2023-10-13 02:16:13,667][46662] Updated weights for policy 0, policy_version 48710 (0.0007) +[2023-10-13 02:16:14,053][46662] Updated weights for policy 0, policy_version 48720 (0.0007) +[2023-10-13 02:16:14,422][46662] Updated weights for policy 0, policy_version 48730 (0.0011) +[2023-10-13 02:16:14,978][46663] Updated weights for policy 1, policy_version 48681 (0.0008) +[2023-10-13 02:16:15,348][46663] Updated weights for policy 1, policy_version 48691 (0.0008) +[2023-10-13 02:16:15,707][46663] Updated weights for policy 1, policy_version 48701 (0.0008) +[2023-10-13 02:16:18,367][46662] Updated weights for policy 0, policy_version 48740 (0.0009) +[2023-10-13 02:16:18,607][45375] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 99778560. Throughput: 0: 1697.0, 1: 1680.3. Samples: 24956448. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:16:18,608][45375] Avg episode reward: [(0, '42.830'), (1, '50.440')] +[2023-10-13 02:16:18,735][46662] Updated weights for policy 0, policy_version 48750 (0.0008) +[2023-10-13 02:16:19,113][46662] Updated weights for policy 0, policy_version 48760 (0.0008) +[2023-10-13 02:16:19,774][46663] Updated weights for policy 1, policy_version 48711 (0.0008) +[2023-10-13 02:16:20,136][46663] Updated weights for policy 1, policy_version 48721 (0.0010) +[2023-10-13 02:16:20,503][46663] Updated weights for policy 1, policy_version 48731 (0.0010) +[2023-10-13 02:16:23,010][46662] Updated weights for policy 0, policy_version 48770 (0.0007) +[2023-10-13 02:16:23,378][46662] Updated weights for policy 0, policy_version 48780 (0.0007) +[2023-10-13 02:16:23,607][45375] Fps is (10 sec: 13106.0, 60 sec: 13107.0, 300 sec: 13329.3). Total num frames: 99844096. Throughput: 0: 1696.7, 1: 1670.9. Samples: 24976968. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:16:23,608][45375] Avg episode reward: [(0, '44.220'), (1, '50.140')] +[2023-10-13 02:16:23,751][46662] Updated weights for policy 0, policy_version 48790 (0.0008) +[2023-10-13 02:16:24,114][46662] Updated weights for policy 0, policy_version 48800 (0.0009) +[2023-10-13 02:16:24,559][46663] Updated weights for policy 1, policy_version 48741 (0.0010) +[2023-10-13 02:16:24,915][46663] Updated weights for policy 1, policy_version 48751 (0.0010) +[2023-10-13 02:16:25,280][46663] Updated weights for policy 1, policy_version 48761 (0.0008) +[2023-10-13 02:16:28,160][46662] Updated weights for policy 0, policy_version 48810 (0.0009) +[2023-10-13 02:16:28,534][46662] Updated weights for policy 0, policy_version 48820 (0.0011) +[2023-10-13 02:16:28,606][45375] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 99909632. Throughput: 0: 1702.6, 1: 1664.0. Samples: 24986100. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:16:28,607][45375] Avg episode reward: [(0, '45.250'), (1, '51.080')] +[2023-10-13 02:16:28,913][46662] Updated weights for policy 0, policy_version 48830 (0.0007) +[2023-10-13 02:16:29,356][46663] Updated weights for policy 1, policy_version 48771 (0.0009) +[2023-10-13 02:16:29,738][46663] Updated weights for policy 1, policy_version 48781 (0.0008) +[2023-10-13 02:16:30,095][46663] Updated weights for policy 1, policy_version 48791 (0.0010) +[2023-10-13 02:16:33,191][46662] Updated weights for policy 0, policy_version 48840 (0.0010) +[2023-10-13 02:16:33,556][46662] Updated weights for policy 0, policy_version 48850 (0.0011) +[2023-10-13 02:16:33,607][45375] Fps is (10 sec: 13108.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 99975168. Throughput: 0: 1700.2, 1: 1667.4. Samples: 25006704. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:16:33,608][45375] Avg episode reward: [(0, '45.290'), (1, '52.760')] +[2023-10-13 02:16:33,922][46662] Updated weights for policy 0, policy_version 48860 (0.0010) +[2023-10-13 02:16:34,154][46663] Updated weights for policy 1, policy_version 48801 (0.0007) +[2023-10-13 02:16:34,518][46663] Updated weights for policy 1, policy_version 48811 (0.0008) +[2023-10-13 02:16:34,878][46663] Updated weights for policy 1, policy_version 48821 (0.0009) +[2023-10-13 02:16:35,240][46663] Updated weights for policy 1, policy_version 48831 (0.0009) +[2023-10-13 02:16:38,063][46662] Updated weights for policy 0, policy_version 48870 (0.0009) +[2023-10-13 02:16:38,435][46662] Updated weights for policy 0, policy_version 48880 (0.0009) +[2023-10-13 02:16:38,606][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 100040704. Throughput: 0: 1699.7, 1: 1668.0. Samples: 25027182. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:16:38,607][45375] Avg episode reward: [(0, '45.740'), (1, '52.130')] +[2023-10-13 02:16:38,819][46662] Updated weights for policy 0, policy_version 48890 (0.0008) +[2023-10-13 02:16:39,401][46663] Updated weights for policy 1, policy_version 48841 (0.0010) +[2023-10-13 02:16:39,772][46663] Updated weights for policy 1, policy_version 48851 (0.0010) +[2023-10-13 02:16:40,137][46663] Updated weights for policy 1, policy_version 48861 (0.0009) +[2023-10-13 02:16:42,664][46662] Updated weights for policy 0, policy_version 48900 (0.0011) +[2023-10-13 02:16:43,026][46662] Updated weights for policy 0, policy_version 48910 (0.0009) +[2023-10-13 02:16:43,404][46662] Updated weights for policy 0, policy_version 48920 (0.0008) +[2023-10-13 02:16:43,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 100106240. Throughput: 0: 1702.2, 1: 1666.7. Samples: 25036220. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:16:43,608][45375] Avg episode reward: [(0, '46.020'), (1, '51.770')] +[2023-10-13 02:16:44,209][46663] Updated weights for policy 1, policy_version 48871 (0.0008) +[2023-10-13 02:16:44,585][46663] Updated weights for policy 1, policy_version 48881 (0.0009) +[2023-10-13 02:16:44,948][46663] Updated weights for policy 1, policy_version 48891 (0.0008) +[2023-10-13 02:16:47,345][46662] Updated weights for policy 0, policy_version 48930 (0.0008) +[2023-10-13 02:16:47,718][46662] Updated weights for policy 0, policy_version 48940 (0.0008) +[2023-10-13 02:16:48,089][46662] Updated weights for policy 0, policy_version 48950 (0.0008) +[2023-10-13 02:16:48,453][46662] Updated weights for policy 0, policy_version 48960 (0.0008) +[2023-10-13 02:16:48,607][45375] Fps is (10 sec: 16383.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 100204544. Throughput: 0: 1701.7, 1: 1671.8. Samples: 25057438. Policy #0 lag: (min: 21.0, avg: 24.3, max: 53.0) +[2023-10-13 02:16:48,607][45375] Avg episode reward: [(0, '46.110'), (1, '52.630')] +[2023-10-13 02:16:49,038][46663] Updated weights for policy 1, policy_version 48901 (0.0008) +[2023-10-13 02:16:49,400][46663] Updated weights for policy 1, policy_version 48911 (0.0010) +[2023-10-13 02:16:49,759][46663] Updated weights for policy 1, policy_version 48921 (0.0008) +[2023-10-13 02:16:52,582][46662] Updated weights for policy 0, policy_version 48970 (0.0008) +[2023-10-13 02:16:52,952][46662] Updated weights for policy 0, policy_version 48980 (0.0008) +[2023-10-13 02:16:53,329][46662] Updated weights for policy 0, policy_version 48990 (0.0007) +[2023-10-13 02:16:53,607][45375] Fps is (10 sec: 16384.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 100270080. Throughput: 0: 1686.0, 1: 1672.1. Samples: 25077436. Policy #0 lag: (min: 21.0, avg: 24.3, max: 53.0) +[2023-10-13 02:16:53,607][45375] Avg episode reward: [(0, '46.760'), (1, '53.030')] +[2023-10-13 02:16:53,843][46663] Updated weights for policy 1, policy_version 48931 (0.0010) +[2023-10-13 02:16:54,216][46663] Updated weights for policy 1, policy_version 48941 (0.0008) +[2023-10-13 02:16:54,576][46663] Updated weights for policy 1, policy_version 48951 (0.0009) +[2023-10-13 02:16:57,200][46662] Updated weights for policy 0, policy_version 49000 (0.0009) +[2023-10-13 02:16:57,575][46662] Updated weights for policy 0, policy_version 49010 (0.0008) +[2023-10-13 02:16:57,943][46662] Updated weights for policy 0, policy_version 49020 (0.0009) +[2023-10-13 02:16:58,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 100335616. Throughput: 0: 1701.7, 1: 1671.5. Samples: 25087154. Policy #0 lag: (min: 21.0, avg: 24.3, max: 53.0) +[2023-10-13 02:16:58,607][45375] Avg episode reward: [(0, '46.290'), (1, '55.320')] +[2023-10-13 02:16:58,803][46663] Updated weights for policy 1, policy_version 48961 (0.0009) +[2023-10-13 02:16:59,165][46663] Updated weights for policy 1, policy_version 48971 (0.0007) +[2023-10-13 02:16:59,535][46663] Updated weights for policy 1, policy_version 48981 (0.0008) +[2023-10-13 02:16:59,907][46663] Updated weights for policy 1, policy_version 48991 (0.0008) +[2023-10-13 02:17:02,047][46662] Updated weights for policy 0, policy_version 49030 (0.0008) +[2023-10-13 02:17:02,425][46662] Updated weights for policy 0, policy_version 49040 (0.0007) +[2023-10-13 02:17:02,800][46662] Updated weights for policy 0, policy_version 49050 (0.0009) +[2023-10-13 02:17:03,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 100401152. Throughput: 0: 1694.7, 1: 1671.2. Samples: 25107914. Policy #0 lag: (min: 21.0, avg: 24.3, max: 53.0) +[2023-10-13 02:17:03,608][45375] Avg episode reward: [(0, '45.800'), (1, '55.930')] +[2023-10-13 02:17:03,900][46663] Updated weights for policy 1, policy_version 49001 (0.0009) +[2023-10-13 02:17:04,267][46663] Updated weights for policy 1, policy_version 49011 (0.0007) +[2023-10-13 02:17:04,648][46663] Updated weights for policy 1, policy_version 49021 (0.0009) +[2023-10-13 02:17:06,931][46662] Updated weights for policy 0, policy_version 49060 (0.0010) +[2023-10-13 02:17:07,314][46662] Updated weights for policy 0, policy_version 49070 (0.0007) +[2023-10-13 02:17:07,677][46662] Updated weights for policy 0, policy_version 49080 (0.0008) +[2023-10-13 02:17:08,601][46663] Updated weights for policy 1, policy_version 49031 (0.0008) +[2023-10-13 02:17:08,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 100466688. Throughput: 0: 1664.5, 1: 1682.5. Samples: 25127582. Policy #0 lag: (min: 21.0, avg: 24.3, max: 53.0) +[2023-10-13 02:17:08,607][45375] Avg episode reward: [(0, '46.140'), (1, '56.890')] +[2023-10-13 02:17:08,967][46663] Updated weights for policy 1, policy_version 49041 (0.0007) +[2023-10-13 02:17:09,324][46663] Updated weights for policy 1, policy_version 49051 (0.0007) +[2023-10-13 02:17:11,707][46662] Updated weights for policy 0, policy_version 49090 (0.0008) +[2023-10-13 02:17:12,069][46662] Updated weights for policy 0, policy_version 49100 (0.0009) +[2023-10-13 02:17:12,440][46662] Updated weights for policy 0, policy_version 49110 (0.0009) +[2023-10-13 02:17:12,807][46662] Updated weights for policy 0, policy_version 49120 (0.0009) +[2023-10-13 02:17:13,539][46663] Updated weights for policy 1, policy_version 49061 (0.0008) +[2023-10-13 02:17:13,607][45375] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 100532224. Throughput: 0: 1685.1, 1: 1682.9. Samples: 25137662. Policy #0 lag: (min: 21.0, avg: 24.3, max: 53.0) +[2023-10-13 02:17:13,607][45375] Avg episode reward: [(0, '46.210'), (1, '57.290')] +[2023-10-13 02:17:13,917][46663] Updated weights for policy 1, policy_version 49071 (0.0008) +[2023-10-13 02:17:14,286][46663] Updated weights for policy 1, policy_version 49081 (0.0008) +[2023-10-13 02:17:16,823][46662] Updated weights for policy 0, policy_version 49130 (0.0007) +[2023-10-13 02:17:17,194][46662] Updated weights for policy 0, policy_version 49140 (0.0007) +[2023-10-13 02:17:17,562][46662] Updated weights for policy 0, policy_version 49150 (0.0008) +[2023-10-13 02:17:18,352][46663] Updated weights for policy 1, policy_version 49091 (0.0009) +[2023-10-13 02:17:18,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 100597760. Throughput: 0: 1686.2, 1: 1679.6. Samples: 25158166. Policy #0 lag: (min: 21.0, avg: 24.3, max: 53.0) +[2023-10-13 02:17:18,607][45375] Avg episode reward: [(0, '46.990'), (1, '57.250')] +[2023-10-13 02:17:18,728][46663] Updated weights for policy 1, policy_version 49101 (0.0007) +[2023-10-13 02:17:19,087][46663] Updated weights for policy 1, policy_version 49111 (0.0010) +[2023-10-13 02:17:21,638][46662] Updated weights for policy 0, policy_version 49160 (0.0008) +[2023-10-13 02:17:22,007][46662] Updated weights for policy 0, policy_version 49170 (0.0007) +[2023-10-13 02:17:22,381][46662] Updated weights for policy 0, policy_version 49180 (0.0008) +[2023-10-13 02:17:23,265][46663] Updated weights for policy 1, policy_version 49121 (0.0008) +[2023-10-13 02:17:23,607][45375] Fps is (10 sec: 13106.5, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 100663296. Throughput: 0: 1669.3, 1: 1677.3. Samples: 25177780. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:17:23,609][45375] Avg episode reward: [(0, '46.880'), (1, '57.750')] +[2023-10-13 02:17:23,624][46663] Updated weights for policy 1, policy_version 49131 (0.0010) +[2023-10-13 02:17:23,989][46663] Updated weights for policy 1, policy_version 49141 (0.0010) +[2023-10-13 02:17:24,365][46663] Updated weights for policy 1, policy_version 49151 (0.0007) +[2023-10-13 02:17:26,426][46662] Updated weights for policy 0, policy_version 49190 (0.0011) +[2023-10-13 02:17:26,793][46662] Updated weights for policy 0, policy_version 49200 (0.0011) +[2023-10-13 02:17:27,157][46662] Updated weights for policy 0, policy_version 49210 (0.0009) +[2023-10-13 02:17:28,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 100728832. Throughput: 0: 1697.3, 1: 1683.3. Samples: 25188346. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:17:28,607][45375] Avg episode reward: [(0, '46.440'), (1, '58.540')] +[2023-10-13 02:17:28,652][46663] Updated weights for policy 1, policy_version 49161 (0.0008) +[2023-10-13 02:17:29,015][46663] Updated weights for policy 1, policy_version 49171 (0.0008) +[2023-10-13 02:17:29,390][46663] Updated weights for policy 1, policy_version 49181 (0.0007) +[2023-10-13 02:17:31,284][46662] Updated weights for policy 0, policy_version 49220 (0.0010) +[2023-10-13 02:17:31,656][46662] Updated weights for policy 0, policy_version 49230 (0.0011) +[2023-10-13 02:17:32,036][46662] Updated weights for policy 0, policy_version 49240 (0.0011) +[2023-10-13 02:17:33,327][46663] Updated weights for policy 1, policy_version 49191 (0.0008) +[2023-10-13 02:17:33,606][45375] Fps is (10 sec: 13108.0, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 100794368. Throughput: 0: 1670.0, 1: 1682.3. Samples: 25208288. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:17:33,607][45375] Avg episode reward: [(0, '46.710'), (1, '58.620')] +[2023-10-13 02:17:33,695][46663] Updated weights for policy 1, policy_version 49201 (0.0009) +[2023-10-13 02:17:34,064][46663] Updated weights for policy 1, policy_version 49211 (0.0009) +[2023-10-13 02:17:36,124][46662] Updated weights for policy 0, policy_version 49250 (0.0010) +[2023-10-13 02:17:36,495][46662] Updated weights for policy 0, policy_version 49260 (0.0010) +[2023-10-13 02:17:36,863][46662] Updated weights for policy 0, policy_version 49270 (0.0010) +[2023-10-13 02:17:37,239][46662] Updated weights for policy 0, policy_version 49280 (0.0010) +[2023-10-13 02:17:38,165][46663] Updated weights for policy 1, policy_version 49221 (0.0009) +[2023-10-13 02:17:38,524][46663] Updated weights for policy 1, policy_version 49231 (0.0008) +[2023-10-13 02:17:38,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 100859904. Throughput: 0: 1669.0, 1: 1676.4. Samples: 25227976. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:17:38,607][45375] Avg episode reward: [(0, '45.680'), (1, '58.970')] +[2023-10-13 02:17:38,893][46663] Updated weights for policy 1, policy_version 49241 (0.0008) +[2023-10-13 02:17:41,347][46662] Updated weights for policy 0, policy_version 49290 (0.0007) +[2023-10-13 02:17:41,712][46662] Updated weights for policy 0, policy_version 49300 (0.0007) +[2023-10-13 02:17:42,084][46662] Updated weights for policy 0, policy_version 49310 (0.0009) +[2023-10-13 02:17:42,850][46663] Updated weights for policy 1, policy_version 49251 (0.0009) +[2023-10-13 02:17:43,222][46663] Updated weights for policy 1, policy_version 49261 (0.0009) +[2023-10-13 02:17:43,588][46663] Updated weights for policy 1, policy_version 49271 (0.0008) +[2023-10-13 02:17:43,606][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 100925440. Throughput: 0: 1683.8, 1: 1690.6. Samples: 25239004. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:17:43,607][45375] Avg episode reward: [(0, '45.210'), (1, '58.460')] +[2023-10-13 02:17:46,126][46662] Updated weights for policy 0, policy_version 49320 (0.0010) +[2023-10-13 02:17:46,499][46662] Updated weights for policy 0, policy_version 49330 (0.0008) +[2023-10-13 02:17:46,864][46662] Updated weights for policy 0, policy_version 49340 (0.0009) +[2023-10-13 02:17:47,675][46663] Updated weights for policy 1, policy_version 49281 (0.0007) +[2023-10-13 02:17:48,037][46663] Updated weights for policy 1, policy_version 49291 (0.0007) +[2023-10-13 02:17:48,403][46663] Updated weights for policy 1, policy_version 49301 (0.0010) +[2023-10-13 02:17:48,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 100990976. Throughput: 0: 1662.7, 1: 1691.4. Samples: 25258846. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:17:48,607][45375] Avg episode reward: [(0, '46.440'), (1, '58.070')] +[2023-10-13 02:17:48,773][46663] Updated weights for policy 1, policy_version 49311 (0.0009) +[2023-10-13 02:17:50,955][46662] Updated weights for policy 0, policy_version 49350 (0.0008) +[2023-10-13 02:17:51,338][46662] Updated weights for policy 0, policy_version 49360 (0.0009) +[2023-10-13 02:17:51,718][46662] Updated weights for policy 0, policy_version 49370 (0.0010) +[2023-10-13 02:17:52,827][46663] Updated weights for policy 1, policy_version 49321 (0.0009) +[2023-10-13 02:17:53,192][46663] Updated weights for policy 1, policy_version 49331 (0.0008) +[2023-10-13 02:17:53,565][46663] Updated weights for policy 1, policy_version 49341 (0.0008) +[2023-10-13 02:17:53,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 101056512. Throughput: 0: 1680.6, 1: 1667.1. Samples: 25278226. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:17:53,607][45375] Avg episode reward: [(0, '45.950'), (1, '58.020')] +[2023-10-13 02:17:53,619][46091] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000049376_50561024.pth... +[2023-10-13 02:17:53,652][46091] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000047808_48955392.pth +[2023-10-13 02:17:53,667][46384] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000049344_50528256.pth... +[2023-10-13 02:17:53,705][46384] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000047776_48922624.pth +[2023-10-13 02:17:55,757][46662] Updated weights for policy 0, policy_version 49380 (0.0008) +[2023-10-13 02:17:56,128][46662] Updated weights for policy 0, policy_version 49390 (0.0007) +[2023-10-13 02:17:56,496][46662] Updated weights for policy 0, policy_version 49400 (0.0007) +[2023-10-13 02:17:57,576][46663] Updated weights for policy 1, policy_version 49351 (0.0008) +[2023-10-13 02:17:57,951][46663] Updated weights for policy 1, policy_version 49361 (0.0009) +[2023-10-13 02:17:58,310][46663] Updated weights for policy 1, policy_version 49371 (0.0009) +[2023-10-13 02:17:58,607][45375] Fps is (10 sec: 16383.9, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 101154816. Throughput: 0: 1679.2, 1: 1681.8. Samples: 25288908. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:17:58,607][45375] Avg episode reward: [(0, '47.360'), (1, '57.680')] +[2023-10-13 02:18:00,635][46662] Updated weights for policy 0, policy_version 49410 (0.0009) +[2023-10-13 02:18:01,003][46662] Updated weights for policy 0, policy_version 49420 (0.0009) +[2023-10-13 02:18:01,380][46662] Updated weights for policy 0, policy_version 49430 (0.0009) +[2023-10-13 02:18:01,740][46662] Updated weights for policy 0, policy_version 49440 (0.0008) +[2023-10-13 02:18:02,281][46663] Updated weights for policy 1, policy_version 49381 (0.0009) +[2023-10-13 02:18:02,646][46663] Updated weights for policy 1, policy_version 49391 (0.0010) +[2023-10-13 02:18:03,023][46663] Updated weights for policy 1, policy_version 49401 (0.0009) +[2023-10-13 02:18:03,607][45375] Fps is (10 sec: 16383.8, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 101220352. Throughput: 0: 1658.2, 1: 1679.0. Samples: 25308342. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:18:03,608][45375] Avg episode reward: [(0, '45.830'), (1, '56.140')] +[2023-10-13 02:18:05,718][46662] Updated weights for policy 0, policy_version 49450 (0.0008) +[2023-10-13 02:18:06,080][46662] Updated weights for policy 0, policy_version 49460 (0.0010) +[2023-10-13 02:18:06,452][46662] Updated weights for policy 0, policy_version 49470 (0.0008) +[2023-10-13 02:18:06,894][46663] Updated weights for policy 1, policy_version 49411 (0.0009) +[2023-10-13 02:18:07,266][46663] Updated weights for policy 1, policy_version 49421 (0.0011) +[2023-10-13 02:18:07,628][46663] Updated weights for policy 1, policy_version 49431 (0.0008) +[2023-10-13 02:18:08,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 101285888. Throughput: 0: 1682.8, 1: 1664.1. Samples: 25328390. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:18:08,607][45375] Avg episode reward: [(0, '46.060'), (1, '54.410')] +[2023-10-13 02:18:10,432][46662] Updated weights for policy 0, policy_version 49480 (0.0007) +[2023-10-13 02:18:10,806][46662] Updated weights for policy 0, policy_version 49490 (0.0009) +[2023-10-13 02:18:11,189][46662] Updated weights for policy 0, policy_version 49500 (0.0011) +[2023-10-13 02:18:11,666][46663] Updated weights for policy 1, policy_version 49441 (0.0010) +[2023-10-13 02:18:12,045][46663] Updated weights for policy 1, policy_version 49451 (0.0007) +[2023-10-13 02:18:12,402][46663] Updated weights for policy 1, policy_version 49461 (0.0008) +[2023-10-13 02:18:12,773][46663] Updated weights for policy 1, policy_version 49471 (0.0008) +[2023-10-13 02:18:13,607][45375] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 101351424. Throughput: 0: 1666.2, 1: 1688.6. Samples: 25339314. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:18:13,607][45375] Avg episode reward: [(0, '46.930'), (1, '53.120')] +[2023-10-13 02:18:15,312][46662] Updated weights for policy 0, policy_version 49510 (0.0010) +[2023-10-13 02:18:15,683][46662] Updated weights for policy 0, policy_version 49520 (0.0010) +[2023-10-13 02:18:16,048][46662] Updated weights for policy 0, policy_version 49530 (0.0010) +[2023-10-13 02:18:16,923][46663] Updated weights for policy 1, policy_version 49481 (0.0008) +[2023-10-13 02:18:17,301][46663] Updated weights for policy 1, policy_version 49491 (0.0008) +[2023-10-13 02:18:17,673][46663] Updated weights for policy 1, policy_version 49501 (0.0008) +[2023-10-13 02:18:18,606][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 101416960. Throughput: 0: 1674.0, 1: 1667.7. Samples: 25358662. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:18:18,607][45375] Avg episode reward: [(0, '46.480'), (1, '52.030')] +[2023-10-13 02:18:20,052][46662] Updated weights for policy 0, policy_version 49540 (0.0009) +[2023-10-13 02:18:20,421][46662] Updated weights for policy 0, policy_version 49550 (0.0008) +[2023-10-13 02:18:20,795][46662] Updated weights for policy 0, policy_version 49560 (0.0009) +[2023-10-13 02:18:22,054][46663] Updated weights for policy 1, policy_version 49511 (0.0008) +[2023-10-13 02:18:22,422][46663] Updated weights for policy 1, policy_version 49521 (0.0010) +[2023-10-13 02:18:22,791][46663] Updated weights for policy 1, policy_version 49531 (0.0010) +[2023-10-13 02:18:23,607][45375] Fps is (10 sec: 13106.9, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 101482496. Throughput: 0: 1692.9, 1: 1660.2. Samples: 25378866. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:18:23,608][45375] Avg episode reward: [(0, '46.030'), (1, '50.900')] +[2023-10-13 02:18:24,872][46662] Updated weights for policy 0, policy_version 49570 (0.0009) +[2023-10-13 02:18:25,237][46662] Updated weights for policy 0, policy_version 49580 (0.0009) +[2023-10-13 02:18:25,608][46662] Updated weights for policy 0, policy_version 49590 (0.0009) +[2023-10-13 02:18:25,974][46662] Updated weights for policy 0, policy_version 49600 (0.0011) +[2023-10-13 02:18:26,725][46663] Updated weights for policy 1, policy_version 49541 (0.0007) +[2023-10-13 02:18:27,089][46663] Updated weights for policy 1, policy_version 49551 (0.0007) +[2023-10-13 02:18:27,451][46663] Updated weights for policy 1, policy_version 49561 (0.0010) +[2023-10-13 02:18:28,606][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 101548032. Throughput: 0: 1665.0, 1: 1678.2. Samples: 25389446. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:18:28,607][45375] Avg episode reward: [(0, '47.220'), (1, '48.920')] +[2023-10-13 02:18:29,946][46662] Updated weights for policy 0, policy_version 49610 (0.0007) +[2023-10-13 02:18:30,317][46662] Updated weights for policy 0, policy_version 49620 (0.0007) +[2023-10-13 02:18:30,678][46662] Updated weights for policy 0, policy_version 49630 (0.0008) +[2023-10-13 02:18:31,566][46663] Updated weights for policy 1, policy_version 49571 (0.0008) +[2023-10-13 02:18:31,930][46663] Updated weights for policy 1, policy_version 49581 (0.0007) +[2023-10-13 02:18:32,299][46663] Updated weights for policy 1, policy_version 49591 (0.0007) +[2023-10-13 02:18:33,607][45375] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 101613568. Throughput: 0: 1680.9, 1: 1652.4. Samples: 25408846. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:18:33,607][45375] Avg episode reward: [(0, '48.140'), (1, '48.440')] +[2023-10-13 02:18:34,786][46662] Updated weights for policy 0, policy_version 49640 (0.0009) +[2023-10-13 02:18:35,162][46662] Updated weights for policy 0, policy_version 49650 (0.0008) +[2023-10-13 02:18:35,522][46662] Updated weights for policy 0, policy_version 49660 (0.0009) +[2023-10-13 02:18:36,497][46663] Updated weights for policy 1, policy_version 49601 (0.0011) +[2023-10-13 02:18:36,859][46663] Updated weights for policy 1, policy_version 49611 (0.0008) +[2023-10-13 02:18:37,232][46663] Updated weights for policy 1, policy_version 49621 (0.0010) +[2023-10-13 02:18:37,600][46663] Updated weights for policy 1, policy_version 49631 (0.0008) +[2023-10-13 02:18:38,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 101679104. Throughput: 0: 1693.6, 1: 1659.6. Samples: 25429116. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:18:38,607][45375] Avg episode reward: [(0, '48.140'), (1, '48.120')] +[2023-10-13 02:18:39,765][46662] Updated weights for policy 0, policy_version 49670 (0.0007) +[2023-10-13 02:18:40,156][46662] Updated weights for policy 0, policy_version 49680 (0.0007) +[2023-10-13 02:18:40,522][46662] Updated weights for policy 0, policy_version 49690 (0.0007) +[2023-10-13 02:18:41,734][46663] Updated weights for policy 1, policy_version 49641 (0.0009) +[2023-10-13 02:18:42,104][46663] Updated weights for policy 1, policy_version 49651 (0.0008) +[2023-10-13 02:18:42,473][46663] Updated weights for policy 1, policy_version 49661 (0.0007) +[2023-10-13 02:18:43,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 101744640. Throughput: 0: 1666.2, 1: 1678.0. Samples: 25439394. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:18:43,607][45375] Avg episode reward: [(0, '49.240'), (1, '48.620')] +[2023-10-13 02:18:44,468][46662] Updated weights for policy 0, policy_version 49700 (0.0007) +[2023-10-13 02:18:44,842][46662] Updated weights for policy 0, policy_version 49710 (0.0007) +[2023-10-13 02:18:45,211][46662] Updated weights for policy 0, policy_version 49720 (0.0010) +[2023-10-13 02:18:46,603][46663] Updated weights for policy 1, policy_version 49671 (0.0010) +[2023-10-13 02:18:46,972][46663] Updated weights for policy 1, policy_version 49681 (0.0011) +[2023-10-13 02:18:47,341][46663] Updated weights for policy 1, policy_version 49691 (0.0011) +[2023-10-13 02:18:48,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 101810176. Throughput: 0: 1693.7, 1: 1656.1. Samples: 25459086. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:18:48,607][45375] Avg episode reward: [(0, '50.280'), (1, '49.230')] +[2023-10-13 02:18:49,228][46662] Updated weights for policy 0, policy_version 49730 (0.0010) +[2023-10-13 02:18:49,588][46662] Updated weights for policy 0, policy_version 49740 (0.0007) +[2023-10-13 02:18:49,960][46662] Updated weights for policy 0, policy_version 49750 (0.0008) +[2023-10-13 02:18:50,326][46662] Updated weights for policy 0, policy_version 49760 (0.0008) +[2023-10-13 02:18:51,535][46663] Updated weights for policy 1, policy_version 49701 (0.0009) +[2023-10-13 02:18:51,904][46663] Updated weights for policy 1, policy_version 49711 (0.0007) +[2023-10-13 02:18:52,270][46663] Updated weights for policy 1, policy_version 49721 (0.0008) +[2023-10-13 02:18:53,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13329.4). Total num frames: 101875712. Throughput: 0: 1689.5, 1: 1674.8. Samples: 25479782. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:18:53,607][45375] Avg episode reward: [(0, '49.510'), (1, '49.260')] +[2023-10-13 02:18:54,330][46662] Updated weights for policy 0, policy_version 49770 (0.0007) +[2023-10-13 02:18:54,697][46662] Updated weights for policy 0, policy_version 49780 (0.0008) +[2023-10-13 02:18:55,065][46662] Updated weights for policy 0, policy_version 49790 (0.0008) +[2023-10-13 02:18:56,040][46663] Updated weights for policy 1, policy_version 49731 (0.0008) +[2023-10-13 02:18:56,411][46663] Updated weights for policy 1, policy_version 49741 (0.0009) +[2023-10-13 02:18:56,782][46663] Updated weights for policy 1, policy_version 49751 (0.0009) +[2023-10-13 02:18:58,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 101941248. Throughput: 0: 1676.7, 1: 1670.1. Samples: 25489922. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:18:58,607][45375] Avg episode reward: [(0, '48.220'), (1, '49.520')] +[2023-10-13 02:18:59,242][46662] Updated weights for policy 0, policy_version 49800 (0.0009) +[2023-10-13 02:18:59,614][46662] Updated weights for policy 0, policy_version 49810 (0.0008) +[2023-10-13 02:18:59,981][46662] Updated weights for policy 0, policy_version 49820 (0.0008) +[2023-10-13 02:19:00,846][46663] Updated weights for policy 1, policy_version 49761 (0.0009) +[2023-10-13 02:19:01,206][46663] Updated weights for policy 1, policy_version 49771 (0.0007) +[2023-10-13 02:19:01,581][46663] Updated weights for policy 1, policy_version 49781 (0.0009) +[2023-10-13 02:19:01,954][46663] Updated weights for policy 1, policy_version 49791 (0.0007) +[2023-10-13 02:19:03,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 102006784. Throughput: 0: 1687.0, 1: 1669.1. Samples: 25509690. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:19:03,608][45375] Avg episode reward: [(0, '47.770'), (1, '49.710')] +[2023-10-13 02:19:04,030][46662] Updated weights for policy 0, policy_version 49830 (0.0007) +[2023-10-13 02:19:04,400][46662] Updated weights for policy 0, policy_version 49840 (0.0007) +[2023-10-13 02:19:04,784][46662] Updated weights for policy 0, policy_version 49850 (0.0007) +[2023-10-13 02:19:06,136][46663] Updated weights for policy 1, policy_version 49801 (0.0008) +[2023-10-13 02:19:06,511][46663] Updated weights for policy 1, policy_version 49811 (0.0007) +[2023-10-13 02:19:06,881][46663] Updated weights for policy 1, policy_version 49821 (0.0008) +[2023-10-13 02:19:08,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 102072320. Throughput: 0: 1687.8, 1: 1678.2. Samples: 25530338. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:19:08,607][45375] Avg episode reward: [(0, '47.250'), (1, '48.180')] +[2023-10-13 02:19:08,778][46662] Updated weights for policy 0, policy_version 49860 (0.0010) +[2023-10-13 02:19:09,150][46662] Updated weights for policy 0, policy_version 49870 (0.0008) +[2023-10-13 02:19:09,526][46662] Updated weights for policy 0, policy_version 49880 (0.0007) +[2023-10-13 02:19:10,794][46663] Updated weights for policy 1, policy_version 49831 (0.0009) +[2023-10-13 02:19:11,152][46663] Updated weights for policy 1, policy_version 49841 (0.0009) +[2023-10-13 02:19:11,532][46663] Updated weights for policy 1, policy_version 49851 (0.0009) +[2023-10-13 02:19:13,555][46662] Updated weights for policy 0, policy_version 49890 (0.0008) +[2023-10-13 02:19:13,606][45375] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 102137856. Throughput: 0: 1683.5, 1: 1657.6. Samples: 25539792. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:19:13,607][45375] Avg episode reward: [(0, '47.580'), (1, '47.030')] +[2023-10-13 02:19:13,919][46662] Updated weights for policy 0, policy_version 49900 (0.0008) +[2023-10-13 02:19:14,297][46662] Updated weights for policy 0, policy_version 49910 (0.0008) +[2023-10-13 02:19:14,673][46662] Updated weights for policy 0, policy_version 49920 (0.0008) +[2023-10-13 02:19:15,641][46663] Updated weights for policy 1, policy_version 49861 (0.0008) +[2023-10-13 02:19:16,002][46663] Updated weights for policy 1, policy_version 49871 (0.0007) +[2023-10-13 02:19:16,368][46663] Updated weights for policy 1, policy_version 49881 (0.0009) +[2023-10-13 02:19:18,555][46662] Updated weights for policy 0, policy_version 49930 (0.0008) +[2023-10-13 02:19:18,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 102203392. Throughput: 0: 1693.0, 1: 1668.2. Samples: 25560098. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:19:18,607][45375] Avg episode reward: [(0, '47.880'), (1, '47.080')] +[2023-10-13 02:19:18,925][46662] Updated weights for policy 0, policy_version 49940 (0.0008) +[2023-10-13 02:19:19,292][46662] Updated weights for policy 0, policy_version 49950 (0.0009) +[2023-10-13 02:19:20,459][46663] Updated weights for policy 1, policy_version 49891 (0.0010) +[2023-10-13 02:19:20,826][46663] Updated weights for policy 1, policy_version 49901 (0.0009) +[2023-10-13 02:19:21,192][46663] Updated weights for policy 1, policy_version 49911 (0.0008) +[2023-10-13 02:19:23,314][46662] Updated weights for policy 0, policy_version 49960 (0.0007) +[2023-10-13 02:19:23,606][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.3, 300 sec: 13329.4). Total num frames: 102268928. Throughput: 0: 1694.2, 1: 1680.9. Samples: 25580994. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:19:23,607][45375] Avg episode reward: [(0, '47.110'), (1, '48.190')] +[2023-10-13 02:19:23,682][46662] Updated weights for policy 0, policy_version 49970 (0.0009) +[2023-10-13 02:19:24,061][46662] Updated weights for policy 0, policy_version 49980 (0.0008) +[2023-10-13 02:19:25,251][46663] Updated weights for policy 1, policy_version 49921 (0.0007) +[2023-10-13 02:19:25,615][46663] Updated weights for policy 1, policy_version 49931 (0.0007) +[2023-10-13 02:19:25,979][46663] Updated weights for policy 1, policy_version 49941 (0.0010) +[2023-10-13 02:19:26,359][46663] Updated weights for policy 1, policy_version 49951 (0.0008) +[2023-10-13 02:19:28,237][46662] Updated weights for policy 0, policy_version 49990 (0.0009) +[2023-10-13 02:19:28,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 102334464. Throughput: 0: 1698.2, 1: 1651.5. Samples: 25590130. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:19:28,607][45375] Avg episode reward: [(0, '47.800'), (1, '48.770')] +[2023-10-13 02:19:28,615][46662] Updated weights for policy 0, policy_version 50000 (0.0009) +[2023-10-13 02:19:28,989][46662] Updated weights for policy 0, policy_version 50010 (0.0009) +[2023-10-13 02:19:30,398][46663] Updated weights for policy 1, policy_version 49961 (0.0008) +[2023-10-13 02:19:30,768][46663] Updated weights for policy 1, policy_version 49971 (0.0009) +[2023-10-13 02:19:31,142][46663] Updated weights for policy 1, policy_version 49981 (0.0008) +[2023-10-13 02:19:33,248][46662] Updated weights for policy 0, policy_version 50020 (0.0007) +[2023-10-13 02:19:33,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 102400000. Throughput: 0: 1690.5, 1: 1675.4. Samples: 25610554. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:19:33,607][45375] Avg episode reward: [(0, '46.090'), (1, '50.130')] +[2023-10-13 02:19:33,637][46662] Updated weights for policy 0, policy_version 50030 (0.0010) +[2023-10-13 02:19:33,992][46662] Updated weights for policy 0, policy_version 50040 (0.0008) +[2023-10-13 02:19:35,236][46663] Updated weights for policy 1, policy_version 49991 (0.0009) +[2023-10-13 02:19:35,590][46663] Updated weights for policy 1, policy_version 50001 (0.0008) +[2023-10-13 02:19:35,958][46663] Updated weights for policy 1, policy_version 50011 (0.0008) +[2023-10-13 02:19:37,992][46662] Updated weights for policy 0, policy_version 50050 (0.0008) +[2023-10-13 02:19:38,357][46662] Updated weights for policy 0, policy_version 50060 (0.0009) +[2023-10-13 02:19:38,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 102465536. Throughput: 0: 1688.3, 1: 1682.2. Samples: 25631454. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:19:38,607][45375] Avg episode reward: [(0, '46.550'), (1, '50.260')] +[2023-10-13 02:19:38,734][46662] Updated weights for policy 0, policy_version 50070 (0.0007) +[2023-10-13 02:19:39,107][46662] Updated weights for policy 0, policy_version 50080 (0.0010) +[2023-10-13 02:19:40,237][46663] Updated weights for policy 1, policy_version 50021 (0.0010) +[2023-10-13 02:19:40,603][46663] Updated weights for policy 1, policy_version 50031 (0.0009) +[2023-10-13 02:19:40,965][46663] Updated weights for policy 1, policy_version 50041 (0.0012) +[2023-10-13 02:19:43,022][46662] Updated weights for policy 0, policy_version 50090 (0.0009) +[2023-10-13 02:19:43,389][46662] Updated weights for policy 0, policy_version 50100 (0.0009) +[2023-10-13 02:19:43,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 102531072. Throughput: 0: 1687.2, 1: 1654.1. Samples: 25640280. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:19:43,607][45375] Avg episode reward: [(0, '46.730'), (1, '50.200')] +[2023-10-13 02:19:43,755][46662] Updated weights for policy 0, policy_version 50110 (0.0010) +[2023-10-13 02:19:45,062][46663] Updated weights for policy 1, policy_version 50051 (0.0008) +[2023-10-13 02:19:45,431][46663] Updated weights for policy 1, policy_version 50061 (0.0008) +[2023-10-13 02:19:45,784][46663] Updated weights for policy 1, policy_version 50071 (0.0009) +[2023-10-13 02:19:47,892][46662] Updated weights for policy 0, policy_version 50120 (0.0010) +[2023-10-13 02:19:48,263][46662] Updated weights for policy 0, policy_version 50130 (0.0010) +[2023-10-13 02:19:48,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 102596608. Throughput: 0: 1692.0, 1: 1671.5. Samples: 25661048. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:19:48,607][45375] Avg episode reward: [(0, '44.330'), (1, '48.380')] +[2023-10-13 02:19:48,629][46662] Updated weights for policy 0, policy_version 50140 (0.0008) +[2023-10-13 02:19:49,923][46663] Updated weights for policy 1, policy_version 50081 (0.0009) +[2023-10-13 02:19:50,287][46663] Updated weights for policy 1, policy_version 50091 (0.0009) +[2023-10-13 02:19:50,651][46663] Updated weights for policy 1, policy_version 50101 (0.0009) +[2023-10-13 02:19:51,017][46663] Updated weights for policy 1, policy_version 50111 (0.0010) +[2023-10-13 02:19:52,800][46662] Updated weights for policy 0, policy_version 50150 (0.0008) +[2023-10-13 02:19:53,162][46662] Updated weights for policy 0, policy_version 50160 (0.0008) +[2023-10-13 02:19:53,534][46662] Updated weights for policy 0, policy_version 50170 (0.0008) +[2023-10-13 02:19:53,607][45375] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 102662144. Throughput: 0: 1676.0, 1: 1673.5. Samples: 25681066. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:19:53,608][45375] Avg episode reward: [(0, '46.800'), (1, '47.720')] +[2023-10-13 02:19:53,620][46384] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000050112_51314688.pth... +[2023-10-13 02:19:53,654][46384] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000048544_49709056.pth +[2023-10-13 02:19:53,759][46091] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000050176_51380224.pth... +[2023-10-13 02:19:53,787][46091] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000048576_49741824.pth +[2023-10-13 02:19:55,282][46663] Updated weights for policy 1, policy_version 50121 (0.0009) +[2023-10-13 02:19:55,661][46663] Updated weights for policy 1, policy_version 50131 (0.0007) +[2023-10-13 02:19:56,040][46663] Updated weights for policy 1, policy_version 50141 (0.0009) +[2023-10-13 02:19:57,707][46662] Updated weights for policy 0, policy_version 50180 (0.0007) +[2023-10-13 02:19:58,083][46662] Updated weights for policy 0, policy_version 50190 (0.0007) +[2023-10-13 02:19:58,463][46662] Updated weights for policy 0, policy_version 50200 (0.0009) +[2023-10-13 02:19:58,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 102727680. Throughput: 0: 1681.0, 1: 1664.1. Samples: 25690320. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:19:58,607][45375] Avg episode reward: [(0, '48.250'), (1, '47.640')] +[2023-10-13 02:20:00,001][46663] Updated weights for policy 1, policy_version 50151 (0.0008) +[2023-10-13 02:20:00,378][46663] Updated weights for policy 1, policy_version 50161 (0.0008) +[2023-10-13 02:20:00,748][46663] Updated weights for policy 1, policy_version 50171 (0.0008) +[2023-10-13 02:20:02,498][46662] Updated weights for policy 0, policy_version 50210 (0.0008) +[2023-10-13 02:20:02,867][46662] Updated weights for policy 0, policy_version 50220 (0.0008) +[2023-10-13 02:20:03,235][46662] Updated weights for policy 0, policy_version 50230 (0.0007) +[2023-10-13 02:20:03,604][46662] Updated weights for policy 0, policy_version 50240 (0.0007) +[2023-10-13 02:20:03,607][45375] Fps is (10 sec: 16384.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 102825984. Throughput: 0: 1674.5, 1: 1675.7. Samples: 25710860. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:20:03,608][45375] Avg episode reward: [(0, '49.130'), (1, '49.890')] +[2023-10-13 02:20:04,864][46663] Updated weights for policy 1, policy_version 50181 (0.0009) +[2023-10-13 02:20:05,237][46663] Updated weights for policy 1, policy_version 50191 (0.0008) +[2023-10-13 02:20:05,594][46663] Updated weights for policy 1, policy_version 50201 (0.0010) +[2023-10-13 02:20:07,595][46662] Updated weights for policy 0, policy_version 50250 (0.0008) +[2023-10-13 02:20:07,960][46662] Updated weights for policy 0, policy_version 50260 (0.0007) +[2023-10-13 02:20:08,326][46662] Updated weights for policy 0, policy_version 50270 (0.0010) +[2023-10-13 02:20:08,607][45375] Fps is (10 sec: 16383.8, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 102891520. Throughput: 0: 1657.1, 1: 1674.7. Samples: 25730924. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:20:08,607][45375] Avg episode reward: [(0, '50.320'), (1, '50.850')] +[2023-10-13 02:20:09,719][46663] Updated weights for policy 1, policy_version 50211 (0.0011) +[2023-10-13 02:20:10,075][46663] Updated weights for policy 1, policy_version 50221 (0.0010) +[2023-10-13 02:20:10,449][46663] Updated weights for policy 1, policy_version 50231 (0.0008) +[2023-10-13 02:20:12,569][46662] Updated weights for policy 0, policy_version 50280 (0.0007) +[2023-10-13 02:20:12,945][46662] Updated weights for policy 0, policy_version 50290 (0.0008) +[2023-10-13 02:20:13,310][46662] Updated weights for policy 0, policy_version 50300 (0.0008) +[2023-10-13 02:20:13,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 102957056. Throughput: 0: 1674.7, 1: 1670.8. Samples: 25740680. Policy #0 lag: (min: 22.0, avg: 22.0, max: 26.0) +[2023-10-13 02:20:13,607][45375] Avg episode reward: [(0, '52.260'), (1, '50.630')] +[2023-10-13 02:20:14,452][46663] Updated weights for policy 1, policy_version 50241 (0.0009) +[2023-10-13 02:20:14,824][46663] Updated weights for policy 1, policy_version 50251 (0.0007) +[2023-10-13 02:20:15,184][46663] Updated weights for policy 1, policy_version 50261 (0.0008) +[2023-10-13 02:20:15,549][46663] Updated weights for policy 1, policy_version 50271 (0.0008) +[2023-10-13 02:20:17,399][46662] Updated weights for policy 0, policy_version 50310 (0.0008) +[2023-10-13 02:20:17,788][46662] Updated weights for policy 0, policy_version 50320 (0.0009) +[2023-10-13 02:20:18,152][46662] Updated weights for policy 0, policy_version 50330 (0.0008) +[2023-10-13 02:20:18,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 103022592. Throughput: 0: 1681.3, 1: 1673.6. Samples: 25761524. Policy #0 lag: (min: 22.0, avg: 22.0, max: 26.0) +[2023-10-13 02:20:18,607][45375] Avg episode reward: [(0, '51.650'), (1, '52.070')] +[2023-10-13 02:20:19,673][46663] Updated weights for policy 1, policy_version 50281 (0.0008) +[2023-10-13 02:20:20,034][46663] Updated weights for policy 1, policy_version 50291 (0.0010) +[2023-10-13 02:20:20,414][46663] Updated weights for policy 1, policy_version 50301 (0.0008) +[2023-10-13 02:20:22,163][46662] Updated weights for policy 0, policy_version 50340 (0.0007) +[2023-10-13 02:20:22,533][46662] Updated weights for policy 0, policy_version 50350 (0.0009) +[2023-10-13 02:20:22,895][46662] Updated weights for policy 0, policy_version 50360 (0.0009) +[2023-10-13 02:20:23,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 103088128. Throughput: 0: 1659.5, 1: 1671.3. Samples: 25781344. Policy #0 lag: (min: 22.0, avg: 22.0, max: 26.0) +[2023-10-13 02:20:23,608][45375] Avg episode reward: [(0, '50.050'), (1, '53.510')] +[2023-10-13 02:20:24,375][46663] Updated weights for policy 1, policy_version 50311 (0.0008) +[2023-10-13 02:20:24,736][46663] Updated weights for policy 1, policy_version 50321 (0.0007) +[2023-10-13 02:20:25,105][46663] Updated weights for policy 1, policy_version 50331 (0.0008) +[2023-10-13 02:20:27,028][46662] Updated weights for policy 0, policy_version 50370 (0.0010) +[2023-10-13 02:20:27,393][46662] Updated weights for policy 0, policy_version 50380 (0.0008) +[2023-10-13 02:20:27,754][46662] Updated weights for policy 0, policy_version 50390 (0.0009) +[2023-10-13 02:20:28,128][46662] Updated weights for policy 0, policy_version 50400 (0.0009) +[2023-10-13 02:20:28,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 103153664. Throughput: 0: 1681.3, 1: 1679.2. Samples: 25791502. Policy #0 lag: (min: 22.0, avg: 22.0, max: 26.0) +[2023-10-13 02:20:28,607][45375] Avg episode reward: [(0, '49.770'), (1, '54.140')] +[2023-10-13 02:20:29,096][46663] Updated weights for policy 1, policy_version 50341 (0.0011) +[2023-10-13 02:20:29,470][46663] Updated weights for policy 1, policy_version 50351 (0.0010) +[2023-10-13 02:20:29,843][46663] Updated weights for policy 1, policy_version 50361 (0.0008) +[2023-10-13 02:20:32,093][46662] Updated weights for policy 0, policy_version 50410 (0.0007) +[2023-10-13 02:20:32,467][46662] Updated weights for policy 0, policy_version 50420 (0.0009) +[2023-10-13 02:20:32,824][46662] Updated weights for policy 0, policy_version 50430 (0.0011) +[2023-10-13 02:20:33,607][45375] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 103219200. Throughput: 0: 1675.1, 1: 1682.4. Samples: 25812136. Policy #0 lag: (min: 22.0, avg: 22.0, max: 26.0) +[2023-10-13 02:20:33,607][45375] Avg episode reward: [(0, '48.800'), (1, '53.940')] +[2023-10-13 02:20:33,980][46663] Updated weights for policy 1, policy_version 50371 (0.0008) +[2023-10-13 02:20:34,364][46663] Updated weights for policy 1, policy_version 50381 (0.0009) +[2023-10-13 02:20:34,732][46663] Updated weights for policy 1, policy_version 50391 (0.0008) +[2023-10-13 02:20:36,816][46662] Updated weights for policy 0, policy_version 50440 (0.0010) +[2023-10-13 02:20:37,186][46662] Updated weights for policy 0, policy_version 50450 (0.0009) +[2023-10-13 02:20:37,552][46662] Updated weights for policy 0, policy_version 50460 (0.0010) +[2023-10-13 02:20:38,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 103284736. Throughput: 0: 1658.1, 1: 1688.2. Samples: 25831650. Policy #0 lag: (min: 22.0, avg: 22.0, max: 26.0) +[2023-10-13 02:20:38,607][45375] Avg episode reward: [(0, '49.830'), (1, '54.250')] +[2023-10-13 02:20:38,715][46663] Updated weights for policy 1, policy_version 50401 (0.0008) +[2023-10-13 02:20:39,084][46663] Updated weights for policy 1, policy_version 50411 (0.0007) +[2023-10-13 02:20:39,455][46663] Updated weights for policy 1, policy_version 50421 (0.0007) +[2023-10-13 02:20:39,828][46663] Updated weights for policy 1, policy_version 50431 (0.0007) +[2023-10-13 02:20:41,629][46662] Updated weights for policy 0, policy_version 50470 (0.0008) +[2023-10-13 02:20:41,997][46662] Updated weights for policy 0, policy_version 50480 (0.0010) +[2023-10-13 02:20:42,376][46662] Updated weights for policy 0, policy_version 50490 (0.0008) +[2023-10-13 02:20:43,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 103350272. Throughput: 0: 1683.5, 1: 1689.1. Samples: 25842086. Policy #0 lag: (min: 22.0, avg: 22.0, max: 26.0) +[2023-10-13 02:20:43,608][45375] Avg episode reward: [(0, '49.950'), (1, '54.030')] +[2023-10-13 02:20:44,087][46663] Updated weights for policy 1, policy_version 50441 (0.0007) +[2023-10-13 02:20:44,456][46663] Updated weights for policy 1, policy_version 50451 (0.0007) +[2023-10-13 02:20:44,839][46663] Updated weights for policy 1, policy_version 50461 (0.0008) +[2023-10-13 02:20:46,630][46662] Updated weights for policy 0, policy_version 50500 (0.0011) +[2023-10-13 02:20:47,010][46662] Updated weights for policy 0, policy_version 50510 (0.0009) +[2023-10-13 02:20:47,381][46662] Updated weights for policy 0, policy_version 50520 (0.0008) +[2023-10-13 02:20:48,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 103415808. Throughput: 0: 1677.4, 1: 1686.5. Samples: 25862234. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:20:48,607][45375] Avg episode reward: [(0, '49.660'), (1, '54.950')] +[2023-10-13 02:20:48,920][46663] Updated weights for policy 1, policy_version 50471 (0.0009) +[2023-10-13 02:20:49,290][46663] Updated weights for policy 1, policy_version 50481 (0.0008) +[2023-10-13 02:20:49,656][46663] Updated weights for policy 1, policy_version 50491 (0.0008) +[2023-10-13 02:20:51,545][46662] Updated weights for policy 0, policy_version 50530 (0.0008) +[2023-10-13 02:20:51,922][46662] Updated weights for policy 0, policy_version 50540 (0.0010) +[2023-10-13 02:20:52,283][46662] Updated weights for policy 0, policy_version 50550 (0.0011) +[2023-10-13 02:20:52,654][46662] Updated weights for policy 0, policy_version 50560 (0.0011) +[2023-10-13 02:20:53,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 103481344. Throughput: 0: 1664.4, 1: 1681.6. Samples: 25881490. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:20:53,608][45375] Avg episode reward: [(0, '49.700'), (1, '56.090')] +[2023-10-13 02:20:53,760][46663] Updated weights for policy 1, policy_version 50501 (0.0007) +[2023-10-13 02:20:54,125][46663] Updated weights for policy 1, policy_version 50511 (0.0007) +[2023-10-13 02:20:54,495][46663] Updated weights for policy 1, policy_version 50521 (0.0007) +[2023-10-13 02:20:56,791][46662] Updated weights for policy 0, policy_version 50570 (0.0008) +[2023-10-13 02:20:57,162][46662] Updated weights for policy 0, policy_version 50580 (0.0008) +[2023-10-13 02:20:57,535][46662] Updated weights for policy 0, policy_version 50590 (0.0008) +[2023-10-13 02:20:58,491][46663] Updated weights for policy 1, policy_version 50531 (0.0007) +[2023-10-13 02:20:58,606][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 103546880. Throughput: 0: 1678.7, 1: 1684.0. Samples: 25892000. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:20:58,607][45375] Avg episode reward: [(0, '49.120'), (1, '54.750')] +[2023-10-13 02:20:58,870][46663] Updated weights for policy 1, policy_version 50541 (0.0009) +[2023-10-13 02:20:59,240][46663] Updated weights for policy 1, policy_version 50551 (0.0010) +[2023-10-13 02:21:01,726][46662] Updated weights for policy 0, policy_version 50600 (0.0007) +[2023-10-13 02:21:02,111][46662] Updated weights for policy 0, policy_version 50610 (0.0007) +[2023-10-13 02:21:02,485][46662] Updated weights for policy 0, policy_version 50620 (0.0008) +[2023-10-13 02:21:03,396][46663] Updated weights for policy 1, policy_version 50561 (0.0008) +[2023-10-13 02:21:03,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 103612416. Throughput: 0: 1662.8, 1: 1681.8. Samples: 25912028. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:21:03,607][45375] Avg episode reward: [(0, '48.980'), (1, '56.330')] +[2023-10-13 02:21:03,762][46663] Updated weights for policy 1, policy_version 50571 (0.0008) +[2023-10-13 02:21:04,145][46663] Updated weights for policy 1, policy_version 50581 (0.0007) +[2023-10-13 02:21:04,500][46663] Updated weights for policy 1, policy_version 50591 (0.0011) +[2023-10-13 02:21:06,532][46662] Updated weights for policy 0, policy_version 50630 (0.0008) +[2023-10-13 02:21:06,907][46662] Updated weights for policy 0, policy_version 50640 (0.0008) +[2023-10-13 02:21:07,269][46662] Updated weights for policy 0, policy_version 50650 (0.0010) +[2023-10-13 02:21:08,528][46663] Updated weights for policy 1, policy_version 50601 (0.0008) +[2023-10-13 02:21:08,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 103677952. Throughput: 0: 1667.2, 1: 1672.1. Samples: 25931612. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:21:08,607][45375] Avg episode reward: [(0, '49.450'), (1, '56.010')] +[2023-10-13 02:21:08,894][46663] Updated weights for policy 1, policy_version 50611 (0.0009) +[2023-10-13 02:21:09,261][46663] Updated weights for policy 1, policy_version 50621 (0.0011) +[2023-10-13 02:21:11,147][46662] Updated weights for policy 0, policy_version 50660 (0.0008) +[2023-10-13 02:21:11,521][46662] Updated weights for policy 0, policy_version 50670 (0.0007) +[2023-10-13 02:21:11,894][46662] Updated weights for policy 0, policy_version 50680 (0.0007) +[2023-10-13 02:21:13,526][46663] Updated weights for policy 1, policy_version 50631 (0.0008) +[2023-10-13 02:21:13,606][45375] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 103743488. Throughput: 0: 1679.1, 1: 1670.8. Samples: 25942248. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:21:13,607][45375] Avg episode reward: [(0, '47.560'), (1, '56.490')] +[2023-10-13 02:21:13,885][46663] Updated weights for policy 1, policy_version 50641 (0.0007) +[2023-10-13 02:21:14,258][46663] Updated weights for policy 1, policy_version 50651 (0.0009) +[2023-10-13 02:21:15,943][46662] Updated weights for policy 0, policy_version 50690 (0.0007) +[2023-10-13 02:21:16,317][46662] Updated weights for policy 0, policy_version 50700 (0.0007) +[2023-10-13 02:21:16,688][46662] Updated weights for policy 0, policy_version 50710 (0.0007) +[2023-10-13 02:21:17,050][46662] Updated weights for policy 0, policy_version 50720 (0.0007) +[2023-10-13 02:21:18,313][46663] Updated weights for policy 1, policy_version 50661 (0.0009) +[2023-10-13 02:21:18,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.5). Total num frames: 103809024. Throughput: 0: 1659.6, 1: 1674.3. Samples: 25962160. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:21:18,607][45375] Avg episode reward: [(0, '46.840'), (1, '55.580')] +[2023-10-13 02:21:18,679][46663] Updated weights for policy 1, policy_version 50671 (0.0009) +[2023-10-13 02:21:19,060][46663] Updated weights for policy 1, policy_version 50681 (0.0007) +[2023-10-13 02:21:21,133][46662] Updated weights for policy 0, policy_version 50730 (0.0009) +[2023-10-13 02:21:21,506][46662] Updated weights for policy 0, policy_version 50740 (0.0009) +[2023-10-13 02:21:21,882][46662] Updated weights for policy 0, policy_version 50750 (0.0008) +[2023-10-13 02:21:23,176][46663] Updated weights for policy 1, policy_version 50691 (0.0010) +[2023-10-13 02:21:23,561][46663] Updated weights for policy 1, policy_version 50701 (0.0007) +[2023-10-13 02:21:23,607][45375] Fps is (10 sec: 13106.7, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 103874560. Throughput: 0: 1681.5, 1: 1661.1. Samples: 25982068. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:21:23,608][45375] Avg episode reward: [(0, '47.440'), (1, '54.740')] +[2023-10-13 02:21:23,926][46663] Updated weights for policy 1, policy_version 50711 (0.0009) +[2023-10-13 02:21:25,924][46662] Updated weights for policy 0, policy_version 50760 (0.0009) +[2023-10-13 02:21:26,292][46662] Updated weights for policy 0, policy_version 50770 (0.0007) +[2023-10-13 02:21:26,663][46662] Updated weights for policy 0, policy_version 50780 (0.0009) +[2023-10-13 02:21:28,143][46663] Updated weights for policy 1, policy_version 50721 (0.0009) +[2023-10-13 02:21:28,564][46663] Updated weights for policy 1, policy_version 50731 (0.0008) +[2023-10-13 02:21:28,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 103940096. Throughput: 0: 1676.9, 1: 1670.4. Samples: 25992714. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:21:28,607][45375] Avg episode reward: [(0, '47.140'), (1, '54.330')] +[2023-10-13 02:21:28,943][46663] Updated weights for policy 1, policy_version 50741 (0.0009) +[2023-10-13 02:21:29,308][46663] Updated weights for policy 1, policy_version 50751 (0.0008) +[2023-10-13 02:21:30,733][46662] Updated weights for policy 0, policy_version 50790 (0.0009) +[2023-10-13 02:21:31,100][46662] Updated weights for policy 0, policy_version 50800 (0.0010) +[2023-10-13 02:21:31,473][46662] Updated weights for policy 0, policy_version 50810 (0.0010) +[2023-10-13 02:21:33,307][46663] Updated weights for policy 1, policy_version 50761 (0.0008) +[2023-10-13 02:21:33,606][45375] Fps is (10 sec: 13107.6, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 104005632. Throughput: 0: 1658.1, 1: 1670.9. Samples: 26012038. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:21:33,607][45375] Avg episode reward: [(0, '46.880'), (1, '54.460')] +[2023-10-13 02:21:33,671][46663] Updated weights for policy 1, policy_version 50771 (0.0009) +[2023-10-13 02:21:34,044][46663] Updated weights for policy 1, policy_version 50781 (0.0009) +[2023-10-13 02:21:35,413][46662] Updated weights for policy 0, policy_version 50820 (0.0009) +[2023-10-13 02:21:35,783][46662] Updated weights for policy 0, policy_version 50830 (0.0007) +[2023-10-13 02:21:36,145][46662] Updated weights for policy 0, policy_version 50840 (0.0007) +[2023-10-13 02:21:38,072][46663] Updated weights for policy 1, policy_version 50791 (0.0009) +[2023-10-13 02:21:38,446][46663] Updated weights for policy 1, policy_version 50801 (0.0011) +[2023-10-13 02:21:38,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 104071168. Throughput: 0: 1686.4, 1: 1661.1. Samples: 26032126. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:21:38,607][45375] Avg episode reward: [(0, '44.410'), (1, '53.500')] +[2023-10-13 02:21:38,809][46663] Updated weights for policy 1, policy_version 50811 (0.0007) +[2023-10-13 02:21:40,326][46662] Updated weights for policy 0, policy_version 50850 (0.0007) +[2023-10-13 02:21:40,696][46662] Updated weights for policy 0, policy_version 50860 (0.0010) +[2023-10-13 02:21:41,064][46662] Updated weights for policy 0, policy_version 50870 (0.0010) +[2023-10-13 02:21:41,435][46662] Updated weights for policy 0, policy_version 50880 (0.0010) +[2023-10-13 02:21:42,921][46663] Updated weights for policy 1, policy_version 50821 (0.0008) +[2023-10-13 02:21:43,290][46663] Updated weights for policy 1, policy_version 50831 (0.0008) +[2023-10-13 02:21:43,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 104136704. Throughput: 0: 1670.9, 1: 1675.4. Samples: 26042586. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:21:43,607][45375] Avg episode reward: [(0, '44.410'), (1, '52.870')] +[2023-10-13 02:21:43,664][46663] Updated weights for policy 1, policy_version 50841 (0.0009) +[2023-10-13 02:21:45,357][46662] Updated weights for policy 0, policy_version 50890 (0.0009) +[2023-10-13 02:21:45,726][46662] Updated weights for policy 0, policy_version 50900 (0.0009) +[2023-10-13 02:21:46,080][46662] Updated weights for policy 0, policy_version 50910 (0.0009) +[2023-10-13 02:21:47,764][46663] Updated weights for policy 1, policy_version 50851 (0.0007) +[2023-10-13 02:21:48,127][46663] Updated weights for policy 1, policy_version 50861 (0.0008) +[2023-10-13 02:21:48,488][46663] Updated weights for policy 1, policy_version 50871 (0.0008) +[2023-10-13 02:21:48,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 104202240. Throughput: 0: 1675.0, 1: 1671.0. Samples: 26062598. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:21:48,607][45375] Avg episode reward: [(0, '43.950'), (1, '52.140')] +[2023-10-13 02:21:50,181][46662] Updated weights for policy 0, policy_version 50920 (0.0008) +[2023-10-13 02:21:50,548][46662] Updated weights for policy 0, policy_version 50930 (0.0007) +[2023-10-13 02:21:50,912][46662] Updated weights for policy 0, policy_version 50940 (0.0009) +[2023-10-13 02:21:52,544][46663] Updated weights for policy 1, policy_version 50881 (0.0008) +[2023-10-13 02:21:52,916][46663] Updated weights for policy 1, policy_version 50891 (0.0007) +[2023-10-13 02:21:53,284][46663] Updated weights for policy 1, policy_version 50901 (0.0009) +[2023-10-13 02:21:53,607][45375] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 104267776. Throughput: 0: 1686.4, 1: 1659.5. Samples: 26082182. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:21:53,608][45375] Avg episode reward: [(0, '44.900'), (1, '51.230')] +[2023-10-13 02:21:53,616][46091] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000050944_52166656.pth... +[2023-10-13 02:21:53,641][46663] Updated weights for policy 1, policy_version 50911 (0.0009) +[2023-10-13 02:21:53,660][46091] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000049376_50561024.pth +[2023-10-13 02:21:53,675][46384] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000050912_52133888.pth... +[2023-10-13 02:21:53,713][46384] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000049344_50528256.pth +[2023-10-13 02:21:54,917][46662] Updated weights for policy 0, policy_version 50950 (0.0007) +[2023-10-13 02:21:55,287][46662] Updated weights for policy 0, policy_version 50960 (0.0009) +[2023-10-13 02:21:55,658][46662] Updated weights for policy 0, policy_version 50970 (0.0010) +[2023-10-13 02:21:57,699][46663] Updated weights for policy 1, policy_version 50921 (0.0007) +[2023-10-13 02:21:58,058][46663] Updated weights for policy 1, policy_version 50931 (0.0007) +[2023-10-13 02:21:58,429][46663] Updated weights for policy 1, policy_version 50941 (0.0007) +[2023-10-13 02:21:58,606][45375] Fps is (10 sec: 16384.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 104366080. Throughput: 0: 1657.5, 1: 1677.3. Samples: 26092314. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:21:58,607][45375] Avg episode reward: [(0, '45.360'), (1, '49.790')] +[2023-10-13 02:21:59,775][46662] Updated weights for policy 0, policy_version 50980 (0.0010) +[2023-10-13 02:22:00,141][46662] Updated weights for policy 0, policy_version 50990 (0.0010) +[2023-10-13 02:22:00,522][46662] Updated weights for policy 0, policy_version 51000 (0.0010) +[2023-10-13 02:22:02,589][46663] Updated weights for policy 1, policy_version 50951 (0.0007) +[2023-10-13 02:22:02,955][46663] Updated weights for policy 1, policy_version 50961 (0.0008) +[2023-10-13 02:22:03,314][46663] Updated weights for policy 1, policy_version 50971 (0.0009) +[2023-10-13 02:22:03,607][45375] Fps is (10 sec: 16384.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 104431616. Throughput: 0: 1679.7, 1: 1670.4. Samples: 26112914. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:22:03,607][45375] Avg episode reward: [(0, '45.120'), (1, '49.730')] +[2023-10-13 02:22:04,492][46662] Updated weights for policy 0, policy_version 51010 (0.0008) +[2023-10-13 02:22:04,868][46662] Updated weights for policy 0, policy_version 51020 (0.0009) +[2023-10-13 02:22:05,232][46662] Updated weights for policy 0, policy_version 51030 (0.0009) +[2023-10-13 02:22:05,607][46662] Updated weights for policy 0, policy_version 51040 (0.0009) +[2023-10-13 02:22:07,491][46663] Updated weights for policy 1, policy_version 50981 (0.0009) +[2023-10-13 02:22:07,857][46663] Updated weights for policy 1, policy_version 50991 (0.0008) +[2023-10-13 02:22:08,226][46663] Updated weights for policy 1, policy_version 51001 (0.0008) +[2023-10-13 02:22:08,606][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 104497152. Throughput: 0: 1692.1, 1: 1656.0. Samples: 26132730. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:22:08,607][45375] Avg episode reward: [(0, '44.120'), (1, '48.920')] +[2023-10-13 02:22:09,508][46662] Updated weights for policy 0, policy_version 51050 (0.0007) +[2023-10-13 02:22:09,879][46662] Updated weights for policy 0, policy_version 51060 (0.0009) +[2023-10-13 02:22:10,250][46662] Updated weights for policy 0, policy_version 51070 (0.0009) +[2023-10-13 02:22:12,297][46663] Updated weights for policy 1, policy_version 51011 (0.0009) +[2023-10-13 02:22:12,673][46663] Updated weights for policy 1, policy_version 51021 (0.0009) +[2023-10-13 02:22:13,052][46663] Updated weights for policy 1, policy_version 51031 (0.0007) +[2023-10-13 02:22:13,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 104562688. Throughput: 0: 1668.8, 1: 1671.3. Samples: 26143020. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:22:13,608][45375] Avg episode reward: [(0, '43.610'), (1, '49.430')] +[2023-10-13 02:22:14,235][46662] Updated weights for policy 0, policy_version 51080 (0.0009) +[2023-10-13 02:22:14,616][46662] Updated weights for policy 0, policy_version 51090 (0.0009) +[2023-10-13 02:22:14,981][46662] Updated weights for policy 0, policy_version 51100 (0.0008) +[2023-10-13 02:22:17,184][46663] Updated weights for policy 1, policy_version 51041 (0.0008) +[2023-10-13 02:22:17,611][46663] Updated weights for policy 1, policy_version 51051 (0.0010) +[2023-10-13 02:22:17,980][46663] Updated weights for policy 1, policy_version 51061 (0.0008) +[2023-10-13 02:22:18,348][46663] Updated weights for policy 1, policy_version 51071 (0.0007) +[2023-10-13 02:22:18,606][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.5). Total num frames: 104628224. Throughput: 0: 1695.4, 1: 1663.9. Samples: 26163204. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:22:18,607][45375] Avg episode reward: [(0, '44.230'), (1, '48.900')] +[2023-10-13 02:22:19,147][46662] Updated weights for policy 0, policy_version 51110 (0.0011) +[2023-10-13 02:22:19,528][46662] Updated weights for policy 0, policy_version 51120 (0.0010) +[2023-10-13 02:22:19,897][46662] Updated weights for policy 0, policy_version 51130 (0.0009) +[2023-10-13 02:22:22,318][46663] Updated weights for policy 1, policy_version 51081 (0.0007) +[2023-10-13 02:22:22,690][46663] Updated weights for policy 1, policy_version 51091 (0.0010) +[2023-10-13 02:22:23,050][46663] Updated weights for policy 1, policy_version 51101 (0.0010) +[2023-10-13 02:22:23,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 104693760. Throughput: 0: 1693.0, 1: 1656.9. Samples: 26182870. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:22:23,607][45375] Avg episode reward: [(0, '44.130'), (1, '47.210')] +[2023-10-13 02:22:24,053][46662] Updated weights for policy 0, policy_version 51140 (0.0007) +[2023-10-13 02:22:24,428][46662] Updated weights for policy 0, policy_version 51150 (0.0007) +[2023-10-13 02:22:24,799][46662] Updated weights for policy 0, policy_version 51160 (0.0008) +[2023-10-13 02:22:27,283][46663] Updated weights for policy 1, policy_version 51111 (0.0008) +[2023-10-13 02:22:27,645][46663] Updated weights for policy 1, policy_version 51121 (0.0009) +[2023-10-13 02:22:28,023][46663] Updated weights for policy 1, policy_version 51131 (0.0009) +[2023-10-13 02:22:28,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 104759296. Throughput: 0: 1683.8, 1: 1668.0. Samples: 26193420. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:22:28,607][45375] Avg episode reward: [(0, '45.420'), (1, '48.350')] +[2023-10-13 02:22:28,849][46662] Updated weights for policy 0, policy_version 51170 (0.0010) +[2023-10-13 02:22:29,215][46662] Updated weights for policy 0, policy_version 51180 (0.0009) +[2023-10-13 02:22:29,588][46662] Updated weights for policy 0, policy_version 51190 (0.0008) +[2023-10-13 02:22:29,963][46662] Updated weights for policy 0, policy_version 51200 (0.0011) +[2023-10-13 02:22:32,053][46663] Updated weights for policy 1, policy_version 51141 (0.0009) +[2023-10-13 02:22:32,429][46663] Updated weights for policy 1, policy_version 51151 (0.0008) +[2023-10-13 02:22:32,792][46663] Updated weights for policy 1, policy_version 51161 (0.0007) +[2023-10-13 02:22:33,606][45375] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 104824832. Throughput: 0: 1693.1, 1: 1662.8. Samples: 26213610. Policy #0 lag: (min: 28.0, avg: 35.9, max: 60.0) +[2023-10-13 02:22:33,607][45375] Avg episode reward: [(0, '46.320'), (1, '48.380')] +[2023-10-13 02:22:34,146][46662] Updated weights for policy 0, policy_version 51210 (0.0007) +[2023-10-13 02:22:34,509][46662] Updated weights for policy 0, policy_version 51220 (0.0007) +[2023-10-13 02:22:34,879][46662] Updated weights for policy 0, policy_version 51230 (0.0009) +[2023-10-13 02:22:36,823][46663] Updated weights for policy 1, policy_version 51171 (0.0007) +[2023-10-13 02:22:37,177][46663] Updated weights for policy 1, policy_version 51181 (0.0009) +[2023-10-13 02:22:37,548][46663] Updated weights for policy 1, policy_version 51191 (0.0009) +[2023-10-13 02:22:38,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 104890368. Throughput: 0: 1696.3, 1: 1669.0. Samples: 26233620. Policy #0 lag: (min: 28.0, avg: 35.9, max: 60.0) +[2023-10-13 02:22:38,608][45375] Avg episode reward: [(0, '47.130'), (1, '49.500')] +[2023-10-13 02:22:39,154][46662] Updated weights for policy 0, policy_version 51240 (0.0011) +[2023-10-13 02:22:39,518][46662] Updated weights for policy 0, policy_version 51250 (0.0007) +[2023-10-13 02:22:39,893][46662] Updated weights for policy 0, policy_version 51260 (0.0008) +[2023-10-13 02:22:41,590][46663] Updated weights for policy 1, policy_version 51201 (0.0007) +[2023-10-13 02:22:41,947][46663] Updated weights for policy 1, policy_version 51211 (0.0011) +[2023-10-13 02:22:42,313][46663] Updated weights for policy 1, policy_version 51221 (0.0010) +[2023-10-13 02:22:42,679][46663] Updated weights for policy 1, policy_version 51231 (0.0008) +[2023-10-13 02:22:43,606][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 104955904. Throughput: 0: 1687.3, 1: 1682.8. Samples: 26243972. Policy #0 lag: (min: 28.0, avg: 35.9, max: 60.0) +[2023-10-13 02:22:43,607][45375] Avg episode reward: [(0, '47.540'), (1, '48.190')] +[2023-10-13 02:22:43,776][46662] Updated weights for policy 0, policy_version 51270 (0.0008) +[2023-10-13 02:22:44,140][46662] Updated weights for policy 0, policy_version 51280 (0.0007) +[2023-10-13 02:22:44,513][46662] Updated weights for policy 0, policy_version 51290 (0.0008) +[2023-10-13 02:22:46,742][46663] Updated weights for policy 1, policy_version 51241 (0.0008) +[2023-10-13 02:22:47,123][46663] Updated weights for policy 1, policy_version 51251 (0.0009) +[2023-10-13 02:22:47,484][46663] Updated weights for policy 1, policy_version 51261 (0.0009) +[2023-10-13 02:22:48,606][45375] Fps is (10 sec: 13107.4, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 105021440. Throughput: 0: 1690.5, 1: 1664.9. Samples: 26263908. Policy #0 lag: (min: 28.0, avg: 35.9, max: 60.0) +[2023-10-13 02:22:48,607][45375] Avg episode reward: [(0, '47.610'), (1, '47.240')] +[2023-10-13 02:22:48,657][46662] Updated weights for policy 0, policy_version 51300 (0.0008) +[2023-10-13 02:22:49,023][46662] Updated weights for policy 0, policy_version 51310 (0.0007) +[2023-10-13 02:22:49,400][46662] Updated weights for policy 0, policy_version 51320 (0.0008) +[2023-10-13 02:22:51,460][46663] Updated weights for policy 1, policy_version 51271 (0.0008) +[2023-10-13 02:22:51,821][46663] Updated weights for policy 1, policy_version 51281 (0.0007) +[2023-10-13 02:22:52,183][46663] Updated weights for policy 1, policy_version 51291 (0.0008) +[2023-10-13 02:22:53,233][46662] Updated weights for policy 0, policy_version 51330 (0.0009) +[2023-10-13 02:22:53,607][45375] Fps is (10 sec: 13106.8, 60 sec: 13653.3, 300 sec: 13329.3). Total num frames: 105086976. Throughput: 0: 1688.1, 1: 1687.2. Samples: 26284618. Policy #0 lag: (min: 28.0, avg: 35.9, max: 60.0) +[2023-10-13 02:22:53,608][45375] Avg episode reward: [(0, '47.370'), (1, '47.190')] +[2023-10-13 02:22:53,609][46662] Updated weights for policy 0, policy_version 51340 (0.0007) +[2023-10-13 02:22:53,981][46662] Updated weights for policy 0, policy_version 51350 (0.0009) +[2023-10-13 02:22:54,357][46662] Updated weights for policy 0, policy_version 51360 (0.0008) +[2023-10-13 02:22:56,129][46663] Updated weights for policy 1, policy_version 51301 (0.0011) +[2023-10-13 02:22:56,505][46663] Updated weights for policy 1, policy_version 51311 (0.0011) +[2023-10-13 02:22:56,877][46663] Updated weights for policy 1, policy_version 51321 (0.0010) +[2023-10-13 02:22:58,606][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 105152512. Throughput: 0: 1684.8, 1: 1681.7. Samples: 26294510. Policy #0 lag: (min: 28.0, avg: 35.9, max: 60.0) +[2023-10-13 02:22:58,607][45375] Avg episode reward: [(0, '47.140'), (1, '48.070')] +[2023-10-13 02:22:58,622][46662] Updated weights for policy 0, policy_version 51370 (0.0010) +[2023-10-13 02:22:58,996][46662] Updated weights for policy 0, policy_version 51380 (0.0007) +[2023-10-13 02:22:59,371][46662] Updated weights for policy 0, policy_version 51390 (0.0007) +[2023-10-13 02:23:00,776][46663] Updated weights for policy 1, policy_version 51331 (0.0009) +[2023-10-13 02:23:01,136][46663] Updated weights for policy 1, policy_version 51341 (0.0007) +[2023-10-13 02:23:01,504][46663] Updated weights for policy 1, policy_version 51351 (0.0007) +[2023-10-13 02:23:03,314][46662] Updated weights for policy 0, policy_version 51400 (0.0008) +[2023-10-13 02:23:03,607][45375] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 105218048. Throughput: 0: 1684.3, 1: 1678.0. Samples: 26314506. Policy #0 lag: (min: 28.0, avg: 35.9, max: 60.0) +[2023-10-13 02:23:03,607][45375] Avg episode reward: [(0, '46.620'), (1, '49.060')] +[2023-10-13 02:23:03,685][46662] Updated weights for policy 0, policy_version 51410 (0.0009) +[2023-10-13 02:23:04,056][46662] Updated weights for policy 0, policy_version 51420 (0.0010) +[2023-10-13 02:23:05,615][46663] Updated weights for policy 1, policy_version 51361 (0.0010) +[2023-10-13 02:23:06,036][46663] Updated weights for policy 1, policy_version 51371 (0.0008) +[2023-10-13 02:23:06,412][46663] Updated weights for policy 1, policy_version 51381 (0.0009) +[2023-10-13 02:23:06,784][46663] Updated weights for policy 1, policy_version 51391 (0.0009) +[2023-10-13 02:23:08,180][46662] Updated weights for policy 0, policy_version 51430 (0.0008) +[2023-10-13 02:23:08,550][46662] Updated weights for policy 0, policy_version 51440 (0.0008) +[2023-10-13 02:23:08,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 105283584. Throughput: 0: 1685.2, 1: 1694.4. Samples: 26334952. Policy #0 lag: (min: 31.0, avg: 32.2, max: 56.0) +[2023-10-13 02:23:08,607][45375] Avg episode reward: [(0, '48.260'), (1, '50.530')] +[2023-10-13 02:23:08,929][46662] Updated weights for policy 0, policy_version 51450 (0.0009) +[2023-10-13 02:23:10,801][46663] Updated weights for policy 1, policy_version 51401 (0.0007) +[2023-10-13 02:23:11,163][46663] Updated weights for policy 1, policy_version 51411 (0.0010) +[2023-10-13 02:23:11,530][46663] Updated weights for policy 1, policy_version 51421 (0.0011) +[2023-10-13 02:23:12,981][46662] Updated weights for policy 0, policy_version 51460 (0.0008) +[2023-10-13 02:23:13,353][46662] Updated weights for policy 0, policy_version 51470 (0.0008) +[2023-10-13 02:23:13,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 105349120. Throughput: 0: 1675.8, 1: 1678.1. Samples: 26344346. Policy #0 lag: (min: 31.0, avg: 32.2, max: 56.0) +[2023-10-13 02:23:13,607][45375] Avg episode reward: [(0, '49.040'), (1, '51.050')] +[2023-10-13 02:23:13,725][46662] Updated weights for policy 0, policy_version 51480 (0.0008) +[2023-10-13 02:23:15,596][46663] Updated weights for policy 1, policy_version 51431 (0.0009) +[2023-10-13 02:23:15,960][46663] Updated weights for policy 1, policy_version 51441 (0.0008) +[2023-10-13 02:23:16,335][46663] Updated weights for policy 1, policy_version 51451 (0.0010) +[2023-10-13 02:23:17,857][46662] Updated weights for policy 0, policy_version 51490 (0.0011) +[2023-10-13 02:23:18,226][46662] Updated weights for policy 0, policy_version 51500 (0.0008) +[2023-10-13 02:23:18,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 105414656. Throughput: 0: 1676.8, 1: 1684.6. Samples: 26364876. Policy #0 lag: (min: 31.0, avg: 32.2, max: 56.0) +[2023-10-13 02:23:18,607][45375] Avg episode reward: [(0, '50.540'), (1, '49.820')] +[2023-10-13 02:23:18,611][46662] Updated weights for policy 0, policy_version 51510 (0.0009) +[2023-10-13 02:23:18,991][46662] Updated weights for policy 0, policy_version 51520 (0.0010) +[2023-10-13 02:23:20,409][46663] Updated weights for policy 1, policy_version 51461 (0.0007) +[2023-10-13 02:23:20,778][46663] Updated weights for policy 1, policy_version 51471 (0.0008) +[2023-10-13 02:23:21,141][46663] Updated weights for policy 1, policy_version 51481 (0.0007) +[2023-10-13 02:23:23,140][46662] Updated weights for policy 0, policy_version 51530 (0.0008) +[2023-10-13 02:23:23,517][46662] Updated weights for policy 0, policy_version 51540 (0.0007) +[2023-10-13 02:23:23,606][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 105480192. Throughput: 0: 1674.1, 1: 1701.8. Samples: 26385536. Policy #0 lag: (min: 31.0, avg: 32.2, max: 56.0) +[2023-10-13 02:23:23,607][45375] Avg episode reward: [(0, '50.420'), (1, '51.380')] +[2023-10-13 02:23:23,890][46662] Updated weights for policy 0, policy_version 51550 (0.0008) +[2023-10-13 02:23:25,177][46663] Updated weights for policy 1, policy_version 51491 (0.0007) +[2023-10-13 02:23:25,559][46663] Updated weights for policy 1, policy_version 51501 (0.0009) +[2023-10-13 02:23:25,926][46663] Updated weights for policy 1, policy_version 51511 (0.0011) +[2023-10-13 02:23:27,856][46662] Updated weights for policy 0, policy_version 51560 (0.0007) +[2023-10-13 02:23:28,237][46662] Updated weights for policy 0, policy_version 51570 (0.0010) +[2023-10-13 02:23:28,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 105545728. Throughput: 0: 1684.7, 1: 1667.6. Samples: 26394824. Policy #0 lag: (min: 31.0, avg: 32.2, max: 56.0) +[2023-10-13 02:23:28,607][45375] Avg episode reward: [(0, '49.700'), (1, '51.640')] +[2023-10-13 02:23:28,612][46662] Updated weights for policy 0, policy_version 51580 (0.0011) +[2023-10-13 02:23:29,930][46663] Updated weights for policy 1, policy_version 51521 (0.0011) +[2023-10-13 02:23:30,307][46663] Updated weights for policy 1, policy_version 51531 (0.0010) +[2023-10-13 02:23:30,667][46663] Updated weights for policy 1, policy_version 51541 (0.0012) +[2023-10-13 02:23:31,032][46663] Updated weights for policy 1, policy_version 51551 (0.0010) +[2023-10-13 02:23:32,583][46662] Updated weights for policy 0, policy_version 51590 (0.0009) +[2023-10-13 02:23:32,951][46662] Updated weights for policy 0, policy_version 51600 (0.0009) +[2023-10-13 02:23:33,324][46662] Updated weights for policy 0, policy_version 51610 (0.0007) +[2023-10-13 02:23:33,607][45375] Fps is (10 sec: 16383.7, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 105644032. Throughput: 0: 1680.7, 1: 1692.4. Samples: 26415702. Policy #0 lag: (min: 31.0, avg: 32.2, max: 56.0) +[2023-10-13 02:23:33,608][45375] Avg episode reward: [(0, '49.260'), (1, '50.600')] +[2023-10-13 02:23:35,028][46663] Updated weights for policy 1, policy_version 51561 (0.0009) +[2023-10-13 02:23:35,392][46663] Updated weights for policy 1, policy_version 51571 (0.0008) +[2023-10-13 02:23:35,757][46663] Updated weights for policy 1, policy_version 51581 (0.0009) +[2023-10-13 02:23:37,481][46662] Updated weights for policy 0, policy_version 51620 (0.0008) +[2023-10-13 02:23:37,853][46662] Updated weights for policy 0, policy_version 51630 (0.0007) +[2023-10-13 02:23:38,229][46662] Updated weights for policy 0, policy_version 51640 (0.0008) +[2023-10-13 02:23:38,607][45375] Fps is (10 sec: 16383.8, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 105709568. Throughput: 0: 1664.5, 1: 1696.1. Samples: 26435848. Policy #0 lag: (min: 1.0, avg: 14.4, max: 33.0) +[2023-10-13 02:23:38,608][45375] Avg episode reward: [(0, '48.600'), (1, '48.650')] +[2023-10-13 02:23:39,863][46663] Updated weights for policy 1, policy_version 51591 (0.0009) +[2023-10-13 02:23:40,232][46663] Updated weights for policy 1, policy_version 51601 (0.0007) +[2023-10-13 02:23:40,596][46663] Updated weights for policy 1, policy_version 51611 (0.0007) +[2023-10-13 02:23:42,246][46662] Updated weights for policy 0, policy_version 51650 (0.0010) +[2023-10-13 02:23:42,614][46662] Updated weights for policy 0, policy_version 51660 (0.0009) +[2023-10-13 02:23:42,974][46662] Updated weights for policy 0, policy_version 51670 (0.0008) +[2023-10-13 02:23:43,346][46662] Updated weights for policy 0, policy_version 51680 (0.0008) +[2023-10-13 02:23:43,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 105775104. Throughput: 0: 1679.8, 1: 1678.8. Samples: 26445648. Policy #0 lag: (min: 1.0, avg: 14.4, max: 33.0) +[2023-10-13 02:23:43,608][45375] Avg episode reward: [(0, '49.500'), (1, '49.320')] +[2023-10-13 02:23:44,485][46663] Updated weights for policy 1, policy_version 51621 (0.0007) +[2023-10-13 02:23:44,853][46663] Updated weights for policy 1, policy_version 51631 (0.0008) +[2023-10-13 02:23:45,219][46663] Updated weights for policy 1, policy_version 51641 (0.0010) +[2023-10-13 02:23:47,303][46662] Updated weights for policy 0, policy_version 51690 (0.0008) +[2023-10-13 02:23:47,671][46662] Updated weights for policy 0, policy_version 51700 (0.0009) +[2023-10-13 02:23:48,044][46662] Updated weights for policy 0, policy_version 51710 (0.0008) +[2023-10-13 02:23:48,606][45375] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 105840640. Throughput: 0: 1680.8, 1: 1696.2. Samples: 26466474. Policy #0 lag: (min: 1.0, avg: 14.4, max: 33.0) +[2023-10-13 02:23:48,607][45375] Avg episode reward: [(0, '48.990'), (1, '50.150')] +[2023-10-13 02:23:49,272][46663] Updated weights for policy 1, policy_version 51651 (0.0007) +[2023-10-13 02:23:49,633][46663] Updated weights for policy 1, policy_version 51661 (0.0008) +[2023-10-13 02:23:50,004][46663] Updated weights for policy 1, policy_version 51671 (0.0007) +[2023-10-13 02:23:51,998][46662] Updated weights for policy 0, policy_version 51720 (0.0009) +[2023-10-13 02:23:52,365][46662] Updated weights for policy 0, policy_version 51730 (0.0008) +[2023-10-13 02:23:52,743][46662] Updated weights for policy 0, policy_version 51740 (0.0009) +[2023-10-13 02:23:53,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 105906176. Throughput: 0: 1660.4, 1: 1704.4. Samples: 26486370. Policy #0 lag: (min: 1.0, avg: 14.4, max: 33.0) +[2023-10-13 02:23:53,608][45375] Avg episode reward: [(0, '50.750'), (1, '51.270')] +[2023-10-13 02:23:53,619][46384] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000051680_52920320.pth... +[2023-10-13 02:23:53,619][46091] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000051744_52985856.pth... +[2023-10-13 02:23:53,655][46091] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000050176_51380224.pth +[2023-10-13 02:23:53,658][46384] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000050112_51314688.pth +[2023-10-13 02:23:54,156][46663] Updated weights for policy 1, policy_version 51681 (0.0008) +[2023-10-13 02:23:54,573][46663] Updated weights for policy 1, policy_version 51691 (0.0008) +[2023-10-13 02:23:54,934][46663] Updated weights for policy 1, policy_version 51701 (0.0008) +[2023-10-13 02:23:55,299][46663] Updated weights for policy 1, policy_version 51711 (0.0009) +[2023-10-13 02:23:56,929][46662] Updated weights for policy 0, policy_version 51750 (0.0008) +[2023-10-13 02:23:57,304][46662] Updated weights for policy 0, policy_version 51760 (0.0007) +[2023-10-13 02:23:57,685][46662] Updated weights for policy 0, policy_version 51770 (0.0010) +[2023-10-13 02:23:58,606][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 105971712. Throughput: 0: 1689.1, 1: 1689.8. Samples: 26496394. Policy #0 lag: (min: 1.0, avg: 14.4, max: 33.0) +[2023-10-13 02:23:58,607][45375] Avg episode reward: [(0, '50.800'), (1, '51.440')] +[2023-10-13 02:23:59,477][46663] Updated weights for policy 1, policy_version 51721 (0.0008) +[2023-10-13 02:23:59,841][46663] Updated weights for policy 1, policy_version 51731 (0.0007) +[2023-10-13 02:24:00,201][46663] Updated weights for policy 1, policy_version 51741 (0.0008) +[2023-10-13 02:24:01,617][46662] Updated weights for policy 0, policy_version 51780 (0.0009) +[2023-10-13 02:24:01,987][46662] Updated weights for policy 0, policy_version 51790 (0.0007) +[2023-10-13 02:24:02,358][46662] Updated weights for policy 0, policy_version 51800 (0.0009) +[2023-10-13 02:24:03,607][45375] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 106037248. Throughput: 0: 1681.4, 1: 1695.3. Samples: 26516828. Policy #0 lag: (min: 1.0, avg: 14.4, max: 33.0) +[2023-10-13 02:24:03,608][45375] Avg episode reward: [(0, '51.420'), (1, '51.040')] +[2023-10-13 02:24:04,123][46663] Updated weights for policy 1, policy_version 51751 (0.0009) +[2023-10-13 02:24:04,503][46663] Updated weights for policy 1, policy_version 51761 (0.0010) +[2023-10-13 02:24:04,869][46663] Updated weights for policy 1, policy_version 51771 (0.0011) +[2023-10-13 02:24:06,487][46662] Updated weights for policy 0, policy_version 51810 (0.0010) +[2023-10-13 02:24:06,856][46662] Updated weights for policy 0, policy_version 51820 (0.0007) +[2023-10-13 02:24:07,221][46662] Updated weights for policy 0, policy_version 51830 (0.0008) +[2023-10-13 02:24:07,599][46662] Updated weights for policy 0, policy_version 51840 (0.0011) +[2023-10-13 02:24:08,606][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 106102784. Throughput: 0: 1670.0, 1: 1690.6. Samples: 26536762. Policy #0 lag: (min: 1.0, avg: 14.4, max: 33.0) +[2023-10-13 02:24:08,607][45375] Avg episode reward: [(0, '50.190'), (1, '51.160')] +[2023-10-13 02:24:08,905][46663] Updated weights for policy 1, policy_version 51781 (0.0009) +[2023-10-13 02:24:09,276][46663] Updated weights for policy 1, policy_version 51791 (0.0007) +[2023-10-13 02:24:09,643][46663] Updated weights for policy 1, policy_version 51801 (0.0008) +[2023-10-13 02:24:11,601][46662] Updated weights for policy 0, policy_version 51850 (0.0007) +[2023-10-13 02:24:11,959][46662] Updated weights for policy 0, policy_version 51860 (0.0007) +[2023-10-13 02:24:12,335][46662] Updated weights for policy 0, policy_version 51870 (0.0008) +[2023-10-13 02:24:13,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 106168320. Throughput: 0: 1695.4, 1: 1692.0. Samples: 26547256. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:24:13,608][45375] Avg episode reward: [(0, '48.800'), (1, '50.430')] +[2023-10-13 02:24:13,678][46663] Updated weights for policy 1, policy_version 51811 (0.0008) +[2023-10-13 02:24:14,048][46663] Updated weights for policy 1, policy_version 51821 (0.0008) +[2023-10-13 02:24:14,409][46663] Updated weights for policy 1, policy_version 51831 (0.0007) +[2023-10-13 02:24:16,456][46662] Updated weights for policy 0, policy_version 51880 (0.0009) +[2023-10-13 02:24:16,831][46662] Updated weights for policy 0, policy_version 51890 (0.0008) +[2023-10-13 02:24:17,200][46662] Updated weights for policy 0, policy_version 51900 (0.0009) +[2023-10-13 02:24:18,452][46663] Updated weights for policy 1, policy_version 51841 (0.0008) +[2023-10-13 02:24:18,606][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 106233856. Throughput: 0: 1677.8, 1: 1691.7. Samples: 26567330. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:24:18,607][45375] Avg episode reward: [(0, '48.830'), (1, '50.010')] +[2023-10-13 02:24:18,814][46663] Updated weights for policy 1, policy_version 51851 (0.0009) +[2023-10-13 02:24:19,180][46663] Updated weights for policy 1, policy_version 51861 (0.0011) +[2023-10-13 02:24:19,545][46663] Updated weights for policy 1, policy_version 51871 (0.0009) +[2023-10-13 02:24:21,209][46662] Updated weights for policy 0, policy_version 51910 (0.0010) +[2023-10-13 02:24:21,578][46662] Updated weights for policy 0, policy_version 51920 (0.0008) +[2023-10-13 02:24:21,954][46662] Updated weights for policy 0, policy_version 51930 (0.0007) +[2023-10-13 02:24:23,606][45375] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 106299392. Throughput: 0: 1674.1, 1: 1687.7. Samples: 26587128. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:24:23,607][45375] Avg episode reward: [(0, '49.160'), (1, '52.000')] +[2023-10-13 02:24:23,628][46663] Updated weights for policy 1, policy_version 51881 (0.0008) +[2023-10-13 02:24:24,001][46663] Updated weights for policy 1, policy_version 51891 (0.0007) +[2023-10-13 02:24:24,366][46663] Updated weights for policy 1, policy_version 51901 (0.0007) +[2023-10-13 02:24:26,073][46662] Updated weights for policy 0, policy_version 51940 (0.0010) +[2023-10-13 02:24:26,440][46662] Updated weights for policy 0, policy_version 51950 (0.0010) +[2023-10-13 02:24:26,824][46662] Updated weights for policy 0, policy_version 51960 (0.0009) +[2023-10-13 02:24:28,510][46663] Updated weights for policy 1, policy_version 51911 (0.0009) +[2023-10-13 02:24:28,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 106364928. Throughput: 0: 1686.3, 1: 1687.1. Samples: 26597448. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:24:28,608][45375] Avg episode reward: [(0, '48.690'), (1, '51.410')] +[2023-10-13 02:24:28,875][46663] Updated weights for policy 1, policy_version 51921 (0.0009) +[2023-10-13 02:24:29,240][46663] Updated weights for policy 1, policy_version 51931 (0.0009) +[2023-10-13 02:24:31,022][46662] Updated weights for policy 0, policy_version 51970 (0.0010) +[2023-10-13 02:24:31,397][46662] Updated weights for policy 0, policy_version 51980 (0.0007) +[2023-10-13 02:24:31,762][46662] Updated weights for policy 0, policy_version 51990 (0.0007) +[2023-10-13 02:24:32,132][46662] Updated weights for policy 0, policy_version 52000 (0.0009) +[2023-10-13 02:24:33,277][46663] Updated weights for policy 1, policy_version 51941 (0.0007) +[2023-10-13 02:24:33,607][45375] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 106430464. Throughput: 0: 1658.3, 1: 1687.0. Samples: 26617014. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:24:33,608][45375] Avg episode reward: [(0, '49.760'), (1, '50.750')] +[2023-10-13 02:24:33,653][46663] Updated weights for policy 1, policy_version 51951 (0.0007) +[2023-10-13 02:24:34,015][46663] Updated weights for policy 1, policy_version 51961 (0.0008) +[2023-10-13 02:24:36,238][46662] Updated weights for policy 0, policy_version 52010 (0.0010) +[2023-10-13 02:24:36,610][46662] Updated weights for policy 0, policy_version 52020 (0.0008) +[2023-10-13 02:24:36,990][46662] Updated weights for policy 0, policy_version 52030 (0.0008) +[2023-10-13 02:24:38,145][46663] Updated weights for policy 1, policy_version 51971 (0.0008) +[2023-10-13 02:24:38,512][46663] Updated weights for policy 1, policy_version 51981 (0.0012) +[2023-10-13 02:24:38,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 106496000. Throughput: 0: 1667.3, 1: 1672.0. Samples: 26636638. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:24:38,607][45375] Avg episode reward: [(0, '50.020'), (1, '51.780')] +[2023-10-13 02:24:38,881][46663] Updated weights for policy 1, policy_version 51991 (0.0010) +[2023-10-13 02:24:41,175][46662] Updated weights for policy 0, policy_version 52040 (0.0008) +[2023-10-13 02:24:41,544][46662] Updated weights for policy 0, policy_version 52050 (0.0010) +[2023-10-13 02:24:41,917][46662] Updated weights for policy 0, policy_version 52060 (0.0009) +[2023-10-13 02:24:43,194][46663] Updated weights for policy 1, policy_version 52001 (0.0009) +[2023-10-13 02:24:43,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 106561536. Throughput: 0: 1671.4, 1: 1685.8. Samples: 26647470. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:24:43,608][45375] Avg episode reward: [(0, '51.230'), (1, '51.370')] +[2023-10-13 02:24:43,624][46663] Updated weights for policy 1, policy_version 52011 (0.0007) +[2023-10-13 02:24:43,997][46663] Updated weights for policy 1, policy_version 52021 (0.0008) +[2023-10-13 02:24:44,368][46663] Updated weights for policy 1, policy_version 52031 (0.0008) +[2023-10-13 02:24:45,943][46662] Updated weights for policy 0, policy_version 52070 (0.0009) +[2023-10-13 02:24:46,320][46662] Updated weights for policy 0, policy_version 52080 (0.0007) +[2023-10-13 02:24:46,682][46662] Updated weights for policy 0, policy_version 52090 (0.0007) +[2023-10-13 02:24:48,471][46663] Updated weights for policy 1, policy_version 52041 (0.0009) +[2023-10-13 02:24:48,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 106627072. Throughput: 0: 1655.5, 1: 1677.4. Samples: 26666810. Policy #0 lag: (min: 16.0, avg: 40.8, max: 48.0) +[2023-10-13 02:24:48,607][45375] Avg episode reward: [(0, '50.040'), (1, '52.250')] +[2023-10-13 02:24:48,840][46663] Updated weights for policy 1, policy_version 52051 (0.0009) +[2023-10-13 02:24:49,204][46663] Updated weights for policy 1, policy_version 52061 (0.0009) +[2023-10-13 02:24:50,857][46662] Updated weights for policy 0, policy_version 52100 (0.0008) +[2023-10-13 02:24:51,228][46662] Updated weights for policy 0, policy_version 52110 (0.0009) +[2023-10-13 02:24:51,598][46662] Updated weights for policy 0, policy_version 52120 (0.0009) +[2023-10-13 02:24:53,251][46663] Updated weights for policy 1, policy_version 52071 (0.0008) +[2023-10-13 02:24:53,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.3, 300 sec: 13440.4). Total num frames: 106692608. Throughput: 0: 1668.6, 1: 1665.0. Samples: 26686772. Policy #0 lag: (min: 16.0, avg: 40.8, max: 48.0) +[2023-10-13 02:24:53,607][45375] Avg episode reward: [(0, '51.830'), (1, '51.060')] +[2023-10-13 02:24:53,623][46663] Updated weights for policy 1, policy_version 52081 (0.0008) +[2023-10-13 02:24:53,989][46663] Updated weights for policy 1, policy_version 52091 (0.0009) +[2023-10-13 02:24:55,704][46662] Updated weights for policy 0, policy_version 52130 (0.0008) +[2023-10-13 02:24:56,076][46662] Updated weights for policy 0, policy_version 52140 (0.0008) +[2023-10-13 02:24:56,452][46662] Updated weights for policy 0, policy_version 52150 (0.0008) +[2023-10-13 02:24:56,818][46662] Updated weights for policy 0, policy_version 52160 (0.0009) +[2023-10-13 02:24:58,096][46663] Updated weights for policy 1, policy_version 52101 (0.0009) +[2023-10-13 02:24:58,462][46663] Updated weights for policy 1, policy_version 52111 (0.0010) +[2023-10-13 02:24:58,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 106758144. Throughput: 0: 1661.5, 1: 1673.9. Samples: 26697348. Policy #0 lag: (min: 16.0, avg: 40.8, max: 48.0) +[2023-10-13 02:24:58,607][45375] Avg episode reward: [(0, '50.580'), (1, '51.290')] +[2023-10-13 02:24:58,837][46663] Updated weights for policy 1, policy_version 52121 (0.0011) +[2023-10-13 02:25:00,827][46662] Updated weights for policy 0, policy_version 52170 (0.0010) +[2023-10-13 02:25:01,192][46662] Updated weights for policy 0, policy_version 52180 (0.0008) +[2023-10-13 02:25:01,562][46662] Updated weights for policy 0, policy_version 52190 (0.0007) +[2023-10-13 02:25:02,909][46663] Updated weights for policy 1, policy_version 52131 (0.0009) +[2023-10-13 02:25:03,283][46663] Updated weights for policy 1, policy_version 52141 (0.0008) +[2023-10-13 02:25:03,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 106823680. Throughput: 0: 1653.6, 1: 1674.0. Samples: 26717070. Policy #0 lag: (min: 16.0, avg: 40.8, max: 48.0) +[2023-10-13 02:25:03,607][45375] Avg episode reward: [(0, '51.310'), (1, '51.100')] +[2023-10-13 02:25:03,644][46663] Updated weights for policy 1, policy_version 52151 (0.0010) +[2023-10-13 02:25:05,554][46662] Updated weights for policy 0, policy_version 52200 (0.0008) +[2023-10-13 02:25:05,914][46662] Updated weights for policy 0, policy_version 52210 (0.0009) +[2023-10-13 02:25:06,294][46662] Updated weights for policy 0, policy_version 52220 (0.0009) +[2023-10-13 02:25:07,611][46663] Updated weights for policy 1, policy_version 52161 (0.0010) +[2023-10-13 02:25:07,971][46663] Updated weights for policy 1, policy_version 52171 (0.0009) +[2023-10-13 02:25:08,350][46663] Updated weights for policy 1, policy_version 52181 (0.0010) +[2023-10-13 02:25:08,606][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 106889216. Throughput: 0: 1676.3, 1: 1657.2. Samples: 26737136. Policy #0 lag: (min: 16.0, avg: 40.8, max: 48.0) +[2023-10-13 02:25:08,607][45375] Avg episode reward: [(0, '51.880'), (1, '52.020')] +[2023-10-13 02:25:08,718][46663] Updated weights for policy 1, policy_version 52191 (0.0009) +[2023-10-13 02:25:10,312][46662] Updated weights for policy 0, policy_version 52230 (0.0008) +[2023-10-13 02:25:10,676][46662] Updated weights for policy 0, policy_version 52240 (0.0010) +[2023-10-13 02:25:11,050][46662] Updated weights for policy 0, policy_version 52250 (0.0010) +[2023-10-13 02:25:12,881][46663] Updated weights for policy 1, policy_version 52201 (0.0009) +[2023-10-13 02:25:13,255][46663] Updated weights for policy 1, policy_version 52211 (0.0008) +[2023-10-13 02:25:13,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 106954752. Throughput: 0: 1661.1, 1: 1669.6. Samples: 26747332. Policy #0 lag: (min: 16.0, avg: 40.8, max: 48.0) +[2023-10-13 02:25:13,608][45375] Avg episode reward: [(0, '51.670'), (1, '51.060')] +[2023-10-13 02:25:13,612][46663] Updated weights for policy 1, policy_version 52221 (0.0008) +[2023-10-13 02:25:14,956][46662] Updated weights for policy 0, policy_version 52260 (0.0010) +[2023-10-13 02:25:15,330][46662] Updated weights for policy 0, policy_version 52270 (0.0009) +[2023-10-13 02:25:15,697][46662] Updated weights for policy 0, policy_version 52280 (0.0008) +[2023-10-13 02:25:17,726][46663] Updated weights for policy 1, policy_version 52231 (0.0009) +[2023-10-13 02:25:18,093][46663] Updated weights for policy 1, policy_version 52241 (0.0008) +[2023-10-13 02:25:18,459][46663] Updated weights for policy 1, policy_version 52251 (0.0008) +[2023-10-13 02:25:18,606][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 107020288. Throughput: 0: 1682.2, 1: 1666.2. Samples: 26767694. Policy #0 lag: (min: 16.0, avg: 40.8, max: 48.0) +[2023-10-13 02:25:18,607][45375] Avg episode reward: [(0, '50.610'), (1, '51.600')] +[2023-10-13 02:25:19,658][46662] Updated weights for policy 0, policy_version 52290 (0.0009) +[2023-10-13 02:25:20,026][46662] Updated weights for policy 0, policy_version 52300 (0.0009) +[2023-10-13 02:25:20,396][46662] Updated weights for policy 0, policy_version 52310 (0.0008) +[2023-10-13 02:25:20,767][46662] Updated weights for policy 0, policy_version 52320 (0.0009) +[2023-10-13 02:25:22,584][46663] Updated weights for policy 1, policy_version 52261 (0.0008) +[2023-10-13 02:25:22,939][46663] Updated weights for policy 1, policy_version 52271 (0.0009) +[2023-10-13 02:25:23,305][46663] Updated weights for policy 1, policy_version 52281 (0.0009) +[2023-10-13 02:25:23,607][45375] Fps is (10 sec: 16384.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 107118592. Throughput: 0: 1700.5, 1: 1657.3. Samples: 26787742. Policy #0 lag: (min: 22.0, avg: 22.0, max: 22.0) +[2023-10-13 02:25:23,607][45375] Avg episode reward: [(0, '52.640'), (1, '50.910')] +[2023-10-13 02:25:24,772][46662] Updated weights for policy 0, policy_version 52330 (0.0008) +[2023-10-13 02:25:25,145][46662] Updated weights for policy 0, policy_version 52340 (0.0007) +[2023-10-13 02:25:25,513][46662] Updated weights for policy 0, policy_version 52350 (0.0009) +[2023-10-13 02:25:27,369][46663] Updated weights for policy 1, policy_version 52291 (0.0009) +[2023-10-13 02:25:27,732][46663] Updated weights for policy 1, policy_version 52301 (0.0009) +[2023-10-13 02:25:28,111][46663] Updated weights for policy 1, policy_version 52311 (0.0008) +[2023-10-13 02:25:28,606][45375] Fps is (10 sec: 16384.0, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 107184128. Throughput: 0: 1670.9, 1: 1668.2. Samples: 26797730. Policy #0 lag: (min: 22.0, avg: 22.0, max: 22.0) +[2023-10-13 02:25:28,607][45375] Avg episode reward: [(0, '53.290'), (1, '51.290')] +[2023-10-13 02:25:29,511][46662] Updated weights for policy 0, policy_version 52360 (0.0009) +[2023-10-13 02:25:29,871][46662] Updated weights for policy 0, policy_version 52370 (0.0007) +[2023-10-13 02:25:30,254][46662] Updated weights for policy 0, policy_version 52380 (0.0008) +[2023-10-13 02:25:32,291][46663] Updated weights for policy 1, policy_version 52321 (0.0011) +[2023-10-13 02:25:32,698][46663] Updated weights for policy 1, policy_version 52331 (0.0010) +[2023-10-13 02:25:33,070][46663] Updated weights for policy 1, policy_version 52341 (0.0008) +[2023-10-13 02:25:33,425][46663] Updated weights for policy 1, policy_version 52351 (0.0007) +[2023-10-13 02:25:33,606][45375] Fps is (10 sec: 13107.4, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 107249664. Throughput: 0: 1695.4, 1: 1672.4. Samples: 26818360. Policy #0 lag: (min: 22.0, avg: 22.0, max: 22.0) +[2023-10-13 02:25:33,607][45375] Avg episode reward: [(0, '53.630'), (1, '49.590')] +[2023-10-13 02:25:34,381][46662] Updated weights for policy 0, policy_version 52390 (0.0008) +[2023-10-13 02:25:34,747][46662] Updated weights for policy 0, policy_version 52400 (0.0011) +[2023-10-13 02:25:35,117][46662] Updated weights for policy 0, policy_version 52410 (0.0009) +[2023-10-13 02:25:37,510][46663] Updated weights for policy 1, policy_version 52361 (0.0010) +[2023-10-13 02:25:37,873][46663] Updated weights for policy 1, policy_version 52371 (0.0009) +[2023-10-13 02:25:38,247][46663] Updated weights for policy 1, policy_version 52381 (0.0007) +[2023-10-13 02:25:38,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 107315200. Throughput: 0: 1700.4, 1: 1661.5. Samples: 26838058. Policy #0 lag: (min: 22.0, avg: 22.0, max: 22.0) +[2023-10-13 02:25:38,608][45375] Avg episode reward: [(0, '52.900'), (1, '49.540')] +[2023-10-13 02:25:39,221][46662] Updated weights for policy 0, policy_version 52420 (0.0009) +[2023-10-13 02:25:39,601][46662] Updated weights for policy 0, policy_version 52430 (0.0007) +[2023-10-13 02:25:39,967][46662] Updated weights for policy 0, policy_version 52440 (0.0008) +[2023-10-13 02:25:42,236][46663] Updated weights for policy 1, policy_version 52391 (0.0008) +[2023-10-13 02:25:42,600][46663] Updated weights for policy 1, policy_version 52401 (0.0009) +[2023-10-13 02:25:42,961][46663] Updated weights for policy 1, policy_version 52411 (0.0008) +[2023-10-13 02:25:43,607][45375] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 107380736. Throughput: 0: 1670.8, 1: 1683.2. Samples: 26848278. Policy #0 lag: (min: 22.0, avg: 22.0, max: 22.0) +[2023-10-13 02:25:43,608][45375] Avg episode reward: [(0, '53.110'), (1, '48.820')] +[2023-10-13 02:25:44,092][46662] Updated weights for policy 0, policy_version 52450 (0.0009) +[2023-10-13 02:25:44,454][46662] Updated weights for policy 0, policy_version 52460 (0.0010) +[2023-10-13 02:25:44,826][46662] Updated weights for policy 0, policy_version 52470 (0.0008) +[2023-10-13 02:25:45,193][46662] Updated weights for policy 0, policy_version 52480 (0.0010) +[2023-10-13 02:25:47,005][46663] Updated weights for policy 1, policy_version 52421 (0.0008) +[2023-10-13 02:25:47,382][46663] Updated weights for policy 1, policy_version 52431 (0.0009) +[2023-10-13 02:25:47,744][46663] Updated weights for policy 1, policy_version 52441 (0.0010) +[2023-10-13 02:25:48,606][45375] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 107446272. Throughput: 0: 1695.9, 1: 1667.2. Samples: 26868410. Policy #0 lag: (min: 22.0, avg: 22.0, max: 22.0) +[2023-10-13 02:25:48,607][45375] Avg episode reward: [(0, '53.360'), (1, '47.640')] +[2023-10-13 02:25:49,235][46662] Updated weights for policy 0, policy_version 52490 (0.0007) +[2023-10-13 02:25:49,610][46662] Updated weights for policy 0, policy_version 52500 (0.0007) +[2023-10-13 02:25:49,972][46662] Updated weights for policy 0, policy_version 52510 (0.0007) +[2023-10-13 02:25:51,775][46663] Updated weights for policy 1, policy_version 52451 (0.0009) +[2023-10-13 02:25:52,149][46663] Updated weights for policy 1, policy_version 52461 (0.0008) +[2023-10-13 02:25:52,513][46663] Updated weights for policy 1, policy_version 52471 (0.0008) +[2023-10-13 02:25:53,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 107511808. Throughput: 0: 1691.4, 1: 1672.4. Samples: 26888508. Policy #0 lag: (min: 22.0, avg: 22.0, max: 22.0) +[2023-10-13 02:25:53,607][45375] Avg episode reward: [(0, '52.080'), (1, '47.740')] +[2023-10-13 02:25:53,617][46091] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000052512_53772288.pth... +[2023-10-13 02:25:53,618][46384] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000052480_53739520.pth... +[2023-10-13 02:25:53,657][46091] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000050944_52166656.pth +[2023-10-13 02:25:53,658][46384] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000050912_52133888.pth +[2023-10-13 02:25:54,190][46662] Updated weights for policy 0, policy_version 52520 (0.0009) +[2023-10-13 02:25:54,558][46662] Updated weights for policy 0, policy_version 52530 (0.0007) +[2023-10-13 02:25:54,934][46662] Updated weights for policy 0, policy_version 52540 (0.0009) +[2023-10-13 02:25:56,499][46663] Updated weights for policy 1, policy_version 52481 (0.0008) +[2023-10-13 02:25:56,869][46663] Updated weights for policy 1, policy_version 52491 (0.0007) +[2023-10-13 02:25:57,237][46663] Updated weights for policy 1, policy_version 52501 (0.0007) +[2023-10-13 02:25:57,615][46663] Updated weights for policy 1, policy_version 52511 (0.0009) +[2023-10-13 02:25:58,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 107577344. Throughput: 0: 1678.8, 1: 1688.5. Samples: 26898862. Policy #0 lag: (min: 31.0, avg: 34.1, max: 63.0) +[2023-10-13 02:25:58,607][45375] Avg episode reward: [(0, '51.610'), (1, '47.430')] +[2023-10-13 02:25:59,022][46662] Updated weights for policy 0, policy_version 52550 (0.0008) +[2023-10-13 02:25:59,388][46662] Updated weights for policy 0, policy_version 52560 (0.0008) +[2023-10-13 02:25:59,761][46662] Updated weights for policy 0, policy_version 52570 (0.0009) +[2023-10-13 02:26:01,640][46663] Updated weights for policy 1, policy_version 52521 (0.0008) +[2023-10-13 02:26:02,010][46663] Updated weights for policy 1, policy_version 52531 (0.0010) +[2023-10-13 02:26:02,376][46663] Updated weights for policy 1, policy_version 52541 (0.0009) +[2023-10-13 02:26:03,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 107642880. Throughput: 0: 1691.2, 1: 1666.2. Samples: 26918780. Policy #0 lag: (min: 31.0, avg: 34.1, max: 63.0) +[2023-10-13 02:26:03,608][45375] Avg episode reward: [(0, '49.470'), (1, '45.880')] +[2023-10-13 02:26:03,737][46662] Updated weights for policy 0, policy_version 52580 (0.0008) +[2023-10-13 02:26:04,110][46662] Updated weights for policy 0, policy_version 52590 (0.0009) +[2023-10-13 02:26:04,491][46662] Updated weights for policy 0, policy_version 52600 (0.0008) +[2023-10-13 02:26:06,381][46663] Updated weights for policy 1, policy_version 52551 (0.0008) +[2023-10-13 02:26:06,755][46663] Updated weights for policy 1, policy_version 52561 (0.0007) +[2023-10-13 02:26:07,123][46663] Updated weights for policy 1, policy_version 52571 (0.0010) +[2023-10-13 02:26:08,501][46662] Updated weights for policy 0, policy_version 52610 (0.0009) +[2023-10-13 02:26:08,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 107708416. Throughput: 0: 1686.2, 1: 1678.0. Samples: 26939134. Policy #0 lag: (min: 31.0, avg: 34.1, max: 63.0) +[2023-10-13 02:26:08,607][45375] Avg episode reward: [(0, '47.460'), (1, '46.180')] +[2023-10-13 02:26:08,867][46662] Updated weights for policy 0, policy_version 52620 (0.0008) +[2023-10-13 02:26:09,247][46662] Updated weights for policy 0, policy_version 52630 (0.0007) +[2023-10-13 02:26:09,610][46662] Updated weights for policy 0, policy_version 52640 (0.0009) +[2023-10-13 02:26:11,379][46663] Updated weights for policy 1, policy_version 52581 (0.0009) +[2023-10-13 02:26:11,745][46663] Updated weights for policy 1, policy_version 52591 (0.0007) +[2023-10-13 02:26:12,104][46663] Updated weights for policy 1, policy_version 52601 (0.0008) +[2023-10-13 02:26:13,561][46662] Updated weights for policy 0, policy_version 52650 (0.0008) +[2023-10-13 02:26:13,606][45375] Fps is (10 sec: 13107.5, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 107773952. Throughput: 0: 1686.4, 1: 1683.8. Samples: 26949388. Policy #0 lag: (min: 31.0, avg: 34.1, max: 63.0) +[2023-10-13 02:26:13,607][45375] Avg episode reward: [(0, '48.190'), (1, '45.400')] +[2023-10-13 02:26:13,923][46662] Updated weights for policy 0, policy_version 52660 (0.0009) +[2023-10-13 02:26:14,299][46662] Updated weights for policy 0, policy_version 52670 (0.0008) +[2023-10-13 02:26:16,186][46663] Updated weights for policy 1, policy_version 52611 (0.0008) +[2023-10-13 02:26:16,550][46663] Updated weights for policy 1, policy_version 52621 (0.0007) +[2023-10-13 02:26:16,912][46663] Updated weights for policy 1, policy_version 52631 (0.0008) +[2023-10-13 02:26:18,383][46662] Updated weights for policy 0, policy_version 52680 (0.0009) +[2023-10-13 02:26:18,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 107839488. Throughput: 0: 1686.5, 1: 1666.1. Samples: 26969228. Policy #0 lag: (min: 31.0, avg: 34.1, max: 63.0) +[2023-10-13 02:26:18,607][45375] Avg episode reward: [(0, '48.790'), (1, '45.830')] +[2023-10-13 02:26:18,749][46662] Updated weights for policy 0, policy_version 52690 (0.0007) +[2023-10-13 02:26:19,122][46662] Updated weights for policy 0, policy_version 52700 (0.0007) +[2023-10-13 02:26:20,969][46663] Updated weights for policy 1, policy_version 52641 (0.0010) +[2023-10-13 02:26:21,365][46663] Updated weights for policy 1, policy_version 52651 (0.0009) +[2023-10-13 02:26:21,747][46663] Updated weights for policy 1, policy_version 52661 (0.0008) +[2023-10-13 02:26:22,122][46663] Updated weights for policy 1, policy_version 52671 (0.0007) +[2023-10-13 02:26:23,215][46662] Updated weights for policy 0, policy_version 52710 (0.0009) +[2023-10-13 02:26:23,584][46662] Updated weights for policy 0, policy_version 52720 (0.0009) +[2023-10-13 02:26:23,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 107905024. Throughput: 0: 1684.2, 1: 1684.1. Samples: 26989632. Policy #0 lag: (min: 31.0, avg: 34.1, max: 63.0) +[2023-10-13 02:26:23,607][45375] Avg episode reward: [(0, '46.990'), (1, '45.750')] +[2023-10-13 02:26:23,957][46662] Updated weights for policy 0, policy_version 52730 (0.0008) +[2023-10-13 02:26:26,085][46663] Updated weights for policy 1, policy_version 52681 (0.0010) +[2023-10-13 02:26:26,455][46663] Updated weights for policy 1, policy_version 52691 (0.0007) +[2023-10-13 02:26:26,826][46663] Updated weights for policy 1, policy_version 52701 (0.0007) +[2023-10-13 02:26:27,942][46662] Updated weights for policy 0, policy_version 52740 (0.0009) +[2023-10-13 02:26:28,312][46662] Updated weights for policy 0, policy_version 52750 (0.0008) +[2023-10-13 02:26:28,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 107970560. Throughput: 0: 1690.4, 1: 1670.0. Samples: 26999494. Policy #0 lag: (min: 31.0, avg: 34.1, max: 63.0) +[2023-10-13 02:26:28,607][45375] Avg episode reward: [(0, '46.200'), (1, '46.900')] +[2023-10-13 02:26:28,679][46662] Updated weights for policy 0, policy_version 52760 (0.0009) +[2023-10-13 02:26:30,925][46663] Updated weights for policy 1, policy_version 52711 (0.0009) +[2023-10-13 02:26:31,290][46663] Updated weights for policy 1, policy_version 52721 (0.0009) +[2023-10-13 02:26:31,666][46663] Updated weights for policy 1, policy_version 52731 (0.0009) +[2023-10-13 02:26:32,634][46662] Updated weights for policy 0, policy_version 52770 (0.0008) +[2023-10-13 02:26:33,009][46662] Updated weights for policy 0, policy_version 52780 (0.0007) +[2023-10-13 02:26:33,376][46662] Updated weights for policy 0, policy_version 52790 (0.0008) +[2023-10-13 02:26:33,607][45375] Fps is (10 sec: 13106.9, 60 sec: 13107.1, 300 sec: 13440.4). Total num frames: 108036096. Throughput: 0: 1694.3, 1: 1663.8. Samples: 27019526. Policy #0 lag: (min: 1.0, avg: 12.6, max: 33.0) +[2023-10-13 02:26:33,608][45375] Avg episode reward: [(0, '45.820'), (1, '46.150')] +[2023-10-13 02:26:33,754][46662] Updated weights for policy 0, policy_version 52800 (0.0008) +[2023-10-13 02:26:35,765][46663] Updated weights for policy 1, policy_version 52741 (0.0008) +[2023-10-13 02:26:36,121][46663] Updated weights for policy 1, policy_version 52751 (0.0011) +[2023-10-13 02:26:36,497][46663] Updated weights for policy 1, policy_version 52761 (0.0009) +[2023-10-13 02:26:37,802][46662] Updated weights for policy 0, policy_version 52810 (0.0009) +[2023-10-13 02:26:38,174][46662] Updated weights for policy 0, policy_version 52820 (0.0009) +[2023-10-13 02:26:38,541][46662] Updated weights for policy 0, policy_version 52830 (0.0010) +[2023-10-13 02:26:38,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 108101632. Throughput: 0: 1684.2, 1: 1675.9. Samples: 27039714. Policy #0 lag: (min: 1.0, avg: 12.6, max: 33.0) +[2023-10-13 02:26:38,607][45375] Avg episode reward: [(0, '44.930'), (1, '45.610')] +[2023-10-13 02:26:40,745][46663] Updated weights for policy 1, policy_version 52771 (0.0009) +[2023-10-13 02:26:41,111][46663] Updated weights for policy 1, policy_version 52781 (0.0007) +[2023-10-13 02:26:41,473][46663] Updated weights for policy 1, policy_version 52791 (0.0008) +[2023-10-13 02:26:42,622][46662] Updated weights for policy 0, policy_version 52840 (0.0010) +[2023-10-13 02:26:42,999][46662] Updated weights for policy 0, policy_version 52850 (0.0008) +[2023-10-13 02:26:43,378][46662] Updated weights for policy 0, policy_version 52860 (0.0008) +[2023-10-13 02:26:43,607][45375] Fps is (10 sec: 16384.3, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 108199936. Throughput: 0: 1696.5, 1: 1654.3. Samples: 27049648. Policy #0 lag: (min: 1.0, avg: 12.6, max: 33.0) +[2023-10-13 02:26:43,607][45375] Avg episode reward: [(0, '46.110'), (1, '46.300')] +[2023-10-13 02:26:45,423][46663] Updated weights for policy 1, policy_version 52801 (0.0008) +[2023-10-13 02:26:45,782][46663] Updated weights for policy 1, policy_version 52811 (0.0008) +[2023-10-13 02:26:46,150][46663] Updated weights for policy 1, policy_version 52821 (0.0007) +[2023-10-13 02:26:46,514][46663] Updated weights for policy 1, policy_version 52831 (0.0008) +[2023-10-13 02:26:47,481][46662] Updated weights for policy 0, policy_version 52870 (0.0009) +[2023-10-13 02:26:47,845][46662] Updated weights for policy 0, policy_version 52880 (0.0008) +[2023-10-13 02:26:48,222][46662] Updated weights for policy 0, policy_version 52890 (0.0009) +[2023-10-13 02:26:48,607][45375] Fps is (10 sec: 16384.0, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 108265472. Throughput: 0: 1690.6, 1: 1667.3. Samples: 27069886. Policy #0 lag: (min: 1.0, avg: 12.6, max: 33.0) +[2023-10-13 02:26:48,607][45375] Avg episode reward: [(0, '44.750'), (1, '45.610')] +[2023-10-13 02:26:50,810][46663] Updated weights for policy 1, policy_version 52841 (0.0008) +[2023-10-13 02:26:51,180][46663] Updated weights for policy 1, policy_version 52851 (0.0009) +[2023-10-13 02:26:51,548][46663] Updated weights for policy 1, policy_version 52861 (0.0008) +[2023-10-13 02:26:52,331][46662] Updated weights for policy 0, policy_version 52900 (0.0009) +[2023-10-13 02:26:52,694][46662] Updated weights for policy 0, policy_version 52910 (0.0009) +[2023-10-13 02:26:53,070][46662] Updated weights for policy 0, policy_version 52920 (0.0010) +[2023-10-13 02:26:53,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 108331008. Throughput: 0: 1674.2, 1: 1675.1. Samples: 27089852. Policy #0 lag: (min: 1.0, avg: 12.6, max: 33.0) +[2023-10-13 02:26:53,607][45375] Avg episode reward: [(0, '46.020'), (1, '45.510')] +[2023-10-13 02:26:55,620][46663] Updated weights for policy 1, policy_version 52871 (0.0008) +[2023-10-13 02:26:55,987][46663] Updated weights for policy 1, policy_version 52881 (0.0009) +[2023-10-13 02:26:56,368][46663] Updated weights for policy 1, policy_version 52891 (0.0008) +[2023-10-13 02:26:57,137][46662] Updated weights for policy 0, policy_version 52930 (0.0009) +[2023-10-13 02:26:57,504][46662] Updated weights for policy 0, policy_version 52940 (0.0009) +[2023-10-13 02:26:57,890][46662] Updated weights for policy 0, policy_version 52950 (0.0009) +[2023-10-13 02:26:58,253][46662] Updated weights for policy 0, policy_version 52960 (0.0007) +[2023-10-13 02:26:58,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 108396544. Throughput: 0: 1686.9, 1: 1653.6. Samples: 27099710. Policy #0 lag: (min: 1.0, avg: 12.6, max: 33.0) +[2023-10-13 02:26:58,607][45375] Avg episode reward: [(0, '46.260'), (1, '45.380')] +[2023-10-13 02:27:00,494][46663] Updated weights for policy 1, policy_version 52901 (0.0007) +[2023-10-13 02:27:00,853][46663] Updated weights for policy 1, policy_version 52911 (0.0007) +[2023-10-13 02:27:01,222][46663] Updated weights for policy 1, policy_version 52921 (0.0007) +[2023-10-13 02:27:02,378][46662] Updated weights for policy 0, policy_version 52970 (0.0008) +[2023-10-13 02:27:02,743][46662] Updated weights for policy 0, policy_version 52980 (0.0010) +[2023-10-13 02:27:03,123][46662] Updated weights for policy 0, policy_version 52990 (0.0010) +[2023-10-13 02:27:03,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 108462080. Throughput: 0: 1689.2, 1: 1669.0. Samples: 27120346. Policy #0 lag: (min: 22.0, avg: 22.0, max: 22.0) +[2023-10-13 02:27:03,607][45375] Avg episode reward: [(0, '46.910'), (1, '46.130')] +[2023-10-13 02:27:05,281][46663] Updated weights for policy 1, policy_version 52931 (0.0008) +[2023-10-13 02:27:05,681][46663] Updated weights for policy 1, policy_version 52941 (0.0007) +[2023-10-13 02:27:06,045][46663] Updated weights for policy 1, policy_version 52951 (0.0009) +[2023-10-13 02:27:07,032][46662] Updated weights for policy 0, policy_version 53000 (0.0009) +[2023-10-13 02:27:07,405][46662] Updated weights for policy 0, policy_version 53010 (0.0009) +[2023-10-13 02:27:07,771][46662] Updated weights for policy 0, policy_version 53020 (0.0008) +[2023-10-13 02:27:08,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 108527616. Throughput: 0: 1671.5, 1: 1675.6. Samples: 27140250. Policy #0 lag: (min: 22.0, avg: 22.0, max: 22.0) +[2023-10-13 02:27:08,607][45375] Avg episode reward: [(0, '47.270'), (1, '46.610')] +[2023-10-13 02:27:10,191][46663] Updated weights for policy 1, policy_version 52961 (0.0010) +[2023-10-13 02:27:10,560][46663] Updated weights for policy 1, policy_version 52971 (0.0007) +[2023-10-13 02:27:10,927][46663] Updated weights for policy 1, policy_version 52981 (0.0008) +[2023-10-13 02:27:11,291][46663] Updated weights for policy 1, policy_version 52991 (0.0008) +[2023-10-13 02:27:11,812][46662] Updated weights for policy 0, policy_version 53030 (0.0008) +[2023-10-13 02:27:12,177][46662] Updated weights for policy 0, policy_version 53040 (0.0007) +[2023-10-13 02:27:12,552][46662] Updated weights for policy 0, policy_version 53050 (0.0009) +[2023-10-13 02:27:13,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 108593152. Throughput: 0: 1693.5, 1: 1656.8. Samples: 27150258. Policy #0 lag: (min: 22.0, avg: 22.0, max: 22.0) +[2023-10-13 02:27:13,608][45375] Avg episode reward: [(0, '48.040'), (1, '47.680')] +[2023-10-13 02:27:15,099][46663] Updated weights for policy 1, policy_version 53001 (0.0007) +[2023-10-13 02:27:15,471][46663] Updated weights for policy 1, policy_version 53011 (0.0008) +[2023-10-13 02:27:15,843][46663] Updated weights for policy 1, policy_version 53021 (0.0008) +[2023-10-13 02:27:16,573][46662] Updated weights for policy 0, policy_version 53060 (0.0008) +[2023-10-13 02:27:16,945][46662] Updated weights for policy 0, policy_version 53070 (0.0007) +[2023-10-13 02:27:17,304][46662] Updated weights for policy 0, policy_version 53080 (0.0008) +[2023-10-13 02:27:18,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 108658688. Throughput: 0: 1685.0, 1: 1677.0. Samples: 27170818. Policy #0 lag: (min: 22.0, avg: 22.0, max: 22.0) +[2023-10-13 02:27:18,607][45375] Avg episode reward: [(0, '48.080'), (1, '47.150')] +[2023-10-13 02:27:19,845][46663] Updated weights for policy 1, policy_version 53031 (0.0008) +[2023-10-13 02:27:20,215][46663] Updated weights for policy 1, policy_version 53041 (0.0008) +[2023-10-13 02:27:20,578][46663] Updated weights for policy 1, policy_version 53051 (0.0008) +[2023-10-13 02:27:21,444][46662] Updated weights for policy 0, policy_version 53090 (0.0009) +[2023-10-13 02:27:21,808][46662] Updated weights for policy 0, policy_version 53100 (0.0010) +[2023-10-13 02:27:22,187][46662] Updated weights for policy 0, policy_version 53110 (0.0009) +[2023-10-13 02:27:22,548][46662] Updated weights for policy 0, policy_version 53120 (0.0008) +[2023-10-13 02:27:23,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 108724224. Throughput: 0: 1671.4, 1: 1681.5. Samples: 27190592. Policy #0 lag: (min: 22.0, avg: 22.0, max: 22.0) +[2023-10-13 02:27:23,608][45375] Avg episode reward: [(0, '48.120'), (1, '46.800')] +[2023-10-13 02:27:24,643][46663] Updated weights for policy 1, policy_version 53061 (0.0009) +[2023-10-13 02:27:25,012][46663] Updated weights for policy 1, policy_version 53071 (0.0008) +[2023-10-13 02:27:25,382][46663] Updated weights for policy 1, policy_version 53081 (0.0011) +[2023-10-13 02:27:26,484][46662] Updated weights for policy 0, policy_version 53130 (0.0009) +[2023-10-13 02:27:26,848][46662] Updated weights for policy 0, policy_version 53140 (0.0011) +[2023-10-13 02:27:27,228][46662] Updated weights for policy 0, policy_version 53150 (0.0010) +[2023-10-13 02:27:28,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 108789760. Throughput: 0: 1691.7, 1: 1669.9. Samples: 27200920. Policy #0 lag: (min: 22.0, avg: 22.0, max: 22.0) +[2023-10-13 02:27:28,607][45375] Avg episode reward: [(0, '48.190'), (1, '45.710')] +[2023-10-13 02:27:29,494][46663] Updated weights for policy 1, policy_version 53091 (0.0009) +[2023-10-13 02:27:29,864][46663] Updated weights for policy 1, policy_version 53101 (0.0010) +[2023-10-13 02:27:30,237][46663] Updated weights for policy 1, policy_version 53111 (0.0011) +[2023-10-13 02:27:31,484][46662] Updated weights for policy 0, policy_version 53160 (0.0009) +[2023-10-13 02:27:31,862][46662] Updated weights for policy 0, policy_version 53170 (0.0008) +[2023-10-13 02:27:32,246][46662] Updated weights for policy 0, policy_version 53180 (0.0008) +[2023-10-13 02:27:33,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 108855296. Throughput: 0: 1670.9, 1: 1679.6. Samples: 27220656. Policy #0 lag: (min: 22.0, avg: 22.0, max: 22.0) +[2023-10-13 02:27:33,608][45375] Avg episode reward: [(0, '48.120'), (1, '45.860')] +[2023-10-13 02:27:34,349][46663] Updated weights for policy 1, policy_version 53121 (0.0008) +[2023-10-13 02:27:34,720][46663] Updated weights for policy 1, policy_version 53131 (0.0008) +[2023-10-13 02:27:35,077][46663] Updated weights for policy 1, policy_version 53141 (0.0009) +[2023-10-13 02:27:35,448][46663] Updated weights for policy 1, policy_version 53151 (0.0009) +[2023-10-13 02:27:36,312][46662] Updated weights for policy 0, policy_version 53190 (0.0010) +[2023-10-13 02:27:36,677][46662] Updated weights for policy 0, policy_version 53200 (0.0010) +[2023-10-13 02:27:37,056][46662] Updated weights for policy 0, policy_version 53210 (0.0010) +[2023-10-13 02:27:38,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 108920832. Throughput: 0: 1668.0, 1: 1683.6. Samples: 27240670. Policy #0 lag: (min: 31.0, avg: 35.3, max: 63.0) +[2023-10-13 02:27:38,607][45375] Avg episode reward: [(0, '49.080'), (1, '46.800')] +[2023-10-13 02:27:39,543][46663] Updated weights for policy 1, policy_version 53161 (0.0009) +[2023-10-13 02:27:39,913][46663] Updated weights for policy 1, policy_version 53171 (0.0010) +[2023-10-13 02:27:40,291][46663] Updated weights for policy 1, policy_version 53181 (0.0010) +[2023-10-13 02:27:41,287][46662] Updated weights for policy 0, policy_version 53220 (0.0009) +[2023-10-13 02:27:41,662][46662] Updated weights for policy 0, policy_version 53230 (0.0009) +[2023-10-13 02:27:42,032][46662] Updated weights for policy 0, policy_version 53240 (0.0008) +[2023-10-13 02:27:43,607][45375] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 108986368. Throughput: 0: 1684.4, 1: 1677.5. Samples: 27250996. Policy #0 lag: (min: 31.0, avg: 35.3, max: 63.0) +[2023-10-13 02:27:43,607][45375] Avg episode reward: [(0, '47.850'), (1, '47.210')] +[2023-10-13 02:27:44,333][46663] Updated weights for policy 1, policy_version 53191 (0.0008) +[2023-10-13 02:27:44,701][46663] Updated weights for policy 1, policy_version 53201 (0.0007) +[2023-10-13 02:27:45,065][46663] Updated weights for policy 1, policy_version 53211 (0.0007) +[2023-10-13 02:27:46,047][46662] Updated weights for policy 0, policy_version 53250 (0.0008) +[2023-10-13 02:27:46,415][46662] Updated weights for policy 0, policy_version 53260 (0.0008) +[2023-10-13 02:27:46,791][46662] Updated weights for policy 0, policy_version 53270 (0.0009) +[2023-10-13 02:27:47,156][46662] Updated weights for policy 0, policy_version 53280 (0.0008) +[2023-10-13 02:27:48,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 109051904. Throughput: 0: 1660.0, 1: 1682.4. Samples: 27270754. Policy #0 lag: (min: 31.0, avg: 35.3, max: 63.0) +[2023-10-13 02:27:48,607][45375] Avg episode reward: [(0, '49.160'), (1, '46.280')] +[2023-10-13 02:27:49,241][46663] Updated weights for policy 1, policy_version 53221 (0.0011) +[2023-10-13 02:27:49,617][46663] Updated weights for policy 1, policy_version 53231 (0.0009) +[2023-10-13 02:27:49,983][46663] Updated weights for policy 1, policy_version 53241 (0.0007) +[2023-10-13 02:27:51,344][46662] Updated weights for policy 0, policy_version 53290 (0.0009) +[2023-10-13 02:27:51,725][46662] Updated weights for policy 0, policy_version 53300 (0.0009) +[2023-10-13 02:27:52,100][46662] Updated weights for policy 0, policy_version 53310 (0.0008) +[2023-10-13 02:27:53,607][45375] Fps is (10 sec: 13106.8, 60 sec: 13107.1, 300 sec: 13440.4). Total num frames: 109117440. Throughput: 0: 1667.7, 1: 1677.1. Samples: 27290764. Policy #0 lag: (min: 31.0, avg: 35.3, max: 63.0) +[2023-10-13 02:27:53,608][45375] Avg episode reward: [(0, '47.800'), (1, '46.510')] +[2023-10-13 02:27:53,619][46384] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000053248_54525952.pth... +[2023-10-13 02:27:53,620][46091] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000053312_54591488.pth... +[2023-10-13 02:27:53,655][46091] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000051744_52985856.pth +[2023-10-13 02:27:53,657][46384] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000051680_52920320.pth +[2023-10-13 02:27:54,164][46663] Updated weights for policy 1, policy_version 53251 (0.0007) +[2023-10-13 02:27:54,561][46663] Updated weights for policy 1, policy_version 53261 (0.0008) +[2023-10-13 02:27:54,922][46663] Updated weights for policy 1, policy_version 53271 (0.0007) +[2023-10-13 02:27:56,082][46662] Updated weights for policy 0, policy_version 53320 (0.0010) +[2023-10-13 02:27:56,460][46662] Updated weights for policy 0, policy_version 53330 (0.0007) +[2023-10-13 02:27:56,828][46662] Updated weights for policy 0, policy_version 53340 (0.0008) +[2023-10-13 02:27:58,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 109182976. Throughput: 0: 1673.5, 1: 1673.8. Samples: 27300888. Policy #0 lag: (min: 31.0, avg: 35.3, max: 63.0) +[2023-10-13 02:27:58,607][45375] Avg episode reward: [(0, '47.770'), (1, '47.900')] +[2023-10-13 02:27:59,112][46663] Updated weights for policy 1, policy_version 53281 (0.0010) +[2023-10-13 02:27:59,479][46663] Updated weights for policy 1, policy_version 53291 (0.0009) +[2023-10-13 02:27:59,834][46663] Updated weights for policy 1, policy_version 53301 (0.0008) +[2023-10-13 02:28:00,200][46663] Updated weights for policy 1, policy_version 53311 (0.0010) +[2023-10-13 02:28:00,895][46662] Updated weights for policy 0, policy_version 53350 (0.0009) +[2023-10-13 02:28:01,273][46662] Updated weights for policy 0, policy_version 53360 (0.0008) +[2023-10-13 02:28:01,640][46662] Updated weights for policy 0, policy_version 53370 (0.0009) +[2023-10-13 02:28:03,606][45375] Fps is (10 sec: 13107.7, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 109248512. Throughput: 0: 1652.4, 1: 1672.0. Samples: 27320416. Policy #0 lag: (min: 31.0, avg: 35.3, max: 63.0) +[2023-10-13 02:28:03,607][45375] Avg episode reward: [(0, '48.990'), (1, '47.500')] +[2023-10-13 02:28:04,439][46663] Updated weights for policy 1, policy_version 53321 (0.0008) +[2023-10-13 02:28:04,807][46663] Updated weights for policy 1, policy_version 53331 (0.0008) +[2023-10-13 02:28:05,177][46663] Updated weights for policy 1, policy_version 53341 (0.0008) +[2023-10-13 02:28:05,583][46662] Updated weights for policy 0, policy_version 53380 (0.0010) +[2023-10-13 02:28:05,955][46662] Updated weights for policy 0, policy_version 53390 (0.0009) +[2023-10-13 02:28:06,329][46662] Updated weights for policy 0, policy_version 53400 (0.0009) +[2023-10-13 02:28:08,606][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 109314048. Throughput: 0: 1672.3, 1: 1669.1. Samples: 27340954. Policy #0 lag: (min: 31.0, avg: 35.3, max: 63.0) +[2023-10-13 02:28:08,607][45375] Avg episode reward: [(0, '47.980'), (1, '47.140')] +[2023-10-13 02:28:09,104][46663] Updated weights for policy 1, policy_version 53351 (0.0010) +[2023-10-13 02:28:09,476][46663] Updated weights for policy 1, policy_version 53361 (0.0010) +[2023-10-13 02:28:09,841][46663] Updated weights for policy 1, policy_version 53371 (0.0007) +[2023-10-13 02:28:10,544][46662] Updated weights for policy 0, policy_version 53410 (0.0007) +[2023-10-13 02:28:10,915][46662] Updated weights for policy 0, policy_version 53420 (0.0009) +[2023-10-13 02:28:11,279][46662] Updated weights for policy 0, policy_version 53430 (0.0009) +[2023-10-13 02:28:11,658][46662] Updated weights for policy 0, policy_version 53440 (0.0007) +[2023-10-13 02:28:13,607][45375] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 109379584. Throughput: 0: 1659.7, 1: 1673.9. Samples: 27350932. Policy #0 lag: (min: 31.0, avg: 31.0, max: 32.0) +[2023-10-13 02:28:13,608][45375] Avg episode reward: [(0, '48.750'), (1, '48.460')] +[2023-10-13 02:28:14,040][46663] Updated weights for policy 1, policy_version 53381 (0.0008) +[2023-10-13 02:28:14,401][46663] Updated weights for policy 1, policy_version 53391 (0.0008) +[2023-10-13 02:28:14,773][46663] Updated weights for policy 1, policy_version 53401 (0.0010) +[2023-10-13 02:28:15,525][46662] Updated weights for policy 0, policy_version 53450 (0.0011) +[2023-10-13 02:28:15,896][46662] Updated weights for policy 0, policy_version 53460 (0.0008) +[2023-10-13 02:28:16,263][46662] Updated weights for policy 0, policy_version 53470 (0.0009) +[2023-10-13 02:28:18,606][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 109445120. Throughput: 0: 1661.7, 1: 1673.8. Samples: 27370756. Policy #0 lag: (min: 31.0, avg: 31.0, max: 32.0) +[2023-10-13 02:28:18,607][45375] Avg episode reward: [(0, '47.280'), (1, '49.350')] +[2023-10-13 02:28:18,785][46663] Updated weights for policy 1, policy_version 53411 (0.0008) +[2023-10-13 02:28:19,154][46663] Updated weights for policy 1, policy_version 53421 (0.0007) +[2023-10-13 02:28:19,522][46663] Updated weights for policy 1, policy_version 53431 (0.0008) +[2023-10-13 02:28:20,383][46662] Updated weights for policy 0, policy_version 53480 (0.0010) +[2023-10-13 02:28:20,757][46662] Updated weights for policy 0, policy_version 53490 (0.0011) +[2023-10-13 02:28:21,119][46662] Updated weights for policy 0, policy_version 53500 (0.0008) +[2023-10-13 02:28:23,607][45375] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 109510656. Throughput: 0: 1682.7, 1: 1671.3. Samples: 27391598. Policy #0 lag: (min: 31.0, avg: 31.0, max: 32.0) +[2023-10-13 02:28:23,607][45375] Avg episode reward: [(0, '45.770'), (1, '49.210')] +[2023-10-13 02:28:23,628][46663] Updated weights for policy 1, policy_version 53441 (0.0007) +[2023-10-13 02:28:23,994][46663] Updated weights for policy 1, policy_version 53451 (0.0010) +[2023-10-13 02:28:24,359][46663] Updated weights for policy 1, policy_version 53461 (0.0009) +[2023-10-13 02:28:24,722][46663] Updated weights for policy 1, policy_version 53471 (0.0008) +[2023-10-13 02:28:24,993][46662] Updated weights for policy 0, policy_version 53510 (0.0007) +[2023-10-13 02:28:25,361][46662] Updated weights for policy 0, policy_version 53520 (0.0008) +[2023-10-13 02:28:25,734][46662] Updated weights for policy 0, policy_version 53530 (0.0008) +[2023-10-13 02:28:28,606][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 109576192. Throughput: 0: 1658.3, 1: 1671.4. Samples: 27400834. Policy #0 lag: (min: 31.0, avg: 31.0, max: 32.0) +[2023-10-13 02:28:28,607][45375] Avg episode reward: [(0, '46.570'), (1, '49.390')] +[2023-10-13 02:28:28,796][46663] Updated weights for policy 1, policy_version 53481 (0.0008) +[2023-10-13 02:28:29,171][46663] Updated weights for policy 1, policy_version 53491 (0.0008) +[2023-10-13 02:28:29,533][46663] Updated weights for policy 1, policy_version 53501 (0.0009) +[2023-10-13 02:28:29,741][46662] Updated weights for policy 0, policy_version 53540 (0.0008) +[2023-10-13 02:28:30,115][46662] Updated weights for policy 0, policy_version 53550 (0.0008) +[2023-10-13 02:28:30,487][46662] Updated weights for policy 0, policy_version 53560 (0.0008) +[2023-10-13 02:28:33,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 109641728. Throughput: 0: 1681.6, 1: 1666.9. Samples: 27421440. Policy #0 lag: (min: 31.0, avg: 31.0, max: 32.0) +[2023-10-13 02:28:33,607][45375] Avg episode reward: [(0, '45.790'), (1, '50.440')] +[2023-10-13 02:28:33,776][46663] Updated weights for policy 1, policy_version 53511 (0.0008) +[2023-10-13 02:28:34,155][46663] Updated weights for policy 1, policy_version 53521 (0.0009) +[2023-10-13 02:28:34,523][46663] Updated weights for policy 1, policy_version 53531 (0.0010) +[2023-10-13 02:28:34,704][46662] Updated weights for policy 0, policy_version 53570 (0.0009) +[2023-10-13 02:28:35,073][46662] Updated weights for policy 0, policy_version 53580 (0.0008) +[2023-10-13 02:28:35,438][46662] Updated weights for policy 0, policy_version 53590 (0.0007) +[2023-10-13 02:28:35,808][46662] Updated weights for policy 0, policy_version 53600 (0.0008) +[2023-10-13 02:28:38,490][46663] Updated weights for policy 1, policy_version 53541 (0.0010) +[2023-10-13 02:28:38,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 109707264. Throughput: 0: 1692.3, 1: 1665.9. Samples: 27441884. Policy #0 lag: (min: 31.0, avg: 31.0, max: 32.0) +[2023-10-13 02:28:38,607][45375] Avg episode reward: [(0, '45.310'), (1, '52.100')] +[2023-10-13 02:28:38,862][46663] Updated weights for policy 1, policy_version 53551 (0.0010) +[2023-10-13 02:28:39,222][46663] Updated weights for policy 1, policy_version 53561 (0.0010) +[2023-10-13 02:28:39,888][46662] Updated weights for policy 0, policy_version 53610 (0.0007) +[2023-10-13 02:28:40,262][46662] Updated weights for policy 0, policy_version 53620 (0.0007) +[2023-10-13 02:28:40,634][46662] Updated weights for policy 0, policy_version 53630 (0.0008) +[2023-10-13 02:28:43,380][46663] Updated weights for policy 1, policy_version 53571 (0.0010) +[2023-10-13 02:28:43,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 109772800. Throughput: 0: 1664.0, 1: 1672.6. Samples: 27451032. Policy #0 lag: (min: 31.0, avg: 31.0, max: 32.0) +[2023-10-13 02:28:43,607][45375] Avg episode reward: [(0, '46.090'), (1, '53.220')] +[2023-10-13 02:28:43,782][46663] Updated weights for policy 1, policy_version 53581 (0.0010) +[2023-10-13 02:28:44,161][46663] Updated weights for policy 1, policy_version 53591 (0.0009) +[2023-10-13 02:28:44,595][46662] Updated weights for policy 0, policy_version 53640 (0.0008) +[2023-10-13 02:28:44,968][46662] Updated weights for policy 0, policy_version 53650 (0.0008) +[2023-10-13 02:28:45,343][46662] Updated weights for policy 0, policy_version 53660 (0.0011) +[2023-10-13 02:28:48,247][46663] Updated weights for policy 1, policy_version 53601 (0.0007) +[2023-10-13 02:28:48,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 109838336. Throughput: 0: 1688.4, 1: 1668.3. Samples: 27471466. Policy #0 lag: (min: 31.0, avg: 31.0, max: 32.0) +[2023-10-13 02:28:48,607][45375] Avg episode reward: [(0, '45.390'), (1, '53.060')] +[2023-10-13 02:28:48,613][46663] Updated weights for policy 1, policy_version 53611 (0.0009) +[2023-10-13 02:28:48,977][46663] Updated weights for policy 1, policy_version 53621 (0.0011) +[2023-10-13 02:28:49,342][46663] Updated weights for policy 1, policy_version 53631 (0.0011) +[2023-10-13 02:28:49,688][46662] Updated weights for policy 0, policy_version 53670 (0.0008) +[2023-10-13 02:28:50,059][46662] Updated weights for policy 0, policy_version 53680 (0.0007) +[2023-10-13 02:28:50,425][46662] Updated weights for policy 0, policy_version 53690 (0.0007) +[2023-10-13 02:28:53,455][46663] Updated weights for policy 1, policy_version 53641 (0.0008) +[2023-10-13 02:28:53,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.3, 300 sec: 13329.4). Total num frames: 109903872. Throughput: 0: 1687.7, 1: 1661.4. Samples: 27491664. Policy #0 lag: (min: 31.0, avg: 37.5, max: 63.0) +[2023-10-13 02:28:53,607][45375] Avg episode reward: [(0, '46.270'), (1, '54.070')] +[2023-10-13 02:28:53,824][46663] Updated weights for policy 1, policy_version 53651 (0.0008) +[2023-10-13 02:28:54,190][46663] Updated weights for policy 1, policy_version 53661 (0.0007) +[2023-10-13 02:28:54,568][46662] Updated weights for policy 0, policy_version 53700 (0.0008) +[2023-10-13 02:28:54,941][46662] Updated weights for policy 0, policy_version 53710 (0.0011) +[2023-10-13 02:28:55,317][46662] Updated weights for policy 0, policy_version 53720 (0.0009) +[2023-10-13 02:28:58,248][46663] Updated weights for policy 1, policy_version 53671 (0.0008) +[2023-10-13 02:28:58,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 109969408. Throughput: 0: 1670.1, 1: 1668.4. Samples: 27501162. Policy #0 lag: (min: 31.0, avg: 37.5, max: 63.0) +[2023-10-13 02:28:58,608][45375] Avg episode reward: [(0, '45.660'), (1, '55.390')] +[2023-10-13 02:28:58,621][46663] Updated weights for policy 1, policy_version 53681 (0.0008) +[2023-10-13 02:28:59,004][46663] Updated weights for policy 1, policy_version 53691 (0.0011) +[2023-10-13 02:28:59,476][46662] Updated weights for policy 0, policy_version 53730 (0.0011) +[2023-10-13 02:28:59,835][46662] Updated weights for policy 0, policy_version 53740 (0.0008) +[2023-10-13 02:29:00,204][46662] Updated weights for policy 0, policy_version 53750 (0.0008) +[2023-10-13 02:29:00,567][46662] Updated weights for policy 0, policy_version 53760 (0.0009) +[2023-10-13 02:29:03,084][46663] Updated weights for policy 1, policy_version 53701 (0.0009) +[2023-10-13 02:29:03,452][46663] Updated weights for policy 1, policy_version 53711 (0.0007) +[2023-10-13 02:29:03,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 110034944. Throughput: 0: 1690.1, 1: 1665.7. Samples: 27521770. Policy #0 lag: (min: 31.0, avg: 37.5, max: 63.0) +[2023-10-13 02:29:03,607][45375] Avg episode reward: [(0, '45.510'), (1, '55.600')] +[2023-10-13 02:29:03,823][46663] Updated weights for policy 1, policy_version 53721 (0.0007) +[2023-10-13 02:29:04,587][46662] Updated weights for policy 0, policy_version 53770 (0.0007) +[2023-10-13 02:29:04,943][46662] Updated weights for policy 0, policy_version 53780 (0.0009) +[2023-10-13 02:29:05,322][46662] Updated weights for policy 0, policy_version 53790 (0.0009) +[2023-10-13 02:29:07,878][46663] Updated weights for policy 1, policy_version 53731 (0.0008) +[2023-10-13 02:29:08,252][46663] Updated weights for policy 1, policy_version 53741 (0.0007) +[2023-10-13 02:29:08,607][45375] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 110100480. Throughput: 0: 1687.5, 1: 1653.3. Samples: 27541934. Policy #0 lag: (min: 31.0, avg: 37.5, max: 63.0) +[2023-10-13 02:29:08,607][45375] Avg episode reward: [(0, '45.710'), (1, '55.190')] +[2023-10-13 02:29:08,617][46663] Updated weights for policy 1, policy_version 53751 (0.0008) +[2023-10-13 02:29:09,247][46662] Updated weights for policy 0, policy_version 53800 (0.0009) +[2023-10-13 02:29:09,633][46662] Updated weights for policy 0, policy_version 53810 (0.0008) +[2023-10-13 02:29:09,995][46662] Updated weights for policy 0, policy_version 53820 (0.0008) +[2023-10-13 02:29:12,687][46663] Updated weights for policy 1, policy_version 53761 (0.0008) +[2023-10-13 02:29:13,057][46663] Updated weights for policy 1, policy_version 53771 (0.0007) +[2023-10-13 02:29:13,428][46663] Updated weights for policy 1, policy_version 53781 (0.0009) +[2023-10-13 02:29:13,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 110166016. Throughput: 0: 1683.5, 1: 1672.7. Samples: 27551862. Policy #0 lag: (min: 31.0, avg: 37.5, max: 63.0) +[2023-10-13 02:29:13,607][45375] Avg episode reward: [(0, '45.670'), (1, '55.410')] +[2023-10-13 02:29:13,801][46663] Updated weights for policy 1, policy_version 53791 (0.0008) +[2023-10-13 02:29:13,949][46662] Updated weights for policy 0, policy_version 53830 (0.0007) +[2023-10-13 02:29:14,321][46662] Updated weights for policy 0, policy_version 53840 (0.0009) +[2023-10-13 02:29:14,689][46662] Updated weights for policy 0, policy_version 53850 (0.0009) +[2023-10-13 02:29:17,832][46663] Updated weights for policy 1, policy_version 53801 (0.0010) +[2023-10-13 02:29:18,208][46663] Updated weights for policy 1, policy_version 53811 (0.0008) +[2023-10-13 02:29:18,571][46663] Updated weights for policy 1, policy_version 53821 (0.0007) +[2023-10-13 02:29:18,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 110231552. Throughput: 0: 1680.0, 1: 1676.2. Samples: 27572468. Policy #0 lag: (min: 31.0, avg: 37.5, max: 63.0) +[2023-10-13 02:29:18,607][45375] Avg episode reward: [(0, '46.020'), (1, '53.700')] +[2023-10-13 02:29:18,623][46662] Updated weights for policy 0, policy_version 53860 (0.0008) +[2023-10-13 02:29:18,997][46662] Updated weights for policy 0, policy_version 53870 (0.0011) +[2023-10-13 02:29:19,373][46662] Updated weights for policy 0, policy_version 53880 (0.0010) +[2023-10-13 02:29:22,667][46663] Updated weights for policy 1, policy_version 53831 (0.0009) +[2023-10-13 02:29:23,042][46663] Updated weights for policy 1, policy_version 53841 (0.0009) +[2023-10-13 02:29:23,406][46663] Updated weights for policy 1, policy_version 53851 (0.0007) +[2023-10-13 02:29:23,455][46662] Updated weights for policy 0, policy_version 53890 (0.0009) +[2023-10-13 02:29:23,607][45375] Fps is (10 sec: 16384.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 110329856. Throughput: 0: 1680.2, 1: 1658.1. Samples: 27592108. Policy #0 lag: (min: 31.0, avg: 37.5, max: 63.0) +[2023-10-13 02:29:23,607][45375] Avg episode reward: [(0, '47.450'), (1, '53.050')] +[2023-10-13 02:29:23,825][46662] Updated weights for policy 0, policy_version 53900 (0.0007) +[2023-10-13 02:29:24,193][46662] Updated weights for policy 0, policy_version 53910 (0.0007) +[2023-10-13 02:29:24,556][46662] Updated weights for policy 0, policy_version 53920 (0.0009) +[2023-10-13 02:29:27,426][46663] Updated weights for policy 1, policy_version 53861 (0.0010) +[2023-10-13 02:29:27,788][46663] Updated weights for policy 1, policy_version 53871 (0.0009) +[2023-10-13 02:29:28,146][46663] Updated weights for policy 1, policy_version 53881 (0.0007) +[2023-10-13 02:29:28,607][45375] Fps is (10 sec: 16383.8, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 110395392. Throughput: 0: 1680.7, 1: 1682.8. Samples: 27602390. Policy #0 lag: (min: 31.0, avg: 37.5, max: 63.0) +[2023-10-13 02:29:28,607][45375] Avg episode reward: [(0, '48.680'), (1, '52.670')] +[2023-10-13 02:29:28,649][46662] Updated weights for policy 0, policy_version 53930 (0.0009) +[2023-10-13 02:29:29,017][46662] Updated weights for policy 0, policy_version 53940 (0.0008) +[2023-10-13 02:29:29,394][46662] Updated weights for policy 0, policy_version 53950 (0.0008) +[2023-10-13 02:29:32,445][46663] Updated weights for policy 1, policy_version 53891 (0.0008) +[2023-10-13 02:29:32,842][46663] Updated weights for policy 1, policy_version 53901 (0.0009) +[2023-10-13 02:29:33,215][46663] Updated weights for policy 1, policy_version 53911 (0.0007) +[2023-10-13 02:29:33,425][46662] Updated weights for policy 0, policy_version 53960 (0.0008) +[2023-10-13 02:29:33,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 110460928. Throughput: 0: 1686.1, 1: 1681.2. Samples: 27622994. Policy #0 lag: (min: 8.0, avg: 30.1, max: 40.0) +[2023-10-13 02:29:33,608][45375] Avg episode reward: [(0, '48.200'), (1, '54.090')] +[2023-10-13 02:29:33,791][46662] Updated weights for policy 0, policy_version 53970 (0.0009) +[2023-10-13 02:29:34,158][46662] Updated weights for policy 0, policy_version 53980 (0.0010) +[2023-10-13 02:29:37,218][46663] Updated weights for policy 1, policy_version 53921 (0.0007) +[2023-10-13 02:29:37,591][46663] Updated weights for policy 1, policy_version 53931 (0.0007) +[2023-10-13 02:29:37,946][46663] Updated weights for policy 1, policy_version 53941 (0.0009) +[2023-10-13 02:29:38,068][46662] Updated weights for policy 0, policy_version 53990 (0.0009) +[2023-10-13 02:29:38,317][46663] Updated weights for policy 1, policy_version 53951 (0.0008) +[2023-10-13 02:29:38,427][46662] Updated weights for policy 0, policy_version 54000 (0.0009) +[2023-10-13 02:29:38,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 110526464. Throughput: 0: 1697.6, 1: 1661.8. Samples: 27642834. Policy #0 lag: (min: 8.0, avg: 30.1, max: 40.0) +[2023-10-13 02:29:38,607][45375] Avg episode reward: [(0, '48.120'), (1, '53.450')] +[2023-10-13 02:29:38,795][46662] Updated weights for policy 0, policy_version 54010 (0.0009) +[2023-10-13 02:29:42,535][46663] Updated weights for policy 1, policy_version 53961 (0.0009) +[2023-10-13 02:29:42,832][46662] Updated weights for policy 0, policy_version 54020 (0.0009) +[2023-10-13 02:29:42,899][46663] Updated weights for policy 1, policy_version 53971 (0.0008) +[2023-10-13 02:29:43,207][46662] Updated weights for policy 0, policy_version 54030 (0.0008) +[2023-10-13 02:29:43,265][46663] Updated weights for policy 1, policy_version 53981 (0.0007) +[2023-10-13 02:29:43,568][46662] Updated weights for policy 0, policy_version 54040 (0.0007) +[2023-10-13 02:29:43,607][45375] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 110592000. Throughput: 0: 1697.6, 1: 1677.7. Samples: 27653054. Policy #0 lag: (min: 8.0, avg: 30.1, max: 40.0) +[2023-10-13 02:29:43,607][45375] Avg episode reward: [(0, '47.230'), (1, '49.240')] +[2023-10-13 02:29:47,382][46663] Updated weights for policy 1, policy_version 53991 (0.0009) +[2023-10-13 02:29:47,559][46662] Updated weights for policy 0, policy_version 54050 (0.0007) +[2023-10-13 02:29:47,757][46663] Updated weights for policy 1, policy_version 54001 (0.0009) +[2023-10-13 02:29:47,925][46662] Updated weights for policy 0, policy_version 54060 (0.0008) +[2023-10-13 02:29:48,116][46663] Updated weights for policy 1, policy_version 54011 (0.0009) +[2023-10-13 02:29:48,293][46662] Updated weights for policy 0, policy_version 54070 (0.0009) +[2023-10-13 02:29:48,607][45375] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 110657536. Throughput: 0: 1703.2, 1: 1674.0. Samples: 27673744. Policy #0 lag: (min: 8.0, avg: 30.1, max: 40.0) +[2023-10-13 02:29:48,607][45375] Avg episode reward: [(0, '46.200'), (1, '49.670')] +[2023-10-13 02:29:48,665][46662] Updated weights for policy 0, policy_version 54080 (0.0007) +[2023-10-13 02:29:52,129][46663] Updated weights for policy 1, policy_version 54021 (0.0008) +[2023-10-13 02:29:52,497][46663] Updated weights for policy 1, policy_version 54031 (0.0007) +[2023-10-13 02:29:52,862][46663] Updated weights for policy 1, policy_version 54041 (0.0009) +[2023-10-13 02:29:52,862][46662] Updated weights for policy 0, policy_version 54090 (0.0009) +[2023-10-13 02:29:53,226][46662] Updated weights for policy 0, policy_version 54100 (0.0010) +[2023-10-13 02:29:53,606][46662] Updated weights for policy 0, policy_version 54110 (0.0007) +[2023-10-13 02:29:53,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 110723072. Throughput: 0: 1694.1, 1: 1667.3. Samples: 27693196. Policy #0 lag: (min: 8.0, avg: 30.1, max: 40.0) +[2023-10-13 02:29:53,607][45375] Avg episode reward: [(0, '46.770'), (1, '50.590')] +[2023-10-13 02:29:53,617][46384] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000054048_55345152.pth... +[2023-10-13 02:29:53,657][46384] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000052480_53739520.pth +[2023-10-13 02:29:53,679][46091] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000054112_55410688.pth... +[2023-10-13 02:29:53,716][46091] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000052512_53772288.pth +[2023-10-13 02:29:56,894][46663] Updated weights for policy 1, policy_version 54051 (0.0009) +[2023-10-13 02:29:57,259][46663] Updated weights for policy 1, policy_version 54061 (0.0009) +[2023-10-13 02:29:57,627][46663] Updated weights for policy 1, policy_version 54071 (0.0007) +[2023-10-13 02:29:57,819][46662] Updated weights for policy 0, policy_version 54120 (0.0007) +[2023-10-13 02:29:58,199][46662] Updated weights for policy 0, policy_version 54130 (0.0008) +[2023-10-13 02:29:58,563][46662] Updated weights for policy 0, policy_version 54140 (0.0007) +[2023-10-13 02:29:58,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 110788608. Throughput: 0: 1699.6, 1: 1678.2. Samples: 27703862. Policy #0 lag: (min: 8.0, avg: 30.1, max: 40.0) +[2023-10-13 02:29:58,607][45375] Avg episode reward: [(0, '46.760'), (1, '50.640')] +[2023-10-13 02:30:01,835][46663] Updated weights for policy 1, policy_version 54081 (0.0009) +[2023-10-13 02:30:02,203][46663] Updated weights for policy 1, policy_version 54091 (0.0009) +[2023-10-13 02:30:02,503][46662] Updated weights for policy 0, policy_version 54150 (0.0008) +[2023-10-13 02:30:02,573][46663] Updated weights for policy 1, policy_version 54101 (0.0009) +[2023-10-13 02:30:02,870][46662] Updated weights for policy 0, policy_version 54160 (0.0008) +[2023-10-13 02:30:02,949][46663] Updated weights for policy 1, policy_version 54111 (0.0008) +[2023-10-13 02:30:03,229][46662] Updated weights for policy 0, policy_version 54170 (0.0007) +[2023-10-13 02:30:03,607][45375] Fps is (10 sec: 16383.9, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 110886912. Throughput: 0: 1700.4, 1: 1661.9. Samples: 27723774. Policy #0 lag: (min: 31.0, avg: 38.0, max: 63.0) +[2023-10-13 02:30:03,608][45375] Avg episode reward: [(0, '46.340'), (1, '49.530')] +[2023-10-13 02:30:07,024][46663] Updated weights for policy 1, policy_version 54121 (0.0008) +[2023-10-13 02:30:07,280][46662] Updated weights for policy 0, policy_version 54180 (0.0008) +[2023-10-13 02:30:07,390][46663] Updated weights for policy 1, policy_version 54131 (0.0009) +[2023-10-13 02:30:07,644][46662] Updated weights for policy 0, policy_version 54190 (0.0008) +[2023-10-13 02:30:07,747][46663] Updated weights for policy 1, policy_version 54141 (0.0008) +[2023-10-13 02:30:08,015][46662] Updated weights for policy 0, policy_version 54200 (0.0008) +[2023-10-13 02:30:08,607][45375] Fps is (10 sec: 16383.7, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 110952448. Throughput: 0: 1684.7, 1: 1667.9. Samples: 27742976. Policy #0 lag: (min: 31.0, avg: 38.0, max: 63.0) +[2023-10-13 02:30:08,608][45375] Avg episode reward: [(0, '46.640'), (1, '49.300')] +[2023-10-13 02:30:11,766][46663] Updated weights for policy 1, policy_version 54151 (0.0008) +[2023-10-13 02:30:12,136][46662] Updated weights for policy 0, policy_version 54210 (0.0010) +[2023-10-13 02:30:12,141][46663] Updated weights for policy 1, policy_version 54161 (0.0008) +[2023-10-13 02:30:12,508][46662] Updated weights for policy 0, policy_version 54220 (0.0009) +[2023-10-13 02:30:12,509][46663] Updated weights for policy 1, policy_version 54171 (0.0008) +[2023-10-13 02:30:12,872][46662] Updated weights for policy 0, policy_version 54230 (0.0007) +[2023-10-13 02:30:13,245][46662] Updated weights for policy 0, policy_version 54240 (0.0007) +[2023-10-13 02:30:13,607][45375] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 111017984. Throughput: 0: 1699.6, 1: 1671.2. Samples: 27754076. Policy #0 lag: (min: 31.0, avg: 38.0, max: 63.0) +[2023-10-13 02:30:13,607][45375] Avg episode reward: [(0, '45.390'), (1, '49.280')] +[2023-10-13 02:30:16,530][46663] Updated weights for policy 1, policy_version 54181 (0.0010) +[2023-10-13 02:30:16,901][46663] Updated weights for policy 1, policy_version 54191 (0.0010) +[2023-10-13 02:30:17,274][46663] Updated weights for policy 1, policy_version 54201 (0.0009) +[2023-10-13 02:30:17,368][46662] Updated weights for policy 0, policy_version 54250 (0.0008) +[2023-10-13 02:30:17,738][46662] Updated weights for policy 0, policy_version 54260 (0.0007) +[2023-10-13 02:30:18,125][46662] Updated weights for policy 0, policy_version 54270 (0.0010) +[2023-10-13 02:30:18,606][45375] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 13440.4). Total num frames: 111083520. Throughput: 0: 1695.3, 1: 1654.4. Samples: 27773728. Policy #0 lag: (min: 31.0, avg: 38.0, max: 63.0) +[2023-10-13 02:30:18,607][45375] Avg episode reward: [(0, '47.820'), (1, '49.020')] +[2023-10-13 02:30:21,377][46663] Updated weights for policy 1, policy_version 54211 (0.0010) +[2023-10-13 02:30:21,791][46663] Updated weights for policy 1, policy_version 54221 (0.0010) +[2023-10-13 02:30:22,095][46662] Updated weights for policy 0, policy_version 54280 (0.0009) +[2023-10-13 02:30:22,160][46663] Updated weights for policy 1, policy_version 54231 (0.0009) +[2023-10-13 02:30:22,468][46662] Updated weights for policy 0, policy_version 54290 (0.0009) +[2023-10-13 02:30:22,835][46662] Updated weights for policy 0, policy_version 54300 (0.0007) +[2023-10-13 02:30:23,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 111149056. Throughput: 0: 1664.8, 1: 1672.9. Samples: 27793030. Policy #0 lag: (min: 31.0, avg: 38.0, max: 63.0) +[2023-10-13 02:30:23,607][45375] Avg episode reward: [(0, '49.390'), (1, '50.210')] +[2023-10-13 02:30:26,262][46663] Updated weights for policy 1, policy_version 54241 (0.0007) +[2023-10-13 02:30:26,626][46663] Updated weights for policy 1, policy_version 54251 (0.0009) +[2023-10-13 02:30:26,887][46662] Updated weights for policy 0, policy_version 54310 (0.0009) +[2023-10-13 02:30:26,991][46663] Updated weights for policy 1, policy_version 54261 (0.0009) +[2023-10-13 02:30:27,254][46662] Updated weights for policy 0, policy_version 54320 (0.0010) +[2023-10-13 02:30:27,363][46663] Updated weights for policy 1, policy_version 54271 (0.0008) +[2023-10-13 02:30:27,631][46662] Updated weights for policy 0, policy_version 54330 (0.0008) +[2023-10-13 02:30:28,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 111214592. Throughput: 0: 1687.4, 1: 1678.0. Samples: 27804494. Policy #0 lag: (min: 31.0, avg: 38.0, max: 63.0) +[2023-10-13 02:30:28,607][45375] Avg episode reward: [(0, '48.430'), (1, '50.480')] +[2023-10-13 02:30:31,288][46663] Updated weights for policy 1, policy_version 54281 (0.0008) +[2023-10-13 02:30:31,656][46663] Updated weights for policy 1, policy_version 54291 (0.0007) +[2023-10-13 02:30:31,863][46662] Updated weights for policy 0, policy_version 54340 (0.0007) +[2023-10-13 02:30:32,022][46663] Updated weights for policy 1, policy_version 54301 (0.0007) +[2023-10-13 02:30:32,236][46662] Updated weights for policy 0, policy_version 54350 (0.0008) +[2023-10-13 02:30:32,600][46662] Updated weights for policy 0, policy_version 54360 (0.0011) +[2023-10-13 02:30:33,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 111280128. Throughput: 0: 1673.4, 1: 1660.4. Samples: 27823766. Policy #0 lag: (min: 31.0, avg: 38.0, max: 63.0) +[2023-10-13 02:30:33,608][45375] Avg episode reward: [(0, '48.140'), (1, '49.040')] +[2023-10-13 02:30:36,200][46663] Updated weights for policy 1, policy_version 54311 (0.0008) +[2023-10-13 02:30:36,566][46663] Updated weights for policy 1, policy_version 54321 (0.0009) +[2023-10-13 02:30:36,717][46662] Updated weights for policy 0, policy_version 54370 (0.0010) +[2023-10-13 02:30:36,939][46663] Updated weights for policy 1, policy_version 54331 (0.0009) +[2023-10-13 02:30:37,088][46662] Updated weights for policy 0, policy_version 54380 (0.0008) +[2023-10-13 02:30:37,451][46662] Updated weights for policy 0, policy_version 54390 (0.0009) +[2023-10-13 02:30:37,822][46662] Updated weights for policy 0, policy_version 54400 (0.0008) +[2023-10-13 02:30:38,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 111345664. Throughput: 0: 1655.1, 1: 1675.5. Samples: 27843074. Policy #0 lag: (min: 31.0, avg: 38.0, max: 63.0) +[2023-10-13 02:30:38,607][45375] Avg episode reward: [(0, '50.590'), (1, '49.490')] +[2023-10-13 02:30:41,120][46663] Updated weights for policy 1, policy_version 54341 (0.0008) +[2023-10-13 02:30:41,490][46663] Updated weights for policy 1, policy_version 54351 (0.0009) +[2023-10-13 02:30:41,804][46662] Updated weights for policy 0, policy_version 54410 (0.0009) +[2023-10-13 02:30:41,859][46663] Updated weights for policy 1, policy_version 54361 (0.0007) +[2023-10-13 02:30:42,167][46662] Updated weights for policy 0, policy_version 54420 (0.0007) +[2023-10-13 02:30:42,540][46662] Updated weights for policy 0, policy_version 54430 (0.0009) +[2023-10-13 02:30:43,607][45375] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 111411200. Throughput: 0: 1677.1, 1: 1666.8. Samples: 27854334. Policy #0 lag: (min: 25.0, avg: 46.8, max: 57.0) +[2023-10-13 02:30:43,607][45375] Avg episode reward: [(0, '50.900'), (1, '50.590')] +[2023-10-13 02:30:45,689][46663] Updated weights for policy 1, policy_version 54371 (0.0008) +[2023-10-13 02:30:46,071][46663] Updated weights for policy 1, policy_version 54381 (0.0009) +[2023-10-13 02:30:46,431][46663] Updated weights for policy 1, policy_version 54391 (0.0008) +[2023-10-13 02:30:46,675][46662] Updated weights for policy 0, policy_version 54440 (0.0007) +[2023-10-13 02:30:47,059][46662] Updated weights for policy 0, policy_version 54450 (0.0007) +[2023-10-13 02:30:47,424][46662] Updated weights for policy 0, policy_version 54460 (0.0010) +[2023-10-13 02:30:48,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 111476736. Throughput: 0: 1667.0, 1: 1670.4. Samples: 27873958. Policy #0 lag: (min: 25.0, avg: 46.8, max: 57.0) +[2023-10-13 02:30:48,607][45375] Avg episode reward: [(0, '53.590'), (1, '51.850')] +[2023-10-13 02:30:50,408][46663] Updated weights for policy 1, policy_version 54401 (0.0007) +[2023-10-13 02:30:50,773][46663] Updated weights for policy 1, policy_version 54411 (0.0009) +[2023-10-13 02:30:51,156][46663] Updated weights for policy 1, policy_version 54421 (0.0009) +[2023-10-13 02:30:51,506][46662] Updated weights for policy 0, policy_version 54470 (0.0008) +[2023-10-13 02:30:51,531][46663] Updated weights for policy 1, policy_version 54431 (0.0007) +[2023-10-13 02:30:51,876][46662] Updated weights for policy 0, policy_version 54480 (0.0007) +[2023-10-13 02:30:52,241][46662] Updated weights for policy 0, policy_version 54490 (0.0009) +[2023-10-13 02:30:53,607][45375] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 111542272. Throughput: 0: 1657.0, 1: 1690.1. Samples: 27893594. Policy #0 lag: (min: 25.0, avg: 46.8, max: 57.0) +[2023-10-13 02:30:53,608][45375] Avg episode reward: [(0, '56.050'), (1, '51.860')] +[2023-10-13 02:30:55,690][46663] Updated weights for policy 1, policy_version 54441 (0.0010) +[2023-10-13 02:30:56,052][46663] Updated weights for policy 1, policy_version 54451 (0.0009) +[2023-10-13 02:30:56,244][46662] Updated weights for policy 0, policy_version 54500 (0.0009) +[2023-10-13 02:30:56,423][46663] Updated weights for policy 1, policy_version 54461 (0.0007) +[2023-10-13 02:30:56,617][46662] Updated weights for policy 0, policy_version 54510 (0.0009) +[2023-10-13 02:30:56,980][46662] Updated weights for policy 0, policy_version 54520 (0.0009) +[2023-10-13 02:30:58,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 111607808. Throughput: 0: 1672.7, 1: 1665.8. Samples: 27904306. Policy #0 lag: (min: 25.0, avg: 46.8, max: 57.0) +[2023-10-13 02:30:58,607][45375] Avg episode reward: [(0, '56.500'), (1, '52.360')] +[2023-10-13 02:30:58,608][46091] Saving new best policy, reward=56.500! +[2023-10-13 02:31:00,560][46663] Updated weights for policy 1, policy_version 54471 (0.0008) +[2023-10-13 02:31:00,933][46663] Updated weights for policy 1, policy_version 54481 (0.0008) +[2023-10-13 02:31:01,218][46662] Updated weights for policy 0, policy_version 54530 (0.0009) +[2023-10-13 02:31:01,300][46663] Updated weights for policy 1, policy_version 54491 (0.0007) +[2023-10-13 02:31:01,587][46662] Updated weights for policy 0, policy_version 54540 (0.0008) +[2023-10-13 02:31:01,941][46662] Updated weights for policy 0, policy_version 54550 (0.0010) +[2023-10-13 02:31:02,314][46662] Updated weights for policy 0, policy_version 54560 (0.0011) +[2023-10-13 02:31:03,606][45375] Fps is (10 sec: 13107.6, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 111673344. Throughput: 0: 1653.0, 1: 1682.8. Samples: 27923842. Policy #0 lag: (min: 25.0, avg: 46.8, max: 57.0) +[2023-10-13 02:31:03,607][45375] Avg episode reward: [(0, '56.830'), (1, '51.590')] +[2023-10-13 02:31:03,608][46091] Saving new best policy, reward=56.830! +[2023-10-13 02:31:05,412][46663] Updated weights for policy 1, policy_version 54501 (0.0009) +[2023-10-13 02:31:05,793][46663] Updated weights for policy 1, policy_version 54511 (0.0010) +[2023-10-13 02:31:06,156][46663] Updated weights for policy 1, policy_version 54521 (0.0010) +[2023-10-13 02:31:06,229][46662] Updated weights for policy 0, policy_version 54570 (0.0008) +[2023-10-13 02:31:06,598][46662] Updated weights for policy 0, policy_version 54580 (0.0007) +[2023-10-13 02:31:06,975][46662] Updated weights for policy 0, policy_version 54590 (0.0007) +[2023-10-13 02:31:08,607][45375] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 111738880. Throughput: 0: 1665.7, 1: 1691.8. Samples: 27944118. Policy #0 lag: (min: 25.0, avg: 46.8, max: 57.0) +[2023-10-13 02:31:08,608][45375] Avg episode reward: [(0, '57.760'), (1, '54.220')] +[2023-10-13 02:31:08,616][46091] Saving new best policy, reward=57.760! +[2023-10-13 02:31:10,191][46663] Updated weights for policy 1, policy_version 54531 (0.0008) +[2023-10-13 02:31:10,581][46663] Updated weights for policy 1, policy_version 54541 (0.0007) +[2023-10-13 02:31:10,958][46663] Updated weights for policy 1, policy_version 54551 (0.0008) +[2023-10-13 02:31:11,007][46662] Updated weights for policy 0, policy_version 54600 (0.0008) +[2023-10-13 02:31:11,377][46662] Updated weights for policy 0, policy_version 54610 (0.0008) +[2023-10-13 02:31:11,745][46662] Updated weights for policy 0, policy_version 54620 (0.0008) +[2023-10-13 02:31:13,606][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 111804416. Throughput: 0: 1667.1, 1: 1662.0. Samples: 27954300. Policy #0 lag: (min: 25.0, avg: 46.8, max: 57.0) +[2023-10-13 02:31:13,607][45375] Avg episode reward: [(0, '59.040'), (1, '53.340')] +[2023-10-13 02:31:13,608][46091] Saving new best policy, reward=59.040! +[2023-10-13 02:31:15,050][46663] Updated weights for policy 1, policy_version 54561 (0.0008) +[2023-10-13 02:31:15,412][46663] Updated weights for policy 1, policy_version 54571 (0.0007) +[2023-10-13 02:31:15,779][46663] Updated weights for policy 1, policy_version 54581 (0.0009) +[2023-10-13 02:31:15,849][46662] Updated weights for policy 0, policy_version 54630 (0.0010) +[2023-10-13 02:31:16,142][46663] Updated weights for policy 1, policy_version 54591 (0.0007) +[2023-10-13 02:31:16,225][46662] Updated weights for policy 0, policy_version 54640 (0.0008) +[2023-10-13 02:31:16,598][46662] Updated weights for policy 0, policy_version 54650 (0.0009) +[2023-10-13 02:31:18,607][45375] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 111869952. Throughput: 0: 1651.2, 1: 1686.2. Samples: 27973946. Policy #0 lag: (min: 25.0, avg: 46.8, max: 57.0) +[2023-10-13 02:31:18,607][45375] Avg episode reward: [(0, '58.460'), (1, '52.700')] +[2023-10-13 02:31:20,209][46663] Updated weights for policy 1, policy_version 54601 (0.0009) +[2023-10-13 02:31:20,579][46663] Updated weights for policy 1, policy_version 54611 (0.0009) +[2023-10-13 02:31:20,696][46662] Updated weights for policy 0, policy_version 54660 (0.0009) +[2023-10-13 02:31:20,937][46663] Updated weights for policy 1, policy_version 54621 (0.0007) +[2023-10-13 02:31:21,071][46662] Updated weights for policy 0, policy_version 54670 (0.0009) +[2023-10-13 02:31:21,442][46662] Updated weights for policy 0, policy_version 54680 (0.0008) +[2023-10-13 02:31:23,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 111935488. Throughput: 0: 1677.7, 1: 1687.8. Samples: 27994524. Policy #0 lag: (min: 31.0, avg: 32.1, max: 53.0) +[2023-10-13 02:31:23,607][45375] Avg episode reward: [(0, '58.050'), (1, '52.300')] +[2023-10-13 02:31:24,920][46663] Updated weights for policy 1, policy_version 54631 (0.0008) +[2023-10-13 02:31:25,279][46663] Updated weights for policy 1, policy_version 54641 (0.0008) +[2023-10-13 02:31:25,475][46662] Updated weights for policy 0, policy_version 54690 (0.0009) +[2023-10-13 02:31:25,641][46663] Updated weights for policy 1, policy_version 54651 (0.0008) +[2023-10-13 02:31:25,844][46662] Updated weights for policy 0, policy_version 54700 (0.0009) +[2023-10-13 02:31:26,219][46662] Updated weights for policy 0, policy_version 54710 (0.0010) +[2023-10-13 02:31:26,585][46662] Updated weights for policy 0, policy_version 54720 (0.0010) +[2023-10-13 02:31:28,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 112001024. Throughput: 0: 1670.0, 1: 1670.3. Samples: 28004644. Policy #0 lag: (min: 31.0, avg: 32.1, max: 53.0) +[2023-10-13 02:31:28,607][45375] Avg episode reward: [(0, '57.690'), (1, '51.660')] +[2023-10-13 02:31:29,908][46663] Updated weights for policy 1, policy_version 54661 (0.0008) +[2023-10-13 02:31:30,278][46663] Updated weights for policy 1, policy_version 54671 (0.0009) +[2023-10-13 02:31:30,550][46662] Updated weights for policy 0, policy_version 54730 (0.0008) +[2023-10-13 02:31:30,634][46663] Updated weights for policy 1, policy_version 54681 (0.0010) +[2023-10-13 02:31:30,928][46662] Updated weights for policy 0, policy_version 54740 (0.0008) +[2023-10-13 02:31:31,295][46662] Updated weights for policy 0, policy_version 54750 (0.0009) +[2023-10-13 02:31:33,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 112066560. Throughput: 0: 1664.8, 1: 1680.2. Samples: 28024480. Policy #0 lag: (min: 31.0, avg: 32.1, max: 53.0) +[2023-10-13 02:31:33,607][45375] Avg episode reward: [(0, '57.640'), (1, '50.460')] +[2023-10-13 02:31:34,640][46663] Updated weights for policy 1, policy_version 54691 (0.0011) +[2023-10-13 02:31:35,010][46663] Updated weights for policy 1, policy_version 54701 (0.0010) +[2023-10-13 02:31:35,392][46663] Updated weights for policy 1, policy_version 54711 (0.0009) +[2023-10-13 02:31:35,656][46662] Updated weights for policy 0, policy_version 54760 (0.0008) +[2023-10-13 02:31:36,024][46662] Updated weights for policy 0, policy_version 54770 (0.0010) +[2023-10-13 02:31:36,397][46662] Updated weights for policy 0, policy_version 54780 (0.0008) +[2023-10-13 02:31:38,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 112132096. Throughput: 0: 1681.5, 1: 1679.2. Samples: 28044826. Policy #0 lag: (min: 31.0, avg: 32.1, max: 53.0) +[2023-10-13 02:31:38,607][45375] Avg episode reward: [(0, '57.110'), (1, '50.730')] +[2023-10-13 02:31:39,452][46663] Updated weights for policy 1, policy_version 54721 (0.0009) +[2023-10-13 02:31:39,811][46663] Updated weights for policy 1, policy_version 54731 (0.0007) +[2023-10-13 02:31:40,178][46663] Updated weights for policy 1, policy_version 54741 (0.0008) +[2023-10-13 02:31:40,516][46662] Updated weights for policy 0, policy_version 54790 (0.0008) +[2023-10-13 02:31:40,555][46663] Updated weights for policy 1, policy_version 54751 (0.0007) +[2023-10-13 02:31:40,891][46662] Updated weights for policy 0, policy_version 54800 (0.0009) +[2023-10-13 02:31:41,260][46662] Updated weights for policy 0, policy_version 54810 (0.0007) +[2023-10-13 02:31:43,607][45375] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 112197632. Throughput: 0: 1666.3, 1: 1673.0. Samples: 28054574. Policy #0 lag: (min: 31.0, avg: 32.1, max: 53.0) +[2023-10-13 02:31:43,608][45375] Avg episode reward: [(0, '57.540'), (1, '48.920')] +[2023-10-13 02:31:44,580][46663] Updated weights for policy 1, policy_version 54761 (0.0008) +[2023-10-13 02:31:44,955][46663] Updated weights for policy 1, policy_version 54771 (0.0008) +[2023-10-13 02:31:45,248][46662] Updated weights for policy 0, policy_version 54820 (0.0008) +[2023-10-13 02:31:45,321][46663] Updated weights for policy 1, policy_version 54781 (0.0007) +[2023-10-13 02:31:45,618][46662] Updated weights for policy 0, policy_version 54830 (0.0009) +[2023-10-13 02:31:45,988][46662] Updated weights for policy 0, policy_version 54840 (0.0009) +[2023-10-13 02:31:48,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 112263168. Throughput: 0: 1671.9, 1: 1684.0. Samples: 28074858. Policy #0 lag: (min: 31.0, avg: 32.1, max: 53.0) +[2023-10-13 02:31:48,607][45375] Avg episode reward: [(0, '56.150'), (1, '48.400')] +[2023-10-13 02:31:49,434][46663] Updated weights for policy 1, policy_version 54791 (0.0007) +[2023-10-13 02:31:49,799][46663] Updated weights for policy 1, policy_version 54801 (0.0008) +[2023-10-13 02:31:50,079][46662] Updated weights for policy 0, policy_version 54850 (0.0009) +[2023-10-13 02:31:50,179][46663] Updated weights for policy 1, policy_version 54811 (0.0009) +[2023-10-13 02:31:50,446][46662] Updated weights for policy 0, policy_version 54860 (0.0008) +[2023-10-13 02:31:50,813][46662] Updated weights for policy 0, policy_version 54870 (0.0009) +[2023-10-13 02:31:51,193][46662] Updated weights for policy 0, policy_version 54880 (0.0009) +[2023-10-13 02:31:53,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 112328704. Throughput: 0: 1678.6, 1: 1680.0. Samples: 28095254. Policy #0 lag: (min: 31.0, avg: 32.1, max: 53.0) +[2023-10-13 02:31:53,607][45375] Avg episode reward: [(0, '53.760'), (1, '47.920')] +[2023-10-13 02:31:53,618][46091] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000054880_56197120.pth... +[2023-10-13 02:31:53,618][46384] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000054816_56131584.pth... +[2023-10-13 02:31:53,654][46384] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000053248_54525952.pth +[2023-10-13 02:31:53,658][46091] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000053312_54591488.pth +[2023-10-13 02:31:53,658][46384] Saving a milestone ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/milestones/checkpoint_000054816_56131584.pth +[2023-10-13 02:31:53,662][46091] Saving a milestone ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/milestones/checkpoint_000054880_56197120.pth +[2023-10-13 02:31:54,197][46663] Updated weights for policy 1, policy_version 54821 (0.0008) +[2023-10-13 02:31:54,566][46663] Updated weights for policy 1, policy_version 54831 (0.0010) +[2023-10-13 02:31:54,935][46663] Updated weights for policy 1, policy_version 54841 (0.0008) +[2023-10-13 02:31:55,204][46662] Updated weights for policy 0, policy_version 54890 (0.0008) +[2023-10-13 02:31:55,578][46662] Updated weights for policy 0, policy_version 54900 (0.0010) +[2023-10-13 02:31:55,949][46662] Updated weights for policy 0, policy_version 54910 (0.0010) +[2023-10-13 02:31:58,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 112394240. Throughput: 0: 1661.3, 1: 1683.0. Samples: 28104794. Policy #0 lag: (min: 31.0, avg: 32.1, max: 53.0) +[2023-10-13 02:31:58,607][45375] Avg episode reward: [(0, '51.500'), (1, '48.140')] +[2023-10-13 02:31:59,165][46663] Updated weights for policy 1, policy_version 54851 (0.0009) +[2023-10-13 02:31:59,549][46663] Updated weights for policy 1, policy_version 54861 (0.0010) +[2023-10-13 02:31:59,914][46663] Updated weights for policy 1, policy_version 54871 (0.0007) +[2023-10-13 02:32:00,016][46662] Updated weights for policy 0, policy_version 54920 (0.0009) +[2023-10-13 02:32:00,376][46662] Updated weights for policy 0, policy_version 54930 (0.0010) +[2023-10-13 02:32:00,753][46662] Updated weights for policy 0, policy_version 54940 (0.0008) +[2023-10-13 02:32:03,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 112459776. Throughput: 0: 1676.8, 1: 1675.4. Samples: 28124794. Policy #0 lag: (min: 31.0, avg: 35.9, max: 63.0) +[2023-10-13 02:32:03,608][45375] Avg episode reward: [(0, '51.320'), (1, '46.310')] +[2023-10-13 02:32:03,994][46663] Updated weights for policy 1, policy_version 54881 (0.0008) +[2023-10-13 02:32:04,365][46663] Updated weights for policy 1, policy_version 54891 (0.0011) +[2023-10-13 02:32:04,722][46663] Updated weights for policy 1, policy_version 54901 (0.0009) +[2023-10-13 02:32:04,846][46662] Updated weights for policy 0, policy_version 54950 (0.0007) +[2023-10-13 02:32:05,089][46663] Updated weights for policy 1, policy_version 54911 (0.0008) +[2023-10-13 02:32:05,210][46662] Updated weights for policy 0, policy_version 54960 (0.0008) +[2023-10-13 02:32:05,589][46662] Updated weights for policy 0, policy_version 54970 (0.0008) +[2023-10-13 02:32:08,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 112525312. Throughput: 0: 1678.6, 1: 1677.2. Samples: 28145536. Policy #0 lag: (min: 31.0, avg: 35.9, max: 63.0) +[2023-10-13 02:32:08,607][45375] Avg episode reward: [(0, '50.890'), (1, '46.130')] +[2023-10-13 02:32:09,215][46663] Updated weights for policy 1, policy_version 54921 (0.0011) +[2023-10-13 02:32:09,581][46663] Updated weights for policy 1, policy_version 54931 (0.0008) +[2023-10-13 02:32:09,621][46662] Updated weights for policy 0, policy_version 54980 (0.0008) +[2023-10-13 02:32:09,942][46663] Updated weights for policy 1, policy_version 54941 (0.0007) +[2023-10-13 02:32:09,992][46662] Updated weights for policy 0, policy_version 54990 (0.0009) +[2023-10-13 02:32:10,373][46662] Updated weights for policy 0, policy_version 55000 (0.0008) +[2023-10-13 02:32:13,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.1, 300 sec: 13329.3). Total num frames: 112590848. Throughput: 0: 1657.7, 1: 1674.2. Samples: 28154578. Policy #0 lag: (min: 31.0, avg: 35.9, max: 63.0) +[2023-10-13 02:32:13,608][45375] Avg episode reward: [(0, '50.830'), (1, '45.720')] +[2023-10-13 02:32:14,026][46663] Updated weights for policy 1, policy_version 54951 (0.0007) +[2023-10-13 02:32:14,386][46663] Updated weights for policy 1, policy_version 54961 (0.0011) +[2023-10-13 02:32:14,418][46662] Updated weights for policy 0, policy_version 55010 (0.0007) +[2023-10-13 02:32:14,758][46663] Updated weights for policy 1, policy_version 54971 (0.0007) +[2023-10-13 02:32:14,794][46662] Updated weights for policy 0, policy_version 55020 (0.0008) +[2023-10-13 02:32:15,164][46662] Updated weights for policy 0, policy_version 55030 (0.0008) +[2023-10-13 02:32:15,544][46662] Updated weights for policy 0, policy_version 55040 (0.0007) +[2023-10-13 02:32:18,606][45375] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 112656384. Throughput: 0: 1676.7, 1: 1676.1. Samples: 28175356. Policy #0 lag: (min: 31.0, avg: 35.9, max: 63.0) +[2023-10-13 02:32:18,607][45375] Avg episode reward: [(0, '49.390'), (1, '45.660')] +[2023-10-13 02:32:18,846][46663] Updated weights for policy 1, policy_version 54981 (0.0007) +[2023-10-13 02:32:19,208][46663] Updated weights for policy 1, policy_version 54991 (0.0008) +[2023-10-13 02:32:19,582][46663] Updated weights for policy 1, policy_version 55001 (0.0008) +[2023-10-13 02:32:19,599][46662] Updated weights for policy 0, policy_version 55050 (0.0009) +[2023-10-13 02:32:19,968][46662] Updated weights for policy 0, policy_version 55060 (0.0009) +[2023-10-13 02:32:20,340][46662] Updated weights for policy 0, policy_version 55070 (0.0011) +[2023-10-13 02:32:23,478][46663] Updated weights for policy 1, policy_version 55011 (0.0008) +[2023-10-13 02:32:23,606][45375] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 112721920. Throughput: 0: 1683.1, 1: 1678.0. Samples: 28196076. Policy #0 lag: (min: 31.0, avg: 35.9, max: 63.0) +[2023-10-13 02:32:23,607][45375] Avg episode reward: [(0, '51.490'), (1, '45.850')] +[2023-10-13 02:32:23,835][46663] Updated weights for policy 1, policy_version 55021 (0.0007) +[2023-10-13 02:32:24,211][46663] Updated weights for policy 1, policy_version 55031 (0.0007) +[2023-10-13 02:32:24,723][46662] Updated weights for policy 0, policy_version 55080 (0.0009) +[2023-10-13 02:32:25,101][46662] Updated weights for policy 0, policy_version 55090 (0.0009) +[2023-10-13 02:32:25,462][46662] Updated weights for policy 0, policy_version 55100 (0.0008) +[2023-10-13 02:32:28,317][46663] Updated weights for policy 1, policy_version 55041 (0.0009) +[2023-10-13 02:32:28,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 112787456. Throughput: 0: 1668.0, 1: 1680.5. Samples: 28205256. Policy #0 lag: (min: 31.0, avg: 35.9, max: 63.0) +[2023-10-13 02:32:28,607][45375] Avg episode reward: [(0, '50.280'), (1, '45.330')] +[2023-10-13 02:32:28,676][46663] Updated weights for policy 1, policy_version 55051 (0.0010) +[2023-10-13 02:32:29,046][46663] Updated weights for policy 1, policy_version 55061 (0.0010) +[2023-10-13 02:32:29,418][46663] Updated weights for policy 1, policy_version 55071 (0.0009) +[2023-10-13 02:32:29,447][46662] Updated weights for policy 0, policy_version 55110 (0.0008) +[2023-10-13 02:32:29,827][46662] Updated weights for policy 0, policy_version 55120 (0.0009) +[2023-10-13 02:32:30,205][46662] Updated weights for policy 0, policy_version 55130 (0.0009) +[2023-10-13 02:32:33,480][46663] Updated weights for policy 1, policy_version 55081 (0.0010) +[2023-10-13 02:32:33,606][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 112852992. Throughput: 0: 1681.4, 1: 1676.0. Samples: 28225940. Policy #0 lag: (min: 31.0, avg: 35.9, max: 63.0) +[2023-10-13 02:32:33,607][45375] Avg episode reward: [(0, '48.360'), (1, '46.130')] +[2023-10-13 02:32:33,848][46663] Updated weights for policy 1, policy_version 55091 (0.0010) +[2023-10-13 02:32:34,154][46662] Updated weights for policy 0, policy_version 55140 (0.0008) +[2023-10-13 02:32:34,212][46663] Updated weights for policy 1, policy_version 55101 (0.0009) +[2023-10-13 02:32:34,530][46662] Updated weights for policy 0, policy_version 55150 (0.0007) +[2023-10-13 02:32:34,900][46662] Updated weights for policy 0, policy_version 55160 (0.0008) +[2023-10-13 02:32:38,088][46663] Updated weights for policy 1, policy_version 55111 (0.0008) +[2023-10-13 02:32:38,446][46663] Updated weights for policy 1, policy_version 55121 (0.0008) +[2023-10-13 02:32:38,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 112918528. Throughput: 0: 1680.8, 1: 1671.1. Samples: 28246090. Policy #0 lag: (min: 31.0, avg: 35.9, max: 63.0) +[2023-10-13 02:32:38,607][45375] Avg episode reward: [(0, '48.050'), (1, '46.970')] +[2023-10-13 02:32:38,812][46663] Updated weights for policy 1, policy_version 55131 (0.0008) +[2023-10-13 02:32:39,114][46662] Updated weights for policy 0, policy_version 55170 (0.0007) +[2023-10-13 02:32:39,481][46662] Updated weights for policy 0, policy_version 55180 (0.0009) +[2023-10-13 02:32:39,859][46662] Updated weights for policy 0, policy_version 55190 (0.0008) +[2023-10-13 02:32:40,229][46662] Updated weights for policy 0, policy_version 55200 (0.0007) +[2023-10-13 02:32:42,846][46663] Updated weights for policy 1, policy_version 55141 (0.0008) +[2023-10-13 02:32:43,207][46663] Updated weights for policy 1, policy_version 55151 (0.0008) +[2023-10-13 02:32:43,576][46663] Updated weights for policy 1, policy_version 55161 (0.0007) +[2023-10-13 02:32:43,606][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.3, 300 sec: 13329.4). Total num frames: 112984064. Throughput: 0: 1674.4, 1: 1680.4. Samples: 28255764. Policy #0 lag: (min: 31.0, avg: 32.1, max: 54.0) +[2023-10-13 02:32:43,607][45375] Avg episode reward: [(0, '47.840'), (1, '47.320')] +[2023-10-13 02:32:44,187][46662] Updated weights for policy 0, policy_version 55210 (0.0007) +[2023-10-13 02:32:44,557][46662] Updated weights for policy 0, policy_version 55220 (0.0009) +[2023-10-13 02:32:44,934][46662] Updated weights for policy 0, policy_version 55230 (0.0010) +[2023-10-13 02:32:47,817][46663] Updated weights for policy 1, policy_version 55171 (0.0010) +[2023-10-13 02:32:48,210][46663] Updated weights for policy 1, policy_version 55181 (0.0010) +[2023-10-13 02:32:48,576][46663] Updated weights for policy 1, policy_version 55191 (0.0011) +[2023-10-13 02:32:48,606][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 113049600. Throughput: 0: 1683.2, 1: 1690.1. Samples: 28276592. Policy #0 lag: (min: 31.0, avg: 32.1, max: 54.0) +[2023-10-13 02:32:48,607][45375] Avg episode reward: [(0, '46.230'), (1, '48.520')] +[2023-10-13 02:32:48,994][46662] Updated weights for policy 0, policy_version 55240 (0.0010) +[2023-10-13 02:32:49,365][46662] Updated weights for policy 0, policy_version 55250 (0.0007) +[2023-10-13 02:32:49,727][46662] Updated weights for policy 0, policy_version 55260 (0.0008) +[2023-10-13 02:32:52,726][46663] Updated weights for policy 1, policy_version 55201 (0.0008) +[2023-10-13 02:32:53,100][46663] Updated weights for policy 1, policy_version 55211 (0.0007) +[2023-10-13 02:32:53,463][46663] Updated weights for policy 1, policy_version 55221 (0.0007) +[2023-10-13 02:32:53,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 113115136. Throughput: 0: 1682.3, 1: 1672.4. Samples: 28296498. Policy #0 lag: (min: 31.0, avg: 32.1, max: 54.0) +[2023-10-13 02:32:53,607][45375] Avg episode reward: [(0, '46.900'), (1, '49.390')] +[2023-10-13 02:32:53,670][46662] Updated weights for policy 0, policy_version 55270 (0.0008) +[2023-10-13 02:32:53,832][46663] Updated weights for policy 1, policy_version 55231 (0.0007) +[2023-10-13 02:32:54,036][46662] Updated weights for policy 0, policy_version 55280 (0.0008) +[2023-10-13 02:32:54,399][46662] Updated weights for policy 0, policy_version 55290 (0.0009) +[2023-10-13 02:32:57,912][46663] Updated weights for policy 1, policy_version 55241 (0.0008) +[2023-10-13 02:32:58,278][46663] Updated weights for policy 1, policy_version 55251 (0.0009) +[2023-10-13 02:32:58,405][46662] Updated weights for policy 0, policy_version 55300 (0.0008) +[2023-10-13 02:32:58,606][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 113180672. Throughput: 0: 1684.6, 1: 1689.2. Samples: 28306398. Policy #0 lag: (min: 31.0, avg: 32.1, max: 54.0) +[2023-10-13 02:32:58,607][45375] Avg episode reward: [(0, '45.480'), (1, '49.880')] +[2023-10-13 02:32:58,642][46663] Updated weights for policy 1, policy_version 55261 (0.0008) +[2023-10-13 02:32:58,763][46662] Updated weights for policy 0, policy_version 55310 (0.0007) +[2023-10-13 02:32:59,138][46662] Updated weights for policy 0, policy_version 55320 (0.0008) +[2023-10-13 02:33:02,908][46663] Updated weights for policy 1, policy_version 55271 (0.0009) +[2023-10-13 02:33:03,230][46662] Updated weights for policy 0, policy_version 55330 (0.0009) +[2023-10-13 02:33:03,279][46663] Updated weights for policy 1, policy_version 55281 (0.0008) +[2023-10-13 02:33:03,598][46662] Updated weights for policy 0, policy_version 55340 (0.0007) +[2023-10-13 02:33:03,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 113246208. Throughput: 0: 1681.9, 1: 1685.3. Samples: 28326880. Policy #0 lag: (min: 31.0, avg: 32.1, max: 54.0) +[2023-10-13 02:33:03,607][45375] Avg episode reward: [(0, '45.100'), (1, '50.270')] +[2023-10-13 02:33:03,645][46663] Updated weights for policy 1, policy_version 55291 (0.0008) +[2023-10-13 02:33:03,965][46662] Updated weights for policy 0, policy_version 55350 (0.0009) +[2023-10-13 02:33:04,330][46662] Updated weights for policy 0, policy_version 55360 (0.0010) +[2023-10-13 02:33:07,570][46663] Updated weights for policy 1, policy_version 55301 (0.0008) +[2023-10-13 02:33:07,945][46663] Updated weights for policy 1, policy_version 55311 (0.0009) +[2023-10-13 02:33:08,307][46663] Updated weights for policy 1, policy_version 55321 (0.0009) +[2023-10-13 02:33:08,450][46662] Updated weights for policy 0, policy_version 55370 (0.0007) +[2023-10-13 02:33:08,607][45375] Fps is (10 sec: 16383.8, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 113344512. Throughput: 0: 1685.6, 1: 1659.0. Samples: 28346584. Policy #0 lag: (min: 31.0, avg: 32.1, max: 54.0) +[2023-10-13 02:33:08,607][45375] Avg episode reward: [(0, '46.940'), (1, '48.660')] +[2023-10-13 02:33:08,832][46662] Updated weights for policy 0, policy_version 55380 (0.0010) +[2023-10-13 02:33:09,197][46662] Updated weights for policy 0, policy_version 55390 (0.0007) +[2023-10-13 02:33:12,568][46663] Updated weights for policy 1, policy_version 55331 (0.0010) +[2023-10-13 02:33:12,921][46663] Updated weights for policy 1, policy_version 55341 (0.0009) +[2023-10-13 02:33:13,237][46662] Updated weights for policy 0, policy_version 55400 (0.0009) +[2023-10-13 02:33:13,291][46663] Updated weights for policy 1, policy_version 55351 (0.0007) +[2023-10-13 02:33:13,603][46662] Updated weights for policy 0, policy_version 55410 (0.0008) +[2023-10-13 02:33:13,606][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 113377280. Throughput: 0: 1685.5, 1: 1678.1. Samples: 28356620. Policy #0 lag: (min: 31.0, avg: 32.1, max: 54.0) +[2023-10-13 02:33:13,607][45375] Avg episode reward: [(0, '46.710'), (1, '48.360')] +[2023-10-13 02:33:13,980][46662] Updated weights for policy 0, policy_version 55420 (0.0009) +[2023-10-13 02:33:17,377][46663] Updated weights for policy 1, policy_version 55361 (0.0008) +[2023-10-13 02:33:17,748][46663] Updated weights for policy 1, policy_version 55371 (0.0010) +[2023-10-13 02:33:18,003][46662] Updated weights for policy 0, policy_version 55430 (0.0009) +[2023-10-13 02:33:18,115][46663] Updated weights for policy 1, policy_version 55381 (0.0008) +[2023-10-13 02:33:18,363][46662] Updated weights for policy 0, policy_version 55440 (0.0009) +[2023-10-13 02:33:18,479][46663] Updated weights for policy 1, policy_version 55391 (0.0008) +[2023-10-13 02:33:18,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 113475584. Throughput: 0: 1689.5, 1: 1677.8. Samples: 28377468. Policy #0 lag: (min: 31.0, avg: 32.1, max: 54.0) +[2023-10-13 02:33:18,607][45375] Avg episode reward: [(0, '46.540'), (1, '48.250')] +[2023-10-13 02:33:18,728][46662] Updated weights for policy 0, policy_version 55450 (0.0008) +[2023-10-13 02:33:22,418][46663] Updated weights for policy 1, policy_version 55401 (0.0010) +[2023-10-13 02:33:22,789][46662] Updated weights for policy 0, policy_version 55460 (0.0007) +[2023-10-13 02:33:22,792][46663] Updated weights for policy 1, policy_version 55411 (0.0007) +[2023-10-13 02:33:23,149][46663] Updated weights for policy 1, policy_version 55421 (0.0009) +[2023-10-13 02:33:23,159][46662] Updated weights for policy 0, policy_version 55470 (0.0007) +[2023-10-13 02:33:23,519][46662] Updated weights for policy 0, policy_version 55480 (0.0009) +[2023-10-13 02:33:23,607][45375] Fps is (10 sec: 16384.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 113541120. Throughput: 0: 1687.1, 1: 1660.0. Samples: 28396710. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:33:23,607][45375] Avg episode reward: [(0, '46.830'), (1, '49.340')] +[2023-10-13 02:33:27,305][46663] Updated weights for policy 1, policy_version 55431 (0.0009) +[2023-10-13 02:33:27,675][46663] Updated weights for policy 1, policy_version 55441 (0.0009) +[2023-10-13 02:33:27,794][46662] Updated weights for policy 0, policy_version 55490 (0.0008) +[2023-10-13 02:33:28,044][46663] Updated weights for policy 1, policy_version 55451 (0.0008) +[2023-10-13 02:33:28,162][46662] Updated weights for policy 0, policy_version 55500 (0.0007) +[2023-10-13 02:33:28,533][46662] Updated weights for policy 0, policy_version 55510 (0.0007) +[2023-10-13 02:33:28,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 113606656. Throughput: 0: 1689.1, 1: 1678.8. Samples: 28407318. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:33:28,607][45375] Avg episode reward: [(0, '47.780'), (1, '49.320')] +[2023-10-13 02:33:28,894][46662] Updated weights for policy 0, policy_version 55520 (0.0010) +[2023-10-13 02:33:32,009][46663] Updated weights for policy 1, policy_version 55461 (0.0007) +[2023-10-13 02:33:32,376][46663] Updated weights for policy 1, policy_version 55471 (0.0008) +[2023-10-13 02:33:32,746][46663] Updated weights for policy 1, policy_version 55481 (0.0007) +[2023-10-13 02:33:33,051][46662] Updated weights for policy 0, policy_version 55530 (0.0007) +[2023-10-13 02:33:33,420][46662] Updated weights for policy 0, policy_version 55540 (0.0010) +[2023-10-13 02:33:33,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 113672192. Throughput: 0: 1688.9, 1: 1666.9. Samples: 28427606. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:33:33,607][45375] Avg episode reward: [(0, '49.170'), (1, '49.900')] +[2023-10-13 02:33:33,798][46662] Updated weights for policy 0, policy_version 55550 (0.0007) +[2023-10-13 02:33:36,804][46663] Updated weights for policy 1, policy_version 55491 (0.0008) +[2023-10-13 02:33:37,227][46663] Updated weights for policy 1, policy_version 55501 (0.0009) +[2023-10-13 02:33:37,590][46663] Updated weights for policy 1, policy_version 55511 (0.0009) +[2023-10-13 02:33:37,903][46662] Updated weights for policy 0, policy_version 55560 (0.0009) +[2023-10-13 02:33:38,269][46662] Updated weights for policy 0, policy_version 55570 (0.0009) +[2023-10-13 02:33:38,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 113737728. Throughput: 0: 1686.4, 1: 1664.4. Samples: 28447284. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:33:38,607][45375] Avg episode reward: [(0, '50.120'), (1, '50.110')] +[2023-10-13 02:33:38,643][46662] Updated weights for policy 0, policy_version 55580 (0.0007) +[2023-10-13 02:33:41,573][46663] Updated weights for policy 1, policy_version 55521 (0.0009) +[2023-10-13 02:33:41,935][46663] Updated weights for policy 1, policy_version 55531 (0.0008) +[2023-10-13 02:33:42,297][46663] Updated weights for policy 1, policy_version 55541 (0.0009) +[2023-10-13 02:33:42,570][46662] Updated weights for policy 0, policy_version 55590 (0.0009) +[2023-10-13 02:33:42,658][46663] Updated weights for policy 1, policy_version 55551 (0.0011) +[2023-10-13 02:33:42,933][46662] Updated weights for policy 0, policy_version 55600 (0.0008) +[2023-10-13 02:33:43,310][46662] Updated weights for policy 0, policy_version 55610 (0.0008) +[2023-10-13 02:33:43,607][45375] Fps is (10 sec: 16383.8, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 113836032. Throughput: 0: 1688.6, 1: 1677.3. Samples: 28457864. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:33:43,608][45375] Avg episode reward: [(0, '50.970'), (1, '49.870')] +[2023-10-13 02:33:46,632][46663] Updated weights for policy 1, policy_version 55561 (0.0007) +[2023-10-13 02:33:46,990][46663] Updated weights for policy 1, policy_version 55571 (0.0008) +[2023-10-13 02:33:47,249][46662] Updated weights for policy 0, policy_version 55620 (0.0007) +[2023-10-13 02:33:47,367][46663] Updated weights for policy 1, policy_version 55581 (0.0007) +[2023-10-13 02:33:47,616][46662] Updated weights for policy 0, policy_version 55630 (0.0010) +[2023-10-13 02:33:47,984][46662] Updated weights for policy 0, policy_version 55640 (0.0010) +[2023-10-13 02:33:48,606][45375] Fps is (10 sec: 16384.2, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 113901568. Throughput: 0: 1694.6, 1: 1658.3. Samples: 28477760. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:33:48,607][45375] Avg episode reward: [(0, '51.360'), (1, '49.690')] +[2023-10-13 02:33:51,529][46663] Updated weights for policy 1, policy_version 55591 (0.0010) +[2023-10-13 02:33:51,893][46663] Updated weights for policy 1, policy_version 55601 (0.0009) +[2023-10-13 02:33:52,155][46662] Updated weights for policy 0, policy_version 55650 (0.0008) +[2023-10-13 02:33:52,254][46663] Updated weights for policy 1, policy_version 55611 (0.0008) +[2023-10-13 02:33:52,528][46662] Updated weights for policy 0, policy_version 55660 (0.0008) +[2023-10-13 02:33:52,892][46662] Updated weights for policy 0, policy_version 55670 (0.0007) +[2023-10-13 02:33:53,264][46662] Updated weights for policy 0, policy_version 55680 (0.0008) +[2023-10-13 02:33:53,607][45375] Fps is (10 sec: 13107.0, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 113967104. Throughput: 0: 1673.5, 1: 1676.8. Samples: 28497352. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:33:53,608][45375] Avg episode reward: [(0, '52.240'), (1, '48.220')] +[2023-10-13 02:33:53,617][46091] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000055680_57016320.pth... +[2023-10-13 02:33:53,618][46384] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000055616_56950784.pth... +[2023-10-13 02:33:53,657][46091] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000054112_55410688.pth +[2023-10-13 02:33:53,658][46384] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000054048_55345152.pth +[2023-10-13 02:33:56,409][46663] Updated weights for policy 1, policy_version 55621 (0.0008) +[2023-10-13 02:33:56,778][46663] Updated weights for policy 1, policy_version 55631 (0.0007) +[2023-10-13 02:33:57,145][46663] Updated weights for policy 1, policy_version 55641 (0.0007) +[2023-10-13 02:33:57,201][46662] Updated weights for policy 0, policy_version 55690 (0.0009) +[2023-10-13 02:33:57,559][46662] Updated weights for policy 0, policy_version 55700 (0.0011) +[2023-10-13 02:33:57,925][46662] Updated weights for policy 0, policy_version 55710 (0.0009) +[2023-10-13 02:33:58,607][45375] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 114032640. Throughput: 0: 1690.6, 1: 1680.6. Samples: 28508324. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:33:58,607][45375] Avg episode reward: [(0, '52.130'), (1, '47.190')] +[2023-10-13 02:34:01,426][46663] Updated weights for policy 1, policy_version 55651 (0.0008) +[2023-10-13 02:34:01,793][46663] Updated weights for policy 1, policy_version 55661 (0.0008) +[2023-10-13 02:34:01,945][46662] Updated weights for policy 0, policy_version 55720 (0.0008) +[2023-10-13 02:34:02,164][46663] Updated weights for policy 1, policy_version 55671 (0.0008) +[2023-10-13 02:34:02,324][46662] Updated weights for policy 0, policy_version 55730 (0.0008) +[2023-10-13 02:34:02,681][46662] Updated weights for policy 0, policy_version 55740 (0.0009) +[2023-10-13 02:34:03,607][45375] Fps is (10 sec: 13107.4, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 114098176. Throughput: 0: 1687.0, 1: 1654.9. Samples: 28527852. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:34:03,608][45375] Avg episode reward: [(0, '54.100'), (1, '47.930')] +[2023-10-13 02:34:06,259][46663] Updated weights for policy 1, policy_version 55681 (0.0009) +[2023-10-13 02:34:06,620][46663] Updated weights for policy 1, policy_version 55691 (0.0008) +[2023-10-13 02:34:06,762][46662] Updated weights for policy 0, policy_version 55750 (0.0008) +[2023-10-13 02:34:06,983][46663] Updated weights for policy 1, policy_version 55701 (0.0008) +[2023-10-13 02:34:07,128][46662] Updated weights for policy 0, policy_version 55760 (0.0008) +[2023-10-13 02:34:07,350][46663] Updated weights for policy 1, policy_version 55711 (0.0008) +[2023-10-13 02:34:07,494][46662] Updated weights for policy 0, policy_version 55770 (0.0008) +[2023-10-13 02:34:08,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 114163712. Throughput: 0: 1667.1, 1: 1680.3. Samples: 28547346. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:34:08,608][45375] Avg episode reward: [(0, '54.110'), (1, '48.090')] +[2023-10-13 02:34:11,456][46663] Updated weights for policy 1, policy_version 55721 (0.0008) +[2023-10-13 02:34:11,501][46662] Updated weights for policy 0, policy_version 55780 (0.0009) +[2023-10-13 02:34:11,823][46663] Updated weights for policy 1, policy_version 55731 (0.0009) +[2023-10-13 02:34:11,872][46662] Updated weights for policy 0, policy_version 55790 (0.0009) +[2023-10-13 02:34:12,191][46663] Updated weights for policy 1, policy_version 55741 (0.0008) +[2023-10-13 02:34:12,234][46662] Updated weights for policy 0, policy_version 55800 (0.0008) +[2023-10-13 02:34:13,607][45375] Fps is (10 sec: 13107.4, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 114229248. Throughput: 0: 1695.7, 1: 1672.6. Samples: 28558894. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:34:13,607][45375] Avg episode reward: [(0, '55.610'), (1, '48.970')] +[2023-10-13 02:34:16,161][46662] Updated weights for policy 0, policy_version 55810 (0.0009) +[2023-10-13 02:34:16,219][46663] Updated weights for policy 1, policy_version 55751 (0.0008) +[2023-10-13 02:34:16,535][46662] Updated weights for policy 0, policy_version 55820 (0.0009) +[2023-10-13 02:34:16,583][46663] Updated weights for policy 1, policy_version 55761 (0.0008) +[2023-10-13 02:34:16,910][46662] Updated weights for policy 0, policy_version 55830 (0.0008) +[2023-10-13 02:34:16,936][46663] Updated weights for policy 1, policy_version 55771 (0.0007) +[2023-10-13 02:34:17,275][46662] Updated weights for policy 0, policy_version 55840 (0.0008) +[2023-10-13 02:34:18,606][45375] Fps is (10 sec: 13107.5, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 114294784. Throughput: 0: 1682.8, 1: 1655.9. Samples: 28577850. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:34:18,607][45375] Avg episode reward: [(0, '54.040'), (1, '49.460')] +[2023-10-13 02:34:21,116][46663] Updated weights for policy 1, policy_version 55781 (0.0009) +[2023-10-13 02:34:21,355][46662] Updated weights for policy 0, policy_version 55850 (0.0008) +[2023-10-13 02:34:21,489][46663] Updated weights for policy 1, policy_version 55791 (0.0008) +[2023-10-13 02:34:21,728][46662] Updated weights for policy 0, policy_version 55860 (0.0008) +[2023-10-13 02:34:21,845][46663] Updated weights for policy 1, policy_version 55801 (0.0008) +[2023-10-13 02:34:22,099][46662] Updated weights for policy 0, policy_version 55870 (0.0009) +[2023-10-13 02:34:23,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 114360320. Throughput: 0: 1669.6, 1: 1676.8. Samples: 28597874. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:34:23,607][45375] Avg episode reward: [(0, '55.940'), (1, '49.620')] +[2023-10-13 02:34:26,111][46663] Updated weights for policy 1, policy_version 55811 (0.0008) +[2023-10-13 02:34:26,222][46662] Updated weights for policy 0, policy_version 55880 (0.0007) +[2023-10-13 02:34:26,505][46663] Updated weights for policy 1, policy_version 55821 (0.0010) +[2023-10-13 02:34:26,591][46662] Updated weights for policy 0, policy_version 55890 (0.0007) +[2023-10-13 02:34:26,870][46663] Updated weights for policy 1, policy_version 55831 (0.0009) +[2023-10-13 02:34:26,956][46662] Updated weights for policy 0, policy_version 55900 (0.0007) +[2023-10-13 02:34:28,606][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 114425856. Throughput: 0: 1691.6, 1: 1664.9. Samples: 28608902. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:34:28,607][45375] Avg episode reward: [(0, '58.230'), (1, '49.170')] +[2023-10-13 02:34:30,929][46663] Updated weights for policy 1, policy_version 55841 (0.0009) +[2023-10-13 02:34:31,158][46662] Updated weights for policy 0, policy_version 55910 (0.0007) +[2023-10-13 02:34:31,294][46663] Updated weights for policy 1, policy_version 55851 (0.0008) +[2023-10-13 02:34:31,527][46662] Updated weights for policy 0, policy_version 55920 (0.0008) +[2023-10-13 02:34:31,658][46663] Updated weights for policy 1, policy_version 55861 (0.0008) +[2023-10-13 02:34:31,890][46662] Updated weights for policy 0, policy_version 55930 (0.0008) +[2023-10-13 02:34:32,019][46663] Updated weights for policy 1, policy_version 55871 (0.0007) +[2023-10-13 02:34:33,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 114491392. Throughput: 0: 1661.6, 1: 1668.7. Samples: 28627624. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:34:33,608][45375] Avg episode reward: [(0, '57.350'), (1, '48.980')] +[2023-10-13 02:34:35,942][46663] Updated weights for policy 1, policy_version 55881 (0.0009) +[2023-10-13 02:34:36,235][46662] Updated weights for policy 0, policy_version 55940 (0.0008) +[2023-10-13 02:34:36,297][46663] Updated weights for policy 1, policy_version 55891 (0.0010) +[2023-10-13 02:34:36,603][46662] Updated weights for policy 0, policy_version 55950 (0.0007) +[2023-10-13 02:34:36,655][46663] Updated weights for policy 1, policy_version 55901 (0.0008) +[2023-10-13 02:34:36,982][46662] Updated weights for policy 0, policy_version 55960 (0.0007) +[2023-10-13 02:34:38,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 114556928. Throughput: 0: 1665.5, 1: 1676.6. Samples: 28647746. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:34:38,607][45375] Avg episode reward: [(0, '56.600'), (1, '48.430')] +[2023-10-13 02:34:40,846][46663] Updated weights for policy 1, policy_version 55911 (0.0009) +[2023-10-13 02:34:40,897][46662] Updated weights for policy 0, policy_version 55970 (0.0007) +[2023-10-13 02:34:41,217][46663] Updated weights for policy 1, policy_version 55921 (0.0009) +[2023-10-13 02:34:41,283][46662] Updated weights for policy 0, policy_version 55980 (0.0008) +[2023-10-13 02:34:41,573][46663] Updated weights for policy 1, policy_version 55931 (0.0007) +[2023-10-13 02:34:41,644][46662] Updated weights for policy 0, policy_version 55990 (0.0009) +[2023-10-13 02:34:42,008][46662] Updated weights for policy 0, policy_version 56000 (0.0007) +[2023-10-13 02:34:43,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 114622464. Throughput: 0: 1680.0, 1: 1659.6. Samples: 28658604. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:34:43,608][45375] Avg episode reward: [(0, '56.400'), (1, '47.510')] +[2023-10-13 02:34:45,773][46663] Updated weights for policy 1, policy_version 55941 (0.0010) +[2023-10-13 02:34:46,144][46663] Updated weights for policy 1, policy_version 55951 (0.0009) +[2023-10-13 02:34:46,209][46662] Updated weights for policy 0, policy_version 56010 (0.0007) +[2023-10-13 02:34:46,502][46663] Updated weights for policy 1, policy_version 55961 (0.0009) +[2023-10-13 02:34:46,575][46662] Updated weights for policy 0, policy_version 56020 (0.0008) +[2023-10-13 02:34:46,937][46662] Updated weights for policy 0, policy_version 56030 (0.0008) +[2023-10-13 02:34:48,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 114688000. Throughput: 0: 1659.3, 1: 1671.8. Samples: 28677750. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:34:48,607][45375] Avg episode reward: [(0, '55.720'), (1, '47.100')] +[2023-10-13 02:34:50,675][46663] Updated weights for policy 1, policy_version 55971 (0.0008) +[2023-10-13 02:34:51,046][46663] Updated weights for policy 1, policy_version 55981 (0.0007) +[2023-10-13 02:34:51,054][46662] Updated weights for policy 0, policy_version 56040 (0.0009) +[2023-10-13 02:34:51,406][46663] Updated weights for policy 1, policy_version 55991 (0.0008) +[2023-10-13 02:34:51,431][46662] Updated weights for policy 0, policy_version 56050 (0.0008) +[2023-10-13 02:34:51,803][46662] Updated weights for policy 0, policy_version 56060 (0.0007) +[2023-10-13 02:34:53,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 114753536. Throughput: 0: 1671.2, 1: 1673.5. Samples: 28697856. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:34:53,608][45375] Avg episode reward: [(0, '54.210'), (1, '48.000')] +[2023-10-13 02:34:55,455][46663] Updated weights for policy 1, policy_version 56001 (0.0008) +[2023-10-13 02:34:55,736][46662] Updated weights for policy 0, policy_version 56070 (0.0009) +[2023-10-13 02:34:55,810][46663] Updated weights for policy 1, policy_version 56011 (0.0008) +[2023-10-13 02:34:56,101][46662] Updated weights for policy 0, policy_version 56080 (0.0008) +[2023-10-13 02:34:56,181][46663] Updated weights for policy 1, policy_version 56021 (0.0008) +[2023-10-13 02:34:56,476][46662] Updated weights for policy 0, policy_version 56090 (0.0008) +[2023-10-13 02:34:56,549][46663] Updated weights for policy 1, policy_version 56031 (0.0007) +[2023-10-13 02:34:58,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 114819072. Throughput: 0: 1659.8, 1: 1656.0. Samples: 28708102. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:34:58,608][45375] Avg episode reward: [(0, '54.080'), (1, '47.160')] +[2023-10-13 02:35:00,633][46663] Updated weights for policy 1, policy_version 56041 (0.0007) +[2023-10-13 02:35:00,670][46662] Updated weights for policy 0, policy_version 56100 (0.0009) +[2023-10-13 02:35:01,007][46663] Updated weights for policy 1, policy_version 56051 (0.0007) +[2023-10-13 02:35:01,042][46662] Updated weights for policy 0, policy_version 56110 (0.0007) +[2023-10-13 02:35:01,373][46663] Updated weights for policy 1, policy_version 56061 (0.0009) +[2023-10-13 02:35:01,411][46662] Updated weights for policy 0, policy_version 56120 (0.0008) +[2023-10-13 02:35:03,606][45375] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 114884608. Throughput: 0: 1647.7, 1: 1674.6. Samples: 28727356. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:35:03,607][45375] Avg episode reward: [(0, '53.370'), (1, '46.800')] +[2023-10-13 02:35:05,368][46663] Updated weights for policy 1, policy_version 56071 (0.0008) +[2023-10-13 02:35:05,449][46662] Updated weights for policy 0, policy_version 56130 (0.0008) +[2023-10-13 02:35:05,736][46663] Updated weights for policy 1, policy_version 56081 (0.0008) +[2023-10-13 02:35:05,815][46662] Updated weights for policy 0, policy_version 56140 (0.0009) +[2023-10-13 02:35:06,096][46663] Updated weights for policy 1, policy_version 56091 (0.0008) +[2023-10-13 02:35:06,188][46662] Updated weights for policy 0, policy_version 56150 (0.0007) +[2023-10-13 02:35:06,558][46662] Updated weights for policy 0, policy_version 56160 (0.0008) +[2023-10-13 02:35:08,607][45375] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 114950144. Throughput: 0: 1667.9, 1: 1673.4. Samples: 28748230. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:35:08,607][45375] Avg episode reward: [(0, '52.580'), (1, '47.530')] +[2023-10-13 02:35:10,291][46663] Updated weights for policy 1, policy_version 56101 (0.0009) +[2023-10-13 02:35:10,574][46662] Updated weights for policy 0, policy_version 56170 (0.0008) +[2023-10-13 02:35:10,653][46663] Updated weights for policy 1, policy_version 56111 (0.0009) +[2023-10-13 02:35:10,945][46662] Updated weights for policy 0, policy_version 56180 (0.0008) +[2023-10-13 02:35:11,018][46663] Updated weights for policy 1, policy_version 56121 (0.0009) +[2023-10-13 02:35:11,322][46662] Updated weights for policy 0, policy_version 56190 (0.0009) +[2023-10-13 02:35:13,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 115015680. Throughput: 0: 1655.2, 1: 1654.9. Samples: 28757858. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:35:13,607][45375] Avg episode reward: [(0, '52.640'), (1, '46.650')] +[2023-10-13 02:35:15,136][46663] Updated weights for policy 1, policy_version 56131 (0.0007) +[2023-10-13 02:35:15,397][46662] Updated weights for policy 0, policy_version 56200 (0.0009) +[2023-10-13 02:35:15,545][46663] Updated weights for policy 1, policy_version 56141 (0.0008) +[2023-10-13 02:35:15,757][46662] Updated weights for policy 0, policy_version 56210 (0.0007) +[2023-10-13 02:35:15,908][46663] Updated weights for policy 1, policy_version 56151 (0.0008) +[2023-10-13 02:35:16,121][46662] Updated weights for policy 0, policy_version 56220 (0.0007) +[2023-10-13 02:35:18,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 115081216. Throughput: 0: 1662.2, 1: 1674.6. Samples: 28777780. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) +[2023-10-13 02:35:18,607][45375] Avg episode reward: [(0, '51.670'), (1, '46.400')] +[2023-10-13 02:35:19,926][46663] Updated weights for policy 1, policy_version 56161 (0.0008) +[2023-10-13 02:35:20,299][46663] Updated weights for policy 1, policy_version 56171 (0.0007) +[2023-10-13 02:35:20,459][46662] Updated weights for policy 0, policy_version 56230 (0.0009) +[2023-10-13 02:35:20,665][46663] Updated weights for policy 1, policy_version 56181 (0.0007) +[2023-10-13 02:35:20,817][46662] Updated weights for policy 0, policy_version 56240 (0.0008) +[2023-10-13 02:35:21,025][46663] Updated weights for policy 1, policy_version 56191 (0.0008) +[2023-10-13 02:35:21,185][46662] Updated weights for policy 0, policy_version 56250 (0.0008) +[2023-10-13 02:35:23,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 115146752. Throughput: 0: 1676.7, 1: 1670.5. Samples: 28798368. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) +[2023-10-13 02:35:23,607][45375] Avg episode reward: [(0, '50.150'), (1, '46.260')] +[2023-10-13 02:35:25,056][46663] Updated weights for policy 1, policy_version 56201 (0.0011) +[2023-10-13 02:35:25,151][46662] Updated weights for policy 0, policy_version 56260 (0.0009) +[2023-10-13 02:35:25,420][46663] Updated weights for policy 1, policy_version 56211 (0.0009) +[2023-10-13 02:35:25,515][46662] Updated weights for policy 0, policy_version 56270 (0.0007) +[2023-10-13 02:35:25,786][46663] Updated weights for policy 1, policy_version 56221 (0.0008) +[2023-10-13 02:35:25,888][46662] Updated weights for policy 0, policy_version 56280 (0.0007) +[2023-10-13 02:35:28,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 115212288. Throughput: 0: 1653.3, 1: 1666.2. Samples: 28807980. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) +[2023-10-13 02:35:28,607][45375] Avg episode reward: [(0, '47.910'), (1, '45.960')] +[2023-10-13 02:35:29,913][46663] Updated weights for policy 1, policy_version 56231 (0.0010) +[2023-10-13 02:35:30,095][46662] Updated weights for policy 0, policy_version 56290 (0.0009) +[2023-10-13 02:35:30,287][46663] Updated weights for policy 1, policy_version 56241 (0.0008) +[2023-10-13 02:35:30,459][46662] Updated weights for policy 0, policy_version 56300 (0.0007) +[2023-10-13 02:35:30,648][46663] Updated weights for policy 1, policy_version 56251 (0.0008) +[2023-10-13 02:35:30,836][46662] Updated weights for policy 0, policy_version 56310 (0.0008) +[2023-10-13 02:35:31,201][46662] Updated weights for policy 0, policy_version 56320 (0.0009) +[2023-10-13 02:35:33,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 115277824. Throughput: 0: 1660.0, 1: 1675.8. Samples: 28827862. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) +[2023-10-13 02:35:33,608][45375] Avg episode reward: [(0, '48.560'), (1, '47.140')] +[2023-10-13 02:35:34,729][46663] Updated weights for policy 1, policy_version 56261 (0.0009) +[2023-10-13 02:35:35,091][46663] Updated weights for policy 1, policy_version 56271 (0.0009) +[2023-10-13 02:35:35,347][46662] Updated weights for policy 0, policy_version 56330 (0.0007) +[2023-10-13 02:35:35,460][46663] Updated weights for policy 1, policy_version 56281 (0.0009) +[2023-10-13 02:35:35,723][46662] Updated weights for policy 0, policy_version 56340 (0.0009) +[2023-10-13 02:35:36,100][46662] Updated weights for policy 0, policy_version 56350 (0.0009) +[2023-10-13 02:35:38,607][45375] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 115343360. Throughput: 0: 1669.1, 1: 1671.7. Samples: 28848192. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) +[2023-10-13 02:35:38,608][45375] Avg episode reward: [(0, '48.330'), (1, '48.840')] +[2023-10-13 02:35:39,475][46663] Updated weights for policy 1, policy_version 56291 (0.0009) +[2023-10-13 02:35:39,849][46663] Updated weights for policy 1, policy_version 56301 (0.0007) +[2023-10-13 02:35:40,225][46663] Updated weights for policy 1, policy_version 56311 (0.0008) +[2023-10-13 02:35:40,242][46662] Updated weights for policy 0, policy_version 56360 (0.0008) +[2023-10-13 02:35:40,617][46662] Updated weights for policy 0, policy_version 56370 (0.0007) +[2023-10-13 02:35:40,978][46662] Updated weights for policy 0, policy_version 56380 (0.0008) +[2023-10-13 02:35:43,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 115408896. Throughput: 0: 1654.9, 1: 1666.9. Samples: 28857582. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) +[2023-10-13 02:35:43,608][45375] Avg episode reward: [(0, '47.050'), (1, '50.640')] +[2023-10-13 02:35:44,243][46663] Updated weights for policy 1, policy_version 56321 (0.0007) +[2023-10-13 02:35:44,619][46663] Updated weights for policy 1, policy_version 56331 (0.0009) +[2023-10-13 02:35:44,905][46662] Updated weights for policy 0, policy_version 56390 (0.0008) +[2023-10-13 02:35:44,981][46663] Updated weights for policy 1, policy_version 56341 (0.0008) +[2023-10-13 02:35:45,275][46662] Updated weights for policy 0, policy_version 56400 (0.0008) +[2023-10-13 02:35:45,344][46663] Updated weights for policy 1, policy_version 56351 (0.0009) +[2023-10-13 02:35:45,641][46662] Updated weights for policy 0, policy_version 56410 (0.0008) +[2023-10-13 02:35:48,606][45375] Fps is (10 sec: 13107.6, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 115474432. Throughput: 0: 1673.8, 1: 1677.7. Samples: 28878172. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) +[2023-10-13 02:35:48,607][45375] Avg episode reward: [(0, '47.850'), (1, '49.800')] +[2023-10-13 02:35:49,456][46663] Updated weights for policy 1, policy_version 56361 (0.0008) +[2023-10-13 02:35:49,751][46662] Updated weights for policy 0, policy_version 56420 (0.0009) +[2023-10-13 02:35:49,824][46663] Updated weights for policy 1, policy_version 56371 (0.0009) +[2023-10-13 02:35:50,112][46662] Updated weights for policy 0, policy_version 56430 (0.0008) +[2023-10-13 02:35:50,183][46663] Updated weights for policy 1, policy_version 56381 (0.0010) +[2023-10-13 02:35:50,482][46662] Updated weights for policy 0, policy_version 56440 (0.0009) +[2023-10-13 02:35:53,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 115539968. Throughput: 0: 1669.1, 1: 1677.4. Samples: 28898824. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) +[2023-10-13 02:35:53,608][45375] Avg episode reward: [(0, '46.800'), (1, '50.120')] +[2023-10-13 02:35:53,620][46384] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000056384_57737216.pth... +[2023-10-13 02:35:53,621][46091] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000056448_57802752.pth... +[2023-10-13 02:35:53,657][46091] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000054880_56197120.pth +[2023-10-13 02:35:53,659][46384] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000054816_56131584.pth +[2023-10-13 02:35:54,301][46663] Updated weights for policy 1, policy_version 56391 (0.0009) +[2023-10-13 02:35:54,563][46662] Updated weights for policy 0, policy_version 56450 (0.0009) +[2023-10-13 02:35:54,673][46663] Updated weights for policy 1, policy_version 56401 (0.0010) +[2023-10-13 02:35:54,929][46662] Updated weights for policy 0, policy_version 56460 (0.0007) +[2023-10-13 02:35:55,034][46663] Updated weights for policy 1, policy_version 56411 (0.0011) +[2023-10-13 02:35:55,299][46662] Updated weights for policy 0, policy_version 56470 (0.0011) +[2023-10-13 02:35:55,670][46662] Updated weights for policy 0, policy_version 56480 (0.0012) +[2023-10-13 02:35:58,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 115605504. Throughput: 0: 1657.4, 1: 1674.5. Samples: 28907794. Policy #0 lag: (min: 31.0, avg: 31.4, max: 44.0) +[2023-10-13 02:35:58,607][45375] Avg episode reward: [(0, '46.780'), (1, '49.500')] +[2023-10-13 02:35:59,177][46663] Updated weights for policy 1, policy_version 56421 (0.0010) +[2023-10-13 02:35:59,539][46663] Updated weights for policy 1, policy_version 56431 (0.0009) +[2023-10-13 02:35:59,697][46662] Updated weights for policy 0, policy_version 56490 (0.0010) +[2023-10-13 02:35:59,902][46663] Updated weights for policy 1, policy_version 56441 (0.0007) +[2023-10-13 02:36:00,064][46662] Updated weights for policy 0, policy_version 56500 (0.0008) +[2023-10-13 02:36:00,445][46662] Updated weights for policy 0, policy_version 56510 (0.0007) +[2023-10-13 02:36:03,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.1, 300 sec: 13329.3). Total num frames: 115671040. Throughput: 0: 1670.7, 1: 1673.0. Samples: 28928248. Policy #0 lag: (min: 31.0, avg: 31.4, max: 44.0) +[2023-10-13 02:36:03,608][45375] Avg episode reward: [(0, '47.510'), (1, '50.200')] +[2023-10-13 02:36:04,132][46663] Updated weights for policy 1, policy_version 56451 (0.0009) +[2023-10-13 02:36:04,545][46663] Updated weights for policy 1, policy_version 56461 (0.0007) +[2023-10-13 02:36:04,556][46662] Updated weights for policy 0, policy_version 56520 (0.0009) +[2023-10-13 02:36:04,915][46662] Updated weights for policy 0, policy_version 56530 (0.0008) +[2023-10-13 02:36:04,925][46663] Updated weights for policy 1, policy_version 56471 (0.0007) +[2023-10-13 02:36:05,293][46662] Updated weights for policy 0, policy_version 56540 (0.0009) +[2023-10-13 02:36:08,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 115736576. Throughput: 0: 1669.6, 1: 1669.1. Samples: 28948610. Policy #0 lag: (min: 31.0, avg: 31.4, max: 44.0) +[2023-10-13 02:36:08,607][45375] Avg episode reward: [(0, '47.690'), (1, '51.420')] +[2023-10-13 02:36:08,946][46663] Updated weights for policy 1, policy_version 56481 (0.0008) +[2023-10-13 02:36:09,319][46663] Updated weights for policy 1, policy_version 56491 (0.0010) +[2023-10-13 02:36:09,539][46662] Updated weights for policy 0, policy_version 56550 (0.0009) +[2023-10-13 02:36:09,688][46663] Updated weights for policy 1, policy_version 56501 (0.0007) +[2023-10-13 02:36:09,898][46662] Updated weights for policy 0, policy_version 56560 (0.0007) +[2023-10-13 02:36:10,052][46663] Updated weights for policy 1, policy_version 56511 (0.0007) +[2023-10-13 02:36:10,270][46662] Updated weights for policy 0, policy_version 56570 (0.0008) +[2023-10-13 02:36:13,607][45375] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 115802112. Throughput: 0: 1659.5, 1: 1665.8. Samples: 28957620. Policy #0 lag: (min: 31.0, avg: 31.4, max: 44.0) +[2023-10-13 02:36:13,608][45375] Avg episode reward: [(0, '47.760'), (1, '51.910')] +[2023-10-13 02:36:14,174][46663] Updated weights for policy 1, policy_version 56521 (0.0008) +[2023-10-13 02:36:14,340][46662] Updated weights for policy 0, policy_version 56580 (0.0009) +[2023-10-13 02:36:14,542][46663] Updated weights for policy 1, policy_version 56531 (0.0008) +[2023-10-13 02:36:14,701][46662] Updated weights for policy 0, policy_version 56590 (0.0008) +[2023-10-13 02:36:14,911][46663] Updated weights for policy 1, policy_version 56541 (0.0007) +[2023-10-13 02:36:15,079][46662] Updated weights for policy 0, policy_version 56600 (0.0008) +[2023-10-13 02:36:18,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 115867648. Throughput: 0: 1675.1, 1: 1667.2. Samples: 28978264. Policy #0 lag: (min: 31.0, avg: 31.4, max: 44.0) +[2023-10-13 02:36:18,607][45375] Avg episode reward: [(0, '47.940'), (1, '52.370')] +[2023-10-13 02:36:19,030][46662] Updated weights for policy 0, policy_version 56610 (0.0008) +[2023-10-13 02:36:19,050][46663] Updated weights for policy 1, policy_version 56551 (0.0008) +[2023-10-13 02:36:19,410][46662] Updated weights for policy 0, policy_version 56620 (0.0008) +[2023-10-13 02:36:19,415][46663] Updated weights for policy 1, policy_version 56561 (0.0008) +[2023-10-13 02:36:19,783][46662] Updated weights for policy 0, policy_version 56630 (0.0008) +[2023-10-13 02:36:19,792][46663] Updated weights for policy 1, policy_version 56571 (0.0009) +[2023-10-13 02:36:20,147][46662] Updated weights for policy 0, policy_version 56640 (0.0009) +[2023-10-13 02:36:23,606][45375] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 115933184. Throughput: 0: 1675.9, 1: 1671.3. Samples: 28998812. Policy #0 lag: (min: 31.0, avg: 31.4, max: 44.0) +[2023-10-13 02:36:23,607][45375] Avg episode reward: [(0, '48.640'), (1, '53.130')] +[2023-10-13 02:36:23,923][46663] Updated weights for policy 1, policy_version 56581 (0.0007) +[2023-10-13 02:36:24,292][46663] Updated weights for policy 1, policy_version 56591 (0.0007) +[2023-10-13 02:36:24,342][46662] Updated weights for policy 0, policy_version 56650 (0.0007) +[2023-10-13 02:36:24,669][46663] Updated weights for policy 1, policy_version 56601 (0.0011) +[2023-10-13 02:36:24,715][46662] Updated weights for policy 0, policy_version 56660 (0.0009) +[2023-10-13 02:36:25,090][46662] Updated weights for policy 0, policy_version 56670 (0.0008) +[2023-10-13 02:36:28,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 115998720. Throughput: 0: 1669.9, 1: 1670.3. Samples: 29007892. Policy #0 lag: (min: 31.0, avg: 31.4, max: 44.0) +[2023-10-13 02:36:28,607][45375] Avg episode reward: [(0, '49.480'), (1, '55.020')] +[2023-10-13 02:36:28,726][46663] Updated weights for policy 1, policy_version 56611 (0.0009) +[2023-10-13 02:36:29,087][46663] Updated weights for policy 1, policy_version 56621 (0.0007) +[2023-10-13 02:36:29,358][46662] Updated weights for policy 0, policy_version 56680 (0.0008) +[2023-10-13 02:36:29,465][46663] Updated weights for policy 1, policy_version 56631 (0.0008) +[2023-10-13 02:36:29,731][46662] Updated weights for policy 0, policy_version 56690 (0.0010) +[2023-10-13 02:36:30,112][46662] Updated weights for policy 0, policy_version 56700 (0.0010) +[2023-10-13 02:36:33,599][46663] Updated weights for policy 1, policy_version 56641 (0.0010) +[2023-10-13 02:36:33,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 116064256. Throughput: 0: 1672.4, 1: 1665.0. Samples: 29028352. Policy #0 lag: (min: 31.0, avg: 31.4, max: 44.0) +[2023-10-13 02:36:33,607][45375] Avg episode reward: [(0, '50.450'), (1, '54.690')] +[2023-10-13 02:36:33,963][46663] Updated weights for policy 1, policy_version 56651 (0.0009) +[2023-10-13 02:36:34,185][46662] Updated weights for policy 0, policy_version 56710 (0.0009) +[2023-10-13 02:36:34,328][46663] Updated weights for policy 1, policy_version 56661 (0.0008) +[2023-10-13 02:36:34,551][46662] Updated weights for policy 0, policy_version 56720 (0.0009) +[2023-10-13 02:36:34,689][46663] Updated weights for policy 1, policy_version 56671 (0.0009) +[2023-10-13 02:36:34,923][46662] Updated weights for policy 0, policy_version 56730 (0.0007) +[2023-10-13 02:36:38,607][45375] Fps is (10 sec: 13106.8, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 116129792. Throughput: 0: 1669.3, 1: 1664.2. Samples: 29048832. Policy #0 lag: (min: 15.0, avg: 21.1, max: 47.0) +[2023-10-13 02:36:38,608][45375] Avg episode reward: [(0, '49.020'), (1, '53.330')] +[2023-10-13 02:36:38,779][46663] Updated weights for policy 1, policy_version 56681 (0.0008) +[2023-10-13 02:36:38,967][46662] Updated weights for policy 0, policy_version 56740 (0.0009) +[2023-10-13 02:36:39,152][46663] Updated weights for policy 1, policy_version 56691 (0.0008) +[2023-10-13 02:36:39,337][46662] Updated weights for policy 0, policy_version 56750 (0.0010) +[2023-10-13 02:36:39,520][46663] Updated weights for policy 1, policy_version 56701 (0.0009) +[2023-10-13 02:36:39,704][46662] Updated weights for policy 0, policy_version 56760 (0.0011) +[2023-10-13 02:36:43,558][46663] Updated weights for policy 1, policy_version 56711 (0.0008) +[2023-10-13 02:36:43,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 116195328. Throughput: 0: 1669.1, 1: 1668.1. Samples: 29057970. Policy #0 lag: (min: 15.0, avg: 21.1, max: 47.0) +[2023-10-13 02:36:43,608][45375] Avg episode reward: [(0, '49.780'), (1, '51.850')] +[2023-10-13 02:36:43,861][46662] Updated weights for policy 0, policy_version 56770 (0.0008) +[2023-10-13 02:36:43,924][46663] Updated weights for policy 1, policy_version 56721 (0.0007) +[2023-10-13 02:36:44,230][46662] Updated weights for policy 0, policy_version 56780 (0.0008) +[2023-10-13 02:36:44,296][46663] Updated weights for policy 1, policy_version 56731 (0.0008) +[2023-10-13 02:36:44,591][46662] Updated weights for policy 0, policy_version 56790 (0.0008) +[2023-10-13 02:36:44,961][46662] Updated weights for policy 0, policy_version 56800 (0.0007) +[2023-10-13 02:36:48,456][46663] Updated weights for policy 1, policy_version 56741 (0.0007) +[2023-10-13 02:36:48,607][45375] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 116260864. Throughput: 0: 1669.8, 1: 1676.0. Samples: 29078808. Policy #0 lag: (min: 15.0, avg: 21.1, max: 47.0) +[2023-10-13 02:36:48,607][45375] Avg episode reward: [(0, '48.790'), (1, '50.880')] +[2023-10-13 02:36:48,854][46663] Updated weights for policy 1, policy_version 56751 (0.0009) +[2023-10-13 02:36:49,005][46662] Updated weights for policy 0, policy_version 56810 (0.0007) +[2023-10-13 02:36:49,218][46663] Updated weights for policy 1, policy_version 56761 (0.0008) +[2023-10-13 02:36:49,378][46662] Updated weights for policy 0, policy_version 56820 (0.0007) +[2023-10-13 02:36:49,749][46662] Updated weights for policy 0, policy_version 56830 (0.0009) +[2023-10-13 02:36:53,446][46663] Updated weights for policy 1, policy_version 56771 (0.0008) +[2023-10-13 02:36:53,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 116326400. Throughput: 0: 1669.5, 1: 1673.2. Samples: 29099034. Policy #0 lag: (min: 15.0, avg: 21.1, max: 47.0) +[2023-10-13 02:36:53,608][45375] Avg episode reward: [(0, '48.540'), (1, '51.370')] +[2023-10-13 02:36:53,818][46663] Updated weights for policy 1, policy_version 56781 (0.0008) +[2023-10-13 02:36:53,855][46662] Updated weights for policy 0, policy_version 56840 (0.0010) +[2023-10-13 02:36:54,187][46663] Updated weights for policy 1, policy_version 56791 (0.0008) +[2023-10-13 02:36:54,226][46662] Updated weights for policy 0, policy_version 56850 (0.0010) +[2023-10-13 02:36:54,596][46662] Updated weights for policy 0, policy_version 56860 (0.0009) +[2023-10-13 02:36:58,306][46663] Updated weights for policy 1, policy_version 56801 (0.0009) +[2023-10-13 02:36:58,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 116391936. Throughput: 0: 1668.6, 1: 1671.3. Samples: 29107914. Policy #0 lag: (min: 15.0, avg: 21.1, max: 47.0) +[2023-10-13 02:36:58,607][45375] Avg episode reward: [(0, '47.720'), (1, '52.070')] +[2023-10-13 02:36:58,672][46663] Updated weights for policy 1, policy_version 56811 (0.0008) +[2023-10-13 02:36:58,761][46662] Updated weights for policy 0, policy_version 56870 (0.0008) +[2023-10-13 02:36:59,038][46663] Updated weights for policy 1, policy_version 56821 (0.0007) +[2023-10-13 02:36:59,122][46662] Updated weights for policy 0, policy_version 56880 (0.0008) +[2023-10-13 02:36:59,411][46663] Updated weights for policy 1, policy_version 56831 (0.0008) +[2023-10-13 02:36:59,497][46662] Updated weights for policy 0, policy_version 56890 (0.0008) +[2023-10-13 02:37:03,567][46663] Updated weights for policy 1, policy_version 56841 (0.0007) +[2023-10-13 02:37:03,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 116457472. Throughput: 0: 1664.0, 1: 1670.1. Samples: 29128300. Policy #0 lag: (min: 15.0, avg: 21.1, max: 47.0) +[2023-10-13 02:37:03,608][45375] Avg episode reward: [(0, '47.190'), (1, '51.780')] +[2023-10-13 02:37:03,639][46662] Updated weights for policy 0, policy_version 56900 (0.0008) +[2023-10-13 02:37:03,925][46663] Updated weights for policy 1, policy_version 56851 (0.0008) +[2023-10-13 02:37:04,008][46662] Updated weights for policy 0, policy_version 56910 (0.0008) +[2023-10-13 02:37:04,298][46663] Updated weights for policy 1, policy_version 56861 (0.0007) +[2023-10-13 02:37:04,374][46662] Updated weights for policy 0, policy_version 56920 (0.0009) +[2023-10-13 02:37:08,206][46662] Updated weights for policy 0, policy_version 56930 (0.0008) +[2023-10-13 02:37:08,373][46663] Updated weights for policy 1, policy_version 56871 (0.0007) +[2023-10-13 02:37:08,564][46662] Updated weights for policy 0, policy_version 56940 (0.0008) +[2023-10-13 02:37:08,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 116523008. Throughput: 0: 1673.0, 1: 1660.0. Samples: 29148796. Policy #0 lag: (min: 15.0, avg: 21.1, max: 47.0) +[2023-10-13 02:37:08,607][45375] Avg episode reward: [(0, '47.270'), (1, '50.280')] +[2023-10-13 02:37:08,748][46663] Updated weights for policy 1, policy_version 56881 (0.0008) +[2023-10-13 02:37:08,943][46662] Updated weights for policy 0, policy_version 56950 (0.0009) +[2023-10-13 02:37:09,115][46663] Updated weights for policy 1, policy_version 56891 (0.0008) +[2023-10-13 02:37:09,309][46662] Updated weights for policy 0, policy_version 56960 (0.0009) +[2023-10-13 02:37:13,197][46663] Updated weights for policy 1, policy_version 56901 (0.0009) +[2023-10-13 02:37:13,556][46662] Updated weights for policy 0, policy_version 56970 (0.0007) +[2023-10-13 02:37:13,565][46663] Updated weights for policy 1, policy_version 56911 (0.0009) +[2023-10-13 02:37:13,606][45375] Fps is (10 sec: 13107.5, 60 sec: 13107.3, 300 sec: 13329.4). Total num frames: 116588544. Throughput: 0: 1669.5, 1: 1666.3. Samples: 29158000. Policy #0 lag: (min: 15.0, avg: 21.1, max: 47.0) +[2023-10-13 02:37:13,607][45375] Avg episode reward: [(0, '45.480'), (1, '51.360')] +[2023-10-13 02:37:13,921][46662] Updated weights for policy 0, policy_version 56980 (0.0007) +[2023-10-13 02:37:13,929][46663] Updated weights for policy 1, policy_version 56921 (0.0009) +[2023-10-13 02:37:14,293][46662] Updated weights for policy 0, policy_version 56990 (0.0008) +[2023-10-13 02:37:17,870][46663] Updated weights for policy 1, policy_version 56931 (0.0010) +[2023-10-13 02:37:18,235][46663] Updated weights for policy 1, policy_version 56941 (0.0010) +[2023-10-13 02:37:18,382][46662] Updated weights for policy 0, policy_version 57000 (0.0010) +[2023-10-13 02:37:18,601][46663] Updated weights for policy 1, policy_version 56951 (0.0009) +[2023-10-13 02:37:18,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 116654080. Throughput: 0: 1671.0, 1: 1670.6. Samples: 29178724. Policy #0 lag: (min: 5.0, avg: 12.8, max: 37.0) +[2023-10-13 02:37:18,607][45375] Avg episode reward: [(0, '45.740'), (1, '49.370')] +[2023-10-13 02:37:18,757][46662] Updated weights for policy 0, policy_version 57010 (0.0008) +[2023-10-13 02:37:19,125][46662] Updated weights for policy 0, policy_version 57020 (0.0009) +[2023-10-13 02:37:22,761][46663] Updated weights for policy 1, policy_version 56961 (0.0008) +[2023-10-13 02:37:23,127][46663] Updated weights for policy 1, policy_version 56971 (0.0009) +[2023-10-13 02:37:23,149][46662] Updated weights for policy 0, policy_version 57030 (0.0009) +[2023-10-13 02:37:23,486][46663] Updated weights for policy 1, policy_version 56981 (0.0007) +[2023-10-13 02:37:23,511][46662] Updated weights for policy 0, policy_version 57040 (0.0008) +[2023-10-13 02:37:23,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 116719616. Throughput: 0: 1674.1, 1: 1655.3. Samples: 29198656. Policy #0 lag: (min: 5.0, avg: 12.8, max: 37.0) +[2023-10-13 02:37:23,607][45375] Avg episode reward: [(0, '45.230'), (1, '48.180')] +[2023-10-13 02:37:23,857][46663] Updated weights for policy 1, policy_version 56991 (0.0007) +[2023-10-13 02:37:23,885][46662] Updated weights for policy 0, policy_version 57050 (0.0008) +[2023-10-13 02:37:27,944][46663] Updated weights for policy 1, policy_version 57001 (0.0010) +[2023-10-13 02:37:27,959][46662] Updated weights for policy 0, policy_version 57060 (0.0008) +[2023-10-13 02:37:28,307][46663] Updated weights for policy 1, policy_version 57011 (0.0008) +[2023-10-13 02:37:28,315][46662] Updated weights for policy 0, policy_version 57070 (0.0007) +[2023-10-13 02:37:28,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 116785152. Throughput: 0: 1674.7, 1: 1671.2. Samples: 29208536. Policy #0 lag: (min: 5.0, avg: 12.8, max: 37.0) +[2023-10-13 02:37:28,607][45375] Avg episode reward: [(0, '45.530'), (1, '46.770')] +[2023-10-13 02:37:28,683][46662] Updated weights for policy 0, policy_version 57080 (0.0007) +[2023-10-13 02:37:28,685][46663] Updated weights for policy 1, policy_version 57021 (0.0010) +[2023-10-13 02:37:32,796][46663] Updated weights for policy 1, policy_version 57031 (0.0009) +[2023-10-13 02:37:32,918][46662] Updated weights for policy 0, policy_version 57090 (0.0008) +[2023-10-13 02:37:33,164][46663] Updated weights for policy 1, policy_version 57041 (0.0007) +[2023-10-13 02:37:33,283][46662] Updated weights for policy 0, policy_version 57100 (0.0009) +[2023-10-13 02:37:33,533][46663] Updated weights for policy 1, policy_version 57051 (0.0007) +[2023-10-13 02:37:33,607][45375] Fps is (10 sec: 13106.8, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 116850688. Throughput: 0: 1671.1, 1: 1665.6. Samples: 29228960. Policy #0 lag: (min: 5.0, avg: 12.8, max: 37.0) +[2023-10-13 02:37:33,608][45375] Avg episode reward: [(0, '44.790'), (1, '46.650')] +[2023-10-13 02:37:33,655][46662] Updated weights for policy 0, policy_version 57110 (0.0010) +[2023-10-13 02:37:34,016][46662] Updated weights for policy 0, policy_version 57120 (0.0011) +[2023-10-13 02:37:37,734][46663] Updated weights for policy 1, policy_version 57061 (0.0008) +[2023-10-13 02:37:38,099][46663] Updated weights for policy 1, policy_version 57071 (0.0008) +[2023-10-13 02:37:38,177][46662] Updated weights for policy 0, policy_version 57130 (0.0008) +[2023-10-13 02:37:38,466][46663] Updated weights for policy 1, policy_version 57081 (0.0009) +[2023-10-13 02:37:38,552][46662] Updated weights for policy 0, policy_version 57140 (0.0009) +[2023-10-13 02:37:38,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.3, 300 sec: 13329.4). Total num frames: 116916224. Throughput: 0: 1668.3, 1: 1651.7. Samples: 29248432. Policy #0 lag: (min: 5.0, avg: 12.8, max: 37.0) +[2023-10-13 02:37:38,607][45375] Avg episode reward: [(0, '44.810'), (1, '46.400')] +[2023-10-13 02:37:38,916][46662] Updated weights for policy 0, policy_version 57150 (0.0010) +[2023-10-13 02:37:42,370][46663] Updated weights for policy 1, policy_version 57091 (0.0008) +[2023-10-13 02:37:42,736][46663] Updated weights for policy 1, policy_version 57101 (0.0008) +[2023-10-13 02:37:42,988][46662] Updated weights for policy 0, policy_version 57160 (0.0009) +[2023-10-13 02:37:43,104][46663] Updated weights for policy 1, policy_version 57111 (0.0008) +[2023-10-13 02:37:43,368][46662] Updated weights for policy 0, policy_version 57170 (0.0007) +[2023-10-13 02:37:43,607][45375] Fps is (10 sec: 16384.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 117014528. Throughput: 0: 1674.0, 1: 1671.6. Samples: 29258468. Policy #0 lag: (min: 5.0, avg: 12.8, max: 37.0) +[2023-10-13 02:37:43,608][45375] Avg episode reward: [(0, '46.670'), (1, '47.770')] +[2023-10-13 02:37:43,743][46662] Updated weights for policy 0, policy_version 57180 (0.0009) +[2023-10-13 02:37:47,201][46663] Updated weights for policy 1, policy_version 57121 (0.0008) +[2023-10-13 02:37:47,562][46663] Updated weights for policy 1, policy_version 57131 (0.0009) +[2023-10-13 02:37:47,661][46662] Updated weights for policy 0, policy_version 57190 (0.0009) +[2023-10-13 02:37:47,918][46663] Updated weights for policy 1, policy_version 57141 (0.0008) +[2023-10-13 02:37:48,030][46662] Updated weights for policy 0, policy_version 57200 (0.0009) +[2023-10-13 02:37:48,294][46663] Updated weights for policy 1, policy_version 57151 (0.0009) +[2023-10-13 02:37:48,396][46662] Updated weights for policy 0, policy_version 57210 (0.0009) +[2023-10-13 02:37:48,607][45375] Fps is (10 sec: 16383.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 117080064. Throughput: 0: 1680.6, 1: 1668.0. Samples: 29278990. Policy #0 lag: (min: 5.0, avg: 12.8, max: 37.0) +[2023-10-13 02:37:48,607][45375] Avg episode reward: [(0, '47.020'), (1, '48.110')] +[2023-10-13 02:37:52,260][46663] Updated weights for policy 1, policy_version 57161 (0.0010) +[2023-10-13 02:37:52,389][46662] Updated weights for policy 0, policy_version 57220 (0.0008) +[2023-10-13 02:37:52,622][46663] Updated weights for policy 1, policy_version 57171 (0.0008) +[2023-10-13 02:37:52,767][46662] Updated weights for policy 0, policy_version 57230 (0.0008) +[2023-10-13 02:37:52,988][46663] Updated weights for policy 1, policy_version 57181 (0.0007) +[2023-10-13 02:37:53,149][46662] Updated weights for policy 0, policy_version 57240 (0.0007) +[2023-10-13 02:37:53,607][45375] Fps is (10 sec: 16383.8, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 117178368. Throughput: 0: 1664.7, 1: 1658.1. Samples: 29298322. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0) +[2023-10-13 02:37:53,608][45375] Avg episode reward: [(0, '46.510'), (1, '47.840')] +[2023-10-13 02:37:53,620][46091] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000057248_58621952.pth... +[2023-10-13 02:37:53,620][46384] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000057184_58556416.pth... +[2023-10-13 02:37:53,657][46091] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000055680_57016320.pth +[2023-10-13 02:37:53,657][46384] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000055616_56950784.pth +[2023-10-13 02:37:57,077][46663] Updated weights for policy 1, policy_version 57191 (0.0007) +[2023-10-13 02:37:57,179][46662] Updated weights for policy 0, policy_version 57250 (0.0007) +[2023-10-13 02:37:57,440][46663] Updated weights for policy 1, policy_version 57201 (0.0009) +[2023-10-13 02:37:57,556][46662] Updated weights for policy 0, policy_version 57260 (0.0007) +[2023-10-13 02:37:57,811][46663] Updated weights for policy 1, policy_version 57211 (0.0009) +[2023-10-13 02:37:57,921][46662] Updated weights for policy 0, policy_version 57270 (0.0008) +[2023-10-13 02:37:58,292][46662] Updated weights for policy 0, policy_version 57280 (0.0008) +[2023-10-13 02:37:58,607][45375] Fps is (10 sec: 16384.1, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 117243904. Throughput: 0: 1681.8, 1: 1678.4. Samples: 29309210. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0) +[2023-10-13 02:37:58,607][45375] Avg episode reward: [(0, '47.060'), (1, '47.960')] +[2023-10-13 02:38:02,092][46663] Updated weights for policy 1, policy_version 57221 (0.0008) +[2023-10-13 02:38:02,400][46662] Updated weights for policy 0, policy_version 57290 (0.0009) +[2023-10-13 02:38:02,457][46663] Updated weights for policy 1, policy_version 57231 (0.0009) +[2023-10-13 02:38:02,774][46662] Updated weights for policy 0, policy_version 57300 (0.0009) +[2023-10-13 02:38:02,823][46663] Updated weights for policy 1, policy_version 57241 (0.0009) +[2023-10-13 02:38:03,143][46662] Updated weights for policy 0, policy_version 57310 (0.0007) +[2023-10-13 02:38:03,606][45375] Fps is (10 sec: 13107.6, 60 sec: 14199.5, 300 sec: 13440.4). Total num frames: 117309440. Throughput: 0: 1680.4, 1: 1658.8. Samples: 29328984. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0) +[2023-10-13 02:38:03,607][45375] Avg episode reward: [(0, '49.070'), (1, '47.890')] +[2023-10-13 02:38:06,949][46663] Updated weights for policy 1, policy_version 57251 (0.0009) +[2023-10-13 02:38:07,312][46663] Updated weights for policy 1, policy_version 57261 (0.0007) +[2023-10-13 02:38:07,345][46662] Updated weights for policy 0, policy_version 57320 (0.0009) +[2023-10-13 02:38:07,687][46663] Updated weights for policy 1, policy_version 57271 (0.0009) +[2023-10-13 02:38:07,730][46662] Updated weights for policy 0, policy_version 57330 (0.0009) +[2023-10-13 02:38:08,092][46662] Updated weights for policy 0, policy_version 57340 (0.0008) +[2023-10-13 02:38:08,607][45375] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 117374976. Throughput: 0: 1658.4, 1: 1657.4. Samples: 29347870. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0) +[2023-10-13 02:38:08,607][45375] Avg episode reward: [(0, '48.390'), (1, '47.330')] +[2023-10-13 02:38:11,820][46663] Updated weights for policy 1, policy_version 57281 (0.0009) +[2023-10-13 02:38:12,182][46663] Updated weights for policy 1, policy_version 57291 (0.0009) +[2023-10-13 02:38:12,319][46662] Updated weights for policy 0, policy_version 57350 (0.0008) +[2023-10-13 02:38:12,553][46663] Updated weights for policy 1, policy_version 57301 (0.0008) +[2023-10-13 02:38:12,681][46662] Updated weights for policy 0, policy_version 57360 (0.0008) +[2023-10-13 02:38:12,920][46663] Updated weights for policy 1, policy_version 57311 (0.0007) +[2023-10-13 02:38:13,047][46662] Updated weights for policy 0, policy_version 57370 (0.0007) +[2023-10-13 02:38:13,607][45375] Fps is (10 sec: 13106.9, 60 sec: 14199.4, 300 sec: 13440.4). Total num frames: 117440512. Throughput: 0: 1677.6, 1: 1668.6. Samples: 29359118. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0) +[2023-10-13 02:38:13,608][45375] Avg episode reward: [(0, '48.750'), (1, '45.920')] +[2023-10-13 02:38:16,871][46663] Updated weights for policy 1, policy_version 57321 (0.0008) +[2023-10-13 02:38:17,237][46663] Updated weights for policy 1, policy_version 57331 (0.0009) +[2023-10-13 02:38:17,246][46662] Updated weights for policy 0, policy_version 57380 (0.0008) +[2023-10-13 02:38:17,607][46663] Updated weights for policy 1, policy_version 57341 (0.0008) +[2023-10-13 02:38:17,613][46662] Updated weights for policy 0, policy_version 57390 (0.0009) +[2023-10-13 02:38:17,979][46662] Updated weights for policy 0, policy_version 57400 (0.0009) +[2023-10-13 02:38:18,607][45375] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 13440.4). Total num frames: 117506048. Throughput: 0: 1679.4, 1: 1651.3. Samples: 29378842. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0) +[2023-10-13 02:38:18,607][45375] Avg episode reward: [(0, '50.130'), (1, '46.000')] +[2023-10-13 02:38:21,809][46663] Updated weights for policy 1, policy_version 57351 (0.0008) +[2023-10-13 02:38:22,087][46662] Updated weights for policy 0, policy_version 57410 (0.0009) +[2023-10-13 02:38:22,173][46663] Updated weights for policy 1, policy_version 57361 (0.0007) +[2023-10-13 02:38:22,452][46662] Updated weights for policy 0, policy_version 57420 (0.0010) +[2023-10-13 02:38:22,532][46663] Updated weights for policy 1, policy_version 57371 (0.0008) +[2023-10-13 02:38:22,821][46662] Updated weights for policy 0, policy_version 57430 (0.0007) +[2023-10-13 02:38:23,180][46662] Updated weights for policy 0, policy_version 57440 (0.0009) +[2023-10-13 02:38:23,607][45375] Fps is (10 sec: 13107.4, 60 sec: 14199.4, 300 sec: 13440.4). Total num frames: 117571584. Throughput: 0: 1662.4, 1: 1661.1. Samples: 29397988. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0) +[2023-10-13 02:38:23,607][45375] Avg episode reward: [(0, '48.670'), (1, '46.390')] +[2023-10-13 02:38:26,774][46663] Updated weights for policy 1, policy_version 57381 (0.0007) +[2023-10-13 02:38:27,089][46662] Updated weights for policy 0, policy_version 57450 (0.0008) +[2023-10-13 02:38:27,170][46663] Updated weights for policy 1, policy_version 57391 (0.0008) +[2023-10-13 02:38:27,464][46662] Updated weights for policy 0, policy_version 57460 (0.0010) +[2023-10-13 02:38:27,539][46663] Updated weights for policy 1, policy_version 57401 (0.0010) +[2023-10-13 02:38:27,841][46662] Updated weights for policy 0, policy_version 57470 (0.0008) +[2023-10-13 02:38:28,606][45375] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 13440.4). Total num frames: 117637120. Throughput: 0: 1678.4, 1: 1669.0. Samples: 29409100. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0) +[2023-10-13 02:38:28,607][45375] Avg episode reward: [(0, '47.930'), (1, '47.100')] +[2023-10-13 02:38:31,643][46663] Updated weights for policy 1, policy_version 57411 (0.0009) +[2023-10-13 02:38:32,011][46663] Updated weights for policy 1, policy_version 57421 (0.0007) +[2023-10-13 02:38:32,029][46662] Updated weights for policy 0, policy_version 57480 (0.0008) +[2023-10-13 02:38:32,370][46663] Updated weights for policy 1, policy_version 57431 (0.0007) +[2023-10-13 02:38:32,399][46662] Updated weights for policy 0, policy_version 57490 (0.0009) +[2023-10-13 02:38:32,773][46662] Updated weights for policy 0, policy_version 57500 (0.0009) +[2023-10-13 02:38:33,606][45375] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 13440.4). Total num frames: 117702656. Throughput: 0: 1671.0, 1: 1651.4. Samples: 29428498. Policy #0 lag: (min: 31.0, avg: 38.7, max: 63.0) +[2023-10-13 02:38:33,607][45375] Avg episode reward: [(0, '46.920'), (1, '46.380')] +[2023-10-13 02:38:36,321][46663] Updated weights for policy 1, policy_version 57441 (0.0007) +[2023-10-13 02:38:36,704][46663] Updated weights for policy 1, policy_version 57451 (0.0008) +[2023-10-13 02:38:36,817][46662] Updated weights for policy 0, policy_version 57510 (0.0010) +[2023-10-13 02:38:37,067][46663] Updated weights for policy 1, policy_version 57461 (0.0007) +[2023-10-13 02:38:37,183][46662] Updated weights for policy 0, policy_version 57520 (0.0007) +[2023-10-13 02:38:37,438][46663] Updated weights for policy 1, policy_version 57471 (0.0008) +[2023-10-13 02:38:37,549][46662] Updated weights for policy 0, policy_version 57530 (0.0007) +[2023-10-13 02:38:38,607][45375] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 13329.4). Total num frames: 117768192. Throughput: 0: 1657.7, 1: 1662.6. Samples: 29447734. Policy #0 lag: (min: 31.0, avg: 38.7, max: 63.0) +[2023-10-13 02:38:38,607][45375] Avg episode reward: [(0, '47.240'), (1, '46.560')] +[2023-10-13 02:38:41,514][46662] Updated weights for policy 0, policy_version 57540 (0.0009) +[2023-10-13 02:38:41,594][46663] Updated weights for policy 1, policy_version 57481 (0.0009) +[2023-10-13 02:38:41,878][46662] Updated weights for policy 0, policy_version 57550 (0.0008) +[2023-10-13 02:38:41,962][46663] Updated weights for policy 1, policy_version 57491 (0.0008) +[2023-10-13 02:38:42,253][46662] Updated weights for policy 0, policy_version 57560 (0.0009) +[2023-10-13 02:38:42,328][46663] Updated weights for policy 1, policy_version 57501 (0.0010) +[2023-10-13 02:38:43,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13653.4, 300 sec: 13329.4). Total num frames: 117833728. Throughput: 0: 1675.4, 1: 1661.2. Samples: 29459356. Policy #0 lag: (min: 31.0, avg: 38.7, max: 63.0) +[2023-10-13 02:38:43,607][45375] Avg episode reward: [(0, '46.390'), (1, '45.880')] +[2023-10-13 02:38:46,327][46663] Updated weights for policy 1, policy_version 57511 (0.0010) +[2023-10-13 02:38:46,442][46662] Updated weights for policy 0, policy_version 57570 (0.0009) +[2023-10-13 02:38:46,700][46663] Updated weights for policy 1, policy_version 57521 (0.0008) +[2023-10-13 02:38:46,812][46662] Updated weights for policy 0, policy_version 57580 (0.0008) +[2023-10-13 02:38:47,062][46663] Updated weights for policy 1, policy_version 57531 (0.0009) +[2023-10-13 02:38:47,192][46662] Updated weights for policy 0, policy_version 57590 (0.0007) +[2023-10-13 02:38:47,551][46662] Updated weights for policy 0, policy_version 57600 (0.0008) +[2023-10-13 02:38:48,606][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13329.4). Total num frames: 117899264. Throughput: 0: 1672.2, 1: 1653.0. Samples: 29478618. Policy #0 lag: (min: 31.0, avg: 38.7, max: 63.0) +[2023-10-13 02:38:48,607][45375] Avg episode reward: [(0, '46.110'), (1, '44.160')] +[2023-10-13 02:38:51,159][46663] Updated weights for policy 1, policy_version 57541 (0.0008) +[2023-10-13 02:38:51,497][46662] Updated weights for policy 0, policy_version 57610 (0.0007) +[2023-10-13 02:38:51,520][46663] Updated weights for policy 1, policy_version 57551 (0.0007) +[2023-10-13 02:38:51,862][46662] Updated weights for policy 0, policy_version 57620 (0.0008) +[2023-10-13 02:38:51,880][46663] Updated weights for policy 1, policy_version 57561 (0.0008) +[2023-10-13 02:38:52,239][46662] Updated weights for policy 0, policy_version 57630 (0.0008) +[2023-10-13 02:38:53,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 117964800. Throughput: 0: 1674.7, 1: 1672.4. Samples: 29498488. Policy #0 lag: (min: 31.0, avg: 38.7, max: 63.0) +[2023-10-13 02:38:53,607][45375] Avg episode reward: [(0, '46.770'), (1, '44.330')] +[2023-10-13 02:38:55,975][46663] Updated weights for policy 1, policy_version 57571 (0.0009) +[2023-10-13 02:38:56,328][46663] Updated weights for policy 1, policy_version 57581 (0.0008) +[2023-10-13 02:38:56,390][46662] Updated weights for policy 0, policy_version 57640 (0.0008) +[2023-10-13 02:38:56,697][46663] Updated weights for policy 1, policy_version 57591 (0.0009) +[2023-10-13 02:38:56,764][46662] Updated weights for policy 0, policy_version 57650 (0.0007) +[2023-10-13 02:38:57,135][46662] Updated weights for policy 0, policy_version 57660 (0.0009) +[2023-10-13 02:38:58,606][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 118030336. Throughput: 0: 1681.0, 1: 1664.5. Samples: 29509666. Policy #0 lag: (min: 31.0, avg: 38.7, max: 63.0) +[2023-10-13 02:38:58,607][45375] Avg episode reward: [(0, '46.470'), (1, '43.670')] +[2023-10-13 02:39:00,860][46663] Updated weights for policy 1, policy_version 57601 (0.0009) +[2023-10-13 02:39:01,166][46662] Updated weights for policy 0, policy_version 57670 (0.0008) +[2023-10-13 02:39:01,220][46663] Updated weights for policy 1, policy_version 57611 (0.0008) +[2023-10-13 02:39:01,542][46662] Updated weights for policy 0, policy_version 57680 (0.0008) +[2023-10-13 02:39:01,579][46663] Updated weights for policy 1, policy_version 57621 (0.0007) +[2023-10-13 02:39:01,900][46662] Updated weights for policy 0, policy_version 57690 (0.0008) +[2023-10-13 02:39:01,951][46663] Updated weights for policy 1, policy_version 57631 (0.0007) +[2023-10-13 02:39:03,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.1, 300 sec: 13329.4). Total num frames: 118095872. Throughput: 0: 1659.8, 1: 1665.3. Samples: 29528474. Policy #0 lag: (min: 31.0, avg: 38.7, max: 63.0) +[2023-10-13 02:39:03,608][45375] Avg episode reward: [(0, '44.800'), (1, '44.090')] +[2023-10-13 02:39:05,902][46662] Updated weights for policy 0, policy_version 57700 (0.0007) +[2023-10-13 02:39:06,118][46663] Updated weights for policy 1, policy_version 57641 (0.0009) +[2023-10-13 02:39:06,266][46662] Updated weights for policy 0, policy_version 57710 (0.0009) +[2023-10-13 02:39:06,491][46663] Updated weights for policy 1, policy_version 57651 (0.0008) +[2023-10-13 02:39:06,636][46662] Updated weights for policy 0, policy_version 57720 (0.0008) +[2023-10-13 02:39:06,861][46663] Updated weights for policy 1, policy_version 57661 (0.0009) +[2023-10-13 02:39:08,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 118161408. Throughput: 0: 1674.3, 1: 1675.6. Samples: 29548734. Policy #0 lag: (min: 31.0, avg: 38.7, max: 63.0) +[2023-10-13 02:39:08,607][45375] Avg episode reward: [(0, '45.610'), (1, '45.490')] +[2023-10-13 02:39:10,576][46662] Updated weights for policy 0, policy_version 57730 (0.0009) +[2023-10-13 02:39:10,947][46662] Updated weights for policy 0, policy_version 57740 (0.0008) +[2023-10-13 02:39:10,979][46663] Updated weights for policy 1, policy_version 57671 (0.0007) +[2023-10-13 02:39:11,322][46662] Updated weights for policy 0, policy_version 57750 (0.0007) +[2023-10-13 02:39:11,352][46663] Updated weights for policy 1, policy_version 57681 (0.0007) +[2023-10-13 02:39:11,692][46662] Updated weights for policy 0, policy_version 57760 (0.0009) +[2023-10-13 02:39:11,716][46663] Updated weights for policy 1, policy_version 57691 (0.0007) +[2023-10-13 02:39:13,606][45375] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 118226944. Throughput: 0: 1678.0, 1: 1662.2. Samples: 29559406. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:39:13,607][45375] Avg episode reward: [(0, '47.000'), (1, '45.630')] +[2023-10-13 02:39:15,686][46663] Updated weights for policy 1, policy_version 57701 (0.0008) +[2023-10-13 02:39:15,889][46662] Updated weights for policy 0, policy_version 57770 (0.0008) +[2023-10-13 02:39:16,048][46663] Updated weights for policy 1, policy_version 57711 (0.0010) +[2023-10-13 02:39:16,255][46662] Updated weights for policy 0, policy_version 57780 (0.0008) +[2023-10-13 02:39:16,418][46663] Updated weights for policy 1, policy_version 57721 (0.0008) +[2023-10-13 02:39:16,622][46662] Updated weights for policy 0, policy_version 57790 (0.0007) +[2023-10-13 02:39:18,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 118292480. Throughput: 0: 1661.3, 1: 1674.1. Samples: 29578592. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:39:18,607][45375] Avg episode reward: [(0, '47.390'), (1, '46.060')] +[2023-10-13 02:39:20,406][46663] Updated weights for policy 1, policy_version 57731 (0.0009) +[2023-10-13 02:39:20,772][46663] Updated weights for policy 1, policy_version 57741 (0.0007) +[2023-10-13 02:39:20,786][46662] Updated weights for policy 0, policy_version 57800 (0.0010) +[2023-10-13 02:39:21,129][46663] Updated weights for policy 1, policy_version 57751 (0.0008) +[2023-10-13 02:39:21,146][46662] Updated weights for policy 0, policy_version 57810 (0.0008) +[2023-10-13 02:39:21,517][46662] Updated weights for policy 0, policy_version 57820 (0.0007) +[2023-10-13 02:39:23,606][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 118358016. Throughput: 0: 1681.3, 1: 1687.3. Samples: 29599320. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:39:23,607][45375] Avg episode reward: [(0, '46.220'), (1, '46.040')] +[2023-10-13 02:39:25,125][46663] Updated weights for policy 1, policy_version 57761 (0.0007) +[2023-10-13 02:39:25,492][46663] Updated weights for policy 1, policy_version 57771 (0.0007) +[2023-10-13 02:39:25,568][46662] Updated weights for policy 0, policy_version 57830 (0.0010) +[2023-10-13 02:39:25,852][46663] Updated weights for policy 1, policy_version 57781 (0.0008) +[2023-10-13 02:39:25,934][46662] Updated weights for policy 0, policy_version 57840 (0.0008) +[2023-10-13 02:39:26,227][46663] Updated weights for policy 1, policy_version 57791 (0.0010) +[2023-10-13 02:39:26,302][46662] Updated weights for policy 0, policy_version 57850 (0.0007) +[2023-10-13 02:39:28,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 118423552. Throughput: 0: 1665.4, 1: 1665.2. Samples: 29609232. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:39:28,607][45375] Avg episode reward: [(0, '46.100'), (1, '46.660')] +[2023-10-13 02:39:30,323][46662] Updated weights for policy 0, policy_version 57860 (0.0008) +[2023-10-13 02:39:30,476][46663] Updated weights for policy 1, policy_version 57801 (0.0009) +[2023-10-13 02:39:30,694][46662] Updated weights for policy 0, policy_version 57870 (0.0008) +[2023-10-13 02:39:30,844][46663] Updated weights for policy 1, policy_version 57811 (0.0008) +[2023-10-13 02:39:31,056][46662] Updated weights for policy 0, policy_version 57880 (0.0011) +[2023-10-13 02:39:31,200][46663] Updated weights for policy 1, policy_version 57821 (0.0007) +[2023-10-13 02:39:33,606][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 118489088. Throughput: 0: 1655.1, 1: 1685.4. Samples: 29628940. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:39:33,607][45375] Avg episode reward: [(0, '47.230'), (1, '47.430')] +[2023-10-13 02:39:35,116][46663] Updated weights for policy 1, policy_version 57831 (0.0009) +[2023-10-13 02:39:35,223][46662] Updated weights for policy 0, policy_version 57890 (0.0008) +[2023-10-13 02:39:35,485][46663] Updated weights for policy 1, policy_version 57841 (0.0009) +[2023-10-13 02:39:35,590][46662] Updated weights for policy 0, policy_version 57900 (0.0008) +[2023-10-13 02:39:35,845][46663] Updated weights for policy 1, policy_version 57851 (0.0009) +[2023-10-13 02:39:35,952][46662] Updated weights for policy 0, policy_version 57910 (0.0009) +[2023-10-13 02:39:36,324][46662] Updated weights for policy 0, policy_version 57920 (0.0008) +[2023-10-13 02:39:38,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 118554624. Throughput: 0: 1671.8, 1: 1685.9. Samples: 29649582. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:39:38,608][45375] Avg episode reward: [(0, '48.160'), (1, '48.310')] +[2023-10-13 02:39:40,032][46663] Updated weights for policy 1, policy_version 57861 (0.0008) +[2023-10-13 02:39:40,394][46663] Updated weights for policy 1, policy_version 57871 (0.0007) +[2023-10-13 02:39:40,494][46662] Updated weights for policy 0, policy_version 57930 (0.0009) +[2023-10-13 02:39:40,757][46663] Updated weights for policy 1, policy_version 57881 (0.0009) +[2023-10-13 02:39:40,877][46662] Updated weights for policy 0, policy_version 57940 (0.0010) +[2023-10-13 02:39:41,242][46662] Updated weights for policy 0, policy_version 57950 (0.0008) +[2023-10-13 02:39:43,607][45375] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 118620160. Throughput: 0: 1657.6, 1: 1667.2. Samples: 29659284. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:39:43,608][45375] Avg episode reward: [(0, '46.220'), (1, '49.300')] +[2023-10-13 02:39:44,755][46663] Updated weights for policy 1, policy_version 57891 (0.0008) +[2023-10-13 02:39:45,123][46663] Updated weights for policy 1, policy_version 57901 (0.0007) +[2023-10-13 02:39:45,485][46663] Updated weights for policy 1, policy_version 57911 (0.0007) +[2023-10-13 02:39:45,526][46662] Updated weights for policy 0, policy_version 57960 (0.0009) +[2023-10-13 02:39:45,896][46662] Updated weights for policy 0, policy_version 57970 (0.0008) +[2023-10-13 02:39:46,273][46662] Updated weights for policy 0, policy_version 57980 (0.0010) +[2023-10-13 02:39:48,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 118685696. Throughput: 0: 1663.2, 1: 1689.4. Samples: 29679344. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:39:48,607][45375] Avg episode reward: [(0, '47.530'), (1, '49.370')] +[2023-10-13 02:39:49,648][46663] Updated weights for policy 1, policy_version 57921 (0.0008) +[2023-10-13 02:39:50,019][46663] Updated weights for policy 1, policy_version 57931 (0.0008) +[2023-10-13 02:39:50,333][46662] Updated weights for policy 0, policy_version 57990 (0.0007) +[2023-10-13 02:39:50,388][46663] Updated weights for policy 1, policy_version 57941 (0.0008) +[2023-10-13 02:39:50,707][46662] Updated weights for policy 0, policy_version 58000 (0.0009) +[2023-10-13 02:39:50,745][46663] Updated weights for policy 1, policy_version 57951 (0.0008) +[2023-10-13 02:39:51,076][46662] Updated weights for policy 0, policy_version 58010 (0.0007) +[2023-10-13 02:39:53,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 118751232. Throughput: 0: 1675.5, 1: 1686.9. Samples: 29700042. Policy #0 lag: (min: 6.0, avg: 6.4, max: 20.0) +[2023-10-13 02:39:53,607][45375] Avg episode reward: [(0, '46.610'), (1, '50.840')] +[2023-10-13 02:39:53,617][46384] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000057952_59342848.pth... +[2023-10-13 02:39:53,617][46091] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000058016_59408384.pth... +[2023-10-13 02:39:53,655][46091] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000056448_57802752.pth +[2023-10-13 02:39:53,658][46384] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000056384_57737216.pth +[2023-10-13 02:39:54,873][46663] Updated weights for policy 1, policy_version 57961 (0.0008) +[2023-10-13 02:39:55,055][46662] Updated weights for policy 0, policy_version 58020 (0.0008) +[2023-10-13 02:39:55,246][46663] Updated weights for policy 1, policy_version 57971 (0.0009) +[2023-10-13 02:39:55,422][46662] Updated weights for policy 0, policy_version 58030 (0.0009) +[2023-10-13 02:39:55,612][46663] Updated weights for policy 1, policy_version 57981 (0.0009) +[2023-10-13 02:39:55,795][46662] Updated weights for policy 0, policy_version 58040 (0.0008) +[2023-10-13 02:39:58,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 118816768. Throughput: 0: 1662.4, 1: 1674.0. Samples: 29709544. Policy #0 lag: (min: 6.0, avg: 6.4, max: 20.0) +[2023-10-13 02:39:58,607][45375] Avg episode reward: [(0, '45.950'), (1, '51.490')] +[2023-10-13 02:39:59,732][46663] Updated weights for policy 1, policy_version 57991 (0.0008) +[2023-10-13 02:39:59,831][46662] Updated weights for policy 0, policy_version 58050 (0.0008) +[2023-10-13 02:40:00,100][46663] Updated weights for policy 1, policy_version 58001 (0.0010) +[2023-10-13 02:40:00,198][46662] Updated weights for policy 0, policy_version 58060 (0.0009) +[2023-10-13 02:40:00,468][46663] Updated weights for policy 1, policy_version 58011 (0.0010) +[2023-10-13 02:40:00,564][46662] Updated weights for policy 0, policy_version 58070 (0.0009) +[2023-10-13 02:40:00,935][46662] Updated weights for policy 0, policy_version 58080 (0.0009) +[2023-10-13 02:40:03,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 118882304. Throughput: 0: 1674.3, 1: 1680.3. Samples: 29729550. Policy #0 lag: (min: 6.0, avg: 6.4, max: 20.0) +[2023-10-13 02:40:03,608][45375] Avg episode reward: [(0, '46.290'), (1, '51.440')] +[2023-10-13 02:40:04,566][46663] Updated weights for policy 1, policy_version 58021 (0.0008) +[2023-10-13 02:40:04,941][46663] Updated weights for policy 1, policy_version 58031 (0.0009) +[2023-10-13 02:40:05,035][46662] Updated weights for policy 0, policy_version 58090 (0.0009) +[2023-10-13 02:40:05,314][46663] Updated weights for policy 1, policy_version 58041 (0.0010) +[2023-10-13 02:40:05,402][46662] Updated weights for policy 0, policy_version 58100 (0.0009) +[2023-10-13 02:40:05,778][46662] Updated weights for policy 0, policy_version 58110 (0.0007) +[2023-10-13 02:40:08,606][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 118947840. Throughput: 0: 1682.1, 1: 1670.9. Samples: 29750206. Policy #0 lag: (min: 6.0, avg: 6.4, max: 20.0) +[2023-10-13 02:40:08,607][45375] Avg episode reward: [(0, '44.980'), (1, '51.710')] +[2023-10-13 02:40:09,532][46663] Updated weights for policy 1, policy_version 58051 (0.0008) +[2023-10-13 02:40:09,702][46662] Updated weights for policy 0, policy_version 58120 (0.0008) +[2023-10-13 02:40:09,896][46663] Updated weights for policy 1, policy_version 58061 (0.0008) +[2023-10-13 02:40:10,061][46662] Updated weights for policy 0, policy_version 58130 (0.0008) +[2023-10-13 02:40:10,264][46663] Updated weights for policy 1, policy_version 58071 (0.0009) +[2023-10-13 02:40:10,430][46662] Updated weights for policy 0, policy_version 58140 (0.0008) +[2023-10-13 02:40:13,607][45375] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 119013376. Throughput: 0: 1665.2, 1: 1665.6. Samples: 29759116. Policy #0 lag: (min: 6.0, avg: 6.4, max: 20.0) +[2023-10-13 02:40:13,607][45375] Avg episode reward: [(0, '43.680'), (1, '51.530')] +[2023-10-13 02:40:14,294][46663] Updated weights for policy 1, policy_version 58081 (0.0007) +[2023-10-13 02:40:14,395][46662] Updated weights for policy 0, policy_version 58150 (0.0009) +[2023-10-13 02:40:14,666][46663] Updated weights for policy 1, policy_version 58091 (0.0008) +[2023-10-13 02:40:14,773][46662] Updated weights for policy 0, policy_version 58160 (0.0009) +[2023-10-13 02:40:15,042][46663] Updated weights for policy 1, policy_version 58101 (0.0008) +[2023-10-13 02:40:15,136][46662] Updated weights for policy 0, policy_version 58170 (0.0009) +[2023-10-13 02:40:15,398][46663] Updated weights for policy 1, policy_version 58111 (0.0010) +[2023-10-13 02:40:18,606][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 119078912. Throughput: 0: 1685.0, 1: 1673.7. Samples: 29780082. Policy #0 lag: (min: 6.0, avg: 6.4, max: 20.0) +[2023-10-13 02:40:18,607][45375] Avg episode reward: [(0, '44.430'), (1, '53.030')] +[2023-10-13 02:40:19,321][46662] Updated weights for policy 0, policy_version 58180 (0.0008) +[2023-10-13 02:40:19,494][46663] Updated weights for policy 1, policy_version 58121 (0.0009) +[2023-10-13 02:40:19,701][46662] Updated weights for policy 0, policy_version 58190 (0.0009) +[2023-10-13 02:40:19,861][46663] Updated weights for policy 1, policy_version 58131 (0.0009) +[2023-10-13 02:40:20,065][46662] Updated weights for policy 0, policy_version 58200 (0.0007) +[2023-10-13 02:40:20,229][46663] Updated weights for policy 1, policy_version 58141 (0.0007) +[2023-10-13 02:40:23,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 119144448. Throughput: 0: 1684.6, 1: 1666.8. Samples: 29800394. Policy #0 lag: (min: 6.0, avg: 6.4, max: 20.0) +[2023-10-13 02:40:23,607][45375] Avg episode reward: [(0, '44.860'), (1, '52.820')] +[2023-10-13 02:40:24,141][46662] Updated weights for policy 0, policy_version 58210 (0.0007) +[2023-10-13 02:40:24,267][46663] Updated weights for policy 1, policy_version 58151 (0.0008) +[2023-10-13 02:40:24,498][46662] Updated weights for policy 0, policy_version 58220 (0.0007) +[2023-10-13 02:40:24,635][46663] Updated weights for policy 1, policy_version 58161 (0.0007) +[2023-10-13 02:40:24,876][46662] Updated weights for policy 0, policy_version 58230 (0.0008) +[2023-10-13 02:40:24,989][46663] Updated weights for policy 1, policy_version 58171 (0.0009) +[2023-10-13 02:40:25,245][46662] Updated weights for policy 0, policy_version 58240 (0.0008) +[2023-10-13 02:40:28,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 119209984. Throughput: 0: 1669.7, 1: 1667.7. Samples: 29809466. Policy #0 lag: (min: 6.0, avg: 6.4, max: 20.0) +[2023-10-13 02:40:28,607][45375] Avg episode reward: [(0, '44.900'), (1, '53.080')] +[2023-10-13 02:40:29,297][46663] Updated weights for policy 1, policy_version 58181 (0.0009) +[2023-10-13 02:40:29,442][46662] Updated weights for policy 0, policy_version 58250 (0.0008) +[2023-10-13 02:40:29,675][46663] Updated weights for policy 1, policy_version 58191 (0.0007) +[2023-10-13 02:40:29,816][46662] Updated weights for policy 0, policy_version 58260 (0.0008) +[2023-10-13 02:40:30,037][46663] Updated weights for policy 1, policy_version 58201 (0.0007) +[2023-10-13 02:40:30,185][46662] Updated weights for policy 0, policy_version 58270 (0.0008) +[2023-10-13 02:40:33,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13107.1, 300 sec: 13329.4). Total num frames: 119275520. Throughput: 0: 1686.5, 1: 1655.3. Samples: 29829728. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:40:33,608][45375] Avg episode reward: [(0, '44.170'), (1, '53.630')] +[2023-10-13 02:40:34,225][46663] Updated weights for policy 1, policy_version 58211 (0.0007) +[2023-10-13 02:40:34,333][46662] Updated weights for policy 0, policy_version 58280 (0.0007) +[2023-10-13 02:40:34,594][46663] Updated weights for policy 1, policy_version 58221 (0.0007) +[2023-10-13 02:40:34,702][46662] Updated weights for policy 0, policy_version 58290 (0.0007) +[2023-10-13 02:40:34,949][46663] Updated weights for policy 1, policy_version 58231 (0.0007) +[2023-10-13 02:40:35,073][46662] Updated weights for policy 0, policy_version 58300 (0.0007) +[2023-10-13 02:40:38,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 119341056. Throughput: 0: 1675.3, 1: 1658.1. Samples: 29850046. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:40:38,607][45375] Avg episode reward: [(0, '44.090'), (1, '53.830')] +[2023-10-13 02:40:39,130][46662] Updated weights for policy 0, policy_version 58310 (0.0009) +[2023-10-13 02:40:39,158][46663] Updated weights for policy 1, policy_version 58241 (0.0008) +[2023-10-13 02:40:39,504][46662] Updated weights for policy 0, policy_version 58320 (0.0007) +[2023-10-13 02:40:39,515][46663] Updated weights for policy 1, policy_version 58251 (0.0007) +[2023-10-13 02:40:39,866][46662] Updated weights for policy 0, policy_version 58330 (0.0008) +[2023-10-13 02:40:39,885][46663] Updated weights for policy 1, policy_version 58261 (0.0008) +[2023-10-13 02:40:40,250][46663] Updated weights for policy 1, policy_version 58271 (0.0008) +[2023-10-13 02:40:43,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 119406592. Throughput: 0: 1665.0, 1: 1659.8. Samples: 29859158. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:40:43,608][45375] Avg episode reward: [(0, '44.780'), (1, '53.140')] +[2023-10-13 02:40:43,894][46662] Updated weights for policy 0, policy_version 58340 (0.0008) +[2023-10-13 02:40:44,266][46662] Updated weights for policy 0, policy_version 58350 (0.0008) +[2023-10-13 02:40:44,342][46663] Updated weights for policy 1, policy_version 58281 (0.0007) +[2023-10-13 02:40:44,643][46662] Updated weights for policy 0, policy_version 58360 (0.0009) +[2023-10-13 02:40:44,704][46663] Updated weights for policy 1, policy_version 58291 (0.0008) +[2023-10-13 02:40:45,073][46663] Updated weights for policy 1, policy_version 58301 (0.0011) +[2023-10-13 02:40:48,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 119472128. Throughput: 0: 1679.8, 1: 1662.6. Samples: 29879960. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:40:48,607][45375] Avg episode reward: [(0, '45.280'), (1, '52.410')] +[2023-10-13 02:40:48,651][46662] Updated weights for policy 0, policy_version 58370 (0.0009) +[2023-10-13 02:40:49,022][46662] Updated weights for policy 0, policy_version 58380 (0.0010) +[2023-10-13 02:40:49,193][46663] Updated weights for policy 1, policy_version 58311 (0.0009) +[2023-10-13 02:40:49,394][46662] Updated weights for policy 0, policy_version 58390 (0.0007) +[2023-10-13 02:40:49,557][46663] Updated weights for policy 1, policy_version 58321 (0.0009) +[2023-10-13 02:40:49,763][46662] Updated weights for policy 0, policy_version 58400 (0.0009) +[2023-10-13 02:40:49,922][46663] Updated weights for policy 1, policy_version 58331 (0.0008) +[2023-10-13 02:40:53,607][45375] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 119537664. Throughput: 0: 1676.5, 1: 1662.5. Samples: 29900460. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:40:53,607][45375] Avg episode reward: [(0, '45.830'), (1, '53.340')] +[2023-10-13 02:40:53,788][46662] Updated weights for policy 0, policy_version 58410 (0.0011) +[2023-10-13 02:40:54,128][46663] Updated weights for policy 1, policy_version 58341 (0.0008) +[2023-10-13 02:40:54,152][46662] Updated weights for policy 0, policy_version 58420 (0.0007) +[2023-10-13 02:40:54,506][46663] Updated weights for policy 1, policy_version 58351 (0.0009) +[2023-10-13 02:40:54,514][46662] Updated weights for policy 0, policy_version 58430 (0.0008) +[2023-10-13 02:40:54,868][46663] Updated weights for policy 1, policy_version 58361 (0.0009) +[2023-10-13 02:40:58,606][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 119603200. Throughput: 0: 1678.8, 1: 1664.6. Samples: 29909568. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:40:58,607][45375] Avg episode reward: [(0, '47.260'), (1, '52.170')] +[2023-10-13 02:40:58,719][46662] Updated weights for policy 0, policy_version 58440 (0.0007) +[2023-10-13 02:40:59,067][46663] Updated weights for policy 1, policy_version 58371 (0.0009) +[2023-10-13 02:40:59,085][46662] Updated weights for policy 0, policy_version 58450 (0.0010) +[2023-10-13 02:40:59,428][46663] Updated weights for policy 1, policy_version 58381 (0.0007) +[2023-10-13 02:40:59,452][46662] Updated weights for policy 0, policy_version 58460 (0.0008) +[2023-10-13 02:40:59,795][46663] Updated weights for policy 1, policy_version 58391 (0.0011) +[2023-10-13 02:41:03,531][46662] Updated weights for policy 0, policy_version 58470 (0.0010) +[2023-10-13 02:41:03,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 119668736. Throughput: 0: 1675.7, 1: 1663.9. Samples: 29930362. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:41:03,607][45375] Avg episode reward: [(0, '46.870'), (1, '52.000')] +[2023-10-13 02:41:03,904][46662] Updated weights for policy 0, policy_version 58480 (0.0007) +[2023-10-13 02:41:04,035][46663] Updated weights for policy 1, policy_version 58401 (0.0011) +[2023-10-13 02:41:04,282][46662] Updated weights for policy 0, policy_version 58490 (0.0007) +[2023-10-13 02:41:04,395][46663] Updated weights for policy 1, policy_version 58411 (0.0007) +[2023-10-13 02:41:04,757][46663] Updated weights for policy 1, policy_version 58421 (0.0009) +[2023-10-13 02:41:05,121][46663] Updated weights for policy 1, policy_version 58431 (0.0009) +[2023-10-13 02:41:08,229][46662] Updated weights for policy 0, policy_version 58500 (0.0007) +[2023-10-13 02:41:08,600][46662] Updated weights for policy 0, policy_version 58510 (0.0007) +[2023-10-13 02:41:08,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 119734272. Throughput: 0: 1678.9, 1: 1667.3. Samples: 29950976. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:41:08,607][45375] Avg episode reward: [(0, '50.210'), (1, '53.490')] +[2023-10-13 02:41:08,964][46662] Updated weights for policy 0, policy_version 58520 (0.0007) +[2023-10-13 02:41:09,106][46663] Updated weights for policy 1, policy_version 58441 (0.0007) +[2023-10-13 02:41:09,473][46663] Updated weights for policy 1, policy_version 58451 (0.0007) +[2023-10-13 02:41:09,838][46663] Updated weights for policy 1, policy_version 58461 (0.0007) +[2023-10-13 02:41:13,046][46662] Updated weights for policy 0, policy_version 58530 (0.0007) +[2023-10-13 02:41:13,416][46662] Updated weights for policy 0, policy_version 58540 (0.0007) +[2023-10-13 02:41:13,607][45375] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 119799808. Throughput: 0: 1680.9, 1: 1665.6. Samples: 29960058. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:41:13,608][45375] Avg episode reward: [(0, '49.900'), (1, '52.940')] +[2023-10-13 02:41:13,784][46662] Updated weights for policy 0, policy_version 58550 (0.0007) +[2023-10-13 02:41:13,980][46663] Updated weights for policy 1, policy_version 58471 (0.0008) +[2023-10-13 02:41:14,156][46662] Updated weights for policy 0, policy_version 58560 (0.0008) +[2023-10-13 02:41:14,345][46663] Updated weights for policy 1, policy_version 58481 (0.0008) +[2023-10-13 02:41:14,708][46663] Updated weights for policy 1, policy_version 58491 (0.0008) +[2023-10-13 02:41:18,215][46662] Updated weights for policy 0, policy_version 58570 (0.0008) +[2023-10-13 02:41:18,585][46662] Updated weights for policy 0, policy_version 58580 (0.0007) +[2023-10-13 02:41:18,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 119865344. Throughput: 0: 1683.2, 1: 1671.0. Samples: 29980668. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:41:18,607][45375] Avg episode reward: [(0, '49.770'), (1, '52.490')] +[2023-10-13 02:41:18,897][46663] Updated weights for policy 1, policy_version 58501 (0.0008) +[2023-10-13 02:41:18,955][46662] Updated weights for policy 0, policy_version 58590 (0.0009) +[2023-10-13 02:41:19,258][46663] Updated weights for policy 1, policy_version 58511 (0.0008) +[2023-10-13 02:41:19,621][46663] Updated weights for policy 1, policy_version 58521 (0.0009) +[2023-10-13 02:41:23,223][46662] Updated weights for policy 0, policy_version 58600 (0.0009) +[2023-10-13 02:41:23,601][46663] Updated weights for policy 1, policy_version 58531 (0.0009) +[2023-10-13 02:41:23,607][46662] Updated weights for policy 0, policy_version 58610 (0.0009) +[2023-10-13 02:41:23,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 119930880. Throughput: 0: 1688.7, 1: 1670.1. Samples: 30001194. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:41:23,608][45375] Avg episode reward: [(0, '50.720'), (1, '51.980')] +[2023-10-13 02:41:23,962][46663] Updated weights for policy 1, policy_version 58541 (0.0007) +[2023-10-13 02:41:23,969][46662] Updated weights for policy 0, policy_version 58620 (0.0008) +[2023-10-13 02:41:24,325][46663] Updated weights for policy 1, policy_version 58551 (0.0008) +[2023-10-13 02:41:28,208][46662] Updated weights for policy 0, policy_version 58630 (0.0007) +[2023-10-13 02:41:28,421][46663] Updated weights for policy 1, policy_version 58561 (0.0008) +[2023-10-13 02:41:28,581][46662] Updated weights for policy 0, policy_version 58640 (0.0007) +[2023-10-13 02:41:28,606][45375] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 119996416. Throughput: 0: 1682.5, 1: 1670.7. Samples: 30010050. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:41:28,607][45375] Avg episode reward: [(0, '52.690'), (1, '52.200')] +[2023-10-13 02:41:28,774][46663] Updated weights for policy 1, policy_version 58571 (0.0009) +[2023-10-13 02:41:28,959][46662] Updated weights for policy 0, policy_version 58650 (0.0008) +[2023-10-13 02:41:29,154][46663] Updated weights for policy 1, policy_version 58581 (0.0009) +[2023-10-13 02:41:29,520][46663] Updated weights for policy 1, policy_version 58591 (0.0010) +[2023-10-13 02:41:33,186][46662] Updated weights for policy 0, policy_version 58660 (0.0009) +[2023-10-13 02:41:33,562][46662] Updated weights for policy 0, policy_version 58670 (0.0009) +[2023-10-13 02:41:33,565][46663] Updated weights for policy 1, policy_version 58601 (0.0008) +[2023-10-13 02:41:33,607][45375] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 120061952. Throughput: 0: 1673.5, 1: 1670.3. Samples: 30030430. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:41:33,607][45375] Avg episode reward: [(0, '53.380'), (1, '51.730')] +[2023-10-13 02:41:33,926][46663] Updated weights for policy 1, policy_version 58611 (0.0008) +[2023-10-13 02:41:33,929][46662] Updated weights for policy 0, policy_version 58680 (0.0008) +[2023-10-13 02:41:34,281][46663] Updated weights for policy 1, policy_version 58621 (0.0010) +[2023-10-13 02:41:38,054][46662] Updated weights for policy 0, policy_version 58690 (0.0009) +[2023-10-13 02:41:38,424][46662] Updated weights for policy 0, policy_version 58700 (0.0010) +[2023-10-13 02:41:38,465][46663] Updated weights for policy 1, policy_version 58631 (0.0009) +[2023-10-13 02:41:38,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 120127488. Throughput: 0: 1666.4, 1: 1666.2. Samples: 30050424. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:41:38,607][45375] Avg episode reward: [(0, '53.700'), (1, '52.020')] +[2023-10-13 02:41:38,798][46662] Updated weights for policy 0, policy_version 58710 (0.0008) +[2023-10-13 02:41:38,853][46663] Updated weights for policy 1, policy_version 58641 (0.0007) +[2023-10-13 02:41:39,172][46662] Updated weights for policy 0, policy_version 58720 (0.0008) +[2023-10-13 02:41:39,226][46663] Updated weights for policy 1, policy_version 58651 (0.0008) +[2023-10-13 02:41:43,145][46662] Updated weights for policy 0, policy_version 58730 (0.0009) +[2023-10-13 02:41:43,359][46663] Updated weights for policy 1, policy_version 58661 (0.0008) +[2023-10-13 02:41:43,519][46662] Updated weights for policy 0, policy_version 58740 (0.0007) +[2023-10-13 02:41:43,606][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.3, 300 sec: 13329.4). Total num frames: 120193024. Throughput: 0: 1662.7, 1: 1671.7. Samples: 30059616. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:41:43,607][45375] Avg episode reward: [(0, '53.000'), (1, '50.910')] +[2023-10-13 02:41:43,724][46663] Updated weights for policy 1, policy_version 58671 (0.0007) +[2023-10-13 02:41:43,881][46662] Updated weights for policy 0, policy_version 58750 (0.0008) +[2023-10-13 02:41:44,087][46663] Updated weights for policy 1, policy_version 58681 (0.0007) +[2023-10-13 02:41:47,969][46662] Updated weights for policy 0, policy_version 58760 (0.0011) +[2023-10-13 02:41:47,984][46663] Updated weights for policy 1, policy_version 58691 (0.0008) +[2023-10-13 02:41:48,342][46662] Updated weights for policy 0, policy_version 58770 (0.0009) +[2023-10-13 02:41:48,356][46663] Updated weights for policy 1, policy_version 58701 (0.0010) +[2023-10-13 02:41:48,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 120258560. Throughput: 0: 1661.3, 1: 1675.0. Samples: 30080498. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:41:48,608][45375] Avg episode reward: [(0, '52.780'), (1, '51.110')] +[2023-10-13 02:41:48,714][46662] Updated weights for policy 0, policy_version 58780 (0.0007) +[2023-10-13 02:41:48,718][46663] Updated weights for policy 1, policy_version 58711 (0.0008) +[2023-10-13 02:41:52,763][46662] Updated weights for policy 0, policy_version 58790 (0.0009) +[2023-10-13 02:41:52,847][46663] Updated weights for policy 1, policy_version 58721 (0.0007) +[2023-10-13 02:41:53,139][46662] Updated weights for policy 0, policy_version 58800 (0.0007) +[2023-10-13 02:41:53,213][46663] Updated weights for policy 1, policy_version 58731 (0.0009) +[2023-10-13 02:41:53,495][46662] Updated weights for policy 0, policy_version 58810 (0.0007) +[2023-10-13 02:41:53,581][46663] Updated weights for policy 1, policy_version 58741 (0.0008) +[2023-10-13 02:41:53,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 120324096. Throughput: 0: 1658.4, 1: 1664.0. Samples: 30100486. Policy #0 lag: (min: 31.0, avg: 38.5, max: 63.0) +[2023-10-13 02:41:53,607][45375] Avg episode reward: [(0, '51.480'), (1, '49.730')] +[2023-10-13 02:41:53,714][46091] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000058816_60227584.pth... +[2023-10-13 02:41:53,743][46091] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000057248_58621952.pth +[2023-10-13 02:41:53,945][46663] Updated weights for policy 1, policy_version 58751 (0.0008) +[2023-10-13 02:41:53,981][46384] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000058752_60162048.pth... +[2023-10-13 02:41:54,009][46384] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000057184_58556416.pth +[2023-10-13 02:41:57,535][46662] Updated weights for policy 0, policy_version 58820 (0.0008) +[2023-10-13 02:41:57,897][46662] Updated weights for policy 0, policy_version 58830 (0.0008) +[2023-10-13 02:41:57,991][46663] Updated weights for policy 1, policy_version 58761 (0.0008) +[2023-10-13 02:41:58,268][46662] Updated weights for policy 0, policy_version 58840 (0.0008) +[2023-10-13 02:41:58,348][46663] Updated weights for policy 1, policy_version 58771 (0.0008) +[2023-10-13 02:41:58,607][45375] Fps is (10 sec: 16384.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 120422400. Throughput: 0: 1666.0, 1: 1680.7. Samples: 30110660. Policy #0 lag: (min: 31.0, avg: 38.5, max: 63.0) +[2023-10-13 02:41:58,607][45375] Avg episode reward: [(0, '52.470'), (1, '49.790')] +[2023-10-13 02:41:58,720][46663] Updated weights for policy 1, policy_version 58781 (0.0008) +[2023-10-13 02:42:02,495][46662] Updated weights for policy 0, policy_version 58850 (0.0008) +[2023-10-13 02:42:02,613][46663] Updated weights for policy 1, policy_version 58791 (0.0008) +[2023-10-13 02:42:02,859][46662] Updated weights for policy 0, policy_version 58860 (0.0007) +[2023-10-13 02:42:02,980][46663] Updated weights for policy 1, policy_version 58801 (0.0008) +[2023-10-13 02:42:03,226][46662] Updated weights for policy 0, policy_version 58870 (0.0007) +[2023-10-13 02:42:03,342][46663] Updated weights for policy 1, policy_version 58811 (0.0008) +[2023-10-13 02:42:03,596][46662] Updated weights for policy 0, policy_version 58880 (0.0008) +[2023-10-13 02:42:03,607][45375] Fps is (10 sec: 19661.0, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 120520704. Throughput: 0: 1664.4, 1: 1681.9. Samples: 30131248. Policy #0 lag: (min: 31.0, avg: 38.5, max: 63.0) +[2023-10-13 02:42:03,607][45375] Avg episode reward: [(0, '52.240'), (1, '51.020')] +[2023-10-13 02:42:07,329][46663] Updated weights for policy 1, policy_version 58821 (0.0008) +[2023-10-13 02:42:07,658][46662] Updated weights for policy 0, policy_version 58890 (0.0008) +[2023-10-13 02:42:07,687][46663] Updated weights for policy 1, policy_version 58831 (0.0007) +[2023-10-13 02:42:08,029][46662] Updated weights for policy 0, policy_version 58900 (0.0008) +[2023-10-13 02:42:08,050][46663] Updated weights for policy 1, policy_version 58841 (0.0008) +[2023-10-13 02:42:08,402][46662] Updated weights for policy 0, policy_version 58910 (0.0010) +[2023-10-13 02:42:08,607][45375] Fps is (10 sec: 16383.9, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 120586240. Throughput: 0: 1650.6, 1: 1657.1. Samples: 30150042. Policy #0 lag: (min: 31.0, avg: 38.5, max: 63.0) +[2023-10-13 02:42:08,607][45375] Avg episode reward: [(0, '52.830'), (1, '51.400')] +[2023-10-13 02:42:12,190][46663] Updated weights for policy 1, policy_version 58851 (0.0008) +[2023-10-13 02:42:12,507][46662] Updated weights for policy 0, policy_version 58920 (0.0008) +[2023-10-13 02:42:12,548][46663] Updated weights for policy 1, policy_version 58861 (0.0008) +[2023-10-13 02:42:12,867][46662] Updated weights for policy 0, policy_version 58930 (0.0007) +[2023-10-13 02:42:12,919][46663] Updated weights for policy 1, policy_version 58871 (0.0007) +[2023-10-13 02:42:13,231][46662] Updated weights for policy 0, policy_version 58940 (0.0008) +[2023-10-13 02:42:13,607][45375] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 120651776. Throughput: 0: 1669.1, 1: 1686.3. Samples: 30161040. Policy #0 lag: (min: 31.0, avg: 38.5, max: 63.0) +[2023-10-13 02:42:13,607][45375] Avg episode reward: [(0, '50.830'), (1, '51.310')] +[2023-10-13 02:42:16,859][46663] Updated weights for policy 1, policy_version 58881 (0.0007) +[2023-10-13 02:42:17,233][46663] Updated weights for policy 1, policy_version 58891 (0.0008) +[2023-10-13 02:42:17,314][46662] Updated weights for policy 0, policy_version 58950 (0.0008) +[2023-10-13 02:42:17,589][46663] Updated weights for policy 1, policy_version 58901 (0.0007) +[2023-10-13 02:42:17,688][46662] Updated weights for policy 0, policy_version 58960 (0.0009) +[2023-10-13 02:42:17,957][46663] Updated weights for policy 1, policy_version 58911 (0.0010) +[2023-10-13 02:42:18,054][46662] Updated weights for policy 0, policy_version 58970 (0.0009) +[2023-10-13 02:42:18,607][45375] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 120717312. Throughput: 0: 1672.2, 1: 1680.0. Samples: 30181280. Policy #0 lag: (min: 31.0, avg: 38.5, max: 63.0) +[2023-10-13 02:42:18,607][45375] Avg episode reward: [(0, '51.420'), (1, '51.150')] +[2023-10-13 02:42:22,000][46663] Updated weights for policy 1, policy_version 58921 (0.0009) +[2023-10-13 02:42:22,160][46662] Updated weights for policy 0, policy_version 58980 (0.0009) +[2023-10-13 02:42:22,370][46663] Updated weights for policy 1, policy_version 58931 (0.0008) +[2023-10-13 02:42:22,530][46662] Updated weights for policy 0, policy_version 58990 (0.0010) +[2023-10-13 02:42:22,735][46663] Updated weights for policy 1, policy_version 58941 (0.0008) +[2023-10-13 02:42:22,896][46662] Updated weights for policy 0, policy_version 59000 (0.0009) +[2023-10-13 02:42:23,607][45375] Fps is (10 sec: 13107.0, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 120782848. Throughput: 0: 1656.7, 1: 1677.5. Samples: 30200466. Policy #0 lag: (min: 31.0, avg: 38.5, max: 63.0) +[2023-10-13 02:42:23,608][45375] Avg episode reward: [(0, '50.220'), (1, '50.360')] +[2023-10-13 02:42:26,802][46663] Updated weights for policy 1, policy_version 58951 (0.0008) +[2023-10-13 02:42:26,988][46662] Updated weights for policy 0, policy_version 59010 (0.0009) +[2023-10-13 02:42:27,169][46663] Updated weights for policy 1, policy_version 58961 (0.0008) +[2023-10-13 02:42:27,355][46662] Updated weights for policy 0, policy_version 59020 (0.0009) +[2023-10-13 02:42:27,532][46663] Updated weights for policy 1, policy_version 58971 (0.0008) +[2023-10-13 02:42:27,735][46662] Updated weights for policy 0, policy_version 59030 (0.0009) +[2023-10-13 02:42:28,105][46662] Updated weights for policy 0, policy_version 59040 (0.0008) +[2023-10-13 02:42:28,607][45375] Fps is (10 sec: 13107.2, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 120848384. Throughput: 0: 1679.9, 1: 1704.2. Samples: 30211900. Policy #0 lag: (min: 5.0, avg: 7.8, max: 37.0) +[2023-10-13 02:42:28,607][45375] Avg episode reward: [(0, '48.960'), (1, '49.430')] +[2023-10-13 02:42:31,599][46663] Updated weights for policy 1, policy_version 58981 (0.0008) +[2023-10-13 02:42:31,962][46663] Updated weights for policy 1, policy_version 58991 (0.0007) +[2023-10-13 02:42:32,107][46662] Updated weights for policy 0, policy_version 59050 (0.0009) +[2023-10-13 02:42:32,333][46663] Updated weights for policy 1, policy_version 59001 (0.0008) +[2023-10-13 02:42:32,474][46662] Updated weights for policy 0, policy_version 59060 (0.0007) +[2023-10-13 02:42:32,845][46662] Updated weights for policy 0, policy_version 59070 (0.0009) +[2023-10-13 02:42:33,606][45375] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 120913920. Throughput: 0: 1680.0, 1: 1672.1. Samples: 30231342. Policy #0 lag: (min: 5.0, avg: 7.8, max: 37.0) +[2023-10-13 02:42:33,607][45375] Avg episode reward: [(0, '49.030'), (1, '49.520')] +[2023-10-13 02:42:36,585][46663] Updated weights for policy 1, policy_version 59011 (0.0009) +[2023-10-13 02:42:36,938][46662] Updated weights for policy 0, policy_version 59080 (0.0009) +[2023-10-13 02:42:36,956][46663] Updated weights for policy 1, policy_version 59021 (0.0007) +[2023-10-13 02:42:37,302][46662] Updated weights for policy 0, policy_version 59090 (0.0007) +[2023-10-13 02:42:37,326][46663] Updated weights for policy 1, policy_version 59031 (0.0009) +[2023-10-13 02:42:37,671][46662] Updated weights for policy 0, policy_version 59100 (0.0009) +[2023-10-13 02:42:38,606][45375] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 13440.4). Total num frames: 120979456. Throughput: 0: 1654.3, 1: 1677.7. Samples: 30250426. Policy #0 lag: (min: 5.0, avg: 7.8, max: 37.0) +[2023-10-13 02:42:38,607][45375] Avg episode reward: [(0, '49.280'), (1, '50.470')] +[2023-10-13 02:42:41,285][46663] Updated weights for policy 1, policy_version 59041 (0.0008) +[2023-10-13 02:42:41,652][46663] Updated weights for policy 1, policy_version 59051 (0.0007) +[2023-10-13 02:42:41,837][46662] Updated weights for policy 0, policy_version 59110 (0.0008) +[2023-10-13 02:42:42,025][46663] Updated weights for policy 1, policy_version 59061 (0.0008) +[2023-10-13 02:42:42,211][46662] Updated weights for policy 0, policy_version 59120 (0.0009) +[2023-10-13 02:42:42,398][46663] Updated weights for policy 1, policy_version 59071 (0.0009) +[2023-10-13 02:42:42,571][46662] Updated weights for policy 0, policy_version 59130 (0.0008) +[2023-10-13 02:42:43,607][45375] Fps is (10 sec: 13107.1, 60 sec: 14199.4, 300 sec: 13440.4). Total num frames: 121044992. Throughput: 0: 1668.4, 1: 1692.0. Samples: 30261878. Policy #0 lag: (min: 5.0, avg: 7.8, max: 37.0) +[2023-10-13 02:42:43,607][45375] Avg episode reward: [(0, '49.910'), (1, '49.940')] +[2023-10-13 02:42:46,456][46663] Updated weights for policy 1, policy_version 59081 (0.0008) +[2023-10-13 02:42:46,645][46662] Updated weights for policy 0, policy_version 59140 (0.0010) +[2023-10-13 02:42:46,818][46663] Updated weights for policy 1, policy_version 59091 (0.0009) +[2023-10-13 02:42:47,015][46662] Updated weights for policy 0, policy_version 59150 (0.0010) +[2023-10-13 02:42:47,191][46663] Updated weights for policy 1, policy_version 59101 (0.0008) +[2023-10-13 02:42:47,381][46662] Updated weights for policy 0, policy_version 59160 (0.0010) +[2023-10-13 02:42:48,606][45375] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 13329.4). Total num frames: 121110528. Throughput: 0: 1661.6, 1: 1669.3. Samples: 30281140. Policy #0 lag: (min: 5.0, avg: 7.8, max: 37.0) +[2023-10-13 02:42:48,607][45375] Avg episode reward: [(0, '52.360'), (1, '50.650')] +[2023-10-13 02:42:51,389][46663] Updated weights for policy 1, policy_version 59111 (0.0008) +[2023-10-13 02:42:51,536][46662] Updated weights for policy 0, policy_version 59170 (0.0008) +[2023-10-13 02:42:51,741][46663] Updated weights for policy 1, policy_version 59121 (0.0010) +[2023-10-13 02:42:51,901][46662] Updated weights for policy 0, policy_version 59180 (0.0007) +[2023-10-13 02:42:52,108][46663] Updated weights for policy 1, policy_version 59131 (0.0008) +[2023-10-13 02:42:52,268][46662] Updated weights for policy 0, policy_version 59190 (0.0007) +[2023-10-13 02:42:52,632][46662] Updated weights for policy 0, policy_version 59200 (0.0010) +[2023-10-13 02:42:53,606][45375] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 13329.4). Total num frames: 121176064. Throughput: 0: 1651.8, 1: 1693.2. Samples: 30300566. Policy #0 lag: (min: 5.0, avg: 7.8, max: 37.0) +[2023-10-13 02:42:53,607][45375] Avg episode reward: [(0, '52.960'), (1, '50.330')] +[2023-10-13 02:42:56,099][46663] Updated weights for policy 1, policy_version 59141 (0.0010) +[2023-10-13 02:42:56,473][46663] Updated weights for policy 1, policy_version 59151 (0.0008) +[2023-10-13 02:42:56,609][46662] Updated weights for policy 0, policy_version 59210 (0.0008) +[2023-10-13 02:42:56,844][46663] Updated weights for policy 1, policy_version 59161 (0.0008) +[2023-10-13 02:42:56,980][46662] Updated weights for policy 0, policy_version 59220 (0.0007) +[2023-10-13 02:42:57,348][46662] Updated weights for policy 0, policy_version 59230 (0.0008) +[2023-10-13 02:42:58,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13329.4). Total num frames: 121241600. Throughput: 0: 1674.7, 1: 1681.2. Samples: 30312054. Policy #0 lag: (min: 5.0, avg: 7.8, max: 37.0) +[2023-10-13 02:42:58,607][45375] Avg episode reward: [(0, '53.310'), (1, '51.840')] +[2023-10-13 02:43:00,881][46663] Updated weights for policy 1, policy_version 59171 (0.0009) +[2023-10-13 02:43:01,250][46663] Updated weights for policy 1, policy_version 59181 (0.0008) +[2023-10-13 02:43:01,454][46662] Updated weights for policy 0, policy_version 59240 (0.0007) +[2023-10-13 02:43:01,613][46663] Updated weights for policy 1, policy_version 59191 (0.0008) +[2023-10-13 02:43:01,823][46662] Updated weights for policy 0, policy_version 59250 (0.0008) +[2023-10-13 02:43:02,188][46662] Updated weights for policy 0, policy_version 59260 (0.0009) +[2023-10-13 02:43:03,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 121307136. Throughput: 0: 1655.5, 1: 1673.3. Samples: 30331076. Policy #0 lag: (min: 5.0, avg: 7.8, max: 37.0) +[2023-10-13 02:43:03,608][45375] Avg episode reward: [(0, '55.170'), (1, '51.950')] +[2023-10-13 02:43:05,617][46663] Updated weights for policy 1, policy_version 59201 (0.0008) +[2023-10-13 02:43:05,977][46663] Updated weights for policy 1, policy_version 59211 (0.0010) +[2023-10-13 02:43:06,292][46662] Updated weights for policy 0, policy_version 59270 (0.0008) +[2023-10-13 02:43:06,344][46663] Updated weights for policy 1, policy_version 59221 (0.0007) +[2023-10-13 02:43:06,656][46662] Updated weights for policy 0, policy_version 59280 (0.0007) +[2023-10-13 02:43:06,714][46663] Updated weights for policy 1, policy_version 59231 (0.0009) +[2023-10-13 02:43:07,035][46662] Updated weights for policy 0, policy_version 59290 (0.0009) +[2023-10-13 02:43:08,606][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 121372672. Throughput: 0: 1661.1, 1: 1686.0. Samples: 30351082. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:43:08,607][45375] Avg episode reward: [(0, '54.540'), (1, '50.350')] +[2023-10-13 02:43:10,647][46663] Updated weights for policy 1, policy_version 59241 (0.0009) +[2023-10-13 02:43:11,018][46663] Updated weights for policy 1, policy_version 59251 (0.0008) +[2023-10-13 02:43:11,085][46662] Updated weights for policy 0, policy_version 59300 (0.0008) +[2023-10-13 02:43:11,380][46663] Updated weights for policy 1, policy_version 59261 (0.0009) +[2023-10-13 02:43:11,460][46662] Updated weights for policy 0, policy_version 59310 (0.0007) +[2023-10-13 02:43:11,829][46662] Updated weights for policy 0, policy_version 59320 (0.0007) +[2023-10-13 02:43:13,607][45375] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 121438208. Throughput: 0: 1668.9, 1: 1663.6. Samples: 30361862. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:43:13,607][45375] Avg episode reward: [(0, '55.380'), (1, '48.670')] +[2023-10-13 02:43:15,585][46663] Updated weights for policy 1, policy_version 59271 (0.0008) +[2023-10-13 02:43:15,963][46663] Updated weights for policy 1, policy_version 59281 (0.0009) +[2023-10-13 02:43:16,132][46662] Updated weights for policy 0, policy_version 59330 (0.0008) +[2023-10-13 02:43:16,319][46663] Updated weights for policy 1, policy_version 59291 (0.0009) +[2023-10-13 02:43:16,499][46662] Updated weights for policy 0, policy_version 59340 (0.0008) +[2023-10-13 02:43:16,860][46662] Updated weights for policy 0, policy_version 59350 (0.0008) +[2023-10-13 02:43:17,229][46662] Updated weights for policy 0, policy_version 59360 (0.0007) +[2023-10-13 02:43:18,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 121503744. Throughput: 0: 1648.2, 1: 1685.7. Samples: 30381368. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:43:18,607][45375] Avg episode reward: [(0, '54.640'), (1, '49.760')] +[2023-10-13 02:43:20,399][46663] Updated weights for policy 1, policy_version 59301 (0.0009) +[2023-10-13 02:43:20,774][46663] Updated weights for policy 1, policy_version 59311 (0.0009) +[2023-10-13 02:43:21,140][46663] Updated weights for policy 1, policy_version 59321 (0.0007) +[2023-10-13 02:43:21,426][46662] Updated weights for policy 0, policy_version 59370 (0.0009) +[2023-10-13 02:43:21,789][46662] Updated weights for policy 0, policy_version 59380 (0.0010) +[2023-10-13 02:43:22,163][46662] Updated weights for policy 0, policy_version 59390 (0.0007) +[2023-10-13 02:43:23,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 121569280. Throughput: 0: 1662.7, 1: 1686.9. Samples: 30401158. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:43:23,608][45375] Avg episode reward: [(0, '55.290'), (1, '49.680')] +[2023-10-13 02:43:25,282][46663] Updated weights for policy 1, policy_version 59331 (0.0008) +[2023-10-13 02:43:25,652][46663] Updated weights for policy 1, policy_version 59341 (0.0010) +[2023-10-13 02:43:26,019][46663] Updated weights for policy 1, policy_version 59351 (0.0008) +[2023-10-13 02:43:26,092][46662] Updated weights for policy 0, policy_version 59400 (0.0008) +[2023-10-13 02:43:26,462][46662] Updated weights for policy 0, policy_version 59410 (0.0007) +[2023-10-13 02:43:26,827][46662] Updated weights for policy 0, policy_version 59420 (0.0008) +[2023-10-13 02:43:28,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 121634816. Throughput: 0: 1671.7, 1: 1658.0. Samples: 30411714. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:43:28,607][45375] Avg episode reward: [(0, '55.320'), (1, '50.770')] +[2023-10-13 02:43:30,108][46663] Updated weights for policy 1, policy_version 59361 (0.0008) +[2023-10-13 02:43:30,470][46663] Updated weights for policy 1, policy_version 59371 (0.0007) +[2023-10-13 02:43:30,836][46663] Updated weights for policy 1, policy_version 59381 (0.0009) +[2023-10-13 02:43:30,924][46662] Updated weights for policy 0, policy_version 59430 (0.0007) +[2023-10-13 02:43:31,204][46663] Updated weights for policy 1, policy_version 59391 (0.0008) +[2023-10-13 02:43:31,304][46662] Updated weights for policy 0, policy_version 59440 (0.0008) +[2023-10-13 02:43:31,675][46662] Updated weights for policy 0, policy_version 59450 (0.0009) +[2023-10-13 02:43:33,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 121700352. Throughput: 0: 1654.7, 1: 1681.9. Samples: 30431288. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:43:33,608][45375] Avg episode reward: [(0, '56.410'), (1, '53.890')] +[2023-10-13 02:43:35,182][46663] Updated weights for policy 1, policy_version 59401 (0.0009) +[2023-10-13 02:43:35,552][46663] Updated weights for policy 1, policy_version 59411 (0.0010) +[2023-10-13 02:43:35,848][46662] Updated weights for policy 0, policy_version 59460 (0.0008) +[2023-10-13 02:43:35,915][46663] Updated weights for policy 1, policy_version 59421 (0.0009) +[2023-10-13 02:43:36,225][46662] Updated weights for policy 0, policy_version 59470 (0.0009) +[2023-10-13 02:43:36,587][46662] Updated weights for policy 0, policy_version 59480 (0.0010) +[2023-10-13 02:43:38,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 121765888. Throughput: 0: 1674.0, 1: 1682.8. Samples: 30451620. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:43:38,607][45375] Avg episode reward: [(0, '55.640'), (1, '54.070')] +[2023-10-13 02:43:39,931][46663] Updated weights for policy 1, policy_version 59431 (0.0008) +[2023-10-13 02:43:40,291][46663] Updated weights for policy 1, policy_version 59441 (0.0010) +[2023-10-13 02:43:40,632][46662] Updated weights for policy 0, policy_version 59490 (0.0009) +[2023-10-13 02:43:40,661][46663] Updated weights for policy 1, policy_version 59451 (0.0008) +[2023-10-13 02:43:41,007][46662] Updated weights for policy 0, policy_version 59500 (0.0009) +[2023-10-13 02:43:41,376][46662] Updated weights for policy 0, policy_version 59510 (0.0008) +[2023-10-13 02:43:41,746][46662] Updated weights for policy 0, policy_version 59520 (0.0008) +[2023-10-13 02:43:43,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 121831424. Throughput: 0: 1662.3, 1: 1665.4. Samples: 30461800. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:43:43,608][45375] Avg episode reward: [(0, '55.400'), (1, '53.430')] +[2023-10-13 02:43:44,855][46663] Updated weights for policy 1, policy_version 59461 (0.0008) +[2023-10-13 02:43:45,216][46663] Updated weights for policy 1, policy_version 59471 (0.0009) +[2023-10-13 02:43:45,584][46663] Updated weights for policy 1, policy_version 59481 (0.0007) +[2023-10-13 02:43:45,628][46662] Updated weights for policy 0, policy_version 59530 (0.0008) +[2023-10-13 02:43:46,008][46662] Updated weights for policy 0, policy_version 59540 (0.0008) +[2023-10-13 02:43:46,371][46662] Updated weights for policy 0, policy_version 59550 (0.0007) +[2023-10-13 02:43:48,606][45375] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 121896960. Throughput: 0: 1662.6, 1: 1678.6. Samples: 30481432. Policy #0 lag: (min: 18.0, avg: 29.1, max: 50.0) +[2023-10-13 02:43:48,607][45375] Avg episode reward: [(0, '55.750'), (1, '51.650')] +[2023-10-13 02:43:49,697][46663] Updated weights for policy 1, policy_version 59491 (0.0008) +[2023-10-13 02:43:50,068][46663] Updated weights for policy 1, policy_version 59501 (0.0009) +[2023-10-13 02:43:50,433][46663] Updated weights for policy 1, policy_version 59511 (0.0007) +[2023-10-13 02:43:50,531][46662] Updated weights for policy 0, policy_version 59560 (0.0007) +[2023-10-13 02:43:50,912][46662] Updated weights for policy 0, policy_version 59570 (0.0008) +[2023-10-13 02:43:51,283][46662] Updated weights for policy 0, policy_version 59580 (0.0008) +[2023-10-13 02:43:53,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 121962496. Throughput: 0: 1678.9, 1: 1678.6. Samples: 30502168. Policy #0 lag: (min: 18.0, avg: 29.1, max: 50.0) +[2023-10-13 02:43:53,607][45375] Avg episode reward: [(0, '54.270'), (1, '51.350')] +[2023-10-13 02:43:53,614][46091] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000059584_61014016.pth... +[2023-10-13 02:43:53,614][46384] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000059520_60948480.pth... +[2023-10-13 02:43:53,644][46091] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000058016_59408384.pth +[2023-10-13 02:43:53,645][46384] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000057952_59342848.pth +[2023-10-13 02:43:54,420][46663] Updated weights for policy 1, policy_version 59521 (0.0009) +[2023-10-13 02:43:54,784][46663] Updated weights for policy 1, policy_version 59531 (0.0008) +[2023-10-13 02:43:55,158][46663] Updated weights for policy 1, policy_version 59541 (0.0008) +[2023-10-13 02:43:55,427][46662] Updated weights for policy 0, policy_version 59590 (0.0008) +[2023-10-13 02:43:55,522][46663] Updated weights for policy 1, policy_version 59551 (0.0007) +[2023-10-13 02:43:55,807][46662] Updated weights for policy 0, policy_version 59600 (0.0008) +[2023-10-13 02:43:56,173][46662] Updated weights for policy 0, policy_version 59610 (0.0007) +[2023-10-13 02:43:58,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 122028032. Throughput: 0: 1663.8, 1: 1668.1. Samples: 30511796. Policy #0 lag: (min: 18.0, avg: 29.1, max: 50.0) +[2023-10-13 02:43:58,607][45375] Avg episode reward: [(0, '53.910'), (1, '50.260')] +[2023-10-13 02:43:59,724][46663] Updated weights for policy 1, policy_version 59561 (0.0010) +[2023-10-13 02:44:00,099][46663] Updated weights for policy 1, policy_version 59571 (0.0010) +[2023-10-13 02:44:00,159][46662] Updated weights for policy 0, policy_version 59620 (0.0009) +[2023-10-13 02:44:00,463][46663] Updated weights for policy 1, policy_version 59581 (0.0009) +[2023-10-13 02:44:00,533][46662] Updated weights for policy 0, policy_version 59630 (0.0008) +[2023-10-13 02:44:00,901][46662] Updated weights for policy 0, policy_version 59640 (0.0008) +[2023-10-13 02:44:03,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 122093568. Throughput: 0: 1672.0, 1: 1672.9. Samples: 30531888. Policy #0 lag: (min: 18.0, avg: 29.1, max: 50.0) +[2023-10-13 02:44:03,608][45375] Avg episode reward: [(0, '53.720'), (1, '50.440')] +[2023-10-13 02:44:04,480][46663] Updated weights for policy 1, policy_version 59591 (0.0008) +[2023-10-13 02:44:04,852][46663] Updated weights for policy 1, policy_version 59601 (0.0010) +[2023-10-13 02:44:04,943][46662] Updated weights for policy 0, policy_version 59650 (0.0009) +[2023-10-13 02:44:05,208][46663] Updated weights for policy 1, policy_version 59611 (0.0008) +[2023-10-13 02:44:05,304][46662] Updated weights for policy 0, policy_version 59660 (0.0008) +[2023-10-13 02:44:05,678][46662] Updated weights for policy 0, policy_version 59670 (0.0009) +[2023-10-13 02:44:06,043][46662] Updated weights for policy 0, policy_version 59680 (0.0007) +[2023-10-13 02:44:08,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 122159104. Throughput: 0: 1688.8, 1: 1675.4. Samples: 30552548. Policy #0 lag: (min: 18.0, avg: 29.1, max: 50.0) +[2023-10-13 02:44:08,607][45375] Avg episode reward: [(0, '52.900'), (1, '50.880')] +[2023-10-13 02:44:09,424][46663] Updated weights for policy 1, policy_version 59621 (0.0009) +[2023-10-13 02:44:09,794][46663] Updated weights for policy 1, policy_version 59631 (0.0010) +[2023-10-13 02:44:10,113][46662] Updated weights for policy 0, policy_version 59690 (0.0009) +[2023-10-13 02:44:10,167][46663] Updated weights for policy 1, policy_version 59641 (0.0010) +[2023-10-13 02:44:10,473][46662] Updated weights for policy 0, policy_version 59700 (0.0009) +[2023-10-13 02:44:10,849][46662] Updated weights for policy 0, policy_version 59710 (0.0009) +[2023-10-13 02:44:13,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 122224640. Throughput: 0: 1656.7, 1: 1675.5. Samples: 30561668. Policy #0 lag: (min: 18.0, avg: 29.1, max: 50.0) +[2023-10-13 02:44:13,608][45375] Avg episode reward: [(0, '51.690'), (1, '51.230')] +[2023-10-13 02:44:14,343][46663] Updated weights for policy 1, policy_version 59651 (0.0009) +[2023-10-13 02:44:14,716][46663] Updated weights for policy 1, policy_version 59661 (0.0008) +[2023-10-13 02:44:14,761][46662] Updated weights for policy 0, policy_version 59720 (0.0009) +[2023-10-13 02:44:15,089][46663] Updated weights for policy 1, policy_version 59671 (0.0009) +[2023-10-13 02:44:15,134][46662] Updated weights for policy 0, policy_version 59730 (0.0008) +[2023-10-13 02:44:15,500][46662] Updated weights for policy 0, policy_version 59740 (0.0008) +[2023-10-13 02:44:18,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 122290176. Throughput: 0: 1681.5, 1: 1674.5. Samples: 30582308. Policy #0 lag: (min: 18.0, avg: 29.1, max: 50.0) +[2023-10-13 02:44:18,607][45375] Avg episode reward: [(0, '52.210'), (1, '51.650')] +[2023-10-13 02:44:19,038][46663] Updated weights for policy 1, policy_version 59681 (0.0008) +[2023-10-13 02:44:19,400][46663] Updated weights for policy 1, policy_version 59691 (0.0007) +[2023-10-13 02:44:19,608][46662] Updated weights for policy 0, policy_version 59750 (0.0009) +[2023-10-13 02:44:19,762][46663] Updated weights for policy 1, policy_version 59701 (0.0008) +[2023-10-13 02:44:19,985][46662] Updated weights for policy 0, policy_version 59760 (0.0008) +[2023-10-13 02:44:20,131][46663] Updated weights for policy 1, policy_version 59711 (0.0007) +[2023-10-13 02:44:20,357][46662] Updated weights for policy 0, policy_version 59770 (0.0007) +[2023-10-13 02:44:23,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 122355712. Throughput: 0: 1685.4, 1: 1669.2. Samples: 30602574. Policy #0 lag: (min: 18.0, avg: 29.1, max: 50.0) +[2023-10-13 02:44:23,608][45375] Avg episode reward: [(0, '51.970'), (1, '51.940')] +[2023-10-13 02:44:24,267][46663] Updated weights for policy 1, policy_version 59721 (0.0008) +[2023-10-13 02:44:24,551][46662] Updated weights for policy 0, policy_version 59780 (0.0008) +[2023-10-13 02:44:24,638][46663] Updated weights for policy 1, policy_version 59731 (0.0008) +[2023-10-13 02:44:24,918][46662] Updated weights for policy 0, policy_version 59790 (0.0008) +[2023-10-13 02:44:25,004][46663] Updated weights for policy 1, policy_version 59741 (0.0007) +[2023-10-13 02:44:25,286][46662] Updated weights for policy 0, policy_version 59800 (0.0008) +[2023-10-13 02:44:28,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 122421248. Throughput: 0: 1660.5, 1: 1670.5. Samples: 30611696. Policy #0 lag: (min: 31.0, avg: 37.6, max: 63.0) +[2023-10-13 02:44:28,607][45375] Avg episode reward: [(0, '50.420'), (1, '53.090')] +[2023-10-13 02:44:29,128][46663] Updated weights for policy 1, policy_version 59751 (0.0007) +[2023-10-13 02:44:29,486][46663] Updated weights for policy 1, policy_version 59761 (0.0010) +[2023-10-13 02:44:29,613][46662] Updated weights for policy 0, policy_version 59810 (0.0008) +[2023-10-13 02:44:29,858][46663] Updated weights for policy 1, policy_version 59771 (0.0008) +[2023-10-13 02:44:29,982][46662] Updated weights for policy 0, policy_version 59820 (0.0009) +[2023-10-13 02:44:30,352][46662] Updated weights for policy 0, policy_version 59830 (0.0008) +[2023-10-13 02:44:30,720][46662] Updated weights for policy 0, policy_version 59840 (0.0007) +[2023-10-13 02:44:33,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 122486784. Throughput: 0: 1682.0, 1: 1672.0. Samples: 30632364. Policy #0 lag: (min: 31.0, avg: 37.6, max: 63.0) +[2023-10-13 02:44:33,608][45375] Avg episode reward: [(0, '51.220'), (1, '49.970')] +[2023-10-13 02:44:33,924][46663] Updated weights for policy 1, policy_version 59781 (0.0007) +[2023-10-13 02:44:34,296][46663] Updated weights for policy 1, policy_version 59791 (0.0007) +[2023-10-13 02:44:34,615][46662] Updated weights for policy 0, policy_version 59850 (0.0009) +[2023-10-13 02:44:34,661][46663] Updated weights for policy 1, policy_version 59801 (0.0007) +[2023-10-13 02:44:34,981][46662] Updated weights for policy 0, policy_version 59860 (0.0007) +[2023-10-13 02:44:35,361][46662] Updated weights for policy 0, policy_version 59870 (0.0008) +[2023-10-13 02:44:38,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 122552320. Throughput: 0: 1683.2, 1: 1674.8. Samples: 30653282. Policy #0 lag: (min: 31.0, avg: 37.6, max: 63.0) +[2023-10-13 02:44:38,607][45375] Avg episode reward: [(0, '50.530'), (1, '48.480')] +[2023-10-13 02:44:38,847][46663] Updated weights for policy 1, policy_version 59811 (0.0008) +[2023-10-13 02:44:39,210][46663] Updated weights for policy 1, policy_version 59821 (0.0009) +[2023-10-13 02:44:39,486][46662] Updated weights for policy 0, policy_version 59880 (0.0008) +[2023-10-13 02:44:39,583][46663] Updated weights for policy 1, policy_version 59831 (0.0008) +[2023-10-13 02:44:39,864][46662] Updated weights for policy 0, policy_version 59890 (0.0007) +[2023-10-13 02:44:40,234][46662] Updated weights for policy 0, policy_version 59900 (0.0007) +[2023-10-13 02:44:43,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 122617856. Throughput: 0: 1666.8, 1: 1677.7. Samples: 30662302. Policy #0 lag: (min: 31.0, avg: 37.6, max: 63.0) +[2023-10-13 02:44:43,608][45375] Avg episode reward: [(0, '49.860'), (1, '49.750')] +[2023-10-13 02:44:43,669][46663] Updated weights for policy 1, policy_version 59841 (0.0009) +[2023-10-13 02:44:44,040][46663] Updated weights for policy 1, policy_version 59851 (0.0007) +[2023-10-13 02:44:44,200][46662] Updated weights for policy 0, policy_version 59910 (0.0008) +[2023-10-13 02:44:44,409][46663] Updated weights for policy 1, policy_version 59861 (0.0007) +[2023-10-13 02:44:44,568][46662] Updated weights for policy 0, policy_version 59920 (0.0007) +[2023-10-13 02:44:44,777][46663] Updated weights for policy 1, policy_version 59871 (0.0008) +[2023-10-13 02:44:44,932][46662] Updated weights for policy 0, policy_version 59930 (0.0007) +[2023-10-13 02:44:48,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 122683392. Throughput: 0: 1685.8, 1: 1677.0. Samples: 30683212. Policy #0 lag: (min: 31.0, avg: 37.6, max: 63.0) +[2023-10-13 02:44:48,607][45375] Avg episode reward: [(0, '49.980'), (1, '50.940')] +[2023-10-13 02:44:48,878][46663] Updated weights for policy 1, policy_version 59881 (0.0008) +[2023-10-13 02:44:49,001][46662] Updated weights for policy 0, policy_version 59940 (0.0009) +[2023-10-13 02:44:49,242][46663] Updated weights for policy 1, policy_version 59891 (0.0009) +[2023-10-13 02:44:49,369][46662] Updated weights for policy 0, policy_version 59950 (0.0007) +[2023-10-13 02:44:49,615][46663] Updated weights for policy 1, policy_version 59901 (0.0009) +[2023-10-13 02:44:49,729][46662] Updated weights for policy 0, policy_version 59960 (0.0008) +[2023-10-13 02:44:53,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 122748928. Throughput: 0: 1682.0, 1: 1678.6. Samples: 30703774. Policy #0 lag: (min: 31.0, avg: 37.6, max: 63.0) +[2023-10-13 02:44:53,607][45375] Avg episode reward: [(0, '50.330'), (1, '49.780')] +[2023-10-13 02:44:53,788][46663] Updated weights for policy 1, policy_version 59911 (0.0009) +[2023-10-13 02:44:53,877][46662] Updated weights for policy 0, policy_version 59970 (0.0008) +[2023-10-13 02:44:54,147][46663] Updated weights for policy 1, policy_version 59921 (0.0008) +[2023-10-13 02:44:54,235][46662] Updated weights for policy 0, policy_version 59980 (0.0010) +[2023-10-13 02:44:54,507][46663] Updated weights for policy 1, policy_version 59931 (0.0008) +[2023-10-13 02:44:54,604][46662] Updated weights for policy 0, policy_version 59990 (0.0008) +[2023-10-13 02:44:54,973][46662] Updated weights for policy 0, policy_version 60000 (0.0009) +[2023-10-13 02:44:58,582][46663] Updated weights for policy 1, policy_version 59941 (0.0009) +[2023-10-13 02:44:58,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 122814464. Throughput: 0: 1682.3, 1: 1674.9. Samples: 30712742. Policy #0 lag: (min: 31.0, avg: 37.6, max: 63.0) +[2023-10-13 02:44:58,608][45375] Avg episode reward: [(0, '49.270'), (1, '49.820')] +[2023-10-13 02:44:58,949][46663] Updated weights for policy 1, policy_version 59951 (0.0009) +[2023-10-13 02:44:58,995][46662] Updated weights for policy 0, policy_version 60010 (0.0009) +[2023-10-13 02:44:59,316][46663] Updated weights for policy 1, policy_version 59961 (0.0008) +[2023-10-13 02:44:59,352][46662] Updated weights for policy 0, policy_version 60020 (0.0008) +[2023-10-13 02:44:59,731][46662] Updated weights for policy 0, policy_version 60030 (0.0008) +[2023-10-13 02:45:03,519][46663] Updated weights for policy 1, policy_version 59971 (0.0009) +[2023-10-13 02:45:03,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 122880000. Throughput: 0: 1685.2, 1: 1671.8. Samples: 30733376. Policy #0 lag: (min: 31.0, avg: 37.6, max: 63.0) +[2023-10-13 02:45:03,608][45375] Avg episode reward: [(0, '49.460'), (1, '50.280')] +[2023-10-13 02:45:03,883][46662] Updated weights for policy 0, policy_version 60040 (0.0007) +[2023-10-13 02:45:03,884][46663] Updated weights for policy 1, policy_version 59981 (0.0010) +[2023-10-13 02:45:04,254][46663] Updated weights for policy 1, policy_version 59991 (0.0010) +[2023-10-13 02:45:04,262][46662] Updated weights for policy 0, policy_version 60050 (0.0007) +[2023-10-13 02:45:04,635][46662] Updated weights for policy 0, policy_version 60060 (0.0009) +[2023-10-13 02:45:08,408][46663] Updated weights for policy 1, policy_version 60001 (0.0009) +[2023-10-13 02:45:08,545][46662] Updated weights for policy 0, policy_version 60070 (0.0009) +[2023-10-13 02:45:08,606][45375] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 122945536. Throughput: 0: 1683.7, 1: 1676.1. Samples: 30753768. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:45:08,607][45375] Avg episode reward: [(0, '48.830'), (1, '50.360')] +[2023-10-13 02:45:08,770][46663] Updated weights for policy 1, policy_version 60011 (0.0008) +[2023-10-13 02:45:08,905][46662] Updated weights for policy 0, policy_version 60080 (0.0009) +[2023-10-13 02:45:09,144][46663] Updated weights for policy 1, policy_version 60021 (0.0008) +[2023-10-13 02:45:09,280][46662] Updated weights for policy 0, policy_version 60090 (0.0008) +[2023-10-13 02:45:09,505][46663] Updated weights for policy 1, policy_version 60031 (0.0010) +[2023-10-13 02:45:13,422][46662] Updated weights for policy 0, policy_version 60100 (0.0009) +[2023-10-13 02:45:13,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 123011072. Throughput: 0: 1688.5, 1: 1672.5. Samples: 30762942. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:45:13,608][45375] Avg episode reward: [(0, '48.810'), (1, '49.240')] +[2023-10-13 02:45:13,713][46663] Updated weights for policy 1, policy_version 60041 (0.0009) +[2023-10-13 02:45:13,801][46662] Updated weights for policy 0, policy_version 60110 (0.0008) +[2023-10-13 02:45:14,076][46663] Updated weights for policy 1, policy_version 60051 (0.0008) +[2023-10-13 02:45:14,180][46662] Updated weights for policy 0, policy_version 60120 (0.0007) +[2023-10-13 02:45:14,436][46663] Updated weights for policy 1, policy_version 60061 (0.0007) +[2023-10-13 02:45:18,236][46662] Updated weights for policy 0, policy_version 60130 (0.0010) +[2023-10-13 02:45:18,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 123076608. Throughput: 0: 1683.8, 1: 1669.8. Samples: 30783278. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:45:18,607][46662] Updated weights for policy 0, policy_version 60140 (0.0008) +[2023-10-13 02:45:18,607][45375] Avg episode reward: [(0, '49.390'), (1, '50.140')] +[2023-10-13 02:45:18,611][46663] Updated weights for policy 1, policy_version 60071 (0.0007) +[2023-10-13 02:45:18,977][46663] Updated weights for policy 1, policy_version 60081 (0.0009) +[2023-10-13 02:45:18,978][46662] Updated weights for policy 0, policy_version 60150 (0.0008) +[2023-10-13 02:45:19,345][46663] Updated weights for policy 1, policy_version 60091 (0.0009) +[2023-10-13 02:45:19,351][46662] Updated weights for policy 0, policy_version 60160 (0.0007) +[2023-10-13 02:45:23,505][46663] Updated weights for policy 1, policy_version 60101 (0.0011) +[2023-10-13 02:45:23,529][46662] Updated weights for policy 0, policy_version 60170 (0.0008) +[2023-10-13 02:45:23,606][45375] Fps is (10 sec: 13107.5, 60 sec: 13107.3, 300 sec: 13329.4). Total num frames: 123142144. Throughput: 0: 1677.0, 1: 1660.0. Samples: 30803448. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:45:23,607][45375] Avg episode reward: [(0, '50.090'), (1, '49.790')] +[2023-10-13 02:45:23,868][46663] Updated weights for policy 1, policy_version 60111 (0.0008) +[2023-10-13 02:45:23,885][46662] Updated weights for policy 0, policy_version 60180 (0.0009) +[2023-10-13 02:45:24,240][46663] Updated weights for policy 1, policy_version 60121 (0.0008) +[2023-10-13 02:45:24,256][46662] Updated weights for policy 0, policy_version 60190 (0.0008) +[2023-10-13 02:45:28,352][46663] Updated weights for policy 1, policy_version 60131 (0.0008) +[2023-10-13 02:45:28,530][46662] Updated weights for policy 0, policy_version 60200 (0.0008) +[2023-10-13 02:45:28,606][45375] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 123207680. Throughput: 0: 1678.0, 1: 1659.0. Samples: 30812468. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:45:28,607][45375] Avg episode reward: [(0, '50.060'), (1, '49.720')] +[2023-10-13 02:45:28,718][46663] Updated weights for policy 1, policy_version 60141 (0.0009) +[2023-10-13 02:45:28,911][46662] Updated weights for policy 0, policy_version 60210 (0.0008) +[2023-10-13 02:45:29,089][46663] Updated weights for policy 1, policy_version 60151 (0.0008) +[2023-10-13 02:45:29,286][46662] Updated weights for policy 0, policy_version 60220 (0.0009) +[2023-10-13 02:45:33,002][46663] Updated weights for policy 1, policy_version 60161 (0.0008) +[2023-10-13 02:45:33,369][46663] Updated weights for policy 1, policy_version 60171 (0.0008) +[2023-10-13 02:45:33,430][46662] Updated weights for policy 0, policy_version 60230 (0.0009) +[2023-10-13 02:45:33,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 123273216. Throughput: 0: 1662.4, 1: 1664.0. Samples: 30832902. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:45:33,607][45375] Avg episode reward: [(0, '52.620'), (1, '49.660')] +[2023-10-13 02:45:33,738][46663] Updated weights for policy 1, policy_version 60181 (0.0008) +[2023-10-13 02:45:33,802][46662] Updated weights for policy 0, policy_version 60240 (0.0008) +[2023-10-13 02:45:34,108][46663] Updated weights for policy 1, policy_version 60191 (0.0008) +[2023-10-13 02:45:34,177][46662] Updated weights for policy 0, policy_version 60250 (0.0008) +[2023-10-13 02:45:38,164][46663] Updated weights for policy 1, policy_version 60201 (0.0010) +[2023-10-13 02:45:38,306][46662] Updated weights for policy 0, policy_version 60260 (0.0008) +[2023-10-13 02:45:38,530][46663] Updated weights for policy 1, policy_version 60211 (0.0009) +[2023-10-13 02:45:38,607][45375] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 123338752. Throughput: 0: 1658.0, 1: 1649.2. Samples: 30852602. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:45:38,607][45375] Avg episode reward: [(0, '53.180'), (1, '50.030')] +[2023-10-13 02:45:38,672][46662] Updated weights for policy 0, policy_version 60270 (0.0008) +[2023-10-13 02:45:38,895][46663] Updated weights for policy 1, policy_version 60221 (0.0010) +[2023-10-13 02:45:39,049][46662] Updated weights for policy 0, policy_version 60280 (0.0008) +[2023-10-13 02:45:43,120][46663] Updated weights for policy 1, policy_version 60231 (0.0008) +[2023-10-13 02:45:43,315][46662] Updated weights for policy 0, policy_version 60290 (0.0009) +[2023-10-13 02:45:43,492][46663] Updated weights for policy 1, policy_version 60241 (0.0009) +[2023-10-13 02:45:43,606][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 123404288. Throughput: 0: 1657.1, 1: 1665.6. Samples: 30862262. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:45:43,607][45375] Avg episode reward: [(0, '52.150'), (1, '52.210')] +[2023-10-13 02:45:43,683][46662] Updated weights for policy 0, policy_version 60300 (0.0008) +[2023-10-13 02:45:43,858][46663] Updated weights for policy 1, policy_version 60251 (0.0008) +[2023-10-13 02:45:44,051][46662] Updated weights for policy 0, policy_version 60310 (0.0009) +[2023-10-13 02:45:44,430][46662] Updated weights for policy 0, policy_version 60320 (0.0008) +[2023-10-13 02:45:47,941][46663] Updated weights for policy 1, policy_version 60261 (0.0008) +[2023-10-13 02:45:48,315][46663] Updated weights for policy 1, policy_version 60271 (0.0009) +[2023-10-13 02:45:48,349][46662] Updated weights for policy 0, policy_version 60330 (0.0009) +[2023-10-13 02:45:48,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 123469824. Throughput: 0: 1656.4, 1: 1658.8. Samples: 30882564. Policy #0 lag: (min: 31.0, avg: 37.7, max: 63.0) +[2023-10-13 02:45:48,607][45375] Avg episode reward: [(0, '52.490'), (1, '51.360')] +[2023-10-13 02:45:48,681][46663] Updated weights for policy 1, policy_version 60281 (0.0009) +[2023-10-13 02:45:48,723][46662] Updated weights for policy 0, policy_version 60340 (0.0007) +[2023-10-13 02:45:49,094][46662] Updated weights for policy 0, policy_version 60350 (0.0009) +[2023-10-13 02:45:53,033][46663] Updated weights for policy 1, policy_version 60291 (0.0008) +[2023-10-13 02:45:53,169][46662] Updated weights for policy 0, policy_version 60360 (0.0007) +[2023-10-13 02:45:53,401][46663] Updated weights for policy 1, policy_version 60301 (0.0008) +[2023-10-13 02:45:53,537][46662] Updated weights for policy 0, policy_version 60370 (0.0007) +[2023-10-13 02:45:53,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 123535360. Throughput: 0: 1656.1, 1: 1648.3. Samples: 30902464. Policy #0 lag: (min: 31.0, avg: 37.7, max: 63.0) +[2023-10-13 02:45:53,608][45375] Avg episode reward: [(0, '53.350'), (1, '51.370')] +[2023-10-13 02:45:53,765][46663] Updated weights for policy 1, policy_version 60311 (0.0008) +[2023-10-13 02:45:53,899][46662] Updated weights for policy 0, policy_version 60380 (0.0007) +[2023-10-13 02:45:54,044][46091] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000060384_61833216.pth... +[2023-10-13 02:45:54,082][46091] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000058816_60227584.pth +[2023-10-13 02:45:54,092][46384] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000060320_61767680.pth... +[2023-10-13 02:45:54,123][46384] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000058752_60162048.pth +[2023-10-13 02:45:57,981][46663] Updated weights for policy 1, policy_version 60321 (0.0009) +[2023-10-13 02:45:58,069][46662] Updated weights for policy 0, policy_version 60390 (0.0008) +[2023-10-13 02:45:58,345][46663] Updated weights for policy 1, policy_version 60331 (0.0009) +[2023-10-13 02:45:58,434][46662] Updated weights for policy 0, policy_version 60400 (0.0008) +[2023-10-13 02:45:58,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 123600896. Throughput: 0: 1651.9, 1: 1656.4. Samples: 30911814. Policy #0 lag: (min: 31.0, avg: 37.7, max: 63.0) +[2023-10-13 02:45:58,607][45375] Avg episode reward: [(0, '52.090'), (1, '51.830')] +[2023-10-13 02:45:58,701][46663] Updated weights for policy 1, policy_version 60341 (0.0008) +[2023-10-13 02:45:58,798][46662] Updated weights for policy 0, policy_version 60410 (0.0008) +[2023-10-13 02:45:59,066][46663] Updated weights for policy 1, policy_version 60351 (0.0008) +[2023-10-13 02:46:02,943][46662] Updated weights for policy 0, policy_version 60420 (0.0009) +[2023-10-13 02:46:03,232][46663] Updated weights for policy 1, policy_version 60361 (0.0009) +[2023-10-13 02:46:03,310][46662] Updated weights for policy 0, policy_version 60430 (0.0007) +[2023-10-13 02:46:03,604][46663] Updated weights for policy 1, policy_version 60371 (0.0009) +[2023-10-13 02:46:03,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 123666432. Throughput: 0: 1656.8, 1: 1657.2. Samples: 30932408. Policy #0 lag: (min: 31.0, avg: 37.7, max: 63.0) +[2023-10-13 02:46:03,608][45375] Avg episode reward: [(0, '51.220'), (1, '52.380')] +[2023-10-13 02:46:03,681][46662] Updated weights for policy 0, policy_version 60440 (0.0008) +[2023-10-13 02:46:03,978][46663] Updated weights for policy 1, policy_version 60381 (0.0008) +[2023-10-13 02:46:07,732][46662] Updated weights for policy 0, policy_version 60450 (0.0009) +[2023-10-13 02:46:07,953][46663] Updated weights for policy 1, policy_version 60391 (0.0010) +[2023-10-13 02:46:08,094][46662] Updated weights for policy 0, policy_version 60460 (0.0011) +[2023-10-13 02:46:08,316][46663] Updated weights for policy 1, policy_version 60401 (0.0008) +[2023-10-13 02:46:08,468][46662] Updated weights for policy 0, policy_version 60470 (0.0009) +[2023-10-13 02:46:08,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 123731968. Throughput: 0: 1659.0, 1: 1646.1. Samples: 30952180. Policy #0 lag: (min: 31.0, avg: 37.7, max: 63.0) +[2023-10-13 02:46:08,607][45375] Avg episode reward: [(0, '52.790'), (1, '54.190')] +[2023-10-13 02:46:08,675][46663] Updated weights for policy 1, policy_version 60411 (0.0008) +[2023-10-13 02:46:08,837][46662] Updated weights for policy 0, policy_version 60480 (0.0008) +[2023-10-13 02:46:12,531][46663] Updated weights for policy 1, policy_version 60421 (0.0009) +[2023-10-13 02:46:12,890][46663] Updated weights for policy 1, policy_version 60431 (0.0007) +[2023-10-13 02:46:13,009][46662] Updated weights for policy 0, policy_version 60490 (0.0008) +[2023-10-13 02:46:13,261][46663] Updated weights for policy 1, policy_version 60441 (0.0007) +[2023-10-13 02:46:13,388][46662] Updated weights for policy 0, policy_version 60500 (0.0011) +[2023-10-13 02:46:13,606][45375] Fps is (10 sec: 16384.3, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 123830272. Throughput: 0: 1662.5, 1: 1665.9. Samples: 30962244. Policy #0 lag: (min: 31.0, avg: 37.7, max: 63.0) +[2023-10-13 02:46:13,607][45375] Avg episode reward: [(0, '53.090'), (1, '54.110')] +[2023-10-13 02:46:13,746][46662] Updated weights for policy 0, policy_version 60510 (0.0010) +[2023-10-13 02:46:17,235][46663] Updated weights for policy 1, policy_version 60451 (0.0007) +[2023-10-13 02:46:17,600][46663] Updated weights for policy 1, policy_version 60461 (0.0008) +[2023-10-13 02:46:17,933][46662] Updated weights for policy 0, policy_version 60520 (0.0009) +[2023-10-13 02:46:17,966][46663] Updated weights for policy 1, policy_version 60471 (0.0008) +[2023-10-13 02:46:18,306][46662] Updated weights for policy 0, policy_version 60530 (0.0009) +[2023-10-13 02:46:18,606][45375] Fps is (10 sec: 16384.1, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 123895808. Throughput: 0: 1668.9, 1: 1663.1. Samples: 30982842. Policy #0 lag: (min: 31.0, avg: 37.7, max: 63.0) +[2023-10-13 02:46:18,607][45375] Avg episode reward: [(0, '51.790'), (1, '54.260')] +[2023-10-13 02:46:18,670][46662] Updated weights for policy 0, policy_version 60540 (0.0008) +[2023-10-13 02:46:22,139][46663] Updated weights for policy 1, policy_version 60481 (0.0010) +[2023-10-13 02:46:22,511][46663] Updated weights for policy 1, policy_version 60491 (0.0009) +[2023-10-13 02:46:22,598][46662] Updated weights for policy 0, policy_version 60550 (0.0008) +[2023-10-13 02:46:22,881][46663] Updated weights for policy 1, policy_version 60501 (0.0008) +[2023-10-13 02:46:22,959][46662] Updated weights for policy 0, policy_version 60560 (0.0007) +[2023-10-13 02:46:23,248][46663] Updated weights for policy 1, policy_version 60511 (0.0008) +[2023-10-13 02:46:23,335][46662] Updated weights for policy 0, policy_version 60570 (0.0009) +[2023-10-13 02:46:23,607][45375] Fps is (10 sec: 16383.7, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 123994112. Throughput: 0: 1663.8, 1: 1658.6. Samples: 31002110. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:46:23,608][45375] Avg episode reward: [(0, '51.890'), (1, '54.950')] +[2023-10-13 02:46:27,322][46663] Updated weights for policy 1, policy_version 60521 (0.0009) +[2023-10-13 02:46:27,415][46662] Updated weights for policy 0, policy_version 60580 (0.0009) +[2023-10-13 02:46:27,685][46663] Updated weights for policy 1, policy_version 60531 (0.0008) +[2023-10-13 02:46:27,784][46662] Updated weights for policy 0, policy_version 60590 (0.0008) +[2023-10-13 02:46:28,047][46663] Updated weights for policy 1, policy_version 60541 (0.0008) +[2023-10-13 02:46:28,151][46662] Updated weights for policy 0, policy_version 60600 (0.0008) +[2023-10-13 02:46:28,607][45375] Fps is (10 sec: 16383.8, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 124059648. Throughput: 0: 1675.3, 1: 1671.3. Samples: 31012860. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:46:28,607][45375] Avg episode reward: [(0, '50.630'), (1, '55.240')] +[2023-10-13 02:46:32,290][46663] Updated weights for policy 1, policy_version 60551 (0.0009) +[2023-10-13 02:46:32,336][46662] Updated weights for policy 0, policy_version 60610 (0.0007) +[2023-10-13 02:46:32,665][46663] Updated weights for policy 1, policy_version 60561 (0.0008) +[2023-10-13 02:46:32,696][46662] Updated weights for policy 0, policy_version 60620 (0.0009) +[2023-10-13 02:46:33,028][46663] Updated weights for policy 1, policy_version 60571 (0.0007) +[2023-10-13 02:46:33,067][46662] Updated weights for policy 0, policy_version 60630 (0.0009) +[2023-10-13 02:46:33,440][46662] Updated weights for policy 0, policy_version 60640 (0.0007) +[2023-10-13 02:46:33,606][45375] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 124125184. Throughput: 0: 1673.6, 1: 1669.0. Samples: 31032982. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:46:33,607][45375] Avg episode reward: [(0, '51.480'), (1, '56.230')] +[2023-10-13 02:46:37,100][46663] Updated weights for policy 1, policy_version 60581 (0.0009) +[2023-10-13 02:46:37,468][46663] Updated weights for policy 1, policy_version 60591 (0.0008) +[2023-10-13 02:46:37,496][46662] Updated weights for policy 0, policy_version 60650 (0.0009) +[2023-10-13 02:46:37,837][46663] Updated weights for policy 1, policy_version 60601 (0.0007) +[2023-10-13 02:46:37,856][46662] Updated weights for policy 0, policy_version 60660 (0.0008) +[2023-10-13 02:46:38,226][46662] Updated weights for policy 0, policy_version 60670 (0.0007) +[2023-10-13 02:46:38,606][45375] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 124190720. Throughput: 0: 1661.6, 1: 1658.8. Samples: 31051878. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:46:38,607][45375] Avg episode reward: [(0, '50.410'), (1, '58.170')] +[2023-10-13 02:46:42,047][46663] Updated weights for policy 1, policy_version 60611 (0.0009) +[2023-10-13 02:46:42,352][46662] Updated weights for policy 0, policy_version 60680 (0.0008) +[2023-10-13 02:46:42,413][46663] Updated weights for policy 1, policy_version 60621 (0.0010) +[2023-10-13 02:46:42,721][46662] Updated weights for policy 0, policy_version 60690 (0.0009) +[2023-10-13 02:46:42,779][46663] Updated weights for policy 1, policy_version 60631 (0.0009) +[2023-10-13 02:46:43,080][46662] Updated weights for policy 0, policy_version 60700 (0.0008) +[2023-10-13 02:46:43,607][45375] Fps is (10 sec: 13106.9, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 124256256. Throughput: 0: 1677.8, 1: 1676.8. Samples: 31062770. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:46:43,608][45375] Avg episode reward: [(0, '50.100'), (1, '58.440')] +[2023-10-13 02:46:46,944][46663] Updated weights for policy 1, policy_version 60641 (0.0007) +[2023-10-13 02:46:47,126][46662] Updated weights for policy 0, policy_version 60710 (0.0010) +[2023-10-13 02:46:47,309][46663] Updated weights for policy 1, policy_version 60651 (0.0009) +[2023-10-13 02:46:47,494][46662] Updated weights for policy 0, policy_version 60720 (0.0008) +[2023-10-13 02:46:47,675][46663] Updated weights for policy 1, policy_version 60661 (0.0008) +[2023-10-13 02:46:47,853][46662] Updated weights for policy 0, policy_version 60730 (0.0008) +[2023-10-13 02:46:48,047][46663] Updated weights for policy 1, policy_version 60671 (0.0008) +[2023-10-13 02:46:48,606][45375] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 124321792. Throughput: 0: 1678.0, 1: 1664.1. Samples: 31082798. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:46:48,607][45375] Avg episode reward: [(0, '49.210'), (1, '58.180')] +[2023-10-13 02:46:51,917][46662] Updated weights for policy 0, policy_version 60740 (0.0009) +[2023-10-13 02:46:52,082][46663] Updated weights for policy 1, policy_version 60681 (0.0008) +[2023-10-13 02:46:52,274][46662] Updated weights for policy 0, policy_version 60750 (0.0008) +[2023-10-13 02:46:52,435][46663] Updated weights for policy 1, policy_version 60691 (0.0009) +[2023-10-13 02:46:52,650][46662] Updated weights for policy 0, policy_version 60760 (0.0008) +[2023-10-13 02:46:52,807][46663] Updated weights for policy 1, policy_version 60701 (0.0008) +[2023-10-13 02:46:53,607][45375] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 13440.4). Total num frames: 124387328. Throughput: 0: 1657.8, 1: 1660.4. Samples: 31101498. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:46:53,607][45375] Avg episode reward: [(0, '50.710'), (1, '57.160')] +[2023-10-13 02:46:56,751][46662] Updated weights for policy 0, policy_version 60770 (0.0008) +[2023-10-13 02:46:56,927][46663] Updated weights for policy 1, policy_version 60711 (0.0007) +[2023-10-13 02:46:57,109][46662] Updated weights for policy 0, policy_version 60780 (0.0008) +[2023-10-13 02:46:57,295][46663] Updated weights for policy 1, policy_version 60721 (0.0008) +[2023-10-13 02:46:57,475][46662] Updated weights for policy 0, policy_version 60790 (0.0008) +[2023-10-13 02:46:57,660][46663] Updated weights for policy 1, policy_version 60731 (0.0007) +[2023-10-13 02:46:57,846][46662] Updated weights for policy 0, policy_version 60800 (0.0009) +[2023-10-13 02:46:58,606][45375] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 13329.4). Total num frames: 124452864. Throughput: 0: 1679.4, 1: 1667.9. Samples: 31112874. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:46:58,607][45375] Avg episode reward: [(0, '52.030'), (1, '58.930')] +[2023-10-13 02:47:01,827][46663] Updated weights for policy 1, policy_version 60741 (0.0008) +[2023-10-13 02:47:02,039][46662] Updated weights for policy 0, policy_version 60810 (0.0007) +[2023-10-13 02:47:02,190][46663] Updated weights for policy 1, policy_version 60751 (0.0007) +[2023-10-13 02:47:02,406][46662] Updated weights for policy 0, policy_version 60820 (0.0008) +[2023-10-13 02:47:02,557][46663] Updated weights for policy 1, policy_version 60761 (0.0009) +[2023-10-13 02:47:02,771][46662] Updated weights for policy 0, policy_version 60830 (0.0009) +[2023-10-13 02:47:03,607][45375] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 13329.4). Total num frames: 124518400. Throughput: 0: 1680.8, 1: 1646.3. Samples: 31132564. Policy #0 lag: (min: 31.0, avg: 32.0, max: 53.0) +[2023-10-13 02:47:03,607][45375] Avg episode reward: [(0, '51.800'), (1, '58.590')] +[2023-10-13 02:47:06,642][46663] Updated weights for policy 1, policy_version 60771 (0.0009) +[2023-10-13 02:47:06,830][46662] Updated weights for policy 0, policy_version 60840 (0.0007) +[2023-10-13 02:47:07,008][46663] Updated weights for policy 1, policy_version 60781 (0.0008) +[2023-10-13 02:47:07,200][46662] Updated weights for policy 0, policy_version 60850 (0.0009) +[2023-10-13 02:47:07,380][46663] Updated weights for policy 1, policy_version 60791 (0.0007) +[2023-10-13 02:47:07,570][46662] Updated weights for policy 0, policy_version 60860 (0.0008) +[2023-10-13 02:47:08,606][45375] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 13329.4). Total num frames: 124583936. Throughput: 0: 1662.5, 1: 1655.3. Samples: 31151412. Policy #0 lag: (min: 31.0, avg: 32.0, max: 53.0) +[2023-10-13 02:47:08,607][45375] Avg episode reward: [(0, '52.470'), (1, '57.920')] +[2023-10-13 02:47:11,527][46663] Updated weights for policy 1, policy_version 60801 (0.0007) +[2023-10-13 02:47:11,629][46662] Updated weights for policy 0, policy_version 60870 (0.0008) +[2023-10-13 02:47:11,884][46663] Updated weights for policy 1, policy_version 60811 (0.0009) +[2023-10-13 02:47:11,996][46662] Updated weights for policy 0, policy_version 60880 (0.0009) +[2023-10-13 02:47:12,242][46663] Updated weights for policy 1, policy_version 60821 (0.0009) +[2023-10-13 02:47:12,367][46662] Updated weights for policy 0, policy_version 60890 (0.0009) +[2023-10-13 02:47:12,608][46663] Updated weights for policy 1, policy_version 60831 (0.0009) +[2023-10-13 02:47:13,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13329.4). Total num frames: 124649472. Throughput: 0: 1677.8, 1: 1656.4. Samples: 31162898. Policy #0 lag: (min: 31.0, avg: 32.0, max: 53.0) +[2023-10-13 02:47:13,607][45375] Avg episode reward: [(0, '52.510'), (1, '58.220')] +[2023-10-13 02:47:16,469][46662] Updated weights for policy 0, policy_version 60900 (0.0009) +[2023-10-13 02:47:16,781][46663] Updated weights for policy 1, policy_version 60841 (0.0009) +[2023-10-13 02:47:16,829][46662] Updated weights for policy 0, policy_version 60910 (0.0007) +[2023-10-13 02:47:17,160][46663] Updated weights for policy 1, policy_version 60851 (0.0008) +[2023-10-13 02:47:17,194][46662] Updated weights for policy 0, policy_version 60920 (0.0009) +[2023-10-13 02:47:17,523][46663] Updated weights for policy 1, policy_version 60861 (0.0009) +[2023-10-13 02:47:18,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13329.4). Total num frames: 124715008. Throughput: 0: 1665.6, 1: 1639.6. Samples: 31181716. Policy #0 lag: (min: 31.0, avg: 32.0, max: 53.0) +[2023-10-13 02:47:18,607][45375] Avg episode reward: [(0, '52.660'), (1, '58.420')] +[2023-10-13 02:47:21,368][46662] Updated weights for policy 0, policy_version 60930 (0.0010) +[2023-10-13 02:47:21,580][46663] Updated weights for policy 1, policy_version 60871 (0.0007) +[2023-10-13 02:47:21,741][46662] Updated weights for policy 0, policy_version 60940 (0.0008) +[2023-10-13 02:47:21,944][46663] Updated weights for policy 1, policy_version 60881 (0.0007) +[2023-10-13 02:47:22,103][46662] Updated weights for policy 0, policy_version 60950 (0.0007) +[2023-10-13 02:47:22,307][46663] Updated weights for policy 1, policy_version 60891 (0.0008) +[2023-10-13 02:47:22,477][46662] Updated weights for policy 0, policy_version 60960 (0.0010) +[2023-10-13 02:47:23,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 124780544. Throughput: 0: 1658.4, 1: 1658.4. Samples: 31201130. Policy #0 lag: (min: 31.0, avg: 32.0, max: 53.0) +[2023-10-13 02:47:23,607][45375] Avg episode reward: [(0, '53.350'), (1, '56.840')] +[2023-10-13 02:47:26,447][46663] Updated weights for policy 1, policy_version 60901 (0.0008) +[2023-10-13 02:47:26,566][46662] Updated weights for policy 0, policy_version 60970 (0.0008) +[2023-10-13 02:47:26,811][46663] Updated weights for policy 1, policy_version 60911 (0.0007) +[2023-10-13 02:47:26,937][46662] Updated weights for policy 0, policy_version 60980 (0.0009) +[2023-10-13 02:47:27,168][46663] Updated weights for policy 1, policy_version 60921 (0.0008) +[2023-10-13 02:47:27,303][46662] Updated weights for policy 0, policy_version 60990 (0.0009) +[2023-10-13 02:47:28,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 124846080. Throughput: 0: 1671.6, 1: 1661.5. Samples: 31212758. Policy #0 lag: (min: 31.0, avg: 32.0, max: 53.0) +[2023-10-13 02:47:28,607][45375] Avg episode reward: [(0, '53.690'), (1, '58.640')] +[2023-10-13 02:47:31,373][46663] Updated weights for policy 1, policy_version 60931 (0.0008) +[2023-10-13 02:47:31,561][46662] Updated weights for policy 0, policy_version 61000 (0.0008) +[2023-10-13 02:47:31,743][46663] Updated weights for policy 1, policy_version 60941 (0.0008) +[2023-10-13 02:47:31,933][46662] Updated weights for policy 0, policy_version 61010 (0.0008) +[2023-10-13 02:47:32,120][46663] Updated weights for policy 1, policy_version 60951 (0.0007) +[2023-10-13 02:47:32,302][46662] Updated weights for policy 0, policy_version 61020 (0.0007) +[2023-10-13 02:47:33,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 124911616. Throughput: 0: 1652.4, 1: 1651.9. Samples: 31231492. Policy #0 lag: (min: 31.0, avg: 32.0, max: 53.0) +[2023-10-13 02:47:33,607][45375] Avg episode reward: [(0, '53.830'), (1, '58.180')] +[2023-10-13 02:47:36,034][46663] Updated weights for policy 1, policy_version 60961 (0.0010) +[2023-10-13 02:47:36,369][46662] Updated weights for policy 0, policy_version 61030 (0.0007) +[2023-10-13 02:47:36,395][46663] Updated weights for policy 1, policy_version 60971 (0.0009) +[2023-10-13 02:47:36,745][46662] Updated weights for policy 0, policy_version 61040 (0.0010) +[2023-10-13 02:47:36,756][46663] Updated weights for policy 1, policy_version 60981 (0.0008) +[2023-10-13 02:47:37,106][46662] Updated weights for policy 0, policy_version 61050 (0.0008) +[2023-10-13 02:47:37,118][46663] Updated weights for policy 1, policy_version 60991 (0.0007) +[2023-10-13 02:47:38,607][45375] Fps is (10 sec: 13106.6, 60 sec: 13107.1, 300 sec: 13329.3). Total num frames: 124977152. Throughput: 0: 1659.3, 1: 1668.3. Samples: 31251240. Policy #0 lag: (min: 31.0, avg: 32.0, max: 53.0) +[2023-10-13 02:47:38,608][45375] Avg episode reward: [(0, '53.440'), (1, '58.390')] +[2023-10-13 02:47:41,206][46662] Updated weights for policy 0, policy_version 61060 (0.0008) +[2023-10-13 02:47:41,319][46663] Updated weights for policy 1, policy_version 61001 (0.0009) +[2023-10-13 02:47:41,577][46662] Updated weights for policy 0, policy_version 61070 (0.0008) +[2023-10-13 02:47:41,691][46663] Updated weights for policy 1, policy_version 61011 (0.0009) +[2023-10-13 02:47:41,944][46662] Updated weights for policy 0, policy_version 61080 (0.0007) +[2023-10-13 02:47:42,057][46663] Updated weights for policy 1, policy_version 61021 (0.0007) +[2023-10-13 02:47:43,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 125042688. Throughput: 0: 1665.6, 1: 1659.2. Samples: 31262490. Policy #0 lag: (min: 19.0, avg: 19.0, max: 19.0) +[2023-10-13 02:47:43,607][45375] Avg episode reward: [(0, '53.190'), (1, '57.750')] +[2023-10-13 02:47:46,188][46662] Updated weights for policy 0, policy_version 61090 (0.0008) +[2023-10-13 02:47:46,214][46663] Updated weights for policy 1, policy_version 61031 (0.0009) +[2023-10-13 02:47:46,562][46662] Updated weights for policy 0, policy_version 61100 (0.0008) +[2023-10-13 02:47:46,587][46663] Updated weights for policy 1, policy_version 61041 (0.0008) +[2023-10-13 02:47:46,921][46662] Updated weights for policy 0, policy_version 61110 (0.0008) +[2023-10-13 02:47:46,945][46663] Updated weights for policy 1, policy_version 61051 (0.0008) +[2023-10-13 02:47:47,288][46662] Updated weights for policy 0, policy_version 61120 (0.0008) +[2023-10-13 02:47:48,607][45375] Fps is (10 sec: 13107.7, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 125108224. Throughput: 0: 1645.6, 1: 1658.9. Samples: 31281266. Policy #0 lag: (min: 19.0, avg: 19.0, max: 19.0) +[2023-10-13 02:47:48,608][45375] Avg episode reward: [(0, '51.350'), (1, '55.900')] +[2023-10-13 02:47:51,178][46663] Updated weights for policy 1, policy_version 61061 (0.0008) +[2023-10-13 02:47:51,459][46662] Updated weights for policy 0, policy_version 61130 (0.0007) +[2023-10-13 02:47:51,553][46663] Updated weights for policy 1, policy_version 61071 (0.0007) +[2023-10-13 02:47:51,830][46662] Updated weights for policy 0, policy_version 61140 (0.0009) +[2023-10-13 02:47:51,922][46663] Updated weights for policy 1, policy_version 61081 (0.0007) +[2023-10-13 02:47:52,192][46662] Updated weights for policy 0, policy_version 61150 (0.0007) +[2023-10-13 02:47:53,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 125173760. Throughput: 0: 1657.8, 1: 1673.5. Samples: 31301318. Policy #0 lag: (min: 19.0, avg: 19.0, max: 19.0) +[2023-10-13 02:47:53,607][45375] Avg episode reward: [(0, '52.610'), (1, '56.560')] +[2023-10-13 02:47:53,618][46091] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000061152_62619648.pth... +[2023-10-13 02:47:53,618][46384] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000061088_62554112.pth... +[2023-10-13 02:47:53,656][46091] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000059584_61014016.pth +[2023-10-13 02:47:53,666][46384] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000059520_60948480.pth +[2023-10-13 02:47:55,803][46663] Updated weights for policy 1, policy_version 61091 (0.0008) +[2023-10-13 02:47:56,168][46663] Updated weights for policy 1, policy_version 61101 (0.0009) +[2023-10-13 02:47:56,350][46662] Updated weights for policy 0, policy_version 61160 (0.0010) +[2023-10-13 02:47:56,541][46663] Updated weights for policy 1, policy_version 61111 (0.0009) +[2023-10-13 02:47:56,722][46662] Updated weights for policy 0, policy_version 61170 (0.0009) +[2023-10-13 02:47:57,092][46662] Updated weights for policy 0, policy_version 61180 (0.0007) +[2023-10-13 02:47:58,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 125239296. Throughput: 0: 1662.4, 1: 1661.2. Samples: 31312462. Policy #0 lag: (min: 19.0, avg: 19.0, max: 19.0) +[2023-10-13 02:47:58,607][45375] Avg episode reward: [(0, '52.300'), (1, '56.420')] +[2023-10-13 02:48:00,665][46663] Updated weights for policy 1, policy_version 61121 (0.0009) +[2023-10-13 02:48:01,026][46663] Updated weights for policy 1, policy_version 61131 (0.0009) +[2023-10-13 02:48:01,030][46662] Updated weights for policy 0, policy_version 61190 (0.0008) +[2023-10-13 02:48:01,393][46662] Updated weights for policy 0, policy_version 61200 (0.0009) +[2023-10-13 02:48:01,397][46663] Updated weights for policy 1, policy_version 61141 (0.0009) +[2023-10-13 02:48:01,769][46662] Updated weights for policy 0, policy_version 61210 (0.0008) +[2023-10-13 02:48:01,770][46663] Updated weights for policy 1, policy_version 61151 (0.0008) +[2023-10-13 02:48:03,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 125304832. Throughput: 0: 1650.8, 1: 1676.2. Samples: 31331430. Policy #0 lag: (min: 19.0, avg: 19.0, max: 19.0) +[2023-10-13 02:48:03,608][45375] Avg episode reward: [(0, '52.840'), (1, '55.870')] +[2023-10-13 02:48:05,637][46662] Updated weights for policy 0, policy_version 61220 (0.0009) +[2023-10-13 02:48:06,012][46662] Updated weights for policy 0, policy_version 61230 (0.0007) +[2023-10-13 02:48:06,044][46663] Updated weights for policy 1, policy_version 61161 (0.0010) +[2023-10-13 02:48:06,382][46662] Updated weights for policy 0, policy_version 61240 (0.0009) +[2023-10-13 02:48:06,418][46663] Updated weights for policy 1, policy_version 61171 (0.0007) +[2023-10-13 02:48:06,790][46663] Updated weights for policy 1, policy_version 61181 (0.0010) +[2023-10-13 02:48:08,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 125370368. Throughput: 0: 1676.5, 1: 1670.8. Samples: 31351758. Policy #0 lag: (min: 19.0, avg: 19.0, max: 19.0) +[2023-10-13 02:48:08,607][45375] Avg episode reward: [(0, '53.800'), (1, '55.330')] +[2023-10-13 02:48:10,475][46662] Updated weights for policy 0, policy_version 61250 (0.0008) +[2023-10-13 02:48:10,756][46663] Updated weights for policy 1, policy_version 61191 (0.0008) +[2023-10-13 02:48:10,839][46662] Updated weights for policy 0, policy_version 61260 (0.0009) +[2023-10-13 02:48:11,123][46663] Updated weights for policy 1, policy_version 61201 (0.0008) +[2023-10-13 02:48:11,206][46662] Updated weights for policy 0, policy_version 61270 (0.0008) +[2023-10-13 02:48:11,492][46663] Updated weights for policy 1, policy_version 61211 (0.0008) +[2023-10-13 02:48:11,568][46662] Updated weights for policy 0, policy_version 61280 (0.0009) +[2023-10-13 02:48:13,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 125435904. Throughput: 0: 1665.3, 1: 1655.1. Samples: 31362176. Policy #0 lag: (min: 19.0, avg: 19.0, max: 19.0) +[2023-10-13 02:48:13,608][45375] Avg episode reward: [(0, '51.760'), (1, '55.480')] +[2023-10-13 02:48:15,509][46663] Updated weights for policy 1, policy_version 61221 (0.0008) +[2023-10-13 02:48:15,738][46662] Updated weights for policy 0, policy_version 61290 (0.0009) +[2023-10-13 02:48:15,876][46663] Updated weights for policy 1, policy_version 61231 (0.0007) +[2023-10-13 02:48:16,100][46662] Updated weights for policy 0, policy_version 61300 (0.0008) +[2023-10-13 02:48:16,249][46663] Updated weights for policy 1, policy_version 61241 (0.0008) +[2023-10-13 02:48:16,472][46662] Updated weights for policy 0, policy_version 61310 (0.0008) +[2023-10-13 02:48:18,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 125501440. Throughput: 0: 1661.2, 1: 1674.5. Samples: 31381596. Policy #0 lag: (min: 19.0, avg: 19.0, max: 19.0) +[2023-10-13 02:48:18,607][45375] Avg episode reward: [(0, '51.780'), (1, '54.620')] +[2023-10-13 02:48:20,132][46663] Updated weights for policy 1, policy_version 61251 (0.0007) +[2023-10-13 02:48:20,500][46663] Updated weights for policy 1, policy_version 61261 (0.0007) +[2023-10-13 02:48:20,656][46662] Updated weights for policy 0, policy_version 61320 (0.0008) +[2023-10-13 02:48:20,869][46663] Updated weights for policy 1, policy_version 61271 (0.0007) +[2023-10-13 02:48:21,019][46662] Updated weights for policy 0, policy_version 61330 (0.0009) +[2023-10-13 02:48:21,393][46662] Updated weights for policy 0, policy_version 61340 (0.0010) +[2023-10-13 02:48:23,606][45375] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 125566976. Throughput: 0: 1673.0, 1: 1680.2. Samples: 31402134. Policy #0 lag: (min: 31.0, avg: 34.4, max: 63.0) +[2023-10-13 02:48:23,607][45375] Avg episode reward: [(0, '52.540'), (1, '54.270')] +[2023-10-13 02:48:24,991][46663] Updated weights for policy 1, policy_version 61281 (0.0009) +[2023-10-13 02:48:25,352][46663] Updated weights for policy 1, policy_version 61291 (0.0009) +[2023-10-13 02:48:25,419][46662] Updated weights for policy 0, policy_version 61350 (0.0010) +[2023-10-13 02:48:25,719][46663] Updated weights for policy 1, policy_version 61301 (0.0008) +[2023-10-13 02:48:25,799][46662] Updated weights for policy 0, policy_version 61360 (0.0007) +[2023-10-13 02:48:26,077][46663] Updated weights for policy 1, policy_version 61311 (0.0008) +[2023-10-13 02:48:26,169][46662] Updated weights for policy 0, policy_version 61370 (0.0010) +[2023-10-13 02:48:28,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 125632512. Throughput: 0: 1661.2, 1: 1659.3. Samples: 31411916. Policy #0 lag: (min: 31.0, avg: 34.4, max: 63.0) +[2023-10-13 02:48:28,607][45375] Avg episode reward: [(0, '52.060'), (1, '54.330')] +[2023-10-13 02:48:30,259][46662] Updated weights for policy 0, policy_version 61380 (0.0009) +[2023-10-13 02:48:30,366][46663] Updated weights for policy 1, policy_version 61321 (0.0007) +[2023-10-13 02:48:30,627][46662] Updated weights for policy 0, policy_version 61390 (0.0009) +[2023-10-13 02:48:30,745][46663] Updated weights for policy 1, policy_version 61331 (0.0008) +[2023-10-13 02:48:30,992][46662] Updated weights for policy 0, policy_version 61400 (0.0009) +[2023-10-13 02:48:31,115][46663] Updated weights for policy 1, policy_version 61341 (0.0009) +[2023-10-13 02:48:33,606][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 125698048. Throughput: 0: 1668.0, 1: 1677.7. Samples: 31431820. Policy #0 lag: (min: 31.0, avg: 34.4, max: 63.0) +[2023-10-13 02:48:33,607][45375] Avg episode reward: [(0, '52.450'), (1, '52.010')] +[2023-10-13 02:48:35,159][46663] Updated weights for policy 1, policy_version 61351 (0.0008) +[2023-10-13 02:48:35,239][46662] Updated weights for policy 0, policy_version 61410 (0.0010) +[2023-10-13 02:48:35,526][46663] Updated weights for policy 1, policy_version 61361 (0.0009) +[2023-10-13 02:48:35,618][46662] Updated weights for policy 0, policy_version 61420 (0.0008) +[2023-10-13 02:48:35,884][46663] Updated weights for policy 1, policy_version 61371 (0.0008) +[2023-10-13 02:48:35,991][46662] Updated weights for policy 0, policy_version 61430 (0.0007) +[2023-10-13 02:48:36,357][46662] Updated weights for policy 0, policy_version 61440 (0.0008) +[2023-10-13 02:48:38,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.3, 300 sec: 13329.4). Total num frames: 125763584. Throughput: 0: 1681.1, 1: 1673.0. Samples: 31452254. Policy #0 lag: (min: 31.0, avg: 34.4, max: 63.0) +[2023-10-13 02:48:38,607][45375] Avg episode reward: [(0, '52.530'), (1, '51.490')] +[2023-10-13 02:48:40,060][46663] Updated weights for policy 1, policy_version 61381 (0.0007) +[2023-10-13 02:48:40,375][46662] Updated weights for policy 0, policy_version 61450 (0.0009) +[2023-10-13 02:48:40,427][46663] Updated weights for policy 1, policy_version 61391 (0.0008) +[2023-10-13 02:48:40,749][46662] Updated weights for policy 0, policy_version 61460 (0.0007) +[2023-10-13 02:48:40,797][46663] Updated weights for policy 1, policy_version 61401 (0.0008) +[2023-10-13 02:48:41,118][46662] Updated weights for policy 0, policy_version 61470 (0.0009) +[2023-10-13 02:48:43,607][45375] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 125829120. Throughput: 0: 1656.9, 1: 1657.5. Samples: 31461610. Policy #0 lag: (min: 31.0, avg: 34.4, max: 63.0) +[2023-10-13 02:48:43,608][45375] Avg episode reward: [(0, '51.900'), (1, '51.280')] +[2023-10-13 02:48:44,953][46662] Updated weights for policy 0, policy_version 61480 (0.0009) +[2023-10-13 02:48:44,954][46663] Updated weights for policy 1, policy_version 61411 (0.0008) +[2023-10-13 02:48:45,327][46662] Updated weights for policy 0, policy_version 61490 (0.0009) +[2023-10-13 02:48:45,329][46663] Updated weights for policy 1, policy_version 61421 (0.0008) +[2023-10-13 02:48:45,696][46662] Updated weights for policy 0, policy_version 61500 (0.0007) +[2023-10-13 02:48:45,702][46663] Updated weights for policy 1, policy_version 61431 (0.0007) +[2023-10-13 02:48:48,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 125894656. Throughput: 0: 1678.2, 1: 1673.1. Samples: 31482240. Policy #0 lag: (min: 31.0, avg: 34.4, max: 63.0) +[2023-10-13 02:48:48,607][45375] Avg episode reward: [(0, '52.910'), (1, '50.820')] +[2023-10-13 02:48:49,744][46662] Updated weights for policy 0, policy_version 61510 (0.0007) +[2023-10-13 02:48:49,799][46663] Updated weights for policy 1, policy_version 61441 (0.0007) +[2023-10-13 02:48:50,124][46662] Updated weights for policy 0, policy_version 61520 (0.0007) +[2023-10-13 02:48:50,165][46663] Updated weights for policy 1, policy_version 61451 (0.0009) +[2023-10-13 02:48:50,486][46662] Updated weights for policy 0, policy_version 61530 (0.0008) +[2023-10-13 02:48:50,528][46663] Updated weights for policy 1, policy_version 61461 (0.0009) +[2023-10-13 02:48:50,890][46663] Updated weights for policy 1, policy_version 61471 (0.0009) +[2023-10-13 02:48:53,607][45375] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 125960192. Throughput: 0: 1675.0, 1: 1684.0. Samples: 31502916. Policy #0 lag: (min: 31.0, avg: 34.4, max: 63.0) +[2023-10-13 02:48:53,607][45375] Avg episode reward: [(0, '54.150'), (1, '49.410')] +[2023-10-13 02:48:54,557][46662] Updated weights for policy 0, policy_version 61540 (0.0009) +[2023-10-13 02:48:54,926][46662] Updated weights for policy 0, policy_version 61550 (0.0009) +[2023-10-13 02:48:54,962][46663] Updated weights for policy 1, policy_version 61481 (0.0009) +[2023-10-13 02:48:55,287][46662] Updated weights for policy 0, policy_version 61560 (0.0009) +[2023-10-13 02:48:55,333][46663] Updated weights for policy 1, policy_version 61491 (0.0007) +[2023-10-13 02:48:55,707][46663] Updated weights for policy 1, policy_version 61501 (0.0009) +[2023-10-13 02:48:58,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 126025728. Throughput: 0: 1661.5, 1: 1670.6. Samples: 31512122. Policy #0 lag: (min: 31.0, avg: 34.4, max: 63.0) +[2023-10-13 02:48:58,607][45375] Avg episode reward: [(0, '52.960'), (1, '47.700')] +[2023-10-13 02:48:59,571][46662] Updated weights for policy 0, policy_version 61570 (0.0007) +[2023-10-13 02:48:59,737][46663] Updated weights for policy 1, policy_version 61511 (0.0009) +[2023-10-13 02:48:59,945][46662] Updated weights for policy 0, policy_version 61580 (0.0007) +[2023-10-13 02:49:00,111][46663] Updated weights for policy 1, policy_version 61521 (0.0009) +[2023-10-13 02:49:00,314][46662] Updated weights for policy 0, policy_version 61590 (0.0008) +[2023-10-13 02:49:00,489][46663] Updated weights for policy 1, policy_version 61531 (0.0010) +[2023-10-13 02:49:00,678][46662] Updated weights for policy 0, policy_version 61600 (0.0007) +[2023-10-13 02:49:03,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 126091264. Throughput: 0: 1677.2, 1: 1675.2. Samples: 31532450. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:49:03,608][45375] Avg episode reward: [(0, '53.740'), (1, '48.020')] +[2023-10-13 02:49:04,525][46663] Updated weights for policy 1, policy_version 61541 (0.0009) +[2023-10-13 02:49:04,699][46662] Updated weights for policy 0, policy_version 61610 (0.0009) +[2023-10-13 02:49:04,900][46663] Updated weights for policy 1, policy_version 61551 (0.0009) +[2023-10-13 02:49:05,068][46662] Updated weights for policy 0, policy_version 61620 (0.0009) +[2023-10-13 02:49:05,264][46663] Updated weights for policy 1, policy_version 61561 (0.0010) +[2023-10-13 02:49:05,434][46662] Updated weights for policy 0, policy_version 61630 (0.0007) +[2023-10-13 02:49:08,607][45375] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 126156800. Throughput: 0: 1681.5, 1: 1675.5. Samples: 31553196. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:49:08,608][45375] Avg episode reward: [(0, '50.450'), (1, '47.520')] +[2023-10-13 02:49:09,353][46663] Updated weights for policy 1, policy_version 61571 (0.0008) +[2023-10-13 02:49:09,484][46662] Updated weights for policy 0, policy_version 61640 (0.0009) +[2023-10-13 02:49:09,723][46663] Updated weights for policy 1, policy_version 61581 (0.0007) +[2023-10-13 02:49:09,854][46662] Updated weights for policy 0, policy_version 61650 (0.0008) +[2023-10-13 02:49:10,087][46663] Updated weights for policy 1, policy_version 61591 (0.0007) +[2023-10-13 02:49:10,221][46662] Updated weights for policy 0, policy_version 61660 (0.0009) +[2023-10-13 02:49:13,607][45375] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 126222336. Throughput: 0: 1662.7, 1: 1679.6. Samples: 31562320. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:49:13,607][45375] Avg episode reward: [(0, '49.570'), (1, '46.650')] +[2023-10-13 02:49:14,087][46663] Updated weights for policy 1, policy_version 61601 (0.0009) +[2023-10-13 02:49:14,207][46662] Updated weights for policy 0, policy_version 61670 (0.0007) +[2023-10-13 02:49:14,449][46663] Updated weights for policy 1, policy_version 61611 (0.0007) +[2023-10-13 02:49:14,576][46662] Updated weights for policy 0, policy_version 61680 (0.0007) +[2023-10-13 02:49:14,825][46663] Updated weights for policy 1, policy_version 61621 (0.0007) +[2023-10-13 02:49:14,940][46662] Updated weights for policy 0, policy_version 61690 (0.0008) +[2023-10-13 02:49:15,186][46663] Updated weights for policy 1, policy_version 61631 (0.0010) +[2023-10-13 02:49:18,607][45375] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 126287872. Throughput: 0: 1679.7, 1: 1677.6. Samples: 31582896. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:49:18,607][45375] Avg episode reward: [(0, '50.150'), (1, '45.990')] +[2023-10-13 02:49:19,121][46662] Updated weights for policy 0, policy_version 61700 (0.0010) +[2023-10-13 02:49:19,414][46663] Updated weights for policy 1, policy_version 61641 (0.0009) +[2023-10-13 02:49:19,488][46662] Updated weights for policy 0, policy_version 61710 (0.0008) +[2023-10-13 02:49:19,777][46663] Updated weights for policy 1, policy_version 61651 (0.0008) +[2023-10-13 02:49:19,859][46662] Updated weights for policy 0, policy_version 61720 (0.0008) +[2023-10-13 02:49:20,136][46663] Updated weights for policy 1, policy_version 61661 (0.0007) +[2023-10-13 02:49:23,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13107.1, 300 sec: 13329.3). Total num frames: 126353408. Throughput: 0: 1681.8, 1: 1677.2. Samples: 31603412. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:49:23,608][45375] Avg episode reward: [(0, '49.820'), (1, '46.140')] +[2023-10-13 02:49:23,876][46662] Updated weights for policy 0, policy_version 61730 (0.0008) +[2023-10-13 02:49:24,158][46663] Updated weights for policy 1, policy_version 61671 (0.0008) +[2023-10-13 02:49:24,236][46662] Updated weights for policy 0, policy_version 61740 (0.0009) +[2023-10-13 02:49:24,518][46663] Updated weights for policy 1, policy_version 61681 (0.0008) +[2023-10-13 02:49:24,610][46662] Updated weights for policy 0, policy_version 61750 (0.0007) +[2023-10-13 02:49:24,891][46663] Updated weights for policy 1, policy_version 61691 (0.0007) +[2023-10-13 02:49:24,975][46662] Updated weights for policy 0, policy_version 61760 (0.0008) +[2023-10-13 02:49:28,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 126418944. Throughput: 0: 1676.0, 1: 1678.9. Samples: 31612578. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:49:28,607][45375] Avg episode reward: [(0, '49.660'), (1, '45.700')] +[2023-10-13 02:49:28,997][46663] Updated weights for policy 1, policy_version 61701 (0.0009) +[2023-10-13 02:49:29,041][46662] Updated weights for policy 0, policy_version 61770 (0.0009) +[2023-10-13 02:49:29,368][46663] Updated weights for policy 1, policy_version 61711 (0.0008) +[2023-10-13 02:49:29,408][46662] Updated weights for policy 0, policy_version 61780 (0.0008) +[2023-10-13 02:49:29,739][46663] Updated weights for policy 1, policy_version 61721 (0.0008) +[2023-10-13 02:49:29,775][46662] Updated weights for policy 0, policy_version 61790 (0.0009) +[2023-10-13 02:49:33,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 126484480. Throughput: 0: 1680.0, 1: 1674.3. Samples: 31633180. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:49:33,607][45375] Avg episode reward: [(0, '50.350'), (1, '46.900')] +[2023-10-13 02:49:33,826][46662] Updated weights for policy 0, policy_version 61800 (0.0009) +[2023-10-13 02:49:33,831][46663] Updated weights for policy 1, policy_version 61731 (0.0010) +[2023-10-13 02:49:34,192][46663] Updated weights for policy 1, policy_version 61741 (0.0007) +[2023-10-13 02:49:34,199][46662] Updated weights for policy 0, policy_version 61810 (0.0008) +[2023-10-13 02:49:34,555][46662] Updated weights for policy 0, policy_version 61820 (0.0008) +[2023-10-13 02:49:34,562][46663] Updated weights for policy 1, policy_version 61751 (0.0008) +[2023-10-13 02:49:38,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 126550016. Throughput: 0: 1678.8, 1: 1669.9. Samples: 31653610. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:49:38,607][45375] Avg episode reward: [(0, '50.430'), (1, '46.660')] +[2023-10-13 02:49:38,819][46662] Updated weights for policy 0, policy_version 61830 (0.0010) +[2023-10-13 02:49:38,827][46663] Updated weights for policy 1, policy_version 61761 (0.0009) +[2023-10-13 02:49:39,181][46663] Updated weights for policy 1, policy_version 61771 (0.0009) +[2023-10-13 02:49:39,187][46662] Updated weights for policy 0, policy_version 61840 (0.0008) +[2023-10-13 02:49:39,552][46663] Updated weights for policy 1, policy_version 61781 (0.0007) +[2023-10-13 02:49:39,552][46662] Updated weights for policy 0, policy_version 61850 (0.0008) +[2023-10-13 02:49:39,917][46663] Updated weights for policy 1, policy_version 61791 (0.0007) +[2023-10-13 02:49:43,606][45375] Fps is (10 sec: 13107.4, 60 sec: 13107.3, 300 sec: 13329.4). Total num frames: 126615552. Throughput: 0: 1677.1, 1: 1671.3. Samples: 31662800. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:49:43,607][45375] Avg episode reward: [(0, '50.610'), (1, '46.380')] +[2023-10-13 02:49:43,614][46662] Updated weights for policy 0, policy_version 61860 (0.0008) +[2023-10-13 02:49:43,988][46662] Updated weights for policy 0, policy_version 61870 (0.0007) +[2023-10-13 02:49:44,223][46663] Updated weights for policy 1, policy_version 61801 (0.0009) +[2023-10-13 02:49:44,367][46662] Updated weights for policy 0, policy_version 61880 (0.0007) +[2023-10-13 02:49:44,591][46663] Updated weights for policy 1, policy_version 61811 (0.0011) +[2023-10-13 02:49:44,961][46663] Updated weights for policy 1, policy_version 61821 (0.0008) +[2023-10-13 02:49:48,523][46662] Updated weights for policy 0, policy_version 61890 (0.0008) +[2023-10-13 02:49:48,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 126681088. Throughput: 0: 1682.1, 1: 1669.5. Samples: 31683272. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:49:48,607][45375] Avg episode reward: [(0, '51.190'), (1, '46.580')] +[2023-10-13 02:49:48,885][46662] Updated weights for policy 0, policy_version 61900 (0.0009) +[2023-10-13 02:49:49,135][46663] Updated weights for policy 1, policy_version 61831 (0.0008) +[2023-10-13 02:49:49,262][46662] Updated weights for policy 0, policy_version 61910 (0.0010) +[2023-10-13 02:49:49,499][46663] Updated weights for policy 1, policy_version 61841 (0.0007) +[2023-10-13 02:49:49,623][46662] Updated weights for policy 0, policy_version 61920 (0.0009) +[2023-10-13 02:49:49,872][46663] Updated weights for policy 1, policy_version 61851 (0.0008) +[2023-10-13 02:49:53,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 126746624. Throughput: 0: 1682.8, 1: 1664.9. Samples: 31703844. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:49:53,607][45375] Avg episode reward: [(0, '50.600'), (1, '47.420')] +[2023-10-13 02:49:53,664][46662] Updated weights for policy 0, policy_version 61930 (0.0009) +[2023-10-13 02:49:53,849][46663] Updated weights for policy 1, policy_version 61861 (0.0007) +[2023-10-13 02:49:54,037][46662] Updated weights for policy 0, policy_version 61940 (0.0007) +[2023-10-13 02:49:54,217][46663] Updated weights for policy 1, policy_version 61871 (0.0010) +[2023-10-13 02:49:54,395][46662] Updated weights for policy 0, policy_version 61950 (0.0008) +[2023-10-13 02:49:54,469][46091] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000061952_63438848.pth... +[2023-10-13 02:49:54,498][46091] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000060384_61833216.pth +[2023-10-13 02:49:54,581][46663] Updated weights for policy 1, policy_version 61881 (0.0008) +[2023-10-13 02:49:54,839][46384] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000061888_63373312.pth... +[2023-10-13 02:49:54,877][46384] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000060320_61767680.pth +[2023-10-13 02:49:58,603][46662] Updated weights for policy 0, policy_version 61960 (0.0007) +[2023-10-13 02:49:58,606][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 126812160. Throughput: 0: 1680.1, 1: 1663.9. Samples: 31712800. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:49:58,607][45375] Avg episode reward: [(0, '50.400'), (1, '48.800')] +[2023-10-13 02:49:58,736][46663] Updated weights for policy 1, policy_version 61891 (0.0008) +[2023-10-13 02:49:58,971][46662] Updated weights for policy 0, policy_version 61970 (0.0008) +[2023-10-13 02:49:59,102][46663] Updated weights for policy 1, policy_version 61901 (0.0008) +[2023-10-13 02:49:59,337][46662] Updated weights for policy 0, policy_version 61980 (0.0009) +[2023-10-13 02:49:59,469][46663] Updated weights for policy 1, policy_version 61911 (0.0008) +[2023-10-13 02:50:03,537][46662] Updated weights for policy 0, policy_version 61990 (0.0007) +[2023-10-13 02:50:03,574][46663] Updated weights for policy 1, policy_version 61921 (0.0010) +[2023-10-13 02:50:03,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 126877696. Throughput: 0: 1676.0, 1: 1667.2. Samples: 31733344. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:50:03,608][45375] Avg episode reward: [(0, '50.240'), (1, '48.860')] +[2023-10-13 02:50:03,901][46662] Updated weights for policy 0, policy_version 62000 (0.0007) +[2023-10-13 02:50:03,937][46663] Updated weights for policy 1, policy_version 61931 (0.0009) +[2023-10-13 02:50:04,273][46662] Updated weights for policy 0, policy_version 62010 (0.0009) +[2023-10-13 02:50:04,306][46663] Updated weights for policy 1, policy_version 61941 (0.0009) +[2023-10-13 02:50:04,669][46663] Updated weights for policy 1, policy_version 61951 (0.0008) +[2023-10-13 02:50:08,533][46662] Updated weights for policy 0, policy_version 62020 (0.0008) +[2023-10-13 02:50:08,606][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.3, 300 sec: 13329.4). Total num frames: 126943232. Throughput: 0: 1670.8, 1: 1669.9. Samples: 31753740. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:50:08,607][45375] Avg episode reward: [(0, '50.820'), (1, '49.270')] +[2023-10-13 02:50:08,697][46663] Updated weights for policy 1, policy_version 61961 (0.0007) +[2023-10-13 02:50:08,891][46662] Updated weights for policy 0, policy_version 62030 (0.0009) +[2023-10-13 02:50:09,065][46663] Updated weights for policy 1, policy_version 61971 (0.0007) +[2023-10-13 02:50:09,259][46662] Updated weights for policy 0, policy_version 62040 (0.0008) +[2023-10-13 02:50:09,428][46663] Updated weights for policy 1, policy_version 61981 (0.0007) +[2023-10-13 02:50:13,387][46662] Updated weights for policy 0, policy_version 62050 (0.0008) +[2023-10-13 02:50:13,546][46663] Updated weights for policy 1, policy_version 61991 (0.0007) +[2023-10-13 02:50:13,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 127008768. Throughput: 0: 1669.5, 1: 1668.8. Samples: 31762798. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:50:13,607][45375] Avg episode reward: [(0, '51.470'), (1, '49.090')] +[2023-10-13 02:50:13,799][46662] Updated weights for policy 0, policy_version 62060 (0.0008) +[2023-10-13 02:50:13,902][46663] Updated weights for policy 1, policy_version 62001 (0.0007) +[2023-10-13 02:50:14,160][46662] Updated weights for policy 0, policy_version 62070 (0.0009) +[2023-10-13 02:50:14,276][46663] Updated weights for policy 1, policy_version 62011 (0.0007) +[2023-10-13 02:50:14,527][46662] Updated weights for policy 0, policy_version 62080 (0.0009) +[2023-10-13 02:50:18,315][46663] Updated weights for policy 1, policy_version 62021 (0.0008) +[2023-10-13 02:50:18,566][46662] Updated weights for policy 0, policy_version 62090 (0.0007) +[2023-10-13 02:50:18,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 127074304. Throughput: 0: 1666.1, 1: 1668.1. Samples: 31783220. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:50:18,607][45375] Avg episode reward: [(0, '52.860'), (1, '49.530')] +[2023-10-13 02:50:18,671][46663] Updated weights for policy 1, policy_version 62031 (0.0008) +[2023-10-13 02:50:18,924][46662] Updated weights for policy 0, policy_version 62100 (0.0008) +[2023-10-13 02:50:19,031][46663] Updated weights for policy 1, policy_version 62041 (0.0010) +[2023-10-13 02:50:19,307][46662] Updated weights for policy 0, policy_version 62110 (0.0010) +[2023-10-13 02:50:23,252][46663] Updated weights for policy 1, policy_version 62051 (0.0008) +[2023-10-13 02:50:23,491][46662] Updated weights for policy 0, policy_version 62120 (0.0007) +[2023-10-13 02:50:23,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 127139840. Throughput: 0: 1664.6, 1: 1661.3. Samples: 31803278. Policy #0 lag: (min: 4.0, avg: 17.0, max: 36.0) +[2023-10-13 02:50:23,607][45375] Avg episode reward: [(0, '53.180'), (1, '50.420')] +[2023-10-13 02:50:23,612][46663] Updated weights for policy 1, policy_version 62061 (0.0008) +[2023-10-13 02:50:23,858][46662] Updated weights for policy 0, policy_version 62130 (0.0008) +[2023-10-13 02:50:23,983][46663] Updated weights for policy 1, policy_version 62071 (0.0009) +[2023-10-13 02:50:24,234][46662] Updated weights for policy 0, policy_version 62140 (0.0008) +[2023-10-13 02:50:28,141][46663] Updated weights for policy 1, policy_version 62081 (0.0009) +[2023-10-13 02:50:28,298][46662] Updated weights for policy 0, policy_version 62150 (0.0008) +[2023-10-13 02:50:28,509][46663] Updated weights for policy 1, policy_version 62091 (0.0008) +[2023-10-13 02:50:28,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 127205376. Throughput: 0: 1660.8, 1: 1661.9. Samples: 31812318. Policy #0 lag: (min: 4.0, avg: 17.0, max: 36.0) +[2023-10-13 02:50:28,607][45375] Avg episode reward: [(0, '52.570'), (1, '53.040')] +[2023-10-13 02:50:28,665][46662] Updated weights for policy 0, policy_version 62160 (0.0008) +[2023-10-13 02:50:28,873][46663] Updated weights for policy 1, policy_version 62101 (0.0009) +[2023-10-13 02:50:29,029][46662] Updated weights for policy 0, policy_version 62170 (0.0009) +[2023-10-13 02:50:29,233][46663] Updated weights for policy 1, policy_version 62111 (0.0008) +[2023-10-13 02:50:33,094][46662] Updated weights for policy 0, policy_version 62180 (0.0008) +[2023-10-13 02:50:33,464][46662] Updated weights for policy 0, policy_version 62190 (0.0007) +[2023-10-13 02:50:33,500][46663] Updated weights for policy 1, policy_version 62121 (0.0008) +[2023-10-13 02:50:33,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 127270912. Throughput: 0: 1659.1, 1: 1665.5. Samples: 31832878. Policy #0 lag: (min: 4.0, avg: 17.0, max: 36.0) +[2023-10-13 02:50:33,607][45375] Avg episode reward: [(0, '55.570'), (1, '51.590')] +[2023-10-13 02:50:33,827][46662] Updated weights for policy 0, policy_version 62200 (0.0008) +[2023-10-13 02:50:33,866][46663] Updated weights for policy 1, policy_version 62131 (0.0008) +[2023-10-13 02:50:34,244][46663] Updated weights for policy 1, policy_version 62141 (0.0007) +[2023-10-13 02:50:37,978][46662] Updated weights for policy 0, policy_version 62210 (0.0008) +[2023-10-13 02:50:38,246][46663] Updated weights for policy 1, policy_version 62151 (0.0009) +[2023-10-13 02:50:38,349][46662] Updated weights for policy 0, policy_version 62220 (0.0009) +[2023-10-13 02:50:38,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 127336448. Throughput: 0: 1658.8, 1: 1653.4. Samples: 31852892. Policy #0 lag: (min: 4.0, avg: 17.0, max: 36.0) +[2023-10-13 02:50:38,607][45375] Avg episode reward: [(0, '54.950'), (1, '52.120')] +[2023-10-13 02:50:38,608][46663] Updated weights for policy 1, policy_version 62161 (0.0009) +[2023-10-13 02:50:38,716][46662] Updated weights for policy 0, policy_version 62230 (0.0008) +[2023-10-13 02:50:38,971][46663] Updated weights for policy 1, policy_version 62171 (0.0008) +[2023-10-13 02:50:39,087][46662] Updated weights for policy 0, policy_version 62240 (0.0008) +[2023-10-13 02:50:43,041][46663] Updated weights for policy 1, policy_version 62181 (0.0008) +[2023-10-13 02:50:43,158][46662] Updated weights for policy 0, policy_version 62250 (0.0011) +[2023-10-13 02:50:43,397][46663] Updated weights for policy 1, policy_version 62191 (0.0007) +[2023-10-13 02:50:43,520][46662] Updated weights for policy 0, policy_version 62260 (0.0008) +[2023-10-13 02:50:43,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 127401984. Throughput: 0: 1662.5, 1: 1661.7. Samples: 31862388. Policy #0 lag: (min: 4.0, avg: 17.0, max: 36.0) +[2023-10-13 02:50:43,607][45375] Avg episode reward: [(0, '55.250'), (1, '51.100')] +[2023-10-13 02:50:43,771][46663] Updated weights for policy 1, policy_version 62201 (0.0008) +[2023-10-13 02:50:43,885][46662] Updated weights for policy 0, policy_version 62270 (0.0008) +[2023-10-13 02:50:47,984][46662] Updated weights for policy 0, policy_version 62280 (0.0008) +[2023-10-13 02:50:48,035][46663] Updated weights for policy 1, policy_version 62211 (0.0010) +[2023-10-13 02:50:48,351][46662] Updated weights for policy 0, policy_version 62290 (0.0008) +[2023-10-13 02:50:48,399][46663] Updated weights for policy 1, policy_version 62221 (0.0009) +[2023-10-13 02:50:48,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 127467520. Throughput: 0: 1666.4, 1: 1658.7. Samples: 31882974. Policy #0 lag: (min: 4.0, avg: 17.0, max: 36.0) +[2023-10-13 02:50:48,607][45375] Avg episode reward: [(0, '52.840'), (1, '50.170')] +[2023-10-13 02:50:48,719][46662] Updated weights for policy 0, policy_version 62300 (0.0008) +[2023-10-13 02:50:48,767][46663] Updated weights for policy 1, policy_version 62231 (0.0010) +[2023-10-13 02:50:52,700][46662] Updated weights for policy 0, policy_version 62310 (0.0009) +[2023-10-13 02:50:52,944][46663] Updated weights for policy 1, policy_version 62241 (0.0008) +[2023-10-13 02:50:53,077][46662] Updated weights for policy 0, policy_version 62320 (0.0008) +[2023-10-13 02:50:53,309][46663] Updated weights for policy 1, policy_version 62251 (0.0008) +[2023-10-13 02:50:53,444][46662] Updated weights for policy 0, policy_version 62330 (0.0008) +[2023-10-13 02:50:53,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 127533056. Throughput: 0: 1667.0, 1: 1647.6. Samples: 31902896. Policy #0 lag: (min: 4.0, avg: 17.0, max: 36.0) +[2023-10-13 02:50:53,607][45375] Avg episode reward: [(0, '52.260'), (1, '49.880')] +[2023-10-13 02:50:53,668][46663] Updated weights for policy 1, policy_version 62261 (0.0007) +[2023-10-13 02:50:54,032][46663] Updated weights for policy 1, policy_version 62271 (0.0008) +[2023-10-13 02:50:57,487][46662] Updated weights for policy 0, policy_version 62340 (0.0008) +[2023-10-13 02:50:57,862][46662] Updated weights for policy 0, policy_version 62350 (0.0008) +[2023-10-13 02:50:58,165][46663] Updated weights for policy 1, policy_version 62281 (0.0008) +[2023-10-13 02:50:58,229][46662] Updated weights for policy 0, policy_version 62360 (0.0009) +[2023-10-13 02:50:58,533][46663] Updated weights for policy 1, policy_version 62291 (0.0007) +[2023-10-13 02:50:58,607][45375] Fps is (10 sec: 16384.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 127631360. Throughput: 0: 1677.9, 1: 1654.8. Samples: 31912768. Policy #0 lag: (min: 31.0, avg: 32.1, max: 53.0) +[2023-10-13 02:50:58,607][45375] Avg episode reward: [(0, '52.710'), (1, '50.400')] +[2023-10-13 02:50:58,898][46663] Updated weights for policy 1, policy_version 62301 (0.0011) +[2023-10-13 02:51:02,169][46662] Updated weights for policy 0, policy_version 62370 (0.0009) +[2023-10-13 02:51:02,571][46662] Updated weights for policy 0, policy_version 62380 (0.0009) +[2023-10-13 02:51:02,901][46663] Updated weights for policy 1, policy_version 62311 (0.0008) +[2023-10-13 02:51:02,955][46662] Updated weights for policy 0, policy_version 62390 (0.0007) +[2023-10-13 02:51:03,263][46663] Updated weights for policy 1, policy_version 62321 (0.0008) +[2023-10-13 02:51:03,313][46662] Updated weights for policy 0, policy_version 62400 (0.0009) +[2023-10-13 02:51:03,606][45375] Fps is (10 sec: 16384.1, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 127696896. Throughput: 0: 1680.7, 1: 1656.1. Samples: 31933378. Policy #0 lag: (min: 31.0, avg: 32.1, max: 53.0) +[2023-10-13 02:51:03,607][45375] Avg episode reward: [(0, '53.140'), (1, '50.840')] +[2023-10-13 02:51:03,636][46663] Updated weights for policy 1, policy_version 62331 (0.0009) +[2023-10-13 02:51:07,374][46662] Updated weights for policy 0, policy_version 62410 (0.0009) +[2023-10-13 02:51:07,685][46663] Updated weights for policy 1, policy_version 62341 (0.0008) +[2023-10-13 02:51:07,740][46662] Updated weights for policy 0, policy_version 62420 (0.0008) +[2023-10-13 02:51:08,048][46663] Updated weights for policy 1, policy_version 62351 (0.0009) +[2023-10-13 02:51:08,103][46662] Updated weights for policy 0, policy_version 62430 (0.0009) +[2023-10-13 02:51:08,425][46663] Updated weights for policy 1, policy_version 62361 (0.0007) +[2023-10-13 02:51:08,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13329.4). Total num frames: 127762432. Throughput: 0: 1665.1, 1: 1647.7. Samples: 31952352. Policy #0 lag: (min: 31.0, avg: 32.1, max: 53.0) +[2023-10-13 02:51:08,607][45375] Avg episode reward: [(0, '54.190'), (1, '50.470')] +[2023-10-13 02:51:12,132][46662] Updated weights for policy 0, policy_version 62440 (0.0009) +[2023-10-13 02:51:12,500][46662] Updated weights for policy 0, policy_version 62450 (0.0009) +[2023-10-13 02:51:12,538][46663] Updated weights for policy 1, policy_version 62371 (0.0008) +[2023-10-13 02:51:12,859][46662] Updated weights for policy 0, policy_version 62460 (0.0008) +[2023-10-13 02:51:12,901][46663] Updated weights for policy 1, policy_version 62381 (0.0009) +[2023-10-13 02:51:13,275][46663] Updated weights for policy 1, policy_version 62391 (0.0010) +[2023-10-13 02:51:13,607][45375] Fps is (10 sec: 16383.9, 60 sec: 14199.5, 300 sec: 13440.4). Total num frames: 127860736. Throughput: 0: 1685.2, 1: 1667.7. Samples: 31963200. Policy #0 lag: (min: 31.0, avg: 32.1, max: 53.0) +[2023-10-13 02:51:13,607][45375] Avg episode reward: [(0, '54.840'), (1, '50.270')] +[2023-10-13 02:51:16,906][46662] Updated weights for policy 0, policy_version 62470 (0.0008) +[2023-10-13 02:51:17,248][46663] Updated weights for policy 1, policy_version 62401 (0.0009) +[2023-10-13 02:51:17,273][46662] Updated weights for policy 0, policy_version 62480 (0.0008) +[2023-10-13 02:51:17,653][46662] Updated weights for policy 0, policy_version 62490 (0.0007) +[2023-10-13 02:51:17,682][46663] Updated weights for policy 1, policy_version 62411 (0.0008) +[2023-10-13 02:51:18,044][46663] Updated weights for policy 1, policy_version 62421 (0.0008) +[2023-10-13 02:51:18,414][46663] Updated weights for policy 1, policy_version 62431 (0.0011) +[2023-10-13 02:51:18,606][45375] Fps is (10 sec: 16384.0, 60 sec: 14199.5, 300 sec: 13329.4). Total num frames: 127926272. Throughput: 0: 1686.8, 1: 1664.4. Samples: 31983680. Policy #0 lag: (min: 31.0, avg: 32.1, max: 53.0) +[2023-10-13 02:51:18,607][45375] Avg episode reward: [(0, '54.680'), (1, '51.260')] +[2023-10-13 02:51:21,582][46662] Updated weights for policy 0, policy_version 62500 (0.0008) +[2023-10-13 02:51:21,955][46662] Updated weights for policy 0, policy_version 62510 (0.0008) +[2023-10-13 02:51:22,317][46662] Updated weights for policy 0, policy_version 62520 (0.0008) +[2023-10-13 02:51:22,551][46663] Updated weights for policy 1, policy_version 62441 (0.0009) +[2023-10-13 02:51:22,915][46663] Updated weights for policy 1, policy_version 62451 (0.0010) +[2023-10-13 02:51:23,286][46663] Updated weights for policy 1, policy_version 62461 (0.0008) +[2023-10-13 02:51:23,606][45375] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 13329.4). Total num frames: 127991808. Throughput: 0: 1665.2, 1: 1654.1. Samples: 32002262. Policy #0 lag: (min: 31.0, avg: 32.1, max: 53.0) +[2023-10-13 02:51:23,607][45375] Avg episode reward: [(0, '54.970'), (1, '51.690')] +[2023-10-13 02:51:26,313][46662] Updated weights for policy 0, policy_version 62530 (0.0007) +[2023-10-13 02:51:26,679][46662] Updated weights for policy 0, policy_version 62540 (0.0010) +[2023-10-13 02:51:27,056][46662] Updated weights for policy 0, policy_version 62550 (0.0008) +[2023-10-13 02:51:27,425][46662] Updated weights for policy 0, policy_version 62560 (0.0010) +[2023-10-13 02:51:27,523][46663] Updated weights for policy 1, policy_version 62471 (0.0008) +[2023-10-13 02:51:27,891][46663] Updated weights for policy 1, policy_version 62481 (0.0008) +[2023-10-13 02:51:28,249][46663] Updated weights for policy 1, policy_version 62491 (0.0009) +[2023-10-13 02:51:28,607][45375] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 13329.4). Total num frames: 128057344. Throughput: 0: 1700.5, 1: 1667.4. Samples: 32013944. Policy #0 lag: (min: 31.0, avg: 32.1, max: 53.0) +[2023-10-13 02:51:28,607][45375] Avg episode reward: [(0, '56.680'), (1, '50.960')] +[2023-10-13 02:51:31,485][46662] Updated weights for policy 0, policy_version 62570 (0.0009) +[2023-10-13 02:51:31,844][46662] Updated weights for policy 0, policy_version 62580 (0.0011) +[2023-10-13 02:51:32,216][46662] Updated weights for policy 0, policy_version 62590 (0.0009) +[2023-10-13 02:51:32,387][46663] Updated weights for policy 1, policy_version 62501 (0.0009) +[2023-10-13 02:51:32,750][46663] Updated weights for policy 1, policy_version 62511 (0.0008) +[2023-10-13 02:51:33,119][46663] Updated weights for policy 1, policy_version 62521 (0.0007) +[2023-10-13 02:51:33,607][45375] Fps is (10 sec: 13106.9, 60 sec: 14199.4, 300 sec: 13329.3). Total num frames: 128122880. Throughput: 0: 1680.8, 1: 1671.0. Samples: 32033808. Policy #0 lag: (min: 31.0, avg: 32.1, max: 53.0) +[2023-10-13 02:51:33,608][45375] Avg episode reward: [(0, '55.750'), (1, '50.650')] +[2023-10-13 02:51:36,278][46662] Updated weights for policy 0, policy_version 62600 (0.0010) +[2023-10-13 02:51:36,647][46662] Updated weights for policy 0, policy_version 62610 (0.0010) +[2023-10-13 02:51:37,007][46662] Updated weights for policy 0, policy_version 62620 (0.0011) +[2023-10-13 02:51:37,027][46663] Updated weights for policy 1, policy_version 62531 (0.0007) +[2023-10-13 02:51:37,392][46663] Updated weights for policy 1, policy_version 62541 (0.0009) +[2023-10-13 02:51:37,775][46663] Updated weights for policy 1, policy_version 62551 (0.0008) +[2023-10-13 02:51:38,607][45375] Fps is (10 sec: 13107.2, 60 sec: 14199.4, 300 sec: 13329.4). Total num frames: 128188416. Throughput: 0: 1674.6, 1: 1661.8. Samples: 32053034. Policy #0 lag: (min: 31.0, avg: 32.1, max: 53.0) +[2023-10-13 02:51:38,607][45375] Avg episode reward: [(0, '55.970'), (1, '49.590')] +[2023-10-13 02:51:41,143][46662] Updated weights for policy 0, policy_version 62630 (0.0007) +[2023-10-13 02:51:41,512][46662] Updated weights for policy 0, policy_version 62640 (0.0007) +[2023-10-13 02:51:41,747][46663] Updated weights for policy 1, policy_version 62561 (0.0010) +[2023-10-13 02:51:41,868][46662] Updated weights for policy 0, policy_version 62650 (0.0008) +[2023-10-13 02:51:42,121][46663] Updated weights for policy 1, policy_version 62571 (0.0007) +[2023-10-13 02:51:42,480][46663] Updated weights for policy 1, policy_version 62581 (0.0010) +[2023-10-13 02:51:42,855][46663] Updated weights for policy 1, policy_version 62591 (0.0009) +[2023-10-13 02:51:43,607][45375] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 13329.4). Total num frames: 128253952. Throughput: 0: 1696.8, 1: 1684.1. Samples: 32064910. Policy #0 lag: (min: 6.0, avg: 6.7, max: 24.0) +[2023-10-13 02:51:43,607][45375] Avg episode reward: [(0, '55.980'), (1, '50.060')] +[2023-10-13 02:51:45,899][46662] Updated weights for policy 0, policy_version 62660 (0.0010) +[2023-10-13 02:51:46,276][46662] Updated weights for policy 0, policy_version 62670 (0.0007) +[2023-10-13 02:51:46,639][46662] Updated weights for policy 0, policy_version 62680 (0.0009) +[2023-10-13 02:51:46,855][46663] Updated weights for policy 1, policy_version 62601 (0.0008) +[2023-10-13 02:51:47,223][46663] Updated weights for policy 1, policy_version 62611 (0.0008) +[2023-10-13 02:51:47,585][46663] Updated weights for policy 1, policy_version 62621 (0.0010) +[2023-10-13 02:51:48,607][45375] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 13329.4). Total num frames: 128319488. Throughput: 0: 1673.5, 1: 1672.2. Samples: 32083936. Policy #0 lag: (min: 6.0, avg: 6.7, max: 24.0) +[2023-10-13 02:51:48,607][45375] Avg episode reward: [(0, '55.250'), (1, '49.340')] +[2023-10-13 02:51:50,773][46662] Updated weights for policy 0, policy_version 62690 (0.0008) +[2023-10-13 02:51:51,163][46662] Updated weights for policy 0, policy_version 62700 (0.0009) +[2023-10-13 02:51:51,521][46662] Updated weights for policy 0, policy_version 62710 (0.0009) +[2023-10-13 02:51:51,628][46663] Updated weights for policy 1, policy_version 62631 (0.0008) +[2023-10-13 02:51:51,891][46662] Updated weights for policy 0, policy_version 62720 (0.0008) +[2023-10-13 02:51:51,995][46663] Updated weights for policy 1, policy_version 62641 (0.0008) +[2023-10-13 02:51:52,365][46663] Updated weights for policy 1, policy_version 62651 (0.0008) +[2023-10-13 02:51:53,607][45375] Fps is (10 sec: 13106.9, 60 sec: 14199.4, 300 sec: 13329.3). Total num frames: 128385024. Throughput: 0: 1685.5, 1: 1684.2. Samples: 32103990. Policy #0 lag: (min: 6.0, avg: 6.7, max: 24.0) +[2023-10-13 02:51:53,608][45375] Avg episode reward: [(0, '54.190'), (1, '51.540')] +[2023-10-13 02:51:53,620][46384] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000062656_64159744.pth... +[2023-10-13 02:51:53,620][46091] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000062720_64225280.pth... +[2023-10-13 02:51:53,656][46091] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000061152_62619648.pth +[2023-10-13 02:51:53,657][46384] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000061088_62554112.pth +[2023-10-13 02:51:53,660][46091] Saving a milestone ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/milestones/checkpoint_000062720_64225280.pth +[2023-10-13 02:51:53,661][46384] Saving a milestone ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/milestones/checkpoint_000062656_64159744.pth +[2023-10-13 02:51:55,985][46662] Updated weights for policy 0, policy_version 62730 (0.0008) +[2023-10-13 02:51:56,322][46663] Updated weights for policy 1, policy_version 62661 (0.0009) +[2023-10-13 02:51:56,358][46662] Updated weights for policy 0, policy_version 62740 (0.0009) +[2023-10-13 02:51:56,697][46663] Updated weights for policy 1, policy_version 62671 (0.0009) +[2023-10-13 02:51:56,732][46662] Updated weights for policy 0, policy_version 62750 (0.0009) +[2023-10-13 02:51:57,064][46663] Updated weights for policy 1, policy_version 62681 (0.0008) +[2023-10-13 02:51:58,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13329.4). Total num frames: 128450560. Throughput: 0: 1694.8, 1: 1688.7. Samples: 32115456. Policy #0 lag: (min: 6.0, avg: 6.7, max: 24.0) +[2023-10-13 02:51:58,607][45375] Avg episode reward: [(0, '56.190'), (1, '50.660')] +[2023-10-13 02:52:00,671][46662] Updated weights for policy 0, policy_version 62760 (0.0008) +[2023-10-13 02:52:01,046][46662] Updated weights for policy 0, policy_version 62770 (0.0007) +[2023-10-13 02:52:01,089][46663] Updated weights for policy 1, policy_version 62691 (0.0010) +[2023-10-13 02:52:01,403][46662] Updated weights for policy 0, policy_version 62780 (0.0008) +[2023-10-13 02:52:01,454][46663] Updated weights for policy 1, policy_version 62701 (0.0008) +[2023-10-13 02:52:01,819][46663] Updated weights for policy 1, policy_version 62711 (0.0009) +[2023-10-13 02:52:03,606][45375] Fps is (10 sec: 13107.7, 60 sec: 13653.3, 300 sec: 13329.4). Total num frames: 128516096. Throughput: 0: 1671.2, 1: 1672.5. Samples: 32134146. Policy #0 lag: (min: 6.0, avg: 6.7, max: 24.0) +[2023-10-13 02:52:03,607][45375] Avg episode reward: [(0, '55.570'), (1, '50.040')] +[2023-10-13 02:52:05,412][46662] Updated weights for policy 0, policy_version 62790 (0.0007) +[2023-10-13 02:52:05,774][46662] Updated weights for policy 0, policy_version 62800 (0.0008) +[2023-10-13 02:52:05,857][46663] Updated weights for policy 1, policy_version 62721 (0.0009) +[2023-10-13 02:52:06,148][46662] Updated weights for policy 0, policy_version 62810 (0.0008) +[2023-10-13 02:52:06,269][46663] Updated weights for policy 1, policy_version 62731 (0.0010) +[2023-10-13 02:52:06,646][46663] Updated weights for policy 1, policy_version 62741 (0.0010) +[2023-10-13 02:52:07,012][46663] Updated weights for policy 1, policy_version 62751 (0.0009) +[2023-10-13 02:52:08,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13329.4). Total num frames: 128581632. Throughput: 0: 1696.2, 1: 1695.0. Samples: 32154866. Policy #0 lag: (min: 6.0, avg: 6.7, max: 24.0) +[2023-10-13 02:52:08,607][45375] Avg episode reward: [(0, '54.240'), (1, '49.460')] +[2023-10-13 02:52:10,256][46662] Updated weights for policy 0, policy_version 62820 (0.0009) +[2023-10-13 02:52:10,625][46662] Updated weights for policy 0, policy_version 62830 (0.0011) +[2023-10-13 02:52:11,000][46662] Updated weights for policy 0, policy_version 62840 (0.0010) +[2023-10-13 02:52:11,152][46663] Updated weights for policy 1, policy_version 62761 (0.0008) +[2023-10-13 02:52:11,515][46663] Updated weights for policy 1, policy_version 62771 (0.0008) +[2023-10-13 02:52:11,888][46663] Updated weights for policy 1, policy_version 62781 (0.0012) +[2023-10-13 02:52:13,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 128647168. Throughput: 0: 1673.7, 1: 1688.2. Samples: 32165228. Policy #0 lag: (min: 6.0, avg: 6.7, max: 24.0) +[2023-10-13 02:52:13,607][45375] Avg episode reward: [(0, '52.870'), (1, '49.290')] +[2023-10-13 02:52:15,023][46662] Updated weights for policy 0, policy_version 62850 (0.0010) +[2023-10-13 02:52:15,389][46662] Updated weights for policy 0, policy_version 62860 (0.0007) +[2023-10-13 02:52:15,758][46662] Updated weights for policy 0, policy_version 62870 (0.0010) +[2023-10-13 02:52:16,035][46663] Updated weights for policy 1, policy_version 62791 (0.0009) +[2023-10-13 02:52:16,125][46662] Updated weights for policy 0, policy_version 62880 (0.0008) +[2023-10-13 02:52:16,401][46663] Updated weights for policy 1, policy_version 62801 (0.0008) +[2023-10-13 02:52:16,769][46663] Updated weights for policy 1, policy_version 62811 (0.0007) +[2023-10-13 02:52:18,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 128712704. Throughput: 0: 1681.0, 1: 1670.4. Samples: 32184620. Policy #0 lag: (min: 6.0, avg: 6.7, max: 24.0) +[2023-10-13 02:52:18,607][45375] Avg episode reward: [(0, '51.370'), (1, '47.220')] +[2023-10-13 02:52:20,240][46662] Updated weights for policy 0, policy_version 62890 (0.0012) +[2023-10-13 02:52:20,599][46662] Updated weights for policy 0, policy_version 62900 (0.0007) +[2023-10-13 02:52:20,913][46663] Updated weights for policy 1, policy_version 62821 (0.0007) +[2023-10-13 02:52:20,963][46662] Updated weights for policy 0, policy_version 62910 (0.0010) +[2023-10-13 02:52:21,278][46663] Updated weights for policy 1, policy_version 62831 (0.0008) +[2023-10-13 02:52:21,641][46663] Updated weights for policy 1, policy_version 62841 (0.0008) +[2023-10-13 02:52:23,607][45375] Fps is (10 sec: 13106.9, 60 sec: 13107.1, 300 sec: 13329.3). Total num frames: 128778240. Throughput: 0: 1692.3, 1: 1684.5. Samples: 32204992. Policy #0 lag: (min: 31.0, avg: 33.9, max: 63.0) +[2023-10-13 02:52:23,608][45375] Avg episode reward: [(0, '51.110'), (1, '47.430')] +[2023-10-13 02:52:25,069][46662] Updated weights for policy 0, policy_version 62920 (0.0011) +[2023-10-13 02:52:25,440][46662] Updated weights for policy 0, policy_version 62930 (0.0010) +[2023-10-13 02:52:25,791][46663] Updated weights for policy 1, policy_version 62851 (0.0008) +[2023-10-13 02:52:25,801][46662] Updated weights for policy 0, policy_version 62940 (0.0008) +[2023-10-13 02:52:26,168][46663] Updated weights for policy 1, policy_version 62861 (0.0010) +[2023-10-13 02:52:26,527][46663] Updated weights for policy 1, policy_version 62871 (0.0008) +[2023-10-13 02:52:28,606][45375] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 128843776. Throughput: 0: 1663.5, 1: 1667.6. Samples: 32214808. Policy #0 lag: (min: 31.0, avg: 33.9, max: 63.0) +[2023-10-13 02:52:28,607][45375] Avg episode reward: [(0, '51.930'), (1, '47.870')] +[2023-10-13 02:52:29,767][46662] Updated weights for policy 0, policy_version 62950 (0.0008) +[2023-10-13 02:52:30,127][46662] Updated weights for policy 0, policy_version 62960 (0.0008) +[2023-10-13 02:52:30,397][46663] Updated weights for policy 1, policy_version 62881 (0.0007) +[2023-10-13 02:52:30,492][46662] Updated weights for policy 0, policy_version 62970 (0.0008) +[2023-10-13 02:52:30,759][46663] Updated weights for policy 1, policy_version 62891 (0.0007) +[2023-10-13 02:52:31,135][46663] Updated weights for policy 1, policy_version 62901 (0.0007) +[2023-10-13 02:52:31,510][46663] Updated weights for policy 1, policy_version 62911 (0.0008) +[2023-10-13 02:52:33,606][45375] Fps is (10 sec: 13107.7, 60 sec: 13107.3, 300 sec: 13329.4). Total num frames: 128909312. Throughput: 0: 1684.4, 1: 1668.8. Samples: 32234828. Policy #0 lag: (min: 31.0, avg: 33.9, max: 63.0) +[2023-10-13 02:52:33,607][45375] Avg episode reward: [(0, '51.720'), (1, '46.220')] +[2023-10-13 02:52:34,585][46662] Updated weights for policy 0, policy_version 62980 (0.0008) +[2023-10-13 02:52:34,953][46662] Updated weights for policy 0, policy_version 62990 (0.0007) +[2023-10-13 02:52:35,317][46662] Updated weights for policy 0, policy_version 63000 (0.0008) +[2023-10-13 02:52:35,687][46663] Updated weights for policy 1, policy_version 62921 (0.0009) +[2023-10-13 02:52:36,043][46663] Updated weights for policy 1, policy_version 62931 (0.0010) +[2023-10-13 02:52:36,411][46663] Updated weights for policy 1, policy_version 62941 (0.0009) +[2023-10-13 02:52:38,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 128974848. Throughput: 0: 1694.4, 1: 1676.1. Samples: 32255660. Policy #0 lag: (min: 31.0, avg: 33.9, max: 63.0) +[2023-10-13 02:52:38,607][45375] Avg episode reward: [(0, '51.130'), (1, '47.350')] +[2023-10-13 02:52:39,433][46662] Updated weights for policy 0, policy_version 63010 (0.0008) +[2023-10-13 02:52:39,839][46662] Updated weights for policy 0, policy_version 63020 (0.0008) +[2023-10-13 02:52:40,202][46662] Updated weights for policy 0, policy_version 63030 (0.0007) +[2023-10-13 02:52:40,343][46663] Updated weights for policy 1, policy_version 62951 (0.0008) +[2023-10-13 02:52:40,567][46662] Updated weights for policy 0, policy_version 63040 (0.0007) +[2023-10-13 02:52:40,710][46663] Updated weights for policy 1, policy_version 62961 (0.0009) +[2023-10-13 02:52:41,073][46663] Updated weights for policy 1, policy_version 62971 (0.0010) +[2023-10-13 02:52:43,607][45375] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 129040384. Throughput: 0: 1661.2, 1: 1649.8. Samples: 32264454. Policy #0 lag: (min: 31.0, avg: 33.9, max: 63.0) +[2023-10-13 02:52:43,608][45375] Avg episode reward: [(0, '51.670'), (1, '48.100')] +[2023-10-13 02:52:44,612][46662] Updated weights for policy 0, policy_version 63050 (0.0007) +[2023-10-13 02:52:44,979][46662] Updated weights for policy 0, policy_version 63060 (0.0009) +[2023-10-13 02:52:45,205][46663] Updated weights for policy 1, policy_version 62981 (0.0010) +[2023-10-13 02:52:45,347][46662] Updated weights for policy 0, policy_version 63070 (0.0008) +[2023-10-13 02:52:45,562][46663] Updated weights for policy 1, policy_version 62991 (0.0008) +[2023-10-13 02:52:45,929][46663] Updated weights for policy 1, policy_version 63001 (0.0007) +[2023-10-13 02:52:48,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 129105920. Throughput: 0: 1685.2, 1: 1670.0. Samples: 32285126. Policy #0 lag: (min: 31.0, avg: 33.9, max: 63.0) +[2023-10-13 02:52:48,607][45375] Avg episode reward: [(0, '51.040'), (1, '48.590')] +[2023-10-13 02:52:49,577][46662] Updated weights for policy 0, policy_version 63080 (0.0008) +[2023-10-13 02:52:49,951][46662] Updated weights for policy 0, policy_version 63090 (0.0009) +[2023-10-13 02:52:50,126][46663] Updated weights for policy 1, policy_version 63011 (0.0009) +[2023-10-13 02:52:50,315][46662] Updated weights for policy 0, policy_version 63100 (0.0008) +[2023-10-13 02:52:50,495][46663] Updated weights for policy 1, policy_version 63021 (0.0008) +[2023-10-13 02:52:50,855][46663] Updated weights for policy 1, policy_version 63031 (0.0009) +[2023-10-13 02:52:53,607][45375] Fps is (10 sec: 13107.4, 60 sec: 13107.3, 300 sec: 13329.4). Total num frames: 129171456. Throughput: 0: 1678.6, 1: 1669.2. Samples: 32305516. Policy #0 lag: (min: 31.0, avg: 33.9, max: 63.0) +[2023-10-13 02:52:53,607][45375] Avg episode reward: [(0, '52.080'), (1, '49.070')] +[2023-10-13 02:52:54,462][46662] Updated weights for policy 0, policy_version 63110 (0.0009) +[2023-10-13 02:52:54,836][46662] Updated weights for policy 0, policy_version 63120 (0.0008) +[2023-10-13 02:52:55,110][46663] Updated weights for policy 1, policy_version 63041 (0.0008) +[2023-10-13 02:52:55,210][46662] Updated weights for policy 0, policy_version 63130 (0.0007) +[2023-10-13 02:52:55,515][46663] Updated weights for policy 1, policy_version 63051 (0.0009) +[2023-10-13 02:52:55,879][46663] Updated weights for policy 1, policy_version 63061 (0.0012) +[2023-10-13 02:52:56,242][46663] Updated weights for policy 1, policy_version 63071 (0.0010) +[2023-10-13 02:52:58,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 129236992. Throughput: 0: 1667.5, 1: 1652.2. Samples: 32314614. Policy #0 lag: (min: 31.0, avg: 33.9, max: 63.0) +[2023-10-13 02:52:58,607][45375] Avg episode reward: [(0, '53.340'), (1, '48.510')] +[2023-10-13 02:52:59,196][46662] Updated weights for policy 0, policy_version 63140 (0.0008) +[2023-10-13 02:52:59,560][46662] Updated weights for policy 0, policy_version 63150 (0.0007) +[2023-10-13 02:52:59,924][46662] Updated weights for policy 0, policy_version 63160 (0.0009) +[2023-10-13 02:53:00,464][46663] Updated weights for policy 1, policy_version 63081 (0.0009) +[2023-10-13 02:53:00,827][46663] Updated weights for policy 1, policy_version 63091 (0.0009) +[2023-10-13 02:53:01,185][46663] Updated weights for policy 1, policy_version 63101 (0.0010) +[2023-10-13 02:53:03,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13107.1, 300 sec: 13329.4). Total num frames: 129302528. Throughput: 0: 1679.1, 1: 1670.8. Samples: 32335368. Policy #0 lag: (min: 31.0, avg: 33.9, max: 63.0) +[2023-10-13 02:53:03,608][45375] Avg episode reward: [(0, '55.110'), (1, '49.400')] +[2023-10-13 02:53:03,933][46662] Updated weights for policy 0, policy_version 63170 (0.0008) +[2023-10-13 02:53:04,310][46662] Updated weights for policy 0, policy_version 63180 (0.0008) +[2023-10-13 02:53:04,672][46662] Updated weights for policy 0, policy_version 63190 (0.0009) +[2023-10-13 02:53:05,041][46662] Updated weights for policy 0, policy_version 63200 (0.0009) +[2023-10-13 02:53:05,105][46663] Updated weights for policy 1, policy_version 63111 (0.0009) +[2023-10-13 02:53:05,478][46663] Updated weights for policy 1, policy_version 63121 (0.0010) +[2023-10-13 02:53:05,847][46663] Updated weights for policy 1, policy_version 63131 (0.0009) +[2023-10-13 02:53:08,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 129368064. Throughput: 0: 1678.8, 1: 1678.5. Samples: 32356066. Policy #0 lag: (min: 12.0, avg: 26.8, max: 44.0) +[2023-10-13 02:53:08,607][45375] Avg episode reward: [(0, '54.240'), (1, '49.530')] +[2023-10-13 02:53:09,120][46662] Updated weights for policy 0, policy_version 63210 (0.0008) +[2023-10-13 02:53:09,489][46662] Updated weights for policy 0, policy_version 63220 (0.0007) +[2023-10-13 02:53:09,855][46662] Updated weights for policy 0, policy_version 63230 (0.0007) +[2023-10-13 02:53:09,903][46663] Updated weights for policy 1, policy_version 63141 (0.0007) +[2023-10-13 02:53:10,278][46663] Updated weights for policy 1, policy_version 63151 (0.0008) +[2023-10-13 02:53:10,648][46663] Updated weights for policy 1, policy_version 63161 (0.0008) +[2023-10-13 02:53:13,607][45375] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 129433600. Throughput: 0: 1675.8, 1: 1663.6. Samples: 32365082. Policy #0 lag: (min: 12.0, avg: 26.8, max: 44.0) +[2023-10-13 02:53:13,607][45375] Avg episode reward: [(0, '55.380'), (1, '49.660')] +[2023-10-13 02:53:13,917][46662] Updated weights for policy 0, policy_version 63240 (0.0009) +[2023-10-13 02:53:14,303][46662] Updated weights for policy 0, policy_version 63250 (0.0010) +[2023-10-13 02:53:14,614][46663] Updated weights for policy 1, policy_version 63171 (0.0009) +[2023-10-13 02:53:14,664][46662] Updated weights for policy 0, policy_version 63260 (0.0008) +[2023-10-13 02:53:14,974][46663] Updated weights for policy 1, policy_version 63181 (0.0011) +[2023-10-13 02:53:15,352][46663] Updated weights for policy 1, policy_version 63191 (0.0009) +[2023-10-13 02:53:18,606][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 129499136. Throughput: 0: 1676.9, 1: 1676.8. Samples: 32385742. Policy #0 lag: (min: 12.0, avg: 26.8, max: 44.0) +[2023-10-13 02:53:18,607][45375] Avg episode reward: [(0, '56.120'), (1, '48.980')] +[2023-10-13 02:53:18,703][46662] Updated weights for policy 0, policy_version 63270 (0.0008) +[2023-10-13 02:53:19,068][46662] Updated weights for policy 0, policy_version 63280 (0.0007) +[2023-10-13 02:53:19,357][46663] Updated weights for policy 1, policy_version 63201 (0.0009) +[2023-10-13 02:53:19,430][46662] Updated weights for policy 0, policy_version 63290 (0.0007) +[2023-10-13 02:53:19,725][46663] Updated weights for policy 1, policy_version 63211 (0.0007) +[2023-10-13 02:53:20,097][46663] Updated weights for policy 1, policy_version 63221 (0.0009) +[2023-10-13 02:53:20,466][46663] Updated weights for policy 1, policy_version 63231 (0.0007) +[2023-10-13 02:53:23,500][46662] Updated weights for policy 0, policy_version 63300 (0.0008) +[2023-10-13 02:53:23,607][45375] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 129564672. Throughput: 0: 1672.9, 1: 1676.3. Samples: 32406374. Policy #0 lag: (min: 12.0, avg: 26.8, max: 44.0) +[2023-10-13 02:53:23,608][45375] Avg episode reward: [(0, '56.960'), (1, '50.010')] +[2023-10-13 02:53:23,872][46662] Updated weights for policy 0, policy_version 63310 (0.0008) +[2023-10-13 02:53:24,241][46662] Updated weights for policy 0, policy_version 63320 (0.0009) +[2023-10-13 02:53:24,558][46663] Updated weights for policy 1, policy_version 63241 (0.0008) +[2023-10-13 02:53:24,923][46663] Updated weights for policy 1, policy_version 63251 (0.0007) +[2023-10-13 02:53:25,291][46663] Updated weights for policy 1, policy_version 63261 (0.0009) +[2023-10-13 02:53:28,396][46662] Updated weights for policy 0, policy_version 63330 (0.0007) +[2023-10-13 02:53:28,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 129630208. Throughput: 0: 1679.1, 1: 1676.8. Samples: 32415468. Policy #0 lag: (min: 12.0, avg: 26.8, max: 44.0) +[2023-10-13 02:53:28,607][45375] Avg episode reward: [(0, '57.770'), (1, '50.380')] +[2023-10-13 02:53:28,797][46662] Updated weights for policy 0, policy_version 63340 (0.0008) +[2023-10-13 02:53:29,159][46662] Updated weights for policy 0, policy_version 63350 (0.0009) +[2023-10-13 02:53:29,509][46663] Updated weights for policy 1, policy_version 63271 (0.0010) +[2023-10-13 02:53:29,527][46662] Updated weights for policy 0, policy_version 63360 (0.0008) +[2023-10-13 02:53:29,881][46663] Updated weights for policy 1, policy_version 63281 (0.0010) +[2023-10-13 02:53:30,265][46663] Updated weights for policy 1, policy_version 63291 (0.0011) +[2023-10-13 02:53:33,504][46662] Updated weights for policy 0, policy_version 63370 (0.0008) +[2023-10-13 02:53:33,607][45375] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 129695744. Throughput: 0: 1676.0, 1: 1671.9. Samples: 32435782. Policy #0 lag: (min: 12.0, avg: 26.8, max: 44.0) +[2023-10-13 02:53:33,607][45375] Avg episode reward: [(0, '57.790'), (1, '50.880')] +[2023-10-13 02:53:33,873][46662] Updated weights for policy 0, policy_version 63380 (0.0008) +[2023-10-13 02:53:34,245][46662] Updated weights for policy 0, policy_version 63390 (0.0009) +[2023-10-13 02:53:34,265][46663] Updated weights for policy 1, policy_version 63301 (0.0008) +[2023-10-13 02:53:34,629][46663] Updated weights for policy 1, policy_version 63311 (0.0007) +[2023-10-13 02:53:34,988][46663] Updated weights for policy 1, policy_version 63321 (0.0008) +[2023-10-13 02:53:38,363][46662] Updated weights for policy 0, policy_version 63400 (0.0008) +[2023-10-13 02:53:38,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 129761280. Throughput: 0: 1679.5, 1: 1671.9. Samples: 32456328. Policy #0 lag: (min: 12.0, avg: 26.8, max: 44.0) +[2023-10-13 02:53:38,607][45375] Avg episode reward: [(0, '57.330'), (1, '52.270')] +[2023-10-13 02:53:38,739][46662] Updated weights for policy 0, policy_version 63410 (0.0008) +[2023-10-13 02:53:39,113][46662] Updated weights for policy 0, policy_version 63420 (0.0010) +[2023-10-13 02:53:39,233][46663] Updated weights for policy 1, policy_version 63331 (0.0010) +[2023-10-13 02:53:39,602][46663] Updated weights for policy 1, policy_version 63341 (0.0008) +[2023-10-13 02:53:39,967][46663] Updated weights for policy 1, policy_version 63351 (0.0008) +[2023-10-13 02:53:43,284][46662] Updated weights for policy 0, policy_version 63430 (0.0010) +[2023-10-13 02:53:43,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 129826816. Throughput: 0: 1679.6, 1: 1670.0. Samples: 32465342. Policy #0 lag: (min: 12.0, avg: 26.8, max: 44.0) +[2023-10-13 02:53:43,607][45375] Avg episode reward: [(0, '57.420'), (1, '54.520')] +[2023-10-13 02:53:43,648][46662] Updated weights for policy 0, policy_version 63440 (0.0007) +[2023-10-13 02:53:44,017][46662] Updated weights for policy 0, policy_version 63450 (0.0007) +[2023-10-13 02:53:44,262][46663] Updated weights for policy 1, policy_version 63361 (0.0009) +[2023-10-13 02:53:44,690][46663] Updated weights for policy 1, policy_version 63371 (0.0008) +[2023-10-13 02:53:45,059][46663] Updated weights for policy 1, policy_version 63381 (0.0008) +[2023-10-13 02:53:45,427][46663] Updated weights for policy 1, policy_version 63391 (0.0009) +[2023-10-13 02:53:48,065][46662] Updated weights for policy 0, policy_version 63460 (0.0008) +[2023-10-13 02:53:48,441][46662] Updated weights for policy 0, policy_version 63470 (0.0010) +[2023-10-13 02:53:48,606][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 129892352. Throughput: 0: 1682.7, 1: 1664.9. Samples: 32486010. Policy #0 lag: (min: 12.0, avg: 26.8, max: 44.0) +[2023-10-13 02:53:48,607][45375] Avg episode reward: [(0, '57.290'), (1, '53.380')] +[2023-10-13 02:53:48,808][46662] Updated weights for policy 0, policy_version 63480 (0.0007) +[2023-10-13 02:53:49,650][46663] Updated weights for policy 1, policy_version 63401 (0.0010) +[2023-10-13 02:53:50,030][46663] Updated weights for policy 1, policy_version 63411 (0.0008) +[2023-10-13 02:53:50,392][46663] Updated weights for policy 1, policy_version 63421 (0.0008) +[2023-10-13 02:53:52,815][46662] Updated weights for policy 0, policy_version 63490 (0.0008) +[2023-10-13 02:53:53,187][46662] Updated weights for policy 0, policy_version 63500 (0.0009) +[2023-10-13 02:53:53,561][46662] Updated weights for policy 0, policy_version 63510 (0.0008) +[2023-10-13 02:53:53,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 129957888. Throughput: 0: 1680.8, 1: 1665.4. Samples: 32506648. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:53:53,607][45375] Avg episode reward: [(0, '58.460'), (1, '52.920')] +[2023-10-13 02:53:53,614][46384] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000063424_64946176.pth... +[2023-10-13 02:53:53,647][46384] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000061888_63373312.pth +[2023-10-13 02:53:53,930][46091] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000063520_65044480.pth... +[2023-10-13 02:53:53,935][46662] Updated weights for policy 0, policy_version 63520 (0.0008) +[2023-10-13 02:53:53,971][46091] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000061952_63438848.pth +[2023-10-13 02:53:54,339][46663] Updated weights for policy 1, policy_version 63431 (0.0007) +[2023-10-13 02:53:54,701][46663] Updated weights for policy 1, policy_version 63441 (0.0008) +[2023-10-13 02:53:55,078][46663] Updated weights for policy 1, policy_version 63451 (0.0008) +[2023-10-13 02:53:58,049][46662] Updated weights for policy 0, policy_version 63530 (0.0009) +[2023-10-13 02:53:58,420][46662] Updated weights for policy 0, policy_version 63540 (0.0009) +[2023-10-13 02:53:58,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 130023424. Throughput: 0: 1684.4, 1: 1668.0. Samples: 32515942. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:53:58,607][45375] Avg episode reward: [(0, '59.670'), (1, '52.460')] +[2023-10-13 02:53:58,792][46662] Updated weights for policy 0, policy_version 63550 (0.0009) +[2023-10-13 02:53:58,857][46091] Saving new best policy, reward=59.670! +[2023-10-13 02:53:59,291][46663] Updated weights for policy 1, policy_version 63461 (0.0008) +[2023-10-13 02:53:59,656][46663] Updated weights for policy 1, policy_version 63471 (0.0007) +[2023-10-13 02:54:00,039][46663] Updated weights for policy 1, policy_version 63481 (0.0010) +[2023-10-13 02:54:02,833][46662] Updated weights for policy 0, policy_version 63560 (0.0008) +[2023-10-13 02:54:03,208][46662] Updated weights for policy 0, policy_version 63570 (0.0009) +[2023-10-13 02:54:03,581][46662] Updated weights for policy 0, policy_version 63580 (0.0007) +[2023-10-13 02:54:03,606][45375] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 130088960. Throughput: 0: 1687.3, 1: 1668.0. Samples: 32536732. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:54:03,607][45375] Avg episode reward: [(0, '58.860'), (1, '52.940')] +[2023-10-13 02:54:04,205][46663] Updated weights for policy 1, policy_version 63491 (0.0008) +[2023-10-13 02:54:04,569][46663] Updated weights for policy 1, policy_version 63501 (0.0009) +[2023-10-13 02:54:04,937][46663] Updated weights for policy 1, policy_version 63511 (0.0008) +[2023-10-13 02:54:07,703][46662] Updated weights for policy 0, policy_version 63590 (0.0009) +[2023-10-13 02:54:08,070][46662] Updated weights for policy 0, policy_version 63600 (0.0008) +[2023-10-13 02:54:08,436][46662] Updated weights for policy 0, policy_version 63610 (0.0007) +[2023-10-13 02:54:08,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 130154496. Throughput: 0: 1678.4, 1: 1667.7. Samples: 32556948. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:54:08,607][45375] Avg episode reward: [(0, '57.310'), (1, '54.500')] +[2023-10-13 02:54:08,904][46663] Updated weights for policy 1, policy_version 63521 (0.0008) +[2023-10-13 02:54:09,270][46663] Updated weights for policy 1, policy_version 63531 (0.0011) +[2023-10-13 02:54:09,630][46663] Updated weights for policy 1, policy_version 63541 (0.0011) +[2023-10-13 02:54:09,992][46663] Updated weights for policy 1, policy_version 63551 (0.0011) +[2023-10-13 02:54:12,443][46662] Updated weights for policy 0, policy_version 63620 (0.0009) +[2023-10-13 02:54:12,817][46662] Updated weights for policy 0, policy_version 63630 (0.0009) +[2023-10-13 02:54:13,205][46662] Updated weights for policy 0, policy_version 63640 (0.0009) +[2023-10-13 02:54:13,606][45375] Fps is (10 sec: 16384.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 130252800. Throughput: 0: 1688.9, 1: 1668.2. Samples: 32566540. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:54:13,607][45375] Avg episode reward: [(0, '58.990'), (1, '53.550')] +[2023-10-13 02:54:13,906][46663] Updated weights for policy 1, policy_version 63561 (0.0010) +[2023-10-13 02:54:14,267][46663] Updated weights for policy 1, policy_version 63571 (0.0009) +[2023-10-13 02:54:14,634][46663] Updated weights for policy 1, policy_version 63581 (0.0008) +[2023-10-13 02:54:17,227][46662] Updated weights for policy 0, policy_version 63650 (0.0008) +[2023-10-13 02:54:17,634][46662] Updated weights for policy 0, policy_version 63660 (0.0010) +[2023-10-13 02:54:18,004][46662] Updated weights for policy 0, policy_version 63670 (0.0009) +[2023-10-13 02:54:18,376][46662] Updated weights for policy 0, policy_version 63680 (0.0009) +[2023-10-13 02:54:18,607][45375] Fps is (10 sec: 16384.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 130318336. Throughput: 0: 1692.6, 1: 1673.0. Samples: 32587236. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:54:18,607][45375] Avg episode reward: [(0, '58.270'), (1, '55.130')] +[2023-10-13 02:54:18,739][46663] Updated weights for policy 1, policy_version 63591 (0.0009) +[2023-10-13 02:54:19,095][46663] Updated weights for policy 1, policy_version 63601 (0.0011) +[2023-10-13 02:54:19,478][46663] Updated weights for policy 1, policy_version 63611 (0.0010) +[2023-10-13 02:54:22,217][46662] Updated weights for policy 0, policy_version 63690 (0.0009) +[2023-10-13 02:54:22,593][46662] Updated weights for policy 0, policy_version 63700 (0.0009) +[2023-10-13 02:54:22,957][46662] Updated weights for policy 0, policy_version 63710 (0.0008) +[2023-10-13 02:54:23,594][46663] Updated weights for policy 1, policy_version 63621 (0.0011) +[2023-10-13 02:54:23,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 130383872. Throughput: 0: 1670.8, 1: 1675.8. Samples: 32606924. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:54:23,607][45375] Avg episode reward: [(0, '59.090'), (1, '54.540')] +[2023-10-13 02:54:23,950][46663] Updated weights for policy 1, policy_version 63631 (0.0009) +[2023-10-13 02:54:24,321][46663] Updated weights for policy 1, policy_version 63641 (0.0008) +[2023-10-13 02:54:27,042][46662] Updated weights for policy 0, policy_version 63720 (0.0009) +[2023-10-13 02:54:27,415][46662] Updated weights for policy 0, policy_version 63730 (0.0008) +[2023-10-13 02:54:27,780][46662] Updated weights for policy 0, policy_version 63740 (0.0007) +[2023-10-13 02:54:28,452][46663] Updated weights for policy 1, policy_version 63651 (0.0010) +[2023-10-13 02:54:28,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 130449408. Throughput: 0: 1693.7, 1: 1678.8. Samples: 32617108. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:54:28,607][45375] Avg episode reward: [(0, '60.310'), (1, '55.340')] +[2023-10-13 02:54:28,608][46091] Saving new best policy, reward=60.310! +[2023-10-13 02:54:28,853][46663] Updated weights for policy 1, policy_version 63661 (0.0007) +[2023-10-13 02:54:29,218][46663] Updated weights for policy 1, policy_version 63671 (0.0007) +[2023-10-13 02:54:31,635][46662] Updated weights for policy 0, policy_version 63750 (0.0007) +[2023-10-13 02:54:32,017][46662] Updated weights for policy 0, policy_version 63760 (0.0009) +[2023-10-13 02:54:32,386][46662] Updated weights for policy 0, policy_version 63770 (0.0008) +[2023-10-13 02:54:33,166][46663] Updated weights for policy 1, policy_version 63681 (0.0007) +[2023-10-13 02:54:33,538][46663] Updated weights for policy 1, policy_version 63691 (0.0008) +[2023-10-13 02:54:33,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 130514944. Throughput: 0: 1681.7, 1: 1682.3. Samples: 32637390. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:54:33,607][45375] Avg episode reward: [(0, '60.970'), (1, '55.110')] +[2023-10-13 02:54:33,608][46091] Saving new best policy, reward=60.970! +[2023-10-13 02:54:33,908][46663] Updated weights for policy 1, policy_version 63701 (0.0010) +[2023-10-13 02:54:34,273][46663] Updated weights for policy 1, policy_version 63711 (0.0008) +[2023-10-13 02:54:36,524][46662] Updated weights for policy 0, policy_version 63780 (0.0008) +[2023-10-13 02:54:36,895][46662] Updated weights for policy 0, policy_version 63790 (0.0010) +[2023-10-13 02:54:37,260][46662] Updated weights for policy 0, policy_version 63800 (0.0009) +[2023-10-13 02:54:38,399][46663] Updated weights for policy 1, policy_version 63721 (0.0008) +[2023-10-13 02:54:38,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 130580480. Throughput: 0: 1666.4, 1: 1671.2. Samples: 32656840. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:54:38,607][45375] Avg episode reward: [(0, '60.470'), (1, '54.740')] +[2023-10-13 02:54:38,763][46663] Updated weights for policy 1, policy_version 63731 (0.0008) +[2023-10-13 02:54:39,127][46663] Updated weights for policy 1, policy_version 63741 (0.0009) +[2023-10-13 02:54:41,224][46662] Updated weights for policy 0, policy_version 63810 (0.0008) +[2023-10-13 02:54:41,587][46662] Updated weights for policy 0, policy_version 63820 (0.0008) +[2023-10-13 02:54:41,960][46662] Updated weights for policy 0, policy_version 63830 (0.0009) +[2023-10-13 02:54:42,322][46662] Updated weights for policy 0, policy_version 63840 (0.0009) +[2023-10-13 02:54:43,106][46663] Updated weights for policy 1, policy_version 63751 (0.0011) +[2023-10-13 02:54:43,466][46663] Updated weights for policy 1, policy_version 63761 (0.0008) +[2023-10-13 02:54:43,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 130646016. Throughput: 0: 1693.3, 1: 1679.7. Samples: 32667730. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:54:43,608][45375] Avg episode reward: [(0, '60.070'), (1, '54.070')] +[2023-10-13 02:54:43,829][46663] Updated weights for policy 1, policy_version 63771 (0.0007) +[2023-10-13 02:54:46,441][46662] Updated weights for policy 0, policy_version 63850 (0.0007) +[2023-10-13 02:54:46,800][46662] Updated weights for policy 0, policy_version 63860 (0.0007) +[2023-10-13 02:54:47,166][46662] Updated weights for policy 0, policy_version 63870 (0.0007) +[2023-10-13 02:54:47,457][46663] Updated weights for policy 1, policy_version 63781 (0.0009) +[2023-10-13 02:54:47,817][46663] Updated weights for policy 1, policy_version 63791 (0.0009) +[2023-10-13 02:54:48,168][46663] Updated weights for policy 1, policy_version 63801 (0.0008) +[2023-10-13 02:54:48,607][45375] Fps is (10 sec: 16383.8, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 130744320. Throughput: 0: 1680.3, 1: 1693.9. Samples: 32688572. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:54:48,608][45375] Avg episode reward: [(0, '61.170'), (1, '52.410')] +[2023-10-13 02:54:48,609][46091] Saving new best policy, reward=61.170! +[2023-10-13 02:54:51,313][46662] Updated weights for policy 0, policy_version 63880 (0.0008) +[2023-10-13 02:54:51,671][46662] Updated weights for policy 0, policy_version 63890 (0.0011) +[2023-10-13 02:54:52,042][46662] Updated weights for policy 0, policy_version 63900 (0.0008) +[2023-10-13 02:54:52,139][46663] Updated weights for policy 1, policy_version 63811 (0.0009) +[2023-10-13 02:54:52,510][46663] Updated weights for policy 1, policy_version 63821 (0.0009) +[2023-10-13 02:54:52,865][46663] Updated weights for policy 1, policy_version 63831 (0.0010) +[2023-10-13 02:54:53,607][45375] Fps is (10 sec: 16384.3, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 130809856. Throughput: 0: 1677.6, 1: 1675.4. Samples: 32707834. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:54:53,607][45375] Avg episode reward: [(0, '60.330'), (1, '51.190')] +[2023-10-13 02:54:56,079][46662] Updated weights for policy 0, policy_version 63910 (0.0010) +[2023-10-13 02:54:56,453][46662] Updated weights for policy 0, policy_version 63920 (0.0009) +[2023-10-13 02:54:56,831][46662] Updated weights for policy 0, policy_version 63930 (0.0009) +[2023-10-13 02:54:57,119][46663] Updated weights for policy 1, policy_version 63841 (0.0009) +[2023-10-13 02:54:57,490][46663] Updated weights for policy 1, policy_version 63851 (0.0009) +[2023-10-13 02:54:57,849][46663] Updated weights for policy 1, policy_version 63861 (0.0010) +[2023-10-13 02:54:58,214][46663] Updated weights for policy 1, policy_version 63871 (0.0010) +[2023-10-13 02:54:58,607][45375] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 130875392. Throughput: 0: 1696.3, 1: 1700.4. Samples: 32719388. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:54:58,607][45375] Avg episode reward: [(0, '58.280'), (1, '52.070')] +[2023-10-13 02:55:00,809][46662] Updated weights for policy 0, policy_version 63940 (0.0009) +[2023-10-13 02:55:01,187][46662] Updated weights for policy 0, policy_version 63950 (0.0009) +[2023-10-13 02:55:01,551][46662] Updated weights for policy 0, policy_version 63960 (0.0009) +[2023-10-13 02:55:02,415][46663] Updated weights for policy 1, policy_version 63881 (0.0009) +[2023-10-13 02:55:02,778][46663] Updated weights for policy 1, policy_version 63891 (0.0007) +[2023-10-13 02:55:03,148][46663] Updated weights for policy 1, policy_version 63901 (0.0008) +[2023-10-13 02:55:03,606][45375] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 130940928. Throughput: 0: 1673.6, 1: 1694.8. Samples: 32738816. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:55:03,607][45375] Avg episode reward: [(0, '58.930'), (1, '52.860')] +[2023-10-13 02:55:05,668][46662] Updated weights for policy 0, policy_version 63970 (0.0008) +[2023-10-13 02:55:06,065][46662] Updated weights for policy 0, policy_version 63980 (0.0009) +[2023-10-13 02:55:06,432][46662] Updated weights for policy 0, policy_version 63990 (0.0007) +[2023-10-13 02:55:06,802][46662] Updated weights for policy 0, policy_version 64000 (0.0007) +[2023-10-13 02:55:07,283][46663] Updated weights for policy 1, policy_version 63911 (0.0008) +[2023-10-13 02:55:07,657][46663] Updated weights for policy 1, policy_version 63921 (0.0008) +[2023-10-13 02:55:08,021][46663] Updated weights for policy 1, policy_version 63931 (0.0010) +[2023-10-13 02:55:08,607][45375] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 131006464. Throughput: 0: 1693.3, 1: 1673.2. Samples: 32758418. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:55:08,607][45375] Avg episode reward: [(0, '57.390'), (1, '51.930')] +[2023-10-13 02:55:10,764][46662] Updated weights for policy 0, policy_version 64010 (0.0008) +[2023-10-13 02:55:11,133][46662] Updated weights for policy 0, policy_version 64020 (0.0007) +[2023-10-13 02:55:11,507][46662] Updated weights for policy 0, policy_version 64030 (0.0007) +[2023-10-13 02:55:12,305][46663] Updated weights for policy 1, policy_version 63941 (0.0010) +[2023-10-13 02:55:12,673][46663] Updated weights for policy 1, policy_version 63951 (0.0009) +[2023-10-13 02:55:13,033][46663] Updated weights for policy 1, policy_version 63961 (0.0009) +[2023-10-13 02:55:13,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 131072000. Throughput: 0: 1689.9, 1: 1700.1. Samples: 32769658. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:55:13,607][45375] Avg episode reward: [(0, '57.710'), (1, '51.010')] +[2023-10-13 02:55:15,399][46662] Updated weights for policy 0, policy_version 64040 (0.0010) +[2023-10-13 02:55:15,782][46662] Updated weights for policy 0, policy_version 64050 (0.0010) +[2023-10-13 02:55:16,148][46662] Updated weights for policy 0, policy_version 64060 (0.0009) +[2023-10-13 02:55:17,082][46663] Updated weights for policy 1, policy_version 63971 (0.0009) +[2023-10-13 02:55:17,487][46663] Updated weights for policy 1, policy_version 63981 (0.0010) +[2023-10-13 02:55:17,847][46663] Updated weights for policy 1, policy_version 63991 (0.0010) +[2023-10-13 02:55:18,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 131137536. Throughput: 0: 1681.4, 1: 1692.2. Samples: 32789202. Policy #0 lag: (min: 43.0, avg: 47.9, max: 48.0) +[2023-10-13 02:55:18,607][45375] Avg episode reward: [(0, '58.220'), (1, '49.860')] +[2023-10-13 02:55:20,218][46662] Updated weights for policy 0, policy_version 64070 (0.0009) +[2023-10-13 02:55:20,588][46662] Updated weights for policy 0, policy_version 64080 (0.0008) +[2023-10-13 02:55:20,945][46662] Updated weights for policy 0, policy_version 64090 (0.0008) +[2023-10-13 02:55:21,861][46663] Updated weights for policy 1, policy_version 64001 (0.0010) +[2023-10-13 02:55:22,229][46663] Updated weights for policy 1, policy_version 64011 (0.0008) +[2023-10-13 02:55:22,590][46663] Updated weights for policy 1, policy_version 64021 (0.0009) +[2023-10-13 02:55:22,956][46663] Updated weights for policy 1, policy_version 64031 (0.0008) +[2023-10-13 02:55:23,607][45375] Fps is (10 sec: 13106.8, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 131203072. Throughput: 0: 1700.5, 1: 1682.3. Samples: 32809068. Policy #0 lag: (min: 43.0, avg: 47.9, max: 48.0) +[2023-10-13 02:55:23,608][45375] Avg episode reward: [(0, '58.090'), (1, '48.630')] +[2023-10-13 02:55:24,896][46662] Updated weights for policy 0, policy_version 64100 (0.0010) +[2023-10-13 02:55:25,266][46662] Updated weights for policy 0, policy_version 64110 (0.0009) +[2023-10-13 02:55:25,644][46662] Updated weights for policy 0, policy_version 64120 (0.0010) +[2023-10-13 02:55:27,094][46663] Updated weights for policy 1, policy_version 64041 (0.0010) +[2023-10-13 02:55:27,460][46663] Updated weights for policy 1, policy_version 64051 (0.0008) +[2023-10-13 02:55:27,825][46663] Updated weights for policy 1, policy_version 64061 (0.0009) +[2023-10-13 02:55:28,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 131268608. Throughput: 0: 1673.8, 1: 1699.9. Samples: 32819548. Policy #0 lag: (min: 43.0, avg: 47.9, max: 48.0) +[2023-10-13 02:55:28,607][45375] Avg episode reward: [(0, '58.330'), (1, '49.410')] +[2023-10-13 02:55:29,893][46662] Updated weights for policy 0, policy_version 64130 (0.0009) +[2023-10-13 02:55:30,252][46662] Updated weights for policy 0, policy_version 64140 (0.0008) +[2023-10-13 02:55:30,625][46662] Updated weights for policy 0, policy_version 64150 (0.0008) +[2023-10-13 02:55:30,991][46662] Updated weights for policy 0, policy_version 64160 (0.0008) +[2023-10-13 02:55:31,803][46663] Updated weights for policy 1, policy_version 64071 (0.0009) +[2023-10-13 02:55:32,156][46663] Updated weights for policy 1, policy_version 64081 (0.0008) +[2023-10-13 02:55:32,520][46663] Updated weights for policy 1, policy_version 64091 (0.0009) +[2023-10-13 02:55:33,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 131334144. Throughput: 0: 1681.6, 1: 1669.2. Samples: 32839358. Policy #0 lag: (min: 43.0, avg: 47.9, max: 48.0) +[2023-10-13 02:55:33,608][45375] Avg episode reward: [(0, '57.870'), (1, '48.450')] +[2023-10-13 02:55:35,082][46662] Updated weights for policy 0, policy_version 64170 (0.0009) +[2023-10-13 02:55:35,442][46662] Updated weights for policy 0, policy_version 64180 (0.0008) +[2023-10-13 02:55:35,807][46662] Updated weights for policy 0, policy_version 64190 (0.0007) +[2023-10-13 02:55:36,644][46663] Updated weights for policy 1, policy_version 64101 (0.0009) +[2023-10-13 02:55:37,014][46663] Updated weights for policy 1, policy_version 64111 (0.0012) +[2023-10-13 02:55:37,390][46663] Updated weights for policy 1, policy_version 64121 (0.0009) +[2023-10-13 02:55:38,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 131399680. Throughput: 0: 1694.8, 1: 1677.1. Samples: 32859570. Policy #0 lag: (min: 43.0, avg: 47.9, max: 48.0) +[2023-10-13 02:55:38,607][45375] Avg episode reward: [(0, '55.680'), (1, '48.410')] +[2023-10-13 02:55:39,886][46662] Updated weights for policy 0, policy_version 64200 (0.0008) +[2023-10-13 02:55:40,248][46662] Updated weights for policy 0, policy_version 64210 (0.0009) +[2023-10-13 02:55:40,617][46662] Updated weights for policy 0, policy_version 64220 (0.0009) +[2023-10-13 02:55:41,588][46663] Updated weights for policy 1, policy_version 64131 (0.0008) +[2023-10-13 02:55:41,949][46663] Updated weights for policy 1, policy_version 64141 (0.0010) +[2023-10-13 02:55:42,318][46663] Updated weights for policy 1, policy_version 64151 (0.0009) +[2023-10-13 02:55:43,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 131465216. Throughput: 0: 1666.1, 1: 1676.8. Samples: 32869820. Policy #0 lag: (min: 43.0, avg: 47.9, max: 48.0) +[2023-10-13 02:55:43,607][45375] Avg episode reward: [(0, '55.800'), (1, '48.890')] +[2023-10-13 02:55:44,635][46662] Updated weights for policy 0, policy_version 64230 (0.0009) +[2023-10-13 02:55:45,003][46662] Updated weights for policy 0, policy_version 64240 (0.0008) +[2023-10-13 02:55:45,381][46662] Updated weights for policy 0, policy_version 64250 (0.0008) +[2023-10-13 02:55:46,421][46663] Updated weights for policy 1, policy_version 64161 (0.0009) +[2023-10-13 02:55:46,799][46663] Updated weights for policy 1, policy_version 64171 (0.0008) +[2023-10-13 02:55:47,167][46663] Updated weights for policy 1, policy_version 64181 (0.0007) +[2023-10-13 02:55:47,521][46663] Updated weights for policy 1, policy_version 64191 (0.0007) +[2023-10-13 02:55:48,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13551.5). Total num frames: 131530752. Throughput: 0: 1692.4, 1: 1658.7. Samples: 32889616. Policy #0 lag: (min: 43.0, avg: 47.9, max: 48.0) +[2023-10-13 02:55:48,608][45375] Avg episode reward: [(0, '55.340'), (1, '49.980')] +[2023-10-13 02:55:49,406][46662] Updated weights for policy 0, policy_version 64260 (0.0009) +[2023-10-13 02:55:49,777][46662] Updated weights for policy 0, policy_version 64270 (0.0008) +[2023-10-13 02:55:50,147][46662] Updated weights for policy 0, policy_version 64280 (0.0007) +[2023-10-13 02:55:51,507][46663] Updated weights for policy 1, policy_version 64201 (0.0009) +[2023-10-13 02:55:51,874][46663] Updated weights for policy 1, policy_version 64211 (0.0009) +[2023-10-13 02:55:52,254][46663] Updated weights for policy 1, policy_version 64221 (0.0010) +[2023-10-13 02:55:53,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 131596288. Throughput: 0: 1693.5, 1: 1676.8. Samples: 32910080. Policy #0 lag: (min: 43.0, avg: 47.9, max: 48.0) +[2023-10-13 02:55:53,607][45375] Avg episode reward: [(0, '56.600'), (1, '50.320')] +[2023-10-13 02:55:53,617][46384] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000064224_65765376.pth... +[2023-10-13 02:55:53,618][46091] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000064288_65830912.pth... +[2023-10-13 02:55:53,654][46091] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000062720_64225280.pth +[2023-10-13 02:55:53,657][46384] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000062656_64159744.pth +[2023-10-13 02:55:54,187][46662] Updated weights for policy 0, policy_version 64290 (0.0008) +[2023-10-13 02:55:54,600][46662] Updated weights for policy 0, policy_version 64300 (0.0008) +[2023-10-13 02:55:54,969][46662] Updated weights for policy 0, policy_version 64310 (0.0009) +[2023-10-13 02:55:55,341][46662] Updated weights for policy 0, policy_version 64320 (0.0010) +[2023-10-13 02:55:56,281][46663] Updated weights for policy 1, policy_version 64231 (0.0008) +[2023-10-13 02:55:56,655][46663] Updated weights for policy 1, policy_version 64241 (0.0007) +[2023-10-13 02:55:57,026][46663] Updated weights for policy 1, policy_version 64251 (0.0007) +[2023-10-13 02:55:58,607][45375] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 131661824. Throughput: 0: 1669.1, 1: 1670.5. Samples: 32919942. Policy #0 lag: (min: 43.0, avg: 47.9, max: 48.0) +[2023-10-13 02:55:58,607][45375] Avg episode reward: [(0, '56.120'), (1, '51.610')] +[2023-10-13 02:55:59,394][46662] Updated weights for policy 0, policy_version 64330 (0.0008) +[2023-10-13 02:55:59,761][46662] Updated weights for policy 0, policy_version 64340 (0.0008) +[2023-10-13 02:56:00,137][46662] Updated weights for policy 0, policy_version 64350 (0.0008) +[2023-10-13 02:56:01,176][46663] Updated weights for policy 1, policy_version 64261 (0.0008) +[2023-10-13 02:56:01,555][46663] Updated weights for policy 1, policy_version 64271 (0.0009) +[2023-10-13 02:56:01,917][46663] Updated weights for policy 1, policy_version 64281 (0.0007) +[2023-10-13 02:56:03,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 131727360. Throughput: 0: 1687.4, 1: 1655.7. Samples: 32939644. Policy #0 lag: (min: 17.0, avg: 24.9, max: 49.0) +[2023-10-13 02:56:03,607][45375] Avg episode reward: [(0, '56.140'), (1, '53.100')] +[2023-10-13 02:56:04,050][46662] Updated weights for policy 0, policy_version 64360 (0.0008) +[2023-10-13 02:56:04,420][46662] Updated weights for policy 0, policy_version 64370 (0.0009) +[2023-10-13 02:56:04,798][46662] Updated weights for policy 0, policy_version 64380 (0.0012) +[2023-10-13 02:56:06,133][46663] Updated weights for policy 1, policy_version 64291 (0.0007) +[2023-10-13 02:56:06,524][46663] Updated weights for policy 1, policy_version 64301 (0.0008) +[2023-10-13 02:56:06,892][46663] Updated weights for policy 1, policy_version 64311 (0.0007) +[2023-10-13 02:56:08,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 131792896. Throughput: 0: 1692.7, 1: 1667.6. Samples: 32960278. Policy #0 lag: (min: 17.0, avg: 24.9, max: 49.0) +[2023-10-13 02:56:08,607][45375] Avg episode reward: [(0, '57.320'), (1, '53.190')] +[2023-10-13 02:56:08,715][46662] Updated weights for policy 0, policy_version 64390 (0.0010) +[2023-10-13 02:56:09,077][46662] Updated weights for policy 0, policy_version 64400 (0.0009) +[2023-10-13 02:56:09,462][46662] Updated weights for policy 0, policy_version 64410 (0.0007) +[2023-10-13 02:56:10,935][46663] Updated weights for policy 1, policy_version 64321 (0.0009) +[2023-10-13 02:56:11,303][46663] Updated weights for policy 1, policy_version 64331 (0.0010) +[2023-10-13 02:56:11,673][46663] Updated weights for policy 1, policy_version 64341 (0.0008) +[2023-10-13 02:56:12,048][46663] Updated weights for policy 1, policy_version 64351 (0.0008) +[2023-10-13 02:56:13,581][46662] Updated weights for policy 0, policy_version 64420 (0.0008) +[2023-10-13 02:56:13,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13107.1, 300 sec: 13329.3). Total num frames: 131858432. Throughput: 0: 1686.8, 1: 1659.0. Samples: 32970112. Policy #0 lag: (min: 17.0, avg: 24.9, max: 49.0) +[2023-10-13 02:56:13,608][45375] Avg episode reward: [(0, '58.100'), (1, '53.160')] +[2023-10-13 02:56:13,944][46662] Updated weights for policy 0, policy_version 64430 (0.0009) +[2023-10-13 02:56:14,317][46662] Updated weights for policy 0, policy_version 64440 (0.0010) +[2023-10-13 02:56:16,190][46663] Updated weights for policy 1, policy_version 64361 (0.0008) +[2023-10-13 02:56:16,555][46663] Updated weights for policy 1, policy_version 64371 (0.0007) +[2023-10-13 02:56:16,926][46663] Updated weights for policy 1, policy_version 64381 (0.0007) +[2023-10-13 02:56:18,497][46662] Updated weights for policy 0, policy_version 64450 (0.0010) +[2023-10-13 02:56:18,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 131923968. Throughput: 0: 1690.9, 1: 1659.0. Samples: 32990102. Policy #0 lag: (min: 17.0, avg: 24.9, max: 49.0) +[2023-10-13 02:56:18,607][45375] Avg episode reward: [(0, '59.140'), (1, '56.110')] +[2023-10-13 02:56:18,871][46662] Updated weights for policy 0, policy_version 64460 (0.0008) +[2023-10-13 02:56:19,243][46662] Updated weights for policy 0, policy_version 64470 (0.0008) +[2023-10-13 02:56:19,605][46662] Updated weights for policy 0, policy_version 64480 (0.0009) +[2023-10-13 02:56:20,901][46663] Updated weights for policy 1, policy_version 64391 (0.0007) +[2023-10-13 02:56:21,268][46663] Updated weights for policy 1, policy_version 64401 (0.0007) +[2023-10-13 02:56:21,632][46663] Updated weights for policy 1, policy_version 64411 (0.0009) +[2023-10-13 02:56:23,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 131989504. Throughput: 0: 1685.8, 1: 1672.1. Samples: 33010674. Policy #0 lag: (min: 17.0, avg: 24.9, max: 49.0) +[2023-10-13 02:56:23,608][45375] Avg episode reward: [(0, '58.700'), (1, '56.910')] +[2023-10-13 02:56:23,741][46662] Updated weights for policy 0, policy_version 64490 (0.0007) +[2023-10-13 02:56:24,114][46662] Updated weights for policy 0, policy_version 64500 (0.0007) +[2023-10-13 02:56:24,486][46662] Updated weights for policy 0, policy_version 64510 (0.0007) +[2023-10-13 02:56:25,650][46663] Updated weights for policy 1, policy_version 64421 (0.0007) +[2023-10-13 02:56:26,025][46663] Updated weights for policy 1, policy_version 64431 (0.0008) +[2023-10-13 02:56:26,388][46663] Updated weights for policy 1, policy_version 64441 (0.0010) +[2023-10-13 02:56:28,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 132055040. Throughput: 0: 1685.3, 1: 1656.0. Samples: 33020176. Policy #0 lag: (min: 17.0, avg: 24.9, max: 49.0) +[2023-10-13 02:56:28,607][45375] Avg episode reward: [(0, '59.160'), (1, '57.140')] +[2023-10-13 02:56:28,692][46662] Updated weights for policy 0, policy_version 64520 (0.0009) +[2023-10-13 02:56:29,067][46662] Updated weights for policy 0, policy_version 64530 (0.0009) +[2023-10-13 02:56:29,445][46662] Updated weights for policy 0, policy_version 64540 (0.0008) +[2023-10-13 02:56:30,332][46663] Updated weights for policy 1, policy_version 64451 (0.0009) +[2023-10-13 02:56:30,713][46663] Updated weights for policy 1, policy_version 64461 (0.0007) +[2023-10-13 02:56:31,069][46663] Updated weights for policy 1, policy_version 64471 (0.0008) +[2023-10-13 02:56:33,568][46662] Updated weights for policy 0, policy_version 64550 (0.0009) +[2023-10-13 02:56:33,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 132120576. Throughput: 0: 1679.4, 1: 1672.2. Samples: 33040440. Policy #0 lag: (min: 17.0, avg: 24.9, max: 49.0) +[2023-10-13 02:56:33,608][45375] Avg episode reward: [(0, '58.750'), (1, '56.930')] +[2023-10-13 02:56:33,936][46662] Updated weights for policy 0, policy_version 64560 (0.0008) +[2023-10-13 02:56:34,308][46662] Updated weights for policy 0, policy_version 64570 (0.0008) +[2023-10-13 02:56:35,066][46663] Updated weights for policy 1, policy_version 64481 (0.0007) +[2023-10-13 02:56:35,418][46663] Updated weights for policy 1, policy_version 64491 (0.0007) +[2023-10-13 02:56:35,779][46663] Updated weights for policy 1, policy_version 64501 (0.0008) +[2023-10-13 02:56:36,139][46663] Updated weights for policy 1, policy_version 64511 (0.0008) +[2023-10-13 02:56:38,326][46662] Updated weights for policy 0, policy_version 64580 (0.0010) +[2023-10-13 02:56:38,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 132186112. Throughput: 0: 1681.9, 1: 1680.3. Samples: 33061380. Policy #0 lag: (min: 17.0, avg: 24.9, max: 49.0) +[2023-10-13 02:56:38,607][45375] Avg episode reward: [(0, '58.170'), (1, '56.130')] +[2023-10-13 02:56:38,690][46662] Updated weights for policy 0, policy_version 64590 (0.0010) +[2023-10-13 02:56:39,053][46662] Updated weights for policy 0, policy_version 64600 (0.0010) +[2023-10-13 02:56:40,128][46663] Updated weights for policy 1, policy_version 64521 (0.0008) +[2023-10-13 02:56:40,484][46663] Updated weights for policy 1, policy_version 64531 (0.0009) +[2023-10-13 02:56:40,856][46663] Updated weights for policy 1, policy_version 64541 (0.0009) +[2023-10-13 02:56:43,188][46662] Updated weights for policy 0, policy_version 64610 (0.0009) +[2023-10-13 02:56:43,599][46662] Updated weights for policy 0, policy_version 64620 (0.0008) +[2023-10-13 02:56:43,606][45375] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 132251648. Throughput: 0: 1685.2, 1: 1662.4. Samples: 33070588. Policy #0 lag: (min: 17.0, avg: 24.9, max: 49.0) +[2023-10-13 02:56:43,607][45375] Avg episode reward: [(0, '58.290'), (1, '56.530')] +[2023-10-13 02:56:43,966][46662] Updated weights for policy 0, policy_version 64630 (0.0009) +[2023-10-13 02:56:44,337][46662] Updated weights for policy 0, policy_version 64640 (0.0009) +[2023-10-13 02:56:44,930][46663] Updated weights for policy 1, policy_version 64551 (0.0008) +[2023-10-13 02:56:45,297][46663] Updated weights for policy 1, policy_version 64561 (0.0008) +[2023-10-13 02:56:45,674][46663] Updated weights for policy 1, policy_version 64571 (0.0009) +[2023-10-13 02:56:48,278][46662] Updated weights for policy 0, policy_version 64650 (0.0010) +[2023-10-13 02:56:48,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 132317184. Throughput: 0: 1686.2, 1: 1689.8. Samples: 33091564. Policy #0 lag: (min: 23.0, avg: 23.0, max: 23.0) +[2023-10-13 02:56:48,607][45375] Avg episode reward: [(0, '58.480'), (1, '56.690')] +[2023-10-13 02:56:48,654][46662] Updated weights for policy 0, policy_version 64660 (0.0010) +[2023-10-13 02:56:49,026][46662] Updated weights for policy 0, policy_version 64670 (0.0008) +[2023-10-13 02:56:49,695][46663] Updated weights for policy 1, policy_version 64581 (0.0007) +[2023-10-13 02:56:50,061][46663] Updated weights for policy 1, policy_version 64591 (0.0009) +[2023-10-13 02:56:50,433][46663] Updated weights for policy 1, policy_version 64601 (0.0011) +[2023-10-13 02:56:53,161][46662] Updated weights for policy 0, policy_version 64680 (0.0008) +[2023-10-13 02:56:53,531][46662] Updated weights for policy 0, policy_version 64690 (0.0008) +[2023-10-13 02:56:53,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 132382720. Throughput: 0: 1679.3, 1: 1693.8. Samples: 33112068. Policy #0 lag: (min: 23.0, avg: 23.0, max: 23.0) +[2023-10-13 02:56:53,607][45375] Avg episode reward: [(0, '57.980'), (1, '56.640')] +[2023-10-13 02:56:53,900][46662] Updated weights for policy 0, policy_version 64700 (0.0009) +[2023-10-13 02:56:54,665][46663] Updated weights for policy 1, policy_version 64611 (0.0010) +[2023-10-13 02:56:55,030][46663] Updated weights for policy 1, policy_version 64621 (0.0008) +[2023-10-13 02:56:55,408][46663] Updated weights for policy 1, policy_version 64631 (0.0010) +[2023-10-13 02:56:57,918][46662] Updated weights for policy 0, policy_version 64710 (0.0007) +[2023-10-13 02:56:58,296][46662] Updated weights for policy 0, policy_version 64720 (0.0009) +[2023-10-13 02:56:58,606][45375] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 132448256. Throughput: 0: 1684.3, 1: 1673.9. Samples: 33121230. Policy #0 lag: (min: 23.0, avg: 23.0, max: 23.0) +[2023-10-13 02:56:58,607][45375] Avg episode reward: [(0, '58.760'), (1, '56.650')] +[2023-10-13 02:56:58,657][46662] Updated weights for policy 0, policy_version 64730 (0.0008) +[2023-10-13 02:56:59,559][46663] Updated weights for policy 1, policy_version 64641 (0.0009) +[2023-10-13 02:56:59,914][46663] Updated weights for policy 1, policy_version 64651 (0.0008) +[2023-10-13 02:57:00,288][46663] Updated weights for policy 1, policy_version 64661 (0.0008) +[2023-10-13 02:57:00,659][46663] Updated weights for policy 1, policy_version 64671 (0.0008) +[2023-10-13 02:57:02,735][46662] Updated weights for policy 0, policy_version 64740 (0.0008) +[2023-10-13 02:57:03,113][46662] Updated weights for policy 0, policy_version 64750 (0.0010) +[2023-10-13 02:57:03,489][46662] Updated weights for policy 0, policy_version 64760 (0.0010) +[2023-10-13 02:57:03,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 132513792. Throughput: 0: 1682.7, 1: 1690.4. Samples: 33141888. Policy #0 lag: (min: 23.0, avg: 23.0, max: 23.0) +[2023-10-13 02:57:03,608][45375] Avg episode reward: [(0, '59.010'), (1, '55.410')] +[2023-10-13 02:57:04,802][46663] Updated weights for policy 1, policy_version 64681 (0.0008) +[2023-10-13 02:57:05,166][46663] Updated weights for policy 1, policy_version 64691 (0.0009) +[2023-10-13 02:57:05,531][46663] Updated weights for policy 1, policy_version 64701 (0.0007) +[2023-10-13 02:57:07,577][46662] Updated weights for policy 0, policy_version 64770 (0.0009) +[2023-10-13 02:57:07,949][46662] Updated weights for policy 0, policy_version 64780 (0.0008) +[2023-10-13 02:57:08,320][46662] Updated weights for policy 0, policy_version 64790 (0.0007) +[2023-10-13 02:57:08,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 132579328. Throughput: 0: 1679.1, 1: 1687.3. Samples: 33162160. Policy #0 lag: (min: 23.0, avg: 23.0, max: 23.0) +[2023-10-13 02:57:08,607][45375] Avg episode reward: [(0, '58.580'), (1, '55.060')] +[2023-10-13 02:57:08,686][46662] Updated weights for policy 0, policy_version 64800 (0.0009) +[2023-10-13 02:57:09,682][46663] Updated weights for policy 1, policy_version 64711 (0.0008) +[2023-10-13 02:57:10,056][46663] Updated weights for policy 1, policy_version 64721 (0.0009) +[2023-10-13 02:57:10,419][46663] Updated weights for policy 1, policy_version 64731 (0.0009) +[2023-10-13 02:57:12,707][46662] Updated weights for policy 0, policy_version 64810 (0.0008) +[2023-10-13 02:57:13,075][46662] Updated weights for policy 0, policy_version 64820 (0.0007) +[2023-10-13 02:57:13,457][46662] Updated weights for policy 0, policy_version 64830 (0.0009) +[2023-10-13 02:57:13,606][45375] Fps is (10 sec: 16384.4, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 132677632. Throughput: 0: 1685.2, 1: 1678.9. Samples: 33171564. Policy #0 lag: (min: 23.0, avg: 23.0, max: 23.0) +[2023-10-13 02:57:13,607][45375] Avg episode reward: [(0, '58.160'), (1, '53.830')] +[2023-10-13 02:57:14,429][46663] Updated weights for policy 1, policy_version 64741 (0.0007) +[2023-10-13 02:57:14,798][46663] Updated weights for policy 1, policy_version 64751 (0.0007) +[2023-10-13 02:57:15,155][46663] Updated weights for policy 1, policy_version 64761 (0.0008) +[2023-10-13 02:57:17,318][46662] Updated weights for policy 0, policy_version 64840 (0.0009) +[2023-10-13 02:57:17,694][46662] Updated weights for policy 0, policy_version 64850 (0.0009) +[2023-10-13 02:57:18,059][46662] Updated weights for policy 0, policy_version 64860 (0.0009) +[2023-10-13 02:57:18,606][45375] Fps is (10 sec: 16384.2, 60 sec: 13653.4, 300 sec: 13440.5). Total num frames: 132743168. Throughput: 0: 1690.5, 1: 1687.0. Samples: 33192428. Policy #0 lag: (min: 23.0, avg: 23.0, max: 23.0) +[2023-10-13 02:57:18,607][45375] Avg episode reward: [(0, '57.500'), (1, '54.470')] +[2023-10-13 02:57:19,243][46663] Updated weights for policy 1, policy_version 64771 (0.0009) +[2023-10-13 02:57:19,611][46663] Updated weights for policy 1, policy_version 64781 (0.0009) +[2023-10-13 02:57:19,985][46663] Updated weights for policy 1, policy_version 64791 (0.0010) +[2023-10-13 02:57:22,133][46662] Updated weights for policy 0, policy_version 64870 (0.0010) +[2023-10-13 02:57:22,497][46662] Updated weights for policy 0, policy_version 64880 (0.0008) +[2023-10-13 02:57:22,867][46662] Updated weights for policy 0, policy_version 64890 (0.0007) +[2023-10-13 02:57:23,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 132808704. Throughput: 0: 1669.4, 1: 1683.2. Samples: 33212248. Policy #0 lag: (min: 23.0, avg: 23.0, max: 23.0) +[2023-10-13 02:57:23,608][45375] Avg episode reward: [(0, '54.120'), (1, '52.980')] +[2023-10-13 02:57:23,977][46663] Updated weights for policy 1, policy_version 64801 (0.0007) +[2023-10-13 02:57:24,344][46663] Updated weights for policy 1, policy_version 64811 (0.0007) +[2023-10-13 02:57:24,712][46663] Updated weights for policy 1, policy_version 64821 (0.0010) +[2023-10-13 02:57:25,083][46663] Updated weights for policy 1, policy_version 64831 (0.0009) +[2023-10-13 02:57:26,862][46662] Updated weights for policy 0, policy_version 64900 (0.0008) +[2023-10-13 02:57:27,240][46662] Updated weights for policy 0, policy_version 64910 (0.0010) +[2023-10-13 02:57:27,609][46662] Updated weights for policy 0, policy_version 64920 (0.0009) +[2023-10-13 02:57:28,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 132874240. Throughput: 0: 1690.4, 1: 1682.4. Samples: 33222360. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:57:28,607][45375] Avg episode reward: [(0, '53.280'), (1, '51.040')] +[2023-10-13 02:57:29,213][46663] Updated weights for policy 1, policy_version 64841 (0.0007) +[2023-10-13 02:57:29,565][46663] Updated weights for policy 1, policy_version 64851 (0.0008) +[2023-10-13 02:57:29,938][46663] Updated weights for policy 1, policy_version 64861 (0.0009) +[2023-10-13 02:57:31,696][46662] Updated weights for policy 0, policy_version 64930 (0.0010) +[2023-10-13 02:57:32,091][46662] Updated weights for policy 0, policy_version 64940 (0.0007) +[2023-10-13 02:57:32,461][46662] Updated weights for policy 0, policy_version 64950 (0.0008) +[2023-10-13 02:57:32,831][46662] Updated weights for policy 0, policy_version 64960 (0.0010) +[2023-10-13 02:57:33,606][45375] Fps is (10 sec: 13107.4, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 132939776. Throughput: 0: 1682.5, 1: 1676.4. Samples: 33242714. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:57:33,607][45375] Avg episode reward: [(0, '52.050'), (1, '51.070')] +[2023-10-13 02:57:33,966][46663] Updated weights for policy 1, policy_version 64871 (0.0009) +[2023-10-13 02:57:34,340][46663] Updated weights for policy 1, policy_version 64881 (0.0007) +[2023-10-13 02:57:34,705][46663] Updated weights for policy 1, policy_version 64891 (0.0008) +[2023-10-13 02:57:37,093][46662] Updated weights for policy 0, policy_version 64970 (0.0007) +[2023-10-13 02:57:37,460][46662] Updated weights for policy 0, policy_version 64980 (0.0008) +[2023-10-13 02:57:37,833][46662] Updated weights for policy 0, policy_version 64990 (0.0008) +[2023-10-13 02:57:38,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 133005312. Throughput: 0: 1656.7, 1: 1681.5. Samples: 33262286. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:57:38,607][45375] Avg episode reward: [(0, '51.510'), (1, '50.770')] +[2023-10-13 02:57:38,688][46663] Updated weights for policy 1, policy_version 64901 (0.0008) +[2023-10-13 02:57:39,045][46663] Updated weights for policy 1, policy_version 64911 (0.0009) +[2023-10-13 02:57:39,409][46663] Updated weights for policy 1, policy_version 64921 (0.0008) +[2023-10-13 02:57:41,831][46662] Updated weights for policy 0, policy_version 65000 (0.0010) +[2023-10-13 02:57:42,195][46662] Updated weights for policy 0, policy_version 65010 (0.0012) +[2023-10-13 02:57:42,567][46662] Updated weights for policy 0, policy_version 65020 (0.0010) +[2023-10-13 02:57:43,560][46663] Updated weights for policy 1, policy_version 64931 (0.0008) +[2023-10-13 02:57:43,606][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 133070848. Throughput: 0: 1684.8, 1: 1679.7. Samples: 33272636. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:57:43,607][45375] Avg episode reward: [(0, '52.840'), (1, '50.870')] +[2023-10-13 02:57:43,952][46663] Updated weights for policy 1, policy_version 64941 (0.0009) +[2023-10-13 02:57:44,316][46663] Updated weights for policy 1, policy_version 64951 (0.0010) +[2023-10-13 02:57:46,720][46662] Updated weights for policy 0, policy_version 65030 (0.0009) +[2023-10-13 02:57:47,098][46662] Updated weights for policy 0, policy_version 65040 (0.0009) +[2023-10-13 02:57:47,464][46662] Updated weights for policy 0, policy_version 65050 (0.0008) +[2023-10-13 02:57:48,492][46663] Updated weights for policy 1, policy_version 64961 (0.0010) +[2023-10-13 02:57:48,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 133136384. Throughput: 0: 1676.5, 1: 1676.0. Samples: 33292748. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:57:48,607][45375] Avg episode reward: [(0, '51.380'), (1, '50.640')] +[2023-10-13 02:57:48,858][46663] Updated weights for policy 1, policy_version 64971 (0.0008) +[2023-10-13 02:57:49,229][46663] Updated weights for policy 1, policy_version 64981 (0.0008) +[2023-10-13 02:57:49,591][46663] Updated weights for policy 1, policy_version 64991 (0.0008) +[2023-10-13 02:57:51,567][46662] Updated weights for policy 0, policy_version 65060 (0.0007) +[2023-10-13 02:57:51,934][46662] Updated weights for policy 0, policy_version 65070 (0.0007) +[2023-10-13 02:57:52,295][46662] Updated weights for policy 0, policy_version 65080 (0.0008) +[2023-10-13 02:57:53,607][45375] Fps is (10 sec: 13106.8, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 133201920. Throughput: 0: 1659.3, 1: 1674.3. Samples: 33312174. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:57:53,608][45375] Avg episode reward: [(0, '51.610'), (1, '50.560')] +[2023-10-13 02:57:53,618][46091] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000065088_66650112.pth... +[2023-10-13 02:57:53,655][46091] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000063520_65044480.pth +[2023-10-13 02:57:53,800][46663] Updated weights for policy 1, policy_version 65001 (0.0008) +[2023-10-13 02:57:54,179][46663] Updated weights for policy 1, policy_version 65011 (0.0009) +[2023-10-13 02:57:54,538][46663] Updated weights for policy 1, policy_version 65021 (0.0007) +[2023-10-13 02:57:54,648][46384] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000065024_66584576.pth... +[2023-10-13 02:57:54,678][46384] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000063424_64946176.pth +[2023-10-13 02:57:56,342][46662] Updated weights for policy 0, policy_version 65090 (0.0009) +[2023-10-13 02:57:56,718][46662] Updated weights for policy 0, policy_version 65100 (0.0009) +[2023-10-13 02:57:57,089][46662] Updated weights for policy 0, policy_version 65110 (0.0009) +[2023-10-13 02:57:57,453][46662] Updated weights for policy 0, policy_version 65120 (0.0008) +[2023-10-13 02:57:58,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 133267456. Throughput: 0: 1685.0, 1: 1671.1. Samples: 33322590. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:57:58,607][45375] Avg episode reward: [(0, '52.380'), (1, '50.190')] +[2023-10-13 02:57:58,752][46663] Updated weights for policy 1, policy_version 65031 (0.0009) +[2023-10-13 02:57:59,118][46663] Updated weights for policy 1, policy_version 65041 (0.0007) +[2023-10-13 02:57:59,489][46663] Updated weights for policy 1, policy_version 65051 (0.0011) +[2023-10-13 02:58:01,500][46662] Updated weights for policy 0, policy_version 65130 (0.0009) +[2023-10-13 02:58:01,861][46662] Updated weights for policy 0, policy_version 65140 (0.0009) +[2023-10-13 02:58:02,235][46662] Updated weights for policy 0, policy_version 65150 (0.0007) +[2023-10-13 02:58:03,457][46663] Updated weights for policy 1, policy_version 65061 (0.0009) +[2023-10-13 02:58:03,607][45375] Fps is (10 sec: 13107.5, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 133332992. Throughput: 0: 1667.3, 1: 1677.9. Samples: 33342962. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:58:03,607][45375] Avg episode reward: [(0, '50.770'), (1, '49.630')] +[2023-10-13 02:58:03,826][46663] Updated weights for policy 1, policy_version 65071 (0.0007) +[2023-10-13 02:58:04,199][46663] Updated weights for policy 1, policy_version 65081 (0.0008) +[2023-10-13 02:58:06,165][46662] Updated weights for policy 0, policy_version 65160 (0.0009) +[2023-10-13 02:58:06,534][46662] Updated weights for policy 0, policy_version 65170 (0.0007) +[2023-10-13 02:58:06,902][46662] Updated weights for policy 0, policy_version 65180 (0.0007) +[2023-10-13 02:58:08,252][46663] Updated weights for policy 1, policy_version 65091 (0.0008) +[2023-10-13 02:58:08,606][45375] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 133398528. Throughput: 0: 1674.6, 1: 1669.3. Samples: 33362724. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:58:08,607][45375] Avg episode reward: [(0, '49.480'), (1, '47.370')] +[2023-10-13 02:58:08,617][46663] Updated weights for policy 1, policy_version 65101 (0.0009) +[2023-10-13 02:58:08,980][46663] Updated weights for policy 1, policy_version 65111 (0.0010) +[2023-10-13 02:58:11,051][46662] Updated weights for policy 0, policy_version 65190 (0.0008) +[2023-10-13 02:58:11,417][46662] Updated weights for policy 0, policy_version 65200 (0.0011) +[2023-10-13 02:58:11,788][46662] Updated weights for policy 0, policy_version 65210 (0.0008) +[2023-10-13 02:58:13,019][46663] Updated weights for policy 1, policy_version 65121 (0.0010) +[2023-10-13 02:58:13,380][46663] Updated weights for policy 1, policy_version 65131 (0.0007) +[2023-10-13 02:58:13,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13107.1, 300 sec: 13440.4). Total num frames: 133464064. Throughput: 0: 1678.2, 1: 1675.8. Samples: 33373292. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:58:13,608][45375] Avg episode reward: [(0, '49.430'), (1, '46.640')] +[2023-10-13 02:58:13,741][46663] Updated weights for policy 1, policy_version 65141 (0.0009) +[2023-10-13 02:58:14,115][46663] Updated weights for policy 1, policy_version 65151 (0.0008) +[2023-10-13 02:58:15,825][46662] Updated weights for policy 0, policy_version 65220 (0.0009) +[2023-10-13 02:58:16,203][46662] Updated weights for policy 0, policy_version 65230 (0.0009) +[2023-10-13 02:58:16,573][46662] Updated weights for policy 0, policy_version 65240 (0.0007) +[2023-10-13 02:58:18,073][46663] Updated weights for policy 1, policy_version 65161 (0.0008) +[2023-10-13 02:58:18,434][46663] Updated weights for policy 1, policy_version 65171 (0.0009) +[2023-10-13 02:58:18,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 133529600. Throughput: 0: 1660.0, 1: 1684.0. Samples: 33393194. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:58:18,607][45375] Avg episode reward: [(0, '50.670'), (1, '45.660')] +[2023-10-13 02:58:18,801][46663] Updated weights for policy 1, policy_version 65181 (0.0009) +[2023-10-13 02:58:20,546][46662] Updated weights for policy 0, policy_version 65250 (0.0009) +[2023-10-13 02:58:20,942][46662] Updated weights for policy 0, policy_version 65260 (0.0009) +[2023-10-13 02:58:21,313][46662] Updated weights for policy 0, policy_version 65270 (0.0009) +[2023-10-13 02:58:21,682][46662] Updated weights for policy 0, policy_version 65280 (0.0009) +[2023-10-13 02:58:22,856][46663] Updated weights for policy 1, policy_version 65191 (0.0008) +[2023-10-13 02:58:23,222][46663] Updated weights for policy 1, policy_version 65201 (0.0007) +[2023-10-13 02:58:23,587][46663] Updated weights for policy 1, policy_version 65211 (0.0008) +[2023-10-13 02:58:23,607][45375] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 133595136. Throughput: 0: 1683.8, 1: 1666.9. Samples: 33413070. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:58:23,607][45375] Avg episode reward: [(0, '50.560'), (1, '46.050')] +[2023-10-13 02:58:25,562][46662] Updated weights for policy 0, policy_version 65290 (0.0010) +[2023-10-13 02:58:25,941][46662] Updated weights for policy 0, policy_version 65300 (0.0009) +[2023-10-13 02:58:26,305][46662] Updated weights for policy 0, policy_version 65310 (0.0010) +[2023-10-13 02:58:27,570][46663] Updated weights for policy 1, policy_version 65221 (0.0009) +[2023-10-13 02:58:27,940][46663] Updated weights for policy 1, policy_version 65231 (0.0010) +[2023-10-13 02:58:28,300][46663] Updated weights for policy 1, policy_version 65241 (0.0010) +[2023-10-13 02:58:28,607][45375] Fps is (10 sec: 16383.9, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 133693440. Throughput: 0: 1671.4, 1: 1686.0. Samples: 33423720. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:58:28,607][45375] Avg episode reward: [(0, '52.710'), (1, '45.150')] +[2023-10-13 02:58:30,384][46662] Updated weights for policy 0, policy_version 65320 (0.0009) +[2023-10-13 02:58:30,751][46662] Updated weights for policy 0, policy_version 65330 (0.0009) +[2023-10-13 02:58:31,120][46662] Updated weights for policy 0, policy_version 65340 (0.0010) +[2023-10-13 02:58:32,232][46663] Updated weights for policy 1, policy_version 65251 (0.0008) +[2023-10-13 02:58:32,609][46663] Updated weights for policy 1, policy_version 65261 (0.0009) +[2023-10-13 02:58:32,966][46663] Updated weights for policy 1, policy_version 65271 (0.0007) +[2023-10-13 02:58:33,607][45375] Fps is (10 sec: 16383.8, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 133758976. Throughput: 0: 1669.9, 1: 1686.4. Samples: 33443782. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:58:33,608][45375] Avg episode reward: [(0, '53.290'), (1, '45.930')] +[2023-10-13 02:58:35,329][46662] Updated weights for policy 0, policy_version 65350 (0.0009) +[2023-10-13 02:58:35,694][46662] Updated weights for policy 0, policy_version 65360 (0.0010) +[2023-10-13 02:58:36,059][46662] Updated weights for policy 0, policy_version 65370 (0.0008) +[2023-10-13 02:58:37,226][46663] Updated weights for policy 1, policy_version 65281 (0.0010) +[2023-10-13 02:58:37,601][46663] Updated weights for policy 1, policy_version 65291 (0.0009) +[2023-10-13 02:58:37,967][46663] Updated weights for policy 1, policy_version 65301 (0.0009) +[2023-10-13 02:58:38,334][46663] Updated weights for policy 1, policy_version 65311 (0.0010) +[2023-10-13 02:58:38,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 133824512. Throughput: 0: 1692.9, 1: 1663.9. Samples: 33463230. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:58:38,607][45375] Avg episode reward: [(0, '54.750'), (1, '46.900')] +[2023-10-13 02:58:40,154][46662] Updated weights for policy 0, policy_version 65380 (0.0008) +[2023-10-13 02:58:40,516][46662] Updated weights for policy 0, policy_version 65390 (0.0007) +[2023-10-13 02:58:40,885][46662] Updated weights for policy 0, policy_version 65400 (0.0008) +[2023-10-13 02:58:42,491][46663] Updated weights for policy 1, policy_version 65321 (0.0008) +[2023-10-13 02:58:42,846][46663] Updated weights for policy 1, policy_version 65331 (0.0010) +[2023-10-13 02:58:43,222][46663] Updated weights for policy 1, policy_version 65341 (0.0010) +[2023-10-13 02:58:43,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 133890048. Throughput: 0: 1673.1, 1: 1692.0. Samples: 33474022. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:58:43,608][45375] Avg episode reward: [(0, '54.680'), (1, '46.980')] +[2023-10-13 02:58:44,838][46662] Updated weights for policy 0, policy_version 65410 (0.0008) +[2023-10-13 02:58:45,213][46662] Updated weights for policy 0, policy_version 65420 (0.0007) +[2023-10-13 02:58:45,580][46662] Updated weights for policy 0, policy_version 65430 (0.0007) +[2023-10-13 02:58:45,954][46662] Updated weights for policy 0, policy_version 65440 (0.0009) +[2023-10-13 02:58:47,278][46663] Updated weights for policy 1, policy_version 65351 (0.0007) +[2023-10-13 02:58:47,649][46663] Updated weights for policy 1, policy_version 65361 (0.0008) +[2023-10-13 02:58:48,012][46663] Updated weights for policy 1, policy_version 65371 (0.0010) +[2023-10-13 02:58:48,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 133955584. Throughput: 0: 1678.4, 1: 1677.6. Samples: 33493984. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:58:48,607][45375] Avg episode reward: [(0, '54.850'), (1, '48.510')] +[2023-10-13 02:58:50,017][46662] Updated weights for policy 0, policy_version 65450 (0.0008) +[2023-10-13 02:58:50,387][46662] Updated weights for policy 0, policy_version 65460 (0.0009) +[2023-10-13 02:58:50,751][46662] Updated weights for policy 0, policy_version 65470 (0.0009) +[2023-10-13 02:58:52,007][46663] Updated weights for policy 1, policy_version 65381 (0.0008) +[2023-10-13 02:58:52,365][46663] Updated weights for policy 1, policy_version 65391 (0.0010) +[2023-10-13 02:58:52,729][46663] Updated weights for policy 1, policy_version 65401 (0.0011) +[2023-10-13 02:58:53,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 134021120. Throughput: 0: 1692.5, 1: 1671.2. Samples: 33514092. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:58:53,608][45375] Avg episode reward: [(0, '55.720'), (1, '49.010')] +[2023-10-13 02:58:54,730][46662] Updated weights for policy 0, policy_version 65480 (0.0009) +[2023-10-13 02:58:55,100][46662] Updated weights for policy 0, policy_version 65490 (0.0009) +[2023-10-13 02:58:55,481][46662] Updated weights for policy 0, policy_version 65500 (0.0007) +[2023-10-13 02:58:56,700][46663] Updated weights for policy 1, policy_version 65411 (0.0008) +[2023-10-13 02:58:57,065][46663] Updated weights for policy 1, policy_version 65421 (0.0008) +[2023-10-13 02:58:57,440][46663] Updated weights for policy 1, policy_version 65431 (0.0007) +[2023-10-13 02:58:58,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 134086656. Throughput: 0: 1668.0, 1: 1696.1. Samples: 33524676. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:58:58,608][45375] Avg episode reward: [(0, '55.860'), (1, '46.520')] +[2023-10-13 02:58:59,689][46662] Updated weights for policy 0, policy_version 65510 (0.0008) +[2023-10-13 02:59:00,056][46662] Updated weights for policy 0, policy_version 65520 (0.0010) +[2023-10-13 02:59:00,425][46662] Updated weights for policy 0, policy_version 65530 (0.0011) +[2023-10-13 02:59:01,506][46663] Updated weights for policy 1, policy_version 65441 (0.0007) +[2023-10-13 02:59:01,875][46663] Updated weights for policy 1, policy_version 65451 (0.0007) +[2023-10-13 02:59:02,240][46663] Updated weights for policy 1, policy_version 65461 (0.0008) +[2023-10-13 02:59:02,606][46663] Updated weights for policy 1, policy_version 65471 (0.0008) +[2023-10-13 02:59:03,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 134152192. Throughput: 0: 1684.8, 1: 1668.1. Samples: 33544078. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:59:03,608][45375] Avg episode reward: [(0, '56.360'), (1, '46.860')] +[2023-10-13 02:59:04,515][46662] Updated weights for policy 0, policy_version 65540 (0.0009) +[2023-10-13 02:59:04,879][46662] Updated weights for policy 0, policy_version 65550 (0.0007) +[2023-10-13 02:59:05,260][46662] Updated weights for policy 0, policy_version 65560 (0.0010) +[2023-10-13 02:59:06,714][46663] Updated weights for policy 1, policy_version 65481 (0.0009) +[2023-10-13 02:59:07,085][46663] Updated weights for policy 1, policy_version 65491 (0.0010) +[2023-10-13 02:59:07,447][46663] Updated weights for policy 1, policy_version 65501 (0.0007) +[2023-10-13 02:59:08,607][45375] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 134217728. Throughput: 0: 1693.9, 1: 1672.7. Samples: 33564566. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:59:08,607][45375] Avg episode reward: [(0, '56.290'), (1, '48.840')] +[2023-10-13 02:59:09,304][46662] Updated weights for policy 0, policy_version 65570 (0.0009) +[2023-10-13 02:59:09,688][46662] Updated weights for policy 0, policy_version 65580 (0.0007) +[2023-10-13 02:59:10,063][46662] Updated weights for policy 0, policy_version 65590 (0.0007) +[2023-10-13 02:59:10,422][46662] Updated weights for policy 0, policy_version 65600 (0.0008) +[2023-10-13 02:59:11,590][46663] Updated weights for policy 1, policy_version 65511 (0.0008) +[2023-10-13 02:59:11,952][46663] Updated weights for policy 1, policy_version 65521 (0.0011) +[2023-10-13 02:59:12,317][46663] Updated weights for policy 1, policy_version 65531 (0.0009) +[2023-10-13 02:59:13,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 134283264. Throughput: 0: 1676.2, 1: 1683.3. Samples: 33574898. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:59:13,608][45375] Avg episode reward: [(0, '56.140'), (1, '49.180')] +[2023-10-13 02:59:14,321][46662] Updated weights for policy 0, policy_version 65610 (0.0007) +[2023-10-13 02:59:14,690][46662] Updated weights for policy 0, policy_version 65620 (0.0007) +[2023-10-13 02:59:15,068][46662] Updated weights for policy 0, policy_version 65630 (0.0007) +[2023-10-13 02:59:16,504][46663] Updated weights for policy 1, policy_version 65541 (0.0007) +[2023-10-13 02:59:16,865][46663] Updated weights for policy 1, policy_version 65551 (0.0007) +[2023-10-13 02:59:17,232][46663] Updated weights for policy 1, policy_version 65561 (0.0008) +[2023-10-13 02:59:18,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 134348800. Throughput: 0: 1690.0, 1: 1660.2. Samples: 33594540. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:59:18,607][45375] Avg episode reward: [(0, '57.740'), (1, '49.380')] +[2023-10-13 02:59:19,224][46662] Updated weights for policy 0, policy_version 65640 (0.0010) +[2023-10-13 02:59:19,588][46662] Updated weights for policy 0, policy_version 65650 (0.0009) +[2023-10-13 02:59:19,963][46662] Updated weights for policy 0, policy_version 65660 (0.0009) +[2023-10-13 02:59:21,149][46663] Updated weights for policy 1, policy_version 65571 (0.0009) +[2023-10-13 02:59:21,506][46663] Updated weights for policy 1, policy_version 65581 (0.0009) +[2023-10-13 02:59:21,881][46663] Updated weights for policy 1, policy_version 65591 (0.0009) +[2023-10-13 02:59:23,607][45375] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 134414336. Throughput: 0: 1692.4, 1: 1688.3. Samples: 33615358. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:59:23,607][45375] Avg episode reward: [(0, '56.970'), (1, '48.680')] +[2023-10-13 02:59:23,978][46662] Updated weights for policy 0, policy_version 65670 (0.0010) +[2023-10-13 02:59:24,358][46662] Updated weights for policy 0, policy_version 65680 (0.0009) +[2023-10-13 02:59:24,724][46662] Updated weights for policy 0, policy_version 65690 (0.0007) +[2023-10-13 02:59:25,814][46663] Updated weights for policy 1, policy_version 65601 (0.0011) +[2023-10-13 02:59:26,186][46663] Updated weights for policy 1, policy_version 65611 (0.0009) +[2023-10-13 02:59:26,551][46663] Updated weights for policy 1, policy_version 65621 (0.0008) +[2023-10-13 02:59:26,922][46663] Updated weights for policy 1, policy_version 65631 (0.0010) +[2023-10-13 02:59:28,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 134479872. Throughput: 0: 1677.8, 1: 1679.7. Samples: 33625110. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:59:28,607][45375] Avg episode reward: [(0, '56.910'), (1, '48.480')] +[2023-10-13 02:59:28,723][46662] Updated weights for policy 0, policy_version 65700 (0.0009) +[2023-10-13 02:59:29,088][46662] Updated weights for policy 0, policy_version 65710 (0.0009) +[2023-10-13 02:59:29,456][46662] Updated weights for policy 0, policy_version 65720 (0.0009) +[2023-10-13 02:59:30,986][46663] Updated weights for policy 1, policy_version 65641 (0.0009) +[2023-10-13 02:59:31,350][46663] Updated weights for policy 1, policy_version 65651 (0.0009) +[2023-10-13 02:59:31,722][46663] Updated weights for policy 1, policy_version 65661 (0.0009) +[2023-10-13 02:59:33,544][46662] Updated weights for policy 0, policy_version 65730 (0.0008) +[2023-10-13 02:59:33,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 134545408. Throughput: 0: 1691.4, 1: 1669.4. Samples: 33645222. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:59:33,607][45375] Avg episode reward: [(0, '56.980'), (1, '48.780')] +[2023-10-13 02:59:33,913][46662] Updated weights for policy 0, policy_version 65740 (0.0009) +[2023-10-13 02:59:34,281][46662] Updated weights for policy 0, policy_version 65750 (0.0009) +[2023-10-13 02:59:34,648][46662] Updated weights for policy 0, policy_version 65760 (0.0008) +[2023-10-13 02:59:35,810][46663] Updated weights for policy 1, policy_version 65671 (0.0008) +[2023-10-13 02:59:36,179][46663] Updated weights for policy 1, policy_version 65681 (0.0009) +[2023-10-13 02:59:36,542][46663] Updated weights for policy 1, policy_version 65691 (0.0008) +[2023-10-13 02:59:38,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 134610944. Throughput: 0: 1688.4, 1: 1692.7. Samples: 33666238. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 02:59:38,607][45375] Avg episode reward: [(0, '57.410'), (1, '48.660')] +[2023-10-13 02:59:38,725][46662] Updated weights for policy 0, policy_version 65770 (0.0007) +[2023-10-13 02:59:39,101][46662] Updated weights for policy 0, policy_version 65780 (0.0008) +[2023-10-13 02:59:39,469][46662] Updated weights for policy 0, policy_version 65790 (0.0007) +[2023-10-13 02:59:40,442][46663] Updated weights for policy 1, policy_version 65701 (0.0010) +[2023-10-13 02:59:40,820][46663] Updated weights for policy 1, policy_version 65711 (0.0009) +[2023-10-13 02:59:41,187][46663] Updated weights for policy 1, policy_version 65721 (0.0010) +[2023-10-13 02:59:43,429][46662] Updated weights for policy 0, policy_version 65800 (0.0009) +[2023-10-13 02:59:43,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 134676480. Throughput: 0: 1688.7, 1: 1663.8. Samples: 33675538. Policy #0 lag: (min: 2.0, avg: 4.2, max: 34.0) +[2023-10-13 02:59:43,607][45375] Avg episode reward: [(0, '57.450'), (1, '47.500')] +[2023-10-13 02:59:43,803][46662] Updated weights for policy 0, policy_version 65810 (0.0010) +[2023-10-13 02:59:44,177][46662] Updated weights for policy 0, policy_version 65820 (0.0010) +[2023-10-13 02:59:45,393][46663] Updated weights for policy 1, policy_version 65731 (0.0010) +[2023-10-13 02:59:45,754][46663] Updated weights for policy 1, policy_version 65741 (0.0009) +[2023-10-13 02:59:46,125][46663] Updated weights for policy 1, policy_version 65751 (0.0007) +[2023-10-13 02:59:48,198][46662] Updated weights for policy 0, policy_version 65830 (0.0009) +[2023-10-13 02:59:48,569][46662] Updated weights for policy 0, policy_version 65840 (0.0007) +[2023-10-13 02:59:48,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 134742016. Throughput: 0: 1696.7, 1: 1682.5. Samples: 33696140. Policy #0 lag: (min: 2.0, avg: 4.2, max: 34.0) +[2023-10-13 02:59:48,607][45375] Avg episode reward: [(0, '57.200'), (1, '48.630')] +[2023-10-13 02:59:48,939][46662] Updated weights for policy 0, policy_version 65850 (0.0011) +[2023-10-13 02:59:50,115][46663] Updated weights for policy 1, policy_version 65761 (0.0008) +[2023-10-13 02:59:50,480][46663] Updated weights for policy 1, policy_version 65771 (0.0009) +[2023-10-13 02:59:50,845][46663] Updated weights for policy 1, policy_version 65781 (0.0008) +[2023-10-13 02:59:51,205][46663] Updated weights for policy 1, policy_version 65791 (0.0008) +[2023-10-13 02:59:52,906][46662] Updated weights for policy 0, policy_version 65860 (0.0008) +[2023-10-13 02:59:53,277][46662] Updated weights for policy 0, policy_version 65870 (0.0012) +[2023-10-13 02:59:53,607][45375] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 134807552. Throughput: 0: 1689.7, 1: 1693.7. Samples: 33716820. Policy #0 lag: (min: 2.0, avg: 4.2, max: 34.0) +[2023-10-13 02:59:53,608][45375] Avg episode reward: [(0, '56.880'), (1, '47.830')] +[2023-10-13 02:59:53,619][46384] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000065792_67371008.pth... +[2023-10-13 02:59:53,648][46662] Updated weights for policy 0, policy_version 65880 (0.0010) +[2023-10-13 02:59:53,653][46384] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000064224_65765376.pth +[2023-10-13 02:59:53,935][46091] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000065888_67469312.pth... +[2023-10-13 02:59:53,964][46091] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000064288_65830912.pth +[2023-10-13 02:59:55,249][46663] Updated weights for policy 1, policy_version 65801 (0.0009) +[2023-10-13 02:59:55,616][46663] Updated weights for policy 1, policy_version 65811 (0.0009) +[2023-10-13 02:59:55,989][46663] Updated weights for policy 1, policy_version 65821 (0.0009) +[2023-10-13 02:59:57,901][46662] Updated weights for policy 0, policy_version 65890 (0.0008) +[2023-10-13 02:59:58,315][46662] Updated weights for policy 0, policy_version 65900 (0.0009) +[2023-10-13 02:59:58,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 134873088. Throughput: 0: 1687.5, 1: 1669.5. Samples: 33725962. Policy #0 lag: (min: 2.0, avg: 4.2, max: 34.0) +[2023-10-13 02:59:58,607][45375] Avg episode reward: [(0, '59.280'), (1, '47.590')] +[2023-10-13 02:59:58,680][46662] Updated weights for policy 0, policy_version 65910 (0.0008) +[2023-10-13 02:59:59,056][46662] Updated weights for policy 0, policy_version 65920 (0.0008) +[2023-10-13 03:00:00,086][46663] Updated weights for policy 1, policy_version 65831 (0.0010) +[2023-10-13 03:00:00,463][46663] Updated weights for policy 1, policy_version 65841 (0.0009) +[2023-10-13 03:00:00,820][46663] Updated weights for policy 1, policy_version 65851 (0.0010) +[2023-10-13 03:00:02,900][46662] Updated weights for policy 0, policy_version 65930 (0.0007) +[2023-10-13 03:00:03,271][46662] Updated weights for policy 0, policy_version 65940 (0.0009) +[2023-10-13 03:00:03,606][45375] Fps is (10 sec: 13107.6, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 134938624. Throughput: 0: 1683.9, 1: 1692.5. Samples: 33746476. Policy #0 lag: (min: 2.0, avg: 4.2, max: 34.0) +[2023-10-13 03:00:03,607][45375] Avg episode reward: [(0, '59.060'), (1, '47.990')] +[2023-10-13 03:00:03,647][46662] Updated weights for policy 0, policy_version 65950 (0.0008) +[2023-10-13 03:00:04,861][46663] Updated weights for policy 1, policy_version 65861 (0.0008) +[2023-10-13 03:00:05,231][46663] Updated weights for policy 1, policy_version 65871 (0.0008) +[2023-10-13 03:00:05,605][46663] Updated weights for policy 1, policy_version 65881 (0.0007) +[2023-10-13 03:00:07,759][46662] Updated weights for policy 0, policy_version 65960 (0.0008) +[2023-10-13 03:00:08,137][46662] Updated weights for policy 0, policy_version 65970 (0.0007) +[2023-10-13 03:00:08,508][46662] Updated weights for policy 0, policy_version 65980 (0.0009) +[2023-10-13 03:00:08,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 135004160. Throughput: 0: 1672.7, 1: 1693.5. Samples: 33766836. Policy #0 lag: (min: 2.0, avg: 4.2, max: 34.0) +[2023-10-13 03:00:08,607][45375] Avg episode reward: [(0, '58.470'), (1, '47.600')] +[2023-10-13 03:00:09,719][46663] Updated weights for policy 1, policy_version 65891 (0.0007) +[2023-10-13 03:00:10,115][46663] Updated weights for policy 1, policy_version 65901 (0.0009) +[2023-10-13 03:00:10,493][46663] Updated weights for policy 1, policy_version 65911 (0.0008) +[2023-10-13 03:00:12,630][46662] Updated weights for policy 0, policy_version 65990 (0.0010) +[2023-10-13 03:00:12,992][46662] Updated weights for policy 0, policy_version 66000 (0.0009) +[2023-10-13 03:00:13,361][46662] Updated weights for policy 0, policy_version 66010 (0.0007) +[2023-10-13 03:00:13,607][45375] Fps is (10 sec: 16383.7, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 135102464. Throughput: 0: 1686.5, 1: 1670.7. Samples: 33776188. Policy #0 lag: (min: 2.0, avg: 4.2, max: 34.0) +[2023-10-13 03:00:13,608][45375] Avg episode reward: [(0, '58.250'), (1, '48.000')] +[2023-10-13 03:00:14,623][46663] Updated weights for policy 1, policy_version 65921 (0.0009) +[2023-10-13 03:00:14,992][46663] Updated weights for policy 1, policy_version 65931 (0.0010) +[2023-10-13 03:00:15,349][46663] Updated weights for policy 1, policy_version 65941 (0.0008) +[2023-10-13 03:00:15,713][46663] Updated weights for policy 1, policy_version 65951 (0.0010) +[2023-10-13 03:00:17,373][46662] Updated weights for policy 0, policy_version 66020 (0.0009) +[2023-10-13 03:00:17,746][46662] Updated weights for policy 0, policy_version 66030 (0.0011) +[2023-10-13 03:00:18,112][46662] Updated weights for policy 0, policy_version 66040 (0.0008) +[2023-10-13 03:00:18,606][45375] Fps is (10 sec: 16384.2, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 135168000. Throughput: 0: 1685.9, 1: 1682.1. Samples: 33796782. Policy #0 lag: (min: 2.0, avg: 4.2, max: 34.0) +[2023-10-13 03:00:18,607][45375] Avg episode reward: [(0, '58.760'), (1, '48.870')] +[2023-10-13 03:00:19,897][46663] Updated weights for policy 1, policy_version 65961 (0.0011) +[2023-10-13 03:00:20,260][46663] Updated weights for policy 1, policy_version 65971 (0.0010) +[2023-10-13 03:00:20,624][46663] Updated weights for policy 1, policy_version 65981 (0.0011) +[2023-10-13 03:00:22,160][46662] Updated weights for policy 0, policy_version 66050 (0.0008) +[2023-10-13 03:00:22,525][46662] Updated weights for policy 0, policy_version 66060 (0.0008) +[2023-10-13 03:00:22,897][46662] Updated weights for policy 0, policy_version 66070 (0.0010) +[2023-10-13 03:00:23,264][46662] Updated weights for policy 0, policy_version 66080 (0.0010) +[2023-10-13 03:00:23,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 135233536. Throughput: 0: 1669.3, 1: 1674.4. Samples: 33816706. Policy #0 lag: (min: 31.0, avg: 33.0, max: 61.0) +[2023-10-13 03:00:23,608][45375] Avg episode reward: [(0, '59.750'), (1, '49.690')] +[2023-10-13 03:00:24,766][46663] Updated weights for policy 1, policy_version 65991 (0.0009) +[2023-10-13 03:00:25,132][46663] Updated weights for policy 1, policy_version 66001 (0.0009) +[2023-10-13 03:00:25,496][46663] Updated weights for policy 1, policy_version 66011 (0.0009) +[2023-10-13 03:00:27,381][46662] Updated weights for policy 0, policy_version 66090 (0.0010) +[2023-10-13 03:00:27,748][46662] Updated weights for policy 0, policy_version 66100 (0.0009) +[2023-10-13 03:00:28,119][46662] Updated weights for policy 0, policy_version 66110 (0.0011) +[2023-10-13 03:00:28,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 135299072. Throughput: 0: 1681.8, 1: 1667.2. Samples: 33826244. Policy #0 lag: (min: 31.0, avg: 33.0, max: 61.0) +[2023-10-13 03:00:28,607][45375] Avg episode reward: [(0, '60.040'), (1, '49.810')] +[2023-10-13 03:00:29,578][46663] Updated weights for policy 1, policy_version 66021 (0.0007) +[2023-10-13 03:00:29,944][46663] Updated weights for policy 1, policy_version 66031 (0.0007) +[2023-10-13 03:00:30,316][46663] Updated weights for policy 1, policy_version 66041 (0.0008) +[2023-10-13 03:00:32,234][46662] Updated weights for policy 0, policy_version 66120 (0.0010) +[2023-10-13 03:00:32,596][46662] Updated weights for policy 0, policy_version 66130 (0.0010) +[2023-10-13 03:00:32,961][46662] Updated weights for policy 0, policy_version 66140 (0.0011) +[2023-10-13 03:00:33,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 135364608. Throughput: 0: 1680.7, 1: 1667.2. Samples: 33846800. Policy #0 lag: (min: 31.0, avg: 33.0, max: 61.0) +[2023-10-13 03:00:33,608][45375] Avg episode reward: [(0, '59.140'), (1, '50.190')] +[2023-10-13 03:00:34,377][46663] Updated weights for policy 1, policy_version 66051 (0.0009) +[2023-10-13 03:00:34,749][46663] Updated weights for policy 1, policy_version 66061 (0.0010) +[2023-10-13 03:00:35,121][46663] Updated weights for policy 1, policy_version 66071 (0.0010) +[2023-10-13 03:00:36,922][46662] Updated weights for policy 0, policy_version 66150 (0.0010) +[2023-10-13 03:00:37,287][46662] Updated weights for policy 0, policy_version 66160 (0.0009) +[2023-10-13 03:00:37,654][46662] Updated weights for policy 0, policy_version 66170 (0.0010) +[2023-10-13 03:00:38,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 135430144. Throughput: 0: 1659.6, 1: 1662.5. Samples: 33866312. Policy #0 lag: (min: 31.0, avg: 33.0, max: 61.0) +[2023-10-13 03:00:38,607][45375] Avg episode reward: [(0, '59.780'), (1, '49.590')] +[2023-10-13 03:00:39,334][46663] Updated weights for policy 1, policy_version 66081 (0.0009) +[2023-10-13 03:00:39,698][46663] Updated weights for policy 1, policy_version 66091 (0.0008) +[2023-10-13 03:00:40,058][46663] Updated weights for policy 1, policy_version 66101 (0.0007) +[2023-10-13 03:00:40,429][46663] Updated weights for policy 1, policy_version 66111 (0.0007) +[2023-10-13 03:00:41,873][46662] Updated weights for policy 0, policy_version 66180 (0.0009) +[2023-10-13 03:00:42,246][46662] Updated weights for policy 0, policy_version 66190 (0.0009) +[2023-10-13 03:00:42,619][46662] Updated weights for policy 0, policy_version 66200 (0.0009) +[2023-10-13 03:00:43,606][45375] Fps is (10 sec: 13107.5, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 135495680. Throughput: 0: 1681.4, 1: 1658.6. Samples: 33876260. Policy #0 lag: (min: 31.0, avg: 33.0, max: 61.0) +[2023-10-13 03:00:43,607][45375] Avg episode reward: [(0, '58.720'), (1, '49.380')] +[2023-10-13 03:00:44,611][46663] Updated weights for policy 1, policy_version 66121 (0.0009) +[2023-10-13 03:00:44,969][46663] Updated weights for policy 1, policy_version 66131 (0.0007) +[2023-10-13 03:00:45,337][46663] Updated weights for policy 1, policy_version 66141 (0.0007) +[2023-10-13 03:00:46,794][46662] Updated weights for policy 0, policy_version 66210 (0.0008) +[2023-10-13 03:00:47,213][46662] Updated weights for policy 0, policy_version 66220 (0.0008) +[2023-10-13 03:00:47,584][46662] Updated weights for policy 0, policy_version 66230 (0.0008) +[2023-10-13 03:00:47,948][46662] Updated weights for policy 0, policy_version 66240 (0.0009) +[2023-10-13 03:00:48,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 135561216. Throughput: 0: 1678.7, 1: 1666.1. Samples: 33896992. Policy #0 lag: (min: 31.0, avg: 33.0, max: 61.0) +[2023-10-13 03:00:48,607][45375] Avg episode reward: [(0, '58.010'), (1, '51.060')] +[2023-10-13 03:00:49,369][46663] Updated weights for policy 1, policy_version 66151 (0.0007) +[2023-10-13 03:00:49,737][46663] Updated weights for policy 1, policy_version 66161 (0.0008) +[2023-10-13 03:00:50,112][46663] Updated weights for policy 1, policy_version 66171 (0.0008) +[2023-10-13 03:00:51,983][46662] Updated weights for policy 0, policy_version 66250 (0.0009) +[2023-10-13 03:00:52,356][46662] Updated weights for policy 0, policy_version 66260 (0.0007) +[2023-10-13 03:00:52,727][46662] Updated weights for policy 0, policy_version 66270 (0.0007) +[2023-10-13 03:00:53,607][45375] Fps is (10 sec: 13106.8, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 135626752. Throughput: 0: 1666.5, 1: 1667.2. Samples: 33916852. Policy #0 lag: (min: 31.0, avg: 33.0, max: 61.0) +[2023-10-13 03:00:53,608][45375] Avg episode reward: [(0, '57.040'), (1, '48.460')] +[2023-10-13 03:00:54,128][46663] Updated weights for policy 1, policy_version 66181 (0.0007) +[2023-10-13 03:00:54,514][46663] Updated weights for policy 1, policy_version 66191 (0.0007) +[2023-10-13 03:00:54,876][46663] Updated weights for policy 1, policy_version 66201 (0.0007) +[2023-10-13 03:00:56,756][46662] Updated weights for policy 0, policy_version 66280 (0.0007) +[2023-10-13 03:00:57,120][46662] Updated weights for policy 0, policy_version 66290 (0.0007) +[2023-10-13 03:00:57,497][46662] Updated weights for policy 0, policy_version 66300 (0.0008) +[2023-10-13 03:00:58,606][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 135692288. Throughput: 0: 1683.4, 1: 1669.1. Samples: 33927052. Policy #0 lag: (min: 31.0, avg: 33.0, max: 61.0) +[2023-10-13 03:00:58,607][45375] Avg episode reward: [(0, '57.810'), (1, '50.540')] +[2023-10-13 03:00:58,894][46663] Updated weights for policy 1, policy_version 66211 (0.0009) +[2023-10-13 03:00:59,257][46663] Updated weights for policy 1, policy_version 66221 (0.0007) +[2023-10-13 03:00:59,626][46663] Updated weights for policy 1, policy_version 66231 (0.0007) +[2023-10-13 03:01:01,574][46662] Updated weights for policy 0, policy_version 66310 (0.0010) +[2023-10-13 03:01:01,938][46662] Updated weights for policy 0, policy_version 66320 (0.0008) +[2023-10-13 03:01:02,311][46662] Updated weights for policy 0, policy_version 66330 (0.0008) +[2023-10-13 03:01:03,606][45375] Fps is (10 sec: 13107.6, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 135757824. Throughput: 0: 1666.3, 1: 1681.4. Samples: 33947428. Policy #0 lag: (min: 31.0, avg: 33.0, max: 61.0) +[2023-10-13 03:01:03,607][45375] Avg episode reward: [(0, '57.710'), (1, '50.810')] +[2023-10-13 03:01:03,627][46663] Updated weights for policy 1, policy_version 66241 (0.0011) +[2023-10-13 03:01:03,982][46663] Updated weights for policy 1, policy_version 66251 (0.0011) +[2023-10-13 03:01:04,354][46663] Updated weights for policy 1, policy_version 66261 (0.0009) +[2023-10-13 03:01:04,720][46663] Updated weights for policy 1, policy_version 66271 (0.0009) +[2023-10-13 03:01:06,386][46662] Updated weights for policy 0, policy_version 66340 (0.0010) +[2023-10-13 03:01:06,751][46662] Updated weights for policy 0, policy_version 66350 (0.0007) +[2023-10-13 03:01:07,119][46662] Updated weights for policy 0, policy_version 66360 (0.0009) +[2023-10-13 03:01:08,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 135823360. Throughput: 0: 1665.5, 1: 1679.5. Samples: 33967230. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 03:01:08,607][45375] Avg episode reward: [(0, '57.080'), (1, '50.090')] +[2023-10-13 03:01:08,838][46663] Updated weights for policy 1, policy_version 66281 (0.0010) +[2023-10-13 03:01:09,207][46663] Updated weights for policy 1, policy_version 66291 (0.0008) +[2023-10-13 03:01:09,571][46663] Updated weights for policy 1, policy_version 66301 (0.0008) +[2023-10-13 03:01:11,101][46662] Updated weights for policy 0, policy_version 66370 (0.0009) +[2023-10-13 03:01:11,476][46662] Updated weights for policy 0, policy_version 66380 (0.0009) +[2023-10-13 03:01:11,848][46662] Updated weights for policy 0, policy_version 66390 (0.0009) +[2023-10-13 03:01:12,217][46662] Updated weights for policy 0, policy_version 66400 (0.0007) +[2023-10-13 03:01:13,607][45375] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 135888896. Throughput: 0: 1683.6, 1: 1685.6. Samples: 33977854. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 03:01:13,608][45375] Avg episode reward: [(0, '54.590'), (1, '52.350')] +[2023-10-13 03:01:13,611][46663] Updated weights for policy 1, policy_version 66311 (0.0008) +[2023-10-13 03:01:13,982][46663] Updated weights for policy 1, policy_version 66321 (0.0007) +[2023-10-13 03:01:14,342][46663] Updated weights for policy 1, policy_version 66331 (0.0009) +[2023-10-13 03:01:16,230][46662] Updated weights for policy 0, policy_version 66410 (0.0009) +[2023-10-13 03:01:16,603][46662] Updated weights for policy 0, policy_version 66420 (0.0010) +[2023-10-13 03:01:16,970][46662] Updated weights for policy 0, policy_version 66430 (0.0010) +[2023-10-13 03:01:18,596][46663] Updated weights for policy 1, policy_version 66341 (0.0009) +[2023-10-13 03:01:18,606][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 135954432. Throughput: 0: 1663.8, 1: 1694.2. Samples: 33997910. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 03:01:18,607][45375] Avg episode reward: [(0, '55.120'), (1, '50.970')] +[2023-10-13 03:01:18,960][46663] Updated weights for policy 1, policy_version 66351 (0.0008) +[2023-10-13 03:01:19,326][46663] Updated weights for policy 1, policy_version 66361 (0.0007) +[2023-10-13 03:01:21,002][46662] Updated weights for policy 0, policy_version 66440 (0.0010) +[2023-10-13 03:01:21,362][46662] Updated weights for policy 0, policy_version 66450 (0.0008) +[2023-10-13 03:01:21,740][46662] Updated weights for policy 0, policy_version 66460 (0.0009) +[2023-10-13 03:01:23,286][46663] Updated weights for policy 1, policy_version 66371 (0.0008) +[2023-10-13 03:01:23,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 136019968. Throughput: 0: 1677.2, 1: 1699.6. Samples: 34018266. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 03:01:23,607][45375] Avg episode reward: [(0, '55.580'), (1, '50.680')] +[2023-10-13 03:01:23,655][46663] Updated weights for policy 1, policy_version 66381 (0.0007) +[2023-10-13 03:01:24,013][46663] Updated weights for policy 1, policy_version 66391 (0.0007) +[2023-10-13 03:01:25,859][46662] Updated weights for policy 0, policy_version 66470 (0.0010) +[2023-10-13 03:01:26,228][46662] Updated weights for policy 0, policy_version 66480 (0.0007) +[2023-10-13 03:01:26,601][46662] Updated weights for policy 0, policy_version 66490 (0.0010) +[2023-10-13 03:01:28,047][46663] Updated weights for policy 1, policy_version 66401 (0.0007) +[2023-10-13 03:01:28,414][46663] Updated weights for policy 1, policy_version 66411 (0.0009) +[2023-10-13 03:01:28,606][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 136085504. Throughput: 0: 1681.2, 1: 1705.7. Samples: 34028670. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 03:01:28,607][45375] Avg episode reward: [(0, '55.090'), (1, '50.660')] +[2023-10-13 03:01:28,780][46663] Updated weights for policy 1, policy_version 66421 (0.0009) +[2023-10-13 03:01:29,150][46663] Updated weights for policy 1, policy_version 66431 (0.0007) +[2023-10-13 03:01:30,715][46662] Updated weights for policy 0, policy_version 66500 (0.0009) +[2023-10-13 03:01:31,079][46662] Updated weights for policy 0, policy_version 66510 (0.0010) +[2023-10-13 03:01:31,454][46662] Updated weights for policy 0, policy_version 66520 (0.0010) +[2023-10-13 03:01:33,065][46663] Updated weights for policy 1, policy_version 66441 (0.0009) +[2023-10-13 03:01:33,428][46663] Updated weights for policy 1, policy_version 66451 (0.0007) +[2023-10-13 03:01:33,606][45375] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 136151040. Throughput: 0: 1656.0, 1: 1708.7. Samples: 34048406. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 03:01:33,607][45375] Avg episode reward: [(0, '53.740'), (1, '51.900')] +[2023-10-13 03:01:33,795][46663] Updated weights for policy 1, policy_version 66461 (0.0008) +[2023-10-13 03:01:35,572][46662] Updated weights for policy 0, policy_version 66530 (0.0010) +[2023-10-13 03:01:35,962][46662] Updated weights for policy 0, policy_version 66540 (0.0010) +[2023-10-13 03:01:36,328][46662] Updated weights for policy 0, policy_version 66550 (0.0010) +[2023-10-13 03:01:36,698][46662] Updated weights for policy 0, policy_version 66560 (0.0010) +[2023-10-13 03:01:37,835][46663] Updated weights for policy 1, policy_version 66471 (0.0008) +[2023-10-13 03:01:38,207][46663] Updated weights for policy 1, policy_version 66481 (0.0007) +[2023-10-13 03:01:38,566][46663] Updated weights for policy 1, policy_version 66491 (0.0009) +[2023-10-13 03:01:38,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 136216576. Throughput: 0: 1675.4, 1: 1686.6. Samples: 34068144. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 03:01:38,607][45375] Avg episode reward: [(0, '54.190'), (1, '51.800')] +[2023-10-13 03:01:40,936][46662] Updated weights for policy 0, policy_version 66570 (0.0009) +[2023-10-13 03:01:41,312][46662] Updated weights for policy 0, policy_version 66580 (0.0009) +[2023-10-13 03:01:41,685][46662] Updated weights for policy 0, policy_version 66590 (0.0009) +[2023-10-13 03:01:42,487][46663] Updated weights for policy 1, policy_version 66501 (0.0011) +[2023-10-13 03:01:42,885][46663] Updated weights for policy 1, policy_version 66511 (0.0011) +[2023-10-13 03:01:43,256][46663] Updated weights for policy 1, policy_version 66521 (0.0009) +[2023-10-13 03:01:43,607][45375] Fps is (10 sec: 16384.0, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 136314880. Throughput: 0: 1668.9, 1: 1708.6. Samples: 34079042. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 03:01:43,607][45375] Avg episode reward: [(0, '53.400'), (1, '52.110')] +[2023-10-13 03:01:45,773][46662] Updated weights for policy 0, policy_version 66600 (0.0009) +[2023-10-13 03:01:46,141][46662] Updated weights for policy 0, policy_version 66610 (0.0007) +[2023-10-13 03:01:46,506][46662] Updated weights for policy 0, policy_version 66620 (0.0010) +[2023-10-13 03:01:47,116][46663] Updated weights for policy 1, policy_version 66531 (0.0007) +[2023-10-13 03:01:47,469][46663] Updated weights for policy 1, policy_version 66541 (0.0009) +[2023-10-13 03:01:47,836][46663] Updated weights for policy 1, policy_version 66551 (0.0009) +[2023-10-13 03:01:48,607][45375] Fps is (10 sec: 16384.0, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 136380416. Throughput: 0: 1660.2, 1: 1694.4. Samples: 34098388. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 03:01:48,607][45375] Avg episode reward: [(0, '53.950'), (1, '52.170')] +[2023-10-13 03:01:50,564][46662] Updated weights for policy 0, policy_version 66630 (0.0010) +[2023-10-13 03:01:50,933][46662] Updated weights for policy 0, policy_version 66640 (0.0008) +[2023-10-13 03:01:51,303][46662] Updated weights for policy 0, policy_version 66650 (0.0007) +[2023-10-13 03:01:51,933][46663] Updated weights for policy 1, policy_version 66561 (0.0009) +[2023-10-13 03:01:52,297][46663] Updated weights for policy 1, policy_version 66571 (0.0008) +[2023-10-13 03:01:52,656][46663] Updated weights for policy 1, policy_version 66581 (0.0010) +[2023-10-13 03:01:53,028][46663] Updated weights for policy 1, policy_version 66591 (0.0008) +[2023-10-13 03:01:53,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 136445952. Throughput: 0: 1680.6, 1: 1675.6. Samples: 34118260. Policy #0 lag: (min: 17.0, avg: 26.3, max: 49.0) +[2023-10-13 03:01:53,607][45375] Avg episode reward: [(0, '51.640'), (1, '52.100')] +[2023-10-13 03:01:53,617][46091] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000066656_68255744.pth... +[2023-10-13 03:01:53,617][46384] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000066592_68190208.pth... +[2023-10-13 03:01:53,653][46091] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000065088_66650112.pth +[2023-10-13 03:01:53,654][46384] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000065024_66584576.pth +[2023-10-13 03:01:55,295][46662] Updated weights for policy 0, policy_version 66660 (0.0009) +[2023-10-13 03:01:55,662][46662] Updated weights for policy 0, policy_version 66670 (0.0008) +[2023-10-13 03:01:56,028][46662] Updated weights for policy 0, policy_version 66680 (0.0010) +[2023-10-13 03:01:57,148][46663] Updated weights for policy 1, policy_version 66601 (0.0009) +[2023-10-13 03:01:57,519][46663] Updated weights for policy 1, policy_version 66611 (0.0007) +[2023-10-13 03:01:57,891][46663] Updated weights for policy 1, policy_version 66621 (0.0008) +[2023-10-13 03:01:58,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 136511488. Throughput: 0: 1664.0, 1: 1705.5. Samples: 34129482. Policy #0 lag: (min: 17.0, avg: 26.3, max: 49.0) +[2023-10-13 03:01:58,607][45375] Avg episode reward: [(0, '52.060'), (1, '52.410')] +[2023-10-13 03:02:00,120][46662] Updated weights for policy 0, policy_version 66690 (0.0010) +[2023-10-13 03:02:00,497][46662] Updated weights for policy 0, policy_version 66700 (0.0009) +[2023-10-13 03:02:00,862][46662] Updated weights for policy 0, policy_version 66710 (0.0008) +[2023-10-13 03:02:01,224][46662] Updated weights for policy 0, policy_version 66720 (0.0008) +[2023-10-13 03:02:01,960][46663] Updated weights for policy 1, policy_version 66631 (0.0007) +[2023-10-13 03:02:02,327][46663] Updated weights for policy 1, policy_version 66641 (0.0008) +[2023-10-13 03:02:02,686][46663] Updated weights for policy 1, policy_version 66651 (0.0009) +[2023-10-13 03:02:03,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 136577024. Throughput: 0: 1670.6, 1: 1690.3. Samples: 34149150. Policy #0 lag: (min: 17.0, avg: 26.3, max: 49.0) +[2023-10-13 03:02:03,608][45375] Avg episode reward: [(0, '51.820'), (1, '52.400')] +[2023-10-13 03:02:05,107][46662] Updated weights for policy 0, policy_version 66730 (0.0008) +[2023-10-13 03:02:05,477][46662] Updated weights for policy 0, policy_version 66740 (0.0007) +[2023-10-13 03:02:05,845][46662] Updated weights for policy 0, policy_version 66750 (0.0008) +[2023-10-13 03:02:06,802][46663] Updated weights for policy 1, policy_version 66661 (0.0009) +[2023-10-13 03:02:07,177][46663] Updated weights for policy 1, policy_version 66671 (0.0009) +[2023-10-13 03:02:07,536][46663] Updated weights for policy 1, policy_version 66681 (0.0007) +[2023-10-13 03:02:08,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 136642560. Throughput: 0: 1683.5, 1: 1677.9. Samples: 34169526. Policy #0 lag: (min: 17.0, avg: 26.3, max: 49.0) +[2023-10-13 03:02:08,607][45375] Avg episode reward: [(0, '50.510'), (1, '52.070')] +[2023-10-13 03:02:09,913][46662] Updated weights for policy 0, policy_version 66760 (0.0009) +[2023-10-13 03:02:10,284][46662] Updated weights for policy 0, policy_version 66770 (0.0009) +[2023-10-13 03:02:10,659][46662] Updated weights for policy 0, policy_version 66780 (0.0011) +[2023-10-13 03:02:11,696][46663] Updated weights for policy 1, policy_version 66691 (0.0008) +[2023-10-13 03:02:12,055][46663] Updated weights for policy 1, policy_version 66701 (0.0010) +[2023-10-13 03:02:12,423][46663] Updated weights for policy 1, policy_version 66711 (0.0009) +[2023-10-13 03:02:13,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 136708096. Throughput: 0: 1656.6, 1: 1699.6. Samples: 34179696. Policy #0 lag: (min: 17.0, avg: 26.3, max: 49.0) +[2023-10-13 03:02:13,607][45375] Avg episode reward: [(0, '49.880'), (1, '53.210')] +[2023-10-13 03:02:14,809][46662] Updated weights for policy 0, policy_version 66790 (0.0009) +[2023-10-13 03:02:15,170][46662] Updated weights for policy 0, policy_version 66800 (0.0008) +[2023-10-13 03:02:15,542][46662] Updated weights for policy 0, policy_version 66810 (0.0008) +[2023-10-13 03:02:16,435][46663] Updated weights for policy 1, policy_version 66721 (0.0008) +[2023-10-13 03:02:16,805][46663] Updated weights for policy 1, policy_version 66731 (0.0008) +[2023-10-13 03:02:17,174][46663] Updated weights for policy 1, policy_version 66741 (0.0008) +[2023-10-13 03:02:17,541][46663] Updated weights for policy 1, policy_version 66751 (0.0011) +[2023-10-13 03:02:18,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 136773632. Throughput: 0: 1683.4, 1: 1671.9. Samples: 34199394. Policy #0 lag: (min: 17.0, avg: 26.3, max: 49.0) +[2023-10-13 03:02:18,607][45375] Avg episode reward: [(0, '50.950'), (1, '52.410')] +[2023-10-13 03:02:19,562][46662] Updated weights for policy 0, policy_version 66820 (0.0010) +[2023-10-13 03:02:19,940][46662] Updated weights for policy 0, policy_version 66830 (0.0008) +[2023-10-13 03:02:20,309][46662] Updated weights for policy 0, policy_version 66840 (0.0008) +[2023-10-13 03:02:21,561][46663] Updated weights for policy 1, policy_version 66761 (0.0008) +[2023-10-13 03:02:21,923][46663] Updated weights for policy 1, policy_version 66771 (0.0008) +[2023-10-13 03:02:22,292][46663] Updated weights for policy 1, policy_version 66781 (0.0008) +[2023-10-13 03:02:23,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 136839168. Throughput: 0: 1687.6, 1: 1680.9. Samples: 34219726. Policy #0 lag: (min: 17.0, avg: 26.3, max: 49.0) +[2023-10-13 03:02:23,608][45375] Avg episode reward: [(0, '51.500'), (1, '51.830')] +[2023-10-13 03:02:24,408][46662] Updated weights for policy 0, policy_version 66850 (0.0008) +[2023-10-13 03:02:24,826][46662] Updated weights for policy 0, policy_version 66860 (0.0008) +[2023-10-13 03:02:25,195][46662] Updated weights for policy 0, policy_version 66870 (0.0009) +[2023-10-13 03:02:25,567][46662] Updated weights for policy 0, policy_version 66880 (0.0008) +[2023-10-13 03:02:26,458][46663] Updated weights for policy 1, policy_version 66791 (0.0007) +[2023-10-13 03:02:26,827][46663] Updated weights for policy 1, policy_version 66801 (0.0007) +[2023-10-13 03:02:27,188][46663] Updated weights for policy 1, policy_version 66811 (0.0009) +[2023-10-13 03:02:28,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 136904704. Throughput: 0: 1661.1, 1: 1686.9. Samples: 34229706. Policy #0 lag: (min: 17.0, avg: 26.3, max: 49.0) +[2023-10-13 03:02:28,608][45375] Avg episode reward: [(0, '50.790'), (1, '52.800')] +[2023-10-13 03:02:29,540][46662] Updated weights for policy 0, policy_version 66890 (0.0009) +[2023-10-13 03:02:29,909][46662] Updated weights for policy 0, policy_version 66900 (0.0008) +[2023-10-13 03:02:30,273][46662] Updated weights for policy 0, policy_version 66910 (0.0007) +[2023-10-13 03:02:31,115][46663] Updated weights for policy 1, policy_version 66821 (0.0009) +[2023-10-13 03:02:31,486][46663] Updated weights for policy 1, policy_version 66831 (0.0010) +[2023-10-13 03:02:31,852][46663] Updated weights for policy 1, policy_version 66841 (0.0010) +[2023-10-13 03:02:33,606][45375] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 136970240. Throughput: 0: 1682.7, 1: 1667.0. Samples: 34249122. Policy #0 lag: (min: 17.0, avg: 26.3, max: 49.0) +[2023-10-13 03:02:33,607][45375] Avg episode reward: [(0, '49.540'), (1, '53.290')] +[2023-10-13 03:02:34,397][46662] Updated weights for policy 0, policy_version 66920 (0.0008) +[2023-10-13 03:02:34,767][46662] Updated weights for policy 0, policy_version 66930 (0.0008) +[2023-10-13 03:02:35,149][46662] Updated weights for policy 0, policy_version 66940 (0.0008) +[2023-10-13 03:02:36,084][46663] Updated weights for policy 1, policy_version 66851 (0.0010) +[2023-10-13 03:02:36,445][46663] Updated weights for policy 1, policy_version 66861 (0.0008) +[2023-10-13 03:02:36,803][46663] Updated weights for policy 1, policy_version 66871 (0.0008) +[2023-10-13 03:02:38,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 137035776. Throughput: 0: 1678.8, 1: 1685.2. Samples: 34269638. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 03:02:38,607][45375] Avg episode reward: [(0, '50.560'), (1, '52.520')] +[2023-10-13 03:02:39,332][46662] Updated weights for policy 0, policy_version 66950 (0.0008) +[2023-10-13 03:02:39,706][46662] Updated weights for policy 0, policy_version 66960 (0.0008) +[2023-10-13 03:02:40,076][46662] Updated weights for policy 0, policy_version 66970 (0.0009) +[2023-10-13 03:02:40,815][46663] Updated weights for policy 1, policy_version 66881 (0.0008) +[2023-10-13 03:02:41,183][46663] Updated weights for policy 1, policy_version 66891 (0.0010) +[2023-10-13 03:02:41,550][46663] Updated weights for policy 1, policy_version 66901 (0.0011) +[2023-10-13 03:02:41,919][46663] Updated weights for policy 1, policy_version 66911 (0.0008) +[2023-10-13 03:02:43,607][45375] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 137101312. Throughput: 0: 1664.0, 1: 1668.0. Samples: 34279422. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 03:02:43,608][45375] Avg episode reward: [(0, '51.430'), (1, '53.050')] +[2023-10-13 03:02:44,259][46662] Updated weights for policy 0, policy_version 66980 (0.0008) +[2023-10-13 03:02:44,643][46662] Updated weights for policy 0, policy_version 66990 (0.0009) +[2023-10-13 03:02:45,016][46662] Updated weights for policy 0, policy_version 67000 (0.0010) +[2023-10-13 03:02:45,959][46663] Updated weights for policy 1, policy_version 66921 (0.0008) +[2023-10-13 03:02:46,334][46663] Updated weights for policy 1, policy_version 66931 (0.0008) +[2023-10-13 03:02:46,704][46663] Updated weights for policy 1, policy_version 66941 (0.0011) +[2023-10-13 03:02:48,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 137166848. Throughput: 0: 1679.2, 1: 1663.5. Samples: 34299572. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 03:02:48,607][45375] Avg episode reward: [(0, '52.500'), (1, '53.770')] +[2023-10-13 03:02:49,087][46662] Updated weights for policy 0, policy_version 67010 (0.0009) +[2023-10-13 03:02:49,455][46662] Updated weights for policy 0, policy_version 67020 (0.0007) +[2023-10-13 03:02:49,819][46662] Updated weights for policy 0, policy_version 67030 (0.0007) +[2023-10-13 03:02:50,183][46662] Updated weights for policy 0, policy_version 67040 (0.0008) +[2023-10-13 03:02:50,699][46663] Updated weights for policy 1, policy_version 66951 (0.0009) +[2023-10-13 03:02:51,068][46663] Updated weights for policy 1, policy_version 66961 (0.0008) +[2023-10-13 03:02:51,435][46663] Updated weights for policy 1, policy_version 66971 (0.0009) +[2023-10-13 03:02:53,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 137232384. Throughput: 0: 1673.6, 1: 1676.1. Samples: 34320262. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 03:02:53,607][45375] Avg episode reward: [(0, '52.340'), (1, '54.050')] +[2023-10-13 03:02:54,216][46662] Updated weights for policy 0, policy_version 67050 (0.0008) +[2023-10-13 03:02:54,583][46662] Updated weights for policy 0, policy_version 67060 (0.0008) +[2023-10-13 03:02:54,954][46662] Updated weights for policy 0, policy_version 67070 (0.0008) +[2023-10-13 03:02:55,397][46663] Updated weights for policy 1, policy_version 66981 (0.0009) +[2023-10-13 03:02:55,770][46663] Updated weights for policy 1, policy_version 66991 (0.0011) +[2023-10-13 03:02:56,135][46663] Updated weights for policy 1, policy_version 67001 (0.0010) +[2023-10-13 03:02:58,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 137297920. Throughput: 0: 1675.5, 1: 1654.4. Samples: 34329538. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 03:02:58,607][45375] Avg episode reward: [(0, '53.030'), (1, '53.840')] +[2023-10-13 03:02:59,039][46662] Updated weights for policy 0, policy_version 67080 (0.0009) +[2023-10-13 03:02:59,416][46662] Updated weights for policy 0, policy_version 67090 (0.0010) +[2023-10-13 03:02:59,793][46662] Updated weights for policy 0, policy_version 67100 (0.0007) +[2023-10-13 03:03:00,389][46663] Updated weights for policy 1, policy_version 67011 (0.0008) +[2023-10-13 03:03:00,759][46663] Updated weights for policy 1, policy_version 67021 (0.0007) +[2023-10-13 03:03:01,125][46663] Updated weights for policy 1, policy_version 67031 (0.0009) +[2023-10-13 03:03:03,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 137363456. Throughput: 0: 1680.6, 1: 1669.4. Samples: 34350144. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 03:03:03,607][45375] Avg episode reward: [(0, '53.060'), (1, '54.000')] +[2023-10-13 03:03:03,959][46662] Updated weights for policy 0, policy_version 67110 (0.0008) +[2023-10-13 03:03:04,326][46662] Updated weights for policy 0, policy_version 67120 (0.0009) +[2023-10-13 03:03:04,686][46662] Updated weights for policy 0, policy_version 67130 (0.0009) +[2023-10-13 03:03:05,371][46663] Updated weights for policy 1, policy_version 67041 (0.0008) +[2023-10-13 03:03:05,730][46663] Updated weights for policy 1, policy_version 67051 (0.0008) +[2023-10-13 03:03:06,096][46663] Updated weights for policy 1, policy_version 67061 (0.0010) +[2023-10-13 03:03:06,469][46663] Updated weights for policy 1, policy_version 67071 (0.0010) +[2023-10-13 03:03:08,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 137428992. Throughput: 0: 1679.4, 1: 1679.1. Samples: 34370856. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 03:03:08,608][45375] Avg episode reward: [(0, '53.320'), (1, '54.050')] +[2023-10-13 03:03:08,768][46662] Updated weights for policy 0, policy_version 67140 (0.0008) +[2023-10-13 03:03:09,146][46662] Updated weights for policy 0, policy_version 67150 (0.0008) +[2023-10-13 03:03:09,505][46662] Updated weights for policy 0, policy_version 67160 (0.0007) +[2023-10-13 03:03:10,544][46663] Updated weights for policy 1, policy_version 67081 (0.0008) +[2023-10-13 03:03:10,914][46663] Updated weights for policy 1, policy_version 67091 (0.0008) +[2023-10-13 03:03:11,279][46663] Updated weights for policy 1, policy_version 67101 (0.0008) +[2023-10-13 03:03:13,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 137494528. Throughput: 0: 1683.7, 1: 1656.8. Samples: 34380026. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 03:03:13,607][45375] Avg episode reward: [(0, '52.910'), (1, '54.970')] +[2023-10-13 03:03:13,608][46662] Updated weights for policy 0, policy_version 67170 (0.0009) +[2023-10-13 03:03:14,018][46662] Updated weights for policy 0, policy_version 67180 (0.0008) +[2023-10-13 03:03:14,388][46662] Updated weights for policy 0, policy_version 67190 (0.0010) +[2023-10-13 03:03:14,753][46662] Updated weights for policy 0, policy_version 67200 (0.0008) +[2023-10-13 03:03:15,372][46663] Updated weights for policy 1, policy_version 67111 (0.0009) +[2023-10-13 03:03:15,742][46663] Updated weights for policy 1, policy_version 67121 (0.0011) +[2023-10-13 03:03:16,111][46663] Updated weights for policy 1, policy_version 67131 (0.0008) +[2023-10-13 03:03:18,607][45375] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 137560064. Throughput: 0: 1682.4, 1: 1677.7. Samples: 34400330. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 03:03:18,607][45375] Avg episode reward: [(0, '55.270'), (1, '54.920')] +[2023-10-13 03:03:18,808][46662] Updated weights for policy 0, policy_version 67210 (0.0010) +[2023-10-13 03:03:19,175][46662] Updated weights for policy 0, policy_version 67220 (0.0010) +[2023-10-13 03:03:19,539][46662] Updated weights for policy 0, policy_version 67230 (0.0011) +[2023-10-13 03:03:20,169][46663] Updated weights for policy 1, policy_version 67141 (0.0007) +[2023-10-13 03:03:20,561][46663] Updated weights for policy 1, policy_version 67151 (0.0007) +[2023-10-13 03:03:20,933][46663] Updated weights for policy 1, policy_version 67161 (0.0010) +[2023-10-13 03:03:23,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 137625600. Throughput: 0: 1681.1, 1: 1677.3. Samples: 34420768. Policy #0 lag: (min: 31.0, avg: 38.9, max: 63.0) +[2023-10-13 03:03:23,607][45375] Avg episode reward: [(0, '56.240'), (1, '54.170')] +[2023-10-13 03:03:23,694][46662] Updated weights for policy 0, policy_version 67240 (0.0011) +[2023-10-13 03:03:24,063][46662] Updated weights for policy 0, policy_version 67250 (0.0010) +[2023-10-13 03:03:24,442][46662] Updated weights for policy 0, policy_version 67260 (0.0010) +[2023-10-13 03:03:24,953][46663] Updated weights for policy 1, policy_version 67171 (0.0008) +[2023-10-13 03:03:25,329][46663] Updated weights for policy 1, policy_version 67181 (0.0009) +[2023-10-13 03:03:25,685][46663] Updated weights for policy 1, policy_version 67191 (0.0008) +[2023-10-13 03:03:28,501][46662] Updated weights for policy 0, policy_version 67270 (0.0009) +[2023-10-13 03:03:28,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 137691136. Throughput: 0: 1678.5, 1: 1662.1. Samples: 34429750. Policy #0 lag: (min: 31.0, avg: 38.9, max: 63.0) +[2023-10-13 03:03:28,607][45375] Avg episode reward: [(0, '58.590'), (1, '53.690')] +[2023-10-13 03:03:28,873][46662] Updated weights for policy 0, policy_version 67280 (0.0011) +[2023-10-13 03:03:29,250][46662] Updated weights for policy 0, policy_version 67290 (0.0009) +[2023-10-13 03:03:29,613][46663] Updated weights for policy 1, policy_version 67201 (0.0008) +[2023-10-13 03:03:29,966][46663] Updated weights for policy 1, policy_version 67211 (0.0011) +[2023-10-13 03:03:30,335][46663] Updated weights for policy 1, policy_version 67221 (0.0009) +[2023-10-13 03:03:30,696][46663] Updated weights for policy 1, policy_version 67231 (0.0010) +[2023-10-13 03:03:33,246][46662] Updated weights for policy 0, policy_version 67300 (0.0009) +[2023-10-13 03:03:33,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 137756672. Throughput: 0: 1679.3, 1: 1678.7. Samples: 34450684. Policy #0 lag: (min: 31.0, avg: 38.9, max: 63.0) +[2023-10-13 03:03:33,607][45375] Avg episode reward: [(0, '57.330'), (1, '54.930')] +[2023-10-13 03:03:33,617][46662] Updated weights for policy 0, policy_version 67310 (0.0007) +[2023-10-13 03:03:33,988][46662] Updated weights for policy 0, policy_version 67320 (0.0008) +[2023-10-13 03:03:34,864][46663] Updated weights for policy 1, policy_version 67241 (0.0007) +[2023-10-13 03:03:35,232][46663] Updated weights for policy 1, policy_version 67251 (0.0008) +[2023-10-13 03:03:35,605][46663] Updated weights for policy 1, policy_version 67261 (0.0008) +[2023-10-13 03:03:38,069][46662] Updated weights for policy 0, policy_version 67330 (0.0007) +[2023-10-13 03:03:38,442][46662] Updated weights for policy 0, policy_version 67340 (0.0007) +[2023-10-13 03:03:38,607][45375] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 137822208. Throughput: 0: 1679.4, 1: 1676.7. Samples: 34471286. Policy #0 lag: (min: 31.0, avg: 38.9, max: 63.0) +[2023-10-13 03:03:38,608][45375] Avg episode reward: [(0, '57.240'), (1, '54.250')] +[2023-10-13 03:03:38,824][46662] Updated weights for policy 0, policy_version 67350 (0.0007) +[2023-10-13 03:03:39,184][46662] Updated weights for policy 0, policy_version 67360 (0.0008) +[2023-10-13 03:03:39,801][46663] Updated weights for policy 1, policy_version 67271 (0.0007) +[2023-10-13 03:03:40,167][46663] Updated weights for policy 1, policy_version 67281 (0.0007) +[2023-10-13 03:03:40,542][46663] Updated weights for policy 1, policy_version 67291 (0.0008) +[2023-10-13 03:03:43,142][46662] Updated weights for policy 0, policy_version 67370 (0.0008) +[2023-10-13 03:03:43,504][46662] Updated weights for policy 0, policy_version 67380 (0.0008) +[2023-10-13 03:03:43,607][45375] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 137887744. Throughput: 0: 1682.5, 1: 1671.4. Samples: 34480464. Policy #0 lag: (min: 31.0, avg: 38.9, max: 63.0) +[2023-10-13 03:03:43,608][45375] Avg episode reward: [(0, '57.240'), (1, '55.610')] +[2023-10-13 03:03:43,881][46662] Updated weights for policy 0, policy_version 67390 (0.0008) +[2023-10-13 03:03:44,612][46663] Updated weights for policy 1, policy_version 67301 (0.0009) +[2023-10-13 03:03:44,988][46663] Updated weights for policy 1, policy_version 67311 (0.0007) +[2023-10-13 03:03:45,347][46663] Updated weights for policy 1, policy_version 67321 (0.0008) +[2023-10-13 03:03:47,812][46662] Updated weights for policy 0, policy_version 67400 (0.0012) +[2023-10-13 03:03:48,181][46662] Updated weights for policy 0, policy_version 67410 (0.0011) +[2023-10-13 03:03:48,558][46662] Updated weights for policy 0, policy_version 67420 (0.0008) +[2023-10-13 03:03:48,607][45375] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 137953280. Throughput: 0: 1680.1, 1: 1673.1. Samples: 34501040. Policy #0 lag: (min: 31.0, avg: 38.9, max: 63.0) +[2023-10-13 03:03:48,607][45375] Avg episode reward: [(0, '60.860'), (1, '54.490')] +[2023-10-13 03:03:49,598][46663] Updated weights for policy 1, policy_version 67331 (0.0010) +[2023-10-13 03:03:49,969][46663] Updated weights for policy 1, policy_version 67341 (0.0010) +[2023-10-13 03:03:50,323][46663] Updated weights for policy 1, policy_version 67351 (0.0010) +[2023-10-13 03:03:52,762][46662] Updated weights for policy 0, policy_version 67430 (0.0009) +[2023-10-13 03:03:53,135][46662] Updated weights for policy 0, policy_version 67440 (0.0007) +[2023-10-13 03:03:53,501][46662] Updated weights for policy 0, policy_version 67450 (0.0008) +[2023-10-13 03:03:53,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 138018816. Throughput: 0: 1669.2, 1: 1668.2. Samples: 34521040. Policy #0 lag: (min: 31.0, avg: 38.9, max: 63.0) +[2023-10-13 03:03:53,607][45375] Avg episode reward: [(0, '61.870'), (1, '53.620')] +[2023-10-13 03:03:53,615][46384] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000067360_68976640.pth... +[2023-10-13 03:03:53,650][46384] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000065792_67371008.pth +[2023-10-13 03:03:53,724][46091] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000067456_69074944.pth... +[2023-10-13 03:03:53,764][46091] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000065888_67469312.pth +[2023-10-13 03:03:53,769][46091] Saving new best policy, reward=61.870! +[2023-10-13 03:03:54,401][46663] Updated weights for policy 1, policy_version 67361 (0.0007) +[2023-10-13 03:03:54,758][46663] Updated weights for policy 1, policy_version 67371 (0.0008) +[2023-10-13 03:03:55,130][46663] Updated weights for policy 1, policy_version 67381 (0.0009) +[2023-10-13 03:03:55,486][46663] Updated weights for policy 1, policy_version 67391 (0.0011) +[2023-10-13 03:03:57,565][46662] Updated weights for policy 0, policy_version 67460 (0.0009) +[2023-10-13 03:03:57,934][46662] Updated weights for policy 0, policy_version 67470 (0.0007) +[2023-10-13 03:03:58,299][46662] Updated weights for policy 0, policy_version 67480 (0.0007) +[2023-10-13 03:03:58,607][45375] Fps is (10 sec: 16383.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 138117120. Throughput: 0: 1675.9, 1: 1667.2. Samples: 34530464. Policy #0 lag: (min: 31.0, avg: 38.9, max: 63.0) +[2023-10-13 03:03:58,607][45375] Avg episode reward: [(0, '61.470'), (1, '53.830')] +[2023-10-13 03:03:59,536][46663] Updated weights for policy 1, policy_version 67401 (0.0009) +[2023-10-13 03:03:59,900][46663] Updated weights for policy 1, policy_version 67411 (0.0008) +[2023-10-13 03:04:00,254][46663] Updated weights for policy 1, policy_version 67421 (0.0010) +[2023-10-13 03:04:02,602][46662] Updated weights for policy 0, policy_version 67490 (0.0008) +[2023-10-13 03:04:02,988][46662] Updated weights for policy 0, policy_version 67500 (0.0008) +[2023-10-13 03:04:03,347][46662] Updated weights for policy 0, policy_version 67510 (0.0007) +[2023-10-13 03:04:03,607][45375] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 138149888. Throughput: 0: 1677.1, 1: 1681.4. Samples: 34551462. Policy #0 lag: (min: 31.0, avg: 38.9, max: 63.0) +[2023-10-13 03:04:03,607][45375] Avg episode reward: [(0, '59.910'), (1, '52.380')] +[2023-10-13 03:04:03,723][46662] Updated weights for policy 0, policy_version 67520 (0.0009) +[2023-10-13 03:04:04,368][46663] Updated weights for policy 1, policy_version 67431 (0.0009) +[2023-10-13 03:04:04,732][46663] Updated weights for policy 1, policy_version 67441 (0.0009) +[2023-10-13 03:04:05,092][46663] Updated weights for policy 1, policy_version 67451 (0.0009) +[2023-10-13 03:04:07,899][46662] Updated weights for policy 0, policy_version 67530 (0.0008) +[2023-10-13 03:04:08,264][46662] Updated weights for policy 0, policy_version 67540 (0.0009) +[2023-10-13 03:04:08,607][45375] Fps is (10 sec: 9830.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 138215424. Throughput: 0: 1669.3, 1: 1682.4. Samples: 34571598. Policy #0 lag: (min: 10.0, avg: 10.9, max: 31.0) +[2023-10-13 03:04:08,607][45375] Avg episode reward: [(0, '59.530'), (1, '52.230')] +[2023-10-13 03:04:08,627][46662] Updated weights for policy 0, policy_version 67550 (0.0009) +[2023-10-13 03:04:09,240][46663] Updated weights for policy 1, policy_version 67461 (0.0009) +[2023-10-13 03:04:09,634][46663] Updated weights for policy 1, policy_version 67471 (0.0007) +[2023-10-13 03:04:10,003][46663] Updated weights for policy 1, policy_version 67481 (0.0008) +[2023-10-13 03:04:12,684][46662] Updated weights for policy 0, policy_version 67560 (0.0008) +[2023-10-13 03:04:13,052][46662] Updated weights for policy 0, policy_version 67570 (0.0008) +[2023-10-13 03:04:13,426][46662] Updated weights for policy 0, policy_version 67580 (0.0008) +[2023-10-13 03:04:13,606][45375] Fps is (10 sec: 16384.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 138313728. Throughput: 0: 1679.2, 1: 1680.6. Samples: 34580942. Policy #0 lag: (min: 10.0, avg: 10.9, max: 31.0) +[2023-10-13 03:04:13,607][45375] Avg episode reward: [(0, '57.870'), (1, '51.740')] +[2023-10-13 03:04:14,042][46663] Updated weights for policy 1, policy_version 67491 (0.0007) +[2023-10-13 03:04:14,408][46663] Updated weights for policy 1, policy_version 67501 (0.0008) +[2023-10-13 03:04:14,786][46663] Updated weights for policy 1, policy_version 67511 (0.0011) +[2023-10-13 03:04:17,355][46662] Updated weights for policy 0, policy_version 67590 (0.0008) +[2023-10-13 03:04:17,732][46662] Updated weights for policy 0, policy_version 67600 (0.0009) +[2023-10-13 03:04:18,099][46662] Updated weights for policy 0, policy_version 67610 (0.0009) +[2023-10-13 03:04:18,607][45375] Fps is (10 sec: 16383.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 138379264. Throughput: 0: 1677.2, 1: 1677.7. Samples: 34601654. Policy #0 lag: (min: 10.0, avg: 10.9, max: 31.0) +[2023-10-13 03:04:18,608][45375] Avg episode reward: [(0, '58.230'), (1, '50.660')] +[2023-10-13 03:04:18,848][46663] Updated weights for policy 1, policy_version 67521 (0.0009) +[2023-10-13 03:04:19,226][46663] Updated weights for policy 1, policy_version 67531 (0.0008) +[2023-10-13 03:04:19,584][46663] Updated weights for policy 1, policy_version 67541 (0.0009) +[2023-10-13 03:04:19,955][46663] Updated weights for policy 1, policy_version 67551 (0.0009) +[2023-10-13 03:04:22,298][46662] Updated weights for policy 0, policy_version 67620 (0.0010) +[2023-10-13 03:04:22,662][46662] Updated weights for policy 0, policy_version 67630 (0.0007) +[2023-10-13 03:04:23,041][46662] Updated weights for policy 0, policy_version 67640 (0.0007) +[2023-10-13 03:04:23,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 138444800. Throughput: 0: 1662.3, 1: 1681.5. Samples: 34621756. Policy #0 lag: (min: 10.0, avg: 10.9, max: 31.0) +[2023-10-13 03:04:23,607][45375] Avg episode reward: [(0, '58.810'), (1, '51.950')] +[2023-10-13 03:04:24,033][46663] Updated weights for policy 1, policy_version 67561 (0.0010) +[2023-10-13 03:04:24,387][46663] Updated weights for policy 1, policy_version 67571 (0.0009) +[2023-10-13 03:04:24,761][46663] Updated weights for policy 1, policy_version 67581 (0.0010) +[2023-10-13 03:04:26,910][46662] Updated weights for policy 0, policy_version 67650 (0.0007) +[2023-10-13 03:04:27,284][46662] Updated weights for policy 0, policy_version 67660 (0.0008) +[2023-10-13 03:04:27,645][46662] Updated weights for policy 0, policy_version 67670 (0.0009) +[2023-10-13 03:04:28,021][46662] Updated weights for policy 0, policy_version 67680 (0.0007) +[2023-10-13 03:04:28,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 138510336. Throughput: 0: 1677.4, 1: 1681.4. Samples: 34631612. Policy #0 lag: (min: 10.0, avg: 10.9, max: 31.0) +[2023-10-13 03:04:28,607][45375] Avg episode reward: [(0, '57.640'), (1, '53.460')] +[2023-10-13 03:04:28,951][46663] Updated weights for policy 1, policy_version 67591 (0.0008) +[2023-10-13 03:04:29,310][46663] Updated weights for policy 1, policy_version 67601 (0.0008) +[2023-10-13 03:04:29,685][46663] Updated weights for policy 1, policy_version 67611 (0.0007) +[2023-10-13 03:04:32,058][46662] Updated weights for policy 0, policy_version 67690 (0.0009) +[2023-10-13 03:04:32,417][46662] Updated weights for policy 0, policy_version 67700 (0.0008) +[2023-10-13 03:04:32,785][46662] Updated weights for policy 0, policy_version 67710 (0.0008) +[2023-10-13 03:04:33,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 138575872. Throughput: 0: 1675.6, 1: 1684.8. Samples: 34652262. Policy #0 lag: (min: 10.0, avg: 10.9, max: 31.0) +[2023-10-13 03:04:33,607][45375] Avg episode reward: [(0, '56.470'), (1, '54.470')] +[2023-10-13 03:04:33,703][46663] Updated weights for policy 1, policy_version 67621 (0.0009) +[2023-10-13 03:04:34,068][46663] Updated weights for policy 1, policy_version 67631 (0.0011) +[2023-10-13 03:04:34,437][46663] Updated weights for policy 1, policy_version 67641 (0.0011) +[2023-10-13 03:04:36,705][46662] Updated weights for policy 0, policy_version 67720 (0.0008) +[2023-10-13 03:04:37,077][46662] Updated weights for policy 0, policy_version 67730 (0.0009) +[2023-10-13 03:04:37,446][46662] Updated weights for policy 0, policy_version 67740 (0.0008) +[2023-10-13 03:04:38,551][46663] Updated weights for policy 1, policy_version 67651 (0.0008) +[2023-10-13 03:04:38,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 138641408. Throughput: 0: 1663.6, 1: 1692.1. Samples: 34672044. Policy #0 lag: (min: 10.0, avg: 10.9, max: 31.0) +[2023-10-13 03:04:38,607][45375] Avg episode reward: [(0, '55.060'), (1, '54.440')] +[2023-10-13 03:04:38,920][46663] Updated weights for policy 1, policy_version 67661 (0.0009) +[2023-10-13 03:04:39,280][46663] Updated weights for policy 1, policy_version 67671 (0.0008) +[2023-10-13 03:04:41,648][46662] Updated weights for policy 0, policy_version 67750 (0.0009) +[2023-10-13 03:04:42,025][46662] Updated weights for policy 0, policy_version 67760 (0.0009) +[2023-10-13 03:04:42,394][46662] Updated weights for policy 0, policy_version 67770 (0.0009) +[2023-10-13 03:04:43,333][46663] Updated weights for policy 1, policy_version 67681 (0.0007) +[2023-10-13 03:04:43,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 138706944. Throughput: 0: 1684.8, 1: 1691.7. Samples: 34682406. Policy #0 lag: (min: 10.0, avg: 10.9, max: 31.0) +[2023-10-13 03:04:43,607][45375] Avg episode reward: [(0, '54.870'), (1, '54.080')] +[2023-10-13 03:04:43,695][46663] Updated weights for policy 1, policy_version 67691 (0.0007) +[2023-10-13 03:04:44,061][46663] Updated weights for policy 1, policy_version 67701 (0.0007) +[2023-10-13 03:04:44,418][46663] Updated weights for policy 1, policy_version 67711 (0.0009) +[2023-10-13 03:04:46,537][46662] Updated weights for policy 0, policy_version 67780 (0.0009) +[2023-10-13 03:04:46,908][46662] Updated weights for policy 0, policy_version 67790 (0.0008) +[2023-10-13 03:04:47,277][46662] Updated weights for policy 0, policy_version 67800 (0.0010) +[2023-10-13 03:04:48,429][46663] Updated weights for policy 1, policy_version 67721 (0.0010) +[2023-10-13 03:04:48,606][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 138772480. Throughput: 0: 1676.4, 1: 1686.1. Samples: 34702774. Policy #0 lag: (min: 8.0, avg: 35.1, max: 40.0) +[2023-10-13 03:04:48,607][45375] Avg episode reward: [(0, '55.810'), (1, '54.310')] +[2023-10-13 03:04:48,789][46663] Updated weights for policy 1, policy_version 67731 (0.0008) +[2023-10-13 03:04:49,159][46663] Updated weights for policy 1, policy_version 67741 (0.0008) +[2023-10-13 03:04:51,449][46662] Updated weights for policy 0, policy_version 67810 (0.0009) +[2023-10-13 03:04:51,860][46662] Updated weights for policy 0, policy_version 67820 (0.0007) +[2023-10-13 03:04:52,232][46662] Updated weights for policy 0, policy_version 67830 (0.0009) +[2023-10-13 03:04:52,588][46662] Updated weights for policy 0, policy_version 67840 (0.0011) +[2023-10-13 03:04:53,257][46663] Updated weights for policy 1, policy_version 67751 (0.0009) +[2023-10-13 03:04:53,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 138838016. Throughput: 0: 1663.7, 1: 1678.1. Samples: 34721978. Policy #0 lag: (min: 8.0, avg: 35.1, max: 40.0) +[2023-10-13 03:04:53,607][45375] Avg episode reward: [(0, '55.940'), (1, '54.810')] +[2023-10-13 03:04:53,633][46663] Updated weights for policy 1, policy_version 67761 (0.0008) +[2023-10-13 03:04:54,005][46663] Updated weights for policy 1, policy_version 67771 (0.0007) +[2023-10-13 03:04:56,632][46662] Updated weights for policy 0, policy_version 67850 (0.0011) +[2023-10-13 03:04:57,012][46662] Updated weights for policy 0, policy_version 67860 (0.0007) +[2023-10-13 03:04:57,380][46662] Updated weights for policy 0, policy_version 67870 (0.0008) +[2023-10-13 03:04:58,054][46663] Updated weights for policy 1, policy_version 67781 (0.0010) +[2023-10-13 03:04:58,461][46663] Updated weights for policy 1, policy_version 67791 (0.0008) +[2023-10-13 03:04:58,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 138903552. Throughput: 0: 1686.9, 1: 1687.8. Samples: 34732804. Policy #0 lag: (min: 8.0, avg: 35.1, max: 40.0) +[2023-10-13 03:04:58,607][45375] Avg episode reward: [(0, '55.200'), (1, '54.500')] +[2023-10-13 03:04:58,818][46663] Updated weights for policy 1, policy_version 67801 (0.0008) +[2023-10-13 03:05:01,327][46662] Updated weights for policy 0, policy_version 67880 (0.0011) +[2023-10-13 03:05:01,697][46662] Updated weights for policy 0, policy_version 67890 (0.0009) +[2023-10-13 03:05:02,084][46662] Updated weights for policy 0, policy_version 67900 (0.0009) +[2023-10-13 03:05:02,825][46663] Updated weights for policy 1, policy_version 67811 (0.0007) +[2023-10-13 03:05:03,195][46663] Updated weights for policy 1, policy_version 67821 (0.0009) +[2023-10-13 03:05:03,552][46663] Updated weights for policy 1, policy_version 67831 (0.0009) +[2023-10-13 03:05:03,606][45375] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 138969088. Throughput: 0: 1670.6, 1: 1684.0. Samples: 34752608. Policy #0 lag: (min: 8.0, avg: 35.1, max: 40.0) +[2023-10-13 03:05:03,607][45375] Avg episode reward: [(0, '53.740'), (1, '56.110')] +[2023-10-13 03:05:06,104][46662] Updated weights for policy 0, policy_version 67910 (0.0009) +[2023-10-13 03:05:06,479][46662] Updated weights for policy 0, policy_version 67920 (0.0009) +[2023-10-13 03:05:06,838][46662] Updated weights for policy 0, policy_version 67930 (0.0009) +[2023-10-13 03:05:07,701][46663] Updated weights for policy 1, policy_version 67841 (0.0007) +[2023-10-13 03:05:08,054][46663] Updated weights for policy 1, policy_version 67851 (0.0011) +[2023-10-13 03:05:08,414][46663] Updated weights for policy 1, policy_version 67861 (0.0011) +[2023-10-13 03:05:08,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13329.4). Total num frames: 139034624. Throughput: 0: 1672.8, 1: 1664.2. Samples: 34771924. Policy #0 lag: (min: 8.0, avg: 35.1, max: 40.0) +[2023-10-13 03:05:08,607][45375] Avg episode reward: [(0, '55.030'), (1, '56.230')] +[2023-10-13 03:05:08,778][46663] Updated weights for policy 1, policy_version 67871 (0.0009) +[2023-10-13 03:05:11,035][46662] Updated weights for policy 0, policy_version 67940 (0.0008) +[2023-10-13 03:05:11,403][46662] Updated weights for policy 0, policy_version 67950 (0.0008) +[2023-10-13 03:05:11,763][46662] Updated weights for policy 0, policy_version 67960 (0.0010) +[2023-10-13 03:05:12,692][46663] Updated weights for policy 1, policy_version 67881 (0.0010) +[2023-10-13 03:05:13,052][46663] Updated weights for policy 1, policy_version 67891 (0.0009) +[2023-10-13 03:05:13,433][46663] Updated weights for policy 1, policy_version 67901 (0.0009) +[2023-10-13 03:05:13,607][45375] Fps is (10 sec: 16383.5, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 139132928. Throughput: 0: 1684.3, 1: 1683.9. Samples: 34783184. Policy #0 lag: (min: 8.0, avg: 35.1, max: 40.0) +[2023-10-13 03:05:13,608][45375] Avg episode reward: [(0, '54.650'), (1, '57.960')] +[2023-10-13 03:05:15,719][46662] Updated weights for policy 0, policy_version 67970 (0.0010) +[2023-10-13 03:05:16,097][46662] Updated weights for policy 0, policy_version 67980 (0.0008) +[2023-10-13 03:05:16,465][46662] Updated weights for policy 0, policy_version 67990 (0.0007) +[2023-10-13 03:05:16,834][46662] Updated weights for policy 0, policy_version 68000 (0.0007) +[2023-10-13 03:05:17,569][46663] Updated weights for policy 1, policy_version 67911 (0.0009) +[2023-10-13 03:05:17,944][46663] Updated weights for policy 1, policy_version 67921 (0.0009) +[2023-10-13 03:05:18,313][46663] Updated weights for policy 1, policy_version 67931 (0.0007) +[2023-10-13 03:05:18,606][45375] Fps is (10 sec: 16384.1, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 139198464. Throughput: 0: 1659.8, 1: 1682.8. Samples: 34802680. Policy #0 lag: (min: 8.0, avg: 35.1, max: 40.0) +[2023-10-13 03:05:18,607][45375] Avg episode reward: [(0, '55.660'), (1, '58.310')] +[2023-10-13 03:05:20,899][46662] Updated weights for policy 0, policy_version 68010 (0.0009) +[2023-10-13 03:05:21,273][46662] Updated weights for policy 0, policy_version 68020 (0.0011) +[2023-10-13 03:05:21,640][46662] Updated weights for policy 0, policy_version 68030 (0.0008) +[2023-10-13 03:05:22,317][46663] Updated weights for policy 1, policy_version 67941 (0.0009) +[2023-10-13 03:05:22,685][46663] Updated weights for policy 1, policy_version 67951 (0.0009) +[2023-10-13 03:05:23,052][46663] Updated weights for policy 1, policy_version 67961 (0.0008) +[2023-10-13 03:05:23,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 139264000. Throughput: 0: 1682.7, 1: 1650.7. Samples: 34822052. Policy #0 lag: (min: 8.0, avg: 35.1, max: 40.0) +[2023-10-13 03:05:23,608][45375] Avg episode reward: [(0, '55.940'), (1, '57.890')] +[2023-10-13 03:05:25,663][46662] Updated weights for policy 0, policy_version 68040 (0.0010) +[2023-10-13 03:05:26,037][46662] Updated weights for policy 0, policy_version 68050 (0.0008) +[2023-10-13 03:05:26,411][46662] Updated weights for policy 0, policy_version 68060 (0.0009) +[2023-10-13 03:05:27,290][46663] Updated weights for policy 1, policy_version 67971 (0.0011) +[2023-10-13 03:05:27,657][46663] Updated weights for policy 1, policy_version 67981 (0.0008) +[2023-10-13 03:05:28,019][46663] Updated weights for policy 1, policy_version 67991 (0.0008) +[2023-10-13 03:05:28,606][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 139329536. Throughput: 0: 1673.5, 1: 1676.8. Samples: 34833170. Policy #0 lag: (min: 8.0, avg: 35.1, max: 40.0) +[2023-10-13 03:05:28,607][45375] Avg episode reward: [(0, '56.000'), (1, '58.860')] +[2023-10-13 03:05:30,489][46662] Updated weights for policy 0, policy_version 68070 (0.0009) +[2023-10-13 03:05:30,860][46662] Updated weights for policy 0, policy_version 68080 (0.0009) +[2023-10-13 03:05:31,223][46662] Updated weights for policy 0, policy_version 68090 (0.0009) +[2023-10-13 03:05:32,080][46663] Updated weights for policy 1, policy_version 68001 (0.0009) +[2023-10-13 03:05:32,449][46663] Updated weights for policy 1, policy_version 68011 (0.0008) +[2023-10-13 03:05:32,814][46663] Updated weights for policy 1, policy_version 68021 (0.0007) +[2023-10-13 03:05:33,185][46663] Updated weights for policy 1, policy_version 68031 (0.0010) +[2023-10-13 03:05:33,607][45375] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 139395072. Throughput: 0: 1665.9, 1: 1665.8. Samples: 34852698. Policy #0 lag: (min: 31.0, avg: 38.2, max: 63.0) +[2023-10-13 03:05:33,607][45375] Avg episode reward: [(0, '56.640'), (1, '60.050')] +[2023-10-13 03:05:35,325][46662] Updated weights for policy 0, policy_version 68100 (0.0008) +[2023-10-13 03:05:35,695][46662] Updated weights for policy 0, policy_version 68110 (0.0010) +[2023-10-13 03:05:36,066][46662] Updated weights for policy 0, policy_version 68120 (0.0011) +[2023-10-13 03:05:37,371][46663] Updated weights for policy 1, policy_version 68041 (0.0011) +[2023-10-13 03:05:37,738][46663] Updated weights for policy 1, policy_version 68051 (0.0011) +[2023-10-13 03:05:38,107][46663] Updated weights for policy 1, policy_version 68061 (0.0009) +[2023-10-13 03:05:38,606][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 139460608. Throughput: 0: 1691.0, 1: 1655.9. Samples: 34872590. Policy #0 lag: (min: 31.0, avg: 38.2, max: 63.0) +[2023-10-13 03:05:38,607][45375] Avg episode reward: [(0, '57.120'), (1, '60.070')] +[2023-10-13 03:05:39,999][46662] Updated weights for policy 0, policy_version 68130 (0.0011) +[2023-10-13 03:05:40,410][46662] Updated weights for policy 0, policy_version 68140 (0.0010) +[2023-10-13 03:05:40,775][46662] Updated weights for policy 0, policy_version 68150 (0.0009) +[2023-10-13 03:05:41,151][46662] Updated weights for policy 0, policy_version 68160 (0.0009) +[2023-10-13 03:05:41,935][46663] Updated weights for policy 1, policy_version 68071 (0.0007) +[2023-10-13 03:05:42,295][46663] Updated weights for policy 1, policy_version 68081 (0.0009) +[2023-10-13 03:05:42,661][46663] Updated weights for policy 1, policy_version 68091 (0.0011) +[2023-10-13 03:05:43,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 139526144. Throughput: 0: 1667.5, 1: 1679.9. Samples: 34883438. Policy #0 lag: (min: 31.0, avg: 38.2, max: 63.0) +[2023-10-13 03:05:43,608][45375] Avg episode reward: [(0, '56.650'), (1, '58.590')] +[2023-10-13 03:05:45,023][46662] Updated weights for policy 0, policy_version 68170 (0.0007) +[2023-10-13 03:05:45,391][46662] Updated weights for policy 0, policy_version 68180 (0.0009) +[2023-10-13 03:05:45,755][46662] Updated weights for policy 0, policy_version 68190 (0.0009) +[2023-10-13 03:05:47,038][46663] Updated weights for policy 1, policy_version 68101 (0.0009) +[2023-10-13 03:05:47,431][46663] Updated weights for policy 1, policy_version 68111 (0.0008) +[2023-10-13 03:05:47,803][46663] Updated weights for policy 1, policy_version 68121 (0.0008) +[2023-10-13 03:05:48,606][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 139591680. Throughput: 0: 1678.2, 1: 1665.8. Samples: 34903090. Policy #0 lag: (min: 31.0, avg: 38.2, max: 63.0) +[2023-10-13 03:05:48,607][45375] Avg episode reward: [(0, '58.470'), (1, '58.490')] +[2023-10-13 03:05:49,903][46662] Updated weights for policy 0, policy_version 68200 (0.0010) +[2023-10-13 03:05:50,275][46662] Updated weights for policy 0, policy_version 68210 (0.0007) +[2023-10-13 03:05:50,636][46662] Updated weights for policy 0, policy_version 68220 (0.0007) +[2023-10-13 03:05:51,680][46663] Updated weights for policy 1, policy_version 68131 (0.0009) +[2023-10-13 03:05:52,056][46663] Updated weights for policy 1, policy_version 68141 (0.0010) +[2023-10-13 03:05:52,429][46663] Updated weights for policy 1, policy_version 68151 (0.0010) +[2023-10-13 03:05:53,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 139657216. Throughput: 0: 1693.1, 1: 1663.6. Samples: 34922978. Policy #0 lag: (min: 31.0, avg: 38.2, max: 63.0) +[2023-10-13 03:05:53,607][45375] Avg episode reward: [(0, '57.990'), (1, '58.170')] +[2023-10-13 03:05:53,618][46091] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000068224_69861376.pth... +[2023-10-13 03:05:53,618][46384] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000068160_69795840.pth... +[2023-10-13 03:05:53,652][46091] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000066656_68255744.pth +[2023-10-13 03:05:53,655][46384] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000066592_68190208.pth +[2023-10-13 03:05:54,722][46662] Updated weights for policy 0, policy_version 68230 (0.0007) +[2023-10-13 03:05:55,090][46662] Updated weights for policy 0, policy_version 68240 (0.0008) +[2023-10-13 03:05:55,462][46662] Updated weights for policy 0, policy_version 68250 (0.0007) +[2023-10-13 03:05:56,483][46663] Updated weights for policy 1, policy_version 68161 (0.0010) +[2023-10-13 03:05:56,849][46663] Updated weights for policy 1, policy_version 68171 (0.0007) +[2023-10-13 03:05:57,224][46663] Updated weights for policy 1, policy_version 68181 (0.0007) +[2023-10-13 03:05:57,588][46663] Updated weights for policy 1, policy_version 68191 (0.0009) +[2023-10-13 03:05:58,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 139722752. Throughput: 0: 1662.7, 1: 1670.9. Samples: 34933194. Policy #0 lag: (min: 31.0, avg: 38.2, max: 63.0) +[2023-10-13 03:05:58,607][45375] Avg episode reward: [(0, '57.180'), (1, '59.600')] +[2023-10-13 03:05:59,387][46662] Updated weights for policy 0, policy_version 68260 (0.0011) +[2023-10-13 03:05:59,752][46662] Updated weights for policy 0, policy_version 68270 (0.0011) +[2023-10-13 03:06:00,128][46662] Updated weights for policy 0, policy_version 68280 (0.0011) +[2023-10-13 03:06:01,662][46663] Updated weights for policy 1, policy_version 68201 (0.0009) +[2023-10-13 03:06:02,031][46663] Updated weights for policy 1, policy_version 68211 (0.0008) +[2023-10-13 03:06:02,395][46663] Updated weights for policy 1, policy_version 68221 (0.0010) +[2023-10-13 03:06:03,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 139788288. Throughput: 0: 1687.6, 1: 1646.8. Samples: 34952730. Policy #0 lag: (min: 31.0, avg: 38.2, max: 63.0) +[2023-10-13 03:06:03,607][45375] Avg episode reward: [(0, '56.000'), (1, '59.840')] +[2023-10-13 03:06:04,347][46662] Updated weights for policy 0, policy_version 68290 (0.0008) +[2023-10-13 03:06:04,722][46662] Updated weights for policy 0, policy_version 68300 (0.0010) +[2023-10-13 03:06:05,088][46662] Updated weights for policy 0, policy_version 68310 (0.0010) +[2023-10-13 03:06:05,458][46662] Updated weights for policy 0, policy_version 68320 (0.0007) +[2023-10-13 03:06:06,540][46663] Updated weights for policy 1, policy_version 68231 (0.0011) +[2023-10-13 03:06:06,920][46663] Updated weights for policy 1, policy_version 68241 (0.0010) +[2023-10-13 03:06:07,290][46663] Updated weights for policy 1, policy_version 68251 (0.0010) +[2023-10-13 03:06:08,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 139853824. Throughput: 0: 1686.3, 1: 1668.4. Samples: 34973010. Policy #0 lag: (min: 31.0, avg: 38.2, max: 63.0) +[2023-10-13 03:06:08,607][45375] Avg episode reward: [(0, '55.040'), (1, '60.000')] +[2023-10-13 03:06:09,588][46662] Updated weights for policy 0, policy_version 68330 (0.0007) +[2023-10-13 03:06:09,950][46662] Updated weights for policy 0, policy_version 68340 (0.0009) +[2023-10-13 03:06:10,327][46662] Updated weights for policy 0, policy_version 68350 (0.0008) +[2023-10-13 03:06:11,389][46663] Updated weights for policy 1, policy_version 68261 (0.0009) +[2023-10-13 03:06:11,761][46663] Updated weights for policy 1, policy_version 68271 (0.0008) +[2023-10-13 03:06:12,119][46663] Updated weights for policy 1, policy_version 68281 (0.0010) +[2023-10-13 03:06:13,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 139919360. Throughput: 0: 1665.8, 1: 1667.2. Samples: 34983158. Policy #0 lag: (min: 31.0, avg: 38.2, max: 63.0) +[2023-10-13 03:06:13,608][45375] Avg episode reward: [(0, '56.670'), (1, '58.100')] +[2023-10-13 03:06:14,300][46662] Updated weights for policy 0, policy_version 68360 (0.0007) +[2023-10-13 03:06:14,668][46662] Updated weights for policy 0, policy_version 68370 (0.0007) +[2023-10-13 03:06:15,034][46662] Updated weights for policy 0, policy_version 68380 (0.0008) +[2023-10-13 03:06:16,333][46663] Updated weights for policy 1, policy_version 68291 (0.0009) +[2023-10-13 03:06:16,696][46663] Updated weights for policy 1, policy_version 68301 (0.0007) +[2023-10-13 03:06:17,061][46663] Updated weights for policy 1, policy_version 68311 (0.0009) +[2023-10-13 03:06:18,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 139984896. Throughput: 0: 1688.5, 1: 1660.4. Samples: 35003398. Policy #0 lag: (min: 31.0, avg: 40.3, max: 63.0) +[2023-10-13 03:06:18,607][45375] Avg episode reward: [(0, '56.830'), (1, '59.150')] +[2023-10-13 03:06:19,201][46662] Updated weights for policy 0, policy_version 68390 (0.0009) +[2023-10-13 03:06:19,572][46662] Updated weights for policy 0, policy_version 68400 (0.0009) +[2023-10-13 03:06:19,944][46662] Updated weights for policy 0, policy_version 68410 (0.0008) +[2023-10-13 03:06:20,953][46663] Updated weights for policy 1, policy_version 68321 (0.0008) +[2023-10-13 03:06:21,327][46663] Updated weights for policy 1, policy_version 68331 (0.0010) +[2023-10-13 03:06:21,687][46663] Updated weights for policy 1, policy_version 68341 (0.0009) +[2023-10-13 03:06:22,052][46663] Updated weights for policy 1, policy_version 68351 (0.0008) +[2023-10-13 03:06:23,606][45375] Fps is (10 sec: 13107.5, 60 sec: 13107.3, 300 sec: 13440.4). Total num frames: 140050432. Throughput: 0: 1683.1, 1: 1679.0. Samples: 35023884. Policy #0 lag: (min: 31.0, avg: 40.3, max: 63.0) +[2023-10-13 03:06:23,607][45375] Avg episode reward: [(0, '57.070'), (1, '59.640')] +[2023-10-13 03:06:23,942][46662] Updated weights for policy 0, policy_version 68420 (0.0008) +[2023-10-13 03:06:24,305][46662] Updated weights for policy 0, policy_version 68430 (0.0007) +[2023-10-13 03:06:24,675][46662] Updated weights for policy 0, policy_version 68440 (0.0008) +[2023-10-13 03:06:26,199][46663] Updated weights for policy 1, policy_version 68361 (0.0008) +[2023-10-13 03:06:26,574][46663] Updated weights for policy 1, policy_version 68371 (0.0007) +[2023-10-13 03:06:26,945][46663] Updated weights for policy 1, policy_version 68381 (0.0008) +[2023-10-13 03:06:28,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 140115968. Throughput: 0: 1676.9, 1: 1662.3. Samples: 35033702. Policy #0 lag: (min: 31.0, avg: 40.3, max: 63.0) +[2023-10-13 03:06:28,607][45375] Avg episode reward: [(0, '57.030'), (1, '60.310')] +[2023-10-13 03:06:28,821][46662] Updated weights for policy 0, policy_version 68450 (0.0011) +[2023-10-13 03:06:29,194][46662] Updated weights for policy 0, policy_version 68460 (0.0011) +[2023-10-13 03:06:29,566][46662] Updated weights for policy 0, policy_version 68470 (0.0010) +[2023-10-13 03:06:29,933][46662] Updated weights for policy 0, policy_version 68480 (0.0007) +[2023-10-13 03:06:31,191][46663] Updated weights for policy 1, policy_version 68391 (0.0009) +[2023-10-13 03:06:31,549][46663] Updated weights for policy 1, policy_version 68401 (0.0009) +[2023-10-13 03:06:31,914][46663] Updated weights for policy 1, policy_version 68411 (0.0007) +[2023-10-13 03:06:33,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 140181504. Throughput: 0: 1680.8, 1: 1660.0. Samples: 35053424. Policy #0 lag: (min: 31.0, avg: 40.3, max: 63.0) +[2023-10-13 03:06:33,607][45375] Avg episode reward: [(0, '56.980'), (1, '59.230')] +[2023-10-13 03:06:33,864][46662] Updated weights for policy 0, policy_version 68490 (0.0007) +[2023-10-13 03:06:34,241][46662] Updated weights for policy 0, policy_version 68500 (0.0008) +[2023-10-13 03:06:34,610][46662] Updated weights for policy 0, policy_version 68510 (0.0008) +[2023-10-13 03:06:35,998][46663] Updated weights for policy 1, policy_version 68421 (0.0009) +[2023-10-13 03:06:36,401][46663] Updated weights for policy 1, policy_version 68431 (0.0008) +[2023-10-13 03:06:36,761][46663] Updated weights for policy 1, policy_version 68441 (0.0010) +[2023-10-13 03:06:38,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 140247040. Throughput: 0: 1679.9, 1: 1676.7. Samples: 35074024. Policy #0 lag: (min: 31.0, avg: 40.3, max: 63.0) +[2023-10-13 03:06:38,607][45375] Avg episode reward: [(0, '56.190'), (1, '59.490')] +[2023-10-13 03:06:38,818][46662] Updated weights for policy 0, policy_version 68520 (0.0008) +[2023-10-13 03:06:39,178][46662] Updated weights for policy 0, policy_version 68530 (0.0010) +[2023-10-13 03:06:39,552][46662] Updated weights for policy 0, policy_version 68540 (0.0009) +[2023-10-13 03:06:40,809][46663] Updated weights for policy 1, policy_version 68451 (0.0009) +[2023-10-13 03:06:41,169][46663] Updated weights for policy 1, policy_version 68461 (0.0010) +[2023-10-13 03:06:41,539][46663] Updated weights for policy 1, policy_version 68471 (0.0007) +[2023-10-13 03:06:43,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 140312576. Throughput: 0: 1678.4, 1: 1665.7. Samples: 35083678. Policy #0 lag: (min: 31.0, avg: 40.3, max: 63.0) +[2023-10-13 03:06:43,607][45375] Avg episode reward: [(0, '55.010'), (1, '59.050')] +[2023-10-13 03:06:43,730][46662] Updated weights for policy 0, policy_version 68550 (0.0007) +[2023-10-13 03:06:44,105][46662] Updated weights for policy 0, policy_version 68560 (0.0008) +[2023-10-13 03:06:44,467][46662] Updated weights for policy 0, policy_version 68570 (0.0007) +[2023-10-13 03:06:45,585][46663] Updated weights for policy 1, policy_version 68481 (0.0008) +[2023-10-13 03:06:45,951][46663] Updated weights for policy 1, policy_version 68491 (0.0007) +[2023-10-13 03:06:46,320][46663] Updated weights for policy 1, policy_version 68501 (0.0007) +[2023-10-13 03:06:46,689][46663] Updated weights for policy 1, policy_version 68511 (0.0007) +[2023-10-13 03:06:48,447][46662] Updated weights for policy 0, policy_version 68580 (0.0008) +[2023-10-13 03:06:48,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 140378112. Throughput: 0: 1683.7, 1: 1679.5. Samples: 35104076. Policy #0 lag: (min: 31.0, avg: 40.3, max: 63.0) +[2023-10-13 03:06:48,607][45375] Avg episode reward: [(0, '54.590'), (1, '60.080')] +[2023-10-13 03:06:48,823][46662] Updated weights for policy 0, policy_version 68590 (0.0010) +[2023-10-13 03:06:49,188][46662] Updated weights for policy 0, policy_version 68600 (0.0010) +[2023-10-13 03:06:50,818][46663] Updated weights for policy 1, policy_version 68521 (0.0009) +[2023-10-13 03:06:51,181][46663] Updated weights for policy 1, policy_version 68531 (0.0012) +[2023-10-13 03:06:51,548][46663] Updated weights for policy 1, policy_version 68541 (0.0010) +[2023-10-13 03:06:53,422][46662] Updated weights for policy 0, policy_version 68610 (0.0010) +[2023-10-13 03:06:53,607][45375] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 140443648. Throughput: 0: 1683.2, 1: 1683.1. Samples: 35124492. Policy #0 lag: (min: 31.0, avg: 40.3, max: 63.0) +[2023-10-13 03:06:53,608][45375] Avg episode reward: [(0, '52.080'), (1, '58.870')] +[2023-10-13 03:06:53,793][46662] Updated weights for policy 0, policy_version 68620 (0.0009) +[2023-10-13 03:06:54,162][46662] Updated weights for policy 0, policy_version 68630 (0.0009) +[2023-10-13 03:06:54,521][46662] Updated weights for policy 0, policy_version 68640 (0.0007) +[2023-10-13 03:06:55,770][46663] Updated weights for policy 1, policy_version 68551 (0.0008) +[2023-10-13 03:06:56,130][46663] Updated weights for policy 1, policy_version 68561 (0.0007) +[2023-10-13 03:06:56,491][46663] Updated weights for policy 1, policy_version 68571 (0.0007) +[2023-10-13 03:06:58,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 140509184. Throughput: 0: 1684.0, 1: 1669.5. Samples: 35134068. Policy #0 lag: (min: 31.0, avg: 40.3, max: 63.0) +[2023-10-13 03:06:58,607][45375] Avg episode reward: [(0, '52.590'), (1, '59.170')] +[2023-10-13 03:06:58,641][46662] Updated weights for policy 0, policy_version 68650 (0.0008) +[2023-10-13 03:06:59,012][46662] Updated weights for policy 0, policy_version 68660 (0.0008) +[2023-10-13 03:06:59,375][46662] Updated weights for policy 0, policy_version 68670 (0.0008) +[2023-10-13 03:07:00,318][46663] Updated weights for policy 1, policy_version 68581 (0.0009) +[2023-10-13 03:07:00,680][46663] Updated weights for policy 1, policy_version 68591 (0.0010) +[2023-10-13 03:07:01,046][46663] Updated weights for policy 1, policy_version 68601 (0.0010) +[2023-10-13 03:07:03,493][46662] Updated weights for policy 0, policy_version 68680 (0.0009) +[2023-10-13 03:07:03,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 140574720. Throughput: 0: 1675.5, 1: 1682.9. Samples: 35154526. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 03:07:03,607][45375] Avg episode reward: [(0, '53.280'), (1, '58.950')] +[2023-10-13 03:07:03,865][46662] Updated weights for policy 0, policy_version 68690 (0.0011) +[2023-10-13 03:07:04,236][46662] Updated weights for policy 0, policy_version 68700 (0.0010) +[2023-10-13 03:07:05,222][46663] Updated weights for policy 1, policy_version 68611 (0.0010) +[2023-10-13 03:07:05,594][46663] Updated weights for policy 1, policy_version 68621 (0.0011) +[2023-10-13 03:07:05,960][46663] Updated weights for policy 1, policy_version 68631 (0.0008) +[2023-10-13 03:07:08,171][46662] Updated weights for policy 0, policy_version 68710 (0.0008) +[2023-10-13 03:07:08,539][46662] Updated weights for policy 0, policy_version 68720 (0.0007) +[2023-10-13 03:07:08,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 140640256. Throughput: 0: 1679.9, 1: 1679.9. Samples: 35175074. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 03:07:08,607][45375] Avg episode reward: [(0, '53.540'), (1, '58.100')] +[2023-10-13 03:07:08,911][46662] Updated weights for policy 0, policy_version 68730 (0.0008) +[2023-10-13 03:07:10,008][46663] Updated weights for policy 1, policy_version 68641 (0.0008) +[2023-10-13 03:07:10,372][46663] Updated weights for policy 1, policy_version 68651 (0.0011) +[2023-10-13 03:07:10,734][46663] Updated weights for policy 1, policy_version 68661 (0.0010) +[2023-10-13 03:07:11,097][46663] Updated weights for policy 1, policy_version 68671 (0.0010) +[2023-10-13 03:07:13,014][46662] Updated weights for policy 0, policy_version 68740 (0.0008) +[2023-10-13 03:07:13,411][46662] Updated weights for policy 0, policy_version 68750 (0.0009) +[2023-10-13 03:07:13,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 140705792. Throughput: 0: 1681.7, 1: 1663.7. Samples: 35184248. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 03:07:13,607][45375] Avg episode reward: [(0, '53.880'), (1, '57.680')] +[2023-10-13 03:07:13,780][46662] Updated weights for policy 0, policy_version 68760 (0.0007) +[2023-10-13 03:07:15,130][46663] Updated weights for policy 1, policy_version 68681 (0.0007) +[2023-10-13 03:07:15,506][46663] Updated weights for policy 1, policy_version 68691 (0.0009) +[2023-10-13 03:07:15,876][46663] Updated weights for policy 1, policy_version 68701 (0.0007) +[2023-10-13 03:07:17,867][46662] Updated weights for policy 0, policy_version 68770 (0.0008) +[2023-10-13 03:07:18,242][46662] Updated weights for policy 0, policy_version 68780 (0.0010) +[2023-10-13 03:07:18,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 140771328. Throughput: 0: 1676.0, 1: 1688.9. Samples: 35204842. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 03:07:18,607][45375] Avg episode reward: [(0, '53.810'), (1, '59.070')] +[2023-10-13 03:07:18,623][46662] Updated weights for policy 0, policy_version 68790 (0.0009) +[2023-10-13 03:07:18,982][46662] Updated weights for policy 0, policy_version 68800 (0.0009) +[2023-10-13 03:07:19,843][46663] Updated weights for policy 1, policy_version 68711 (0.0007) +[2023-10-13 03:07:20,212][46663] Updated weights for policy 1, policy_version 68721 (0.0007) +[2023-10-13 03:07:20,569][46663] Updated weights for policy 1, policy_version 68731 (0.0008) +[2023-10-13 03:07:22,940][46662] Updated weights for policy 0, policy_version 68810 (0.0007) +[2023-10-13 03:07:23,315][46662] Updated weights for policy 0, policy_version 68820 (0.0007) +[2023-10-13 03:07:23,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 140836864. Throughput: 0: 1671.0, 1: 1691.1. Samples: 35225318. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 03:07:23,607][45375] Avg episode reward: [(0, '52.680'), (1, '59.020')] +[2023-10-13 03:07:23,687][46662] Updated weights for policy 0, policy_version 68830 (0.0008) +[2023-10-13 03:07:24,631][46663] Updated weights for policy 1, policy_version 68741 (0.0008) +[2023-10-13 03:07:25,013][46663] Updated weights for policy 1, policy_version 68751 (0.0009) +[2023-10-13 03:07:25,388][46663] Updated weights for policy 1, policy_version 68761 (0.0009) +[2023-10-13 03:07:27,740][46662] Updated weights for policy 0, policy_version 68840 (0.0008) +[2023-10-13 03:07:28,114][46662] Updated weights for policy 0, policy_version 68850 (0.0009) +[2023-10-13 03:07:28,484][46662] Updated weights for policy 0, policy_version 68860 (0.0008) +[2023-10-13 03:07:28,606][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 140902400. Throughput: 0: 1681.7, 1: 1677.0. Samples: 35234818. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 03:07:28,607][45375] Avg episode reward: [(0, '53.260'), (1, '58.220')] +[2023-10-13 03:07:29,222][46663] Updated weights for policy 1, policy_version 68771 (0.0007) +[2023-10-13 03:07:29,589][46663] Updated weights for policy 1, policy_version 68781 (0.0008) +[2023-10-13 03:07:29,953][46663] Updated weights for policy 1, policy_version 68791 (0.0009) +[2023-10-13 03:07:32,593][46662] Updated weights for policy 0, policy_version 68870 (0.0008) +[2023-10-13 03:07:32,963][46662] Updated weights for policy 0, policy_version 68880 (0.0008) +[2023-10-13 03:07:33,329][46662] Updated weights for policy 0, policy_version 68890 (0.0010) +[2023-10-13 03:07:33,607][45375] Fps is (10 sec: 16383.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 141000704. Throughput: 0: 1674.5, 1: 1693.3. Samples: 35255626. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 03:07:33,607][45375] Avg episode reward: [(0, '53.280'), (1, '57.820')] +[2023-10-13 03:07:34,116][46663] Updated weights for policy 1, policy_version 68801 (0.0008) +[2023-10-13 03:07:34,491][46663] Updated weights for policy 1, policy_version 68811 (0.0009) +[2023-10-13 03:07:34,856][46663] Updated weights for policy 1, policy_version 68821 (0.0007) +[2023-10-13 03:07:35,230][46663] Updated weights for policy 1, policy_version 68831 (0.0009) +[2023-10-13 03:07:37,467][46662] Updated weights for policy 0, policy_version 68900 (0.0009) +[2023-10-13 03:07:37,835][46662] Updated weights for policy 0, policy_version 68910 (0.0011) +[2023-10-13 03:07:38,211][46662] Updated weights for policy 0, policy_version 68920 (0.0010) +[2023-10-13 03:07:38,607][45375] Fps is (10 sec: 16383.9, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 141066240. Throughput: 0: 1663.6, 1: 1697.6. Samples: 35275746. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 03:07:38,607][45375] Avg episode reward: [(0, '54.660'), (1, '57.420')] +[2023-10-13 03:07:39,324][46663] Updated weights for policy 1, policy_version 68841 (0.0008) +[2023-10-13 03:07:39,698][46663] Updated weights for policy 1, policy_version 68851 (0.0009) +[2023-10-13 03:07:40,071][46663] Updated weights for policy 1, policy_version 68861 (0.0010) +[2023-10-13 03:07:42,200][46662] Updated weights for policy 0, policy_version 68930 (0.0008) +[2023-10-13 03:07:42,575][46662] Updated weights for policy 0, policy_version 68940 (0.0009) +[2023-10-13 03:07:42,945][46662] Updated weights for policy 0, policy_version 68950 (0.0010) +[2023-10-13 03:07:43,312][46662] Updated weights for policy 0, policy_version 68960 (0.0010) +[2023-10-13 03:07:43,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 141131776. Throughput: 0: 1679.4, 1: 1684.0. Samples: 35285420. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 03:07:43,607][45375] Avg episode reward: [(0, '54.940'), (1, '59.020')] +[2023-10-13 03:07:44,333][46663] Updated weights for policy 1, policy_version 68871 (0.0008) +[2023-10-13 03:07:44,704][46663] Updated weights for policy 1, policy_version 68881 (0.0008) +[2023-10-13 03:07:45,062][46663] Updated weights for policy 1, policy_version 68891 (0.0010) +[2023-10-13 03:07:47,302][46662] Updated weights for policy 0, policy_version 68970 (0.0011) +[2023-10-13 03:07:47,668][46662] Updated weights for policy 0, policy_version 68980 (0.0010) +[2023-10-13 03:07:48,036][46662] Updated weights for policy 0, policy_version 68990 (0.0010) +[2023-10-13 03:07:48,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 141197312. Throughput: 0: 1685.2, 1: 1684.5. Samples: 35306164. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 03:07:48,607][45375] Avg episode reward: [(0, '57.200'), (1, '58.420')] +[2023-10-13 03:07:49,186][46663] Updated weights for policy 1, policy_version 68901 (0.0009) +[2023-10-13 03:07:49,561][46663] Updated weights for policy 1, policy_version 68911 (0.0007) +[2023-10-13 03:07:49,924][46663] Updated weights for policy 1, policy_version 68921 (0.0009) +[2023-10-13 03:07:52,184][46662] Updated weights for policy 0, policy_version 69000 (0.0010) +[2023-10-13 03:07:52,552][46662] Updated weights for policy 0, policy_version 69010 (0.0009) +[2023-10-13 03:07:52,918][46662] Updated weights for policy 0, policy_version 69020 (0.0008) +[2023-10-13 03:07:53,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 141262848. Throughput: 0: 1662.4, 1: 1687.0. Samples: 35325794. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 03:07:53,607][45375] Avg episode reward: [(0, '57.790'), (1, '57.560')] +[2023-10-13 03:07:53,615][46091] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000069024_70680576.pth... +[2023-10-13 03:07:53,615][46384] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000068928_70582272.pth... +[2023-10-13 03:07:53,654][46091] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000067456_69074944.pth +[2023-10-13 03:07:53,656][46384] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000067360_68976640.pth +[2023-10-13 03:07:54,067][46663] Updated weights for policy 1, policy_version 68931 (0.0008) +[2023-10-13 03:07:54,433][46663] Updated weights for policy 1, policy_version 68941 (0.0007) +[2023-10-13 03:07:54,800][46663] Updated weights for policy 1, policy_version 68951 (0.0009) +[2023-10-13 03:07:57,048][46662] Updated weights for policy 0, policy_version 69030 (0.0009) +[2023-10-13 03:07:57,416][46662] Updated weights for policy 0, policy_version 69040 (0.0009) +[2023-10-13 03:07:57,792][46662] Updated weights for policy 0, policy_version 69050 (0.0009) +[2023-10-13 03:07:58,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 141328384. Throughput: 0: 1686.1, 1: 1686.4. Samples: 35336008. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 03:07:58,607][45375] Avg episode reward: [(0, '58.890'), (1, '56.640')] +[2023-10-13 03:07:58,993][46663] Updated weights for policy 1, policy_version 68961 (0.0009) +[2023-10-13 03:07:59,369][46663] Updated weights for policy 1, policy_version 68971 (0.0010) +[2023-10-13 03:07:59,726][46663] Updated weights for policy 1, policy_version 68981 (0.0007) +[2023-10-13 03:08:00,097][46663] Updated weights for policy 1, policy_version 68991 (0.0007) +[2023-10-13 03:08:01,947][46662] Updated weights for policy 0, policy_version 69060 (0.0007) +[2023-10-13 03:08:02,345][46662] Updated weights for policy 0, policy_version 69070 (0.0008) +[2023-10-13 03:08:02,716][46662] Updated weights for policy 0, policy_version 69080 (0.0008) +[2023-10-13 03:08:03,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 141393920. Throughput: 0: 1689.2, 1: 1681.7. Samples: 35356536. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 03:08:03,607][45375] Avg episode reward: [(0, '59.360'), (1, '57.150')] +[2023-10-13 03:08:04,069][46663] Updated weights for policy 1, policy_version 69001 (0.0011) +[2023-10-13 03:08:04,438][46663] Updated weights for policy 1, policy_version 69011 (0.0010) +[2023-10-13 03:08:04,791][46663] Updated weights for policy 1, policy_version 69021 (0.0008) +[2023-10-13 03:08:06,761][46662] Updated weights for policy 0, policy_version 69090 (0.0008) +[2023-10-13 03:08:07,131][46662] Updated weights for policy 0, policy_version 69100 (0.0010) +[2023-10-13 03:08:07,501][46662] Updated weights for policy 0, policy_version 69110 (0.0008) +[2023-10-13 03:08:07,878][46662] Updated weights for policy 0, policy_version 69120 (0.0009) +[2023-10-13 03:08:08,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 141459456. Throughput: 0: 1669.1, 1: 1686.0. Samples: 35376296. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 03:08:08,607][45375] Avg episode reward: [(0, '59.560'), (1, '55.790')] +[2023-10-13 03:08:08,847][46663] Updated weights for policy 1, policy_version 69031 (0.0009) +[2023-10-13 03:08:09,213][46663] Updated weights for policy 1, policy_version 69041 (0.0011) +[2023-10-13 03:08:09,574][46663] Updated weights for policy 1, policy_version 69051 (0.0011) +[2023-10-13 03:08:11,907][46662] Updated weights for policy 0, policy_version 69130 (0.0011) +[2023-10-13 03:08:12,279][46662] Updated weights for policy 0, policy_version 69140 (0.0009) +[2023-10-13 03:08:12,648][46662] Updated weights for policy 0, policy_version 69150 (0.0010) +[2023-10-13 03:08:13,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 141524992. Throughput: 0: 1688.9, 1: 1683.8. Samples: 35386590. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 03:08:13,608][45375] Avg episode reward: [(0, '60.190'), (1, '54.130')] +[2023-10-13 03:08:13,662][46663] Updated weights for policy 1, policy_version 69061 (0.0008) +[2023-10-13 03:08:14,037][46663] Updated weights for policy 1, policy_version 69071 (0.0008) +[2023-10-13 03:08:14,408][46663] Updated weights for policy 1, policy_version 69081 (0.0008) +[2023-10-13 03:08:16,666][46662] Updated weights for policy 0, policy_version 69160 (0.0007) +[2023-10-13 03:08:17,038][46662] Updated weights for policy 0, policy_version 69170 (0.0009) +[2023-10-13 03:08:17,409][46662] Updated weights for policy 0, policy_version 69180 (0.0009) +[2023-10-13 03:08:18,480][46663] Updated weights for policy 1, policy_version 69091 (0.0011) +[2023-10-13 03:08:18,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 141590528. Throughput: 0: 1679.7, 1: 1680.7. Samples: 35406846. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 03:08:18,607][45375] Avg episode reward: [(0, '58.660'), (1, '53.430')] +[2023-10-13 03:08:18,852][46663] Updated weights for policy 1, policy_version 69101 (0.0008) +[2023-10-13 03:08:19,215][46663] Updated weights for policy 1, policy_version 69111 (0.0007) +[2023-10-13 03:08:21,286][46662] Updated weights for policy 0, policy_version 69190 (0.0009) +[2023-10-13 03:08:21,662][46662] Updated weights for policy 0, policy_version 69200 (0.0010) +[2023-10-13 03:08:22,023][46662] Updated weights for policy 0, policy_version 69210 (0.0010) +[2023-10-13 03:08:23,260][46663] Updated weights for policy 1, policy_version 69121 (0.0007) +[2023-10-13 03:08:23,606][45375] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 141656064. Throughput: 0: 1676.3, 1: 1675.6. Samples: 35426580. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 03:08:23,607][45375] Avg episode reward: [(0, '58.610'), (1, '51.790')] +[2023-10-13 03:08:23,622][46663] Updated weights for policy 1, policy_version 69131 (0.0008) +[2023-10-13 03:08:24,001][46663] Updated weights for policy 1, policy_version 69141 (0.0008) +[2023-10-13 03:08:24,366][46663] Updated weights for policy 1, policy_version 69151 (0.0007) +[2023-10-13 03:08:26,087][46662] Updated weights for policy 0, policy_version 69220 (0.0008) +[2023-10-13 03:08:26,456][46662] Updated weights for policy 0, policy_version 69230 (0.0008) +[2023-10-13 03:08:26,829][46662] Updated weights for policy 0, policy_version 69240 (0.0008) +[2023-10-13 03:08:28,385][46663] Updated weights for policy 1, policy_version 69161 (0.0010) +[2023-10-13 03:08:28,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 141721600. Throughput: 0: 1693.6, 1: 1681.7. Samples: 35437308. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 03:08:28,607][45375] Avg episode reward: [(0, '58.020'), (1, '52.320')] +[2023-10-13 03:08:28,753][46663] Updated weights for policy 1, policy_version 69171 (0.0010) +[2023-10-13 03:08:29,131][46663] Updated weights for policy 1, policy_version 69181 (0.0009) +[2023-10-13 03:08:30,917][46662] Updated weights for policy 0, policy_version 69250 (0.0009) +[2023-10-13 03:08:31,289][46662] Updated weights for policy 0, policy_version 69260 (0.0008) +[2023-10-13 03:08:31,671][46662] Updated weights for policy 0, policy_version 69270 (0.0008) +[2023-10-13 03:08:32,036][46662] Updated weights for policy 0, policy_version 69280 (0.0007) +[2023-10-13 03:08:33,195][46663] Updated weights for policy 1, policy_version 69191 (0.0009) +[2023-10-13 03:08:33,561][46663] Updated weights for policy 1, policy_version 69201 (0.0008) +[2023-10-13 03:08:33,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 141787136. Throughput: 0: 1670.5, 1: 1685.5. Samples: 35457184. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 03:08:33,607][45375] Avg episode reward: [(0, '58.990'), (1, '51.950')] +[2023-10-13 03:08:33,928][46663] Updated weights for policy 1, policy_version 69211 (0.0010) +[2023-10-13 03:08:35,976][46662] Updated weights for policy 0, policy_version 69290 (0.0007) +[2023-10-13 03:08:36,338][46662] Updated weights for policy 0, policy_version 69300 (0.0008) +[2023-10-13 03:08:36,705][46662] Updated weights for policy 0, policy_version 69310 (0.0009) +[2023-10-13 03:08:37,882][46663] Updated weights for policy 1, policy_version 69221 (0.0010) +[2023-10-13 03:08:38,248][46663] Updated weights for policy 1, policy_version 69231 (0.0007) +[2023-10-13 03:08:38,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 141852672. Throughput: 0: 1688.1, 1: 1673.9. Samples: 35477086. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 03:08:38,607][45375] Avg episode reward: [(0, '59.100'), (1, '51.290')] +[2023-10-13 03:08:38,612][46663] Updated weights for policy 1, policy_version 69241 (0.0008) +[2023-10-13 03:08:40,698][46662] Updated weights for policy 0, policy_version 69320 (0.0008) +[2023-10-13 03:08:41,074][46662] Updated weights for policy 0, policy_version 69330 (0.0009) +[2023-10-13 03:08:41,440][46662] Updated weights for policy 0, policy_version 69340 (0.0009) +[2023-10-13 03:08:42,675][46663] Updated weights for policy 1, policy_version 69251 (0.0007) +[2023-10-13 03:08:43,040][46663] Updated weights for policy 1, policy_version 69261 (0.0008) +[2023-10-13 03:08:43,411][46663] Updated weights for policy 1, policy_version 69271 (0.0007) +[2023-10-13 03:08:43,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 141918208. Throughput: 0: 1681.7, 1: 1693.3. Samples: 35487884. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 03:08:43,607][45375] Avg episode reward: [(0, '58.510'), (1, '50.430')] +[2023-10-13 03:08:45,422][46662] Updated weights for policy 0, policy_version 69350 (0.0009) +[2023-10-13 03:08:45,796][46662] Updated weights for policy 0, policy_version 69360 (0.0008) +[2023-10-13 03:08:46,156][46662] Updated weights for policy 0, policy_version 69370 (0.0008) +[2023-10-13 03:08:47,516][46663] Updated weights for policy 1, policy_version 69281 (0.0009) +[2023-10-13 03:08:47,885][46663] Updated weights for policy 1, policy_version 69291 (0.0009) +[2023-10-13 03:08:48,241][46663] Updated weights for policy 1, policy_version 69301 (0.0009) +[2023-10-13 03:08:48,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 141983744. Throughput: 0: 1665.9, 1: 1690.6. Samples: 35507578. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 03:08:48,608][45375] Avg episode reward: [(0, '59.690'), (1, '49.280')] +[2023-10-13 03:08:48,615][46663] Updated weights for policy 1, policy_version 69311 (0.0008) +[2023-10-13 03:08:50,289][46662] Updated weights for policy 0, policy_version 69380 (0.0008) +[2023-10-13 03:08:50,685][46662] Updated weights for policy 0, policy_version 69390 (0.0008) +[2023-10-13 03:08:51,046][46662] Updated weights for policy 0, policy_version 69400 (0.0008) +[2023-10-13 03:08:52,642][46663] Updated weights for policy 1, policy_version 69321 (0.0010) +[2023-10-13 03:08:53,003][46663] Updated weights for policy 1, policy_version 69331 (0.0009) +[2023-10-13 03:08:53,380][46663] Updated weights for policy 1, policy_version 69341 (0.0007) +[2023-10-13 03:08:53,607][45375] Fps is (10 sec: 16383.5, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 142082048. Throughput: 0: 1692.7, 1: 1662.5. Samples: 35527280. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 03:08:53,608][45375] Avg episode reward: [(0, '59.020'), (1, '49.960')] +[2023-10-13 03:08:55,058][46662] Updated weights for policy 0, policy_version 69410 (0.0008) +[2023-10-13 03:08:55,432][46662] Updated weights for policy 0, policy_version 69420 (0.0007) +[2023-10-13 03:08:55,801][46662] Updated weights for policy 0, policy_version 69430 (0.0009) +[2023-10-13 03:08:56,163][46662] Updated weights for policy 0, policy_version 69440 (0.0008) +[2023-10-13 03:08:57,423][46663] Updated weights for policy 1, policy_version 69351 (0.0010) +[2023-10-13 03:08:57,789][46663] Updated weights for policy 1, policy_version 69361 (0.0008) +[2023-10-13 03:08:58,158][46663] Updated weights for policy 1, policy_version 69371 (0.0007) +[2023-10-13 03:08:58,606][45375] Fps is (10 sec: 16384.3, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 142147584. Throughput: 0: 1674.4, 1: 1691.7. Samples: 35538066. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 03:08:58,607][45375] Avg episode reward: [(0, '57.490'), (1, '50.830')] +[2023-10-13 03:09:00,296][46662] Updated weights for policy 0, policy_version 69450 (0.0010) +[2023-10-13 03:09:00,676][46662] Updated weights for policy 0, policy_version 69460 (0.0008) +[2023-10-13 03:09:01,044][46662] Updated weights for policy 0, policy_version 69470 (0.0009) +[2023-10-13 03:09:02,350][46663] Updated weights for policy 1, policy_version 69381 (0.0009) +[2023-10-13 03:09:02,749][46663] Updated weights for policy 1, policy_version 69391 (0.0009) +[2023-10-13 03:09:03,122][46663] Updated weights for policy 1, policy_version 69401 (0.0008) +[2023-10-13 03:09:03,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 142213120. Throughput: 0: 1675.2, 1: 1682.8. Samples: 35557958. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 03:09:03,608][45375] Avg episode reward: [(0, '56.140'), (1, '51.070')] +[2023-10-13 03:09:05,250][46662] Updated weights for policy 0, policy_version 69480 (0.0009) +[2023-10-13 03:09:05,614][46662] Updated weights for policy 0, policy_version 69490 (0.0008) +[2023-10-13 03:09:05,986][46662] Updated weights for policy 0, policy_version 69500 (0.0009) +[2023-10-13 03:09:07,171][46663] Updated weights for policy 1, policy_version 69411 (0.0007) +[2023-10-13 03:09:07,549][46663] Updated weights for policy 1, policy_version 69421 (0.0008) +[2023-10-13 03:09:07,914][46663] Updated weights for policy 1, policy_version 69431 (0.0010) +[2023-10-13 03:09:08,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 142278656. Throughput: 0: 1694.1, 1: 1664.6. Samples: 35577724. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 03:09:08,607][45375] Avg episode reward: [(0, '55.510'), (1, '51.110')] +[2023-10-13 03:09:09,960][46662] Updated weights for policy 0, policy_version 69510 (0.0007) +[2023-10-13 03:09:10,332][46662] Updated weights for policy 0, policy_version 69520 (0.0007) +[2023-10-13 03:09:10,698][46662] Updated weights for policy 0, policy_version 69530 (0.0009) +[2023-10-13 03:09:11,979][46663] Updated weights for policy 1, policy_version 69441 (0.0009) +[2023-10-13 03:09:12,345][46663] Updated weights for policy 1, policy_version 69451 (0.0008) +[2023-10-13 03:09:12,725][46663] Updated weights for policy 1, policy_version 69461 (0.0010) +[2023-10-13 03:09:13,087][46663] Updated weights for policy 1, policy_version 69471 (0.0011) +[2023-10-13 03:09:13,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 142344192. Throughput: 0: 1664.3, 1: 1687.9. Samples: 35588158. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) +[2023-10-13 03:09:13,608][45375] Avg episode reward: [(0, '55.730'), (1, '50.960')] +[2023-10-13 03:09:14,776][46662] Updated weights for policy 0, policy_version 69540 (0.0009) +[2023-10-13 03:09:15,150][46662] Updated weights for policy 0, policy_version 69550 (0.0009) +[2023-10-13 03:09:15,507][46662] Updated weights for policy 0, policy_version 69560 (0.0009) +[2023-10-13 03:09:17,175][46663] Updated weights for policy 1, policy_version 69481 (0.0010) +[2023-10-13 03:09:17,545][46663] Updated weights for policy 1, policy_version 69491 (0.0009) +[2023-10-13 03:09:17,914][46663] Updated weights for policy 1, policy_version 69501 (0.0008) +[2023-10-13 03:09:18,606][45375] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 142409728. Throughput: 0: 1680.0, 1: 1672.9. Samples: 35608062. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) +[2023-10-13 03:09:18,607][45375] Avg episode reward: [(0, '54.750'), (1, '51.990')] +[2023-10-13 03:09:19,642][46662] Updated weights for policy 0, policy_version 69570 (0.0010) +[2023-10-13 03:09:20,005][46662] Updated weights for policy 0, policy_version 69580 (0.0009) +[2023-10-13 03:09:20,376][46662] Updated weights for policy 0, policy_version 69590 (0.0009) +[2023-10-13 03:09:20,741][46662] Updated weights for policy 0, policy_version 69600 (0.0009) +[2023-10-13 03:09:21,978][46663] Updated weights for policy 1, policy_version 69511 (0.0009) +[2023-10-13 03:09:22,340][46663] Updated weights for policy 1, policy_version 69521 (0.0009) +[2023-10-13 03:09:22,708][46663] Updated weights for policy 1, policy_version 69531 (0.0011) +[2023-10-13 03:09:23,606][45375] Fps is (10 sec: 13107.6, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 142475264. Throughput: 0: 1687.7, 1: 1671.0. Samples: 35628228. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) +[2023-10-13 03:09:23,607][45375] Avg episode reward: [(0, '54.770'), (1, '52.750')] +[2023-10-13 03:09:24,767][46662] Updated weights for policy 0, policy_version 69610 (0.0008) +[2023-10-13 03:09:25,127][46662] Updated weights for policy 0, policy_version 69620 (0.0011) +[2023-10-13 03:09:25,511][46662] Updated weights for policy 0, policy_version 69630 (0.0009) +[2023-10-13 03:09:26,602][46663] Updated weights for policy 1, policy_version 69541 (0.0008) +[2023-10-13 03:09:26,976][46663] Updated weights for policy 1, policy_version 69551 (0.0009) +[2023-10-13 03:09:27,337][46663] Updated weights for policy 1, policy_version 69561 (0.0009) +[2023-10-13 03:09:28,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 142540800. Throughput: 0: 1665.6, 1: 1688.3. Samples: 35638810. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) +[2023-10-13 03:09:28,607][45375] Avg episode reward: [(0, '55.020'), (1, '53.770')] +[2023-10-13 03:09:29,655][46662] Updated weights for policy 0, policy_version 69640 (0.0009) +[2023-10-13 03:09:30,020][46662] Updated weights for policy 0, policy_version 69650 (0.0009) +[2023-10-13 03:09:30,403][46662] Updated weights for policy 0, policy_version 69660 (0.0007) +[2023-10-13 03:09:31,419][46663] Updated weights for policy 1, policy_version 69571 (0.0009) +[2023-10-13 03:09:31,781][46663] Updated weights for policy 1, policy_version 69581 (0.0009) +[2023-10-13 03:09:32,149][46663] Updated weights for policy 1, policy_version 69591 (0.0010) +[2023-10-13 03:09:33,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 142606336. Throughput: 0: 1682.4, 1: 1664.1. Samples: 35658168. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) +[2023-10-13 03:09:33,607][45375] Avg episode reward: [(0, '54.770'), (1, '53.050')] +[2023-10-13 03:09:34,404][46662] Updated weights for policy 0, policy_version 69670 (0.0009) +[2023-10-13 03:09:34,772][46662] Updated weights for policy 0, policy_version 69680 (0.0009) +[2023-10-13 03:09:35,142][46662] Updated weights for policy 0, policy_version 69690 (0.0007) +[2023-10-13 03:09:36,208][46663] Updated weights for policy 1, policy_version 69601 (0.0010) +[2023-10-13 03:09:36,581][46663] Updated weights for policy 1, policy_version 69611 (0.0007) +[2023-10-13 03:09:36,947][46663] Updated weights for policy 1, policy_version 69621 (0.0007) +[2023-10-13 03:09:37,314][46663] Updated weights for policy 1, policy_version 69631 (0.0008) +[2023-10-13 03:09:38,606][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 142671872. Throughput: 0: 1683.6, 1: 1684.4. Samples: 35678840. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) +[2023-10-13 03:09:38,607][45375] Avg episode reward: [(0, '55.470'), (1, '54.370')] +[2023-10-13 03:09:39,226][46662] Updated weights for policy 0, policy_version 69700 (0.0008) +[2023-10-13 03:09:39,612][46662] Updated weights for policy 0, policy_version 69710 (0.0009) +[2023-10-13 03:09:39,993][46662] Updated weights for policy 0, policy_version 69720 (0.0010) +[2023-10-13 03:09:41,157][46663] Updated weights for policy 1, policy_version 69641 (0.0007) +[2023-10-13 03:09:41,523][46663] Updated weights for policy 1, policy_version 69651 (0.0007) +[2023-10-13 03:09:41,887][46663] Updated weights for policy 1, policy_version 69661 (0.0008) +[2023-10-13 03:09:43,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 142737408. Throughput: 0: 1671.2, 1: 1677.6. Samples: 35688766. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) +[2023-10-13 03:09:43,607][45375] Avg episode reward: [(0, '54.910'), (1, '54.550')] +[2023-10-13 03:09:44,075][46662] Updated weights for policy 0, policy_version 69730 (0.0008) +[2023-10-13 03:09:44,446][46662] Updated weights for policy 0, policy_version 69740 (0.0009) +[2023-10-13 03:09:44,823][46662] Updated weights for policy 0, policy_version 69750 (0.0008) +[2023-10-13 03:09:45,184][46662] Updated weights for policy 0, policy_version 69760 (0.0010) +[2023-10-13 03:09:45,990][46663] Updated weights for policy 1, policy_version 69671 (0.0009) +[2023-10-13 03:09:46,351][46663] Updated weights for policy 1, policy_version 69681 (0.0010) +[2023-10-13 03:09:46,715][46663] Updated weights for policy 1, policy_version 69691 (0.0010) +[2023-10-13 03:09:48,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 142802944. Throughput: 0: 1682.3, 1: 1672.1. Samples: 35708904. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) +[2023-10-13 03:09:48,607][45375] Avg episode reward: [(0, '54.560'), (1, '55.220')] +[2023-10-13 03:09:49,363][46662] Updated weights for policy 0, policy_version 69770 (0.0009) +[2023-10-13 03:09:49,737][46662] Updated weights for policy 0, policy_version 69780 (0.0009) +[2023-10-13 03:09:50,092][46662] Updated weights for policy 0, policy_version 69790 (0.0008) +[2023-10-13 03:09:50,791][46663] Updated weights for policy 1, policy_version 69701 (0.0008) +[2023-10-13 03:09:51,180][46663] Updated weights for policy 1, policy_version 69711 (0.0010) +[2023-10-13 03:09:51,547][46663] Updated weights for policy 1, policy_version 69721 (0.0009) +[2023-10-13 03:09:53,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 142868480. Throughput: 0: 1678.5, 1: 1694.2. Samples: 35729498. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) +[2023-10-13 03:09:53,607][45375] Avg episode reward: [(0, '52.920'), (1, '56.350')] +[2023-10-13 03:09:53,616][46091] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000069792_71467008.pth... +[2023-10-13 03:09:53,617][46384] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000069728_71401472.pth... +[2023-10-13 03:09:53,645][46091] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000068224_69861376.pth +[2023-10-13 03:09:53,653][46384] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000068160_69795840.pth +[2023-10-13 03:09:54,141][46662] Updated weights for policy 0, policy_version 69800 (0.0008) +[2023-10-13 03:09:54,503][46662] Updated weights for policy 0, policy_version 69810 (0.0007) +[2023-10-13 03:09:54,881][46662] Updated weights for policy 0, policy_version 69820 (0.0011) +[2023-10-13 03:09:55,591][46663] Updated weights for policy 1, policy_version 69731 (0.0009) +[2023-10-13 03:09:55,952][46663] Updated weights for policy 1, policy_version 69741 (0.0008) +[2023-10-13 03:09:56,315][46663] Updated weights for policy 1, policy_version 69751 (0.0008) +[2023-10-13 03:09:58,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 142934016. Throughput: 0: 1678.2, 1: 1673.7. Samples: 35738992. Policy #0 lag: (min: 41.0, avg: 55.0, max: 56.0) +[2023-10-13 03:09:58,607][45375] Avg episode reward: [(0, '52.850'), (1, '57.050')] +[2023-10-13 03:09:59,014][46662] Updated weights for policy 0, policy_version 69830 (0.0011) +[2023-10-13 03:09:59,380][46662] Updated weights for policy 0, policy_version 69840 (0.0009) +[2023-10-13 03:09:59,759][46662] Updated weights for policy 0, policy_version 69850 (0.0007) +[2023-10-13 03:10:00,316][46663] Updated weights for policy 1, policy_version 69761 (0.0008) +[2023-10-13 03:10:00,674][46663] Updated weights for policy 1, policy_version 69771 (0.0007) +[2023-10-13 03:10:01,039][46663] Updated weights for policy 1, policy_version 69781 (0.0007) +[2023-10-13 03:10:01,406][46663] Updated weights for policy 1, policy_version 69791 (0.0008) +[2023-10-13 03:10:03,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 142999552. Throughput: 0: 1681.8, 1: 1681.2. Samples: 35759396. Policy #0 lag: (min: 41.0, avg: 55.0, max: 56.0) +[2023-10-13 03:10:03,607][45375] Avg episode reward: [(0, '53.530'), (1, '55.950')] +[2023-10-13 03:10:03,761][46662] Updated weights for policy 0, policy_version 69860 (0.0008) +[2023-10-13 03:10:04,137][46662] Updated weights for policy 0, policy_version 69870 (0.0008) +[2023-10-13 03:10:04,503][46662] Updated weights for policy 0, policy_version 69880 (0.0010) +[2023-10-13 03:10:05,582][46663] Updated weights for policy 1, policy_version 69801 (0.0008) +[2023-10-13 03:10:05,953][46663] Updated weights for policy 1, policy_version 69811 (0.0008) +[2023-10-13 03:10:06,328][46663] Updated weights for policy 1, policy_version 69821 (0.0008) +[2023-10-13 03:10:08,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 143065088. Throughput: 0: 1679.4, 1: 1701.6. Samples: 35780372. Policy #0 lag: (min: 41.0, avg: 55.0, max: 56.0) +[2023-10-13 03:10:08,607][45375] Avg episode reward: [(0, '53.700'), (1, '55.230')] +[2023-10-13 03:10:08,655][46662] Updated weights for policy 0, policy_version 69890 (0.0008) +[2023-10-13 03:10:09,017][46662] Updated weights for policy 0, policy_version 69900 (0.0009) +[2023-10-13 03:10:09,386][46662] Updated weights for policy 0, policy_version 69910 (0.0010) +[2023-10-13 03:10:09,765][46662] Updated weights for policy 0, policy_version 69920 (0.0010) +[2023-10-13 03:10:10,373][46663] Updated weights for policy 1, policy_version 69831 (0.0008) +[2023-10-13 03:10:10,730][46663] Updated weights for policy 1, policy_version 69841 (0.0008) +[2023-10-13 03:10:11,094][46663] Updated weights for policy 1, policy_version 69851 (0.0010) +[2023-10-13 03:10:13,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 143130624. Throughput: 0: 1678.4, 1: 1668.4. Samples: 35789418. Policy #0 lag: (min: 41.0, avg: 55.0, max: 56.0) +[2023-10-13 03:10:13,607][45375] Avg episode reward: [(0, '53.730'), (1, '56.900')] +[2023-10-13 03:10:13,876][46662] Updated weights for policy 0, policy_version 69930 (0.0008) +[2023-10-13 03:10:14,253][46662] Updated weights for policy 0, policy_version 69940 (0.0008) +[2023-10-13 03:10:14,629][46662] Updated weights for policy 0, policy_version 69950 (0.0008) +[2023-10-13 03:10:15,187][46663] Updated weights for policy 1, policy_version 69861 (0.0010) +[2023-10-13 03:10:15,549][46663] Updated weights for policy 1, policy_version 69871 (0.0007) +[2023-10-13 03:10:15,918][46663] Updated weights for policy 1, policy_version 69881 (0.0010) +[2023-10-13 03:10:18,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 143196160. Throughput: 0: 1681.5, 1: 1698.8. Samples: 35810282. Policy #0 lag: (min: 41.0, avg: 55.0, max: 56.0) +[2023-10-13 03:10:18,607][45375] Avg episode reward: [(0, '52.090'), (1, '56.780')] +[2023-10-13 03:10:18,614][46662] Updated weights for policy 0, policy_version 69960 (0.0008) +[2023-10-13 03:10:18,987][46662] Updated weights for policy 0, policy_version 69970 (0.0007) +[2023-10-13 03:10:19,356][46662] Updated weights for policy 0, policy_version 69980 (0.0010) +[2023-10-13 03:10:19,871][46663] Updated weights for policy 1, policy_version 69891 (0.0009) +[2023-10-13 03:10:20,242][46663] Updated weights for policy 1, policy_version 69901 (0.0007) +[2023-10-13 03:10:20,615][46663] Updated weights for policy 1, policy_version 69911 (0.0007) +[2023-10-13 03:10:23,229][46662] Updated weights for policy 0, policy_version 69990 (0.0008) +[2023-10-13 03:10:23,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 143261696. Throughput: 0: 1681.2, 1: 1706.4. Samples: 35831282. Policy #0 lag: (min: 41.0, avg: 55.0, max: 56.0) +[2023-10-13 03:10:23,607][45375] Avg episode reward: [(0, '51.870'), (1, '56.980')] +[2023-10-13 03:10:23,609][46662] Updated weights for policy 0, policy_version 70000 (0.0008) +[2023-10-13 03:10:23,986][46662] Updated weights for policy 0, policy_version 70010 (0.0008) +[2023-10-13 03:10:24,602][46663] Updated weights for policy 1, policy_version 69921 (0.0008) +[2023-10-13 03:10:24,963][46663] Updated weights for policy 1, policy_version 69931 (0.0009) +[2023-10-13 03:10:25,335][46663] Updated weights for policy 1, policy_version 69941 (0.0009) +[2023-10-13 03:10:25,709][46663] Updated weights for policy 1, policy_version 69951 (0.0009) +[2023-10-13 03:10:28,097][46662] Updated weights for policy 0, policy_version 70020 (0.0010) +[2023-10-13 03:10:28,496][46662] Updated weights for policy 0, policy_version 70030 (0.0008) +[2023-10-13 03:10:28,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 143327232. Throughput: 0: 1683.9, 1: 1683.9. Samples: 35840316. Policy #0 lag: (min: 41.0, avg: 55.0, max: 56.0) +[2023-10-13 03:10:28,607][45375] Avg episode reward: [(0, '54.330'), (1, '56.740')] +[2023-10-13 03:10:28,861][46662] Updated weights for policy 0, policy_version 70040 (0.0008) +[2023-10-13 03:10:29,840][46663] Updated weights for policy 1, policy_version 69961 (0.0009) +[2023-10-13 03:10:30,202][46663] Updated weights for policy 1, policy_version 69971 (0.0007) +[2023-10-13 03:10:30,578][46663] Updated weights for policy 1, policy_version 69981 (0.0008) +[2023-10-13 03:10:32,969][46662] Updated weights for policy 0, policy_version 70050 (0.0007) +[2023-10-13 03:10:33,334][46662] Updated weights for policy 0, policy_version 70060 (0.0007) +[2023-10-13 03:10:33,607][45375] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 143392768. Throughput: 0: 1678.5, 1: 1696.3. Samples: 35860770. Policy #0 lag: (min: 41.0, avg: 55.0, max: 56.0) +[2023-10-13 03:10:33,608][45375] Avg episode reward: [(0, '54.050'), (1, '56.730')] +[2023-10-13 03:10:33,706][46662] Updated weights for policy 0, policy_version 70070 (0.0008) +[2023-10-13 03:10:34,068][46662] Updated weights for policy 0, policy_version 70080 (0.0007) +[2023-10-13 03:10:34,571][46663] Updated weights for policy 1, policy_version 69991 (0.0009) +[2023-10-13 03:10:34,935][46663] Updated weights for policy 1, policy_version 70001 (0.0007) +[2023-10-13 03:10:35,306][46663] Updated weights for policy 1, policy_version 70011 (0.0008) +[2023-10-13 03:10:37,950][46662] Updated weights for policy 0, policy_version 70090 (0.0009) +[2023-10-13 03:10:38,333][46662] Updated weights for policy 0, policy_version 70100 (0.0008) +[2023-10-13 03:10:38,606][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 143458304. Throughput: 0: 1678.3, 1: 1698.7. Samples: 35881460. Policy #0 lag: (min: 41.0, avg: 55.0, max: 56.0) +[2023-10-13 03:10:38,607][45375] Avg episode reward: [(0, '53.590'), (1, '58.030')] +[2023-10-13 03:10:38,697][46662] Updated weights for policy 0, policy_version 70110 (0.0008) +[2023-10-13 03:10:39,463][46663] Updated weights for policy 1, policy_version 70021 (0.0008) +[2023-10-13 03:10:39,868][46663] Updated weights for policy 1, policy_version 70031 (0.0009) +[2023-10-13 03:10:40,236][46663] Updated weights for policy 1, policy_version 70041 (0.0009) +[2023-10-13 03:10:42,801][46662] Updated weights for policy 0, policy_version 70120 (0.0008) +[2023-10-13 03:10:43,163][46662] Updated weights for policy 0, policy_version 70130 (0.0008) +[2023-10-13 03:10:43,536][46662] Updated weights for policy 0, policy_version 70140 (0.0011) +[2023-10-13 03:10:43,607][45375] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 143523840. Throughput: 0: 1681.3, 1: 1686.2. Samples: 35890530. Policy #0 lag: (min: 9.0, avg: 14.1, max: 41.0) +[2023-10-13 03:10:43,607][45375] Avg episode reward: [(0, '52.610'), (1, '57.760')] +[2023-10-13 03:10:44,102][46663] Updated weights for policy 1, policy_version 70051 (0.0009) +[2023-10-13 03:10:44,476][46663] Updated weights for policy 1, policy_version 70061 (0.0007) +[2023-10-13 03:10:44,837][46663] Updated weights for policy 1, policy_version 70071 (0.0007) +[2023-10-13 03:10:47,701][46662] Updated weights for policy 0, policy_version 70150 (0.0009) +[2023-10-13 03:10:48,060][46662] Updated weights for policy 0, policy_version 70160 (0.0008) +[2023-10-13 03:10:48,434][46662] Updated weights for policy 0, policy_version 70170 (0.0008) +[2023-10-13 03:10:48,606][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 143589376. Throughput: 0: 1678.9, 1: 1694.4. Samples: 35911196. Policy #0 lag: (min: 9.0, avg: 14.1, max: 41.0) +[2023-10-13 03:10:48,607][45375] Avg episode reward: [(0, '53.360'), (1, '56.950')] +[2023-10-13 03:10:48,972][46663] Updated weights for policy 1, policy_version 70081 (0.0009) +[2023-10-13 03:10:49,340][46663] Updated weights for policy 1, policy_version 70091 (0.0008) +[2023-10-13 03:10:49,700][46663] Updated weights for policy 1, policy_version 70101 (0.0007) +[2023-10-13 03:10:50,074][46663] Updated weights for policy 1, policy_version 70111 (0.0008) +[2023-10-13 03:10:52,523][46662] Updated weights for policy 0, policy_version 70180 (0.0008) +[2023-10-13 03:10:52,890][46662] Updated weights for policy 0, policy_version 70190 (0.0008) +[2023-10-13 03:10:53,255][46662] Updated weights for policy 0, policy_version 70200 (0.0011) +[2023-10-13 03:10:53,607][45375] Fps is (10 sec: 16383.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 143687680. Throughput: 0: 1671.0, 1: 1688.8. Samples: 35931562. Policy #0 lag: (min: 9.0, avg: 14.1, max: 41.0) +[2023-10-13 03:10:53,607][45375] Avg episode reward: [(0, '53.350'), (1, '56.050')] +[2023-10-13 03:10:54,135][46663] Updated weights for policy 1, policy_version 70121 (0.0007) +[2023-10-13 03:10:54,504][46663] Updated weights for policy 1, policy_version 70131 (0.0008) +[2023-10-13 03:10:54,863][46663] Updated weights for policy 1, policy_version 70141 (0.0007) +[2023-10-13 03:10:57,238][46662] Updated weights for policy 0, policy_version 70210 (0.0009) +[2023-10-13 03:10:57,606][46662] Updated weights for policy 0, policy_version 70220 (0.0008) +[2023-10-13 03:10:57,983][46662] Updated weights for policy 0, policy_version 70230 (0.0009) +[2023-10-13 03:10:58,348][46662] Updated weights for policy 0, policy_version 70240 (0.0008) +[2023-10-13 03:10:58,607][45375] Fps is (10 sec: 16383.6, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 143753216. Throughput: 0: 1687.1, 1: 1688.4. Samples: 35941314. Policy #0 lag: (min: 9.0, avg: 14.1, max: 41.0) +[2023-10-13 03:10:58,608][45375] Avg episode reward: [(0, '53.170'), (1, '55.830')] +[2023-10-13 03:10:58,937][46663] Updated weights for policy 1, policy_version 70151 (0.0008) +[2023-10-13 03:10:59,314][46663] Updated weights for policy 1, policy_version 70161 (0.0008) +[2023-10-13 03:10:59,678][46663] Updated weights for policy 1, policy_version 70171 (0.0010) +[2023-10-13 03:11:02,480][46662] Updated weights for policy 0, policy_version 70250 (0.0008) +[2023-10-13 03:11:02,859][46662] Updated weights for policy 0, policy_version 70260 (0.0008) +[2023-10-13 03:11:03,228][46662] Updated weights for policy 0, policy_version 70270 (0.0009) +[2023-10-13 03:11:03,606][45375] Fps is (10 sec: 13107.4, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 143818752. Throughput: 0: 1685.3, 1: 1687.3. Samples: 35962050. Policy #0 lag: (min: 9.0, avg: 14.1, max: 41.0) +[2023-10-13 03:11:03,607][45375] Avg episode reward: [(0, '54.510'), (1, '55.530')] +[2023-10-13 03:11:03,834][46663] Updated weights for policy 1, policy_version 70181 (0.0008) +[2023-10-13 03:11:04,198][46663] Updated weights for policy 1, policy_version 70191 (0.0007) +[2023-10-13 03:11:04,558][46663] Updated weights for policy 1, policy_version 70201 (0.0007) +[2023-10-13 03:11:07,277][46662] Updated weights for policy 0, policy_version 70280 (0.0009) +[2023-10-13 03:11:07,643][46662] Updated weights for policy 0, policy_version 70290 (0.0008) +[2023-10-13 03:11:08,024][46662] Updated weights for policy 0, policy_version 70300 (0.0010) +[2023-10-13 03:11:08,606][45375] Fps is (10 sec: 13107.5, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 143884288. Throughput: 0: 1664.6, 1: 1679.4. Samples: 35981764. Policy #0 lag: (min: 9.0, avg: 14.1, max: 41.0) +[2023-10-13 03:11:08,607][45375] Avg episode reward: [(0, '54.850'), (1, '54.820')] +[2023-10-13 03:11:08,766][46663] Updated weights for policy 1, policy_version 70211 (0.0008) +[2023-10-13 03:11:09,138][46663] Updated weights for policy 1, policy_version 70221 (0.0008) +[2023-10-13 03:11:09,506][46663] Updated weights for policy 1, policy_version 70231 (0.0007) +[2023-10-13 03:11:12,002][46662] Updated weights for policy 0, policy_version 70310 (0.0007) +[2023-10-13 03:11:12,360][46662] Updated weights for policy 0, policy_version 70320 (0.0009) +[2023-10-13 03:11:12,732][46662] Updated weights for policy 0, policy_version 70330 (0.0009) +[2023-10-13 03:11:13,373][46663] Updated weights for policy 1, policy_version 70241 (0.0007) +[2023-10-13 03:11:13,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 143949824. Throughput: 0: 1683.2, 1: 1681.3. Samples: 35991718. Policy #0 lag: (min: 9.0, avg: 14.1, max: 41.0) +[2023-10-13 03:11:13,607][45375] Avg episode reward: [(0, '54.350'), (1, '55.000')] +[2023-10-13 03:11:13,731][46663] Updated weights for policy 1, policy_version 70251 (0.0008) +[2023-10-13 03:11:14,095][46663] Updated weights for policy 1, policy_version 70261 (0.0008) +[2023-10-13 03:11:14,463][46663] Updated weights for policy 1, policy_version 70271 (0.0009) +[2023-10-13 03:11:16,824][46662] Updated weights for policy 0, policy_version 70340 (0.0008) +[2023-10-13 03:11:17,211][46662] Updated weights for policy 0, policy_version 70350 (0.0008) +[2023-10-13 03:11:17,573][46662] Updated weights for policy 0, policy_version 70360 (0.0008) +[2023-10-13 03:11:18,504][46663] Updated weights for policy 1, policy_version 70281 (0.0008) +[2023-10-13 03:11:18,606][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 144015360. Throughput: 0: 1681.9, 1: 1686.1. Samples: 36012332. Policy #0 lag: (min: 9.0, avg: 14.1, max: 41.0) +[2023-10-13 03:11:18,607][45375] Avg episode reward: [(0, '54.690'), (1, '53.840')] +[2023-10-13 03:11:18,869][46663] Updated weights for policy 1, policy_version 70291 (0.0007) +[2023-10-13 03:11:19,235][46663] Updated weights for policy 1, policy_version 70301 (0.0008) +[2023-10-13 03:11:21,565][46662] Updated weights for policy 0, policy_version 70370 (0.0010) +[2023-10-13 03:11:21,937][46662] Updated weights for policy 0, policy_version 70380 (0.0007) +[2023-10-13 03:11:22,306][46662] Updated weights for policy 0, policy_version 70390 (0.0008) +[2023-10-13 03:11:22,674][46662] Updated weights for policy 0, policy_version 70400 (0.0009) +[2023-10-13 03:11:23,295][46663] Updated weights for policy 1, policy_version 70311 (0.0009) +[2023-10-13 03:11:23,607][45375] Fps is (10 sec: 13106.7, 60 sec: 13653.2, 300 sec: 13440.4). Total num frames: 144080896. Throughput: 0: 1660.8, 1: 1676.0. Samples: 36031620. Policy #0 lag: (min: 5.0, avg: 5.2, max: 16.0) +[2023-10-13 03:11:23,608][45375] Avg episode reward: [(0, '54.660'), (1, '53.530')] +[2023-10-13 03:11:23,662][46663] Updated weights for policy 1, policy_version 70321 (0.0007) +[2023-10-13 03:11:24,040][46663] Updated weights for policy 1, policy_version 70331 (0.0007) +[2023-10-13 03:11:26,773][46662] Updated weights for policy 0, policy_version 70410 (0.0008) +[2023-10-13 03:11:27,132][46662] Updated weights for policy 0, policy_version 70420 (0.0008) +[2023-10-13 03:11:27,504][46662] Updated weights for policy 0, policy_version 70430 (0.0009) +[2023-10-13 03:11:28,111][46663] Updated weights for policy 1, policy_version 70341 (0.0008) +[2023-10-13 03:11:28,496][46663] Updated weights for policy 1, policy_version 70351 (0.0007) +[2023-10-13 03:11:28,606][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 144146432. Throughput: 0: 1685.3, 1: 1691.5. Samples: 36042484. Policy #0 lag: (min: 5.0, avg: 5.2, max: 16.0) +[2023-10-13 03:11:28,607][45375] Avg episode reward: [(0, '54.870'), (1, '52.940')] +[2023-10-13 03:11:28,856][46663] Updated weights for policy 1, policy_version 70361 (0.0008) +[2023-10-13 03:11:31,518][46662] Updated weights for policy 0, policy_version 70440 (0.0009) +[2023-10-13 03:11:31,875][46662] Updated weights for policy 0, policy_version 70450 (0.0009) +[2023-10-13 03:11:32,245][46662] Updated weights for policy 0, policy_version 70460 (0.0008) +[2023-10-13 03:11:32,974][46663] Updated weights for policy 1, policy_version 70371 (0.0008) +[2023-10-13 03:11:33,352][46663] Updated weights for policy 1, policy_version 70381 (0.0008) +[2023-10-13 03:11:33,606][45375] Fps is (10 sec: 13107.8, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 144211968. Throughput: 0: 1679.5, 1: 1687.2. Samples: 36062696. Policy #0 lag: (min: 5.0, avg: 5.2, max: 16.0) +[2023-10-13 03:11:33,607][45375] Avg episode reward: [(0, '55.490'), (1, '55.690')] +[2023-10-13 03:11:33,712][46663] Updated weights for policy 1, policy_version 70391 (0.0009) +[2023-10-13 03:11:36,317][46662] Updated weights for policy 0, policy_version 70470 (0.0008) +[2023-10-13 03:11:36,691][46662] Updated weights for policy 0, policy_version 70480 (0.0008) +[2023-10-13 03:11:37,065][46662] Updated weights for policy 0, policy_version 70490 (0.0009) +[2023-10-13 03:11:37,865][46663] Updated weights for policy 1, policy_version 70401 (0.0010) +[2023-10-13 03:11:38,222][46663] Updated weights for policy 1, policy_version 70411 (0.0008) +[2023-10-13 03:11:38,593][46663] Updated weights for policy 1, policy_version 70421 (0.0007) +[2023-10-13 03:11:38,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 144277504. Throughput: 0: 1668.8, 1: 1674.0. Samples: 36081988. Policy #0 lag: (min: 5.0, avg: 5.2, max: 16.0) +[2023-10-13 03:11:38,607][45375] Avg episode reward: [(0, '55.500'), (1, '56.220')] +[2023-10-13 03:11:38,953][46663] Updated weights for policy 1, policy_version 70431 (0.0009) +[2023-10-13 03:11:41,088][46662] Updated weights for policy 0, policy_version 70500 (0.0009) +[2023-10-13 03:11:41,453][46662] Updated weights for policy 0, policy_version 70510 (0.0008) +[2023-10-13 03:11:41,825][46662] Updated weights for policy 0, policy_version 70520 (0.0008) +[2023-10-13 03:11:43,022][46663] Updated weights for policy 1, policy_version 70441 (0.0007) +[2023-10-13 03:11:43,383][46663] Updated weights for policy 1, policy_version 70451 (0.0009) +[2023-10-13 03:11:43,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 144343040. Throughput: 0: 1686.3, 1: 1684.6. Samples: 36093006. Policy #0 lag: (min: 5.0, avg: 5.2, max: 16.0) +[2023-10-13 03:11:43,607][45375] Avg episode reward: [(0, '56.810'), (1, '55.790')] +[2023-10-13 03:11:43,756][46663] Updated weights for policy 1, policy_version 70461 (0.0008) +[2023-10-13 03:11:45,950][46662] Updated weights for policy 0, policy_version 70530 (0.0010) +[2023-10-13 03:11:46,318][46662] Updated weights for policy 0, policy_version 70540 (0.0007) +[2023-10-13 03:11:46,690][46662] Updated weights for policy 0, policy_version 70550 (0.0008) +[2023-10-13 03:11:47,049][46662] Updated weights for policy 0, policy_version 70560 (0.0007) +[2023-10-13 03:11:47,692][46663] Updated weights for policy 1, policy_version 70471 (0.0008) +[2023-10-13 03:11:48,063][46663] Updated weights for policy 1, policy_version 70481 (0.0010) +[2023-10-13 03:11:48,430][46663] Updated weights for policy 1, policy_version 70491 (0.0011) +[2023-10-13 03:11:48,607][45375] Fps is (10 sec: 16384.0, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 144441344. Throughput: 0: 1666.6, 1: 1684.7. Samples: 36112860. Policy #0 lag: (min: 5.0, avg: 5.2, max: 16.0) +[2023-10-13 03:11:48,607][45375] Avg episode reward: [(0, '58.640'), (1, '55.960')] +[2023-10-13 03:11:51,058][46662] Updated weights for policy 0, policy_version 70570 (0.0008) +[2023-10-13 03:11:51,439][46662] Updated weights for policy 0, policy_version 70580 (0.0008) +[2023-10-13 03:11:51,806][46662] Updated weights for policy 0, policy_version 70590 (0.0009) +[2023-10-13 03:11:52,482][46663] Updated weights for policy 1, policy_version 70501 (0.0007) +[2023-10-13 03:11:52,851][46663] Updated weights for policy 1, policy_version 70511 (0.0010) +[2023-10-13 03:11:53,216][46663] Updated weights for policy 1, policy_version 70521 (0.0009) +[2023-10-13 03:11:53,607][45375] Fps is (10 sec: 16383.4, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 144506880. Throughput: 0: 1679.0, 1: 1666.2. Samples: 36132300. Policy #0 lag: (min: 5.0, avg: 5.2, max: 16.0) +[2023-10-13 03:11:53,608][45375] Avg episode reward: [(0, '58.100'), (1, '55.770')] +[2023-10-13 03:11:53,617][46091] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000070592_72286208.pth... +[2023-10-13 03:11:53,617][46384] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000070528_72220672.pth... +[2023-10-13 03:11:53,657][46091] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000069024_70680576.pth +[2023-10-13 03:11:53,657][46384] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000068928_70582272.pth +[2023-10-13 03:11:53,661][46091] Saving a milestone ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/milestones/checkpoint_000070592_72286208.pth +[2023-10-13 03:11:53,662][46384] Saving a milestone ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/milestones/checkpoint_000070528_72220672.pth +[2023-10-13 03:11:55,924][46662] Updated weights for policy 0, policy_version 70600 (0.0007) +[2023-10-13 03:11:56,293][46662] Updated weights for policy 0, policy_version 70610 (0.0007) +[2023-10-13 03:11:56,663][46662] Updated weights for policy 0, policy_version 70620 (0.0008) +[2023-10-13 03:11:57,230][46663] Updated weights for policy 1, policy_version 70531 (0.0007) +[2023-10-13 03:11:57,592][46663] Updated weights for policy 1, policy_version 70541 (0.0007) +[2023-10-13 03:11:57,955][46663] Updated weights for policy 1, policy_version 70551 (0.0008) +[2023-10-13 03:11:58,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 144572416. Throughput: 0: 1689.0, 1: 1691.2. Samples: 36143826. Policy #0 lag: (min: 5.0, avg: 5.2, max: 16.0) +[2023-10-13 03:11:58,607][45375] Avg episode reward: [(0, '57.330'), (1, '56.090')] +[2023-10-13 03:12:00,836][46662] Updated weights for policy 0, policy_version 70630 (0.0008) +[2023-10-13 03:12:01,208][46662] Updated weights for policy 0, policy_version 70640 (0.0008) +[2023-10-13 03:12:01,568][46662] Updated weights for policy 0, policy_version 70650 (0.0010) +[2023-10-13 03:12:02,078][46663] Updated weights for policy 1, policy_version 70561 (0.0009) +[2023-10-13 03:12:02,445][46663] Updated weights for policy 1, policy_version 70571 (0.0010) +[2023-10-13 03:12:02,819][46663] Updated weights for policy 1, policy_version 70581 (0.0008) +[2023-10-13 03:12:03,175][46663] Updated weights for policy 1, policy_version 70591 (0.0008) +[2023-10-13 03:12:03,607][45375] Fps is (10 sec: 13107.6, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 144637952. Throughput: 0: 1673.8, 1: 1681.7. Samples: 36163330. Policy #0 lag: (min: 5.0, avg: 5.2, max: 16.0) +[2023-10-13 03:12:03,607][45375] Avg episode reward: [(0, '56.960'), (1, '55.850')] +[2023-10-13 03:12:05,621][46662] Updated weights for policy 0, policy_version 70660 (0.0009) +[2023-10-13 03:12:06,018][46662] Updated weights for policy 0, policy_version 70670 (0.0008) +[2023-10-13 03:12:06,387][46662] Updated weights for policy 0, policy_version 70680 (0.0007) +[2023-10-13 03:12:07,166][46663] Updated weights for policy 1, policy_version 70601 (0.0009) +[2023-10-13 03:12:07,530][46663] Updated weights for policy 1, policy_version 70611 (0.0007) +[2023-10-13 03:12:07,896][46663] Updated weights for policy 1, policy_version 70621 (0.0007) +[2023-10-13 03:12:08,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 144703488. Throughput: 0: 1695.4, 1: 1670.4. Samples: 36183078. Policy #0 lag: (min: 23.0, avg: 23.0, max: 23.0) +[2023-10-13 03:12:08,607][45375] Avg episode reward: [(0, '56.910'), (1, '55.540')] +[2023-10-13 03:12:10,362][46662] Updated weights for policy 0, policy_version 70690 (0.0008) +[2023-10-13 03:12:10,735][46662] Updated weights for policy 0, policy_version 70700 (0.0008) +[2023-10-13 03:12:11,101][46662] Updated weights for policy 0, policy_version 70710 (0.0008) +[2023-10-13 03:12:11,469][46662] Updated weights for policy 0, policy_version 70720 (0.0011) +[2023-10-13 03:12:11,999][46663] Updated weights for policy 1, policy_version 70631 (0.0007) +[2023-10-13 03:12:12,367][46663] Updated weights for policy 1, policy_version 70641 (0.0009) +[2023-10-13 03:12:12,736][46663] Updated weights for policy 1, policy_version 70651 (0.0008) +[2023-10-13 03:12:13,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 144769024. Throughput: 0: 1683.6, 1: 1690.0. Samples: 36194298. Policy #0 lag: (min: 23.0, avg: 23.0, max: 23.0) +[2023-10-13 03:12:13,608][45375] Avg episode reward: [(0, '55.200'), (1, '56.960')] +[2023-10-13 03:12:15,625][46662] Updated weights for policy 0, policy_version 70730 (0.0008) +[2023-10-13 03:12:15,993][46662] Updated weights for policy 0, policy_version 70740 (0.0007) +[2023-10-13 03:12:16,371][46662] Updated weights for policy 0, policy_version 70750 (0.0007) +[2023-10-13 03:12:16,789][46663] Updated weights for policy 1, policy_version 70661 (0.0009) +[2023-10-13 03:12:17,187][46663] Updated weights for policy 1, policy_version 70671 (0.0007) +[2023-10-13 03:12:17,552][46663] Updated weights for policy 1, policy_version 70681 (0.0007) +[2023-10-13 03:12:18,606][45375] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 144834560. Throughput: 0: 1679.3, 1: 1674.0. Samples: 36213598. Policy #0 lag: (min: 23.0, avg: 23.0, max: 23.0) +[2023-10-13 03:12:18,607][45375] Avg episode reward: [(0, '54.380'), (1, '57.460')] +[2023-10-13 03:12:20,329][46662] Updated weights for policy 0, policy_version 70760 (0.0008) +[2023-10-13 03:12:20,699][46662] Updated weights for policy 0, policy_version 70770 (0.0009) +[2023-10-13 03:12:21,083][46662] Updated weights for policy 0, policy_version 70780 (0.0010) +[2023-10-13 03:12:21,660][46663] Updated weights for policy 1, policy_version 70691 (0.0008) +[2023-10-13 03:12:22,035][46663] Updated weights for policy 1, policy_version 70701 (0.0009) +[2023-10-13 03:12:22,396][46663] Updated weights for policy 1, policy_version 70711 (0.0010) +[2023-10-13 03:12:23,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 144900096. Throughput: 0: 1699.1, 1: 1674.1. Samples: 36233784. Policy #0 lag: (min: 23.0, avg: 23.0, max: 23.0) +[2023-10-13 03:12:23,608][45375] Avg episode reward: [(0, '55.610'), (1, '56.820')] +[2023-10-13 03:12:24,966][46662] Updated weights for policy 0, policy_version 70790 (0.0007) +[2023-10-13 03:12:25,332][46662] Updated weights for policy 0, policy_version 70800 (0.0010) +[2023-10-13 03:12:25,717][46662] Updated weights for policy 0, policy_version 70810 (0.0010) +[2023-10-13 03:12:26,493][46663] Updated weights for policy 1, policy_version 70721 (0.0009) +[2023-10-13 03:12:26,859][46663] Updated weights for policy 1, policy_version 70731 (0.0008) +[2023-10-13 03:12:27,220][46663] Updated weights for policy 1, policy_version 70741 (0.0008) +[2023-10-13 03:12:27,595][46663] Updated weights for policy 1, policy_version 70751 (0.0008) +[2023-10-13 03:12:28,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 144965632. Throughput: 0: 1672.7, 1: 1686.9. Samples: 36244186. Policy #0 lag: (min: 23.0, avg: 23.0, max: 23.0) +[2023-10-13 03:12:28,607][45375] Avg episode reward: [(0, '54.020'), (1, '57.330')] +[2023-10-13 03:12:29,987][46662] Updated weights for policy 0, policy_version 70820 (0.0009) +[2023-10-13 03:12:30,363][46662] Updated weights for policy 0, policy_version 70830 (0.0010) +[2023-10-13 03:12:30,734][46662] Updated weights for policy 0, policy_version 70840 (0.0010) +[2023-10-13 03:12:31,637][46663] Updated weights for policy 1, policy_version 70761 (0.0008) +[2023-10-13 03:12:31,994][46663] Updated weights for policy 1, policy_version 70771 (0.0007) +[2023-10-13 03:12:32,369][46663] Updated weights for policy 1, policy_version 70781 (0.0008) +[2023-10-13 03:12:33,607][45375] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 145031168. Throughput: 0: 1685.9, 1: 1659.3. Samples: 36263392. Policy #0 lag: (min: 23.0, avg: 23.0, max: 23.0) +[2023-10-13 03:12:33,607][45375] Avg episode reward: [(0, '54.900'), (1, '57.430')] +[2023-10-13 03:12:34,747][46662] Updated weights for policy 0, policy_version 70850 (0.0010) +[2023-10-13 03:12:35,121][46662] Updated weights for policy 0, policy_version 70860 (0.0008) +[2023-10-13 03:12:35,486][46662] Updated weights for policy 0, policy_version 70870 (0.0009) +[2023-10-13 03:12:35,859][46662] Updated weights for policy 0, policy_version 70880 (0.0008) +[2023-10-13 03:12:36,546][46663] Updated weights for policy 1, policy_version 70791 (0.0011) +[2023-10-13 03:12:36,922][46663] Updated weights for policy 1, policy_version 70801 (0.0009) +[2023-10-13 03:12:37,288][46663] Updated weights for policy 1, policy_version 70811 (0.0010) +[2023-10-13 03:12:38,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 145096704. Throughput: 0: 1692.5, 1: 1678.7. Samples: 36284006. Policy #0 lag: (min: 23.0, avg: 23.0, max: 23.0) +[2023-10-13 03:12:38,607][45375] Avg episode reward: [(0, '54.460'), (1, '55.910')] +[2023-10-13 03:12:40,047][46662] Updated weights for policy 0, policy_version 70890 (0.0011) +[2023-10-13 03:12:40,412][46662] Updated weights for policy 0, policy_version 70900 (0.0010) +[2023-10-13 03:12:40,778][46662] Updated weights for policy 0, policy_version 70910 (0.0009) +[2023-10-13 03:12:41,160][46663] Updated weights for policy 1, policy_version 70821 (0.0009) +[2023-10-13 03:12:41,532][46663] Updated weights for policy 1, policy_version 70831 (0.0007) +[2023-10-13 03:12:41,902][46663] Updated weights for policy 1, policy_version 70841 (0.0007) +[2023-10-13 03:12:43,607][45375] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 145162240. Throughput: 0: 1663.6, 1: 1674.2. Samples: 36294028. Policy #0 lag: (min: 23.0, avg: 23.0, max: 23.0) +[2023-10-13 03:12:43,608][45375] Avg episode reward: [(0, '53.240'), (1, '55.540')] +[2023-10-13 03:12:44,815][46662] Updated weights for policy 0, policy_version 70920 (0.0010) +[2023-10-13 03:12:45,188][46662] Updated weights for policy 0, policy_version 70930 (0.0008) +[2023-10-13 03:12:45,550][46662] Updated weights for policy 0, policy_version 70940 (0.0007) +[2023-10-13 03:12:46,000][46663] Updated weights for policy 1, policy_version 70851 (0.0008) +[2023-10-13 03:12:46,375][46663] Updated weights for policy 1, policy_version 70861 (0.0010) +[2023-10-13 03:12:46,744][46663] Updated weights for policy 1, policy_version 70871 (0.0007) +[2023-10-13 03:12:48,607][45375] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 145227776. Throughput: 0: 1680.5, 1: 1663.3. Samples: 36313802. Policy #0 lag: (min: 23.0, avg: 23.0, max: 23.0) +[2023-10-13 03:12:48,608][45375] Avg episode reward: [(0, '52.170'), (1, '55.230')] +[2023-10-13 03:12:49,634][46662] Updated weights for policy 0, policy_version 70950 (0.0007) +[2023-10-13 03:12:50,000][46662] Updated weights for policy 0, policy_version 70960 (0.0007) +[2023-10-13 03:12:50,377][46662] Updated weights for policy 0, policy_version 70970 (0.0007) +[2023-10-13 03:12:50,778][46663] Updated weights for policy 1, policy_version 70881 (0.0008) +[2023-10-13 03:12:51,140][46663] Updated weights for policy 1, policy_version 70891 (0.0011) +[2023-10-13 03:12:51,506][46663] Updated weights for policy 1, policy_version 70901 (0.0009) +[2023-10-13 03:12:51,870][46663] Updated weights for policy 1, policy_version 70911 (0.0008) +[2023-10-13 03:12:53,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 145293312. Throughput: 0: 1682.7, 1: 1686.2. Samples: 36334676. Policy #0 lag: (min: 23.0, avg: 23.0, max: 23.0) +[2023-10-13 03:12:53,608][45375] Avg episode reward: [(0, '51.410'), (1, '55.460')] +[2023-10-13 03:12:54,549][46662] Updated weights for policy 0, policy_version 70980 (0.0007) +[2023-10-13 03:12:54,940][46662] Updated weights for policy 0, policy_version 70990 (0.0009) +[2023-10-13 03:12:55,301][46662] Updated weights for policy 0, policy_version 71000 (0.0009) +[2023-10-13 03:12:55,988][46663] Updated weights for policy 1, policy_version 70921 (0.0008) +[2023-10-13 03:12:56,360][46663] Updated weights for policy 1, policy_version 70931 (0.0007) +[2023-10-13 03:12:56,728][46663] Updated weights for policy 1, policy_version 70941 (0.0007) +[2023-10-13 03:12:58,607][45375] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 145358848. Throughput: 0: 1662.7, 1: 1670.4. Samples: 36344288. Policy #0 lag: (min: 30.0, avg: 30.0, max: 30.0) +[2023-10-13 03:12:58,607][45375] Avg episode reward: [(0, '51.810'), (1, '55.250')] +[2023-10-13 03:12:59,219][46662] Updated weights for policy 0, policy_version 71010 (0.0008) +[2023-10-13 03:12:59,590][46662] Updated weights for policy 0, policy_version 71020 (0.0008) +[2023-10-13 03:12:59,965][46662] Updated weights for policy 0, policy_version 71030 (0.0010) +[2023-10-13 03:13:00,330][46662] Updated weights for policy 0, policy_version 71040 (0.0007) +[2023-10-13 03:13:00,674][46663] Updated weights for policy 1, policy_version 70951 (0.0008) +[2023-10-13 03:13:01,037][46663] Updated weights for policy 1, policy_version 70961 (0.0009) +[2023-10-13 03:13:01,405][46663] Updated weights for policy 1, policy_version 70971 (0.0009) +[2023-10-13 03:13:03,607][45375] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 145424384. Throughput: 0: 1677.6, 1: 1683.8. Samples: 36364864. Policy #0 lag: (min: 30.0, avg: 30.0, max: 30.0) +[2023-10-13 03:13:03,607][45375] Avg episode reward: [(0, '52.730'), (1, '53.050')] +[2023-10-13 03:13:04,517][46662] Updated weights for policy 0, policy_version 71050 (0.0010) +[2023-10-13 03:13:04,890][46662] Updated weights for policy 0, policy_version 71060 (0.0009) +[2023-10-13 03:13:05,259][46662] Updated weights for policy 0, policy_version 71070 (0.0007) +[2023-10-13 03:13:05,557][46663] Updated weights for policy 1, policy_version 70981 (0.0009) +[2023-10-13 03:13:05,943][46663] Updated weights for policy 1, policy_version 70991 (0.0010) +[2023-10-13 03:13:06,320][46663] Updated weights for policy 1, policy_version 71001 (0.0009) +[2023-10-13 03:13:08,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 145489920. Throughput: 0: 1677.7, 1: 1695.1. Samples: 36385558. Policy #0 lag: (min: 30.0, avg: 30.0, max: 30.0) +[2023-10-13 03:13:08,607][45375] Avg episode reward: [(0, '53.150'), (1, '54.050')] +[2023-10-13 03:13:09,266][46662] Updated weights for policy 0, policy_version 71080 (0.0008) +[2023-10-13 03:13:09,641][46662] Updated weights for policy 0, policy_version 71090 (0.0008) +[2023-10-13 03:13:10,008][46662] Updated weights for policy 0, policy_version 71100 (0.0008) +[2023-10-13 03:13:10,227][46663] Updated weights for policy 1, policy_version 71011 (0.0008) +[2023-10-13 03:13:10,592][46663] Updated weights for policy 1, policy_version 71021 (0.0009) +[2023-10-13 03:13:10,955][46663] Updated weights for policy 1, policy_version 71031 (0.0008) +[2023-10-13 03:13:13,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 145555456. Throughput: 0: 1674.0, 1: 1670.7. Samples: 36394696. Policy #0 lag: (min: 30.0, avg: 30.0, max: 30.0) +[2023-10-13 03:13:13,607][45375] Avg episode reward: [(0, '54.550'), (1, '54.020')] +[2023-10-13 03:13:13,945][46662] Updated weights for policy 0, policy_version 71110 (0.0009) +[2023-10-13 03:13:14,312][46662] Updated weights for policy 0, policy_version 71120 (0.0008) +[2023-10-13 03:13:14,681][46662] Updated weights for policy 0, policy_version 71130 (0.0009) +[2023-10-13 03:13:14,906][46663] Updated weights for policy 1, policy_version 71041 (0.0008) +[2023-10-13 03:13:15,275][46663] Updated weights for policy 1, policy_version 71051 (0.0010) +[2023-10-13 03:13:15,642][46663] Updated weights for policy 1, policy_version 71061 (0.0009) +[2023-10-13 03:13:16,014][46663] Updated weights for policy 1, policy_version 71071 (0.0008) +[2023-10-13 03:13:18,606][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 145620992. Throughput: 0: 1683.3, 1: 1700.0. Samples: 36415638. Policy #0 lag: (min: 30.0, avg: 30.0, max: 30.0) +[2023-10-13 03:13:18,608][45375] Avg episode reward: [(0, '54.520'), (1, '53.290')] +[2023-10-13 03:13:18,744][46662] Updated weights for policy 0, policy_version 71140 (0.0008) +[2023-10-13 03:13:19,102][46662] Updated weights for policy 0, policy_version 71150 (0.0007) +[2023-10-13 03:13:19,478][46662] Updated weights for policy 0, policy_version 71160 (0.0009) +[2023-10-13 03:13:20,027][46663] Updated weights for policy 1, policy_version 71081 (0.0008) +[2023-10-13 03:13:20,385][46663] Updated weights for policy 1, policy_version 71091 (0.0009) +[2023-10-13 03:13:20,749][46663] Updated weights for policy 1, policy_version 71101 (0.0010) +[2023-10-13 03:13:23,544][46662] Updated weights for policy 0, policy_version 71170 (0.0008) +[2023-10-13 03:13:23,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 145686528. Throughput: 0: 1683.6, 1: 1701.3. Samples: 36436328. Policy #0 lag: (min: 30.0, avg: 30.0, max: 30.0) +[2023-10-13 03:13:23,607][45375] Avg episode reward: [(0, '55.490'), (1, '53.400')] +[2023-10-13 03:13:23,921][46662] Updated weights for policy 0, policy_version 71180 (0.0007) +[2023-10-13 03:13:24,292][46662] Updated weights for policy 0, policy_version 71190 (0.0007) +[2023-10-13 03:13:24,660][46662] Updated weights for policy 0, policy_version 71200 (0.0007) +[2023-10-13 03:13:24,961][46663] Updated weights for policy 1, policy_version 71111 (0.0008) +[2023-10-13 03:13:25,335][46663] Updated weights for policy 1, policy_version 71121 (0.0009) +[2023-10-13 03:13:25,695][46663] Updated weights for policy 1, policy_version 71131 (0.0008) +[2023-10-13 03:13:28,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 145752064. Throughput: 0: 1683.5, 1: 1678.2. Samples: 36445302. Policy #0 lag: (min: 30.0, avg: 30.0, max: 30.0) +[2023-10-13 03:13:28,607][45375] Avg episode reward: [(0, '55.980'), (1, '53.110')] +[2023-10-13 03:13:28,668][46662] Updated weights for policy 0, policy_version 71210 (0.0007) +[2023-10-13 03:13:29,039][46662] Updated weights for policy 0, policy_version 71220 (0.0009) +[2023-10-13 03:13:29,417][46662] Updated weights for policy 0, policy_version 71230 (0.0008) +[2023-10-13 03:13:29,799][46663] Updated weights for policy 1, policy_version 71141 (0.0007) +[2023-10-13 03:13:30,161][46663] Updated weights for policy 1, policy_version 71151 (0.0007) +[2023-10-13 03:13:30,519][46663] Updated weights for policy 1, policy_version 71161 (0.0007) +[2023-10-13 03:13:33,588][46662] Updated weights for policy 0, policy_version 71240 (0.0009) +[2023-10-13 03:13:33,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 145817600. Throughput: 0: 1688.0, 1: 1697.8. Samples: 36466164. Policy #0 lag: (min: 30.0, avg: 30.0, max: 30.0) +[2023-10-13 03:13:33,607][45375] Avg episode reward: [(0, '56.930'), (1, '52.340')] +[2023-10-13 03:13:33,965][46662] Updated weights for policy 0, policy_version 71250 (0.0009) +[2023-10-13 03:13:34,328][46662] Updated weights for policy 0, policy_version 71260 (0.0008) +[2023-10-13 03:13:34,379][46663] Updated weights for policy 1, policy_version 71171 (0.0010) +[2023-10-13 03:13:34,745][46663] Updated weights for policy 1, policy_version 71181 (0.0008) +[2023-10-13 03:13:35,099][46663] Updated weights for policy 1, policy_version 71191 (0.0009) +[2023-10-13 03:13:38,337][46662] Updated weights for policy 0, policy_version 71270 (0.0008) +[2023-10-13 03:13:38,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 145883136. Throughput: 0: 1688.5, 1: 1695.7. Samples: 36486966. Policy #0 lag: (min: 30.0, avg: 30.0, max: 30.0) +[2023-10-13 03:13:38,607][45375] Avg episode reward: [(0, '57.770'), (1, '52.770')] +[2023-10-13 03:13:38,708][46662] Updated weights for policy 0, policy_version 71280 (0.0009) +[2023-10-13 03:13:39,068][46662] Updated weights for policy 0, policy_version 71290 (0.0008) +[2023-10-13 03:13:39,084][46663] Updated weights for policy 1, policy_version 71201 (0.0008) +[2023-10-13 03:13:39,453][46663] Updated weights for policy 1, policy_version 71211 (0.0010) +[2023-10-13 03:13:39,822][46663] Updated weights for policy 1, policy_version 71221 (0.0009) +[2023-10-13 03:13:40,186][46663] Updated weights for policy 1, policy_version 71231 (0.0009) +[2023-10-13 03:13:43,389][46662] Updated weights for policy 0, policy_version 71300 (0.0008) +[2023-10-13 03:13:43,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.3, 300 sec: 13440.4). Total num frames: 145948672. Throughput: 0: 1693.1, 1: 1681.4. Samples: 36496140. Policy #0 lag: (min: 30.0, avg: 30.0, max: 30.0) +[2023-10-13 03:13:43,607][45375] Avg episode reward: [(0, '57.510'), (1, '53.430')] +[2023-10-13 03:13:43,793][46662] Updated weights for policy 0, policy_version 71310 (0.0007) +[2023-10-13 03:13:44,154][46662] Updated weights for policy 0, policy_version 71320 (0.0007) +[2023-10-13 03:13:44,319][46663] Updated weights for policy 1, policy_version 71241 (0.0008) +[2023-10-13 03:13:44,680][46663] Updated weights for policy 1, policy_version 71251 (0.0007) +[2023-10-13 03:13:45,044][46663] Updated weights for policy 1, policy_version 71261 (0.0007) +[2023-10-13 03:13:48,241][46662] Updated weights for policy 0, policy_version 71330 (0.0007) +[2023-10-13 03:13:48,606][45375] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 146014208. Throughput: 0: 1681.1, 1: 1689.0. Samples: 36516518. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 03:13:48,607][45375] Avg episode reward: [(0, '56.720'), (1, '52.800')] +[2023-10-13 03:13:48,607][46662] Updated weights for policy 0, policy_version 71340 (0.0008) +[2023-10-13 03:13:48,972][46662] Updated weights for policy 0, policy_version 71350 (0.0010) +[2023-10-13 03:13:49,093][46663] Updated weights for policy 1, policy_version 71271 (0.0007) +[2023-10-13 03:13:49,337][46662] Updated weights for policy 0, policy_version 71360 (0.0010) +[2023-10-13 03:13:49,454][46663] Updated weights for policy 1, policy_version 71281 (0.0007) +[2023-10-13 03:13:49,829][46663] Updated weights for policy 1, policy_version 71291 (0.0007) +[2023-10-13 03:13:53,450][46662] Updated weights for policy 0, policy_version 71370 (0.0009) +[2023-10-13 03:13:53,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 146079744. Throughput: 0: 1677.8, 1: 1688.9. Samples: 36537062. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 03:13:53,607][45375] Avg episode reward: [(0, '56.410'), (1, '51.820')] +[2023-10-13 03:13:53,616][46384] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000071296_73007104.pth... +[2023-10-13 03:13:53,652][46384] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000069728_71401472.pth +[2023-10-13 03:13:53,814][46662] Updated weights for policy 0, policy_version 71380 (0.0008) +[2023-10-13 03:13:53,990][46663] Updated weights for policy 1, policy_version 71301 (0.0008) +[2023-10-13 03:13:54,184][46662] Updated weights for policy 0, policy_version 71390 (0.0009) +[2023-10-13 03:13:54,256][46091] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000071392_73105408.pth... +[2023-10-13 03:13:54,284][46091] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000069792_71467008.pth +[2023-10-13 03:13:54,368][46663] Updated weights for policy 1, policy_version 71311 (0.0009) +[2023-10-13 03:13:54,734][46663] Updated weights for policy 1, policy_version 71321 (0.0008) +[2023-10-13 03:13:58,431][46662] Updated weights for policy 0, policy_version 71400 (0.0007) +[2023-10-13 03:13:58,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 146145280. Throughput: 0: 1674.1, 1: 1688.9. Samples: 36546032. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 03:13:58,607][45375] Avg episode reward: [(0, '57.020'), (1, '51.220')] +[2023-10-13 03:13:58,770][46663] Updated weights for policy 1, policy_version 71331 (0.0007) +[2023-10-13 03:13:58,810][46662] Updated weights for policy 0, policy_version 71410 (0.0009) +[2023-10-13 03:13:59,133][46663] Updated weights for policy 1, policy_version 71341 (0.0008) +[2023-10-13 03:13:59,176][46662] Updated weights for policy 0, policy_version 71420 (0.0007) +[2023-10-13 03:13:59,503][46663] Updated weights for policy 1, policy_version 71351 (0.0008) +[2023-10-13 03:14:03,281][46662] Updated weights for policy 0, policy_version 71430 (0.0007) +[2023-10-13 03:14:03,596][46663] Updated weights for policy 1, policy_version 71361 (0.0009) +[2023-10-13 03:14:03,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 146210816. Throughput: 0: 1671.3, 1: 1688.3. Samples: 36566818. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 03:14:03,608][45375] Avg episode reward: [(0, '58.570'), (1, '51.940')] +[2023-10-13 03:14:03,662][46662] Updated weights for policy 0, policy_version 71440 (0.0007) +[2023-10-13 03:14:03,958][46663] Updated weights for policy 1, policy_version 71371 (0.0009) +[2023-10-13 03:14:04,031][46662] Updated weights for policy 0, policy_version 71450 (0.0007) +[2023-10-13 03:14:04,326][46663] Updated weights for policy 1, policy_version 71381 (0.0008) +[2023-10-13 03:14:04,697][46663] Updated weights for policy 1, policy_version 71391 (0.0009) +[2023-10-13 03:14:07,944][46662] Updated weights for policy 0, policy_version 71460 (0.0009) +[2023-10-13 03:14:08,321][46662] Updated weights for policy 0, policy_version 71470 (0.0010) +[2023-10-13 03:14:08,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 146276352. Throughput: 0: 1669.2, 1: 1689.8. Samples: 36587484. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 03:14:08,607][45375] Avg episode reward: [(0, '59.940'), (1, '52.360')] +[2023-10-13 03:14:08,701][46662] Updated weights for policy 0, policy_version 71480 (0.0009) +[2023-10-13 03:14:08,806][46663] Updated weights for policy 1, policy_version 71401 (0.0008) +[2023-10-13 03:14:09,175][46663] Updated weights for policy 1, policy_version 71411 (0.0009) +[2023-10-13 03:14:09,541][46663] Updated weights for policy 1, policy_version 71421 (0.0009) +[2023-10-13 03:14:12,717][46662] Updated weights for policy 0, policy_version 71490 (0.0008) +[2023-10-13 03:14:13,082][46662] Updated weights for policy 0, policy_version 71500 (0.0008) +[2023-10-13 03:14:13,446][46662] Updated weights for policy 0, policy_version 71510 (0.0008) +[2023-10-13 03:14:13,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.1, 300 sec: 13329.3). Total num frames: 146341888. Throughput: 0: 1668.0, 1: 1692.1. Samples: 36596506. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 03:14:13,608][45375] Avg episode reward: [(0, '60.640'), (1, '55.830')] +[2023-10-13 03:14:13,615][46663] Updated weights for policy 1, policy_version 71431 (0.0008) +[2023-10-13 03:14:13,820][46662] Updated weights for policy 0, policy_version 71520 (0.0008) +[2023-10-13 03:14:13,978][46663] Updated weights for policy 1, policy_version 71441 (0.0007) +[2023-10-13 03:14:14,350][46663] Updated weights for policy 1, policy_version 71451 (0.0008) +[2023-10-13 03:14:17,776][46662] Updated weights for policy 0, policy_version 71530 (0.0009) +[2023-10-13 03:14:18,151][46662] Updated weights for policy 0, policy_version 71540 (0.0010) +[2023-10-13 03:14:18,390][46663] Updated weights for policy 1, policy_version 71461 (0.0007) +[2023-10-13 03:14:18,522][46662] Updated weights for policy 0, policy_version 71550 (0.0007) +[2023-10-13 03:14:18,607][45375] Fps is (10 sec: 16383.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 146440192. Throughput: 0: 1666.8, 1: 1689.8. Samples: 36617208. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 03:14:18,607][45375] Avg episode reward: [(0, '59.040'), (1, '56.750')] +[2023-10-13 03:14:18,755][46663] Updated weights for policy 1, policy_version 71471 (0.0009) +[2023-10-13 03:14:19,126][46663] Updated weights for policy 1, policy_version 71481 (0.0008) +[2023-10-13 03:14:22,624][46662] Updated weights for policy 0, policy_version 71560 (0.0007) +[2023-10-13 03:14:22,994][46662] Updated weights for policy 0, policy_version 71570 (0.0009) +[2023-10-13 03:14:23,146][46663] Updated weights for policy 1, policy_version 71491 (0.0007) +[2023-10-13 03:14:23,366][46662] Updated weights for policy 0, policy_version 71580 (0.0009) +[2023-10-13 03:14:23,520][46663] Updated weights for policy 1, policy_version 71501 (0.0009) +[2023-10-13 03:14:23,606][45375] Fps is (10 sec: 16384.3, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 146505728. Throughput: 0: 1653.9, 1: 1683.2. Samples: 36637136. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 03:14:23,607][45375] Avg episode reward: [(0, '59.040'), (1, '55.180')] +[2023-10-13 03:14:23,873][46663] Updated weights for policy 1, policy_version 71511 (0.0010) +[2023-10-13 03:14:27,520][46662] Updated weights for policy 0, policy_version 71590 (0.0008) +[2023-10-13 03:14:27,879][46662] Updated weights for policy 0, policy_version 71600 (0.0010) +[2023-10-13 03:14:27,968][46663] Updated weights for policy 1, policy_version 71521 (0.0010) +[2023-10-13 03:14:28,257][46662] Updated weights for policy 0, policy_version 71610 (0.0008) +[2023-10-13 03:14:28,322][46663] Updated weights for policy 1, policy_version 71531 (0.0007) +[2023-10-13 03:14:28,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 146571264. Throughput: 0: 1665.0, 1: 1692.8. Samples: 36647242. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 03:14:28,608][45375] Avg episode reward: [(0, '59.190'), (1, '54.520')] +[2023-10-13 03:14:28,691][46663] Updated weights for policy 1, policy_version 71541 (0.0010) +[2023-10-13 03:14:29,046][46663] Updated weights for policy 1, policy_version 71551 (0.0008) +[2023-10-13 03:14:32,457][46662] Updated weights for policy 0, policy_version 71620 (0.0009) +[2023-10-13 03:14:32,856][46662] Updated weights for policy 0, policy_version 71630 (0.0009) +[2023-10-13 03:14:33,146][46663] Updated weights for policy 1, policy_version 71561 (0.0007) +[2023-10-13 03:14:33,222][46662] Updated weights for policy 0, policy_version 71640 (0.0008) +[2023-10-13 03:14:33,516][46663] Updated weights for policy 1, policy_version 71571 (0.0008) +[2023-10-13 03:14:33,606][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 146636800. Throughput: 0: 1673.6, 1: 1690.6. Samples: 36667910. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 03:14:33,607][45375] Avg episode reward: [(0, '57.670'), (1, '53.460')] +[2023-10-13 03:14:33,888][46663] Updated weights for policy 1, policy_version 71581 (0.0008) +[2023-10-13 03:14:37,336][46662] Updated weights for policy 0, policy_version 71650 (0.0009) +[2023-10-13 03:14:37,703][46662] Updated weights for policy 0, policy_version 71660 (0.0008) +[2023-10-13 03:14:37,981][46663] Updated weights for policy 1, policy_version 71591 (0.0007) +[2023-10-13 03:14:38,080][46662] Updated weights for policy 0, policy_version 71670 (0.0009) +[2023-10-13 03:14:38,351][46663] Updated weights for policy 1, policy_version 71601 (0.0007) +[2023-10-13 03:14:38,454][46662] Updated weights for policy 0, policy_version 71680 (0.0010) +[2023-10-13 03:14:38,606][45375] Fps is (10 sec: 13107.4, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 146702336. Throughput: 0: 1656.6, 1: 1675.3. Samples: 36686996. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 03:14:38,607][45375] Avg episode reward: [(0, '55.370'), (1, '54.320')] +[2023-10-13 03:14:38,715][46663] Updated weights for policy 1, policy_version 71611 (0.0008) +[2023-10-13 03:14:42,403][46662] Updated weights for policy 0, policy_version 71690 (0.0007) +[2023-10-13 03:14:42,770][46662] Updated weights for policy 0, policy_version 71700 (0.0008) +[2023-10-13 03:14:42,862][46663] Updated weights for policy 1, policy_version 71621 (0.0008) +[2023-10-13 03:14:43,136][46662] Updated weights for policy 0, policy_version 71710 (0.0008) +[2023-10-13 03:14:43,247][46663] Updated weights for policy 1, policy_version 71631 (0.0010) +[2023-10-13 03:14:43,606][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 146767872. Throughput: 0: 1672.4, 1: 1688.6. Samples: 36697274. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 03:14:43,607][45375] Avg episode reward: [(0, '54.640'), (1, '54.010')] +[2023-10-13 03:14:43,621][46663] Updated weights for policy 1, policy_version 71641 (0.0011) +[2023-10-13 03:14:47,294][46662] Updated weights for policy 0, policy_version 71720 (0.0009) +[2023-10-13 03:14:47,667][46662] Updated weights for policy 0, policy_version 71730 (0.0009) +[2023-10-13 03:14:47,775][46663] Updated weights for policy 1, policy_version 71651 (0.0008) +[2023-10-13 03:14:48,032][46662] Updated weights for policy 0, policy_version 71740 (0.0008) +[2023-10-13 03:14:48,137][46663] Updated weights for policy 1, policy_version 71661 (0.0007) +[2023-10-13 03:14:48,504][46663] Updated weights for policy 1, policy_version 71671 (0.0009) +[2023-10-13 03:14:48,606][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 146833408. Throughput: 0: 1675.1, 1: 1680.3. Samples: 36717810. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 03:14:48,607][45375] Avg episode reward: [(0, '54.040'), (1, '54.720')] +[2023-10-13 03:14:52,031][46662] Updated weights for policy 0, policy_version 71750 (0.0011) +[2023-10-13 03:14:52,394][46662] Updated weights for policy 0, policy_version 71760 (0.0011) +[2023-10-13 03:14:52,600][46663] Updated weights for policy 1, policy_version 71681 (0.0010) +[2023-10-13 03:14:52,776][46662] Updated weights for policy 0, policy_version 71770 (0.0010) +[2023-10-13 03:14:52,966][46663] Updated weights for policy 1, policy_version 71691 (0.0009) +[2023-10-13 03:14:53,331][46663] Updated weights for policy 1, policy_version 71701 (0.0011) +[2023-10-13 03:14:53,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 146898944. Throughput: 0: 1649.7, 1: 1661.7. Samples: 36736500. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 03:14:53,607][45375] Avg episode reward: [(0, '54.620'), (1, '56.080')] +[2023-10-13 03:14:53,696][46663] Updated weights for policy 1, policy_version 71711 (0.0008) +[2023-10-13 03:14:56,772][46662] Updated weights for policy 0, policy_version 71780 (0.0008) +[2023-10-13 03:14:57,135][46662] Updated weights for policy 0, policy_version 71790 (0.0008) +[2023-10-13 03:14:57,505][46662] Updated weights for policy 0, policy_version 71800 (0.0009) +[2023-10-13 03:14:57,828][46663] Updated weights for policy 1, policy_version 71721 (0.0007) +[2023-10-13 03:14:58,202][46663] Updated weights for policy 1, policy_version 71731 (0.0009) +[2023-10-13 03:14:58,578][46663] Updated weights for policy 1, policy_version 71741 (0.0008) +[2023-10-13 03:14:58,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 146964480. Throughput: 0: 1677.8, 1: 1679.4. Samples: 36747580. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 03:14:58,607][45375] Avg episode reward: [(0, '54.280'), (1, '57.470')] +[2023-10-13 03:15:01,787][46662] Updated weights for policy 0, policy_version 71810 (0.0008) +[2023-10-13 03:15:02,158][46662] Updated weights for policy 0, policy_version 71820 (0.0010) +[2023-10-13 03:15:02,524][46662] Updated weights for policy 0, policy_version 71830 (0.0009) +[2023-10-13 03:15:02,773][46663] Updated weights for policy 1, policy_version 71751 (0.0008) +[2023-10-13 03:15:02,898][46662] Updated weights for policy 0, policy_version 71840 (0.0009) +[2023-10-13 03:15:03,135][46663] Updated weights for policy 1, policy_version 71761 (0.0009) +[2023-10-13 03:15:03,500][46663] Updated weights for policy 1, policy_version 71771 (0.0008) +[2023-10-13 03:15:03,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 147030016. Throughput: 0: 1671.3, 1: 1676.5. Samples: 36767858. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 03:15:03,608][45375] Avg episode reward: [(0, '54.750'), (1, '56.790')] +[2023-10-13 03:15:06,981][46662] Updated weights for policy 0, policy_version 71850 (0.0010) +[2023-10-13 03:15:07,349][46662] Updated weights for policy 0, policy_version 71860 (0.0008) +[2023-10-13 03:15:07,600][46663] Updated weights for policy 1, policy_version 71781 (0.0009) +[2023-10-13 03:15:07,718][46662] Updated weights for policy 0, policy_version 71870 (0.0008) +[2023-10-13 03:15:07,976][46663] Updated weights for policy 1, policy_version 71791 (0.0008) +[2023-10-13 03:15:08,342][46663] Updated weights for policy 1, policy_version 71801 (0.0008) +[2023-10-13 03:15:08,607][45375] Fps is (10 sec: 16383.9, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 147128320. Throughput: 0: 1655.2, 1: 1661.8. Samples: 36786402. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 03:15:08,608][45375] Avg episode reward: [(0, '55.820'), (1, '54.860')] +[2023-10-13 03:15:11,847][46662] Updated weights for policy 0, policy_version 71880 (0.0008) +[2023-10-13 03:15:12,214][46662] Updated weights for policy 0, policy_version 71890 (0.0008) +[2023-10-13 03:15:12,434][46663] Updated weights for policy 1, policy_version 71811 (0.0008) +[2023-10-13 03:15:12,580][46662] Updated weights for policy 0, policy_version 71900 (0.0009) +[2023-10-13 03:15:12,796][46663] Updated weights for policy 1, policy_version 71821 (0.0011) +[2023-10-13 03:15:13,159][46663] Updated weights for policy 1, policy_version 71831 (0.0011) +[2023-10-13 03:15:13,606][45375] Fps is (10 sec: 16384.3, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 147193856. Throughput: 0: 1669.3, 1: 1674.1. Samples: 36797694. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 03:15:13,607][45375] Avg episode reward: [(0, '56.380'), (1, '55.400')] +[2023-10-13 03:15:16,569][46662] Updated weights for policy 0, policy_version 71910 (0.0009) +[2023-10-13 03:15:16,934][46662] Updated weights for policy 0, policy_version 71920 (0.0009) +[2023-10-13 03:15:17,203][46663] Updated weights for policy 1, policy_version 71841 (0.0009) +[2023-10-13 03:15:17,297][46662] Updated weights for policy 0, policy_version 71930 (0.0008) +[2023-10-13 03:15:17,573][46663] Updated weights for policy 1, policy_version 71851 (0.0007) +[2023-10-13 03:15:17,933][46663] Updated weights for policy 1, policy_version 71861 (0.0009) +[2023-10-13 03:15:18,306][46663] Updated weights for policy 1, policy_version 71871 (0.0007) +[2023-10-13 03:15:18,606][45375] Fps is (10 sec: 13107.4, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 147259392. Throughput: 0: 1664.5, 1: 1669.8. Samples: 36817954. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 03:15:18,607][45375] Avg episode reward: [(0, '56.210'), (1, '53.580')] +[2023-10-13 03:15:21,540][46662] Updated weights for policy 0, policy_version 71940 (0.0008) +[2023-10-13 03:15:21,927][46662] Updated weights for policy 0, policy_version 71950 (0.0008) +[2023-10-13 03:15:22,270][46663] Updated weights for policy 1, policy_version 71881 (0.0009) +[2023-10-13 03:15:22,296][46662] Updated weights for policy 0, policy_version 71960 (0.0009) +[2023-10-13 03:15:22,632][46663] Updated weights for policy 1, policy_version 71891 (0.0010) +[2023-10-13 03:15:22,997][46663] Updated weights for policy 1, policy_version 71901 (0.0008) +[2023-10-13 03:15:23,606][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 147324928. Throughput: 0: 1660.1, 1: 1662.4. Samples: 36836506. Policy #0 lag: (min: 11.0, avg: 13.5, max: 43.0) +[2023-10-13 03:15:23,607][45375] Avg episode reward: [(0, '55.700'), (1, '52.950')] +[2023-10-13 03:15:26,518][46662] Updated weights for policy 0, policy_version 71970 (0.0008) +[2023-10-13 03:15:26,891][46662] Updated weights for policy 0, policy_version 71980 (0.0009) +[2023-10-13 03:15:26,961][46663] Updated weights for policy 1, policy_version 71911 (0.0009) +[2023-10-13 03:15:27,266][46662] Updated weights for policy 0, policy_version 71990 (0.0009) +[2023-10-13 03:15:27,335][46663] Updated weights for policy 1, policy_version 71921 (0.0009) +[2023-10-13 03:15:27,641][46662] Updated weights for policy 0, policy_version 72000 (0.0010) +[2023-10-13 03:15:27,704][46663] Updated weights for policy 1, policy_version 71931 (0.0007) +[2023-10-13 03:15:28,606][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 147390464. Throughput: 0: 1669.2, 1: 1679.0. Samples: 36847942. Policy #0 lag: (min: 11.0, avg: 13.5, max: 43.0) +[2023-10-13 03:15:28,607][45375] Avg episode reward: [(0, '55.310'), (1, '53.270')] +[2023-10-13 03:15:31,607][46662] Updated weights for policy 0, policy_version 72010 (0.0008) +[2023-10-13 03:15:31,823][46663] Updated weights for policy 1, policy_version 71941 (0.0008) +[2023-10-13 03:15:31,966][46662] Updated weights for policy 0, policy_version 72020 (0.0007) +[2023-10-13 03:15:32,180][46663] Updated weights for policy 1, policy_version 71951 (0.0008) +[2023-10-13 03:15:32,335][46662] Updated weights for policy 0, policy_version 72030 (0.0009) +[2023-10-13 03:15:32,542][46663] Updated weights for policy 1, policy_version 71961 (0.0008) +[2023-10-13 03:15:33,606][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 147456000. Throughput: 0: 1654.8, 1: 1664.8. Samples: 36867192. Policy #0 lag: (min: 11.0, avg: 13.5, max: 43.0) +[2023-10-13 03:15:33,607][45375] Avg episode reward: [(0, '54.350'), (1, '55.830')] +[2023-10-13 03:15:36,482][46662] Updated weights for policy 0, policy_version 72040 (0.0008) +[2023-10-13 03:15:36,598][46663] Updated weights for policy 1, policy_version 71971 (0.0008) +[2023-10-13 03:15:36,857][46662] Updated weights for policy 0, policy_version 72050 (0.0009) +[2023-10-13 03:15:36,967][46663] Updated weights for policy 1, policy_version 71981 (0.0008) +[2023-10-13 03:15:37,235][46662] Updated weights for policy 0, policy_version 72060 (0.0007) +[2023-10-13 03:15:37,331][46663] Updated weights for policy 1, policy_version 71991 (0.0008) +[2023-10-13 03:15:38,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 147521536. Throughput: 0: 1662.5, 1: 1670.6. Samples: 36886492. Policy #0 lag: (min: 11.0, avg: 13.5, max: 43.0) +[2023-10-13 03:15:38,607][45375] Avg episode reward: [(0, '55.100'), (1, '57.220')] +[2023-10-13 03:15:41,054][46662] Updated weights for policy 0, policy_version 72070 (0.0009) +[2023-10-13 03:15:41,424][46662] Updated weights for policy 0, policy_version 72080 (0.0010) +[2023-10-13 03:15:41,566][46663] Updated weights for policy 1, policy_version 72001 (0.0008) +[2023-10-13 03:15:41,792][46662] Updated weights for policy 0, policy_version 72090 (0.0008) +[2023-10-13 03:15:41,935][46663] Updated weights for policy 1, policy_version 72011 (0.0008) +[2023-10-13 03:15:42,315][46663] Updated weights for policy 1, policy_version 72021 (0.0009) +[2023-10-13 03:15:42,672][46663] Updated weights for policy 1, policy_version 72031 (0.0010) +[2023-10-13 03:15:43,606][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 147587072. Throughput: 0: 1668.9, 1: 1678.0. Samples: 36898192. Policy #0 lag: (min: 11.0, avg: 13.5, max: 43.0) +[2023-10-13 03:15:43,607][45375] Avg episode reward: [(0, '55.620'), (1, '56.560')] +[2023-10-13 03:15:45,881][46662] Updated weights for policy 0, policy_version 72100 (0.0010) +[2023-10-13 03:15:46,244][46662] Updated weights for policy 0, policy_version 72110 (0.0008) +[2023-10-13 03:15:46,615][46662] Updated weights for policy 0, policy_version 72120 (0.0007) +[2023-10-13 03:15:46,851][46663] Updated weights for policy 1, policy_version 72041 (0.0008) +[2023-10-13 03:15:47,215][46663] Updated weights for policy 1, policy_version 72051 (0.0009) +[2023-10-13 03:15:47,586][46663] Updated weights for policy 1, policy_version 72061 (0.0009) +[2023-10-13 03:15:48,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 147652608. Throughput: 0: 1655.2, 1: 1660.4. Samples: 36917062. Policy #0 lag: (min: 11.0, avg: 13.5, max: 43.0) +[2023-10-13 03:15:48,607][45375] Avg episode reward: [(0, '56.320'), (1, '56.830')] +[2023-10-13 03:15:50,669][46662] Updated weights for policy 0, policy_version 72130 (0.0007) +[2023-10-13 03:15:51,046][46662] Updated weights for policy 0, policy_version 72140 (0.0010) +[2023-10-13 03:15:51,421][46662] Updated weights for policy 0, policy_version 72150 (0.0008) +[2023-10-13 03:15:51,596][46663] Updated weights for policy 1, policy_version 72071 (0.0007) +[2023-10-13 03:15:51,795][46662] Updated weights for policy 0, policy_version 72160 (0.0007) +[2023-10-13 03:15:51,956][46663] Updated weights for policy 1, policy_version 72081 (0.0008) +[2023-10-13 03:15:52,327][46663] Updated weights for policy 1, policy_version 72091 (0.0007) +[2023-10-13 03:15:53,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 147718144. Throughput: 0: 1677.9, 1: 1674.8. Samples: 36937272. Policy #0 lag: (min: 11.0, avg: 13.5, max: 43.0) +[2023-10-13 03:15:53,607][45375] Avg episode reward: [(0, '57.340'), (1, '54.310')] +[2023-10-13 03:15:53,618][46384] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000072096_73826304.pth... +[2023-10-13 03:15:53,618][46091] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000072160_73891840.pth... +[2023-10-13 03:15:53,653][46091] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000070592_72286208.pth +[2023-10-13 03:15:53,654][46384] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000070528_72220672.pth +[2023-10-13 03:15:55,750][46662] Updated weights for policy 0, policy_version 72170 (0.0011) +[2023-10-13 03:15:56,123][46662] Updated weights for policy 0, policy_version 72180 (0.0010) +[2023-10-13 03:15:56,302][46663] Updated weights for policy 1, policy_version 72101 (0.0008) +[2023-10-13 03:15:56,500][46662] Updated weights for policy 0, policy_version 72190 (0.0008) +[2023-10-13 03:15:56,671][46663] Updated weights for policy 1, policy_version 72111 (0.0010) +[2023-10-13 03:15:57,031][46663] Updated weights for policy 1, policy_version 72121 (0.0010) +[2023-10-13 03:15:58,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 147783680. Throughput: 0: 1669.4, 1: 1679.6. Samples: 36948398. Policy #0 lag: (min: 11.0, avg: 13.5, max: 43.0) +[2023-10-13 03:15:58,607][45375] Avg episode reward: [(0, '56.970'), (1, '53.330')] +[2023-10-13 03:16:00,729][46662] Updated weights for policy 0, policy_version 72200 (0.0009) +[2023-10-13 03:16:01,103][46662] Updated weights for policy 0, policy_version 72210 (0.0009) +[2023-10-13 03:16:01,191][46663] Updated weights for policy 1, policy_version 72131 (0.0008) +[2023-10-13 03:16:01,466][46662] Updated weights for policy 0, policy_version 72220 (0.0009) +[2023-10-13 03:16:01,552][46663] Updated weights for policy 1, policy_version 72141 (0.0010) +[2023-10-13 03:16:01,927][46663] Updated weights for policy 1, policy_version 72151 (0.0008) +[2023-10-13 03:16:03,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 147849216. Throughput: 0: 1654.4, 1: 1658.3. Samples: 36967028. Policy #0 lag: (min: 11.0, avg: 13.5, max: 43.0) +[2023-10-13 03:16:03,608][45375] Avg episode reward: [(0, '57.530'), (1, '51.780')] +[2023-10-13 03:16:05,630][46662] Updated weights for policy 0, policy_version 72230 (0.0007) +[2023-10-13 03:16:05,960][46663] Updated weights for policy 1, policy_version 72161 (0.0007) +[2023-10-13 03:16:06,005][46662] Updated weights for policy 0, policy_version 72240 (0.0007) +[2023-10-13 03:16:06,338][46663] Updated weights for policy 1, policy_version 72171 (0.0010) +[2023-10-13 03:16:06,375][46662] Updated weights for policy 0, policy_version 72250 (0.0008) +[2023-10-13 03:16:06,710][46663] Updated weights for policy 1, policy_version 72181 (0.0010) +[2023-10-13 03:16:07,067][46663] Updated weights for policy 1, policy_version 72191 (0.0007) +[2023-10-13 03:16:08,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 147914752. Throughput: 0: 1676.6, 1: 1678.5. Samples: 36987488. Policy #0 lag: (min: 11.0, avg: 13.5, max: 43.0) +[2023-10-13 03:16:08,607][45375] Avg episode reward: [(0, '58.170'), (1, '53.100')] +[2023-10-13 03:16:10,455][46662] Updated weights for policy 0, policy_version 72260 (0.0009) +[2023-10-13 03:16:10,828][46662] Updated weights for policy 0, policy_version 72270 (0.0010) +[2023-10-13 03:16:11,198][46662] Updated weights for policy 0, policy_version 72280 (0.0008) +[2023-10-13 03:16:11,247][46663] Updated weights for policy 1, policy_version 72201 (0.0008) +[2023-10-13 03:16:11,614][46663] Updated weights for policy 1, policy_version 72211 (0.0009) +[2023-10-13 03:16:11,986][46663] Updated weights for policy 1, policy_version 72221 (0.0007) +[2023-10-13 03:16:13,607][45375] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 147980288. Throughput: 0: 1672.2, 1: 1665.0. Samples: 36998114. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 03:16:13,607][45375] Avg episode reward: [(0, '59.130'), (1, '54.480')] +[2023-10-13 03:16:15,086][46662] Updated weights for policy 0, policy_version 72290 (0.0010) +[2023-10-13 03:16:15,451][46662] Updated weights for policy 0, policy_version 72300 (0.0007) +[2023-10-13 03:16:15,825][46662] Updated weights for policy 0, policy_version 72310 (0.0007) +[2023-10-13 03:16:15,970][46663] Updated weights for policy 1, policy_version 72231 (0.0009) +[2023-10-13 03:16:16,202][46662] Updated weights for policy 0, policy_version 72320 (0.0008) +[2023-10-13 03:16:16,338][46663] Updated weights for policy 1, policy_version 72241 (0.0010) +[2023-10-13 03:16:16,712][46663] Updated weights for policy 1, policy_version 72251 (0.0011) +[2023-10-13 03:16:18,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.5). Total num frames: 148045824. Throughput: 0: 1675.5, 1: 1667.6. Samples: 37017628. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 03:16:18,607][45375] Avg episode reward: [(0, '57.790'), (1, '55.020')] +[2023-10-13 03:16:20,397][46662] Updated weights for policy 0, policy_version 72330 (0.0008) +[2023-10-13 03:16:20,762][46662] Updated weights for policy 0, policy_version 72340 (0.0008) +[2023-10-13 03:16:20,932][46663] Updated weights for policy 1, policy_version 72261 (0.0010) +[2023-10-13 03:16:21,131][46662] Updated weights for policy 0, policy_version 72350 (0.0009) +[2023-10-13 03:16:21,333][46663] Updated weights for policy 1, policy_version 72271 (0.0008) +[2023-10-13 03:16:21,703][46663] Updated weights for policy 1, policy_version 72281 (0.0008) +[2023-10-13 03:16:23,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13107.1, 300 sec: 13440.4). Total num frames: 148111360. Throughput: 0: 1692.0, 1: 1677.9. Samples: 37038136. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 03:16:23,608][45375] Avg episode reward: [(0, '55.340'), (1, '54.770')] +[2023-10-13 03:16:25,064][46662] Updated weights for policy 0, policy_version 72360 (0.0010) +[2023-10-13 03:16:25,434][46662] Updated weights for policy 0, policy_version 72370 (0.0010) +[2023-10-13 03:16:25,651][46663] Updated weights for policy 1, policy_version 72291 (0.0009) +[2023-10-13 03:16:25,810][46662] Updated weights for policy 0, policy_version 72380 (0.0009) +[2023-10-13 03:16:26,013][46663] Updated weights for policy 1, policy_version 72301 (0.0009) +[2023-10-13 03:16:26,385][46663] Updated weights for policy 1, policy_version 72311 (0.0010) +[2023-10-13 03:16:28,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 148176896. Throughput: 0: 1663.4, 1: 1662.6. Samples: 37047862. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 03:16:28,607][45375] Avg episode reward: [(0, '54.700'), (1, '54.290')] +[2023-10-13 03:16:29,921][46662] Updated weights for policy 0, policy_version 72390 (0.0007) +[2023-10-13 03:16:30,287][46662] Updated weights for policy 0, policy_version 72400 (0.0010) +[2023-10-13 03:16:30,476][46663] Updated weights for policy 1, policy_version 72321 (0.0009) +[2023-10-13 03:16:30,662][46662] Updated weights for policy 0, policy_version 72410 (0.0009) +[2023-10-13 03:16:30,846][46663] Updated weights for policy 1, policy_version 72331 (0.0010) +[2023-10-13 03:16:31,222][46663] Updated weights for policy 1, policy_version 72341 (0.0008) +[2023-10-13 03:16:31,585][46663] Updated weights for policy 1, policy_version 72351 (0.0010) +[2023-10-13 03:16:33,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 148242432. Throughput: 0: 1681.7, 1: 1670.3. Samples: 37067900. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 03:16:33,608][45375] Avg episode reward: [(0, '56.070'), (1, '52.860')] +[2023-10-13 03:16:34,753][46662] Updated weights for policy 0, policy_version 72420 (0.0009) +[2023-10-13 03:16:35,121][46662] Updated weights for policy 0, policy_version 72430 (0.0010) +[2023-10-13 03:16:35,493][46662] Updated weights for policy 0, policy_version 72440 (0.0009) +[2023-10-13 03:16:35,675][46663] Updated weights for policy 1, policy_version 72361 (0.0008) +[2023-10-13 03:16:36,041][46663] Updated weights for policy 1, policy_version 72371 (0.0008) +[2023-10-13 03:16:36,408][46663] Updated weights for policy 1, policy_version 72381 (0.0010) +[2023-10-13 03:16:38,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 148307968. Throughput: 0: 1684.8, 1: 1675.9. Samples: 37088500. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 03:16:38,607][45375] Avg episode reward: [(0, '57.200'), (1, '53.730')] +[2023-10-13 03:16:39,504][46662] Updated weights for policy 0, policy_version 72450 (0.0010) +[2023-10-13 03:16:39,879][46662] Updated weights for policy 0, policy_version 72460 (0.0011) +[2023-10-13 03:16:40,263][46662] Updated weights for policy 0, policy_version 72470 (0.0011) +[2023-10-13 03:16:40,609][46663] Updated weights for policy 1, policy_version 72391 (0.0008) +[2023-10-13 03:16:40,627][46662] Updated weights for policy 0, policy_version 72480 (0.0008) +[2023-10-13 03:16:40,968][46663] Updated weights for policy 1, policy_version 72401 (0.0008) +[2023-10-13 03:16:41,335][46663] Updated weights for policy 1, policy_version 72411 (0.0009) +[2023-10-13 03:16:43,607][45375] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 148373504. Throughput: 0: 1666.8, 1: 1657.9. Samples: 37098006. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 03:16:43,607][45375] Avg episode reward: [(0, '57.650'), (1, '54.750')] +[2023-10-13 03:16:44,637][46662] Updated weights for policy 0, policy_version 72490 (0.0007) +[2023-10-13 03:16:45,017][46662] Updated weights for policy 0, policy_version 72500 (0.0009) +[2023-10-13 03:16:45,377][46662] Updated weights for policy 0, policy_version 72510 (0.0009) +[2023-10-13 03:16:45,452][46663] Updated weights for policy 1, policy_version 72421 (0.0009) +[2023-10-13 03:16:45,820][46663] Updated weights for policy 1, policy_version 72431 (0.0007) +[2023-10-13 03:16:46,175][46663] Updated weights for policy 1, policy_version 72441 (0.0008) +[2023-10-13 03:16:48,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 148439040. Throughput: 0: 1691.2, 1: 1677.4. Samples: 37118618. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 03:16:48,608][45375] Avg episode reward: [(0, '56.260'), (1, '54.010')] +[2023-10-13 03:16:49,469][46662] Updated weights for policy 0, policy_version 72520 (0.0009) +[2023-10-13 03:16:49,839][46662] Updated weights for policy 0, policy_version 72530 (0.0007) +[2023-10-13 03:16:50,215][46662] Updated weights for policy 0, policy_version 72540 (0.0008) +[2023-10-13 03:16:50,342][46663] Updated weights for policy 1, policy_version 72451 (0.0007) +[2023-10-13 03:16:50,707][46663] Updated weights for policy 1, policy_version 72461 (0.0009) +[2023-10-13 03:16:51,070][46663] Updated weights for policy 1, policy_version 72471 (0.0008) +[2023-10-13 03:16:53,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 148504576. Throughput: 0: 1689.8, 1: 1674.2. Samples: 37138866. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 03:16:53,607][45375] Avg episode reward: [(0, '56.940'), (1, '55.320')] +[2023-10-13 03:16:54,403][46662] Updated weights for policy 0, policy_version 72550 (0.0008) +[2023-10-13 03:16:54,775][46662] Updated weights for policy 0, policy_version 72560 (0.0008) +[2023-10-13 03:16:55,152][46662] Updated weights for policy 0, policy_version 72570 (0.0009) +[2023-10-13 03:16:55,209][46663] Updated weights for policy 1, policy_version 72481 (0.0008) +[2023-10-13 03:16:55,573][46663] Updated weights for policy 1, policy_version 72491 (0.0007) +[2023-10-13 03:16:55,938][46663] Updated weights for policy 1, policy_version 72501 (0.0008) +[2023-10-13 03:16:56,313][46663] Updated weights for policy 1, policy_version 72511 (0.0008) +[2023-10-13 03:16:58,606][45375] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 148570112. Throughput: 0: 1673.1, 1: 1660.3. Samples: 37148116. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 03:16:58,607][45375] Avg episode reward: [(0, '56.910'), (1, '55.990')] +[2023-10-13 03:16:59,351][46662] Updated weights for policy 0, policy_version 72580 (0.0008) +[2023-10-13 03:16:59,722][46662] Updated weights for policy 0, policy_version 72590 (0.0008) +[2023-10-13 03:17:00,096][46662] Updated weights for policy 0, policy_version 72600 (0.0009) +[2023-10-13 03:17:00,319][46663] Updated weights for policy 1, policy_version 72521 (0.0007) +[2023-10-13 03:17:00,706][46663] Updated weights for policy 1, policy_version 72531 (0.0008) +[2023-10-13 03:17:01,069][46663] Updated weights for policy 1, policy_version 72541 (0.0010) +[2023-10-13 03:17:03,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 148635648. Throughput: 0: 1680.7, 1: 1671.4. Samples: 37168472. Policy #0 lag: (min: 0.0, avg: 26.7, max: 32.0) +[2023-10-13 03:17:03,607][45375] Avg episode reward: [(0, '56.290'), (1, '57.050')] +[2023-10-13 03:17:04,147][46662] Updated weights for policy 0, policy_version 72610 (0.0009) +[2023-10-13 03:17:04,538][46662] Updated weights for policy 0, policy_version 72620 (0.0009) +[2023-10-13 03:17:04,910][46662] Updated weights for policy 0, policy_version 72630 (0.0007) +[2023-10-13 03:17:05,114][46663] Updated weights for policy 1, policy_version 72551 (0.0008) +[2023-10-13 03:17:05,274][46662] Updated weights for policy 0, policy_version 72640 (0.0008) +[2023-10-13 03:17:05,475][46663] Updated weights for policy 1, policy_version 72561 (0.0007) +[2023-10-13 03:17:05,839][46663] Updated weights for policy 1, policy_version 72571 (0.0008) +[2023-10-13 03:17:08,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 148701184. Throughput: 0: 1680.0, 1: 1673.4. Samples: 37189038. Policy #0 lag: (min: 0.0, avg: 26.7, max: 32.0) +[2023-10-13 03:17:08,607][45375] Avg episode reward: [(0, '56.790'), (1, '56.390')] +[2023-10-13 03:17:09,368][46662] Updated weights for policy 0, policy_version 72650 (0.0008) +[2023-10-13 03:17:09,739][46662] Updated weights for policy 0, policy_version 72660 (0.0008) +[2023-10-13 03:17:09,976][46663] Updated weights for policy 1, policy_version 72581 (0.0010) +[2023-10-13 03:17:10,109][46662] Updated weights for policy 0, policy_version 72670 (0.0007) +[2023-10-13 03:17:10,364][46663] Updated weights for policy 1, policy_version 72591 (0.0007) +[2023-10-13 03:17:10,725][46663] Updated weights for policy 1, policy_version 72601 (0.0009) +[2023-10-13 03:17:13,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 148766720. Throughput: 0: 1674.4, 1: 1662.2. Samples: 37198010. Policy #0 lag: (min: 0.0, avg: 26.7, max: 32.0) +[2023-10-13 03:17:13,607][45375] Avg episode reward: [(0, '56.780'), (1, '57.370')] +[2023-10-13 03:17:13,962][46662] Updated weights for policy 0, policy_version 72680 (0.0009) +[2023-10-13 03:17:14,332][46662] Updated weights for policy 0, policy_version 72690 (0.0008) +[2023-10-13 03:17:14,694][46662] Updated weights for policy 0, policy_version 72700 (0.0008) +[2023-10-13 03:17:14,757][46663] Updated weights for policy 1, policy_version 72611 (0.0007) +[2023-10-13 03:17:15,121][46663] Updated weights for policy 1, policy_version 72621 (0.0008) +[2023-10-13 03:17:15,490][46663] Updated weights for policy 1, policy_version 72631 (0.0010) +[2023-10-13 03:17:18,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 148832256. Throughput: 0: 1677.6, 1: 1681.0. Samples: 37219038. Policy #0 lag: (min: 0.0, avg: 26.7, max: 32.0) +[2023-10-13 03:17:18,607][45375] Avg episode reward: [(0, '55.200'), (1, '56.210')] +[2023-10-13 03:17:18,774][46662] Updated weights for policy 0, policy_version 72710 (0.0009) +[2023-10-13 03:17:19,148][46662] Updated weights for policy 0, policy_version 72720 (0.0011) +[2023-10-13 03:17:19,506][46663] Updated weights for policy 1, policy_version 72641 (0.0011) +[2023-10-13 03:17:19,524][46662] Updated weights for policy 0, policy_version 72730 (0.0009) +[2023-10-13 03:17:19,879][46663] Updated weights for policy 1, policy_version 72651 (0.0009) +[2023-10-13 03:17:20,242][46663] Updated weights for policy 1, policy_version 72661 (0.0007) +[2023-10-13 03:17:20,612][46663] Updated weights for policy 1, policy_version 72671 (0.0009) +[2023-10-13 03:17:23,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 148897792. Throughput: 0: 1682.8, 1: 1680.6. Samples: 37239852. Policy #0 lag: (min: 0.0, avg: 26.7, max: 32.0) +[2023-10-13 03:17:23,608][45375] Avg episode reward: [(0, '54.480'), (1, '55.550')] +[2023-10-13 03:17:23,612][46662] Updated weights for policy 0, policy_version 72740 (0.0008) +[2023-10-13 03:17:23,981][46662] Updated weights for policy 0, policy_version 72750 (0.0008) +[2023-10-13 03:17:24,355][46662] Updated weights for policy 0, policy_version 72760 (0.0009) +[2023-10-13 03:17:24,733][46663] Updated weights for policy 1, policy_version 72681 (0.0009) +[2023-10-13 03:17:25,099][46663] Updated weights for policy 1, policy_version 72691 (0.0008) +[2023-10-13 03:17:25,476][46663] Updated weights for policy 1, policy_version 72701 (0.0008) +[2023-10-13 03:17:28,338][46662] Updated weights for policy 0, policy_version 72770 (0.0008) +[2023-10-13 03:17:28,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 148963328. Throughput: 0: 1683.7, 1: 1672.3. Samples: 37249026. Policy #0 lag: (min: 0.0, avg: 26.7, max: 32.0) +[2023-10-13 03:17:28,607][45375] Avg episode reward: [(0, '53.640'), (1, '55.870')] +[2023-10-13 03:17:28,700][46662] Updated weights for policy 0, policy_version 72780 (0.0010) +[2023-10-13 03:17:29,066][46662] Updated weights for policy 0, policy_version 72790 (0.0009) +[2023-10-13 03:17:29,444][46662] Updated weights for policy 0, policy_version 72800 (0.0007) +[2023-10-13 03:17:29,558][46663] Updated weights for policy 1, policy_version 72711 (0.0008) +[2023-10-13 03:17:29,914][46663] Updated weights for policy 1, policy_version 72721 (0.0008) +[2023-10-13 03:17:30,283][46663] Updated weights for policy 1, policy_version 72731 (0.0012) +[2023-10-13 03:17:33,472][46662] Updated weights for policy 0, policy_version 72810 (0.0008) +[2023-10-13 03:17:33,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 149028864. Throughput: 0: 1684.3, 1: 1671.7. Samples: 37269638. Policy #0 lag: (min: 0.0, avg: 26.7, max: 32.0) +[2023-10-13 03:17:33,608][45375] Avg episode reward: [(0, '55.140'), (1, '55.750')] +[2023-10-13 03:17:33,841][46662] Updated weights for policy 0, policy_version 72820 (0.0007) +[2023-10-13 03:17:34,216][46662] Updated weights for policy 0, policy_version 72830 (0.0007) +[2023-10-13 03:17:34,306][46663] Updated weights for policy 1, policy_version 72741 (0.0010) +[2023-10-13 03:17:34,669][46663] Updated weights for policy 1, policy_version 72751 (0.0008) +[2023-10-13 03:17:35,034][46663] Updated weights for policy 1, policy_version 72761 (0.0009) +[2023-10-13 03:17:38,163][46662] Updated weights for policy 0, policy_version 72840 (0.0008) +[2023-10-13 03:17:38,537][46662] Updated weights for policy 0, policy_version 72850 (0.0008) +[2023-10-13 03:17:38,607][45375] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 149094400. Throughput: 0: 1689.8, 1: 1680.6. Samples: 37290534. Policy #0 lag: (min: 0.0, avg: 26.7, max: 32.0) +[2023-10-13 03:17:38,608][45375] Avg episode reward: [(0, '55.450'), (1, '55.760')] +[2023-10-13 03:17:38,916][46662] Updated weights for policy 0, policy_version 72860 (0.0007) +[2023-10-13 03:17:39,156][46663] Updated weights for policy 1, policy_version 72771 (0.0009) +[2023-10-13 03:17:39,516][46663] Updated weights for policy 1, policy_version 72781 (0.0008) +[2023-10-13 03:17:39,887][46663] Updated weights for policy 1, policy_version 72791 (0.0007) +[2023-10-13 03:17:42,947][46662] Updated weights for policy 0, policy_version 72870 (0.0008) +[2023-10-13 03:17:43,308][46662] Updated weights for policy 0, policy_version 72880 (0.0008) +[2023-10-13 03:17:43,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 149159936. Throughput: 0: 1690.1, 1: 1672.3. Samples: 37299428. Policy #0 lag: (min: 0.0, avg: 26.7, max: 32.0) +[2023-10-13 03:17:43,608][45375] Avg episode reward: [(0, '55.110'), (1, '55.950')] +[2023-10-13 03:17:43,682][46662] Updated weights for policy 0, policy_version 72890 (0.0009) +[2023-10-13 03:17:44,025][46663] Updated weights for policy 1, policy_version 72801 (0.0010) +[2023-10-13 03:17:44,383][46663] Updated weights for policy 1, policy_version 72811 (0.0008) +[2023-10-13 03:17:44,752][46663] Updated weights for policy 1, policy_version 72821 (0.0007) +[2023-10-13 03:17:45,117][46663] Updated weights for policy 1, policy_version 72831 (0.0008) +[2023-10-13 03:17:47,742][46662] Updated weights for policy 0, policy_version 72900 (0.0008) +[2023-10-13 03:17:48,109][46662] Updated weights for policy 0, policy_version 72910 (0.0009) +[2023-10-13 03:17:48,471][46662] Updated weights for policy 0, policy_version 72920 (0.0011) +[2023-10-13 03:17:48,606][45375] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 149225472. Throughput: 0: 1696.2, 1: 1679.1. Samples: 37320360. Policy #0 lag: (min: 0.0, avg: 26.7, max: 32.0) +[2023-10-13 03:17:48,607][45375] Avg episode reward: [(0, '54.290'), (1, '55.240')] +[2023-10-13 03:17:49,241][46663] Updated weights for policy 1, policy_version 72841 (0.0009) +[2023-10-13 03:17:49,609][46663] Updated weights for policy 1, policy_version 72851 (0.0010) +[2023-10-13 03:17:49,981][46663] Updated weights for policy 1, policy_version 72861 (0.0010) +[2023-10-13 03:17:52,544][46662] Updated weights for policy 0, policy_version 72930 (0.0011) +[2023-10-13 03:17:52,939][46662] Updated weights for policy 0, policy_version 72940 (0.0010) +[2023-10-13 03:17:53,302][46662] Updated weights for policy 0, policy_version 72950 (0.0010) +[2023-10-13 03:17:53,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 149291008. Throughput: 0: 1687.9, 1: 1680.4. Samples: 37340610. Policy #0 lag: (min: 26.0, avg: 29.0, max: 58.0) +[2023-10-13 03:17:53,607][45375] Avg episode reward: [(0, '54.360'), (1, '54.400')] +[2023-10-13 03:17:53,668][46091] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000072960_74711040.pth... +[2023-10-13 03:17:53,671][46662] Updated weights for policy 0, policy_version 72960 (0.0009) +[2023-10-13 03:17:53,703][46091] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000071392_73105408.pth +[2023-10-13 03:17:53,861][46663] Updated weights for policy 1, policy_version 72871 (0.0008) +[2023-10-13 03:17:54,221][46663] Updated weights for policy 1, policy_version 72881 (0.0008) +[2023-10-13 03:17:54,583][46663] Updated weights for policy 1, policy_version 72891 (0.0009) +[2023-10-13 03:17:54,763][46384] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000072896_74645504.pth... +[2023-10-13 03:17:54,796][46384] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000071296_73007104.pth +[2023-10-13 03:17:57,676][46662] Updated weights for policy 0, policy_version 72970 (0.0011) +[2023-10-13 03:17:58,049][46662] Updated weights for policy 0, policy_version 72980 (0.0008) +[2023-10-13 03:17:58,425][46662] Updated weights for policy 0, policy_version 72990 (0.0010) +[2023-10-13 03:17:58,606][45375] Fps is (10 sec: 16384.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 149389312. Throughput: 0: 1697.5, 1: 1682.7. Samples: 37350118. Policy #0 lag: (min: 26.0, avg: 29.0, max: 58.0) +[2023-10-13 03:17:58,607][45375] Avg episode reward: [(0, '55.520'), (1, '53.750')] +[2023-10-13 03:17:58,837][46663] Updated weights for policy 1, policy_version 72901 (0.0009) +[2023-10-13 03:17:59,212][46663] Updated weights for policy 1, policy_version 72911 (0.0011) +[2023-10-13 03:17:59,583][46663] Updated weights for policy 1, policy_version 72921 (0.0010) +[2023-10-13 03:18:02,558][46662] Updated weights for policy 0, policy_version 73000 (0.0009) +[2023-10-13 03:18:02,936][46662] Updated weights for policy 0, policy_version 73010 (0.0011) +[2023-10-13 03:18:03,311][46662] Updated weights for policy 0, policy_version 73020 (0.0010) +[2023-10-13 03:18:03,606][45375] Fps is (10 sec: 16384.3, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 149454848. Throughput: 0: 1696.4, 1: 1670.7. Samples: 37370558. Policy #0 lag: (min: 26.0, avg: 29.0, max: 58.0) +[2023-10-13 03:18:03,607][45375] Avg episode reward: [(0, '56.490'), (1, '52.510')] +[2023-10-13 03:18:03,707][46663] Updated weights for policy 1, policy_version 72931 (0.0011) +[2023-10-13 03:18:04,085][46663] Updated weights for policy 1, policy_version 72941 (0.0008) +[2023-10-13 03:18:04,459][46663] Updated weights for policy 1, policy_version 72951 (0.0007) +[2023-10-13 03:18:07,230][46662] Updated weights for policy 0, policy_version 73030 (0.0010) +[2023-10-13 03:18:07,594][46662] Updated weights for policy 0, policy_version 73040 (0.0010) +[2023-10-13 03:18:07,966][46662] Updated weights for policy 0, policy_version 73050 (0.0008) +[2023-10-13 03:18:08,528][46663] Updated weights for policy 1, policy_version 72961 (0.0008) +[2023-10-13 03:18:08,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 149520384. Throughput: 0: 1672.9, 1: 1673.8. Samples: 37390454. Policy #0 lag: (min: 26.0, avg: 29.0, max: 58.0) +[2023-10-13 03:18:08,607][45375] Avg episode reward: [(0, '55.490'), (1, '52.370')] +[2023-10-13 03:18:08,882][46663] Updated weights for policy 1, policy_version 72971 (0.0011) +[2023-10-13 03:18:09,250][46663] Updated weights for policy 1, policy_version 72981 (0.0010) +[2023-10-13 03:18:09,615][46663] Updated weights for policy 1, policy_version 72991 (0.0012) +[2023-10-13 03:18:12,016][46662] Updated weights for policy 0, policy_version 73060 (0.0007) +[2023-10-13 03:18:12,391][46662] Updated weights for policy 0, policy_version 73070 (0.0007) +[2023-10-13 03:18:12,757][46662] Updated weights for policy 0, policy_version 73080 (0.0007) +[2023-10-13 03:18:13,606][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 149585920. Throughput: 0: 1694.4, 1: 1671.8. Samples: 37400508. Policy #0 lag: (min: 26.0, avg: 29.0, max: 58.0) +[2023-10-13 03:18:13,607][45375] Avg episode reward: [(0, '56.010'), (1, '52.000')] +[2023-10-13 03:18:13,620][46663] Updated weights for policy 1, policy_version 73001 (0.0008) +[2023-10-13 03:18:13,981][46663] Updated weights for policy 1, policy_version 73011 (0.0008) +[2023-10-13 03:18:14,343][46663] Updated weights for policy 1, policy_version 73021 (0.0009) +[2023-10-13 03:18:16,863][46662] Updated weights for policy 0, policy_version 73090 (0.0008) +[2023-10-13 03:18:17,235][46662] Updated weights for policy 0, policy_version 73100 (0.0008) +[2023-10-13 03:18:17,600][46662] Updated weights for policy 0, policy_version 73110 (0.0007) +[2023-10-13 03:18:17,971][46662] Updated weights for policy 0, policy_version 73120 (0.0010) +[2023-10-13 03:18:18,448][46663] Updated weights for policy 1, policy_version 73031 (0.0010) +[2023-10-13 03:18:18,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 149651456. Throughput: 0: 1692.6, 1: 1678.8. Samples: 37421348. Policy #0 lag: (min: 26.0, avg: 29.0, max: 58.0) +[2023-10-13 03:18:18,607][45375] Avg episode reward: [(0, '55.440'), (1, '52.160')] +[2023-10-13 03:18:18,817][46663] Updated weights for policy 1, policy_version 73041 (0.0007) +[2023-10-13 03:18:19,175][46663] Updated weights for policy 1, policy_version 73051 (0.0007) +[2023-10-13 03:18:22,109][46662] Updated weights for policy 0, policy_version 73130 (0.0007) +[2023-10-13 03:18:22,481][46662] Updated weights for policy 0, policy_version 73140 (0.0008) +[2023-10-13 03:18:22,858][46662] Updated weights for policy 0, policy_version 73150 (0.0008) +[2023-10-13 03:18:23,257][46663] Updated weights for policy 1, policy_version 73061 (0.0008) +[2023-10-13 03:18:23,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 149716992. Throughput: 0: 1667.0, 1: 1672.8. Samples: 37440826. Policy #0 lag: (min: 26.0, avg: 29.0, max: 58.0) +[2023-10-13 03:18:23,607][45375] Avg episode reward: [(0, '55.950'), (1, '53.770')] +[2023-10-13 03:18:23,624][46663] Updated weights for policy 1, policy_version 73071 (0.0008) +[2023-10-13 03:18:23,992][46663] Updated weights for policy 1, policy_version 73081 (0.0008) +[2023-10-13 03:18:26,865][46662] Updated weights for policy 0, policy_version 73160 (0.0010) +[2023-10-13 03:18:27,232][46662] Updated weights for policy 0, policy_version 73170 (0.0009) +[2023-10-13 03:18:27,596][46662] Updated weights for policy 0, policy_version 73180 (0.0009) +[2023-10-13 03:18:28,053][46663] Updated weights for policy 1, policy_version 73091 (0.0010) +[2023-10-13 03:18:28,420][46663] Updated weights for policy 1, policy_version 73101 (0.0010) +[2023-10-13 03:18:28,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 149782528. Throughput: 0: 1694.5, 1: 1683.5. Samples: 37451438. Policy #0 lag: (min: 26.0, avg: 29.0, max: 58.0) +[2023-10-13 03:18:28,607][45375] Avg episode reward: [(0, '55.510'), (1, '53.480')] +[2023-10-13 03:18:28,799][46663] Updated weights for policy 1, policy_version 73111 (0.0009) +[2023-10-13 03:18:31,610][46662] Updated weights for policy 0, policy_version 73190 (0.0008) +[2023-10-13 03:18:31,975][46662] Updated weights for policy 0, policy_version 73200 (0.0010) +[2023-10-13 03:18:32,340][46662] Updated weights for policy 0, policy_version 73210 (0.0010) +[2023-10-13 03:18:32,682][46663] Updated weights for policy 1, policy_version 73121 (0.0009) +[2023-10-13 03:18:33,046][46663] Updated weights for policy 1, policy_version 73131 (0.0007) +[2023-10-13 03:18:33,422][46663] Updated weights for policy 1, policy_version 73141 (0.0008) +[2023-10-13 03:18:33,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 149848064. Throughput: 0: 1680.3, 1: 1681.2. Samples: 37471630. Policy #0 lag: (min: 26.0, avg: 29.0, max: 58.0) +[2023-10-13 03:18:33,607][45375] Avg episode reward: [(0, '56.790'), (1, '51.250')] +[2023-10-13 03:18:33,791][46663] Updated weights for policy 1, policy_version 73151 (0.0009) +[2023-10-13 03:18:36,464][46662] Updated weights for policy 0, policy_version 73220 (0.0010) +[2023-10-13 03:18:36,830][46662] Updated weights for policy 0, policy_version 73230 (0.0007) +[2023-10-13 03:18:37,197][46662] Updated weights for policy 0, policy_version 73240 (0.0007) +[2023-10-13 03:18:37,914][46663] Updated weights for policy 1, policy_version 73161 (0.0008) +[2023-10-13 03:18:38,277][46663] Updated weights for policy 1, policy_version 73171 (0.0007) +[2023-10-13 03:18:38,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 149913600. Throughput: 0: 1668.2, 1: 1667.6. Samples: 37490720. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0) +[2023-10-13 03:18:38,607][45375] Avg episode reward: [(0, '56.770'), (1, '50.760')] +[2023-10-13 03:18:38,645][46663] Updated weights for policy 1, policy_version 73181 (0.0007) +[2023-10-13 03:18:41,199][46662] Updated weights for policy 0, policy_version 73250 (0.0008) +[2023-10-13 03:18:41,611][46662] Updated weights for policy 0, policy_version 73260 (0.0008) +[2023-10-13 03:18:41,981][46662] Updated weights for policy 0, policy_version 73270 (0.0007) +[2023-10-13 03:18:42,343][46662] Updated weights for policy 0, policy_version 73280 (0.0008) +[2023-10-13 03:18:42,894][46663] Updated weights for policy 1, policy_version 73191 (0.0008) +[2023-10-13 03:18:43,248][46663] Updated weights for policy 1, policy_version 73201 (0.0009) +[2023-10-13 03:18:43,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 149979136. Throughput: 0: 1692.0, 1: 1683.6. Samples: 37502018. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0) +[2023-10-13 03:18:43,607][45375] Avg episode reward: [(0, '56.010'), (1, '50.930')] +[2023-10-13 03:18:43,621][46663] Updated weights for policy 1, policy_version 73211 (0.0007) +[2023-10-13 03:18:46,354][46662] Updated weights for policy 0, policy_version 73290 (0.0007) +[2023-10-13 03:18:46,724][46662] Updated weights for policy 0, policy_version 73300 (0.0008) +[2023-10-13 03:18:47,096][46662] Updated weights for policy 0, policy_version 73310 (0.0010) +[2023-10-13 03:18:47,740][46663] Updated weights for policy 1, policy_version 73221 (0.0009) +[2023-10-13 03:18:48,137][46663] Updated weights for policy 1, policy_version 73231 (0.0010) +[2023-10-13 03:18:48,498][46663] Updated weights for policy 1, policy_version 73241 (0.0009) +[2023-10-13 03:18:48,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 150044672. Throughput: 0: 1672.6, 1: 1691.2. Samples: 37521932. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0) +[2023-10-13 03:18:48,607][45375] Avg episode reward: [(0, '55.640'), (1, '49.380')] +[2023-10-13 03:18:51,063][46662] Updated weights for policy 0, policy_version 73320 (0.0010) +[2023-10-13 03:18:51,430][46662] Updated weights for policy 0, policy_version 73330 (0.0007) +[2023-10-13 03:18:51,796][46662] Updated weights for policy 0, policy_version 73340 (0.0007) +[2023-10-13 03:18:52,655][46663] Updated weights for policy 1, policy_version 73251 (0.0011) +[2023-10-13 03:18:53,021][46663] Updated weights for policy 1, policy_version 73261 (0.0007) +[2023-10-13 03:18:53,389][46663] Updated weights for policy 1, policy_version 73271 (0.0007) +[2023-10-13 03:18:53,607][45375] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 150110208. Throughput: 0: 1688.4, 1: 1661.1. Samples: 37541184. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0) +[2023-10-13 03:18:53,608][45375] Avg episode reward: [(0, '55.460'), (1, '51.400')] +[2023-10-13 03:18:55,806][46662] Updated weights for policy 0, policy_version 73350 (0.0008) +[2023-10-13 03:18:56,171][46662] Updated weights for policy 0, policy_version 73360 (0.0011) +[2023-10-13 03:18:56,555][46662] Updated weights for policy 0, policy_version 73370 (0.0010) +[2023-10-13 03:18:57,308][46663] Updated weights for policy 1, policy_version 73281 (0.0008) +[2023-10-13 03:18:57,679][46663] Updated weights for policy 1, policy_version 73291 (0.0011) +[2023-10-13 03:18:58,041][46663] Updated weights for policy 1, policy_version 73301 (0.0010) +[2023-10-13 03:18:58,413][46663] Updated weights for policy 1, policy_version 73311 (0.0007) +[2023-10-13 03:18:58,606][45375] Fps is (10 sec: 16384.1, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 150208512. Throughput: 0: 1692.4, 1: 1682.2. Samples: 37552364. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0) +[2023-10-13 03:18:58,607][45375] Avg episode reward: [(0, '54.810'), (1, '51.640')] +[2023-10-13 03:19:00,634][46662] Updated weights for policy 0, policy_version 73380 (0.0008) +[2023-10-13 03:19:00,998][46662] Updated weights for policy 0, policy_version 73390 (0.0008) +[2023-10-13 03:19:01,367][46662] Updated weights for policy 0, policy_version 73400 (0.0011) +[2023-10-13 03:19:02,551][46663] Updated weights for policy 1, policy_version 73321 (0.0009) +[2023-10-13 03:19:02,916][46663] Updated weights for policy 1, policy_version 73331 (0.0008) +[2023-10-13 03:19:03,280][46663] Updated weights for policy 1, policy_version 73341 (0.0008) +[2023-10-13 03:19:03,607][45375] Fps is (10 sec: 16384.1, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 150274048. Throughput: 0: 1665.5, 1: 1680.6. Samples: 37571920. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0) +[2023-10-13 03:19:03,608][45375] Avg episode reward: [(0, '53.850'), (1, '51.360')] +[2023-10-13 03:19:05,418][46662] Updated weights for policy 0, policy_version 73410 (0.0009) +[2023-10-13 03:19:05,787][46662] Updated weights for policy 0, policy_version 73420 (0.0009) +[2023-10-13 03:19:06,166][46662] Updated weights for policy 0, policy_version 73430 (0.0010) +[2023-10-13 03:19:06,537][46662] Updated weights for policy 0, policy_version 73440 (0.0010) +[2023-10-13 03:19:07,287][46663] Updated weights for policy 1, policy_version 73351 (0.0008) +[2023-10-13 03:19:07,654][46663] Updated weights for policy 1, policy_version 73361 (0.0009) +[2023-10-13 03:19:08,024][46663] Updated weights for policy 1, policy_version 73371 (0.0010) +[2023-10-13 03:19:08,607][45375] Fps is (10 sec: 13106.8, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 150339584. Throughput: 0: 1690.9, 1: 1660.6. Samples: 37591646. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0) +[2023-10-13 03:19:08,608][45375] Avg episode reward: [(0, '52.890'), (1, '51.290')] +[2023-10-13 03:19:10,594][46662] Updated weights for policy 0, policy_version 73450 (0.0008) +[2023-10-13 03:19:10,971][46662] Updated weights for policy 0, policy_version 73460 (0.0010) +[2023-10-13 03:19:11,336][46662] Updated weights for policy 0, policy_version 73470 (0.0009) +[2023-10-13 03:19:12,193][46663] Updated weights for policy 1, policy_version 73381 (0.0011) +[2023-10-13 03:19:12,555][46663] Updated weights for policy 1, policy_version 73391 (0.0010) +[2023-10-13 03:19:12,930][46663] Updated weights for policy 1, policy_version 73401 (0.0008) +[2023-10-13 03:19:13,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 150405120. Throughput: 0: 1678.0, 1: 1682.4. Samples: 37602656. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0) +[2023-10-13 03:19:13,608][45375] Avg episode reward: [(0, '53.490'), (1, '51.490')] +[2023-10-13 03:19:15,428][46662] Updated weights for policy 0, policy_version 73480 (0.0008) +[2023-10-13 03:19:15,795][46662] Updated weights for policy 0, policy_version 73490 (0.0008) +[2023-10-13 03:19:16,165][46662] Updated weights for policy 0, policy_version 73500 (0.0008) +[2023-10-13 03:19:16,920][46663] Updated weights for policy 1, policy_version 73411 (0.0008) +[2023-10-13 03:19:17,281][46663] Updated weights for policy 1, policy_version 73421 (0.0009) +[2023-10-13 03:19:17,656][46663] Updated weights for policy 1, policy_version 73431 (0.0008) +[2023-10-13 03:19:18,606][45375] Fps is (10 sec: 13107.6, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 150470656. Throughput: 0: 1676.0, 1: 1672.8. Samples: 37622322. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0) +[2023-10-13 03:19:18,607][45375] Avg episode reward: [(0, '53.220'), (1, '51.390')] +[2023-10-13 03:19:20,148][46662] Updated weights for policy 0, policy_version 73510 (0.0010) +[2023-10-13 03:19:20,511][46662] Updated weights for policy 0, policy_version 73520 (0.0010) +[2023-10-13 03:19:20,893][46662] Updated weights for policy 0, policy_version 73530 (0.0011) +[2023-10-13 03:19:21,579][46663] Updated weights for policy 1, policy_version 73441 (0.0008) +[2023-10-13 03:19:21,944][46663] Updated weights for policy 1, policy_version 73451 (0.0008) +[2023-10-13 03:19:22,314][46663] Updated weights for policy 1, policy_version 73461 (0.0009) +[2023-10-13 03:19:22,677][46663] Updated weights for policy 1, policy_version 73471 (0.0009) +[2023-10-13 03:19:23,607][45375] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 150536192. Throughput: 0: 1700.0, 1: 1672.3. Samples: 37642472. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0) +[2023-10-13 03:19:23,607][45375] Avg episode reward: [(0, '53.650'), (1, '51.130')] +[2023-10-13 03:19:24,847][46662] Updated weights for policy 0, policy_version 73540 (0.0009) +[2023-10-13 03:19:25,228][46662] Updated weights for policy 0, policy_version 73550 (0.0009) +[2023-10-13 03:19:25,587][46662] Updated weights for policy 0, policy_version 73560 (0.0011) +[2023-10-13 03:19:26,787][46663] Updated weights for policy 1, policy_version 73481 (0.0007) +[2023-10-13 03:19:27,158][46663] Updated weights for policy 1, policy_version 73491 (0.0007) +[2023-10-13 03:19:27,518][46663] Updated weights for policy 1, policy_version 73501 (0.0009) +[2023-10-13 03:19:28,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 150601728. Throughput: 0: 1674.8, 1: 1681.9. Samples: 37653072. Policy #0 lag: (min: 1.0, avg: 10.4, max: 33.0) +[2023-10-13 03:19:28,607][45375] Avg episode reward: [(0, '53.670'), (1, '50.690')] +[2023-10-13 03:19:29,697][46662] Updated weights for policy 0, policy_version 73570 (0.0010) +[2023-10-13 03:19:30,060][46662] Updated weights for policy 0, policy_version 73580 (0.0010) +[2023-10-13 03:19:30,447][46662] Updated weights for policy 0, policy_version 73590 (0.0011) +[2023-10-13 03:19:30,804][46662] Updated weights for policy 0, policy_version 73600 (0.0011) +[2023-10-13 03:19:31,489][46663] Updated weights for policy 1, policy_version 73511 (0.0008) +[2023-10-13 03:19:31,845][46663] Updated weights for policy 1, policy_version 73521 (0.0007) +[2023-10-13 03:19:32,216][46663] Updated weights for policy 1, policy_version 73531 (0.0008) +[2023-10-13 03:19:33,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 150667264. Throughput: 0: 1687.8, 1: 1657.5. Samples: 37672470. Policy #0 lag: (min: 1.0, avg: 10.4, max: 33.0) +[2023-10-13 03:19:33,607][45375] Avg episode reward: [(0, '54.770'), (1, '50.970')] +[2023-10-13 03:19:34,828][46662] Updated weights for policy 0, policy_version 73610 (0.0007) +[2023-10-13 03:19:35,206][46662] Updated weights for policy 0, policy_version 73620 (0.0010) +[2023-10-13 03:19:35,570][46662] Updated weights for policy 0, policy_version 73630 (0.0011) +[2023-10-13 03:19:36,465][46663] Updated weights for policy 1, policy_version 73541 (0.0010) +[2023-10-13 03:19:36,856][46663] Updated weights for policy 1, policy_version 73551 (0.0010) +[2023-10-13 03:19:37,235][46663] Updated weights for policy 1, policy_version 73561 (0.0008) +[2023-10-13 03:19:38,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 150732800. Throughput: 0: 1695.5, 1: 1679.7. Samples: 37693068. Policy #0 lag: (min: 1.0, avg: 10.4, max: 33.0) +[2023-10-13 03:19:38,607][45375] Avg episode reward: [(0, '56.280'), (1, '52.480')] +[2023-10-13 03:19:39,711][46662] Updated weights for policy 0, policy_version 73640 (0.0008) +[2023-10-13 03:19:40,079][46662] Updated weights for policy 0, policy_version 73650 (0.0007) +[2023-10-13 03:19:40,451][46662] Updated weights for policy 0, policy_version 73660 (0.0009) +[2023-10-13 03:19:41,277][46663] Updated weights for policy 1, policy_version 73571 (0.0008) +[2023-10-13 03:19:41,648][46663] Updated weights for policy 1, policy_version 73581 (0.0008) +[2023-10-13 03:19:42,025][46663] Updated weights for policy 1, policy_version 73591 (0.0007) +[2023-10-13 03:19:43,607][45375] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 150798336. Throughput: 0: 1666.6, 1: 1682.8. Samples: 37703088. Policy #0 lag: (min: 1.0, avg: 10.4, max: 33.0) +[2023-10-13 03:19:43,608][45375] Avg episode reward: [(0, '55.040'), (1, '53.340')] +[2023-10-13 03:19:44,570][46662] Updated weights for policy 0, policy_version 73670 (0.0009) +[2023-10-13 03:19:44,948][46662] Updated weights for policy 0, policy_version 73680 (0.0008) +[2023-10-13 03:19:45,311][46662] Updated weights for policy 0, policy_version 73690 (0.0008) +[2023-10-13 03:19:46,002][46663] Updated weights for policy 1, policy_version 73601 (0.0009) +[2023-10-13 03:19:46,365][46663] Updated weights for policy 1, policy_version 73611 (0.0007) +[2023-10-13 03:19:46,732][46663] Updated weights for policy 1, policy_version 73621 (0.0007) +[2023-10-13 03:19:47,100][46663] Updated weights for policy 1, policy_version 73631 (0.0009) +[2023-10-13 03:19:48,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 150863872. Throughput: 0: 1690.5, 1: 1661.0. Samples: 37722738. Policy #0 lag: (min: 1.0, avg: 10.4, max: 33.0) +[2023-10-13 03:19:48,607][45375] Avg episode reward: [(0, '55.920'), (1, '53.350')] +[2023-10-13 03:19:49,462][46662] Updated weights for policy 0, policy_version 73700 (0.0009) +[2023-10-13 03:19:49,830][46662] Updated weights for policy 0, policy_version 73710 (0.0008) +[2023-10-13 03:19:50,208][46662] Updated weights for policy 0, policy_version 73720 (0.0007) +[2023-10-13 03:19:51,096][46663] Updated weights for policy 1, policy_version 73641 (0.0009) +[2023-10-13 03:19:51,462][46663] Updated weights for policy 1, policy_version 73651 (0.0009) +[2023-10-13 03:19:51,829][46663] Updated weights for policy 1, policy_version 73661 (0.0007) +[2023-10-13 03:19:53,607][45375] Fps is (10 sec: 13107.4, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 150929408. Throughput: 0: 1685.9, 1: 1686.3. Samples: 37743396. Policy #0 lag: (min: 1.0, avg: 10.4, max: 33.0) +[2023-10-13 03:19:53,607][45375] Avg episode reward: [(0, '55.130'), (1, '55.440')] +[2023-10-13 03:19:53,614][46384] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000073664_75431936.pth... +[2023-10-13 03:19:53,615][46091] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000073728_75497472.pth... +[2023-10-13 03:19:53,651][46091] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000072160_73891840.pth +[2023-10-13 03:19:53,656][46384] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000072096_73826304.pth +[2023-10-13 03:19:54,197][46662] Updated weights for policy 0, policy_version 73730 (0.0007) +[2023-10-13 03:19:54,562][46662] Updated weights for policy 0, policy_version 73740 (0.0008) +[2023-10-13 03:19:54,924][46662] Updated weights for policy 0, policy_version 73750 (0.0008) +[2023-10-13 03:19:55,294][46662] Updated weights for policy 0, policy_version 73760 (0.0009) +[2023-10-13 03:19:55,905][46663] Updated weights for policy 1, policy_version 73671 (0.0009) +[2023-10-13 03:19:56,267][46663] Updated weights for policy 1, policy_version 73681 (0.0008) +[2023-10-13 03:19:56,635][46663] Updated weights for policy 1, policy_version 73691 (0.0008) +[2023-10-13 03:19:58,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 150994944. Throughput: 0: 1672.1, 1: 1676.2. Samples: 37753330. Policy #0 lag: (min: 1.0, avg: 10.4, max: 33.0) +[2023-10-13 03:19:58,607][45375] Avg episode reward: [(0, '55.900'), (1, '55.330')] +[2023-10-13 03:19:59,419][46662] Updated weights for policy 0, policy_version 73770 (0.0007) +[2023-10-13 03:19:59,793][46662] Updated weights for policy 0, policy_version 73780 (0.0008) +[2023-10-13 03:20:00,169][46662] Updated weights for policy 0, policy_version 73790 (0.0009) +[2023-10-13 03:20:00,611][46663] Updated weights for policy 1, policy_version 73701 (0.0008) +[2023-10-13 03:20:00,977][46663] Updated weights for policy 1, policy_version 73711 (0.0007) +[2023-10-13 03:20:01,341][46663] Updated weights for policy 1, policy_version 73721 (0.0008) +[2023-10-13 03:20:03,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 151060480. Throughput: 0: 1682.9, 1: 1678.7. Samples: 37773592. Policy #0 lag: (min: 1.0, avg: 10.4, max: 33.0) +[2023-10-13 03:20:03,608][45375] Avg episode reward: [(0, '56.770'), (1, '55.290')] +[2023-10-13 03:20:04,282][46662] Updated weights for policy 0, policy_version 73800 (0.0008) +[2023-10-13 03:20:04,647][46662] Updated weights for policy 0, policy_version 73810 (0.0009) +[2023-10-13 03:20:05,013][46662] Updated weights for policy 0, policy_version 73820 (0.0007) +[2023-10-13 03:20:05,368][46663] Updated weights for policy 1, policy_version 73731 (0.0008) +[2023-10-13 03:20:05,745][46663] Updated weights for policy 1, policy_version 73741 (0.0010) +[2023-10-13 03:20:06,111][46663] Updated weights for policy 1, policy_version 73751 (0.0007) +[2023-10-13 03:20:08,606][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.3, 300 sec: 13329.4). Total num frames: 151126016. Throughput: 0: 1682.1, 1: 1691.7. Samples: 37794296. Policy #0 lag: (min: 1.0, avg: 10.4, max: 33.0) +[2023-10-13 03:20:08,607][45375] Avg episode reward: [(0, '57.010'), (1, '54.390')] +[2023-10-13 03:20:09,016][46662] Updated weights for policy 0, policy_version 73830 (0.0007) +[2023-10-13 03:20:09,388][46662] Updated weights for policy 0, policy_version 73840 (0.0007) +[2023-10-13 03:20:09,746][46662] Updated weights for policy 0, policy_version 73850 (0.0009) +[2023-10-13 03:20:10,175][46663] Updated weights for policy 1, policy_version 73761 (0.0009) +[2023-10-13 03:20:10,538][46663] Updated weights for policy 1, policy_version 73771 (0.0009) +[2023-10-13 03:20:10,908][46663] Updated weights for policy 1, policy_version 73781 (0.0010) +[2023-10-13 03:20:11,268][46663] Updated weights for policy 1, policy_version 73791 (0.0007) +[2023-10-13 03:20:13,606][45375] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 151191552. Throughput: 0: 1675.5, 1: 1664.8. Samples: 37803384. Policy #0 lag: (min: 1.0, avg: 10.4, max: 33.0) +[2023-10-13 03:20:13,607][45375] Avg episode reward: [(0, '56.790'), (1, '53.590')] +[2023-10-13 03:20:13,667][46662] Updated weights for policy 0, policy_version 73860 (0.0008) +[2023-10-13 03:20:14,048][46662] Updated weights for policy 0, policy_version 73870 (0.0008) +[2023-10-13 03:20:14,416][46662] Updated weights for policy 0, policy_version 73880 (0.0009) +[2023-10-13 03:20:15,216][46663] Updated weights for policy 1, policy_version 73801 (0.0008) +[2023-10-13 03:20:15,585][46663] Updated weights for policy 1, policy_version 73811 (0.0009) +[2023-10-13 03:20:15,958][46663] Updated weights for policy 1, policy_version 73821 (0.0008) +[2023-10-13 03:20:18,535][46662] Updated weights for policy 0, policy_version 73890 (0.0009) +[2023-10-13 03:20:18,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 151257088. Throughput: 0: 1684.4, 1: 1689.3. Samples: 37824288. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 03:20:18,607][45375] Avg episode reward: [(0, '56.660'), (1, '53.040')] +[2023-10-13 03:20:18,913][46662] Updated weights for policy 0, policy_version 73900 (0.0010) +[2023-10-13 03:20:19,282][46662] Updated weights for policy 0, policy_version 73910 (0.0008) +[2023-10-13 03:20:19,647][46662] Updated weights for policy 0, policy_version 73920 (0.0007) +[2023-10-13 03:20:20,110][46663] Updated weights for policy 1, policy_version 73831 (0.0008) +[2023-10-13 03:20:20,480][46663] Updated weights for policy 1, policy_version 73841 (0.0010) +[2023-10-13 03:20:20,842][46663] Updated weights for policy 1, policy_version 73851 (0.0008) +[2023-10-13 03:20:23,606][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 151322624. Throughput: 0: 1682.8, 1: 1694.0. Samples: 37845024. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 03:20:23,607][45375] Avg episode reward: [(0, '57.510'), (1, '54.500')] +[2023-10-13 03:20:23,699][46662] Updated weights for policy 0, policy_version 73930 (0.0008) +[2023-10-13 03:20:24,065][46662] Updated weights for policy 0, policy_version 73940 (0.0007) +[2023-10-13 03:20:24,434][46662] Updated weights for policy 0, policy_version 73950 (0.0008) +[2023-10-13 03:20:24,917][46663] Updated weights for policy 1, policy_version 73861 (0.0009) +[2023-10-13 03:20:25,305][46663] Updated weights for policy 1, policy_version 73871 (0.0008) +[2023-10-13 03:20:25,665][46663] Updated weights for policy 1, policy_version 73881 (0.0009) +[2023-10-13 03:20:28,433][46662] Updated weights for policy 0, policy_version 73960 (0.0008) +[2023-10-13 03:20:28,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 151388160. Throughput: 0: 1685.4, 1: 1671.3. Samples: 37854140. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 03:20:28,608][45375] Avg episode reward: [(0, '56.980'), (1, '54.010')] +[2023-10-13 03:20:28,807][46662] Updated weights for policy 0, policy_version 73970 (0.0007) +[2023-10-13 03:20:29,181][46662] Updated weights for policy 0, policy_version 73980 (0.0008) +[2023-10-13 03:20:29,566][46663] Updated weights for policy 1, policy_version 73891 (0.0008) +[2023-10-13 03:20:29,943][46663] Updated weights for policy 1, policy_version 73901 (0.0009) +[2023-10-13 03:20:30,316][46663] Updated weights for policy 1, policy_version 73911 (0.0007) +[2023-10-13 03:20:33,357][46662] Updated weights for policy 0, policy_version 73990 (0.0008) +[2023-10-13 03:20:33,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 151453696. Throughput: 0: 1683.2, 1: 1697.2. Samples: 37874852. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 03:20:33,607][45375] Avg episode reward: [(0, '55.850'), (1, '54.030')] +[2023-10-13 03:20:33,727][46662] Updated weights for policy 0, policy_version 74000 (0.0009) +[2023-10-13 03:20:34,110][46662] Updated weights for policy 0, policy_version 74010 (0.0009) +[2023-10-13 03:20:34,519][46663] Updated weights for policy 1, policy_version 73921 (0.0008) +[2023-10-13 03:20:34,876][46663] Updated weights for policy 1, policy_version 73931 (0.0010) +[2023-10-13 03:20:35,248][46663] Updated weights for policy 1, policy_version 73941 (0.0009) +[2023-10-13 03:20:35,622][46663] Updated weights for policy 1, policy_version 73951 (0.0010) +[2023-10-13 03:20:38,297][46662] Updated weights for policy 0, policy_version 74020 (0.0008) +[2023-10-13 03:20:38,606][45375] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 151519232. Throughput: 0: 1685.1, 1: 1693.2. Samples: 37895418. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 03:20:38,607][45375] Avg episode reward: [(0, '56.480'), (1, '53.720')] +[2023-10-13 03:20:38,659][46662] Updated weights for policy 0, policy_version 74030 (0.0007) +[2023-10-13 03:20:39,036][46662] Updated weights for policy 0, policy_version 74040 (0.0008) +[2023-10-13 03:20:39,715][46663] Updated weights for policy 1, policy_version 73961 (0.0008) +[2023-10-13 03:20:40,080][46663] Updated weights for policy 1, policy_version 73971 (0.0008) +[2023-10-13 03:20:40,450][46663] Updated weights for policy 1, policy_version 73981 (0.0008) +[2023-10-13 03:20:43,211][46662] Updated weights for policy 0, policy_version 74050 (0.0009) +[2023-10-13 03:20:43,593][46662] Updated weights for policy 0, policy_version 74060 (0.0008) +[2023-10-13 03:20:43,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 151584768. Throughput: 0: 1680.8, 1: 1674.9. Samples: 37904336. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 03:20:43,608][45375] Avg episode reward: [(0, '53.450'), (1, '54.100')] +[2023-10-13 03:20:43,968][46662] Updated weights for policy 0, policy_version 74070 (0.0009) +[2023-10-13 03:20:44,336][46662] Updated weights for policy 0, policy_version 74080 (0.0009) +[2023-10-13 03:20:44,492][46663] Updated weights for policy 1, policy_version 73991 (0.0008) +[2023-10-13 03:20:44,861][46663] Updated weights for policy 1, policy_version 74001 (0.0007) +[2023-10-13 03:20:45,226][46663] Updated weights for policy 1, policy_version 74011 (0.0007) +[2023-10-13 03:20:48,383][46662] Updated weights for policy 0, policy_version 74090 (0.0010) +[2023-10-13 03:20:48,606][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 151650304. Throughput: 0: 1678.0, 1: 1688.5. Samples: 37925084. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 03:20:48,607][45375] Avg episode reward: [(0, '54.110'), (1, '53.220')] +[2023-10-13 03:20:48,744][46662] Updated weights for policy 0, policy_version 74100 (0.0008) +[2023-10-13 03:20:49,117][46662] Updated weights for policy 0, policy_version 74110 (0.0010) +[2023-10-13 03:20:49,340][46663] Updated weights for policy 1, policy_version 74021 (0.0008) +[2023-10-13 03:20:49,709][46663] Updated weights for policy 1, policy_version 74031 (0.0008) +[2023-10-13 03:20:50,074][46663] Updated weights for policy 1, policy_version 74041 (0.0009) +[2023-10-13 03:20:53,230][46662] Updated weights for policy 0, policy_version 74120 (0.0010) +[2023-10-13 03:20:53,602][46662] Updated weights for policy 0, policy_version 74130 (0.0007) +[2023-10-13 03:20:53,607][45375] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 151715840. Throughput: 0: 1678.6, 1: 1686.1. Samples: 37945710. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 03:20:53,607][45375] Avg episode reward: [(0, '54.040'), (1, '51.440')] +[2023-10-13 03:20:53,975][46662] Updated weights for policy 0, policy_version 74140 (0.0009) +[2023-10-13 03:20:54,170][46663] Updated weights for policy 1, policy_version 74051 (0.0009) +[2023-10-13 03:20:54,534][46663] Updated weights for policy 1, policy_version 74061 (0.0007) +[2023-10-13 03:20:54,912][46663] Updated weights for policy 1, policy_version 74071 (0.0007) +[2023-10-13 03:20:58,060][46662] Updated weights for policy 0, policy_version 74150 (0.0010) +[2023-10-13 03:20:58,427][46662] Updated weights for policy 0, policy_version 74160 (0.0010) +[2023-10-13 03:20:58,606][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 151781376. Throughput: 0: 1678.9, 1: 1686.5. Samples: 37954828. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 03:20:58,607][45375] Avg episode reward: [(0, '54.040'), (1, '49.570')] +[2023-10-13 03:20:58,791][46662] Updated weights for policy 0, policy_version 74170 (0.0009) +[2023-10-13 03:20:58,997][46663] Updated weights for policy 1, policy_version 74081 (0.0008) +[2023-10-13 03:20:59,360][46663] Updated weights for policy 1, policy_version 74091 (0.0008) +[2023-10-13 03:20:59,722][46663] Updated weights for policy 1, policy_version 74101 (0.0007) +[2023-10-13 03:21:00,092][46663] Updated weights for policy 1, policy_version 74111 (0.0009) +[2023-10-13 03:21:02,638][46662] Updated weights for policy 0, policy_version 74180 (0.0009) +[2023-10-13 03:21:03,009][46662] Updated weights for policy 0, policy_version 74190 (0.0008) +[2023-10-13 03:21:03,379][46662] Updated weights for policy 0, policy_version 74200 (0.0009) +[2023-10-13 03:21:03,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 151846912. Throughput: 0: 1677.2, 1: 1683.7. Samples: 37975532. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 03:21:03,607][45375] Avg episode reward: [(0, '54.490'), (1, '48.510')] +[2023-10-13 03:21:04,229][46663] Updated weights for policy 1, policy_version 74121 (0.0007) +[2023-10-13 03:21:04,599][46663] Updated weights for policy 1, policy_version 74131 (0.0010) +[2023-10-13 03:21:04,964][46663] Updated weights for policy 1, policy_version 74141 (0.0008) +[2023-10-13 03:21:07,392][46662] Updated weights for policy 0, policy_version 74210 (0.0008) +[2023-10-13 03:21:07,762][46662] Updated weights for policy 0, policy_version 74220 (0.0009) +[2023-10-13 03:21:08,137][46662] Updated weights for policy 0, policy_version 74230 (0.0008) +[2023-10-13 03:21:08,518][46662] Updated weights for policy 0, policy_version 74240 (0.0009) +[2023-10-13 03:21:08,606][45375] Fps is (10 sec: 16384.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 151945216. Throughput: 0: 1669.2, 1: 1685.1. Samples: 37995964. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 03:21:08,607][45375] Avg episode reward: [(0, '53.690'), (1, '49.210')] +[2023-10-13 03:21:09,027][46663] Updated weights for policy 1, policy_version 74151 (0.0007) +[2023-10-13 03:21:09,400][46663] Updated weights for policy 1, policy_version 74161 (0.0007) +[2023-10-13 03:21:09,769][46663] Updated weights for policy 1, policy_version 74171 (0.0007) +[2023-10-13 03:21:12,900][46662] Updated weights for policy 0, policy_version 74250 (0.0009) +[2023-10-13 03:21:13,281][46662] Updated weights for policy 0, policy_version 74260 (0.0007) +[2023-10-13 03:21:13,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 151977984. Throughput: 0: 1676.9, 1: 1685.9. Samples: 38005464. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 03:21:13,607][45375] Avg episode reward: [(0, '53.340'), (1, '49.440')] +[2023-10-13 03:21:13,648][46662] Updated weights for policy 0, policy_version 74270 (0.0008) +[2023-10-13 03:21:13,933][46663] Updated weights for policy 1, policy_version 74181 (0.0007) +[2023-10-13 03:21:14,333][46663] Updated weights for policy 1, policy_version 74191 (0.0009) +[2023-10-13 03:21:14,694][46663] Updated weights for policy 1, policy_version 74201 (0.0008) +[2023-10-13 03:21:17,581][46662] Updated weights for policy 0, policy_version 74280 (0.0009) +[2023-10-13 03:21:17,945][46662] Updated weights for policy 0, policy_version 74290 (0.0008) +[2023-10-13 03:21:18,318][46662] Updated weights for policy 0, policy_version 74300 (0.0008) +[2023-10-13 03:21:18,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 152076288. Throughput: 0: 1681.1, 1: 1678.5. Samples: 38026034. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 03:21:18,608][45375] Avg episode reward: [(0, '53.510'), (1, '49.730')] +[2023-10-13 03:21:18,669][46663] Updated weights for policy 1, policy_version 74211 (0.0008) +[2023-10-13 03:21:19,032][46663] Updated weights for policy 1, policy_version 74221 (0.0008) +[2023-10-13 03:21:19,406][46663] Updated weights for policy 1, policy_version 74231 (0.0008) +[2023-10-13 03:21:22,485][46662] Updated weights for policy 0, policy_version 74310 (0.0010) +[2023-10-13 03:21:22,863][46662] Updated weights for policy 0, policy_version 74320 (0.0008) +[2023-10-13 03:21:23,224][46662] Updated weights for policy 0, policy_version 74330 (0.0011) +[2023-10-13 03:21:23,390][46663] Updated weights for policy 1, policy_version 74241 (0.0008) +[2023-10-13 03:21:23,607][45375] Fps is (10 sec: 16383.7, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 152141824. Throughput: 0: 1667.1, 1: 1685.9. Samples: 38046308. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 03:21:23,608][45375] Avg episode reward: [(0, '54.730'), (1, '50.370')] +[2023-10-13 03:21:23,759][46663] Updated weights for policy 1, policy_version 74251 (0.0009) +[2023-10-13 03:21:24,132][46663] Updated weights for policy 1, policy_version 74261 (0.0010) +[2023-10-13 03:21:24,507][46663] Updated weights for policy 1, policy_version 74271 (0.0008) +[2023-10-13 03:21:27,117][46662] Updated weights for policy 0, policy_version 74340 (0.0009) +[2023-10-13 03:21:27,489][46662] Updated weights for policy 0, policy_version 74350 (0.0010) +[2023-10-13 03:21:27,852][46662] Updated weights for policy 0, policy_version 74360 (0.0010) +[2023-10-13 03:21:28,573][46663] Updated weights for policy 1, policy_version 74281 (0.0008) +[2023-10-13 03:21:28,606][45375] Fps is (10 sec: 13107.4, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 152207360. Throughput: 0: 1683.1, 1: 1683.7. Samples: 38055838. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 03:21:28,607][45375] Avg episode reward: [(0, '55.080'), (1, '50.990')] +[2023-10-13 03:21:28,938][46663] Updated weights for policy 1, policy_version 74291 (0.0008) +[2023-10-13 03:21:29,304][46663] Updated weights for policy 1, policy_version 74301 (0.0008) +[2023-10-13 03:21:31,816][46662] Updated weights for policy 0, policy_version 74370 (0.0009) +[2023-10-13 03:21:32,178][46662] Updated weights for policy 0, policy_version 74380 (0.0008) +[2023-10-13 03:21:32,550][46662] Updated weights for policy 0, policy_version 74390 (0.0007) +[2023-10-13 03:21:32,921][46662] Updated weights for policy 0, policy_version 74400 (0.0008) +[2023-10-13 03:21:33,500][46663] Updated weights for policy 1, policy_version 74311 (0.0008) +[2023-10-13 03:21:33,606][45375] Fps is (10 sec: 13107.6, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 152272896. Throughput: 0: 1685.7, 1: 1673.2. Samples: 38076236. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 03:21:33,607][45375] Avg episode reward: [(0, '54.540'), (1, '50.750')] +[2023-10-13 03:21:33,863][46663] Updated weights for policy 1, policy_version 74321 (0.0009) +[2023-10-13 03:21:34,230][46663] Updated weights for policy 1, policy_version 74331 (0.0007) +[2023-10-13 03:21:36,937][46662] Updated weights for policy 0, policy_version 74410 (0.0007) +[2023-10-13 03:21:37,311][46662] Updated weights for policy 0, policy_version 74420 (0.0008) +[2023-10-13 03:21:37,687][46662] Updated weights for policy 0, policy_version 74430 (0.0007) +[2023-10-13 03:21:38,196][46663] Updated weights for policy 1, policy_version 74341 (0.0007) +[2023-10-13 03:21:38,564][46663] Updated weights for policy 1, policy_version 74351 (0.0009) +[2023-10-13 03:21:38,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 152338432. Throughput: 0: 1661.0, 1: 1668.0. Samples: 38095518. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 03:21:38,607][45375] Avg episode reward: [(0, '55.850'), (1, '52.190')] +[2023-10-13 03:21:38,932][46663] Updated weights for policy 1, policy_version 74361 (0.0009) +[2023-10-13 03:21:41,613][46662] Updated weights for policy 0, policy_version 74440 (0.0008) +[2023-10-13 03:21:41,990][46662] Updated weights for policy 0, policy_version 74450 (0.0008) +[2023-10-13 03:21:42,346][46662] Updated weights for policy 0, policy_version 74460 (0.0010) +[2023-10-13 03:21:43,086][46663] Updated weights for policy 1, policy_version 74371 (0.0009) +[2023-10-13 03:21:43,459][46663] Updated weights for policy 1, policy_version 74381 (0.0008) +[2023-10-13 03:21:43,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 152403968. Throughput: 0: 1694.5, 1: 1672.6. Samples: 38106346. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 03:21:43,607][45375] Avg episode reward: [(0, '56.280'), (1, '54.050')] +[2023-10-13 03:21:43,827][46663] Updated weights for policy 1, policy_version 74391 (0.0007) +[2023-10-13 03:21:46,484][46662] Updated weights for policy 0, policy_version 74470 (0.0008) +[2023-10-13 03:21:46,861][46662] Updated weights for policy 0, policy_version 74480 (0.0008) +[2023-10-13 03:21:47,221][46662] Updated weights for policy 0, policy_version 74490 (0.0009) +[2023-10-13 03:21:47,908][46663] Updated weights for policy 1, policy_version 74401 (0.0008) +[2023-10-13 03:21:48,268][46663] Updated weights for policy 1, policy_version 74411 (0.0009) +[2023-10-13 03:21:48,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 152469504. Throughput: 0: 1680.3, 1: 1677.7. Samples: 38126642. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 03:21:48,607][45375] Avg episode reward: [(0, '59.030'), (1, '51.780')] +[2023-10-13 03:21:48,636][46663] Updated weights for policy 1, policy_version 74421 (0.0008) +[2023-10-13 03:21:49,010][46663] Updated weights for policy 1, policy_version 74431 (0.0007) +[2023-10-13 03:21:51,444][46662] Updated weights for policy 0, policy_version 74500 (0.0008) +[2023-10-13 03:21:51,816][46662] Updated weights for policy 0, policy_version 74510 (0.0008) +[2023-10-13 03:21:52,184][46662] Updated weights for policy 0, policy_version 74520 (0.0008) +[2023-10-13 03:21:53,064][46663] Updated weights for policy 1, policy_version 74441 (0.0008) +[2023-10-13 03:21:53,418][46663] Updated weights for policy 1, policy_version 74451 (0.0010) +[2023-10-13 03:21:53,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 152535040. Throughput: 0: 1670.3, 1: 1667.9. Samples: 38146182. Policy #0 lag: (min: 18.0, avg: 18.0, max: 22.0) +[2023-10-13 03:21:53,607][45375] Avg episode reward: [(0, '59.280'), (1, '53.000')] +[2023-10-13 03:21:53,617][46091] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000074528_76316672.pth... +[2023-10-13 03:21:53,646][46091] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000072960_74711040.pth +[2023-10-13 03:21:53,786][46663] Updated weights for policy 1, policy_version 74461 (0.0008) +[2023-10-13 03:21:53,892][46384] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000074464_76251136.pth... +[2023-10-13 03:21:53,921][46384] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000072896_74645504.pth +[2023-10-13 03:21:56,130][46662] Updated weights for policy 0, policy_version 74530 (0.0009) +[2023-10-13 03:21:56,499][46662] Updated weights for policy 0, policy_version 74540 (0.0007) +[2023-10-13 03:21:56,871][46662] Updated weights for policy 0, policy_version 74550 (0.0009) +[2023-10-13 03:21:57,247][46662] Updated weights for policy 0, policy_version 74560 (0.0009) +[2023-10-13 03:21:57,973][46663] Updated weights for policy 1, policy_version 74471 (0.0008) +[2023-10-13 03:21:58,348][46663] Updated weights for policy 1, policy_version 74481 (0.0007) +[2023-10-13 03:21:58,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 152600576. Throughput: 0: 1691.6, 1: 1681.0. Samples: 38157230. Policy #0 lag: (min: 18.0, avg: 18.0, max: 22.0) +[2023-10-13 03:21:58,607][45375] Avg episode reward: [(0, '59.910'), (1, '53.590')] +[2023-10-13 03:21:58,723][46663] Updated weights for policy 1, policy_version 74491 (0.0010) +[2023-10-13 03:22:01,258][46662] Updated weights for policy 0, policy_version 74570 (0.0009) +[2023-10-13 03:22:01,634][46662] Updated weights for policy 0, policy_version 74580 (0.0009) +[2023-10-13 03:22:02,018][46662] Updated weights for policy 0, policy_version 74590 (0.0008) +[2023-10-13 03:22:02,835][46663] Updated weights for policy 1, policy_version 74501 (0.0008) +[2023-10-13 03:22:03,231][46663] Updated weights for policy 1, policy_version 74511 (0.0009) +[2023-10-13 03:22:03,603][46663] Updated weights for policy 1, policy_version 74521 (0.0009) +[2023-10-13 03:22:03,606][45375] Fps is (10 sec: 13107.4, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 152666112. Throughput: 0: 1670.2, 1: 1684.2. Samples: 38176982. Policy #0 lag: (min: 18.0, avg: 18.0, max: 22.0) +[2023-10-13 03:22:03,607][45375] Avg episode reward: [(0, '58.850'), (1, '55.330')] +[2023-10-13 03:22:06,240][46662] Updated weights for policy 0, policy_version 74600 (0.0008) +[2023-10-13 03:22:06,614][46662] Updated weights for policy 0, policy_version 74610 (0.0007) +[2023-10-13 03:22:06,991][46662] Updated weights for policy 0, policy_version 74620 (0.0009) +[2023-10-13 03:22:07,617][46663] Updated weights for policy 1, policy_version 74531 (0.0007) +[2023-10-13 03:22:07,983][46663] Updated weights for policy 1, policy_version 74541 (0.0007) +[2023-10-13 03:22:08,340][46663] Updated weights for policy 1, policy_version 74551 (0.0009) +[2023-10-13 03:22:08,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 152731648. Throughput: 0: 1674.6, 1: 1660.8. Samples: 38196402. Policy #0 lag: (min: 18.0, avg: 18.0, max: 22.0) +[2023-10-13 03:22:08,607][45375] Avg episode reward: [(0, '58.440'), (1, '54.610')] +[2023-10-13 03:22:11,140][46662] Updated weights for policy 0, policy_version 74630 (0.0010) +[2023-10-13 03:22:11,514][46662] Updated weights for policy 0, policy_version 74640 (0.0007) +[2023-10-13 03:22:11,881][46662] Updated weights for policy 0, policy_version 74650 (0.0007) +[2023-10-13 03:22:12,350][46663] Updated weights for policy 1, policy_version 74561 (0.0011) +[2023-10-13 03:22:12,717][46663] Updated weights for policy 1, policy_version 74571 (0.0009) +[2023-10-13 03:22:13,075][46663] Updated weights for policy 1, policy_version 74581 (0.0009) +[2023-10-13 03:22:13,448][46663] Updated weights for policy 1, policy_version 74591 (0.0008) +[2023-10-13 03:22:13,607][45375] Fps is (10 sec: 16383.9, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 152829952. Throughput: 0: 1689.2, 1: 1685.3. Samples: 38207690. Policy #0 lag: (min: 18.0, avg: 18.0, max: 22.0) +[2023-10-13 03:22:13,607][45375] Avg episode reward: [(0, '58.730'), (1, '55.340')] +[2023-10-13 03:22:15,979][46662] Updated weights for policy 0, policy_version 74660 (0.0010) +[2023-10-13 03:22:16,363][46662] Updated weights for policy 0, policy_version 74670 (0.0011) +[2023-10-13 03:22:16,733][46662] Updated weights for policy 0, policy_version 74680 (0.0010) +[2023-10-13 03:22:17,609][46663] Updated weights for policy 1, policy_version 74601 (0.0011) +[2023-10-13 03:22:17,976][46663] Updated weights for policy 1, policy_version 74611 (0.0011) +[2023-10-13 03:22:18,346][46663] Updated weights for policy 1, policy_version 74621 (0.0007) +[2023-10-13 03:22:18,606][45375] Fps is (10 sec: 16384.2, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 152895488. Throughput: 0: 1668.2, 1: 1689.2. Samples: 38227320. Policy #0 lag: (min: 18.0, avg: 18.0, max: 22.0) +[2023-10-13 03:22:18,607][45375] Avg episode reward: [(0, '58.770'), (1, '56.160')] +[2023-10-13 03:22:20,738][46662] Updated weights for policy 0, policy_version 74690 (0.0010) +[2023-10-13 03:22:21,113][46662] Updated weights for policy 0, policy_version 74700 (0.0007) +[2023-10-13 03:22:21,478][46662] Updated weights for policy 0, policy_version 74710 (0.0007) +[2023-10-13 03:22:21,849][46662] Updated weights for policy 0, policy_version 74720 (0.0009) +[2023-10-13 03:22:22,424][46663] Updated weights for policy 1, policy_version 74631 (0.0011) +[2023-10-13 03:22:22,785][46663] Updated weights for policy 1, policy_version 74641 (0.0010) +[2023-10-13 03:22:23,146][46663] Updated weights for policy 1, policy_version 74651 (0.0010) +[2023-10-13 03:22:23,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 152961024. Throughput: 0: 1687.7, 1: 1665.5. Samples: 38246416. Policy #0 lag: (min: 18.0, avg: 18.0, max: 22.0) +[2023-10-13 03:22:23,608][45375] Avg episode reward: [(0, '57.500'), (1, '56.440')] +[2023-10-13 03:22:25,762][46662] Updated weights for policy 0, policy_version 74730 (0.0008) +[2023-10-13 03:22:26,138][46662] Updated weights for policy 0, policy_version 74740 (0.0010) +[2023-10-13 03:22:26,502][46662] Updated weights for policy 0, policy_version 74750 (0.0010) +[2023-10-13 03:22:27,306][46663] Updated weights for policy 1, policy_version 74661 (0.0009) +[2023-10-13 03:22:27,677][46663] Updated weights for policy 1, policy_version 74671 (0.0009) +[2023-10-13 03:22:28,044][46663] Updated weights for policy 1, policy_version 74681 (0.0010) +[2023-10-13 03:22:28,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 153026560. Throughput: 0: 1675.8, 1: 1685.8. Samples: 38257618. Policy #0 lag: (min: 18.0, avg: 18.0, max: 22.0) +[2023-10-13 03:22:28,607][45375] Avg episode reward: [(0, '57.540'), (1, '56.090')] +[2023-10-13 03:22:30,339][46662] Updated weights for policy 0, policy_version 74760 (0.0008) +[2023-10-13 03:22:30,714][46662] Updated weights for policy 0, policy_version 74770 (0.0009) +[2023-10-13 03:22:31,071][46662] Updated weights for policy 0, policy_version 74780 (0.0010) +[2023-10-13 03:22:32,290][46663] Updated weights for policy 1, policy_version 74691 (0.0008) +[2023-10-13 03:22:32,654][46663] Updated weights for policy 1, policy_version 74701 (0.0008) +[2023-10-13 03:22:33,021][46663] Updated weights for policy 1, policy_version 74711 (0.0010) +[2023-10-13 03:22:33,607][45375] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 153092096. Throughput: 0: 1671.9, 1: 1671.6. Samples: 38277098. Policy #0 lag: (min: 18.0, avg: 18.0, max: 22.0) +[2023-10-13 03:22:33,607][45375] Avg episode reward: [(0, '57.280'), (1, '53.430')] +[2023-10-13 03:22:35,116][46662] Updated weights for policy 0, policy_version 74790 (0.0011) +[2023-10-13 03:22:35,483][46662] Updated weights for policy 0, policy_version 74800 (0.0009) +[2023-10-13 03:22:35,853][46662] Updated weights for policy 0, policy_version 74810 (0.0009) +[2023-10-13 03:22:36,916][46663] Updated weights for policy 1, policy_version 74721 (0.0009) +[2023-10-13 03:22:37,272][46663] Updated weights for policy 1, policy_version 74731 (0.0011) +[2023-10-13 03:22:37,640][46663] Updated weights for policy 1, policy_version 74741 (0.0008) +[2023-10-13 03:22:38,005][46663] Updated weights for policy 1, policy_version 74751 (0.0007) +[2023-10-13 03:22:38,607][45375] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 153157632. Throughput: 0: 1690.8, 1: 1659.6. Samples: 38296952. Policy #0 lag: (min: 18.0, avg: 18.0, max: 22.0) +[2023-10-13 03:22:38,608][45375] Avg episode reward: [(0, '57.130'), (1, '53.590')] +[2023-10-13 03:22:39,974][46662] Updated weights for policy 0, policy_version 74820 (0.0009) +[2023-10-13 03:22:40,351][46662] Updated weights for policy 0, policy_version 74830 (0.0008) +[2023-10-13 03:22:40,711][46662] Updated weights for policy 0, policy_version 74840 (0.0008) +[2023-10-13 03:22:41,955][46663] Updated weights for policy 1, policy_version 74761 (0.0007) +[2023-10-13 03:22:42,329][46663] Updated weights for policy 1, policy_version 74771 (0.0009) +[2023-10-13 03:22:42,701][46663] Updated weights for policy 1, policy_version 74781 (0.0008) +[2023-10-13 03:22:43,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 153223168. Throughput: 0: 1666.1, 1: 1677.9. Samples: 38307708. Policy #0 lag: (min: 19.0, avg: 19.0, max: 19.0) +[2023-10-13 03:22:43,608][45375] Avg episode reward: [(0, '55.790'), (1, '51.420')] +[2023-10-13 03:22:44,830][46662] Updated weights for policy 0, policy_version 74850 (0.0009) +[2023-10-13 03:22:45,192][46662] Updated weights for policy 0, policy_version 74860 (0.0010) +[2023-10-13 03:22:45,576][46662] Updated weights for policy 0, policy_version 74870 (0.0009) +[2023-10-13 03:22:45,941][46662] Updated weights for policy 0, policy_version 74880 (0.0011) +[2023-10-13 03:22:46,706][46663] Updated weights for policy 1, policy_version 74791 (0.0007) +[2023-10-13 03:22:47,080][46663] Updated weights for policy 1, policy_version 74801 (0.0008) +[2023-10-13 03:22:47,446][46663] Updated weights for policy 1, policy_version 74811 (0.0008) +[2023-10-13 03:22:48,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 153288704. Throughput: 0: 1681.7, 1: 1661.9. Samples: 38327442. Policy #0 lag: (min: 19.0, avg: 19.0, max: 19.0) +[2023-10-13 03:22:48,608][45375] Avg episode reward: [(0, '54.490'), (1, '51.480')] +[2023-10-13 03:22:50,043][46662] Updated weights for policy 0, policy_version 74890 (0.0008) +[2023-10-13 03:22:50,414][46662] Updated weights for policy 0, policy_version 74900 (0.0010) +[2023-10-13 03:22:50,787][46662] Updated weights for policy 0, policy_version 74910 (0.0008) +[2023-10-13 03:22:51,576][46663] Updated weights for policy 1, policy_version 74821 (0.0009) +[2023-10-13 03:22:51,960][46663] Updated weights for policy 1, policy_version 74831 (0.0008) +[2023-10-13 03:22:52,326][46663] Updated weights for policy 1, policy_version 74841 (0.0010) +[2023-10-13 03:22:53,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 153354240. Throughput: 0: 1689.8, 1: 1672.0. Samples: 38347686. Policy #0 lag: (min: 19.0, avg: 19.0, max: 19.0) +[2023-10-13 03:22:53,607][45375] Avg episode reward: [(0, '51.880'), (1, '52.690')] +[2023-10-13 03:22:55,017][46662] Updated weights for policy 0, policy_version 74920 (0.0008) +[2023-10-13 03:22:55,387][46662] Updated weights for policy 0, policy_version 74930 (0.0007) +[2023-10-13 03:22:55,760][46662] Updated weights for policy 0, policy_version 74940 (0.0010) +[2023-10-13 03:22:56,315][46663] Updated weights for policy 1, policy_version 74851 (0.0011) +[2023-10-13 03:22:56,680][46663] Updated weights for policy 1, policy_version 74861 (0.0009) +[2023-10-13 03:22:57,045][46663] Updated weights for policy 1, policy_version 74871 (0.0008) +[2023-10-13 03:22:58,607][45375] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 153419776. Throughput: 0: 1661.2, 1: 1680.6. Samples: 38358070. Policy #0 lag: (min: 19.0, avg: 19.0, max: 19.0) +[2023-10-13 03:22:58,607][45375] Avg episode reward: [(0, '50.900'), (1, '52.720')] +[2023-10-13 03:22:59,570][46662] Updated weights for policy 0, policy_version 74950 (0.0008) +[2023-10-13 03:22:59,935][46662] Updated weights for policy 0, policy_version 74960 (0.0007) +[2023-10-13 03:23:00,298][46662] Updated weights for policy 0, policy_version 74970 (0.0008) +[2023-10-13 03:23:01,060][46663] Updated weights for policy 1, policy_version 74881 (0.0007) +[2023-10-13 03:23:01,415][46663] Updated weights for policy 1, policy_version 74891 (0.0008) +[2023-10-13 03:23:01,790][46663] Updated weights for policy 1, policy_version 74901 (0.0008) +[2023-10-13 03:23:02,160][46663] Updated weights for policy 1, policy_version 74911 (0.0007) +[2023-10-13 03:23:03,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 153485312. Throughput: 0: 1683.2, 1: 1660.1. Samples: 38377770. Policy #0 lag: (min: 19.0, avg: 19.0, max: 19.0) +[2023-10-13 03:23:03,608][45375] Avg episode reward: [(0, '51.040'), (1, '52.480')] +[2023-10-13 03:23:04,433][46662] Updated weights for policy 0, policy_version 74980 (0.0007) +[2023-10-13 03:23:04,794][46662] Updated weights for policy 0, policy_version 74990 (0.0007) +[2023-10-13 03:23:05,171][46662] Updated weights for policy 0, policy_version 75000 (0.0008) +[2023-10-13 03:23:06,316][46663] Updated weights for policy 1, policy_version 74921 (0.0009) +[2023-10-13 03:23:06,681][46663] Updated weights for policy 1, policy_version 74931 (0.0009) +[2023-10-13 03:23:07,051][46663] Updated weights for policy 1, policy_version 74941 (0.0008) +[2023-10-13 03:23:08,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 153550848. Throughput: 0: 1688.7, 1: 1693.5. Samples: 38398616. Policy #0 lag: (min: 19.0, avg: 19.0, max: 19.0) +[2023-10-13 03:23:08,607][45375] Avg episode reward: [(0, '51.820'), (1, '53.020')] +[2023-10-13 03:23:09,296][46662] Updated weights for policy 0, policy_version 75010 (0.0009) +[2023-10-13 03:23:09,676][46662] Updated weights for policy 0, policy_version 75020 (0.0007) +[2023-10-13 03:23:10,051][46662] Updated weights for policy 0, policy_version 75030 (0.0008) +[2023-10-13 03:23:10,417][46662] Updated weights for policy 0, policy_version 75040 (0.0007) +[2023-10-13 03:23:11,178][46663] Updated weights for policy 1, policy_version 74951 (0.0009) +[2023-10-13 03:23:11,547][46663] Updated weights for policy 1, policy_version 74961 (0.0009) +[2023-10-13 03:23:11,909][46663] Updated weights for policy 1, policy_version 74971 (0.0009) +[2023-10-13 03:23:13,606][45375] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 153616384. Throughput: 0: 1665.5, 1: 1684.0. Samples: 38408342. Policy #0 lag: (min: 19.0, avg: 19.0, max: 19.0) +[2023-10-13 03:23:13,607][45375] Avg episode reward: [(0, '52.380'), (1, '53.090')] +[2023-10-13 03:23:14,409][46662] Updated weights for policy 0, policy_version 75050 (0.0007) +[2023-10-13 03:23:14,773][46662] Updated weights for policy 0, policy_version 75060 (0.0008) +[2023-10-13 03:23:15,156][46662] Updated weights for policy 0, policy_version 75070 (0.0009) +[2023-10-13 03:23:15,905][46663] Updated weights for policy 1, policy_version 74981 (0.0009) +[2023-10-13 03:23:16,260][46663] Updated weights for policy 1, policy_version 74991 (0.0009) +[2023-10-13 03:23:16,631][46663] Updated weights for policy 1, policy_version 75001 (0.0008) +[2023-10-13 03:23:18,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 153681920. Throughput: 0: 1687.4, 1: 1678.4. Samples: 38428562. Policy #0 lag: (min: 19.0, avg: 19.0, max: 19.0) +[2023-10-13 03:23:18,607][45375] Avg episode reward: [(0, '51.120'), (1, '52.090')] +[2023-10-13 03:23:19,063][46662] Updated weights for policy 0, policy_version 75080 (0.0007) +[2023-10-13 03:23:19,435][46662] Updated weights for policy 0, policy_version 75090 (0.0009) +[2023-10-13 03:23:19,805][46662] Updated weights for policy 0, policy_version 75100 (0.0009) +[2023-10-13 03:23:20,650][46663] Updated weights for policy 1, policy_version 75011 (0.0007) +[2023-10-13 03:23:21,015][46663] Updated weights for policy 1, policy_version 75021 (0.0010) +[2023-10-13 03:23:21,382][46663] Updated weights for policy 1, policy_version 75031 (0.0008) +[2023-10-13 03:23:23,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 153747456. Throughput: 0: 1688.6, 1: 1701.7. Samples: 38449514. Policy #0 lag: (min: 19.0, avg: 19.0, max: 19.0) +[2023-10-13 03:23:23,607][45375] Avg episode reward: [(0, '53.480'), (1, '53.480')] +[2023-10-13 03:23:23,884][46662] Updated weights for policy 0, policy_version 75110 (0.0009) +[2023-10-13 03:23:24,248][46662] Updated weights for policy 0, policy_version 75120 (0.0012) +[2023-10-13 03:23:24,623][46662] Updated weights for policy 0, policy_version 75130 (0.0009) +[2023-10-13 03:23:25,626][46663] Updated weights for policy 1, policy_version 75041 (0.0008) +[2023-10-13 03:23:26,000][46663] Updated weights for policy 1, policy_version 75051 (0.0009) +[2023-10-13 03:23:26,366][46663] Updated weights for policy 1, policy_version 75061 (0.0008) +[2023-10-13 03:23:26,740][46663] Updated weights for policy 1, policy_version 75071 (0.0009) +[2023-10-13 03:23:28,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 153812992. Throughput: 0: 1680.8, 1: 1680.4. Samples: 38458962. Policy #0 lag: (min: 19.0, avg: 19.0, max: 19.0) +[2023-10-13 03:23:28,607][45375] Avg episode reward: [(0, '52.920'), (1, '51.740')] +[2023-10-13 03:23:28,929][46662] Updated weights for policy 0, policy_version 75140 (0.0008) +[2023-10-13 03:23:29,301][46662] Updated weights for policy 0, policy_version 75150 (0.0009) +[2023-10-13 03:23:29,673][46662] Updated weights for policy 0, policy_version 75160 (0.0008) +[2023-10-13 03:23:30,746][46663] Updated weights for policy 1, policy_version 75081 (0.0009) +[2023-10-13 03:23:31,120][46663] Updated weights for policy 1, policy_version 75091 (0.0007) +[2023-10-13 03:23:31,484][46663] Updated weights for policy 1, policy_version 75101 (0.0007) +[2023-10-13 03:23:33,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 153878528. Throughput: 0: 1680.9, 1: 1686.8. Samples: 38478990. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 03:23:33,607][45375] Avg episode reward: [(0, '51.920'), (1, '51.870')] +[2023-10-13 03:23:33,755][46662] Updated weights for policy 0, policy_version 75170 (0.0008) +[2023-10-13 03:23:34,125][46662] Updated weights for policy 0, policy_version 75180 (0.0009) +[2023-10-13 03:23:34,499][46662] Updated weights for policy 0, policy_version 75190 (0.0009) +[2023-10-13 03:23:34,857][46662] Updated weights for policy 0, policy_version 75200 (0.0007) +[2023-10-13 03:23:35,383][46663] Updated weights for policy 1, policy_version 75111 (0.0010) +[2023-10-13 03:23:35,747][46663] Updated weights for policy 1, policy_version 75121 (0.0009) +[2023-10-13 03:23:36,116][46663] Updated weights for policy 1, policy_version 75131 (0.0008) +[2023-10-13 03:23:38,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 153944064. Throughput: 0: 1682.5, 1: 1695.9. Samples: 38499714. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 03:23:38,607][45375] Avg episode reward: [(0, '52.920'), (1, '51.880')] +[2023-10-13 03:23:38,860][46662] Updated weights for policy 0, policy_version 75210 (0.0011) +[2023-10-13 03:23:39,227][46662] Updated weights for policy 0, policy_version 75220 (0.0009) +[2023-10-13 03:23:39,600][46662] Updated weights for policy 0, policy_version 75230 (0.0007) +[2023-10-13 03:23:40,289][46663] Updated weights for policy 1, policy_version 75141 (0.0007) +[2023-10-13 03:23:40,682][46663] Updated weights for policy 1, policy_version 75151 (0.0007) +[2023-10-13 03:23:41,052][46663] Updated weights for policy 1, policy_version 75161 (0.0009) +[2023-10-13 03:23:43,573][46662] Updated weights for policy 0, policy_version 75240 (0.0007) +[2023-10-13 03:23:43,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 154009600. Throughput: 0: 1682.0, 1: 1666.1. Samples: 38508736. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 03:23:43,607][45375] Avg episode reward: [(0, '53.350'), (1, '53.420')] +[2023-10-13 03:23:43,939][46662] Updated weights for policy 0, policy_version 75250 (0.0009) +[2023-10-13 03:23:44,314][46662] Updated weights for policy 0, policy_version 75260 (0.0009) +[2023-10-13 03:23:45,067][46663] Updated weights for policy 1, policy_version 75171 (0.0007) +[2023-10-13 03:23:45,434][46663] Updated weights for policy 1, policy_version 75181 (0.0007) +[2023-10-13 03:23:45,807][46663] Updated weights for policy 1, policy_version 75191 (0.0007) +[2023-10-13 03:23:48,324][46662] Updated weights for policy 0, policy_version 75270 (0.0009) +[2023-10-13 03:23:48,606][45375] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 154075136. Throughput: 0: 1683.1, 1: 1690.0. Samples: 38529556. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 03:23:48,607][45375] Avg episode reward: [(0, '54.990'), (1, '53.490')] +[2023-10-13 03:23:48,697][46662] Updated weights for policy 0, policy_version 75280 (0.0007) +[2023-10-13 03:23:49,050][46662] Updated weights for policy 0, policy_version 75290 (0.0009) +[2023-10-13 03:23:49,874][46663] Updated weights for policy 1, policy_version 75201 (0.0009) +[2023-10-13 03:23:50,249][46663] Updated weights for policy 1, policy_version 75211 (0.0010) +[2023-10-13 03:23:50,617][46663] Updated weights for policy 1, policy_version 75221 (0.0011) +[2023-10-13 03:23:50,988][46663] Updated weights for policy 1, policy_version 75231 (0.0012) +[2023-10-13 03:23:53,371][46662] Updated weights for policy 0, policy_version 75300 (0.0009) +[2023-10-13 03:23:53,607][45375] Fps is (10 sec: 13106.8, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 154140672. Throughput: 0: 1684.5, 1: 1684.6. Samples: 38550226. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 03:23:53,608][45375] Avg episode reward: [(0, '54.750'), (1, '52.380')] +[2023-10-13 03:23:53,620][46384] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000075232_77037568.pth... +[2023-10-13 03:23:53,654][46384] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000073664_75431936.pth +[2023-10-13 03:23:53,743][46662] Updated weights for policy 0, policy_version 75310 (0.0007) +[2023-10-13 03:23:54,117][46662] Updated weights for policy 0, policy_version 75320 (0.0007) +[2023-10-13 03:23:54,409][46091] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000075328_77135872.pth... +[2023-10-13 03:23:54,437][46091] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000073728_75497472.pth +[2023-10-13 03:23:54,891][46663] Updated weights for policy 1, policy_version 75241 (0.0012) +[2023-10-13 03:23:55,263][46663] Updated weights for policy 1, policy_version 75251 (0.0008) +[2023-10-13 03:23:55,633][46663] Updated weights for policy 1, policy_version 75261 (0.0007) +[2023-10-13 03:23:58,279][46662] Updated weights for policy 0, policy_version 75330 (0.0008) +[2023-10-13 03:23:58,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 154206208. Throughput: 0: 1688.9, 1: 1669.9. Samples: 38559490. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 03:23:58,607][45375] Avg episode reward: [(0, '55.760'), (1, '53.200')] +[2023-10-13 03:23:58,646][46662] Updated weights for policy 0, policy_version 75340 (0.0008) +[2023-10-13 03:23:59,017][46662] Updated weights for policy 0, policy_version 75350 (0.0010) +[2023-10-13 03:23:59,380][46662] Updated weights for policy 0, policy_version 75360 (0.0010) +[2023-10-13 03:23:59,801][46663] Updated weights for policy 1, policy_version 75271 (0.0007) +[2023-10-13 03:24:00,174][46663] Updated weights for policy 1, policy_version 75281 (0.0007) +[2023-10-13 03:24:00,543][46663] Updated weights for policy 1, policy_version 75291 (0.0007) +[2023-10-13 03:24:03,383][46662] Updated weights for policy 0, policy_version 75370 (0.0009) +[2023-10-13 03:24:03,607][45375] Fps is (10 sec: 13107.6, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 154271744. Throughput: 0: 1686.2, 1: 1685.5. Samples: 38580286. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 03:24:03,607][45375] Avg episode reward: [(0, '58.930'), (1, '51.820')] +[2023-10-13 03:24:03,754][46662] Updated weights for policy 0, policy_version 75380 (0.0007) +[2023-10-13 03:24:04,114][46662] Updated weights for policy 0, policy_version 75390 (0.0009) +[2023-10-13 03:24:04,499][46663] Updated weights for policy 1, policy_version 75301 (0.0008) +[2023-10-13 03:24:04,869][46663] Updated weights for policy 1, policy_version 75311 (0.0007) +[2023-10-13 03:24:05,237][46663] Updated weights for policy 1, policy_version 75321 (0.0008) +[2023-10-13 03:24:08,229][46662] Updated weights for policy 0, policy_version 75400 (0.0008) +[2023-10-13 03:24:08,606][46662] Updated weights for policy 0, policy_version 75410 (0.0009) +[2023-10-13 03:24:08,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 154337280. Throughput: 0: 1678.0, 1: 1681.3. Samples: 38600682. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 03:24:08,607][45375] Avg episode reward: [(0, '59.460'), (1, '52.380')] +[2023-10-13 03:24:08,971][46662] Updated weights for policy 0, policy_version 75420 (0.0010) +[2023-10-13 03:24:09,468][46663] Updated weights for policy 1, policy_version 75331 (0.0010) +[2023-10-13 03:24:09,840][46663] Updated weights for policy 1, policy_version 75341 (0.0008) +[2023-10-13 03:24:10,203][46663] Updated weights for policy 1, policy_version 75351 (0.0009) +[2023-10-13 03:24:13,158][46662] Updated weights for policy 0, policy_version 75430 (0.0007) +[2023-10-13 03:24:13,527][46662] Updated weights for policy 0, policy_version 75440 (0.0007) +[2023-10-13 03:24:13,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 154402816. Throughput: 0: 1677.1, 1: 1670.0. Samples: 38609580. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 03:24:13,608][45375] Avg episode reward: [(0, '58.560'), (1, '51.950')] +[2023-10-13 03:24:13,893][46662] Updated weights for policy 0, policy_version 75450 (0.0007) +[2023-10-13 03:24:14,261][46663] Updated weights for policy 1, policy_version 75361 (0.0009) +[2023-10-13 03:24:14,624][46663] Updated weights for policy 1, policy_version 75371 (0.0008) +[2023-10-13 03:24:14,999][46663] Updated weights for policy 1, policy_version 75381 (0.0007) +[2023-10-13 03:24:15,365][46663] Updated weights for policy 1, policy_version 75391 (0.0009) +[2023-10-13 03:24:17,714][46662] Updated weights for policy 0, policy_version 75460 (0.0008) +[2023-10-13 03:24:18,092][46662] Updated weights for policy 0, policy_version 75470 (0.0008) +[2023-10-13 03:24:18,457][46662] Updated weights for policy 0, policy_version 75480 (0.0009) +[2023-10-13 03:24:18,606][45375] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 154468352. Throughput: 0: 1685.3, 1: 1677.8. Samples: 38630330. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 03:24:18,607][45375] Avg episode reward: [(0, '61.570'), (1, '52.120')] +[2023-10-13 03:24:19,545][46663] Updated weights for policy 1, policy_version 75401 (0.0009) +[2023-10-13 03:24:19,908][46663] Updated weights for policy 1, policy_version 75411 (0.0009) +[2023-10-13 03:24:20,278][46663] Updated weights for policy 1, policy_version 75421 (0.0009) +[2023-10-13 03:24:22,440][46662] Updated weights for policy 0, policy_version 75490 (0.0008) +[2023-10-13 03:24:22,822][46662] Updated weights for policy 0, policy_version 75500 (0.0008) +[2023-10-13 03:24:23,193][46662] Updated weights for policy 0, policy_version 75510 (0.0007) +[2023-10-13 03:24:23,553][46662] Updated weights for policy 0, policy_version 75520 (0.0008) +[2023-10-13 03:24:23,606][45375] Fps is (10 sec: 16384.3, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 154566656. Throughput: 0: 1679.3, 1: 1677.0. Samples: 38650748. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 03:24:23,607][45375] Avg episode reward: [(0, '62.050'), (1, '50.870')] +[2023-10-13 03:24:23,616][46091] Saving new best policy, reward=62.050! +[2023-10-13 03:24:24,374][46663] Updated weights for policy 1, policy_version 75431 (0.0009) +[2023-10-13 03:24:24,735][46663] Updated weights for policy 1, policy_version 75441 (0.0008) +[2023-10-13 03:24:25,098][46663] Updated weights for policy 1, policy_version 75451 (0.0008) +[2023-10-13 03:24:27,716][46662] Updated weights for policy 0, policy_version 75530 (0.0007) +[2023-10-13 03:24:28,078][46662] Updated weights for policy 0, policy_version 75540 (0.0007) +[2023-10-13 03:24:28,446][46662] Updated weights for policy 0, policy_version 75550 (0.0007) +[2023-10-13 03:24:28,607][45375] Fps is (10 sec: 16383.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 154632192. Throughput: 0: 1689.5, 1: 1676.9. Samples: 38660224. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 03:24:28,607][45375] Avg episode reward: [(0, '61.620'), (1, '51.540')] +[2023-10-13 03:24:29,286][46663] Updated weights for policy 1, policy_version 75461 (0.0010) +[2023-10-13 03:24:29,650][46663] Updated weights for policy 1, policy_version 75471 (0.0011) +[2023-10-13 03:24:30,015][46663] Updated weights for policy 1, policy_version 75481 (0.0011) +[2023-10-13 03:24:32,683][46662] Updated weights for policy 0, policy_version 75560 (0.0008) +[2023-10-13 03:24:33,058][46662] Updated weights for policy 0, policy_version 75570 (0.0009) +[2023-10-13 03:24:33,434][46662] Updated weights for policy 0, policy_version 75580 (0.0007) +[2023-10-13 03:24:33,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 154697728. Throughput: 0: 1689.1, 1: 1675.5. Samples: 38680962. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 03:24:33,607][45375] Avg episode reward: [(0, '62.530'), (1, '50.920')] +[2023-10-13 03:24:33,608][46091] Saving new best policy, reward=62.530! +[2023-10-13 03:24:34,148][46663] Updated weights for policy 1, policy_version 75491 (0.0009) +[2023-10-13 03:24:34,521][46663] Updated weights for policy 1, policy_version 75501 (0.0009) +[2023-10-13 03:24:34,891][46663] Updated weights for policy 1, policy_version 75511 (0.0007) +[2023-10-13 03:24:37,512][46662] Updated weights for policy 0, policy_version 75590 (0.0008) +[2023-10-13 03:24:37,883][46662] Updated weights for policy 0, policy_version 75600 (0.0009) +[2023-10-13 03:24:38,263][46662] Updated weights for policy 0, policy_version 75610 (0.0008) +[2023-10-13 03:24:38,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 154763264. Throughput: 0: 1673.3, 1: 1677.5. Samples: 38701008. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 03:24:38,607][45375] Avg episode reward: [(0, '62.140'), (1, '50.800')] +[2023-10-13 03:24:38,935][46663] Updated weights for policy 1, policy_version 75521 (0.0008) +[2023-10-13 03:24:39,309][46663] Updated weights for policy 1, policy_version 75531 (0.0008) +[2023-10-13 03:24:39,663][46663] Updated weights for policy 1, policy_version 75541 (0.0009) +[2023-10-13 03:24:40,026][46663] Updated weights for policy 1, policy_version 75551 (0.0008) +[2023-10-13 03:24:42,220][46662] Updated weights for policy 0, policy_version 75620 (0.0008) +[2023-10-13 03:24:42,584][46662] Updated weights for policy 0, policy_version 75630 (0.0010) +[2023-10-13 03:24:42,965][46662] Updated weights for policy 0, policy_version 75640 (0.0007) +[2023-10-13 03:24:43,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 154828800. Throughput: 0: 1683.5, 1: 1675.3. Samples: 38710638. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 03:24:43,607][45375] Avg episode reward: [(0, '62.590'), (1, '50.490')] +[2023-10-13 03:24:43,608][46091] Saving new best policy, reward=62.590! +[2023-10-13 03:24:44,111][46663] Updated weights for policy 1, policy_version 75561 (0.0007) +[2023-10-13 03:24:44,468][46663] Updated weights for policy 1, policy_version 75571 (0.0008) +[2023-10-13 03:24:44,833][46663] Updated weights for policy 1, policy_version 75581 (0.0007) +[2023-10-13 03:24:46,967][46662] Updated weights for policy 0, policy_version 75650 (0.0007) +[2023-10-13 03:24:47,330][46662] Updated weights for policy 0, policy_version 75660 (0.0007) +[2023-10-13 03:24:47,701][46662] Updated weights for policy 0, policy_version 75670 (0.0008) +[2023-10-13 03:24:48,074][46662] Updated weights for policy 0, policy_version 75680 (0.0010) +[2023-10-13 03:24:48,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 154894336. Throughput: 0: 1686.4, 1: 1676.3. Samples: 38731608. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 03:24:48,607][45375] Avg episode reward: [(0, '63.050'), (1, '49.050')] +[2023-10-13 03:24:48,608][46091] Saving new best policy, reward=63.050! +[2023-10-13 03:24:48,862][46663] Updated weights for policy 1, policy_version 75591 (0.0009) +[2023-10-13 03:24:49,228][46663] Updated weights for policy 1, policy_version 75601 (0.0008) +[2023-10-13 03:24:49,589][46663] Updated weights for policy 1, policy_version 75611 (0.0007) +[2023-10-13 03:24:51,894][46662] Updated weights for policy 0, policy_version 75690 (0.0007) +[2023-10-13 03:24:52,272][46662] Updated weights for policy 0, policy_version 75700 (0.0008) +[2023-10-13 03:24:52,641][46662] Updated weights for policy 0, policy_version 75710 (0.0010) +[2023-10-13 03:24:53,547][46663] Updated weights for policy 1, policy_version 75621 (0.0008) +[2023-10-13 03:24:53,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 154959872. Throughput: 0: 1668.8, 1: 1683.8. Samples: 38751546. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 03:24:53,607][45375] Avg episode reward: [(0, '62.570'), (1, '50.240')] +[2023-10-13 03:24:53,906][46663] Updated weights for policy 1, policy_version 75631 (0.0009) +[2023-10-13 03:24:54,270][46663] Updated weights for policy 1, policy_version 75641 (0.0007) +[2023-10-13 03:24:56,874][46662] Updated weights for policy 0, policy_version 75720 (0.0008) +[2023-10-13 03:24:57,233][46662] Updated weights for policy 0, policy_version 75730 (0.0009) +[2023-10-13 03:24:57,600][46662] Updated weights for policy 0, policy_version 75740 (0.0007) +[2023-10-13 03:24:58,396][46663] Updated weights for policy 1, policy_version 75651 (0.0008) +[2023-10-13 03:24:58,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 155025408. Throughput: 0: 1699.0, 1: 1683.8. Samples: 38761806. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 03:24:58,607][45375] Avg episode reward: [(0, '64.040'), (1, '50.070')] +[2023-10-13 03:24:58,608][46091] Saving new best policy, reward=64.040! +[2023-10-13 03:24:58,759][46663] Updated weights for policy 1, policy_version 75661 (0.0008) +[2023-10-13 03:24:59,133][46663] Updated weights for policy 1, policy_version 75671 (0.0007) +[2023-10-13 03:25:01,722][46662] Updated weights for policy 0, policy_version 75750 (0.0010) +[2023-10-13 03:25:02,087][46662] Updated weights for policy 0, policy_version 75760 (0.0008) +[2023-10-13 03:25:02,453][46662] Updated weights for policy 0, policy_version 75770 (0.0010) +[2023-10-13 03:25:03,240][46663] Updated weights for policy 1, policy_version 75681 (0.0008) +[2023-10-13 03:25:03,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 155090944. Throughput: 0: 1686.6, 1: 1693.5. Samples: 38782436. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 03:25:03,608][45375] Avg episode reward: [(0, '64.170'), (1, '50.780')] +[2023-10-13 03:25:03,609][46091] Saving new best policy, reward=64.170! +[2023-10-13 03:25:03,615][46663] Updated weights for policy 1, policy_version 75691 (0.0007) +[2023-10-13 03:25:03,981][46663] Updated weights for policy 1, policy_version 75701 (0.0008) +[2023-10-13 03:25:04,347][46663] Updated weights for policy 1, policy_version 75711 (0.0007) +[2023-10-13 03:25:06,450][46662] Updated weights for policy 0, policy_version 75780 (0.0008) +[2023-10-13 03:25:06,817][46662] Updated weights for policy 0, policy_version 75790 (0.0010) +[2023-10-13 03:25:07,184][46662] Updated weights for policy 0, policy_version 75800 (0.0008) +[2023-10-13 03:25:08,316][46663] Updated weights for policy 1, policy_version 75721 (0.0007) +[2023-10-13 03:25:08,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 155156480. Throughput: 0: 1672.7, 1: 1687.0. Samples: 38801932. Policy #0 lag: (min: 31.0, avg: 34.7, max: 63.0) +[2023-10-13 03:25:08,607][45375] Avg episode reward: [(0, '63.720'), (1, '50.630')] +[2023-10-13 03:25:08,683][46663] Updated weights for policy 1, policy_version 75731 (0.0008) +[2023-10-13 03:25:09,054][46663] Updated weights for policy 1, policy_version 75741 (0.0009) +[2023-10-13 03:25:11,262][46662] Updated weights for policy 0, policy_version 75810 (0.0007) +[2023-10-13 03:25:11,626][46662] Updated weights for policy 0, policy_version 75820 (0.0009) +[2023-10-13 03:25:11,989][46662] Updated weights for policy 0, policy_version 75830 (0.0008) +[2023-10-13 03:25:12,359][46662] Updated weights for policy 0, policy_version 75840 (0.0008) +[2023-10-13 03:25:13,137][46663] Updated weights for policy 1, policy_version 75751 (0.0008) +[2023-10-13 03:25:13,509][46663] Updated weights for policy 1, policy_version 75761 (0.0009) +[2023-10-13 03:25:13,607][45375] Fps is (10 sec: 13107.4, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 155222016. Throughput: 0: 1695.3, 1: 1693.7. Samples: 38812730. Policy #0 lag: (min: 31.0, avg: 34.7, max: 63.0) +[2023-10-13 03:25:13,607][45375] Avg episode reward: [(0, '62.640'), (1, '52.110')] +[2023-10-13 03:25:13,886][46663] Updated weights for policy 1, policy_version 75771 (0.0008) +[2023-10-13 03:25:16,462][46662] Updated weights for policy 0, policy_version 75850 (0.0009) +[2023-10-13 03:25:16,832][46662] Updated weights for policy 0, policy_version 75860 (0.0008) +[2023-10-13 03:25:17,199][46662] Updated weights for policy 0, policy_version 75870 (0.0009) +[2023-10-13 03:25:17,993][46663] Updated weights for policy 1, policy_version 75781 (0.0008) +[2023-10-13 03:25:18,387][46663] Updated weights for policy 1, policy_version 75791 (0.0010) +[2023-10-13 03:25:18,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 155287552. Throughput: 0: 1678.9, 1: 1696.3. Samples: 38832846. Policy #0 lag: (min: 31.0, avg: 34.7, max: 63.0) +[2023-10-13 03:25:18,607][45375] Avg episode reward: [(0, '59.670'), (1, '51.960')] +[2023-10-13 03:25:18,751][46663] Updated weights for policy 1, policy_version 75801 (0.0011) +[2023-10-13 03:25:21,317][46662] Updated weights for policy 0, policy_version 75880 (0.0008) +[2023-10-13 03:25:21,680][46662] Updated weights for policy 0, policy_version 75890 (0.0007) +[2023-10-13 03:25:22,040][46662] Updated weights for policy 0, policy_version 75900 (0.0007) +[2023-10-13 03:25:22,802][46663] Updated weights for policy 1, policy_version 75811 (0.0010) +[2023-10-13 03:25:23,163][46663] Updated weights for policy 1, policy_version 75821 (0.0010) +[2023-10-13 03:25:23,520][46663] Updated weights for policy 1, policy_version 75831 (0.0009) +[2023-10-13 03:25:23,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 155353088. Throughput: 0: 1679.8, 1: 1678.3. Samples: 38852120. Policy #0 lag: (min: 31.0, avg: 34.7, max: 63.0) +[2023-10-13 03:25:23,607][45375] Avg episode reward: [(0, '60.760'), (1, '51.860')] +[2023-10-13 03:25:26,089][46662] Updated weights for policy 0, policy_version 75910 (0.0008) +[2023-10-13 03:25:26,463][46662] Updated weights for policy 0, policy_version 75920 (0.0007) +[2023-10-13 03:25:26,824][46662] Updated weights for policy 0, policy_version 75930 (0.0007) +[2023-10-13 03:25:27,525][46663] Updated weights for policy 1, policy_version 75841 (0.0010) +[2023-10-13 03:25:27,893][46663] Updated weights for policy 1, policy_version 75851 (0.0008) +[2023-10-13 03:25:28,264][46663] Updated weights for policy 1, policy_version 75861 (0.0007) +[2023-10-13 03:25:28,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 155418624. Throughput: 0: 1694.8, 1: 1695.1. Samples: 38863180. Policy #0 lag: (min: 31.0, avg: 34.7, max: 63.0) +[2023-10-13 03:25:28,607][45375] Avg episode reward: [(0, '58.580'), (1, '51.840')] +[2023-10-13 03:25:28,624][46663] Updated weights for policy 1, policy_version 75871 (0.0008) +[2023-10-13 03:25:30,843][46662] Updated weights for policy 0, policy_version 75940 (0.0009) +[2023-10-13 03:25:31,209][46662] Updated weights for policy 0, policy_version 75950 (0.0009) +[2023-10-13 03:25:31,575][46662] Updated weights for policy 0, policy_version 75960 (0.0008) +[2023-10-13 03:25:32,510][46663] Updated weights for policy 1, policy_version 75881 (0.0009) +[2023-10-13 03:25:32,874][46663] Updated weights for policy 1, policy_version 75891 (0.0008) +[2023-10-13 03:25:33,244][46663] Updated weights for policy 1, policy_version 75901 (0.0009) +[2023-10-13 03:25:33,606][45375] Fps is (10 sec: 16384.2, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 155516928. Throughput: 0: 1664.7, 1: 1691.6. Samples: 38882640. Policy #0 lag: (min: 31.0, avg: 34.7, max: 63.0) +[2023-10-13 03:25:33,607][45375] Avg episode reward: [(0, '58.490'), (1, '53.290')] +[2023-10-13 03:25:35,691][46662] Updated weights for policy 0, policy_version 75970 (0.0008) +[2023-10-13 03:25:36,064][46662] Updated weights for policy 0, policy_version 75980 (0.0008) +[2023-10-13 03:25:36,433][46662] Updated weights for policy 0, policy_version 75990 (0.0008) +[2023-10-13 03:25:36,805][46662] Updated weights for policy 0, policy_version 76000 (0.0007) +[2023-10-13 03:25:37,228][46663] Updated weights for policy 1, policy_version 75911 (0.0009) +[2023-10-13 03:25:37,595][46663] Updated weights for policy 1, policy_version 75921 (0.0011) +[2023-10-13 03:25:37,966][46663] Updated weights for policy 1, policy_version 75931 (0.0010) +[2023-10-13 03:25:38,607][45375] Fps is (10 sec: 16383.8, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 155582464. Throughput: 0: 1681.5, 1: 1665.0. Samples: 38902140. Policy #0 lag: (min: 31.0, avg: 34.7, max: 63.0) +[2023-10-13 03:25:38,607][45375] Avg episode reward: [(0, '58.410'), (1, '54.850')] +[2023-10-13 03:25:40,830][46662] Updated weights for policy 0, policy_version 76010 (0.0008) +[2023-10-13 03:25:41,204][46662] Updated weights for policy 0, policy_version 76020 (0.0008) +[2023-10-13 03:25:41,563][46662] Updated weights for policy 0, policy_version 76030 (0.0008) +[2023-10-13 03:25:41,952][46663] Updated weights for policy 1, policy_version 75941 (0.0011) +[2023-10-13 03:25:42,315][46663] Updated weights for policy 1, policy_version 75951 (0.0010) +[2023-10-13 03:25:42,687][46663] Updated weights for policy 1, policy_version 75961 (0.0009) +[2023-10-13 03:25:43,607][45375] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 155648000. Throughput: 0: 1673.8, 1: 1695.6. Samples: 38913430. Policy #0 lag: (min: 31.0, avg: 34.7, max: 63.0) +[2023-10-13 03:25:43,608][45375] Avg episode reward: [(0, '58.560'), (1, '54.610')] +[2023-10-13 03:25:45,697][46662] Updated weights for policy 0, policy_version 76040 (0.0009) +[2023-10-13 03:25:46,056][46662] Updated weights for policy 0, policy_version 76050 (0.0008) +[2023-10-13 03:25:46,435][46662] Updated weights for policy 0, policy_version 76060 (0.0007) +[2023-10-13 03:25:46,697][46663] Updated weights for policy 1, policy_version 75971 (0.0008) +[2023-10-13 03:25:47,064][46663] Updated weights for policy 1, policy_version 75981 (0.0009) +[2023-10-13 03:25:47,432][46663] Updated weights for policy 1, policy_version 75991 (0.0007) +[2023-10-13 03:25:48,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 155713536. Throughput: 0: 1666.6, 1: 1676.4. Samples: 38932868. Policy #0 lag: (min: 31.0, avg: 34.7, max: 63.0) +[2023-10-13 03:25:48,607][45375] Avg episode reward: [(0, '58.840'), (1, '55.540')] +[2023-10-13 03:25:50,468][46662] Updated weights for policy 0, policy_version 76070 (0.0008) +[2023-10-13 03:25:50,842][46662] Updated weights for policy 0, policy_version 76080 (0.0009) +[2023-10-13 03:25:51,205][46662] Updated weights for policy 0, policy_version 76090 (0.0009) +[2023-10-13 03:25:51,566][46663] Updated weights for policy 1, policy_version 76001 (0.0008) +[2023-10-13 03:25:51,944][46663] Updated weights for policy 1, policy_version 76011 (0.0009) +[2023-10-13 03:25:52,309][46663] Updated weights for policy 1, policy_version 76021 (0.0010) +[2023-10-13 03:25:52,687][46663] Updated weights for policy 1, policy_version 76031 (0.0009) +[2023-10-13 03:25:53,607][45375] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 155779072. Throughput: 0: 1687.5, 1: 1671.4. Samples: 38953082. Policy #0 lag: (min: 31.0, avg: 34.7, max: 63.0) +[2023-10-13 03:25:53,607][45375] Avg episode reward: [(0, '58.400'), (1, '55.700')] +[2023-10-13 03:25:53,616][46091] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000076096_77922304.pth... +[2023-10-13 03:25:53,616][46384] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000076032_77856768.pth... +[2023-10-13 03:25:53,653][46384] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000074464_76251136.pth +[2023-10-13 03:25:53,655][46091] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000074528_76316672.pth +[2023-10-13 03:25:55,270][46662] Updated weights for policy 0, policy_version 76100 (0.0008) +[2023-10-13 03:25:55,650][46662] Updated weights for policy 0, policy_version 76110 (0.0011) +[2023-10-13 03:25:56,016][46662] Updated weights for policy 0, policy_version 76120 (0.0011) +[2023-10-13 03:25:56,668][46663] Updated weights for policy 1, policy_version 76041 (0.0007) +[2023-10-13 03:25:57,035][46663] Updated weights for policy 1, policy_version 76051 (0.0007) +[2023-10-13 03:25:57,403][46663] Updated weights for policy 1, policy_version 76061 (0.0008) +[2023-10-13 03:25:58,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 155844608. Throughput: 0: 1668.5, 1: 1693.2. Samples: 38964008. Policy #0 lag: (min: 5.0, avg: 6.0, max: 27.0) +[2023-10-13 03:25:58,607][45375] Avg episode reward: [(0, '59.330'), (1, '56.700')] +[2023-10-13 03:26:00,036][46662] Updated weights for policy 0, policy_version 76130 (0.0010) +[2023-10-13 03:26:00,406][46662] Updated weights for policy 0, policy_version 76140 (0.0009) +[2023-10-13 03:26:00,780][46662] Updated weights for policy 0, policy_version 76150 (0.0010) +[2023-10-13 03:26:01,156][46662] Updated weights for policy 0, policy_version 76160 (0.0007) +[2023-10-13 03:26:01,527][46663] Updated weights for policy 1, policy_version 76071 (0.0009) +[2023-10-13 03:26:01,894][46663] Updated weights for policy 1, policy_version 76081 (0.0008) +[2023-10-13 03:26:02,258][46663] Updated weights for policy 1, policy_version 76091 (0.0010) +[2023-10-13 03:26:03,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 155910144. Throughput: 0: 1670.0, 1: 1665.9. Samples: 38982960. Policy #0 lag: (min: 5.0, avg: 6.0, max: 27.0) +[2023-10-13 03:26:03,607][45375] Avg episode reward: [(0, '61.250'), (1, '58.330')] +[2023-10-13 03:26:05,250][46662] Updated weights for policy 0, policy_version 76170 (0.0010) +[2023-10-13 03:26:05,619][46662] Updated weights for policy 0, policy_version 76180 (0.0010) +[2023-10-13 03:26:06,001][46662] Updated weights for policy 0, policy_version 76190 (0.0010) +[2023-10-13 03:26:06,266][46663] Updated weights for policy 1, policy_version 76101 (0.0010) +[2023-10-13 03:26:06,651][46663] Updated weights for policy 1, policy_version 76111 (0.0007) +[2023-10-13 03:26:07,025][46663] Updated weights for policy 1, policy_version 76121 (0.0009) +[2023-10-13 03:26:08,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 155975680. Throughput: 0: 1678.0, 1: 1678.6. Samples: 39003168. Policy #0 lag: (min: 5.0, avg: 6.0, max: 27.0) +[2023-10-13 03:26:08,607][45375] Avg episode reward: [(0, '60.370'), (1, '58.710')] +[2023-10-13 03:26:10,216][46662] Updated weights for policy 0, policy_version 76200 (0.0008) +[2023-10-13 03:26:10,587][46662] Updated weights for policy 0, policy_version 76210 (0.0010) +[2023-10-13 03:26:10,950][46662] Updated weights for policy 0, policy_version 76220 (0.0009) +[2023-10-13 03:26:11,243][46663] Updated weights for policy 1, policy_version 76131 (0.0010) +[2023-10-13 03:26:11,603][46663] Updated weights for policy 1, policy_version 76141 (0.0008) +[2023-10-13 03:26:11,968][46663] Updated weights for policy 1, policy_version 76151 (0.0009) +[2023-10-13 03:26:13,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 156041216. Throughput: 0: 1653.1, 1: 1687.9. Samples: 39013528. Policy #0 lag: (min: 5.0, avg: 6.0, max: 27.0) +[2023-10-13 03:26:13,608][45375] Avg episode reward: [(0, '60.390'), (1, '57.310')] +[2023-10-13 03:26:15,040][46662] Updated weights for policy 0, policy_version 76230 (0.0008) +[2023-10-13 03:26:15,412][46662] Updated weights for policy 0, policy_version 76240 (0.0007) +[2023-10-13 03:26:15,774][46662] Updated weights for policy 0, policy_version 76250 (0.0007) +[2023-10-13 03:26:15,869][46663] Updated weights for policy 1, policy_version 76161 (0.0009) +[2023-10-13 03:26:16,232][46663] Updated weights for policy 1, policy_version 76171 (0.0010) +[2023-10-13 03:26:16,602][46663] Updated weights for policy 1, policy_version 76181 (0.0009) +[2023-10-13 03:26:16,957][46663] Updated weights for policy 1, policy_version 76191 (0.0009) +[2023-10-13 03:26:18,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 156106752. Throughput: 0: 1674.4, 1: 1666.4. Samples: 39032976. Policy #0 lag: (min: 5.0, avg: 6.0, max: 27.0) +[2023-10-13 03:26:18,607][45375] Avg episode reward: [(0, '59.860'), (1, '56.440')] +[2023-10-13 03:26:19,936][46662] Updated weights for policy 0, policy_version 76260 (0.0009) +[2023-10-13 03:26:20,306][46662] Updated weights for policy 0, policy_version 76270 (0.0009) +[2023-10-13 03:26:20,673][46662] Updated weights for policy 0, policy_version 76280 (0.0007) +[2023-10-13 03:26:21,194][46663] Updated weights for policy 1, policy_version 76201 (0.0007) +[2023-10-13 03:26:21,557][46663] Updated weights for policy 1, policy_version 76211 (0.0008) +[2023-10-13 03:26:21,925][46663] Updated weights for policy 1, policy_version 76221 (0.0008) +[2023-10-13 03:26:23,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 156172288. Throughput: 0: 1679.9, 1: 1685.8. Samples: 39053600. Policy #0 lag: (min: 5.0, avg: 6.0, max: 27.0) +[2023-10-13 03:26:23,608][45375] Avg episode reward: [(0, '61.210'), (1, '55.360')] +[2023-10-13 03:26:24,762][46662] Updated weights for policy 0, policy_version 76290 (0.0008) +[2023-10-13 03:26:25,139][46662] Updated weights for policy 0, policy_version 76300 (0.0009) +[2023-10-13 03:26:25,503][46662] Updated weights for policy 0, policy_version 76310 (0.0008) +[2023-10-13 03:26:25,866][46663] Updated weights for policy 1, policy_version 76231 (0.0007) +[2023-10-13 03:26:25,874][46662] Updated weights for policy 0, policy_version 76320 (0.0007) +[2023-10-13 03:26:26,234][46663] Updated weights for policy 1, policy_version 76241 (0.0011) +[2023-10-13 03:26:26,601][46663] Updated weights for policy 1, policy_version 76251 (0.0009) +[2023-10-13 03:26:28,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 156237824. Throughput: 0: 1658.6, 1: 1670.1. Samples: 39063224. Policy #0 lag: (min: 5.0, avg: 6.0, max: 27.0) +[2023-10-13 03:26:28,607][45375] Avg episode reward: [(0, '59.540'), (1, '54.620')] +[2023-10-13 03:26:29,879][46662] Updated weights for policy 0, policy_version 76330 (0.0007) +[2023-10-13 03:26:30,246][46662] Updated weights for policy 0, policy_version 76340 (0.0007) +[2023-10-13 03:26:30,620][46662] Updated weights for policy 0, policy_version 76350 (0.0008) +[2023-10-13 03:26:30,705][46663] Updated weights for policy 1, policy_version 76261 (0.0007) +[2023-10-13 03:26:31,079][46663] Updated weights for policy 1, policy_version 76271 (0.0010) +[2023-10-13 03:26:31,457][46663] Updated weights for policy 1, policy_version 76281 (0.0011) +[2023-10-13 03:26:33,607][45375] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 156303360. Throughput: 0: 1676.1, 1: 1671.6. Samples: 39083514. Policy #0 lag: (min: 5.0, avg: 6.0, max: 27.0) +[2023-10-13 03:26:33,607][45375] Avg episode reward: [(0, '60.640'), (1, '56.810')] +[2023-10-13 03:26:34,582][46662] Updated weights for policy 0, policy_version 76360 (0.0008) +[2023-10-13 03:26:34,947][46662] Updated weights for policy 0, policy_version 76370 (0.0010) +[2023-10-13 03:26:35,320][46662] Updated weights for policy 0, policy_version 76380 (0.0009) +[2023-10-13 03:26:35,532][46663] Updated weights for policy 1, policy_version 76291 (0.0007) +[2023-10-13 03:26:35,899][46663] Updated weights for policy 1, policy_version 76301 (0.0009) +[2023-10-13 03:26:36,259][46663] Updated weights for policy 1, policy_version 76311 (0.0007) +[2023-10-13 03:26:38,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 156368896. Throughput: 0: 1676.6, 1: 1687.9. Samples: 39104486. Policy #0 lag: (min: 5.0, avg: 6.0, max: 27.0) +[2023-10-13 03:26:38,607][45375] Avg episode reward: [(0, '60.570'), (1, '58.200')] +[2023-10-13 03:26:39,392][46662] Updated weights for policy 0, policy_version 76390 (0.0009) +[2023-10-13 03:26:39,763][46662] Updated weights for policy 0, policy_version 76400 (0.0009) +[2023-10-13 03:26:40,138][46662] Updated weights for policy 0, policy_version 76410 (0.0010) +[2023-10-13 03:26:40,253][46663] Updated weights for policy 1, policy_version 76321 (0.0007) +[2023-10-13 03:26:40,629][46663] Updated weights for policy 1, policy_version 76331 (0.0009) +[2023-10-13 03:26:40,990][46663] Updated weights for policy 1, policy_version 76341 (0.0011) +[2023-10-13 03:26:41,373][46663] Updated weights for policy 1, policy_version 76351 (0.0008) +[2023-10-13 03:26:43,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 156434432. Throughput: 0: 1661.2, 1: 1662.0. Samples: 39113550. Policy #0 lag: (min: 5.0, avg: 6.0, max: 27.0) +[2023-10-13 03:26:43,607][45375] Avg episode reward: [(0, '60.950'), (1, '55.910')] +[2023-10-13 03:26:44,160][46662] Updated weights for policy 0, policy_version 76420 (0.0009) +[2023-10-13 03:26:44,532][46662] Updated weights for policy 0, policy_version 76430 (0.0008) +[2023-10-13 03:26:44,900][46662] Updated weights for policy 0, policy_version 76440 (0.0007) +[2023-10-13 03:26:45,460][46663] Updated weights for policy 1, policy_version 76361 (0.0008) +[2023-10-13 03:26:45,823][46663] Updated weights for policy 1, policy_version 76371 (0.0010) +[2023-10-13 03:26:46,197][46663] Updated weights for policy 1, policy_version 76381 (0.0008) +[2023-10-13 03:26:48,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 156499968. Throughput: 0: 1680.4, 1: 1679.6. Samples: 39134158. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 03:26:48,607][45375] Avg episode reward: [(0, '60.300'), (1, '56.130')] +[2023-10-13 03:26:48,845][46662] Updated weights for policy 0, policy_version 76450 (0.0010) +[2023-10-13 03:26:49,218][46662] Updated weights for policy 0, policy_version 76460 (0.0007) +[2023-10-13 03:26:49,596][46662] Updated weights for policy 0, policy_version 76470 (0.0010) +[2023-10-13 03:26:49,960][46662] Updated weights for policy 0, policy_version 76480 (0.0009) +[2023-10-13 03:26:50,364][46663] Updated weights for policy 1, policy_version 76391 (0.0009) +[2023-10-13 03:26:50,737][46663] Updated weights for policy 1, policy_version 76401 (0.0007) +[2023-10-13 03:26:51,100][46663] Updated weights for policy 1, policy_version 76411 (0.0008) +[2023-10-13 03:26:53,607][45375] Fps is (10 sec: 13106.6, 60 sec: 13107.1, 300 sec: 13440.4). Total num frames: 156565504. Throughput: 0: 1687.9, 1: 1684.1. Samples: 39154912. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 03:26:53,608][45375] Avg episode reward: [(0, '60.670'), (1, '55.610')] +[2023-10-13 03:26:53,925][46662] Updated weights for policy 0, policy_version 76490 (0.0007) +[2023-10-13 03:26:54,295][46662] Updated weights for policy 0, policy_version 76500 (0.0007) +[2023-10-13 03:26:54,661][46662] Updated weights for policy 0, policy_version 76510 (0.0008) +[2023-10-13 03:26:55,274][46663] Updated weights for policy 1, policy_version 76421 (0.0009) +[2023-10-13 03:26:55,680][46663] Updated weights for policy 1, policy_version 76431 (0.0009) +[2023-10-13 03:26:56,039][46663] Updated weights for policy 1, policy_version 76441 (0.0009) +[2023-10-13 03:26:58,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 156631040. Throughput: 0: 1687.4, 1: 1654.7. Samples: 39163922. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 03:26:58,607][45375] Avg episode reward: [(0, '60.430'), (1, '55.630')] +[2023-10-13 03:26:58,791][46662] Updated weights for policy 0, policy_version 76520 (0.0008) +[2023-10-13 03:26:59,166][46662] Updated weights for policy 0, policy_version 76530 (0.0009) +[2023-10-13 03:26:59,541][46662] Updated weights for policy 0, policy_version 76540 (0.0009) +[2023-10-13 03:27:00,127][46663] Updated weights for policy 1, policy_version 76451 (0.0010) +[2023-10-13 03:27:00,489][46663] Updated weights for policy 1, policy_version 76461 (0.0008) +[2023-10-13 03:27:00,860][46663] Updated weights for policy 1, policy_version 76471 (0.0009) +[2023-10-13 03:27:03,607][45375] Fps is (10 sec: 13107.6, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 156696576. Throughput: 0: 1691.8, 1: 1678.1. Samples: 39184622. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 03:27:03,608][45375] Avg episode reward: [(0, '60.090'), (1, '55.960')] +[2023-10-13 03:27:03,679][46662] Updated weights for policy 0, policy_version 76550 (0.0010) +[2023-10-13 03:27:04,056][46662] Updated weights for policy 0, policy_version 76560 (0.0010) +[2023-10-13 03:27:04,418][46662] Updated weights for policy 0, policy_version 76570 (0.0010) +[2023-10-13 03:27:04,770][46663] Updated weights for policy 1, policy_version 76481 (0.0007) +[2023-10-13 03:27:05,136][46663] Updated weights for policy 1, policy_version 76491 (0.0007) +[2023-10-13 03:27:05,507][46663] Updated weights for policy 1, policy_version 76501 (0.0008) +[2023-10-13 03:27:05,883][46663] Updated weights for policy 1, policy_version 76511 (0.0011) +[2023-10-13 03:27:08,443][46662] Updated weights for policy 0, policy_version 76580 (0.0009) +[2023-10-13 03:27:08,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 156762112. Throughput: 0: 1691.1, 1: 1683.6. Samples: 39205460. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 03:27:08,607][45375] Avg episode reward: [(0, '59.600'), (1, '54.640')] +[2023-10-13 03:27:08,810][46662] Updated weights for policy 0, policy_version 76590 (0.0009) +[2023-10-13 03:27:09,182][46662] Updated weights for policy 0, policy_version 76600 (0.0009) +[2023-10-13 03:27:09,925][46663] Updated weights for policy 1, policy_version 76521 (0.0008) +[2023-10-13 03:27:10,292][46663] Updated weights for policy 1, policy_version 76531 (0.0007) +[2023-10-13 03:27:10,654][46663] Updated weights for policy 1, policy_version 76541 (0.0008) +[2023-10-13 03:27:13,003][46662] Updated weights for policy 0, policy_version 76610 (0.0008) +[2023-10-13 03:27:13,368][46662] Updated weights for policy 0, policy_version 76620 (0.0010) +[2023-10-13 03:27:13,607][45375] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 156827648. Throughput: 0: 1693.2, 1: 1671.3. Samples: 39214624. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 03:27:13,607][45375] Avg episode reward: [(0, '58.470'), (1, '53.050')] +[2023-10-13 03:27:13,748][46662] Updated weights for policy 0, policy_version 76630 (0.0008) +[2023-10-13 03:27:14,119][46662] Updated weights for policy 0, policy_version 76640 (0.0010) +[2023-10-13 03:27:14,776][46663] Updated weights for policy 1, policy_version 76551 (0.0009) +[2023-10-13 03:27:15,144][46663] Updated weights for policy 1, policy_version 76561 (0.0008) +[2023-10-13 03:27:15,516][46663] Updated weights for policy 1, policy_version 76571 (0.0007) +[2023-10-13 03:27:18,323][46662] Updated weights for policy 0, policy_version 76650 (0.0007) +[2023-10-13 03:27:18,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 156893184. Throughput: 0: 1687.4, 1: 1682.3. Samples: 39235150. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 03:27:18,607][45375] Avg episode reward: [(0, '56.570'), (1, '53.260')] +[2023-10-13 03:27:18,686][46662] Updated weights for policy 0, policy_version 76660 (0.0009) +[2023-10-13 03:27:19,057][46662] Updated weights for policy 0, policy_version 76670 (0.0010) +[2023-10-13 03:27:19,615][46663] Updated weights for policy 1, policy_version 76581 (0.0007) +[2023-10-13 03:27:19,976][46663] Updated weights for policy 1, policy_version 76591 (0.0009) +[2023-10-13 03:27:20,347][46663] Updated weights for policy 1, policy_version 76601 (0.0009) +[2023-10-13 03:27:23,204][46662] Updated weights for policy 0, policy_version 76680 (0.0009) +[2023-10-13 03:27:23,591][46662] Updated weights for policy 0, policy_version 76690 (0.0010) +[2023-10-13 03:27:23,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 156958720. Throughput: 0: 1687.4, 1: 1682.4. Samples: 39256128. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 03:27:23,608][45375] Avg episode reward: [(0, '54.800'), (1, '54.700')] +[2023-10-13 03:27:23,959][46662] Updated weights for policy 0, policy_version 76700 (0.0010) +[2023-10-13 03:27:24,303][46663] Updated weights for policy 1, policy_version 76611 (0.0007) +[2023-10-13 03:27:24,665][46663] Updated weights for policy 1, policy_version 76621 (0.0008) +[2023-10-13 03:27:25,032][46663] Updated weights for policy 1, policy_version 76631 (0.0008) +[2023-10-13 03:27:27,759][46662] Updated weights for policy 0, policy_version 76710 (0.0008) +[2023-10-13 03:27:28,127][46662] Updated weights for policy 0, policy_version 76720 (0.0008) +[2023-10-13 03:27:28,488][46662] Updated weights for policy 0, policy_version 76730 (0.0009) +[2023-10-13 03:27:28,606][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 157024256. Throughput: 0: 1692.5, 1: 1681.2. Samples: 39265366. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 03:27:28,607][45375] Avg episode reward: [(0, '54.040'), (1, '56.370')] +[2023-10-13 03:27:29,213][46663] Updated weights for policy 1, policy_version 76641 (0.0008) +[2023-10-13 03:27:29,571][46663] Updated weights for policy 1, policy_version 76651 (0.0009) +[2023-10-13 03:27:29,934][46663] Updated weights for policy 1, policy_version 76661 (0.0010) +[2023-10-13 03:27:30,307][46663] Updated weights for policy 1, policy_version 76671 (0.0008) +[2023-10-13 03:27:32,674][46662] Updated weights for policy 0, policy_version 76740 (0.0009) +[2023-10-13 03:27:33,055][46662] Updated weights for policy 0, policy_version 76750 (0.0009) +[2023-10-13 03:27:33,423][46662] Updated weights for policy 0, policy_version 76760 (0.0009) +[2023-10-13 03:27:33,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 157089792. Throughput: 0: 1691.8, 1: 1689.6. Samples: 39286320. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 03:27:33,607][45375] Avg episode reward: [(0, '52.870'), (1, '56.880')] +[2023-10-13 03:27:34,258][46663] Updated weights for policy 1, policy_version 76681 (0.0011) +[2023-10-13 03:27:34,627][46663] Updated weights for policy 1, policy_version 76691 (0.0008) +[2023-10-13 03:27:35,005][46663] Updated weights for policy 1, policy_version 76701 (0.0008) +[2023-10-13 03:27:37,505][46662] Updated weights for policy 0, policy_version 76770 (0.0008) +[2023-10-13 03:27:37,881][46662] Updated weights for policy 0, policy_version 76780 (0.0009) +[2023-10-13 03:27:38,248][46662] Updated weights for policy 0, policy_version 76790 (0.0009) +[2023-10-13 03:27:38,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 157155328. Throughput: 0: 1680.9, 1: 1696.5. Samples: 39306896. Policy #0 lag: (min: 9.0, avg: 10.4, max: 34.0) +[2023-10-13 03:27:38,607][45375] Avg episode reward: [(0, '52.860'), (1, '54.570')] +[2023-10-13 03:27:38,618][46662] Updated weights for policy 0, policy_version 76800 (0.0008) +[2023-10-13 03:27:38,988][46663] Updated weights for policy 1, policy_version 76711 (0.0008) +[2023-10-13 03:27:39,353][46663] Updated weights for policy 1, policy_version 76721 (0.0007) +[2023-10-13 03:27:39,725][46663] Updated weights for policy 1, policy_version 76731 (0.0009) +[2023-10-13 03:27:42,768][46662] Updated weights for policy 0, policy_version 76810 (0.0008) +[2023-10-13 03:27:43,142][46662] Updated weights for policy 0, policy_version 76820 (0.0008) +[2023-10-13 03:27:43,512][46662] Updated weights for policy 0, policy_version 76830 (0.0007) +[2023-10-13 03:27:43,606][45375] Fps is (10 sec: 16384.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 157253632. Throughput: 0: 1685.1, 1: 1699.7. Samples: 39316236. Policy #0 lag: (min: 9.0, avg: 10.4, max: 34.0) +[2023-10-13 03:27:43,607][45375] Avg episode reward: [(0, '53.160'), (1, '52.550')] +[2023-10-13 03:27:44,019][46663] Updated weights for policy 1, policy_version 76741 (0.0009) +[2023-10-13 03:27:44,410][46663] Updated weights for policy 1, policy_version 76751 (0.0007) +[2023-10-13 03:27:44,772][46663] Updated weights for policy 1, policy_version 76761 (0.0009) +[2023-10-13 03:27:47,670][46662] Updated weights for policy 0, policy_version 76840 (0.0007) +[2023-10-13 03:27:48,039][46662] Updated weights for policy 0, policy_version 76850 (0.0008) +[2023-10-13 03:27:48,409][46662] Updated weights for policy 0, policy_version 76860 (0.0009) +[2023-10-13 03:27:48,607][45375] Fps is (10 sec: 16383.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 157319168. Throughput: 0: 1687.4, 1: 1693.4. Samples: 39336758. Policy #0 lag: (min: 9.0, avg: 10.4, max: 34.0) +[2023-10-13 03:27:48,607][45375] Avg episode reward: [(0, '52.930'), (1, '54.200')] +[2023-10-13 03:27:48,810][46663] Updated weights for policy 1, policy_version 76771 (0.0008) +[2023-10-13 03:27:49,177][46663] Updated weights for policy 1, policy_version 76781 (0.0007) +[2023-10-13 03:27:49,537][46663] Updated weights for policy 1, policy_version 76791 (0.0008) +[2023-10-13 03:27:52,518][46662] Updated weights for policy 0, policy_version 76870 (0.0008) +[2023-10-13 03:27:52,885][46662] Updated weights for policy 0, policy_version 76880 (0.0007) +[2023-10-13 03:27:53,252][46662] Updated weights for policy 0, policy_version 76890 (0.0008) +[2023-10-13 03:27:53,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 157384704. Throughput: 0: 1674.3, 1: 1690.6. Samples: 39356878. Policy #0 lag: (min: 9.0, avg: 10.4, max: 34.0) +[2023-10-13 03:27:53,607][45375] Avg episode reward: [(0, '51.900'), (1, '53.840')] +[2023-10-13 03:27:53,616][46091] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000076896_78741504.pth... +[2023-10-13 03:27:53,647][46091] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000075328_77135872.pth +[2023-10-13 03:27:53,722][46663] Updated weights for policy 1, policy_version 76801 (0.0010) +[2023-10-13 03:27:54,099][46663] Updated weights for policy 1, policy_version 76811 (0.0008) +[2023-10-13 03:27:54,465][46663] Updated weights for policy 1, policy_version 76821 (0.0008) +[2023-10-13 03:27:54,834][46663] Updated weights for policy 1, policy_version 76831 (0.0007) +[2023-10-13 03:27:54,866][46384] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000076832_78675968.pth... +[2023-10-13 03:27:54,904][46384] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000075232_77037568.pth +[2023-10-13 03:27:57,123][46662] Updated weights for policy 0, policy_version 76900 (0.0008) +[2023-10-13 03:27:57,495][46662] Updated weights for policy 0, policy_version 76910 (0.0008) +[2023-10-13 03:27:57,866][46662] Updated weights for policy 0, policy_version 76920 (0.0009) +[2023-10-13 03:27:58,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 157450240. Throughput: 0: 1687.9, 1: 1686.5. Samples: 39366470. Policy #0 lag: (min: 9.0, avg: 10.4, max: 34.0) +[2023-10-13 03:27:58,607][45375] Avg episode reward: [(0, '51.210'), (1, '53.510')] +[2023-10-13 03:27:58,978][46663] Updated weights for policy 1, policy_version 76841 (0.0009) +[2023-10-13 03:27:59,358][46663] Updated weights for policy 1, policy_version 76851 (0.0008) +[2023-10-13 03:27:59,721][46663] Updated weights for policy 1, policy_version 76861 (0.0007) +[2023-10-13 03:28:01,914][46662] Updated weights for policy 0, policy_version 76930 (0.0009) +[2023-10-13 03:28:02,286][46662] Updated weights for policy 0, policy_version 76940 (0.0010) +[2023-10-13 03:28:02,653][46662] Updated weights for policy 0, policy_version 76950 (0.0008) +[2023-10-13 03:28:03,020][46662] Updated weights for policy 0, policy_version 76960 (0.0008) +[2023-10-13 03:28:03,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 157515776. Throughput: 0: 1694.0, 1: 1687.7. Samples: 39387328. Policy #0 lag: (min: 9.0, avg: 10.4, max: 34.0) +[2023-10-13 03:28:03,607][45375] Avg episode reward: [(0, '48.850'), (1, '52.630')] +[2023-10-13 03:28:03,693][46663] Updated weights for policy 1, policy_version 76871 (0.0007) +[2023-10-13 03:28:04,062][46663] Updated weights for policy 1, policy_version 76881 (0.0007) +[2023-10-13 03:28:04,425][46663] Updated weights for policy 1, policy_version 76891 (0.0009) +[2023-10-13 03:28:07,095][46662] Updated weights for policy 0, policy_version 76970 (0.0007) +[2023-10-13 03:28:07,460][46662] Updated weights for policy 0, policy_version 76980 (0.0010) +[2023-10-13 03:28:07,825][46662] Updated weights for policy 0, policy_version 76990 (0.0010) +[2023-10-13 03:28:08,556][46663] Updated weights for policy 1, policy_version 76901 (0.0009) +[2023-10-13 03:28:08,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 157581312. Throughput: 0: 1665.7, 1: 1685.4. Samples: 39406930. Policy #0 lag: (min: 9.0, avg: 10.4, max: 34.0) +[2023-10-13 03:28:08,607][45375] Avg episode reward: [(0, '48.730'), (1, '51.080')] +[2023-10-13 03:28:08,917][46663] Updated weights for policy 1, policy_version 76911 (0.0009) +[2023-10-13 03:28:09,290][46663] Updated weights for policy 1, policy_version 76921 (0.0007) +[2023-10-13 03:28:11,911][46662] Updated weights for policy 0, policy_version 77000 (0.0009) +[2023-10-13 03:28:12,280][46662] Updated weights for policy 0, policy_version 77010 (0.0008) +[2023-10-13 03:28:12,649][46662] Updated weights for policy 0, policy_version 77020 (0.0010) +[2023-10-13 03:28:13,352][46663] Updated weights for policy 1, policy_version 76931 (0.0009) +[2023-10-13 03:28:13,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 157646848. Throughput: 0: 1690.3, 1: 1682.1. Samples: 39417126. Policy #0 lag: (min: 9.0, avg: 10.4, max: 34.0) +[2023-10-13 03:28:13,607][45375] Avg episode reward: [(0, '49.400'), (1, '51.050')] +[2023-10-13 03:28:13,734][46663] Updated weights for policy 1, policy_version 76941 (0.0009) +[2023-10-13 03:28:14,098][46663] Updated weights for policy 1, policy_version 76951 (0.0007) +[2023-10-13 03:28:16,633][46662] Updated weights for policy 0, policy_version 77030 (0.0010) +[2023-10-13 03:28:17,005][46662] Updated weights for policy 0, policy_version 77040 (0.0008) +[2023-10-13 03:28:17,375][46662] Updated weights for policy 0, policy_version 77050 (0.0008) +[2023-10-13 03:28:18,189][46663] Updated weights for policy 1, policy_version 76961 (0.0007) +[2023-10-13 03:28:18,557][46663] Updated weights for policy 1, policy_version 76971 (0.0009) +[2023-10-13 03:28:18,606][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 157712384. Throughput: 0: 1675.8, 1: 1683.5. Samples: 39437490. Policy #0 lag: (min: 9.0, avg: 10.4, max: 34.0) +[2023-10-13 03:28:18,607][45375] Avg episode reward: [(0, '49.750'), (1, '52.760')] +[2023-10-13 03:28:18,934][46663] Updated weights for policy 1, policy_version 76981 (0.0010) +[2023-10-13 03:28:19,291][46663] Updated weights for policy 1, policy_version 76991 (0.0010) +[2023-10-13 03:28:21,263][46662] Updated weights for policy 0, policy_version 77060 (0.0008) +[2023-10-13 03:28:21,645][46662] Updated weights for policy 0, policy_version 77070 (0.0009) +[2023-10-13 03:28:22,017][46662] Updated weights for policy 0, policy_version 77080 (0.0009) +[2023-10-13 03:28:23,292][46663] Updated weights for policy 1, policy_version 77001 (0.0009) +[2023-10-13 03:28:23,607][45375] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 157777920. Throughput: 0: 1669.1, 1: 1670.4. Samples: 39457178. Policy #0 lag: (min: 17.0, avg: 23.9, max: 49.0) +[2023-10-13 03:28:23,608][45375] Avg episode reward: [(0, '50.030'), (1, '52.120')] +[2023-10-13 03:28:23,663][46663] Updated weights for policy 1, policy_version 77011 (0.0010) +[2023-10-13 03:28:24,028][46663] Updated weights for policy 1, policy_version 77021 (0.0010) +[2023-10-13 03:28:26,080][46662] Updated weights for policy 0, policy_version 77090 (0.0008) +[2023-10-13 03:28:26,445][46662] Updated weights for policy 0, policy_version 77100 (0.0007) +[2023-10-13 03:28:26,819][46662] Updated weights for policy 0, policy_version 77110 (0.0008) +[2023-10-13 03:28:27,181][46662] Updated weights for policy 0, policy_version 77120 (0.0010) +[2023-10-13 03:28:28,199][46663] Updated weights for policy 1, policy_version 77031 (0.0008) +[2023-10-13 03:28:28,567][46663] Updated weights for policy 1, policy_version 77041 (0.0008) +[2023-10-13 03:28:28,606][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 157843456. Throughput: 0: 1694.1, 1: 1675.3. Samples: 39467860. Policy #0 lag: (min: 17.0, avg: 23.9, max: 49.0) +[2023-10-13 03:28:28,607][45375] Avg episode reward: [(0, '53.810'), (1, '52.560')] +[2023-10-13 03:28:28,927][46663] Updated weights for policy 1, policy_version 77051 (0.0008) +[2023-10-13 03:28:31,363][46662] Updated weights for policy 0, policy_version 77130 (0.0010) +[2023-10-13 03:28:31,737][46662] Updated weights for policy 0, policy_version 77140 (0.0011) +[2023-10-13 03:28:32,107][46662] Updated weights for policy 0, policy_version 77150 (0.0010) +[2023-10-13 03:28:32,998][46663] Updated weights for policy 1, policy_version 77061 (0.0008) +[2023-10-13 03:28:33,369][46663] Updated weights for policy 1, policy_version 77071 (0.0011) +[2023-10-13 03:28:33,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 157908992. Throughput: 0: 1668.1, 1: 1678.7. Samples: 39487362. Policy #0 lag: (min: 17.0, avg: 23.9, max: 49.0) +[2023-10-13 03:28:33,608][45375] Avg episode reward: [(0, '53.960'), (1, '52.150')] +[2023-10-13 03:28:33,745][46663] Updated weights for policy 1, policy_version 77081 (0.0010) +[2023-10-13 03:28:36,153][46662] Updated weights for policy 0, policy_version 77160 (0.0009) +[2023-10-13 03:28:36,518][46662] Updated weights for policy 0, policy_version 77170 (0.0007) +[2023-10-13 03:28:36,891][46662] Updated weights for policy 0, policy_version 77180 (0.0007) +[2023-10-13 03:28:37,755][46663] Updated weights for policy 1, policy_version 77091 (0.0010) +[2023-10-13 03:28:38,116][46663] Updated weights for policy 1, policy_version 77101 (0.0011) +[2023-10-13 03:28:38,479][46663] Updated weights for policy 1, policy_version 77111 (0.0011) +[2023-10-13 03:28:38,606][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 157974528. Throughput: 0: 1674.8, 1: 1660.6. Samples: 39506972. Policy #0 lag: (min: 17.0, avg: 23.9, max: 49.0) +[2023-10-13 03:28:38,607][45375] Avg episode reward: [(0, '55.580'), (1, '52.320')] +[2023-10-13 03:28:40,899][46662] Updated weights for policy 0, policy_version 77190 (0.0009) +[2023-10-13 03:28:41,264][46662] Updated weights for policy 0, policy_version 77200 (0.0007) +[2023-10-13 03:28:41,633][46662] Updated weights for policy 0, policy_version 77210 (0.0008) +[2023-10-13 03:28:42,475][46663] Updated weights for policy 1, policy_version 77121 (0.0012) +[2023-10-13 03:28:42,837][46663] Updated weights for policy 1, policy_version 77131 (0.0012) +[2023-10-13 03:28:43,211][46663] Updated weights for policy 1, policy_version 77141 (0.0011) +[2023-10-13 03:28:43,585][46663] Updated weights for policy 1, policy_version 77151 (0.0010) +[2023-10-13 03:28:43,606][45375] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 158040064. Throughput: 0: 1688.8, 1: 1679.6. Samples: 39518048. Policy #0 lag: (min: 17.0, avg: 23.9, max: 49.0) +[2023-10-13 03:28:43,607][45375] Avg episode reward: [(0, '55.300'), (1, '52.720')] +[2023-10-13 03:28:45,729][46662] Updated weights for policy 0, policy_version 77220 (0.0008) +[2023-10-13 03:28:46,094][46662] Updated weights for policy 0, policy_version 77230 (0.0009) +[2023-10-13 03:28:46,464][46662] Updated weights for policy 0, policy_version 77240 (0.0008) +[2023-10-13 03:28:47,690][46663] Updated weights for policy 1, policy_version 77161 (0.0008) +[2023-10-13 03:28:48,066][46663] Updated weights for policy 1, policy_version 77171 (0.0010) +[2023-10-13 03:28:48,422][46663] Updated weights for policy 1, policy_version 77181 (0.0009) +[2023-10-13 03:28:48,607][45375] Fps is (10 sec: 16384.0, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 158138368. Throughput: 0: 1660.5, 1: 1679.4. Samples: 39537624. Policy #0 lag: (min: 17.0, avg: 23.9, max: 49.0) +[2023-10-13 03:28:48,607][45375] Avg episode reward: [(0, '57.240'), (1, '55.150')] +[2023-10-13 03:28:50,583][46662] Updated weights for policy 0, policy_version 77250 (0.0008) +[2023-10-13 03:28:50,941][46662] Updated weights for policy 0, policy_version 77260 (0.0008) +[2023-10-13 03:28:51,314][46662] Updated weights for policy 0, policy_version 77270 (0.0010) +[2023-10-13 03:28:51,676][46662] Updated weights for policy 0, policy_version 77280 (0.0007) +[2023-10-13 03:28:52,499][46663] Updated weights for policy 1, policy_version 77191 (0.0009) +[2023-10-13 03:28:52,879][46663] Updated weights for policy 1, policy_version 77201 (0.0009) +[2023-10-13 03:28:53,255][46663] Updated weights for policy 1, policy_version 77211 (0.0008) +[2023-10-13 03:28:53,607][45375] Fps is (10 sec: 16383.7, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 158203904. Throughput: 0: 1688.4, 1: 1649.0. Samples: 39557114. Policy #0 lag: (min: 17.0, avg: 23.9, max: 49.0) +[2023-10-13 03:28:53,608][45375] Avg episode reward: [(0, '56.000'), (1, '55.680')] +[2023-10-13 03:28:55,721][46662] Updated weights for policy 0, policy_version 77290 (0.0010) +[2023-10-13 03:28:56,086][46662] Updated weights for policy 0, policy_version 77300 (0.0011) +[2023-10-13 03:28:56,450][46662] Updated weights for policy 0, policy_version 77310 (0.0012) +[2023-10-13 03:28:57,493][46663] Updated weights for policy 1, policy_version 77221 (0.0009) +[2023-10-13 03:28:57,854][46663] Updated weights for policy 1, policy_version 77231 (0.0009) +[2023-10-13 03:28:58,222][46663] Updated weights for policy 1, policy_version 77241 (0.0008) +[2023-10-13 03:28:58,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 158269440. Throughput: 0: 1678.8, 1: 1676.4. Samples: 39568110. Policy #0 lag: (min: 17.0, avg: 23.9, max: 49.0) +[2023-10-13 03:28:58,607][45375] Avg episode reward: [(0, '55.230'), (1, '55.490')] +[2023-10-13 03:29:00,697][46662] Updated weights for policy 0, policy_version 77320 (0.0010) +[2023-10-13 03:29:01,057][46662] Updated weights for policy 0, policy_version 77330 (0.0008) +[2023-10-13 03:29:01,424][46662] Updated weights for policy 0, policy_version 77340 (0.0009) +[2023-10-13 03:29:02,314][46663] Updated weights for policy 1, policy_version 77251 (0.0008) +[2023-10-13 03:29:02,678][46663] Updated weights for policy 1, policy_version 77261 (0.0008) +[2023-10-13 03:29:03,048][46663] Updated weights for policy 1, policy_version 77271 (0.0008) +[2023-10-13 03:29:03,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 158334976. Throughput: 0: 1668.6, 1: 1672.8. Samples: 39587856. Policy #0 lag: (min: 17.0, avg: 23.9, max: 49.0) +[2023-10-13 03:29:03,608][45375] Avg episode reward: [(0, '55.390'), (1, '56.080')] +[2023-10-13 03:29:05,480][46662] Updated weights for policy 0, policy_version 77350 (0.0009) +[2023-10-13 03:29:05,848][46662] Updated weights for policy 0, policy_version 77360 (0.0010) +[2023-10-13 03:29:06,214][46662] Updated weights for policy 0, policy_version 77370 (0.0008) +[2023-10-13 03:29:07,047][46663] Updated weights for policy 1, policy_version 77281 (0.0009) +[2023-10-13 03:29:07,413][46663] Updated weights for policy 1, policy_version 77291 (0.0009) +[2023-10-13 03:29:07,773][46663] Updated weights for policy 1, policy_version 77301 (0.0008) +[2023-10-13 03:29:08,144][46663] Updated weights for policy 1, policy_version 77311 (0.0009) +[2023-10-13 03:29:08,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 158400512. Throughput: 0: 1683.2, 1: 1651.0. Samples: 39607218. Policy #0 lag: (min: 17.0, avg: 23.9, max: 49.0) +[2023-10-13 03:29:08,607][45375] Avg episode reward: [(0, '55.870'), (1, '57.510')] +[2023-10-13 03:29:10,267][46662] Updated weights for policy 0, policy_version 77380 (0.0008) +[2023-10-13 03:29:10,640][46662] Updated weights for policy 0, policy_version 77390 (0.0008) +[2023-10-13 03:29:10,996][46662] Updated weights for policy 0, policy_version 77400 (0.0010) +[2023-10-13 03:29:12,311][46663] Updated weights for policy 1, policy_version 77321 (0.0010) +[2023-10-13 03:29:12,675][46663] Updated weights for policy 1, policy_version 77331 (0.0008) +[2023-10-13 03:29:13,043][46663] Updated weights for policy 1, policy_version 77341 (0.0007) +[2023-10-13 03:29:13,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 158466048. Throughput: 0: 1661.5, 1: 1677.1. Samples: 39618096. Policy #0 lag: (min: 25.0, avg: 37.8, max: 57.0) +[2023-10-13 03:29:13,608][45375] Avg episode reward: [(0, '58.430'), (1, '57.640')] +[2023-10-13 03:29:14,984][46662] Updated weights for policy 0, policy_version 77410 (0.0011) +[2023-10-13 03:29:15,363][46662] Updated weights for policy 0, policy_version 77420 (0.0008) +[2023-10-13 03:29:15,734][46662] Updated weights for policy 0, policy_version 77430 (0.0007) +[2023-10-13 03:29:16,107][46662] Updated weights for policy 0, policy_version 77440 (0.0007) +[2023-10-13 03:29:17,037][46663] Updated weights for policy 1, policy_version 77351 (0.0007) +[2023-10-13 03:29:17,400][46663] Updated weights for policy 1, policy_version 77361 (0.0010) +[2023-10-13 03:29:17,763][46663] Updated weights for policy 1, policy_version 77371 (0.0008) +[2023-10-13 03:29:18,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 158531584. Throughput: 0: 1676.1, 1: 1667.3. Samples: 39637818. Policy #0 lag: (min: 25.0, avg: 37.8, max: 57.0) +[2023-10-13 03:29:18,607][45375] Avg episode reward: [(0, '57.680'), (1, '58.190')] +[2023-10-13 03:29:20,159][46662] Updated weights for policy 0, policy_version 77450 (0.0009) +[2023-10-13 03:29:20,523][46662] Updated weights for policy 0, policy_version 77460 (0.0009) +[2023-10-13 03:29:20,893][46662] Updated weights for policy 0, policy_version 77470 (0.0008) +[2023-10-13 03:29:21,791][46663] Updated weights for policy 1, policy_version 77381 (0.0009) +[2023-10-13 03:29:22,179][46663] Updated weights for policy 1, policy_version 77391 (0.0010) +[2023-10-13 03:29:22,541][46663] Updated weights for policy 1, policy_version 77401 (0.0011) +[2023-10-13 03:29:23,606][45375] Fps is (10 sec: 13107.5, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 158597120. Throughput: 0: 1689.9, 1: 1666.6. Samples: 39658016. Policy #0 lag: (min: 25.0, avg: 37.8, max: 57.0) +[2023-10-13 03:29:23,607][45375] Avg episode reward: [(0, '56.730'), (1, '58.940')] +[2023-10-13 03:29:24,775][46662] Updated weights for policy 0, policy_version 77480 (0.0009) +[2023-10-13 03:29:25,142][46662] Updated weights for policy 0, policy_version 77490 (0.0009) +[2023-10-13 03:29:25,519][46662] Updated weights for policy 0, policy_version 77500 (0.0008) +[2023-10-13 03:29:26,612][46663] Updated weights for policy 1, policy_version 77411 (0.0010) +[2023-10-13 03:29:26,980][46663] Updated weights for policy 1, policy_version 77421 (0.0008) +[2023-10-13 03:29:27,353][46663] Updated weights for policy 1, policy_version 77431 (0.0009) +[2023-10-13 03:29:28,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 158662656. Throughput: 0: 1661.9, 1: 1681.4. Samples: 39668498. Policy #0 lag: (min: 25.0, avg: 37.8, max: 57.0) +[2023-10-13 03:29:28,607][45375] Avg episode reward: [(0, '56.980'), (1, '59.620')] +[2023-10-13 03:29:29,586][46662] Updated weights for policy 0, policy_version 77510 (0.0007) +[2023-10-13 03:29:29,957][46662] Updated weights for policy 0, policy_version 77520 (0.0007) +[2023-10-13 03:29:30,327][46662] Updated weights for policy 0, policy_version 77530 (0.0007) +[2023-10-13 03:29:31,412][46663] Updated weights for policy 1, policy_version 77441 (0.0010) +[2023-10-13 03:29:31,778][46663] Updated weights for policy 1, policy_version 77451 (0.0009) +[2023-10-13 03:29:32,134][46663] Updated weights for policy 1, policy_version 77461 (0.0007) +[2023-10-13 03:29:32,508][46663] Updated weights for policy 1, policy_version 77471 (0.0008) +[2023-10-13 03:29:33,606][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 158728192. Throughput: 0: 1691.3, 1: 1655.0. Samples: 39688208. Policy #0 lag: (min: 25.0, avg: 37.8, max: 57.0) +[2023-10-13 03:29:33,607][45375] Avg episode reward: [(0, '57.110'), (1, '59.550')] +[2023-10-13 03:29:34,356][46662] Updated weights for policy 0, policy_version 77540 (0.0009) +[2023-10-13 03:29:34,729][46662] Updated weights for policy 0, policy_version 77550 (0.0007) +[2023-10-13 03:29:35,092][46662] Updated weights for policy 0, policy_version 77560 (0.0008) +[2023-10-13 03:29:36,611][46663] Updated weights for policy 1, policy_version 77481 (0.0008) +[2023-10-13 03:29:36,978][46663] Updated weights for policy 1, policy_version 77491 (0.0009) +[2023-10-13 03:29:37,344][46663] Updated weights for policy 1, policy_version 77501 (0.0009) +[2023-10-13 03:29:38,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 158793728. Throughput: 0: 1696.5, 1: 1681.4. Samples: 39709116. Policy #0 lag: (min: 25.0, avg: 37.8, max: 57.0) +[2023-10-13 03:29:38,607][45375] Avg episode reward: [(0, '53.910'), (1, '58.020')] +[2023-10-13 03:29:39,121][46662] Updated weights for policy 0, policy_version 77570 (0.0009) +[2023-10-13 03:29:39,490][46662] Updated weights for policy 0, policy_version 77580 (0.0007) +[2023-10-13 03:29:39,865][46662] Updated weights for policy 0, policy_version 77590 (0.0008) +[2023-10-13 03:29:40,244][46662] Updated weights for policy 0, policy_version 77600 (0.0010) +[2023-10-13 03:29:41,514][46663] Updated weights for policy 1, policy_version 77511 (0.0009) +[2023-10-13 03:29:41,881][46663] Updated weights for policy 1, policy_version 77521 (0.0008) +[2023-10-13 03:29:42,244][46663] Updated weights for policy 1, policy_version 77531 (0.0008) +[2023-10-13 03:29:43,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 158859264. Throughput: 0: 1680.6, 1: 1682.0. Samples: 39719426. Policy #0 lag: (min: 25.0, avg: 37.8, max: 57.0) +[2023-10-13 03:29:43,608][45375] Avg episode reward: [(0, '54.400'), (1, '57.850')] +[2023-10-13 03:29:44,159][46662] Updated weights for policy 0, policy_version 77610 (0.0007) +[2023-10-13 03:29:44,530][46662] Updated weights for policy 0, policy_version 77620 (0.0009) +[2023-10-13 03:29:44,899][46662] Updated weights for policy 0, policy_version 77630 (0.0011) +[2023-10-13 03:29:46,278][46663] Updated weights for policy 1, policy_version 77541 (0.0009) +[2023-10-13 03:29:46,647][46663] Updated weights for policy 1, policy_version 77551 (0.0009) +[2023-10-13 03:29:47,015][46663] Updated weights for policy 1, policy_version 77561 (0.0010) +[2023-10-13 03:29:48,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 158924800. Throughput: 0: 1703.3, 1: 1654.8. Samples: 39738974. Policy #0 lag: (min: 25.0, avg: 37.8, max: 57.0) +[2023-10-13 03:29:48,607][45375] Avg episode reward: [(0, '54.460'), (1, '57.310')] +[2023-10-13 03:29:49,132][46662] Updated weights for policy 0, policy_version 77640 (0.0009) +[2023-10-13 03:29:49,488][46662] Updated weights for policy 0, policy_version 77650 (0.0007) +[2023-10-13 03:29:49,863][46662] Updated weights for policy 0, policy_version 77660 (0.0009) +[2023-10-13 03:29:51,136][46663] Updated weights for policy 1, policy_version 77571 (0.0008) +[2023-10-13 03:29:51,509][46663] Updated weights for policy 1, policy_version 77581 (0.0007) +[2023-10-13 03:29:51,881][46663] Updated weights for policy 1, policy_version 77591 (0.0007) +[2023-10-13 03:29:53,607][45375] Fps is (10 sec: 13106.6, 60 sec: 13107.1, 300 sec: 13440.4). Total num frames: 158990336. Throughput: 0: 1700.6, 1: 1681.8. Samples: 39759424. Policy #0 lag: (min: 25.0, avg: 37.8, max: 57.0) +[2023-10-13 03:29:53,608][45375] Avg episode reward: [(0, '53.530'), (1, '56.630')] +[2023-10-13 03:29:53,619][46384] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000077600_79462400.pth... +[2023-10-13 03:29:53,654][46384] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000076032_77856768.pth +[2023-10-13 03:29:53,821][46662] Updated weights for policy 0, policy_version 77670 (0.0008) +[2023-10-13 03:29:54,190][46662] Updated weights for policy 0, policy_version 77680 (0.0007) +[2023-10-13 03:29:54,559][46662] Updated weights for policy 0, policy_version 77690 (0.0008) +[2023-10-13 03:29:54,771][46091] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000077696_79560704.pth... +[2023-10-13 03:29:54,804][46091] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000076096_77922304.pth +[2023-10-13 03:29:56,072][46663] Updated weights for policy 1, policy_version 77601 (0.0008) +[2023-10-13 03:29:56,437][46663] Updated weights for policy 1, policy_version 77611 (0.0007) +[2023-10-13 03:29:56,813][46663] Updated weights for policy 1, policy_version 77621 (0.0008) +[2023-10-13 03:29:57,173][46663] Updated weights for policy 1, policy_version 77631 (0.0009) +[2023-10-13 03:29:58,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 159055872. Throughput: 0: 1693.5, 1: 1674.1. Samples: 39769638. Policy #0 lag: (min: 25.0, avg: 37.8, max: 57.0) +[2023-10-13 03:29:58,607][45375] Avg episode reward: [(0, '53.720'), (1, '58.080')] +[2023-10-13 03:29:58,624][46662] Updated weights for policy 0, policy_version 77700 (0.0010) +[2023-10-13 03:29:59,005][46662] Updated weights for policy 0, policy_version 77710 (0.0010) +[2023-10-13 03:29:59,377][46662] Updated weights for policy 0, policy_version 77720 (0.0007) +[2023-10-13 03:30:01,406][46663] Updated weights for policy 1, policy_version 77641 (0.0008) +[2023-10-13 03:30:01,772][46663] Updated weights for policy 1, policy_version 77651 (0.0008) +[2023-10-13 03:30:02,131][46663] Updated weights for policy 1, policy_version 77661 (0.0008) +[2023-10-13 03:30:03,378][46662] Updated weights for policy 0, policy_version 77730 (0.0011) +[2023-10-13 03:30:03,607][45375] Fps is (10 sec: 13107.9, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 159121408. Throughput: 0: 1701.6, 1: 1658.8. Samples: 39789040. Policy #0 lag: (min: 21.0, avg: 21.0, max: 21.0) +[2023-10-13 03:30:03,607][45375] Avg episode reward: [(0, '54.010'), (1, '56.940')] +[2023-10-13 03:30:03,743][46662] Updated weights for policy 0, policy_version 77740 (0.0010) +[2023-10-13 03:30:04,108][46662] Updated weights for policy 0, policy_version 77750 (0.0007) +[2023-10-13 03:30:04,473][46662] Updated weights for policy 0, policy_version 77760 (0.0008) +[2023-10-13 03:30:06,340][46663] Updated weights for policy 1, policy_version 77671 (0.0010) +[2023-10-13 03:30:06,713][46663] Updated weights for policy 1, policy_version 77681 (0.0011) +[2023-10-13 03:30:07,087][46663] Updated weights for policy 1, policy_version 77691 (0.0011) +[2023-10-13 03:30:08,606][46662] Updated weights for policy 0, policy_version 77770 (0.0008) +[2023-10-13 03:30:08,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 159186944. Throughput: 0: 1693.1, 1: 1674.1. Samples: 39809540. Policy #0 lag: (min: 21.0, avg: 21.0, max: 21.0) +[2023-10-13 03:30:08,607][45375] Avg episode reward: [(0, '54.090'), (1, '58.400')] +[2023-10-13 03:30:08,984][46662] Updated weights for policy 0, policy_version 77780 (0.0008) +[2023-10-13 03:30:09,354][46662] Updated weights for policy 0, policy_version 77790 (0.0007) +[2023-10-13 03:30:11,131][46663] Updated weights for policy 1, policy_version 77701 (0.0009) +[2023-10-13 03:30:11,523][46663] Updated weights for policy 1, policy_version 77711 (0.0010) +[2023-10-13 03:30:11,884][46663] Updated weights for policy 1, policy_version 77721 (0.0010) +[2023-10-13 03:30:13,325][46662] Updated weights for policy 0, policy_version 77800 (0.0008) +[2023-10-13 03:30:13,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 159252480. Throughput: 0: 1693.2, 1: 1662.3. Samples: 39819498. Policy #0 lag: (min: 21.0, avg: 21.0, max: 21.0) +[2023-10-13 03:30:13,607][45375] Avg episode reward: [(0, '53.170'), (1, '56.130')] +[2023-10-13 03:30:13,703][46662] Updated weights for policy 0, policy_version 77810 (0.0008) +[2023-10-13 03:30:14,084][46662] Updated weights for policy 0, policy_version 77820 (0.0008) +[2023-10-13 03:30:15,823][46663] Updated weights for policy 1, policy_version 77731 (0.0009) +[2023-10-13 03:30:16,199][46663] Updated weights for policy 1, policy_version 77741 (0.0007) +[2023-10-13 03:30:16,564][46663] Updated weights for policy 1, policy_version 77751 (0.0007) +[2023-10-13 03:30:18,109][46662] Updated weights for policy 0, policy_version 77830 (0.0009) +[2023-10-13 03:30:18,478][46662] Updated weights for policy 0, policy_version 77840 (0.0009) +[2023-10-13 03:30:18,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 159318016. Throughput: 0: 1691.4, 1: 1667.7. Samples: 39839366. Policy #0 lag: (min: 21.0, avg: 21.0, max: 21.0) +[2023-10-13 03:30:18,607][45375] Avg episode reward: [(0, '52.600'), (1, '55.790')] +[2023-10-13 03:30:18,840][46662] Updated weights for policy 0, policy_version 77850 (0.0009) +[2023-10-13 03:30:20,722][46663] Updated weights for policy 1, policy_version 77761 (0.0007) +[2023-10-13 03:30:21,088][46663] Updated weights for policy 1, policy_version 77771 (0.0008) +[2023-10-13 03:30:21,451][46663] Updated weights for policy 1, policy_version 77781 (0.0007) +[2023-10-13 03:30:21,830][46663] Updated weights for policy 1, policy_version 77791 (0.0008) +[2023-10-13 03:30:22,761][46662] Updated weights for policy 0, policy_version 77860 (0.0008) +[2023-10-13 03:30:23,130][46662] Updated weights for policy 0, policy_version 77870 (0.0008) +[2023-10-13 03:30:23,506][46662] Updated weights for policy 0, policy_version 77880 (0.0008) +[2023-10-13 03:30:23,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 159383552. Throughput: 0: 1682.4, 1: 1671.6. Samples: 39860042. Policy #0 lag: (min: 21.0, avg: 21.0, max: 21.0) +[2023-10-13 03:30:23,607][45375] Avg episode reward: [(0, '52.520'), (1, '56.220')] +[2023-10-13 03:30:25,778][46663] Updated weights for policy 1, policy_version 77801 (0.0008) +[2023-10-13 03:30:26,140][46663] Updated weights for policy 1, policy_version 77811 (0.0010) +[2023-10-13 03:30:26,503][46663] Updated weights for policy 1, policy_version 77821 (0.0008) +[2023-10-13 03:30:27,766][46662] Updated weights for policy 0, policy_version 77890 (0.0008) +[2023-10-13 03:30:28,134][46662] Updated weights for policy 0, policy_version 77900 (0.0011) +[2023-10-13 03:30:28,504][46662] Updated weights for policy 0, policy_version 77910 (0.0012) +[2023-10-13 03:30:28,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 159449088. Throughput: 0: 1684.8, 1: 1652.9. Samples: 39869622. Policy #0 lag: (min: 21.0, avg: 21.0, max: 21.0) +[2023-10-13 03:30:28,607][45375] Avg episode reward: [(0, '53.200'), (1, '55.850')] +[2023-10-13 03:30:28,876][46662] Updated weights for policy 0, policy_version 77920 (0.0010) +[2023-10-13 03:30:30,656][46663] Updated weights for policy 1, policy_version 77831 (0.0010) +[2023-10-13 03:30:31,021][46663] Updated weights for policy 1, policy_version 77841 (0.0008) +[2023-10-13 03:30:31,386][46663] Updated weights for policy 1, policy_version 77851 (0.0010) +[2023-10-13 03:30:33,116][46662] Updated weights for policy 0, policy_version 77930 (0.0009) +[2023-10-13 03:30:33,486][46662] Updated weights for policy 0, policy_version 77940 (0.0009) +[2023-10-13 03:30:33,606][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 159514624. Throughput: 0: 1683.6, 1: 1670.4. Samples: 39889906. Policy #0 lag: (min: 21.0, avg: 21.0, max: 21.0) +[2023-10-13 03:30:33,607][45375] Avg episode reward: [(0, '54.560'), (1, '55.740')] +[2023-10-13 03:30:33,861][46662] Updated weights for policy 0, policy_version 77950 (0.0007) +[2023-10-13 03:30:35,433][46663] Updated weights for policy 1, policy_version 77861 (0.0010) +[2023-10-13 03:30:35,812][46663] Updated weights for policy 1, policy_version 77871 (0.0008) +[2023-10-13 03:30:36,179][46663] Updated weights for policy 1, policy_version 77881 (0.0009) +[2023-10-13 03:30:38,003][46662] Updated weights for policy 0, policy_version 77960 (0.0009) +[2023-10-13 03:30:38,367][46662] Updated weights for policy 0, policy_version 77970 (0.0007) +[2023-10-13 03:30:38,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 159580160. Throughput: 0: 1680.4, 1: 1674.4. Samples: 39910388. Policy #0 lag: (min: 21.0, avg: 21.0, max: 21.0) +[2023-10-13 03:30:38,607][45375] Avg episode reward: [(0, '53.400'), (1, '57.280')] +[2023-10-13 03:30:38,745][46662] Updated weights for policy 0, policy_version 77980 (0.0010) +[2023-10-13 03:30:40,225][46663] Updated weights for policy 1, policy_version 77891 (0.0010) +[2023-10-13 03:30:40,591][46663] Updated weights for policy 1, policy_version 77901 (0.0008) +[2023-10-13 03:30:40,960][46663] Updated weights for policy 1, policy_version 77911 (0.0008) +[2023-10-13 03:30:42,900][46662] Updated weights for policy 0, policy_version 77990 (0.0009) +[2023-10-13 03:30:43,266][46662] Updated weights for policy 0, policy_version 78000 (0.0008) +[2023-10-13 03:30:43,606][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 159645696. Throughput: 0: 1680.3, 1: 1653.7. Samples: 39919668. Policy #0 lag: (min: 21.0, avg: 21.0, max: 21.0) +[2023-10-13 03:30:43,607][45375] Avg episode reward: [(0, '53.040'), (1, '58.440')] +[2023-10-13 03:30:43,633][46662] Updated weights for policy 0, policy_version 78010 (0.0007) +[2023-10-13 03:30:44,953][46663] Updated weights for policy 1, policy_version 77921 (0.0010) +[2023-10-13 03:30:45,323][46663] Updated weights for policy 1, policy_version 77931 (0.0009) +[2023-10-13 03:30:45,686][46663] Updated weights for policy 1, policy_version 77941 (0.0008) +[2023-10-13 03:30:46,062][46663] Updated weights for policy 1, policy_version 77951 (0.0007) +[2023-10-13 03:30:47,692][46662] Updated weights for policy 0, policy_version 78020 (0.0008) +[2023-10-13 03:30:48,069][46662] Updated weights for policy 0, policy_version 78030 (0.0009) +[2023-10-13 03:30:48,435][46662] Updated weights for policy 0, policy_version 78040 (0.0008) +[2023-10-13 03:30:48,606][45375] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 159711232. Throughput: 0: 1682.4, 1: 1689.5. Samples: 39940774. Policy #0 lag: (min: 21.0, avg: 21.0, max: 21.0) +[2023-10-13 03:30:48,607][45375] Avg episode reward: [(0, '52.850'), (1, '59.230')] +[2023-10-13 03:30:50,136][46663] Updated weights for policy 1, policy_version 77961 (0.0009) +[2023-10-13 03:30:50,504][46663] Updated weights for policy 1, policy_version 77971 (0.0010) +[2023-10-13 03:30:50,874][46663] Updated weights for policy 1, policy_version 77981 (0.0011) +[2023-10-13 03:30:52,469][46662] Updated weights for policy 0, policy_version 78050 (0.0009) +[2023-10-13 03:30:52,844][46662] Updated weights for policy 0, policy_version 78060 (0.0009) +[2023-10-13 03:30:53,216][46662] Updated weights for policy 0, policy_version 78070 (0.0009) +[2023-10-13 03:30:53,594][46662] Updated weights for policy 0, policy_version 78080 (0.0007) +[2023-10-13 03:30:53,607][45375] Fps is (10 sec: 16383.5, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 159809536. Throughput: 0: 1671.9, 1: 1688.4. Samples: 39960756. Policy #0 lag: (min: 31.0, avg: 37.6, max: 63.0) +[2023-10-13 03:30:53,608][45375] Avg episode reward: [(0, '52.890'), (1, '58.780')] +[2023-10-13 03:30:54,898][46663] Updated weights for policy 1, policy_version 77991 (0.0009) +[2023-10-13 03:30:55,260][46663] Updated weights for policy 1, policy_version 78001 (0.0008) +[2023-10-13 03:30:55,619][46663] Updated weights for policy 1, policy_version 78011 (0.0010) +[2023-10-13 03:30:57,696][46662] Updated weights for policy 0, policy_version 78090 (0.0007) +[2023-10-13 03:30:58,071][46662] Updated weights for policy 0, policy_version 78100 (0.0009) +[2023-10-13 03:30:58,432][46662] Updated weights for policy 0, policy_version 78110 (0.0008) +[2023-10-13 03:30:58,606][45375] Fps is (10 sec: 16384.0, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 159875072. Throughput: 0: 1680.1, 1: 1665.8. Samples: 39970064. Policy #0 lag: (min: 31.0, avg: 37.6, max: 63.0) +[2023-10-13 03:30:58,607][45375] Avg episode reward: [(0, '52.870'), (1, '58.320')] +[2023-10-13 03:30:59,943][46663] Updated weights for policy 1, policy_version 78021 (0.0008) +[2023-10-13 03:31:00,302][46663] Updated weights for policy 1, policy_version 78031 (0.0008) +[2023-10-13 03:31:00,666][46663] Updated weights for policy 1, policy_version 78041 (0.0007) +[2023-10-13 03:31:02,447][46662] Updated weights for policy 0, policy_version 78120 (0.0009) +[2023-10-13 03:31:02,829][46662] Updated weights for policy 0, policy_version 78130 (0.0009) +[2023-10-13 03:31:03,199][46662] Updated weights for policy 0, policy_version 78140 (0.0007) +[2023-10-13 03:31:03,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 159940608. Throughput: 0: 1680.1, 1: 1687.7. Samples: 39990918. Policy #0 lag: (min: 31.0, avg: 37.6, max: 63.0) +[2023-10-13 03:31:03,608][45375] Avg episode reward: [(0, '52.870'), (1, '59.210')] +[2023-10-13 03:31:04,548][46663] Updated weights for policy 1, policy_version 78051 (0.0008) +[2023-10-13 03:31:04,950][46663] Updated weights for policy 1, policy_version 78061 (0.0008) +[2023-10-13 03:31:05,314][46663] Updated weights for policy 1, policy_version 78071 (0.0011) +[2023-10-13 03:31:07,082][46662] Updated weights for policy 0, policy_version 78150 (0.0008) +[2023-10-13 03:31:07,446][46662] Updated weights for policy 0, policy_version 78160 (0.0010) +[2023-10-13 03:31:07,823][46662] Updated weights for policy 0, policy_version 78170 (0.0011) +[2023-10-13 03:31:08,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 160006144. Throughput: 0: 1658.4, 1: 1686.8. Samples: 40010578. Policy #0 lag: (min: 31.0, avg: 37.6, max: 63.0) +[2023-10-13 03:31:08,607][45375] Avg episode reward: [(0, '53.120'), (1, '59.580')] +[2023-10-13 03:31:09,377][46663] Updated weights for policy 1, policy_version 78081 (0.0011) +[2023-10-13 03:31:09,743][46663] Updated weights for policy 1, policy_version 78091 (0.0007) +[2023-10-13 03:31:10,115][46663] Updated weights for policy 1, policy_version 78101 (0.0008) +[2023-10-13 03:31:10,479][46663] Updated weights for policy 1, policy_version 78111 (0.0007) +[2023-10-13 03:31:11,809][46662] Updated weights for policy 0, policy_version 78180 (0.0009) +[2023-10-13 03:31:12,189][46662] Updated weights for policy 0, policy_version 78190 (0.0009) +[2023-10-13 03:31:12,551][46662] Updated weights for policy 0, policy_version 78200 (0.0011) +[2023-10-13 03:31:13,607][45375] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 160071680. Throughput: 0: 1677.6, 1: 1679.9. Samples: 40020708. Policy #0 lag: (min: 31.0, avg: 37.6, max: 63.0) +[2023-10-13 03:31:13,607][45375] Avg episode reward: [(0, '53.700'), (1, '59.180')] +[2023-10-13 03:31:14,524][46663] Updated weights for policy 1, policy_version 78121 (0.0008) +[2023-10-13 03:31:14,892][46663] Updated weights for policy 1, policy_version 78131 (0.0008) +[2023-10-13 03:31:15,261][46663] Updated weights for policy 1, policy_version 78141 (0.0009) +[2023-10-13 03:31:16,407][46662] Updated weights for policy 0, policy_version 78210 (0.0010) +[2023-10-13 03:31:16,773][46662] Updated weights for policy 0, policy_version 78220 (0.0010) +[2023-10-13 03:31:17,140][46662] Updated weights for policy 0, policy_version 78230 (0.0008) +[2023-10-13 03:31:17,504][46662] Updated weights for policy 0, policy_version 78240 (0.0008) +[2023-10-13 03:31:18,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 160137216. Throughput: 0: 1672.4, 1: 1693.8. Samples: 40041386. Policy #0 lag: (min: 31.0, avg: 37.6, max: 63.0) +[2023-10-13 03:31:18,607][45375] Avg episode reward: [(0, '54.000'), (1, '59.780')] +[2023-10-13 03:31:19,336][46663] Updated weights for policy 1, policy_version 78151 (0.0007) +[2023-10-13 03:31:19,720][46663] Updated weights for policy 1, policy_version 78161 (0.0011) +[2023-10-13 03:31:20,087][46663] Updated weights for policy 1, policy_version 78171 (0.0008) +[2023-10-13 03:31:21,569][46662] Updated weights for policy 0, policy_version 78250 (0.0010) +[2023-10-13 03:31:21,947][46662] Updated weights for policy 0, policy_version 78260 (0.0009) +[2023-10-13 03:31:22,309][46662] Updated weights for policy 0, policy_version 78270 (0.0009) +[2023-10-13 03:31:23,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 160202752. Throughput: 0: 1660.0, 1: 1695.3. Samples: 40061374. Policy #0 lag: (min: 31.0, avg: 37.6, max: 63.0) +[2023-10-13 03:31:23,607][45375] Avg episode reward: [(0, '55.340'), (1, '59.480')] +[2023-10-13 03:31:24,042][46663] Updated weights for policy 1, policy_version 78181 (0.0007) +[2023-10-13 03:31:24,403][46663] Updated weights for policy 1, policy_version 78191 (0.0009) +[2023-10-13 03:31:24,774][46663] Updated weights for policy 1, policy_version 78201 (0.0007) +[2023-10-13 03:31:26,314][46662] Updated weights for policy 0, policy_version 78280 (0.0008) +[2023-10-13 03:31:26,673][46662] Updated weights for policy 0, policy_version 78290 (0.0007) +[2023-10-13 03:31:27,043][46662] Updated weights for policy 0, policy_version 78300 (0.0007) +[2023-10-13 03:31:28,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 160268288. Throughput: 0: 1691.9, 1: 1697.7. Samples: 40072200. Policy #0 lag: (min: 31.0, avg: 37.6, max: 63.0) +[2023-10-13 03:31:28,607][45375] Avg episode reward: [(0, '56.770'), (1, '59.420')] +[2023-10-13 03:31:28,740][46663] Updated weights for policy 1, policy_version 78211 (0.0009) +[2023-10-13 03:31:29,101][46663] Updated weights for policy 1, policy_version 78221 (0.0010) +[2023-10-13 03:31:29,462][46663] Updated weights for policy 1, policy_version 78231 (0.0009) +[2023-10-13 03:31:31,099][46662] Updated weights for policy 0, policy_version 78310 (0.0009) +[2023-10-13 03:31:31,465][46662] Updated weights for policy 0, policy_version 78320 (0.0010) +[2023-10-13 03:31:31,844][46662] Updated weights for policy 0, policy_version 78330 (0.0008) +[2023-10-13 03:31:33,509][46663] Updated weights for policy 1, policy_version 78241 (0.0007) +[2023-10-13 03:31:33,606][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 160333824. Throughput: 0: 1669.5, 1: 1691.2. Samples: 40092004. Policy #0 lag: (min: 31.0, avg: 37.6, max: 63.0) +[2023-10-13 03:31:33,607][45375] Avg episode reward: [(0, '58.690'), (1, '59.540')] +[2023-10-13 03:31:33,880][46663] Updated weights for policy 1, policy_version 78251 (0.0008) +[2023-10-13 03:31:34,241][46663] Updated weights for policy 1, policy_version 78261 (0.0010) +[2023-10-13 03:31:34,610][46663] Updated weights for policy 1, policy_version 78271 (0.0007) +[2023-10-13 03:31:35,837][46662] Updated weights for policy 0, policy_version 78340 (0.0008) +[2023-10-13 03:31:36,207][46662] Updated weights for policy 0, policy_version 78350 (0.0009) +[2023-10-13 03:31:36,581][46662] Updated weights for policy 0, policy_version 78360 (0.0007) +[2023-10-13 03:31:38,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 160399360. Throughput: 0: 1674.9, 1: 1693.4. Samples: 40112330. Policy #0 lag: (min: 31.0, avg: 37.6, max: 63.0) +[2023-10-13 03:31:38,607][45375] Avg episode reward: [(0, '58.970'), (1, '58.790')] +[2023-10-13 03:31:38,649][46663] Updated weights for policy 1, policy_version 78281 (0.0008) +[2023-10-13 03:31:39,029][46663] Updated weights for policy 1, policy_version 78291 (0.0008) +[2023-10-13 03:31:39,397][46663] Updated weights for policy 1, policy_version 78301 (0.0007) +[2023-10-13 03:31:40,713][46662] Updated weights for policy 0, policy_version 78370 (0.0010) +[2023-10-13 03:31:41,089][46662] Updated weights for policy 0, policy_version 78380 (0.0010) +[2023-10-13 03:31:41,462][46662] Updated weights for policy 0, policy_version 78390 (0.0007) +[2023-10-13 03:31:41,832][46662] Updated weights for policy 0, policy_version 78400 (0.0008) +[2023-10-13 03:31:43,361][46663] Updated weights for policy 1, policy_version 78311 (0.0009) +[2023-10-13 03:31:43,606][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 160464896. Throughput: 0: 1690.6, 1: 1697.5. Samples: 40122528. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 03:31:43,607][45375] Avg episode reward: [(0, '59.220'), (1, '56.080')] +[2023-10-13 03:31:43,733][46663] Updated weights for policy 1, policy_version 78321 (0.0008) +[2023-10-13 03:31:44,099][46663] Updated weights for policy 1, policy_version 78331 (0.0008) +[2023-10-13 03:31:45,827][46662] Updated weights for policy 0, policy_version 78410 (0.0007) +[2023-10-13 03:31:46,205][46662] Updated weights for policy 0, policy_version 78420 (0.0010) +[2023-10-13 03:31:46,584][46662] Updated weights for policy 0, policy_version 78430 (0.0010) +[2023-10-13 03:31:48,078][46663] Updated weights for policy 1, policy_version 78341 (0.0009) +[2023-10-13 03:31:48,452][46663] Updated weights for policy 1, policy_version 78351 (0.0008) +[2023-10-13 03:31:48,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.5). Total num frames: 160530432. Throughput: 0: 1667.6, 1: 1699.2. Samples: 40142422. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 03:31:48,607][45375] Avg episode reward: [(0, '56.670'), (1, '54.870')] +[2023-10-13 03:31:48,823][46663] Updated weights for policy 1, policy_version 78361 (0.0010) +[2023-10-13 03:31:50,856][46662] Updated weights for policy 0, policy_version 78440 (0.0008) +[2023-10-13 03:31:51,231][46662] Updated weights for policy 0, policy_version 78450 (0.0008) +[2023-10-13 03:31:51,606][46662] Updated weights for policy 0, policy_version 78460 (0.0007) +[2023-10-13 03:31:52,937][46663] Updated weights for policy 1, policy_version 78371 (0.0011) +[2023-10-13 03:31:53,312][46663] Updated weights for policy 1, policy_version 78381 (0.0008) +[2023-10-13 03:31:53,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 160595968. Throughput: 0: 1686.9, 1: 1686.0. Samples: 40162356. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 03:31:53,607][45375] Avg episode reward: [(0, '57.310'), (1, '56.250')] +[2023-10-13 03:31:53,615][46091] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000078464_80347136.pth... +[2023-10-13 03:31:53,650][46091] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000076896_78741504.pth +[2023-10-13 03:31:53,654][46091] Saving a milestone ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/milestones/checkpoint_000078464_80347136.pth +[2023-10-13 03:31:53,680][46663] Updated weights for policy 1, policy_version 78391 (0.0008) +[2023-10-13 03:31:54,007][46384] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000078400_80281600.pth... +[2023-10-13 03:31:54,045][46384] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000076832_78675968.pth +[2023-10-13 03:31:54,050][46384] Saving a milestone ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/milestones/checkpoint_000078400_80281600.pth +[2023-10-13 03:31:55,719][46662] Updated weights for policy 0, policy_version 78470 (0.0009) +[2023-10-13 03:31:56,096][46662] Updated weights for policy 0, policy_version 78480 (0.0008) +[2023-10-13 03:31:56,460][46662] Updated weights for policy 0, policy_version 78490 (0.0007) +[2023-10-13 03:31:57,763][46663] Updated weights for policy 1, policy_version 78401 (0.0007) +[2023-10-13 03:31:58,129][46663] Updated weights for policy 1, policy_version 78411 (0.0010) +[2023-10-13 03:31:58,492][46663] Updated weights for policy 1, policy_version 78421 (0.0010) +[2023-10-13 03:31:58,606][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 160661504. Throughput: 0: 1688.7, 1: 1698.0. Samples: 40173112. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 03:31:58,607][45375] Avg episode reward: [(0, '58.340'), (1, '55.920')] +[2023-10-13 03:31:58,861][46663] Updated weights for policy 1, policy_version 78431 (0.0010) +[2023-10-13 03:32:00,650][46662] Updated weights for policy 0, policy_version 78500 (0.0010) +[2023-10-13 03:32:01,018][46662] Updated weights for policy 0, policy_version 78510 (0.0009) +[2023-10-13 03:32:01,393][46662] Updated weights for policy 0, policy_version 78520 (0.0007) +[2023-10-13 03:32:03,025][46663] Updated weights for policy 1, policy_version 78441 (0.0009) +[2023-10-13 03:32:03,389][46663] Updated weights for policy 1, policy_version 78451 (0.0007) +[2023-10-13 03:32:03,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 160727040. Throughput: 0: 1669.6, 1: 1693.6. Samples: 40192734. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 03:32:03,608][45375] Avg episode reward: [(0, '60.110'), (1, '58.530')] +[2023-10-13 03:32:03,760][46663] Updated weights for policy 1, policy_version 78461 (0.0007) +[2023-10-13 03:32:05,480][46662] Updated weights for policy 0, policy_version 78530 (0.0008) +[2023-10-13 03:32:05,859][46662] Updated weights for policy 0, policy_version 78540 (0.0010) +[2023-10-13 03:32:06,222][46662] Updated weights for policy 0, policy_version 78550 (0.0010) +[2023-10-13 03:32:06,597][46662] Updated weights for policy 0, policy_version 78560 (0.0011) +[2023-10-13 03:32:07,787][46663] Updated weights for policy 1, policy_version 78471 (0.0009) +[2023-10-13 03:32:08,160][46663] Updated weights for policy 1, policy_version 78481 (0.0008) +[2023-10-13 03:32:08,525][46663] Updated weights for policy 1, policy_version 78491 (0.0009) +[2023-10-13 03:32:08,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 160792576. Throughput: 0: 1688.3, 1: 1671.3. Samples: 40212554. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 03:32:08,607][45375] Avg episode reward: [(0, '62.090'), (1, '57.030')] +[2023-10-13 03:32:10,531][46662] Updated weights for policy 0, policy_version 78570 (0.0008) +[2023-10-13 03:32:10,903][46662] Updated weights for policy 0, policy_version 78580 (0.0008) +[2023-10-13 03:32:11,265][46662] Updated weights for policy 0, policy_version 78590 (0.0010) +[2023-10-13 03:32:12,663][46663] Updated weights for policy 1, policy_version 78501 (0.0009) +[2023-10-13 03:32:13,037][46663] Updated weights for policy 1, policy_version 78511 (0.0012) +[2023-10-13 03:32:13,405][46663] Updated weights for policy 1, policy_version 78521 (0.0007) +[2023-10-13 03:32:13,606][45375] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 160858112. Throughput: 0: 1664.2, 1: 1687.4. Samples: 40223020. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 03:32:13,607][45375] Avg episode reward: [(0, '61.800'), (1, '56.090')] +[2023-10-13 03:32:15,361][46662] Updated weights for policy 0, policy_version 78600 (0.0011) +[2023-10-13 03:32:15,733][46662] Updated weights for policy 0, policy_version 78610 (0.0008) +[2023-10-13 03:32:16,095][46662] Updated weights for policy 0, policy_version 78620 (0.0007) +[2023-10-13 03:32:17,541][46663] Updated weights for policy 1, policy_version 78531 (0.0008) +[2023-10-13 03:32:17,915][46663] Updated weights for policy 1, policy_version 78541 (0.0009) +[2023-10-13 03:32:18,286][46663] Updated weights for policy 1, policy_version 78551 (0.0008) +[2023-10-13 03:32:18,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 160923648. Throughput: 0: 1671.9, 1: 1687.7. Samples: 40243188. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 03:32:18,608][45375] Avg episode reward: [(0, '63.430'), (1, '56.390')] +[2023-10-13 03:32:20,196][46662] Updated weights for policy 0, policy_version 78630 (0.0009) +[2023-10-13 03:32:20,571][46662] Updated weights for policy 0, policy_version 78640 (0.0010) +[2023-10-13 03:32:20,929][46662] Updated weights for policy 0, policy_version 78650 (0.0011) +[2023-10-13 03:32:22,179][46663] Updated weights for policy 1, policy_version 78561 (0.0011) +[2023-10-13 03:32:22,544][46663] Updated weights for policy 1, policy_version 78571 (0.0009) +[2023-10-13 03:32:22,904][46663] Updated weights for policy 1, policy_version 78581 (0.0007) +[2023-10-13 03:32:23,279][46663] Updated weights for policy 1, policy_version 78591 (0.0010) +[2023-10-13 03:32:23,607][45375] Fps is (10 sec: 16383.7, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 161021952. Throughput: 0: 1679.2, 1: 1659.6. Samples: 40262574. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 03:32:23,608][45375] Avg episode reward: [(0, '63.630'), (1, '57.060')] +[2023-10-13 03:32:24,935][46662] Updated weights for policy 0, policy_version 78660 (0.0009) +[2023-10-13 03:32:25,299][46662] Updated weights for policy 0, policy_version 78670 (0.0008) +[2023-10-13 03:32:25,670][46662] Updated weights for policy 0, policy_version 78680 (0.0009) +[2023-10-13 03:32:27,386][46663] Updated weights for policy 1, policy_version 78601 (0.0009) +[2023-10-13 03:32:27,761][46663] Updated weights for policy 1, policy_version 78611 (0.0009) +[2023-10-13 03:32:28,133][46663] Updated weights for policy 1, policy_version 78621 (0.0007) +[2023-10-13 03:32:28,607][45375] Fps is (10 sec: 16384.1, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 161087488. Throughput: 0: 1662.8, 1: 1688.5. Samples: 40273338. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 03:32:28,607][45375] Avg episode reward: [(0, '64.740'), (1, '56.690')] +[2023-10-13 03:32:28,608][46091] Saving new best policy, reward=64.740! +[2023-10-13 03:32:29,549][46662] Updated weights for policy 0, policy_version 78690 (0.0008) +[2023-10-13 03:32:29,910][46662] Updated weights for policy 0, policy_version 78700 (0.0010) +[2023-10-13 03:32:30,288][46662] Updated weights for policy 0, policy_version 78710 (0.0008) +[2023-10-13 03:32:30,653][46662] Updated weights for policy 0, policy_version 78720 (0.0009) +[2023-10-13 03:32:32,223][46663] Updated weights for policy 1, policy_version 78631 (0.0009) +[2023-10-13 03:32:32,588][46663] Updated weights for policy 1, policy_version 78641 (0.0008) +[2023-10-13 03:32:32,952][46663] Updated weights for policy 1, policy_version 78651 (0.0010) +[2023-10-13 03:32:33,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 161153024. Throughput: 0: 1681.2, 1: 1671.4. Samples: 40293288. Policy #0 lag: (min: 19.0, avg: 24.6, max: 51.0) +[2023-10-13 03:32:33,608][45375] Avg episode reward: [(0, '65.160'), (1, '57.920')] +[2023-10-13 03:32:33,609][46091] Saving new best policy, reward=65.160! +[2023-10-13 03:32:34,796][46662] Updated weights for policy 0, policy_version 78730 (0.0011) +[2023-10-13 03:32:35,167][46662] Updated weights for policy 0, policy_version 78740 (0.0010) +[2023-10-13 03:32:35,533][46662] Updated weights for policy 0, policy_version 78750 (0.0009) +[2023-10-13 03:32:36,793][46663] Updated weights for policy 1, policy_version 78661 (0.0007) +[2023-10-13 03:32:37,156][46663] Updated weights for policy 1, policy_version 78671 (0.0009) +[2023-10-13 03:32:37,529][46663] Updated weights for policy 1, policy_version 78681 (0.0008) +[2023-10-13 03:32:38,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 161218560. Throughput: 0: 1687.0, 1: 1671.4. Samples: 40313484. Policy #0 lag: (min: 19.0, avg: 24.6, max: 51.0) +[2023-10-13 03:32:38,607][45375] Avg episode reward: [(0, '66.290'), (1, '60.130')] +[2023-10-13 03:32:38,613][46091] Saving new best policy, reward=66.290! +[2023-10-13 03:32:39,808][46662] Updated weights for policy 0, policy_version 78760 (0.0007) +[2023-10-13 03:32:40,192][46662] Updated weights for policy 0, policy_version 78770 (0.0010) +[2023-10-13 03:32:40,567][46662] Updated weights for policy 0, policy_version 78780 (0.0009) +[2023-10-13 03:32:41,732][46663] Updated weights for policy 1, policy_version 78691 (0.0008) +[2023-10-13 03:32:42,143][46663] Updated weights for policy 1, policy_version 78701 (0.0010) +[2023-10-13 03:32:42,517][46663] Updated weights for policy 1, policy_version 78711 (0.0010) +[2023-10-13 03:32:43,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 161284096. Throughput: 0: 1658.6, 1: 1688.4. Samples: 40323728. Policy #0 lag: (min: 19.0, avg: 24.6, max: 51.0) +[2023-10-13 03:32:43,608][45375] Avg episode reward: [(0, '68.210'), (1, '58.810')] +[2023-10-13 03:32:43,609][46091] Saving new best policy, reward=68.210! +[2023-10-13 03:32:44,602][46662] Updated weights for policy 0, policy_version 78790 (0.0007) +[2023-10-13 03:32:44,971][46662] Updated weights for policy 0, policy_version 78800 (0.0009) +[2023-10-13 03:32:45,335][46662] Updated weights for policy 0, policy_version 78810 (0.0008) +[2023-10-13 03:32:46,641][46663] Updated weights for policy 1, policy_version 78721 (0.0009) +[2023-10-13 03:32:47,011][46663] Updated weights for policy 1, policy_version 78731 (0.0007) +[2023-10-13 03:32:47,377][46663] Updated weights for policy 1, policy_version 78741 (0.0007) +[2023-10-13 03:32:47,738][46663] Updated weights for policy 1, policy_version 78751 (0.0008) +[2023-10-13 03:32:48,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 161349632. Throughput: 0: 1682.8, 1: 1670.0. Samples: 40343606. Policy #0 lag: (min: 19.0, avg: 24.6, max: 51.0) +[2023-10-13 03:32:48,607][45375] Avg episode reward: [(0, '65.790'), (1, '59.070')] +[2023-10-13 03:32:49,505][46662] Updated weights for policy 0, policy_version 78820 (0.0009) +[2023-10-13 03:32:49,871][46662] Updated weights for policy 0, policy_version 78830 (0.0009) +[2023-10-13 03:32:50,249][46662] Updated weights for policy 0, policy_version 78840 (0.0010) +[2023-10-13 03:32:51,755][46663] Updated weights for policy 1, policy_version 78761 (0.0008) +[2023-10-13 03:32:52,121][46663] Updated weights for policy 1, policy_version 78771 (0.0009) +[2023-10-13 03:32:52,490][46663] Updated weights for policy 1, policy_version 78781 (0.0009) +[2023-10-13 03:32:53,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 161415168. Throughput: 0: 1681.0, 1: 1677.4. Samples: 40363680. Policy #0 lag: (min: 19.0, avg: 24.6, max: 51.0) +[2023-10-13 03:32:53,608][45375] Avg episode reward: [(0, '66.090'), (1, '59.710')] +[2023-10-13 03:32:54,213][46662] Updated weights for policy 0, policy_version 78850 (0.0011) +[2023-10-13 03:32:54,581][46662] Updated weights for policy 0, policy_version 78860 (0.0011) +[2023-10-13 03:32:54,956][46662] Updated weights for policy 0, policy_version 78870 (0.0010) +[2023-10-13 03:32:55,328][46662] Updated weights for policy 0, policy_version 78880 (0.0008) +[2023-10-13 03:32:56,600][46663] Updated weights for policy 1, policy_version 78791 (0.0008) +[2023-10-13 03:32:56,968][46663] Updated weights for policy 1, policy_version 78801 (0.0009) +[2023-10-13 03:32:57,337][46663] Updated weights for policy 1, policy_version 78811 (0.0008) +[2023-10-13 03:32:58,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 161480704. Throughput: 0: 1673.5, 1: 1684.3. Samples: 40374122. Policy #0 lag: (min: 19.0, avg: 24.6, max: 51.0) +[2023-10-13 03:32:58,607][45375] Avg episode reward: [(0, '64.880'), (1, '60.960')] +[2023-10-13 03:32:59,410][46662] Updated weights for policy 0, policy_version 78890 (0.0008) +[2023-10-13 03:32:59,774][46662] Updated weights for policy 0, policy_version 78900 (0.0008) +[2023-10-13 03:33:00,149][46662] Updated weights for policy 0, policy_version 78910 (0.0009) +[2023-10-13 03:33:01,460][46663] Updated weights for policy 1, policy_version 78821 (0.0008) +[2023-10-13 03:33:01,839][46663] Updated weights for policy 1, policy_version 78831 (0.0008) +[2023-10-13 03:33:02,211][46663] Updated weights for policy 1, policy_version 78841 (0.0012) +[2023-10-13 03:33:03,606][45375] Fps is (10 sec: 13107.6, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 161546240. Throughput: 0: 1687.5, 1: 1656.5. Samples: 40393666. Policy #0 lag: (min: 19.0, avg: 24.6, max: 51.0) +[2023-10-13 03:33:03,607][45375] Avg episode reward: [(0, '64.520'), (1, '59.490')] +[2023-10-13 03:33:04,150][46662] Updated weights for policy 0, policy_version 78920 (0.0008) +[2023-10-13 03:33:04,517][46662] Updated weights for policy 0, policy_version 78930 (0.0009) +[2023-10-13 03:33:04,885][46662] Updated weights for policy 0, policy_version 78940 (0.0008) +[2023-10-13 03:33:06,215][46663] Updated weights for policy 1, policy_version 78851 (0.0009) +[2023-10-13 03:33:06,586][46663] Updated weights for policy 1, policy_version 78861 (0.0008) +[2023-10-13 03:33:06,946][46663] Updated weights for policy 1, policy_version 78871 (0.0007) +[2023-10-13 03:33:08,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 161611776. Throughput: 0: 1692.1, 1: 1681.8. Samples: 40414400. Policy #0 lag: (min: 19.0, avg: 24.6, max: 51.0) +[2023-10-13 03:33:08,607][45375] Avg episode reward: [(0, '66.390'), (1, '58.990')] +[2023-10-13 03:33:08,943][46662] Updated weights for policy 0, policy_version 78950 (0.0008) +[2023-10-13 03:33:09,318][46662] Updated weights for policy 0, policy_version 78960 (0.0008) +[2023-10-13 03:33:09,688][46662] Updated weights for policy 0, policy_version 78970 (0.0009) +[2023-10-13 03:33:11,168][46663] Updated weights for policy 1, policy_version 78881 (0.0009) +[2023-10-13 03:33:11,530][46663] Updated weights for policy 1, policy_version 78891 (0.0008) +[2023-10-13 03:33:11,897][46663] Updated weights for policy 1, policy_version 78901 (0.0007) +[2023-10-13 03:33:12,265][46663] Updated weights for policy 1, policy_version 78911 (0.0008) +[2023-10-13 03:33:13,607][45375] Fps is (10 sec: 13106.7, 60 sec: 13653.2, 300 sec: 13440.4). Total num frames: 161677312. Throughput: 0: 1686.6, 1: 1674.7. Samples: 40424596. Policy #0 lag: (min: 19.0, avg: 24.6, max: 51.0) +[2023-10-13 03:33:13,608][45375] Avg episode reward: [(0, '66.280'), (1, '54.990')] +[2023-10-13 03:33:13,670][46662] Updated weights for policy 0, policy_version 78980 (0.0007) +[2023-10-13 03:33:14,043][46662] Updated weights for policy 0, policy_version 78990 (0.0009) +[2023-10-13 03:33:14,402][46662] Updated weights for policy 0, policy_version 79000 (0.0010) +[2023-10-13 03:33:16,394][46663] Updated weights for policy 1, policy_version 78921 (0.0008) +[2023-10-13 03:33:16,764][46663] Updated weights for policy 1, policy_version 78931 (0.0008) +[2023-10-13 03:33:17,133][46663] Updated weights for policy 1, policy_version 78941 (0.0009) +[2023-10-13 03:33:18,601][46662] Updated weights for policy 0, policy_version 79010 (0.0010) +[2023-10-13 03:33:18,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 161742848. Throughput: 0: 1688.5, 1: 1663.6. Samples: 40444130. Policy #0 lag: (min: 19.0, avg: 24.6, max: 51.0) +[2023-10-13 03:33:18,607][45375] Avg episode reward: [(0, '65.570'), (1, '55.490')] +[2023-10-13 03:33:18,974][46662] Updated weights for policy 0, policy_version 79020 (0.0011) +[2023-10-13 03:33:19,351][46662] Updated weights for policy 0, policy_version 79030 (0.0009) +[2023-10-13 03:33:19,728][46662] Updated weights for policy 0, policy_version 79040 (0.0007) +[2023-10-13 03:33:21,194][46663] Updated weights for policy 1, policy_version 78951 (0.0009) +[2023-10-13 03:33:21,555][46663] Updated weights for policy 1, policy_version 78961 (0.0009) +[2023-10-13 03:33:21,920][46663] Updated weights for policy 1, policy_version 78971 (0.0010) +[2023-10-13 03:33:23,607][45375] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 161808384. Throughput: 0: 1687.6, 1: 1675.4. Samples: 40464818. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 03:33:23,607][45375] Avg episode reward: [(0, '63.630'), (1, '54.360')] +[2023-10-13 03:33:23,647][46662] Updated weights for policy 0, policy_version 79050 (0.0008) +[2023-10-13 03:33:24,017][46662] Updated weights for policy 0, policy_version 79060 (0.0010) +[2023-10-13 03:33:24,375][46662] Updated weights for policy 0, policy_version 79070 (0.0010) +[2023-10-13 03:33:25,902][46663] Updated weights for policy 1, policy_version 78981 (0.0009) +[2023-10-13 03:33:26,276][46663] Updated weights for policy 1, policy_version 78991 (0.0009) +[2023-10-13 03:33:26,640][46663] Updated weights for policy 1, policy_version 79001 (0.0008) +[2023-10-13 03:33:28,603][46662] Updated weights for policy 0, policy_version 79080 (0.0009) +[2023-10-13 03:33:28,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 161873920. Throughput: 0: 1689.8, 1: 1662.8. Samples: 40474594. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 03:33:28,607][45375] Avg episode reward: [(0, '63.600'), (1, '53.330')] +[2023-10-13 03:33:28,974][46662] Updated weights for policy 0, policy_version 79090 (0.0007) +[2023-10-13 03:33:29,333][46662] Updated weights for policy 0, policy_version 79100 (0.0009) +[2023-10-13 03:33:30,851][46663] Updated weights for policy 1, policy_version 79011 (0.0007) +[2023-10-13 03:33:31,221][46663] Updated weights for policy 1, policy_version 79021 (0.0009) +[2023-10-13 03:33:31,589][46663] Updated weights for policy 1, policy_version 79031 (0.0009) +[2023-10-13 03:33:33,233][46662] Updated weights for policy 0, policy_version 79110 (0.0008) +[2023-10-13 03:33:33,603][46662] Updated weights for policy 0, policy_version 79120 (0.0007) +[2023-10-13 03:33:33,606][45375] Fps is (10 sec: 13107.4, 60 sec: 13107.3, 300 sec: 13440.4). Total num frames: 161939456. Throughput: 0: 1690.9, 1: 1665.8. Samples: 40494660. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 03:33:33,607][45375] Avg episode reward: [(0, '62.540'), (1, '51.100')] +[2023-10-13 03:33:33,977][46662] Updated weights for policy 0, policy_version 79130 (0.0007) +[2023-10-13 03:33:35,739][46663] Updated weights for policy 1, policy_version 79041 (0.0007) +[2023-10-13 03:33:36,158][46663] Updated weights for policy 1, policy_version 79051 (0.0011) +[2023-10-13 03:33:36,529][46663] Updated weights for policy 1, policy_version 79061 (0.0008) +[2023-10-13 03:33:36,886][46663] Updated weights for policy 1, policy_version 79071 (0.0009) +[2023-10-13 03:33:37,920][46662] Updated weights for policy 0, policy_version 79140 (0.0008) +[2023-10-13 03:33:38,293][46662] Updated weights for policy 0, policy_version 79150 (0.0007) +[2023-10-13 03:33:38,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 162004992. Throughput: 0: 1692.4, 1: 1674.9. Samples: 40515208. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 03:33:38,607][45375] Avg episode reward: [(0, '60.640'), (1, '50.050')] +[2023-10-13 03:33:38,668][46662] Updated weights for policy 0, policy_version 79160 (0.0008) +[2023-10-13 03:33:40,926][46663] Updated weights for policy 1, policy_version 79081 (0.0008) +[2023-10-13 03:33:41,291][46663] Updated weights for policy 1, policy_version 79091 (0.0008) +[2023-10-13 03:33:41,654][46663] Updated weights for policy 1, policy_version 79101 (0.0009) +[2023-10-13 03:33:42,746][46662] Updated weights for policy 0, policy_version 79170 (0.0008) +[2023-10-13 03:33:43,114][46662] Updated weights for policy 0, policy_version 79180 (0.0010) +[2023-10-13 03:33:43,477][46662] Updated weights for policy 0, policy_version 79190 (0.0011) +[2023-10-13 03:33:43,606][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 162070528. Throughput: 0: 1691.3, 1: 1658.3. Samples: 40524852. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 03:33:43,607][45375] Avg episode reward: [(0, '59.230'), (1, '49.890')] +[2023-10-13 03:33:43,844][46662] Updated weights for policy 0, policy_version 79200 (0.0007) +[2023-10-13 03:33:45,754][46663] Updated weights for policy 1, policy_version 79111 (0.0008) +[2023-10-13 03:33:46,118][46663] Updated weights for policy 1, policy_version 79121 (0.0009) +[2023-10-13 03:33:46,486][46663] Updated weights for policy 1, policy_version 79131 (0.0008) +[2023-10-13 03:33:48,054][46662] Updated weights for policy 0, policy_version 79210 (0.0009) +[2023-10-13 03:33:48,416][46662] Updated weights for policy 0, policy_version 79220 (0.0011) +[2023-10-13 03:33:48,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 162136064. Throughput: 0: 1690.2, 1: 1674.9. Samples: 40545098. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 03:33:48,607][45375] Avg episode reward: [(0, '56.380'), (1, '49.600')] +[2023-10-13 03:33:48,791][46662] Updated weights for policy 0, policy_version 79230 (0.0009) +[2023-10-13 03:33:50,506][46663] Updated weights for policy 1, policy_version 79141 (0.0008) +[2023-10-13 03:33:50,868][46663] Updated weights for policy 1, policy_version 79151 (0.0009) +[2023-10-13 03:33:51,241][46663] Updated weights for policy 1, policy_version 79161 (0.0011) +[2023-10-13 03:33:52,961][46662] Updated weights for policy 0, policy_version 79240 (0.0008) +[2023-10-13 03:33:53,328][46662] Updated weights for policy 0, policy_version 79250 (0.0007) +[2023-10-13 03:33:53,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.3, 300 sec: 13329.4). Total num frames: 162201600. Throughput: 0: 1683.0, 1: 1680.0. Samples: 40565736. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 03:33:53,607][45375] Avg episode reward: [(0, '54.660'), (1, '49.470')] +[2023-10-13 03:33:53,613][46384] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000079168_81068032.pth... +[2023-10-13 03:33:53,643][46384] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000077600_79462400.pth +[2023-10-13 03:33:53,704][46662] Updated weights for policy 0, policy_version 79260 (0.0009) +[2023-10-13 03:33:53,848][46091] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000079264_81166336.pth... +[2023-10-13 03:33:53,876][46091] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000077696_79560704.pth +[2023-10-13 03:33:55,470][46663] Updated weights for policy 1, policy_version 79171 (0.0008) +[2023-10-13 03:33:55,849][46663] Updated weights for policy 1, policy_version 79181 (0.0010) +[2023-10-13 03:33:56,211][46663] Updated weights for policy 1, policy_version 79191 (0.0009) +[2023-10-13 03:33:57,953][46662] Updated weights for policy 0, policy_version 79270 (0.0007) +[2023-10-13 03:33:58,325][46662] Updated weights for policy 0, policy_version 79280 (0.0007) +[2023-10-13 03:33:58,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 162267136. Throughput: 0: 1683.5, 1: 1659.3. Samples: 40575022. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 03:33:58,607][45375] Avg episode reward: [(0, '54.870'), (1, '49.480')] +[2023-10-13 03:33:58,685][46662] Updated weights for policy 0, policy_version 79290 (0.0008) +[2023-10-13 03:34:00,205][46663] Updated weights for policy 1, policy_version 79201 (0.0010) +[2023-10-13 03:34:00,573][46663] Updated weights for policy 1, policy_version 79211 (0.0010) +[2023-10-13 03:34:00,939][46663] Updated weights for policy 1, policy_version 79221 (0.0007) +[2023-10-13 03:34:01,313][46663] Updated weights for policy 1, policy_version 79231 (0.0008) +[2023-10-13 03:34:02,667][46662] Updated weights for policy 0, policy_version 79300 (0.0009) +[2023-10-13 03:34:03,034][46662] Updated weights for policy 0, policy_version 79310 (0.0008) +[2023-10-13 03:34:03,406][46662] Updated weights for policy 0, policy_version 79320 (0.0007) +[2023-10-13 03:34:03,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 162332672. Throughput: 0: 1687.5, 1: 1681.9. Samples: 40595756. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 03:34:03,608][45375] Avg episode reward: [(0, '53.430'), (1, '48.280')] +[2023-10-13 03:34:05,207][46663] Updated weights for policy 1, policy_version 79241 (0.0008) +[2023-10-13 03:34:05,581][46663] Updated weights for policy 1, policy_version 79251 (0.0009) +[2023-10-13 03:34:05,942][46663] Updated weights for policy 1, policy_version 79261 (0.0007) +[2023-10-13 03:34:07,469][46662] Updated weights for policy 0, policy_version 79330 (0.0008) +[2023-10-13 03:34:07,848][46662] Updated weights for policy 0, policy_version 79340 (0.0009) +[2023-10-13 03:34:08,209][46662] Updated weights for policy 0, policy_version 79350 (0.0008) +[2023-10-13 03:34:08,583][46662] Updated weights for policy 0, policy_version 79360 (0.0009) +[2023-10-13 03:34:08,607][45375] Fps is (10 sec: 16383.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 162430976. Throughput: 0: 1674.9, 1: 1682.1. Samples: 40615886. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 03:34:08,607][45375] Avg episode reward: [(0, '52.580'), (1, '49.050')] +[2023-10-13 03:34:09,835][46663] Updated weights for policy 1, policy_version 79271 (0.0009) +[2023-10-13 03:34:10,196][46663] Updated weights for policy 1, policy_version 79281 (0.0010) +[2023-10-13 03:34:10,563][46663] Updated weights for policy 1, policy_version 79291 (0.0010) +[2023-10-13 03:34:12,501][46662] Updated weights for policy 0, policy_version 79370 (0.0010) +[2023-10-13 03:34:12,869][46662] Updated weights for policy 0, policy_version 79380 (0.0008) +[2023-10-13 03:34:13,243][46662] Updated weights for policy 0, policy_version 79390 (0.0008) +[2023-10-13 03:34:13,607][45375] Fps is (10 sec: 16384.0, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 162496512. Throughput: 0: 1689.6, 1: 1665.4. Samples: 40625570. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) +[2023-10-13 03:34:13,608][45375] Avg episode reward: [(0, '49.920'), (1, '49.280')] +[2023-10-13 03:34:14,666][46663] Updated weights for policy 1, policy_version 79301 (0.0008) +[2023-10-13 03:34:15,033][46663] Updated weights for policy 1, policy_version 79311 (0.0009) +[2023-10-13 03:34:15,392][46663] Updated weights for policy 1, policy_version 79321 (0.0008) +[2023-10-13 03:34:17,394][46662] Updated weights for policy 0, policy_version 79400 (0.0011) +[2023-10-13 03:34:17,764][46662] Updated weights for policy 0, policy_version 79410 (0.0010) +[2023-10-13 03:34:18,142][46662] Updated weights for policy 0, policy_version 79420 (0.0009) +[2023-10-13 03:34:18,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 162562048. Throughput: 0: 1690.6, 1: 1684.0. Samples: 40646518. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) +[2023-10-13 03:34:18,607][45375] Avg episode reward: [(0, '50.230'), (1, '50.500')] +[2023-10-13 03:34:19,576][46663] Updated weights for policy 1, policy_version 79331 (0.0007) +[2023-10-13 03:34:19,952][46663] Updated weights for policy 1, policy_version 79341 (0.0007) +[2023-10-13 03:34:20,321][46663] Updated weights for policy 1, policy_version 79351 (0.0007) +[2023-10-13 03:34:22,131][46662] Updated weights for policy 0, policy_version 79430 (0.0008) +[2023-10-13 03:34:22,495][46662] Updated weights for policy 0, policy_version 79440 (0.0008) +[2023-10-13 03:34:22,861][46662] Updated weights for policy 0, policy_version 79450 (0.0007) +[2023-10-13 03:34:23,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 162627584. Throughput: 0: 1667.8, 1: 1686.9. Samples: 40666170. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) +[2023-10-13 03:34:23,608][45375] Avg episode reward: [(0, '50.870'), (1, '50.690')] +[2023-10-13 03:34:24,516][46663] Updated weights for policy 1, policy_version 79361 (0.0008) +[2023-10-13 03:34:24,936][46663] Updated weights for policy 1, policy_version 79371 (0.0007) +[2023-10-13 03:34:25,301][46663] Updated weights for policy 1, policy_version 79381 (0.0009) +[2023-10-13 03:34:25,676][46663] Updated weights for policy 1, policy_version 79391 (0.0009) +[2023-10-13 03:34:26,908][46662] Updated weights for policy 0, policy_version 79460 (0.0008) +[2023-10-13 03:34:27,282][46662] Updated weights for policy 0, policy_version 79470 (0.0009) +[2023-10-13 03:34:27,639][46662] Updated weights for policy 0, policy_version 79480 (0.0008) +[2023-10-13 03:34:28,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 162693120. Throughput: 0: 1683.8, 1: 1672.0. Samples: 40675864. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) +[2023-10-13 03:34:28,607][45375] Avg episode reward: [(0, '51.490'), (1, '49.830')] +[2023-10-13 03:34:29,712][46663] Updated weights for policy 1, policy_version 79401 (0.0009) +[2023-10-13 03:34:30,074][46663] Updated weights for policy 1, policy_version 79411 (0.0008) +[2023-10-13 03:34:30,450][46663] Updated weights for policy 1, policy_version 79421 (0.0009) +[2023-10-13 03:34:31,572][46662] Updated weights for policy 0, policy_version 79490 (0.0009) +[2023-10-13 03:34:31,940][46662] Updated weights for policy 0, policy_version 79500 (0.0011) +[2023-10-13 03:34:32,307][46662] Updated weights for policy 0, policy_version 79510 (0.0010) +[2023-10-13 03:34:32,678][46662] Updated weights for policy 0, policy_version 79520 (0.0011) +[2023-10-13 03:34:33,606][45375] Fps is (10 sec: 13107.6, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 162758656. Throughput: 0: 1680.0, 1: 1679.1. Samples: 40696258. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) +[2023-10-13 03:34:33,607][45375] Avg episode reward: [(0, '50.110'), (1, '49.350')] +[2023-10-13 03:34:34,478][46663] Updated weights for policy 1, policy_version 79431 (0.0009) +[2023-10-13 03:34:34,838][46663] Updated weights for policy 1, policy_version 79441 (0.0008) +[2023-10-13 03:34:35,214][46663] Updated weights for policy 1, policy_version 79451 (0.0010) +[2023-10-13 03:34:36,674][46662] Updated weights for policy 0, policy_version 79530 (0.0007) +[2023-10-13 03:34:37,041][46662] Updated weights for policy 0, policy_version 79540 (0.0007) +[2023-10-13 03:34:37,417][46662] Updated weights for policy 0, policy_version 79550 (0.0007) +[2023-10-13 03:34:38,606][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 162824192. Throughput: 0: 1661.3, 1: 1680.1. Samples: 40716098. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) +[2023-10-13 03:34:38,607][45375] Avg episode reward: [(0, '49.630'), (1, '51.320')] +[2023-10-13 03:34:39,229][46663] Updated weights for policy 1, policy_version 79461 (0.0010) +[2023-10-13 03:34:39,601][46663] Updated weights for policy 1, policy_version 79471 (0.0009) +[2023-10-13 03:34:39,973][46663] Updated weights for policy 1, policy_version 79481 (0.0009) +[2023-10-13 03:34:41,557][46662] Updated weights for policy 0, policy_version 79560 (0.0007) +[2023-10-13 03:34:41,927][46662] Updated weights for policy 0, policy_version 79570 (0.0007) +[2023-10-13 03:34:42,300][46662] Updated weights for policy 0, policy_version 79580 (0.0009) +[2023-10-13 03:34:43,606][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 162889728. Throughput: 0: 1687.8, 1: 1679.2. Samples: 40726536. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) +[2023-10-13 03:34:43,607][45375] Avg episode reward: [(0, '49.790'), (1, '53.580')] +[2023-10-13 03:34:44,116][46663] Updated weights for policy 1, policy_version 79491 (0.0009) +[2023-10-13 03:34:44,490][46663] Updated weights for policy 1, policy_version 79501 (0.0008) +[2023-10-13 03:34:44,870][46663] Updated weights for policy 1, policy_version 79511 (0.0010) +[2023-10-13 03:34:46,433][46662] Updated weights for policy 0, policy_version 79590 (0.0008) +[2023-10-13 03:34:46,799][46662] Updated weights for policy 0, policy_version 79600 (0.0007) +[2023-10-13 03:34:47,174][46662] Updated weights for policy 0, policy_version 79610 (0.0007) +[2023-10-13 03:34:48,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.5). Total num frames: 162955264. Throughput: 0: 1672.0, 1: 1680.0. Samples: 40746594. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) +[2023-10-13 03:34:48,607][45375] Avg episode reward: [(0, '50.900'), (1, '53.100')] +[2023-10-13 03:34:48,968][46663] Updated weights for policy 1, policy_version 79521 (0.0009) +[2023-10-13 03:34:49,331][46663] Updated weights for policy 1, policy_version 79531 (0.0007) +[2023-10-13 03:34:49,689][46663] Updated weights for policy 1, policy_version 79541 (0.0011) +[2023-10-13 03:34:50,063][46663] Updated weights for policy 1, policy_version 79551 (0.0010) +[2023-10-13 03:34:51,354][46662] Updated weights for policy 0, policy_version 79620 (0.0008) +[2023-10-13 03:34:51,720][46662] Updated weights for policy 0, policy_version 79630 (0.0007) +[2023-10-13 03:34:52,084][46662] Updated weights for policy 0, policy_version 79640 (0.0008) +[2023-10-13 03:34:53,607][45375] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 163020800. Throughput: 0: 1670.4, 1: 1686.7. Samples: 40766958. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) +[2023-10-13 03:34:53,607][45375] Avg episode reward: [(0, '50.110'), (1, '54.200')] +[2023-10-13 03:34:54,139][46663] Updated weights for policy 1, policy_version 79561 (0.0010) +[2023-10-13 03:34:54,509][46663] Updated weights for policy 1, policy_version 79571 (0.0008) +[2023-10-13 03:34:54,872][46663] Updated weights for policy 1, policy_version 79581 (0.0008) +[2023-10-13 03:34:56,067][46662] Updated weights for policy 0, policy_version 79650 (0.0008) +[2023-10-13 03:34:56,441][46662] Updated weights for policy 0, policy_version 79660 (0.0008) +[2023-10-13 03:34:56,802][46662] Updated weights for policy 0, policy_version 79670 (0.0009) +[2023-10-13 03:34:57,177][46662] Updated weights for policy 0, policy_version 79680 (0.0009) +[2023-10-13 03:34:58,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 163086336. Throughput: 0: 1690.5, 1: 1684.1. Samples: 40777428. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) +[2023-10-13 03:34:58,607][45375] Avg episode reward: [(0, '49.850'), (1, '54.400')] +[2023-10-13 03:34:58,887][46663] Updated weights for policy 1, policy_version 79591 (0.0007) +[2023-10-13 03:34:59,252][46663] Updated weights for policy 1, policy_version 79601 (0.0008) +[2023-10-13 03:34:59,616][46663] Updated weights for policy 1, policy_version 79611 (0.0010) +[2023-10-13 03:35:01,295][46662] Updated weights for policy 0, policy_version 79690 (0.0008) +[2023-10-13 03:35:01,669][46662] Updated weights for policy 0, policy_version 79700 (0.0008) +[2023-10-13 03:35:02,036][46662] Updated weights for policy 0, policy_version 79710 (0.0008) +[2023-10-13 03:35:03,580][46663] Updated weights for policy 1, policy_version 79621 (0.0009) +[2023-10-13 03:35:03,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 163151872. Throughput: 0: 1664.1, 1: 1684.2. Samples: 40797194. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) +[2023-10-13 03:35:03,608][45375] Avg episode reward: [(0, '50.720'), (1, '55.570')] +[2023-10-13 03:35:03,941][46663] Updated weights for policy 1, policy_version 79631 (0.0009) +[2023-10-13 03:35:04,312][46663] Updated weights for policy 1, policy_version 79641 (0.0009) +[2023-10-13 03:35:06,117][46662] Updated weights for policy 0, policy_version 79720 (0.0007) +[2023-10-13 03:35:06,489][46662] Updated weights for policy 0, policy_version 79730 (0.0007) +[2023-10-13 03:35:06,858][46662] Updated weights for policy 0, policy_version 79740 (0.0011) +[2023-10-13 03:35:08,390][46663] Updated weights for policy 1, policy_version 79651 (0.0009) +[2023-10-13 03:35:08,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 163217408. Throughput: 0: 1676.9, 1: 1683.5. Samples: 40817388. Policy #0 lag: (min: 29.0, avg: 37.0, max: 61.0) +[2023-10-13 03:35:08,607][45375] Avg episode reward: [(0, '52.080'), (1, '56.030')] +[2023-10-13 03:35:08,771][46663] Updated weights for policy 1, policy_version 79661 (0.0010) +[2023-10-13 03:35:09,136][46663] Updated weights for policy 1, policy_version 79671 (0.0010) +[2023-10-13 03:35:10,749][46662] Updated weights for policy 0, policy_version 79750 (0.0007) +[2023-10-13 03:35:11,117][46662] Updated weights for policy 0, policy_version 79760 (0.0007) +[2023-10-13 03:35:11,486][46662] Updated weights for policy 0, policy_version 79770 (0.0007) +[2023-10-13 03:35:13,358][46663] Updated weights for policy 1, policy_version 79681 (0.0009) +[2023-10-13 03:35:13,607][45375] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 163282944. Throughput: 0: 1682.6, 1: 1690.6. Samples: 40827656. Policy #0 lag: (min: 29.0, avg: 37.0, max: 61.0) +[2023-10-13 03:35:13,607][45375] Avg episode reward: [(0, '53.320'), (1, '55.170')] +[2023-10-13 03:35:13,787][46663] Updated weights for policy 1, policy_version 79691 (0.0011) +[2023-10-13 03:35:14,168][46663] Updated weights for policy 1, policy_version 79701 (0.0008) +[2023-10-13 03:35:14,533][46663] Updated weights for policy 1, policy_version 79711 (0.0007) +[2023-10-13 03:35:15,607][46662] Updated weights for policy 0, policy_version 79780 (0.0007) +[2023-10-13 03:35:15,972][46662] Updated weights for policy 0, policy_version 79790 (0.0008) +[2023-10-13 03:35:16,347][46662] Updated weights for policy 0, policy_version 79800 (0.0007) +[2023-10-13 03:35:18,593][46663] Updated weights for policy 1, policy_version 79721 (0.0008) +[2023-10-13 03:35:18,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 163348480. Throughput: 0: 1667.3, 1: 1687.1. Samples: 40847208. Policy #0 lag: (min: 29.0, avg: 37.0, max: 61.0) +[2023-10-13 03:35:18,607][45375] Avg episode reward: [(0, '52.940'), (1, '55.850')] +[2023-10-13 03:35:18,958][46663] Updated weights for policy 1, policy_version 79731 (0.0008) +[2023-10-13 03:35:19,332][46663] Updated weights for policy 1, policy_version 79741 (0.0007) +[2023-10-13 03:35:20,415][46662] Updated weights for policy 0, policy_version 79810 (0.0009) +[2023-10-13 03:35:20,789][46662] Updated weights for policy 0, policy_version 79820 (0.0008) +[2023-10-13 03:35:21,147][46662] Updated weights for policy 0, policy_version 79830 (0.0009) +[2023-10-13 03:35:21,520][46662] Updated weights for policy 0, policy_version 79840 (0.0008) +[2023-10-13 03:35:23,385][46663] Updated weights for policy 1, policy_version 79751 (0.0009) +[2023-10-13 03:35:23,606][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.3, 300 sec: 13440.4). Total num frames: 163414016. Throughput: 0: 1685.1, 1: 1679.7. Samples: 40867514. Policy #0 lag: (min: 29.0, avg: 37.0, max: 61.0) +[2023-10-13 03:35:23,607][45375] Avg episode reward: [(0, '52.560'), (1, '55.730')] +[2023-10-13 03:35:23,759][46663] Updated weights for policy 1, policy_version 79761 (0.0009) +[2023-10-13 03:35:24,130][46663] Updated weights for policy 1, policy_version 79771 (0.0009) +[2023-10-13 03:35:25,580][46662] Updated weights for policy 0, policy_version 79850 (0.0008) +[2023-10-13 03:35:25,953][46662] Updated weights for policy 0, policy_version 79860 (0.0008) +[2023-10-13 03:35:26,323][46662] Updated weights for policy 0, policy_version 79870 (0.0007) +[2023-10-13 03:35:28,407][46663] Updated weights for policy 1, policy_version 79781 (0.0008) +[2023-10-13 03:35:28,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 163479552. Throughput: 0: 1673.4, 1: 1680.3. Samples: 40877454. Policy #0 lag: (min: 29.0, avg: 37.0, max: 61.0) +[2023-10-13 03:35:28,608][45375] Avg episode reward: [(0, '51.800'), (1, '56.790')] +[2023-10-13 03:35:28,770][46663] Updated weights for policy 1, policy_version 79791 (0.0008) +[2023-10-13 03:35:29,140][46663] Updated weights for policy 1, policy_version 79801 (0.0008) +[2023-10-13 03:35:30,266][46662] Updated weights for policy 0, policy_version 79880 (0.0008) +[2023-10-13 03:35:30,636][46662] Updated weights for policy 0, policy_version 79890 (0.0009) +[2023-10-13 03:35:31,002][46662] Updated weights for policy 0, policy_version 79900 (0.0010) +[2023-10-13 03:35:33,012][46663] Updated weights for policy 1, policy_version 79811 (0.0008) +[2023-10-13 03:35:33,381][46663] Updated weights for policy 1, policy_version 79821 (0.0007) +[2023-10-13 03:35:33,607][45375] Fps is (10 sec: 13106.9, 60 sec: 13107.1, 300 sec: 13440.4). Total num frames: 163545088. Throughput: 0: 1673.2, 1: 1682.9. Samples: 40897618. Policy #0 lag: (min: 29.0, avg: 37.0, max: 61.0) +[2023-10-13 03:35:33,608][45375] Avg episode reward: [(0, '51.750'), (1, '57.700')] +[2023-10-13 03:35:33,742][46663] Updated weights for policy 1, policy_version 79831 (0.0008) +[2023-10-13 03:35:35,248][46662] Updated weights for policy 0, policy_version 79910 (0.0009) +[2023-10-13 03:35:35,611][46662] Updated weights for policy 0, policy_version 79920 (0.0009) +[2023-10-13 03:35:35,977][46662] Updated weights for policy 0, policy_version 79930 (0.0008) +[2023-10-13 03:35:37,846][46663] Updated weights for policy 1, policy_version 79841 (0.0007) +[2023-10-13 03:35:38,216][46663] Updated weights for policy 1, policy_version 79851 (0.0009) +[2023-10-13 03:35:38,577][46663] Updated weights for policy 1, policy_version 79861 (0.0009) +[2023-10-13 03:35:38,606][45375] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 163610624. Throughput: 0: 1688.6, 1: 1664.4. Samples: 40917840. Policy #0 lag: (min: 29.0, avg: 37.0, max: 61.0) +[2023-10-13 03:35:38,607][45375] Avg episode reward: [(0, '53.380'), (1, '58.670')] +[2023-10-13 03:35:38,931][46663] Updated weights for policy 1, policy_version 79871 (0.0008) +[2023-10-13 03:35:39,974][46662] Updated weights for policy 0, policy_version 79940 (0.0008) +[2023-10-13 03:35:40,354][46662] Updated weights for policy 0, policy_version 79950 (0.0009) +[2023-10-13 03:35:40,726][46662] Updated weights for policy 0, policy_version 79960 (0.0009) +[2023-10-13 03:35:43,055][46663] Updated weights for policy 1, policy_version 79881 (0.0007) +[2023-10-13 03:35:43,426][46663] Updated weights for policy 1, policy_version 79891 (0.0009) +[2023-10-13 03:35:43,606][45375] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 163676160. Throughput: 0: 1660.5, 1: 1681.2. Samples: 40927808. Policy #0 lag: (min: 29.0, avg: 37.0, max: 61.0) +[2023-10-13 03:35:43,607][45375] Avg episode reward: [(0, '51.750'), (1, '60.430')] +[2023-10-13 03:35:43,788][46663] Updated weights for policy 1, policy_version 79901 (0.0009) +[2023-10-13 03:35:44,772][46662] Updated weights for policy 0, policy_version 79970 (0.0011) +[2023-10-13 03:35:45,138][46662] Updated weights for policy 0, policy_version 79980 (0.0010) +[2023-10-13 03:35:45,518][46662] Updated weights for policy 0, policy_version 79990 (0.0009) +[2023-10-13 03:35:45,877][46662] Updated weights for policy 0, policy_version 80000 (0.0008) +[2023-10-13 03:35:47,854][46663] Updated weights for policy 1, policy_version 79911 (0.0010) +[2023-10-13 03:35:48,217][46663] Updated weights for policy 1, policy_version 79921 (0.0009) +[2023-10-13 03:35:48,581][46663] Updated weights for policy 1, policy_version 79931 (0.0011) +[2023-10-13 03:35:48,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 163741696. Throughput: 0: 1675.5, 1: 1677.3. Samples: 40948070. Policy #0 lag: (min: 29.0, avg: 37.0, max: 61.0) +[2023-10-13 03:35:48,607][45375] Avg episode reward: [(0, '51.880'), (1, '58.460')] +[2023-10-13 03:35:49,985][46662] Updated weights for policy 0, policy_version 80010 (0.0008) +[2023-10-13 03:35:50,367][46662] Updated weights for policy 0, policy_version 80020 (0.0008) +[2023-10-13 03:35:50,748][46662] Updated weights for policy 0, policy_version 80030 (0.0011) +[2023-10-13 03:35:52,689][46663] Updated weights for policy 1, policy_version 79941 (0.0009) +[2023-10-13 03:35:53,057][46663] Updated weights for policy 1, policy_version 79951 (0.0010) +[2023-10-13 03:35:53,428][46663] Updated weights for policy 1, policy_version 79961 (0.0009) +[2023-10-13 03:35:53,607][45375] Fps is (10 sec: 13106.7, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 163807232. Throughput: 0: 1684.5, 1: 1659.1. Samples: 40967850. Policy #0 lag: (min: 29.0, avg: 37.0, max: 61.0) +[2023-10-13 03:35:53,608][45375] Avg episode reward: [(0, '52.580'), (1, '57.100')] +[2023-10-13 03:35:53,618][46091] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000080032_81952768.pth... +[2023-10-13 03:35:53,652][46091] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000078464_80347136.pth +[2023-10-13 03:35:53,684][46384] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000079968_81887232.pth... +[2023-10-13 03:35:53,712][46384] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000078400_80281600.pth +[2023-10-13 03:35:54,979][46662] Updated weights for policy 0, policy_version 80040 (0.0007) +[2023-10-13 03:35:55,351][46662] Updated weights for policy 0, policy_version 80050 (0.0009) +[2023-10-13 03:35:55,710][46662] Updated weights for policy 0, policy_version 80060 (0.0009) +[2023-10-13 03:35:57,320][46663] Updated weights for policy 1, policy_version 79971 (0.0009) +[2023-10-13 03:35:57,687][46663] Updated weights for policy 1, policy_version 79981 (0.0009) +[2023-10-13 03:35:58,056][46663] Updated weights for policy 1, policy_version 79991 (0.0009) +[2023-10-13 03:35:58,607][45375] Fps is (10 sec: 16384.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 163905536. Throughput: 0: 1659.9, 1: 1678.7. Samples: 40977890. Policy #0 lag: (min: 29.0, avg: 37.0, max: 61.0) +[2023-10-13 03:35:58,607][45375] Avg episode reward: [(0, '52.770'), (1, '57.100')] +[2023-10-13 03:35:59,651][46662] Updated weights for policy 0, policy_version 80070 (0.0008) +[2023-10-13 03:36:00,017][46662] Updated weights for policy 0, policy_version 80080 (0.0009) +[2023-10-13 03:36:00,376][46662] Updated weights for policy 0, policy_version 80090 (0.0008) +[2023-10-13 03:36:02,049][46663] Updated weights for policy 1, policy_version 80001 (0.0007) +[2023-10-13 03:36:02,415][46663] Updated weights for policy 1, policy_version 80011 (0.0008) +[2023-10-13 03:36:02,781][46663] Updated weights for policy 1, policy_version 80021 (0.0009) +[2023-10-13 03:36:03,155][46663] Updated weights for policy 1, policy_version 80031 (0.0009) +[2023-10-13 03:36:03,607][45375] Fps is (10 sec: 16384.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 163971072. Throughput: 0: 1679.8, 1: 1681.4. Samples: 40998460. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 03:36:03,608][45375] Avg episode reward: [(0, '53.270'), (1, '56.510')] +[2023-10-13 03:36:04,422][46662] Updated weights for policy 0, policy_version 80100 (0.0009) +[2023-10-13 03:36:04,786][46662] Updated weights for policy 0, policy_version 80110 (0.0007) +[2023-10-13 03:36:05,160][46662] Updated weights for policy 0, policy_version 80120 (0.0008) +[2023-10-13 03:36:07,246][46663] Updated weights for policy 1, policy_version 80041 (0.0008) +[2023-10-13 03:36:07,611][46663] Updated weights for policy 1, policy_version 80051 (0.0010) +[2023-10-13 03:36:07,977][46663] Updated weights for policy 1, policy_version 80061 (0.0010) +[2023-10-13 03:36:08,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 164036608. Throughput: 0: 1683.3, 1: 1668.6. Samples: 41018348. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 03:36:08,607][45375] Avg episode reward: [(0, '52.750'), (1, '58.280')] +[2023-10-13 03:36:09,241][46662] Updated weights for policy 0, policy_version 80130 (0.0010) +[2023-10-13 03:36:09,606][46662] Updated weights for policy 0, policy_version 80140 (0.0007) +[2023-10-13 03:36:09,988][46662] Updated weights for policy 0, policy_version 80150 (0.0007) +[2023-10-13 03:36:10,353][46662] Updated weights for policy 0, policy_version 80160 (0.0009) +[2023-10-13 03:36:12,045][46663] Updated weights for policy 1, policy_version 80071 (0.0010) +[2023-10-13 03:36:12,403][46663] Updated weights for policy 1, policy_version 80081 (0.0011) +[2023-10-13 03:36:12,762][46663] Updated weights for policy 1, policy_version 80091 (0.0009) +[2023-10-13 03:36:13,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 164102144. Throughput: 0: 1667.4, 1: 1694.6. Samples: 41028746. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 03:36:13,608][45375] Avg episode reward: [(0, '51.160'), (1, '57.230')] +[2023-10-13 03:36:14,454][46662] Updated weights for policy 0, policy_version 80170 (0.0009) +[2023-10-13 03:36:14,812][46662] Updated weights for policy 0, policy_version 80180 (0.0009) +[2023-10-13 03:36:15,182][46662] Updated weights for policy 0, policy_version 80190 (0.0008) +[2023-10-13 03:36:16,701][46663] Updated weights for policy 1, policy_version 80101 (0.0008) +[2023-10-13 03:36:17,073][46663] Updated weights for policy 1, policy_version 80111 (0.0011) +[2023-10-13 03:36:17,444][46663] Updated weights for policy 1, policy_version 80121 (0.0008) +[2023-10-13 03:36:18,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 164167680. Throughput: 0: 1678.9, 1: 1673.5. Samples: 41048474. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 03:36:18,607][45375] Avg episode reward: [(0, '51.590'), (1, '57.270')] +[2023-10-13 03:36:19,318][46662] Updated weights for policy 0, policy_version 80200 (0.0009) +[2023-10-13 03:36:19,678][46662] Updated weights for policy 0, policy_version 80210 (0.0009) +[2023-10-13 03:36:20,052][46662] Updated weights for policy 0, policy_version 80220 (0.0009) +[2023-10-13 03:36:21,575][46663] Updated weights for policy 1, policy_version 80131 (0.0010) +[2023-10-13 03:36:21,933][46663] Updated weights for policy 1, policy_version 80141 (0.0010) +[2023-10-13 03:36:22,309][46663] Updated weights for policy 1, policy_version 80151 (0.0008) +[2023-10-13 03:36:23,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 164233216. Throughput: 0: 1677.1, 1: 1679.9. Samples: 41068906. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 03:36:23,607][45375] Avg episode reward: [(0, '52.320'), (1, '58.690')] +[2023-10-13 03:36:24,151][46662] Updated weights for policy 0, policy_version 80230 (0.0009) +[2023-10-13 03:36:24,522][46662] Updated weights for policy 0, policy_version 80240 (0.0009) +[2023-10-13 03:36:24,889][46662] Updated weights for policy 0, policy_version 80250 (0.0007) +[2023-10-13 03:36:26,372][46663] Updated weights for policy 1, policy_version 80161 (0.0009) +[2023-10-13 03:36:26,732][46663] Updated weights for policy 1, policy_version 80171 (0.0009) +[2023-10-13 03:36:27,098][46663] Updated weights for policy 1, policy_version 80181 (0.0011) +[2023-10-13 03:36:27,460][46663] Updated weights for policy 1, policy_version 80191 (0.0011) +[2023-10-13 03:36:28,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 164298752. Throughput: 0: 1671.1, 1: 1694.5. Samples: 41079260. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 03:36:28,607][45375] Avg episode reward: [(0, '52.760'), (1, '58.400')] +[2023-10-13 03:36:28,933][46662] Updated weights for policy 0, policy_version 80260 (0.0010) +[2023-10-13 03:36:29,288][46662] Updated weights for policy 0, policy_version 80270 (0.0007) +[2023-10-13 03:36:29,655][46662] Updated weights for policy 0, policy_version 80280 (0.0007) +[2023-10-13 03:36:31,561][46663] Updated weights for policy 1, policy_version 80201 (0.0010) +[2023-10-13 03:36:31,942][46663] Updated weights for policy 1, policy_version 80211 (0.0010) +[2023-10-13 03:36:32,312][46663] Updated weights for policy 1, policy_version 80221 (0.0009) +[2023-10-13 03:36:33,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 164364288. Throughput: 0: 1680.5, 1: 1669.7. Samples: 41098832. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 03:36:33,608][45375] Avg episode reward: [(0, '53.860'), (1, '58.300')] +[2023-10-13 03:36:33,857][46662] Updated weights for policy 0, policy_version 80290 (0.0009) +[2023-10-13 03:36:34,231][46662] Updated weights for policy 0, policy_version 80300 (0.0008) +[2023-10-13 03:36:34,600][46662] Updated weights for policy 0, policy_version 80310 (0.0007) +[2023-10-13 03:36:34,968][46662] Updated weights for policy 0, policy_version 80320 (0.0008) +[2023-10-13 03:36:36,325][46663] Updated weights for policy 1, policy_version 80231 (0.0010) +[2023-10-13 03:36:36,695][46663] Updated weights for policy 1, policy_version 80241 (0.0010) +[2023-10-13 03:36:37,059][46663] Updated weights for policy 1, policy_version 80251 (0.0011) +[2023-10-13 03:36:38,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 164429824. Throughput: 0: 1686.8, 1: 1684.9. Samples: 41119572. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 03:36:38,607][45375] Avg episode reward: [(0, '54.080'), (1, '58.160')] +[2023-10-13 03:36:38,859][46662] Updated weights for policy 0, policy_version 80330 (0.0009) +[2023-10-13 03:36:39,235][46662] Updated weights for policy 0, policy_version 80340 (0.0008) +[2023-10-13 03:36:39,600][46662] Updated weights for policy 0, policy_version 80350 (0.0007) +[2023-10-13 03:36:41,138][46663] Updated weights for policy 1, policy_version 80261 (0.0008) +[2023-10-13 03:36:41,507][46663] Updated weights for policy 1, policy_version 80271 (0.0008) +[2023-10-13 03:36:41,881][46663] Updated weights for policy 1, policy_version 80281 (0.0008) +[2023-10-13 03:36:43,484][46662] Updated weights for policy 0, policy_version 80360 (0.0007) +[2023-10-13 03:36:43,607][45375] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 164495360. Throughput: 0: 1685.0, 1: 1681.1. Samples: 41129366. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 03:36:43,607][45375] Avg episode reward: [(0, '55.160'), (1, '55.340')] +[2023-10-13 03:36:43,865][46662] Updated weights for policy 0, policy_version 80370 (0.0008) +[2023-10-13 03:36:44,234][46662] Updated weights for policy 0, policy_version 80380 (0.0007) +[2023-10-13 03:36:46,160][46663] Updated weights for policy 1, policy_version 80291 (0.0009) +[2023-10-13 03:36:46,574][46663] Updated weights for policy 1, policy_version 80301 (0.0007) +[2023-10-13 03:36:46,931][46663] Updated weights for policy 1, policy_version 80311 (0.0007) +[2023-10-13 03:36:48,378][46662] Updated weights for policy 0, policy_version 80390 (0.0010) +[2023-10-13 03:36:48,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 164560896. Throughput: 0: 1688.4, 1: 1658.7. Samples: 41149076. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 03:36:48,607][45375] Avg episode reward: [(0, '54.710'), (1, '55.450')] +[2023-10-13 03:36:48,746][46662] Updated weights for policy 0, policy_version 80400 (0.0009) +[2023-10-13 03:36:49,123][46662] Updated weights for policy 0, policy_version 80410 (0.0009) +[2023-10-13 03:36:50,917][46663] Updated weights for policy 1, policy_version 80321 (0.0010) +[2023-10-13 03:36:51,272][46663] Updated weights for policy 1, policy_version 80331 (0.0009) +[2023-10-13 03:36:51,645][46663] Updated weights for policy 1, policy_version 80341 (0.0010) +[2023-10-13 03:36:52,017][46663] Updated weights for policy 1, policy_version 80351 (0.0010) +[2023-10-13 03:36:53,138][46662] Updated weights for policy 0, policy_version 80420 (0.0008) +[2023-10-13 03:36:53,511][46662] Updated weights for policy 0, policy_version 80430 (0.0009) +[2023-10-13 03:36:53,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 164626432. Throughput: 0: 1689.5, 1: 1674.8. Samples: 41169738. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 03:36:53,607][45375] Avg episode reward: [(0, '55.730'), (1, '56.450')] +[2023-10-13 03:36:53,881][46662] Updated weights for policy 0, policy_version 80440 (0.0009) +[2023-10-13 03:36:56,122][46663] Updated weights for policy 1, policy_version 80361 (0.0009) +[2023-10-13 03:36:56,495][46663] Updated weights for policy 1, policy_version 80371 (0.0008) +[2023-10-13 03:36:56,867][46663] Updated weights for policy 1, policy_version 80381 (0.0007) +[2023-10-13 03:36:58,070][46662] Updated weights for policy 0, policy_version 80450 (0.0008) +[2023-10-13 03:36:58,435][46662] Updated weights for policy 0, policy_version 80460 (0.0008) +[2023-10-13 03:36:58,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 164691968. Throughput: 0: 1684.6, 1: 1666.2. Samples: 41179534. Policy #0 lag: (min: 10.0, avg: 18.0, max: 42.0) +[2023-10-13 03:36:58,608][45375] Avg episode reward: [(0, '54.380'), (1, '57.270')] +[2023-10-13 03:36:58,810][46662] Updated weights for policy 0, policy_version 80470 (0.0011) +[2023-10-13 03:36:59,181][46662] Updated weights for policy 0, policy_version 80480 (0.0010) +[2023-10-13 03:37:00,753][46663] Updated weights for policy 1, policy_version 80391 (0.0008) +[2023-10-13 03:37:01,111][46663] Updated weights for policy 1, policy_version 80401 (0.0008) +[2023-10-13 03:37:01,479][46663] Updated weights for policy 1, policy_version 80411 (0.0008) +[2023-10-13 03:37:03,273][46662] Updated weights for policy 0, policy_version 80490 (0.0009) +[2023-10-13 03:37:03,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 164757504. Throughput: 0: 1685.2, 1: 1674.4. Samples: 41199656. Policy #0 lag: (min: 10.0, avg: 18.0, max: 42.0) +[2023-10-13 03:37:03,607][45375] Avg episode reward: [(0, '55.010'), (1, '57.450')] +[2023-10-13 03:37:03,639][46662] Updated weights for policy 0, policy_version 80500 (0.0008) +[2023-10-13 03:37:04,005][46662] Updated weights for policy 0, policy_version 80510 (0.0008) +[2023-10-13 03:37:05,623][46663] Updated weights for policy 1, policy_version 80421 (0.0009) +[2023-10-13 03:37:05,988][46663] Updated weights for policy 1, policy_version 80431 (0.0009) +[2023-10-13 03:37:06,347][46663] Updated weights for policy 1, policy_version 80441 (0.0008) +[2023-10-13 03:37:08,135][46662] Updated weights for policy 0, policy_version 80520 (0.0008) +[2023-10-13 03:37:08,507][46662] Updated weights for policy 0, policy_version 80530 (0.0009) +[2023-10-13 03:37:08,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 164823040. Throughput: 0: 1686.9, 1: 1676.4. Samples: 41220252. Policy #0 lag: (min: 10.0, avg: 18.0, max: 42.0) +[2023-10-13 03:37:08,607][45375] Avg episode reward: [(0, '55.940'), (1, '59.500')] +[2023-10-13 03:37:08,868][46662] Updated weights for policy 0, policy_version 80540 (0.0008) +[2023-10-13 03:37:10,457][46663] Updated weights for policy 1, policy_version 80451 (0.0008) +[2023-10-13 03:37:10,815][46663] Updated weights for policy 1, policy_version 80461 (0.0009) +[2023-10-13 03:37:11,183][46663] Updated weights for policy 1, policy_version 80471 (0.0009) +[2023-10-13 03:37:12,943][46662] Updated weights for policy 0, policy_version 80550 (0.0007) +[2023-10-13 03:37:13,310][46662] Updated weights for policy 0, policy_version 80560 (0.0008) +[2023-10-13 03:37:13,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 164888576. Throughput: 0: 1686.6, 1: 1652.9. Samples: 41229540. Policy #0 lag: (min: 10.0, avg: 18.0, max: 42.0) +[2023-10-13 03:37:13,608][45375] Avg episode reward: [(0, '54.780'), (1, '58.160')] +[2023-10-13 03:37:13,693][46662] Updated weights for policy 0, policy_version 80570 (0.0009) +[2023-10-13 03:37:15,525][46663] Updated weights for policy 1, policy_version 80481 (0.0010) +[2023-10-13 03:37:15,893][46663] Updated weights for policy 1, policy_version 80491 (0.0009) +[2023-10-13 03:37:16,269][46663] Updated weights for policy 1, policy_version 80501 (0.0010) +[2023-10-13 03:37:16,630][46663] Updated weights for policy 1, policy_version 80511 (0.0008) +[2023-10-13 03:37:17,851][46662] Updated weights for policy 0, policy_version 80580 (0.0010) +[2023-10-13 03:37:18,218][46662] Updated weights for policy 0, policy_version 80590 (0.0009) +[2023-10-13 03:37:18,579][46662] Updated weights for policy 0, policy_version 80600 (0.0008) +[2023-10-13 03:37:18,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 164954112. Throughput: 0: 1686.9, 1: 1669.5. Samples: 41249872. Policy #0 lag: (min: 10.0, avg: 18.0, max: 42.0) +[2023-10-13 03:37:18,607][45375] Avg episode reward: [(0, '55.980'), (1, '57.250')] +[2023-10-13 03:37:20,717][46663] Updated weights for policy 1, policy_version 80521 (0.0008) +[2023-10-13 03:37:21,092][46663] Updated weights for policy 1, policy_version 80531 (0.0009) +[2023-10-13 03:37:21,457][46663] Updated weights for policy 1, policy_version 80541 (0.0009) +[2023-10-13 03:37:22,626][46662] Updated weights for policy 0, policy_version 80610 (0.0008) +[2023-10-13 03:37:23,001][46662] Updated weights for policy 0, policy_version 80620 (0.0011) +[2023-10-13 03:37:23,377][46662] Updated weights for policy 0, policy_version 80630 (0.0009) +[2023-10-13 03:37:23,606][45375] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 165019648. Throughput: 0: 1675.6, 1: 1672.2. Samples: 41270220. Policy #0 lag: (min: 10.0, avg: 18.0, max: 42.0) +[2023-10-13 03:37:23,607][45375] Avg episode reward: [(0, '55.990'), (1, '58.070')] +[2023-10-13 03:37:23,739][46662] Updated weights for policy 0, policy_version 80640 (0.0008) +[2023-10-13 03:37:25,458][46663] Updated weights for policy 1, policy_version 80551 (0.0010) +[2023-10-13 03:37:25,820][46663] Updated weights for policy 1, policy_version 80561 (0.0011) +[2023-10-13 03:37:26,193][46663] Updated weights for policy 1, policy_version 80571 (0.0010) +[2023-10-13 03:37:27,869][46662] Updated weights for policy 0, policy_version 80650 (0.0011) +[2023-10-13 03:37:28,242][46662] Updated weights for policy 0, policy_version 80660 (0.0010) +[2023-10-13 03:37:28,604][46662] Updated weights for policy 0, policy_version 80670 (0.0010) +[2023-10-13 03:37:28,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 165085184. Throughput: 0: 1682.8, 1: 1654.4. Samples: 41279542. Policy #0 lag: (min: 10.0, avg: 18.0, max: 42.0) +[2023-10-13 03:37:28,607][45375] Avg episode reward: [(0, '55.630'), (1, '57.300')] +[2023-10-13 03:37:30,150][46663] Updated weights for policy 1, policy_version 80581 (0.0009) +[2023-10-13 03:37:30,523][46663] Updated weights for policy 1, policy_version 80591 (0.0009) +[2023-10-13 03:37:30,891][46663] Updated weights for policy 1, policy_version 80601 (0.0008) +[2023-10-13 03:37:32,704][46662] Updated weights for policy 0, policy_version 80680 (0.0008) +[2023-10-13 03:37:33,091][46662] Updated weights for policy 0, policy_version 80690 (0.0008) +[2023-10-13 03:37:33,460][46662] Updated weights for policy 0, policy_version 80700 (0.0008) +[2023-10-13 03:37:33,607][45375] Fps is (10 sec: 16383.7, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 165183488. Throughput: 0: 1679.2, 1: 1676.3. Samples: 41300072. Policy #0 lag: (min: 10.0, avg: 18.0, max: 42.0) +[2023-10-13 03:37:33,608][45375] Avg episode reward: [(0, '57.780'), (1, '56.560')] +[2023-10-13 03:37:35,135][46663] Updated weights for policy 1, policy_version 80611 (0.0008) +[2023-10-13 03:37:35,506][46663] Updated weights for policy 1, policy_version 80621 (0.0009) +[2023-10-13 03:37:35,875][46663] Updated weights for policy 1, policy_version 80631 (0.0008) +[2023-10-13 03:37:37,287][46662] Updated weights for policy 0, policy_version 80710 (0.0008) +[2023-10-13 03:37:37,651][46662] Updated weights for policy 0, policy_version 80720 (0.0009) +[2023-10-13 03:37:38,016][46662] Updated weights for policy 0, policy_version 80730 (0.0010) +[2023-10-13 03:37:38,607][45375] Fps is (10 sec: 16383.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 165249024. Throughput: 0: 1664.4, 1: 1680.8. Samples: 41320276. Policy #0 lag: (min: 10.0, avg: 18.0, max: 42.0) +[2023-10-13 03:37:38,607][45375] Avg episode reward: [(0, '56.130'), (1, '57.150')] +[2023-10-13 03:37:39,896][46663] Updated weights for policy 1, policy_version 80641 (0.0010) +[2023-10-13 03:37:40,264][46663] Updated weights for policy 1, policy_version 80651 (0.0008) +[2023-10-13 03:37:40,623][46663] Updated weights for policy 1, policy_version 80661 (0.0008) +[2023-10-13 03:37:40,985][46663] Updated weights for policy 1, policy_version 80671 (0.0008) +[2023-10-13 03:37:41,930][46662] Updated weights for policy 0, policy_version 80740 (0.0008) +[2023-10-13 03:37:42,300][46662] Updated weights for policy 0, policy_version 80750 (0.0008) +[2023-10-13 03:37:42,668][46662] Updated weights for policy 0, policy_version 80760 (0.0008) +[2023-10-13 03:37:43,606][45375] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 165314560. Throughput: 0: 1687.9, 1: 1664.2. Samples: 41330380. Policy #0 lag: (min: 10.0, avg: 18.0, max: 42.0) +[2023-10-13 03:37:43,607][45375] Avg episode reward: [(0, '53.870'), (1, '57.910')] +[2023-10-13 03:37:44,923][46663] Updated weights for policy 1, policy_version 80681 (0.0009) +[2023-10-13 03:37:45,288][46663] Updated weights for policy 1, policy_version 80691 (0.0008) +[2023-10-13 03:37:45,663][46663] Updated weights for policy 1, policy_version 80701 (0.0010) +[2023-10-13 03:37:46,602][46662] Updated weights for policy 0, policy_version 80770 (0.0011) +[2023-10-13 03:37:46,968][46662] Updated weights for policy 0, policy_version 80780 (0.0008) +[2023-10-13 03:37:47,349][46662] Updated weights for policy 0, policy_version 80790 (0.0009) +[2023-10-13 03:37:47,715][46662] Updated weights for policy 0, policy_version 80800 (0.0009) +[2023-10-13 03:37:48,606][45375] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 165380096. Throughput: 0: 1682.9, 1: 1680.0. Samples: 41350988. Policy #0 lag: (min: 31.0, avg: 35.9, max: 63.0) +[2023-10-13 03:37:48,607][45375] Avg episode reward: [(0, '53.220'), (1, '57.960')] +[2023-10-13 03:37:49,624][46663] Updated weights for policy 1, policy_version 80711 (0.0009) +[2023-10-13 03:37:49,990][46663] Updated weights for policy 1, policy_version 80721 (0.0008) +[2023-10-13 03:37:50,367][46663] Updated weights for policy 1, policy_version 80731 (0.0008) +[2023-10-13 03:37:51,922][46662] Updated weights for policy 0, policy_version 80810 (0.0009) +[2023-10-13 03:37:52,295][46662] Updated weights for policy 0, policy_version 80820 (0.0009) +[2023-10-13 03:37:52,668][46662] Updated weights for policy 0, policy_version 80830 (0.0009) +[2023-10-13 03:37:53,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 165445632. Throughput: 0: 1657.2, 1: 1685.6. Samples: 41370676. Policy #0 lag: (min: 31.0, avg: 35.9, max: 63.0) +[2023-10-13 03:37:53,607][45375] Avg episode reward: [(0, '52.060'), (1, '59.350')] +[2023-10-13 03:37:53,617][46091] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000080832_82771968.pth... +[2023-10-13 03:37:53,617][46384] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000080736_82673664.pth... +[2023-10-13 03:37:53,654][46384] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000079168_81068032.pth +[2023-10-13 03:37:53,654][46091] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000079264_81166336.pth +[2023-10-13 03:37:54,470][46663] Updated weights for policy 1, policy_version 80741 (0.0008) +[2023-10-13 03:37:54,832][46663] Updated weights for policy 1, policy_version 80751 (0.0009) +[2023-10-13 03:37:55,207][46663] Updated weights for policy 1, policy_version 80761 (0.0010) +[2023-10-13 03:37:56,688][46662] Updated weights for policy 0, policy_version 80840 (0.0009) +[2023-10-13 03:37:57,060][46662] Updated weights for policy 0, policy_version 80850 (0.0007) +[2023-10-13 03:37:57,429][46662] Updated weights for policy 0, policy_version 80860 (0.0009) +[2023-10-13 03:37:58,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 165511168. Throughput: 0: 1690.2, 1: 1680.0. Samples: 41381202. Policy #0 lag: (min: 31.0, avg: 35.9, max: 63.0) +[2023-10-13 03:37:58,607][45375] Avg episode reward: [(0, '52.230'), (1, '59.210')] +[2023-10-13 03:37:59,361][46663] Updated weights for policy 1, policy_version 80771 (0.0007) +[2023-10-13 03:37:59,736][46663] Updated weights for policy 1, policy_version 80781 (0.0008) +[2023-10-13 03:38:00,096][46663] Updated weights for policy 1, policy_version 80791 (0.0008) +[2023-10-13 03:38:01,451][46662] Updated weights for policy 0, policy_version 80870 (0.0008) +[2023-10-13 03:38:01,824][46662] Updated weights for policy 0, policy_version 80880 (0.0010) +[2023-10-13 03:38:02,192][46662] Updated weights for policy 0, policy_version 80890 (0.0010) +[2023-10-13 03:38:03,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 165576704. Throughput: 0: 1679.3, 1: 1690.9. Samples: 41401532. Policy #0 lag: (min: 31.0, avg: 35.9, max: 63.0) +[2023-10-13 03:38:03,607][45375] Avg episode reward: [(0, '52.670'), (1, '57.940')] +[2023-10-13 03:38:04,133][46663] Updated weights for policy 1, policy_version 80801 (0.0007) +[2023-10-13 03:38:04,502][46663] Updated weights for policy 1, policy_version 80811 (0.0008) +[2023-10-13 03:38:04,861][46663] Updated weights for policy 1, policy_version 80821 (0.0008) +[2023-10-13 03:38:05,232][46663] Updated weights for policy 1, policy_version 80831 (0.0008) +[2023-10-13 03:38:06,196][46662] Updated weights for policy 0, policy_version 80900 (0.0009) +[2023-10-13 03:38:06,558][46662] Updated weights for policy 0, policy_version 80910 (0.0008) +[2023-10-13 03:38:06,932][46662] Updated weights for policy 0, policy_version 80920 (0.0008) +[2023-10-13 03:38:08,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 165642240. Throughput: 0: 1668.5, 1: 1698.6. Samples: 41421742. Policy #0 lag: (min: 31.0, avg: 35.9, max: 63.0) +[2023-10-13 03:38:08,607][45375] Avg episode reward: [(0, '52.670'), (1, '58.860')] +[2023-10-13 03:38:09,112][46663] Updated weights for policy 1, policy_version 80841 (0.0008) +[2023-10-13 03:38:09,478][46663] Updated weights for policy 1, policy_version 80851 (0.0007) +[2023-10-13 03:38:09,840][46663] Updated weights for policy 1, policy_version 80861 (0.0008) +[2023-10-13 03:38:11,208][46662] Updated weights for policy 0, policy_version 80930 (0.0009) +[2023-10-13 03:38:11,580][46662] Updated weights for policy 0, policy_version 80940 (0.0008) +[2023-10-13 03:38:11,962][46662] Updated weights for policy 0, policy_version 80950 (0.0008) +[2023-10-13 03:38:12,335][46662] Updated weights for policy 0, policy_version 80960 (0.0007) +[2023-10-13 03:38:13,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 165707776. Throughput: 0: 1690.9, 1: 1699.5. Samples: 41432110. Policy #0 lag: (min: 31.0, avg: 35.9, max: 63.0) +[2023-10-13 03:38:13,607][45375] Avg episode reward: [(0, '51.600'), (1, '59.030')] +[2023-10-13 03:38:13,989][46663] Updated weights for policy 1, policy_version 80871 (0.0009) +[2023-10-13 03:38:14,352][46663] Updated weights for policy 1, policy_version 80881 (0.0007) +[2023-10-13 03:38:14,720][46663] Updated weights for policy 1, policy_version 80891 (0.0007) +[2023-10-13 03:38:16,412][46662] Updated weights for policy 0, policy_version 80970 (0.0008) +[2023-10-13 03:38:16,791][46662] Updated weights for policy 0, policy_version 80980 (0.0008) +[2023-10-13 03:38:17,156][46662] Updated weights for policy 0, policy_version 80990 (0.0008) +[2023-10-13 03:38:18,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 165773312. Throughput: 0: 1671.5, 1: 1703.5. Samples: 41451946. Policy #0 lag: (min: 31.0, avg: 35.9, max: 63.0) +[2023-10-13 03:38:18,607][45375] Avg episode reward: [(0, '53.030'), (1, '58.880')] +[2023-10-13 03:38:18,694][46663] Updated weights for policy 1, policy_version 80901 (0.0008) +[2023-10-13 03:38:19,065][46663] Updated weights for policy 1, policy_version 80911 (0.0009) +[2023-10-13 03:38:19,434][46663] Updated weights for policy 1, policy_version 80921 (0.0008) +[2023-10-13 03:38:21,248][46662] Updated weights for policy 0, policy_version 81000 (0.0011) +[2023-10-13 03:38:21,634][46662] Updated weights for policy 0, policy_version 81010 (0.0010) +[2023-10-13 03:38:21,992][46662] Updated weights for policy 0, policy_version 81020 (0.0008) +[2023-10-13 03:38:23,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 165838848. Throughput: 0: 1673.4, 1: 1698.0. Samples: 41471990. Policy #0 lag: (min: 31.0, avg: 35.9, max: 63.0) +[2023-10-13 03:38:23,607][45375] Avg episode reward: [(0, '52.410'), (1, '59.730')] +[2023-10-13 03:38:23,635][46663] Updated weights for policy 1, policy_version 80931 (0.0009) +[2023-10-13 03:38:24,006][46663] Updated weights for policy 1, policy_version 80941 (0.0008) +[2023-10-13 03:38:24,369][46663] Updated weights for policy 1, policy_version 80951 (0.0007) +[2023-10-13 03:38:25,991][46662] Updated weights for policy 0, policy_version 81030 (0.0009) +[2023-10-13 03:38:26,364][46662] Updated weights for policy 0, policy_version 81040 (0.0008) +[2023-10-13 03:38:26,742][46662] Updated weights for policy 0, policy_version 81050 (0.0008) +[2023-10-13 03:38:28,464][46663] Updated weights for policy 1, policy_version 80961 (0.0009) +[2023-10-13 03:38:28,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 165904384. Throughput: 0: 1682.5, 1: 1693.5. Samples: 41482298. Policy #0 lag: (min: 31.0, avg: 35.9, max: 63.0) +[2023-10-13 03:38:28,607][45375] Avg episode reward: [(0, '53.920'), (1, '59.740')] +[2023-10-13 03:38:28,838][46663] Updated weights for policy 1, policy_version 80971 (0.0010) +[2023-10-13 03:38:29,211][46663] Updated weights for policy 1, policy_version 80981 (0.0010) +[2023-10-13 03:38:29,584][46663] Updated weights for policy 1, policy_version 80991 (0.0009) +[2023-10-13 03:38:30,676][46662] Updated weights for policy 0, policy_version 81060 (0.0008) +[2023-10-13 03:38:31,050][46662] Updated weights for policy 0, policy_version 81070 (0.0011) +[2023-10-13 03:38:31,407][46662] Updated weights for policy 0, policy_version 81080 (0.0009) +[2023-10-13 03:38:33,488][46663] Updated weights for policy 1, policy_version 81001 (0.0008) +[2023-10-13 03:38:33,606][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 165969920. Throughput: 0: 1662.4, 1: 1689.5. Samples: 41501822. Policy #0 lag: (min: 31.0, avg: 35.9, max: 63.0) +[2023-10-13 03:38:33,607][45375] Avg episode reward: [(0, '54.930'), (1, '58.510')] +[2023-10-13 03:38:33,857][46663] Updated weights for policy 1, policy_version 81011 (0.0009) +[2023-10-13 03:38:34,226][46663] Updated weights for policy 1, policy_version 81021 (0.0007) +[2023-10-13 03:38:35,521][46662] Updated weights for policy 0, policy_version 81090 (0.0008) +[2023-10-13 03:38:35,886][46662] Updated weights for policy 0, policy_version 81100 (0.0007) +[2023-10-13 03:38:36,262][46662] Updated weights for policy 0, policy_version 81110 (0.0007) +[2023-10-13 03:38:36,635][46662] Updated weights for policy 0, policy_version 81120 (0.0007) +[2023-10-13 03:38:38,380][46663] Updated weights for policy 1, policy_version 81031 (0.0007) +[2023-10-13 03:38:38,606][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 166035456. Throughput: 0: 1684.3, 1: 1683.7. Samples: 41522236. Policy #0 lag: (min: 31.0, avg: 35.9, max: 63.0) +[2023-10-13 03:38:38,607][45375] Avg episode reward: [(0, '55.500'), (1, '58.970')] +[2023-10-13 03:38:38,742][46663] Updated weights for policy 1, policy_version 81041 (0.0010) +[2023-10-13 03:38:39,102][46663] Updated weights for policy 1, policy_version 81051 (0.0010) +[2023-10-13 03:38:40,469][46662] Updated weights for policy 0, policy_version 81130 (0.0010) +[2023-10-13 03:38:40,836][46662] Updated weights for policy 0, policy_version 81140 (0.0008) +[2023-10-13 03:38:41,194][46662] Updated weights for policy 0, policy_version 81150 (0.0009) +[2023-10-13 03:38:43,383][46663] Updated weights for policy 1, policy_version 81061 (0.0009) +[2023-10-13 03:38:43,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 166100992. Throughput: 0: 1670.5, 1: 1687.7. Samples: 41532322. Policy #0 lag: (min: 8.0, avg: 31.3, max: 40.0) +[2023-10-13 03:38:43,607][45375] Avg episode reward: [(0, '54.590'), (1, '57.390')] +[2023-10-13 03:38:43,748][46663] Updated weights for policy 1, policy_version 81071 (0.0007) +[2023-10-13 03:38:44,113][46663] Updated weights for policy 1, policy_version 81081 (0.0010) +[2023-10-13 03:38:45,325][46662] Updated weights for policy 0, policy_version 81160 (0.0011) +[2023-10-13 03:38:45,699][46662] Updated weights for policy 0, policy_version 81170 (0.0010) +[2023-10-13 03:38:46,067][46662] Updated weights for policy 0, policy_version 81180 (0.0009) +[2023-10-13 03:38:48,152][46663] Updated weights for policy 1, policy_version 81091 (0.0008) +[2023-10-13 03:38:48,515][46663] Updated weights for policy 1, policy_version 81101 (0.0011) +[2023-10-13 03:38:48,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 166166528. Throughput: 0: 1666.7, 1: 1685.1. Samples: 41552362. Policy #0 lag: (min: 8.0, avg: 31.3, max: 40.0) +[2023-10-13 03:38:48,607][45375] Avg episode reward: [(0, '54.450'), (1, '56.380')] +[2023-10-13 03:38:48,880][46663] Updated weights for policy 1, policy_version 81111 (0.0011) +[2023-10-13 03:38:50,406][46662] Updated weights for policy 0, policy_version 81190 (0.0008) +[2023-10-13 03:38:50,780][46662] Updated weights for policy 0, policy_version 81200 (0.0008) +[2023-10-13 03:38:51,151][46662] Updated weights for policy 0, policy_version 81210 (0.0010) +[2023-10-13 03:38:52,919][46663] Updated weights for policy 1, policy_version 81121 (0.0010) +[2023-10-13 03:38:53,275][46663] Updated weights for policy 1, policy_version 81131 (0.0011) +[2023-10-13 03:38:53,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 166232064. Throughput: 0: 1679.2, 1: 1668.3. Samples: 41572382. Policy #0 lag: (min: 8.0, avg: 31.3, max: 40.0) +[2023-10-13 03:38:53,607][45375] Avg episode reward: [(0, '55.250'), (1, '56.790')] +[2023-10-13 03:38:53,639][46663] Updated weights for policy 1, policy_version 81141 (0.0010) +[2023-10-13 03:38:54,002][46663] Updated weights for policy 1, policy_version 81151 (0.0011) +[2023-10-13 03:38:55,241][46662] Updated weights for policy 0, policy_version 81220 (0.0010) +[2023-10-13 03:38:55,607][46662] Updated weights for policy 0, policy_version 81230 (0.0008) +[2023-10-13 03:38:55,974][46662] Updated weights for policy 0, policy_version 81240 (0.0009) +[2023-10-13 03:38:58,187][46663] Updated weights for policy 1, policy_version 81161 (0.0009) +[2023-10-13 03:38:58,554][46663] Updated weights for policy 1, policy_version 81171 (0.0008) +[2023-10-13 03:38:58,606][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 166297600. Throughput: 0: 1669.6, 1: 1677.5. Samples: 41582726. Policy #0 lag: (min: 8.0, avg: 31.3, max: 40.0) +[2023-10-13 03:38:58,607][45375] Avg episode reward: [(0, '57.930'), (1, '56.770')] +[2023-10-13 03:38:58,932][46663] Updated weights for policy 1, policy_version 81181 (0.0007) +[2023-10-13 03:39:00,079][46662] Updated weights for policy 0, policy_version 81250 (0.0008) +[2023-10-13 03:39:00,446][46662] Updated weights for policy 0, policy_version 81260 (0.0008) +[2023-10-13 03:39:00,824][46662] Updated weights for policy 0, policy_version 81270 (0.0009) +[2023-10-13 03:39:01,184][46662] Updated weights for policy 0, policy_version 81280 (0.0009) +[2023-10-13 03:39:02,920][46663] Updated weights for policy 1, policy_version 81191 (0.0007) +[2023-10-13 03:39:03,298][46663] Updated weights for policy 1, policy_version 81201 (0.0007) +[2023-10-13 03:39:03,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 166363136. Throughput: 0: 1676.1, 1: 1680.0. Samples: 41602970. Policy #0 lag: (min: 8.0, avg: 31.3, max: 40.0) +[2023-10-13 03:39:03,607][45375] Avg episode reward: [(0, '58.510'), (1, '55.820')] +[2023-10-13 03:39:03,670][46663] Updated weights for policy 1, policy_version 81211 (0.0009) +[2023-10-13 03:39:05,221][46662] Updated weights for policy 0, policy_version 81290 (0.0008) +[2023-10-13 03:39:05,592][46662] Updated weights for policy 0, policy_version 81300 (0.0007) +[2023-10-13 03:39:05,960][46662] Updated weights for policy 0, policy_version 81310 (0.0008) +[2023-10-13 03:39:07,580][46663] Updated weights for policy 1, policy_version 81221 (0.0009) +[2023-10-13 03:39:07,953][46663] Updated weights for policy 1, policy_version 81231 (0.0011) +[2023-10-13 03:39:08,315][46663] Updated weights for policy 1, policy_version 81241 (0.0009) +[2023-10-13 03:39:08,607][45375] Fps is (10 sec: 16383.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 166461440. Throughput: 0: 1687.5, 1: 1663.6. Samples: 41622786. Policy #0 lag: (min: 8.0, avg: 31.3, max: 40.0) +[2023-10-13 03:39:08,607][45375] Avg episode reward: [(0, '58.210'), (1, '55.310')] +[2023-10-13 03:39:10,055][46662] Updated weights for policy 0, policy_version 81320 (0.0009) +[2023-10-13 03:39:10,425][46662] Updated weights for policy 0, policy_version 81330 (0.0007) +[2023-10-13 03:39:10,800][46662] Updated weights for policy 0, policy_version 81340 (0.0010) +[2023-10-13 03:39:12,361][46663] Updated weights for policy 1, policy_version 81251 (0.0011) +[2023-10-13 03:39:12,722][46663] Updated weights for policy 1, policy_version 81261 (0.0009) +[2023-10-13 03:39:13,093][46663] Updated weights for policy 1, policy_version 81271 (0.0008) +[2023-10-13 03:39:13,607][45375] Fps is (10 sec: 16383.6, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 166526976. Throughput: 0: 1664.3, 1: 1687.5. Samples: 41633132. Policy #0 lag: (min: 8.0, avg: 31.3, max: 40.0) +[2023-10-13 03:39:13,608][45375] Avg episode reward: [(0, '59.150'), (1, '54.640')] +[2023-10-13 03:39:14,780][46662] Updated weights for policy 0, policy_version 81350 (0.0009) +[2023-10-13 03:39:15,141][46662] Updated weights for policy 0, policy_version 81360 (0.0009) +[2023-10-13 03:39:15,521][46662] Updated weights for policy 0, policy_version 81370 (0.0009) +[2023-10-13 03:39:17,129][46663] Updated weights for policy 1, policy_version 81281 (0.0008) +[2023-10-13 03:39:17,500][46663] Updated weights for policy 1, policy_version 81291 (0.0008) +[2023-10-13 03:39:17,858][46663] Updated weights for policy 1, policy_version 81301 (0.0008) +[2023-10-13 03:39:18,219][46663] Updated weights for policy 1, policy_version 81311 (0.0007) +[2023-10-13 03:39:18,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 166592512. Throughput: 0: 1690.6, 1: 1681.3. Samples: 41653560. Policy #0 lag: (min: 8.0, avg: 31.3, max: 40.0) +[2023-10-13 03:39:18,608][45375] Avg episode reward: [(0, '60.200'), (1, '53.160')] +[2023-10-13 03:39:19,708][46662] Updated weights for policy 0, policy_version 81380 (0.0008) +[2023-10-13 03:39:20,096][46662] Updated weights for policy 0, policy_version 81390 (0.0010) +[2023-10-13 03:39:20,455][46662] Updated weights for policy 0, policy_version 81400 (0.0008) +[2023-10-13 03:39:22,398][46663] Updated weights for policy 1, policy_version 81321 (0.0008) +[2023-10-13 03:39:22,763][46663] Updated weights for policy 1, policy_version 81331 (0.0009) +[2023-10-13 03:39:23,131][46663] Updated weights for policy 1, policy_version 81341 (0.0009) +[2023-10-13 03:39:23,606][45375] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 166658048. Throughput: 0: 1692.8, 1: 1662.2. Samples: 41673212. Policy #0 lag: (min: 8.0, avg: 31.3, max: 40.0) +[2023-10-13 03:39:23,607][45375] Avg episode reward: [(0, '60.460'), (1, '52.270')] +[2023-10-13 03:39:24,356][46662] Updated weights for policy 0, policy_version 81410 (0.0008) +[2023-10-13 03:39:24,724][46662] Updated weights for policy 0, policy_version 81420 (0.0009) +[2023-10-13 03:39:25,104][46662] Updated weights for policy 0, policy_version 81430 (0.0008) +[2023-10-13 03:39:25,474][46662] Updated weights for policy 0, policy_version 81440 (0.0009) +[2023-10-13 03:39:27,163][46663] Updated weights for policy 1, policy_version 81351 (0.0008) +[2023-10-13 03:39:27,529][46663] Updated weights for policy 1, policy_version 81361 (0.0008) +[2023-10-13 03:39:27,889][46663] Updated weights for policy 1, policy_version 81371 (0.0009) +[2023-10-13 03:39:28,607][45375] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 166723584. Throughput: 0: 1675.9, 1: 1688.0. Samples: 41683700. Policy #0 lag: (min: 8.0, avg: 31.3, max: 40.0) +[2023-10-13 03:39:28,607][45375] Avg episode reward: [(0, '59.450'), (1, '51.850')] +[2023-10-13 03:39:29,476][46662] Updated weights for policy 0, policy_version 81450 (0.0007) +[2023-10-13 03:39:29,836][46662] Updated weights for policy 0, policy_version 81460 (0.0008) +[2023-10-13 03:39:30,195][46662] Updated weights for policy 0, policy_version 81470 (0.0008) +[2023-10-13 03:39:32,162][46663] Updated weights for policy 1, policy_version 81381 (0.0009) +[2023-10-13 03:39:32,528][46663] Updated weights for policy 1, policy_version 81391 (0.0012) +[2023-10-13 03:39:32,891][46663] Updated weights for policy 1, policy_version 81401 (0.0007) +[2023-10-13 03:39:33,607][45375] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 166789120. Throughput: 0: 1692.9, 1: 1676.7. Samples: 41703994. Policy #0 lag: (min: 8.0, avg: 31.3, max: 40.0) +[2023-10-13 03:39:33,608][45375] Avg episode reward: [(0, '60.930'), (1, '52.850')] +[2023-10-13 03:39:34,287][46662] Updated weights for policy 0, policy_version 81480 (0.0009) +[2023-10-13 03:39:34,664][46662] Updated weights for policy 0, policy_version 81490 (0.0009) +[2023-10-13 03:39:35,030][46662] Updated weights for policy 0, policy_version 81500 (0.0007) +[2023-10-13 03:39:36,936][46663] Updated weights for policy 1, policy_version 81411 (0.0009) +[2023-10-13 03:39:37,305][46663] Updated weights for policy 1, policy_version 81421 (0.0011) +[2023-10-13 03:39:37,675][46663] Updated weights for policy 1, policy_version 81431 (0.0011) +[2023-10-13 03:39:38,606][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 166854656. Throughput: 0: 1695.1, 1: 1667.6. Samples: 41723702. Policy #0 lag: (min: 31.0, avg: 38.3, max: 63.0) +[2023-10-13 03:39:38,607][45375] Avg episode reward: [(0, '59.740'), (1, '52.090')] +[2023-10-13 03:39:39,217][46662] Updated weights for policy 0, policy_version 81510 (0.0009) +[2023-10-13 03:39:39,581][46662] Updated weights for policy 0, policy_version 81520 (0.0011) +[2023-10-13 03:39:39,938][46662] Updated weights for policy 0, policy_version 81530 (0.0007) +[2023-10-13 03:39:41,899][46663] Updated weights for policy 1, policy_version 81441 (0.0010) +[2023-10-13 03:39:42,261][46663] Updated weights for policy 1, policy_version 81451 (0.0007) +[2023-10-13 03:39:42,637][46663] Updated weights for policy 1, policy_version 81461 (0.0010) +[2023-10-13 03:39:43,013][46663] Updated weights for policy 1, policy_version 81471 (0.0008) +[2023-10-13 03:39:43,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 166920192. Throughput: 0: 1675.1, 1: 1684.7. Samples: 41733918. Policy #0 lag: (min: 31.0, avg: 38.3, max: 63.0) +[2023-10-13 03:39:43,608][45375] Avg episode reward: [(0, '58.340'), (1, '51.960')] +[2023-10-13 03:39:43,899][46662] Updated weights for policy 0, policy_version 81540 (0.0007) +[2023-10-13 03:39:44,276][46662] Updated weights for policy 0, policy_version 81550 (0.0008) +[2023-10-13 03:39:44,635][46662] Updated weights for policy 0, policy_version 81560 (0.0009) +[2023-10-13 03:39:46,998][46663] Updated weights for policy 1, policy_version 81481 (0.0007) +[2023-10-13 03:39:47,361][46663] Updated weights for policy 1, policy_version 81491 (0.0009) +[2023-10-13 03:39:47,730][46663] Updated weights for policy 1, policy_version 81501 (0.0009) +[2023-10-13 03:39:48,574][46662] Updated weights for policy 0, policy_version 81570 (0.0010) +[2023-10-13 03:39:48,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 166985728. Throughput: 0: 1690.6, 1: 1667.2. Samples: 41754072. Policy #0 lag: (min: 31.0, avg: 38.3, max: 63.0) +[2023-10-13 03:39:48,607][45375] Avg episode reward: [(0, '58.060'), (1, '51.600')] +[2023-10-13 03:39:48,941][46662] Updated weights for policy 0, policy_version 81580 (0.0011) +[2023-10-13 03:39:49,306][46662] Updated weights for policy 0, policy_version 81590 (0.0010) +[2023-10-13 03:39:49,676][46662] Updated weights for policy 0, policy_version 81600 (0.0009) +[2023-10-13 03:39:51,655][46663] Updated weights for policy 1, policy_version 81511 (0.0008) +[2023-10-13 03:39:52,020][46663] Updated weights for policy 1, policy_version 81521 (0.0009) +[2023-10-13 03:39:52,397][46663] Updated weights for policy 1, policy_version 81531 (0.0009) +[2023-10-13 03:39:53,607][45375] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 167051264. Throughput: 0: 1688.7, 1: 1676.5. Samples: 41774222. Policy #0 lag: (min: 31.0, avg: 38.3, max: 63.0) +[2023-10-13 03:39:53,607][45375] Avg episode reward: [(0, '59.270'), (1, '53.110')] +[2023-10-13 03:39:53,614][46384] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000081536_83492864.pth... +[2023-10-13 03:39:53,654][46384] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000079968_81887232.pth +[2023-10-13 03:39:53,979][46662] Updated weights for policy 0, policy_version 81610 (0.0008) +[2023-10-13 03:39:54,337][46662] Updated weights for policy 0, policy_version 81620 (0.0009) +[2023-10-13 03:39:54,704][46662] Updated weights for policy 0, policy_version 81630 (0.0008) +[2023-10-13 03:39:54,777][46091] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000081632_83591168.pth... +[2023-10-13 03:39:54,806][46091] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000080032_81952768.pth +[2023-10-13 03:39:56,423][46663] Updated weights for policy 1, policy_version 81541 (0.0010) +[2023-10-13 03:39:56,793][46663] Updated weights for policy 1, policy_version 81551 (0.0009) +[2023-10-13 03:39:57,158][46663] Updated weights for policy 1, policy_version 81561 (0.0008) +[2023-10-13 03:39:58,606][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 167116800. Throughput: 0: 1682.4, 1: 1681.7. Samples: 41784512. Policy #0 lag: (min: 31.0, avg: 38.3, max: 63.0) +[2023-10-13 03:39:58,607][45375] Avg episode reward: [(0, '59.590'), (1, '53.820')] +[2023-10-13 03:39:58,888][46662] Updated weights for policy 0, policy_version 81640 (0.0010) +[2023-10-13 03:39:59,257][46662] Updated weights for policy 0, policy_version 81650 (0.0008) +[2023-10-13 03:39:59,619][46662] Updated weights for policy 0, policy_version 81660 (0.0009) +[2023-10-13 03:40:01,296][46663] Updated weights for policy 1, policy_version 81571 (0.0009) +[2023-10-13 03:40:01,690][46663] Updated weights for policy 1, policy_version 81581 (0.0008) +[2023-10-13 03:40:02,048][46663] Updated weights for policy 1, policy_version 81591 (0.0011) +[2023-10-13 03:40:03,554][46662] Updated weights for policy 0, policy_version 81670 (0.0007) +[2023-10-13 03:40:03,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 167182336. Throughput: 0: 1682.6, 1: 1660.7. Samples: 41804008. Policy #0 lag: (min: 31.0, avg: 38.3, max: 63.0) +[2023-10-13 03:40:03,608][45375] Avg episode reward: [(0, '61.100'), (1, '53.730')] +[2023-10-13 03:40:03,933][46662] Updated weights for policy 0, policy_version 81680 (0.0007) +[2023-10-13 03:40:04,296][46662] Updated weights for policy 0, policy_version 81690 (0.0009) +[2023-10-13 03:40:05,990][46663] Updated weights for policy 1, policy_version 81601 (0.0010) +[2023-10-13 03:40:06,366][46663] Updated weights for policy 1, policy_version 81611 (0.0008) +[2023-10-13 03:40:06,739][46663] Updated weights for policy 1, policy_version 81621 (0.0008) +[2023-10-13 03:40:07,117][46663] Updated weights for policy 1, policy_version 81631 (0.0008) +[2023-10-13 03:40:08,457][46662] Updated weights for policy 0, policy_version 81700 (0.0009) +[2023-10-13 03:40:08,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 167247872. Throughput: 0: 1683.1, 1: 1682.9. Samples: 41824682. Policy #0 lag: (min: 31.0, avg: 38.3, max: 63.0) +[2023-10-13 03:40:08,607][45375] Avg episode reward: [(0, '60.880'), (1, '53.440')] +[2023-10-13 03:40:08,826][46662] Updated weights for policy 0, policy_version 81710 (0.0011) +[2023-10-13 03:40:09,194][46662] Updated weights for policy 0, policy_version 81720 (0.0010) +[2023-10-13 03:40:11,237][46663] Updated weights for policy 1, policy_version 81641 (0.0008) +[2023-10-13 03:40:11,609][46663] Updated weights for policy 1, policy_version 81651 (0.0009) +[2023-10-13 03:40:11,971][46663] Updated weights for policy 1, policy_version 81661 (0.0009) +[2023-10-13 03:40:13,246][46662] Updated weights for policy 0, policy_version 81730 (0.0010) +[2023-10-13 03:40:13,607][45375] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 167313408. Throughput: 0: 1681.0, 1: 1670.5. Samples: 41834518. Policy #0 lag: (min: 31.0, avg: 38.3, max: 63.0) +[2023-10-13 03:40:13,607][45375] Avg episode reward: [(0, '59.280'), (1, '55.000')] +[2023-10-13 03:40:13,612][46662] Updated weights for policy 0, policy_version 81740 (0.0008) +[2023-10-13 03:40:13,986][46662] Updated weights for policy 0, policy_version 81750 (0.0009) +[2023-10-13 03:40:14,358][46662] Updated weights for policy 0, policy_version 81760 (0.0008) +[2023-10-13 03:40:16,081][46663] Updated weights for policy 1, policy_version 81671 (0.0010) +[2023-10-13 03:40:16,456][46663] Updated weights for policy 1, policy_version 81681 (0.0010) +[2023-10-13 03:40:16,821][46663] Updated weights for policy 1, policy_version 81691 (0.0007) +[2023-10-13 03:40:18,518][46662] Updated weights for policy 0, policy_version 81770 (0.0008) +[2023-10-13 03:40:18,606][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 167378944. Throughput: 0: 1679.9, 1: 1664.5. Samples: 41854490. Policy #0 lag: (min: 31.0, avg: 38.3, max: 63.0) +[2023-10-13 03:40:18,607][45375] Avg episode reward: [(0, '59.170'), (1, '55.530')] +[2023-10-13 03:40:18,889][46662] Updated weights for policy 0, policy_version 81780 (0.0007) +[2023-10-13 03:40:19,261][46662] Updated weights for policy 0, policy_version 81790 (0.0009) +[2023-10-13 03:40:20,840][46663] Updated weights for policy 1, policy_version 81701 (0.0009) +[2023-10-13 03:40:21,204][46663] Updated weights for policy 1, policy_version 81711 (0.0009) +[2023-10-13 03:40:21,575][46663] Updated weights for policy 1, policy_version 81721 (0.0007) +[2023-10-13 03:40:23,342][46662] Updated weights for policy 0, policy_version 81800 (0.0009) +[2023-10-13 03:40:23,607][45375] Fps is (10 sec: 13106.9, 60 sec: 13107.1, 300 sec: 13440.4). Total num frames: 167444480. Throughput: 0: 1679.5, 1: 1685.6. Samples: 41875130. Policy #0 lag: (min: 31.0, avg: 38.3, max: 63.0) +[2023-10-13 03:40:23,608][45375] Avg episode reward: [(0, '59.470'), (1, '56.340')] +[2023-10-13 03:40:23,710][46662] Updated weights for policy 0, policy_version 81810 (0.0009) +[2023-10-13 03:40:24,071][46662] Updated weights for policy 0, policy_version 81820 (0.0009) +[2023-10-13 03:40:25,675][46663] Updated weights for policy 1, policy_version 81731 (0.0009) +[2023-10-13 03:40:26,049][46663] Updated weights for policy 1, policy_version 81741 (0.0007) +[2023-10-13 03:40:26,415][46663] Updated weights for policy 1, policy_version 81751 (0.0008) +[2023-10-13 03:40:28,084][46662] Updated weights for policy 0, policy_version 81830 (0.0009) +[2023-10-13 03:40:28,453][46662] Updated weights for policy 0, policy_version 81840 (0.0008) +[2023-10-13 03:40:28,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 167510016. Throughput: 0: 1680.5, 1: 1669.5. Samples: 41884668. Policy #0 lag: (min: 31.0, avg: 38.3, max: 63.0) +[2023-10-13 03:40:28,607][45375] Avg episode reward: [(0, '58.600'), (1, '58.060')] +[2023-10-13 03:40:28,825][46662] Updated weights for policy 0, policy_version 81850 (0.0008) +[2023-10-13 03:40:30,521][46663] Updated weights for policy 1, policy_version 81761 (0.0007) +[2023-10-13 03:40:30,881][46663] Updated weights for policy 1, policy_version 81771 (0.0008) +[2023-10-13 03:40:31,254][46663] Updated weights for policy 1, policy_version 81781 (0.0010) +[2023-10-13 03:40:31,635][46663] Updated weights for policy 1, policy_version 81791 (0.0009) +[2023-10-13 03:40:32,987][46662] Updated weights for policy 0, policy_version 81860 (0.0009) +[2023-10-13 03:40:33,362][46662] Updated weights for policy 0, policy_version 81870 (0.0009) +[2023-10-13 03:40:33,607][45375] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 167575552. Throughput: 0: 1680.3, 1: 1672.9. Samples: 41904964. Policy #0 lag: (min: 0.0, avg: 29.2, max: 32.0) +[2023-10-13 03:40:33,607][45375] Avg episode reward: [(0, '57.740'), (1, '57.720')] +[2023-10-13 03:40:33,729][46662] Updated weights for policy 0, policy_version 81880 (0.0011) +[2023-10-13 03:40:35,688][46663] Updated weights for policy 1, policy_version 81801 (0.0007) +[2023-10-13 03:40:36,061][46663] Updated weights for policy 1, policy_version 81811 (0.0008) +[2023-10-13 03:40:36,429][46663] Updated weights for policy 1, policy_version 81821 (0.0008) +[2023-10-13 03:40:37,639][46662] Updated weights for policy 0, policy_version 81890 (0.0010) +[2023-10-13 03:40:38,018][46662] Updated weights for policy 0, policy_version 81900 (0.0009) +[2023-10-13 03:40:38,377][46662] Updated weights for policy 0, policy_version 81910 (0.0010) +[2023-10-13 03:40:38,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 167641088. Throughput: 0: 1676.5, 1: 1682.8. Samples: 41925394. Policy #0 lag: (min: 0.0, avg: 29.2, max: 32.0) +[2023-10-13 03:40:38,607][45375] Avg episode reward: [(0, '58.550'), (1, '58.640')] +[2023-10-13 03:40:38,754][46662] Updated weights for policy 0, policy_version 81920 (0.0010) +[2023-10-13 03:40:40,522][46663] Updated weights for policy 1, policy_version 81831 (0.0009) +[2023-10-13 03:40:40,895][46663] Updated weights for policy 1, policy_version 81841 (0.0009) +[2023-10-13 03:40:41,256][46663] Updated weights for policy 1, policy_version 81851 (0.0009) +[2023-10-13 03:40:42,835][46662] Updated weights for policy 0, policy_version 81930 (0.0008) +[2023-10-13 03:40:43,206][46662] Updated weights for policy 0, policy_version 81940 (0.0007) +[2023-10-13 03:40:43,575][46662] Updated weights for policy 0, policy_version 81950 (0.0009) +[2023-10-13 03:40:43,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 167706624. Throughput: 0: 1681.4, 1: 1657.9. Samples: 41934778. Policy #0 lag: (min: 0.0, avg: 29.2, max: 32.0) +[2023-10-13 03:40:43,607][45375] Avg episode reward: [(0, '57.500'), (1, '57.100')] +[2023-10-13 03:40:45,314][46663] Updated weights for policy 1, policy_version 81861 (0.0009) +[2023-10-13 03:40:45,678][46663] Updated weights for policy 1, policy_version 81871 (0.0011) +[2023-10-13 03:40:46,052][46663] Updated weights for policy 1, policy_version 81881 (0.0008) +[2023-10-13 03:40:47,626][46662] Updated weights for policy 0, policy_version 81960 (0.0009) +[2023-10-13 03:40:47,996][46662] Updated weights for policy 0, policy_version 81970 (0.0009) +[2023-10-13 03:40:48,374][46662] Updated weights for policy 0, policy_version 81980 (0.0008) +[2023-10-13 03:40:48,607][45375] Fps is (10 sec: 16384.0, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 167804928. Throughput: 0: 1684.8, 1: 1682.5. Samples: 41955536. Policy #0 lag: (min: 0.0, avg: 29.2, max: 32.0) +[2023-10-13 03:40:48,607][45375] Avg episode reward: [(0, '57.750'), (1, '56.830')] +[2023-10-13 03:40:50,283][46663] Updated weights for policy 1, policy_version 81891 (0.0009) +[2023-10-13 03:40:50,687][46663] Updated weights for policy 1, policy_version 81901 (0.0009) +[2023-10-13 03:40:51,048][46663] Updated weights for policy 1, policy_version 81911 (0.0011) +[2023-10-13 03:40:52,520][46662] Updated weights for policy 0, policy_version 81990 (0.0008) +[2023-10-13 03:40:52,885][46662] Updated weights for policy 0, policy_version 82000 (0.0011) +[2023-10-13 03:40:53,257][46662] Updated weights for policy 0, policy_version 82010 (0.0010) +[2023-10-13 03:40:53,607][45375] Fps is (10 sec: 16383.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 167870464. Throughput: 0: 1668.9, 1: 1683.6. Samples: 41975544. Policy #0 lag: (min: 0.0, avg: 29.2, max: 32.0) +[2023-10-13 03:40:53,607][45375] Avg episode reward: [(0, '57.870'), (1, '56.420')] +[2023-10-13 03:40:54,960][46663] Updated weights for policy 1, policy_version 81921 (0.0008) +[2023-10-13 03:40:55,318][46663] Updated weights for policy 1, policy_version 81931 (0.0010) +[2023-10-13 03:40:55,688][46663] Updated weights for policy 1, policy_version 81941 (0.0009) +[2023-10-13 03:40:56,050][46663] Updated weights for policy 1, policy_version 81951 (0.0009) +[2023-10-13 03:40:57,329][46662] Updated weights for policy 0, policy_version 82020 (0.0007) +[2023-10-13 03:40:57,711][46662] Updated weights for policy 0, policy_version 82030 (0.0008) +[2023-10-13 03:40:58,076][46662] Updated weights for policy 0, policy_version 82040 (0.0008) +[2023-10-13 03:40:58,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 167936000. Throughput: 0: 1683.6, 1: 1663.4. Samples: 41985136. Policy #0 lag: (min: 0.0, avg: 29.2, max: 32.0) +[2023-10-13 03:40:58,607][45375] Avg episode reward: [(0, '56.350'), (1, '56.930')] +[2023-10-13 03:41:00,205][46663] Updated weights for policy 1, policy_version 81961 (0.0010) +[2023-10-13 03:41:00,570][46663] Updated weights for policy 1, policy_version 81971 (0.0009) +[2023-10-13 03:41:00,938][46663] Updated weights for policy 1, policy_version 81981 (0.0009) +[2023-10-13 03:41:01,981][46662] Updated weights for policy 0, policy_version 82050 (0.0008) +[2023-10-13 03:41:02,357][46662] Updated weights for policy 0, policy_version 82060 (0.0008) +[2023-10-13 03:41:02,721][46662] Updated weights for policy 0, policy_version 82070 (0.0010) +[2023-10-13 03:41:03,089][46662] Updated weights for policy 0, policy_version 82080 (0.0011) +[2023-10-13 03:41:03,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 168001536. Throughput: 0: 1684.0, 1: 1682.2. Samples: 42005966. Policy #0 lag: (min: 0.0, avg: 29.2, max: 32.0) +[2023-10-13 03:41:03,607][45375] Avg episode reward: [(0, '55.480'), (1, '57.260')] +[2023-10-13 03:41:04,909][46663] Updated weights for policy 1, policy_version 81991 (0.0007) +[2023-10-13 03:41:05,275][46663] Updated weights for policy 1, policy_version 82001 (0.0009) +[2023-10-13 03:41:05,644][46663] Updated weights for policy 1, policy_version 82011 (0.0008) +[2023-10-13 03:41:07,190][46662] Updated weights for policy 0, policy_version 82090 (0.0008) +[2023-10-13 03:41:07,561][46662] Updated weights for policy 0, policy_version 82100 (0.0010) +[2023-10-13 03:41:07,921][46662] Updated weights for policy 0, policy_version 82110 (0.0007) +[2023-10-13 03:41:08,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 168067072. Throughput: 0: 1654.9, 1: 1684.7. Samples: 42025412. Policy #0 lag: (min: 0.0, avg: 29.2, max: 32.0) +[2023-10-13 03:41:08,607][45375] Avg episode reward: [(0, '55.410'), (1, '58.280')] +[2023-10-13 03:41:09,538][46663] Updated weights for policy 1, policy_version 82021 (0.0008) +[2023-10-13 03:41:09,913][46663] Updated weights for policy 1, policy_version 82031 (0.0007) +[2023-10-13 03:41:10,283][46663] Updated weights for policy 1, policy_version 82041 (0.0009) +[2023-10-13 03:41:12,160][46662] Updated weights for policy 0, policy_version 82120 (0.0010) +[2023-10-13 03:41:12,529][46662] Updated weights for policy 0, policy_version 82130 (0.0009) +[2023-10-13 03:41:12,890][46662] Updated weights for policy 0, policy_version 82140 (0.0008) +[2023-10-13 03:41:13,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 168132608. Throughput: 0: 1680.6, 1: 1670.8. Samples: 42035484. Policy #0 lag: (min: 0.0, avg: 29.2, max: 32.0) +[2023-10-13 03:41:13,607][45375] Avg episode reward: [(0, '54.740'), (1, '58.570')] +[2023-10-13 03:41:14,409][46663] Updated weights for policy 1, policy_version 82051 (0.0009) +[2023-10-13 03:41:14,779][46663] Updated weights for policy 1, policy_version 82061 (0.0010) +[2023-10-13 03:41:15,142][46663] Updated weights for policy 1, policy_version 82071 (0.0011) +[2023-10-13 03:41:17,019][46662] Updated weights for policy 0, policy_version 82150 (0.0011) +[2023-10-13 03:41:17,385][46662] Updated weights for policy 0, policy_version 82160 (0.0010) +[2023-10-13 03:41:17,753][46662] Updated weights for policy 0, policy_version 82170 (0.0011) +[2023-10-13 03:41:18,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 168198144. Throughput: 0: 1681.6, 1: 1683.4. Samples: 42056388. Policy #0 lag: (min: 0.0, avg: 29.2, max: 32.0) +[2023-10-13 03:41:18,607][45375] Avg episode reward: [(0, '55.050'), (1, '57.530')] +[2023-10-13 03:41:18,979][46663] Updated weights for policy 1, policy_version 82081 (0.0008) +[2023-10-13 03:41:19,346][46663] Updated weights for policy 1, policy_version 82091 (0.0009) +[2023-10-13 03:41:19,710][46663] Updated weights for policy 1, policy_version 82101 (0.0007) +[2023-10-13 03:41:20,089][46663] Updated weights for policy 1, policy_version 82111 (0.0008) +[2023-10-13 03:41:21,621][46662] Updated weights for policy 0, policy_version 82180 (0.0009) +[2023-10-13 03:41:21,984][46662] Updated weights for policy 0, policy_version 82190 (0.0007) +[2023-10-13 03:41:22,353][46662] Updated weights for policy 0, policy_version 82200 (0.0009) +[2023-10-13 03:41:23,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 168263680. Throughput: 0: 1661.5, 1: 1686.7. Samples: 42076064. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) +[2023-10-13 03:41:23,607][45375] Avg episode reward: [(0, '55.150'), (1, '56.980')] +[2023-10-13 03:41:24,198][46663] Updated weights for policy 1, policy_version 82121 (0.0012) +[2023-10-13 03:41:24,571][46663] Updated weights for policy 1, policy_version 82131 (0.0010) +[2023-10-13 03:41:24,941][46663] Updated weights for policy 1, policy_version 82141 (0.0012) +[2023-10-13 03:41:26,306][46662] Updated weights for policy 0, policy_version 82210 (0.0009) +[2023-10-13 03:41:26,678][46662] Updated weights for policy 0, policy_version 82220 (0.0009) +[2023-10-13 03:41:27,051][46662] Updated weights for policy 0, policy_version 82230 (0.0009) +[2023-10-13 03:41:27,423][46662] Updated weights for policy 0, policy_version 82240 (0.0010) +[2023-10-13 03:41:28,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 168329216. Throughput: 0: 1689.8, 1: 1683.3. Samples: 42086568. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) +[2023-10-13 03:41:28,608][45375] Avg episode reward: [(0, '56.220'), (1, '56.350')] +[2023-10-13 03:41:29,107][46663] Updated weights for policy 1, policy_version 82151 (0.0008) +[2023-10-13 03:41:29,477][46663] Updated weights for policy 1, policy_version 82161 (0.0009) +[2023-10-13 03:41:29,841][46663] Updated weights for policy 1, policy_version 82171 (0.0008) +[2023-10-13 03:41:31,566][46662] Updated weights for policy 0, policy_version 82250 (0.0008) +[2023-10-13 03:41:31,935][46662] Updated weights for policy 0, policy_version 82260 (0.0007) +[2023-10-13 03:41:32,315][46662] Updated weights for policy 0, policy_version 82270 (0.0008) +[2023-10-13 03:41:33,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 168394752. Throughput: 0: 1670.8, 1: 1694.9. Samples: 42106990. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) +[2023-10-13 03:41:33,607][45375] Avg episode reward: [(0, '56.760'), (1, '58.560')] +[2023-10-13 03:41:33,900][46663] Updated weights for policy 1, policy_version 82181 (0.0008) +[2023-10-13 03:41:34,275][46663] Updated weights for policy 1, policy_version 82191 (0.0007) +[2023-10-13 03:41:34,638][46663] Updated weights for policy 1, policy_version 82201 (0.0008) +[2023-10-13 03:41:36,511][46662] Updated weights for policy 0, policy_version 82280 (0.0009) +[2023-10-13 03:41:36,886][46662] Updated weights for policy 0, policy_version 82290 (0.0009) +[2023-10-13 03:41:37,260][46662] Updated weights for policy 0, policy_version 82300 (0.0007) +[2023-10-13 03:41:38,607][45375] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 168460288. Throughput: 0: 1669.1, 1: 1694.5. Samples: 42126904. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) +[2023-10-13 03:41:38,607][45375] Avg episode reward: [(0, '56.210'), (1, '57.310')] +[2023-10-13 03:41:38,824][46663] Updated weights for policy 1, policy_version 82211 (0.0009) +[2023-10-13 03:41:39,225][46663] Updated weights for policy 1, policy_version 82221 (0.0011) +[2023-10-13 03:41:39,592][46663] Updated weights for policy 1, policy_version 82231 (0.0009) +[2023-10-13 03:41:41,223][46662] Updated weights for policy 0, policy_version 82310 (0.0009) +[2023-10-13 03:41:41,590][46662] Updated weights for policy 0, policy_version 82320 (0.0010) +[2023-10-13 03:41:41,971][46662] Updated weights for policy 0, policy_version 82330 (0.0010) +[2023-10-13 03:41:43,512][46663] Updated weights for policy 1, policy_version 82241 (0.0009) +[2023-10-13 03:41:43,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 168525824. Throughput: 0: 1687.3, 1: 1694.0. Samples: 42137296. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) +[2023-10-13 03:41:43,607][45375] Avg episode reward: [(0, '57.870'), (1, '56.970')] +[2023-10-13 03:41:43,873][46663] Updated weights for policy 1, policy_version 82251 (0.0009) +[2023-10-13 03:41:44,238][46663] Updated weights for policy 1, policy_version 82261 (0.0010) +[2023-10-13 03:41:44,607][46663] Updated weights for policy 1, policy_version 82271 (0.0008) +[2023-10-13 03:41:46,099][46662] Updated weights for policy 0, policy_version 82340 (0.0008) +[2023-10-13 03:41:46,474][46662] Updated weights for policy 0, policy_version 82350 (0.0007) +[2023-10-13 03:41:46,843][46662] Updated weights for policy 0, policy_version 82360 (0.0007) +[2023-10-13 03:41:48,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 168591360. Throughput: 0: 1666.9, 1: 1690.0. Samples: 42157024. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) +[2023-10-13 03:41:48,607][45375] Avg episode reward: [(0, '58.010'), (1, '57.710')] +[2023-10-13 03:41:48,929][46663] Updated weights for policy 1, policy_version 82281 (0.0009) +[2023-10-13 03:41:49,280][46663] Updated weights for policy 1, policy_version 82291 (0.0009) +[2023-10-13 03:41:49,656][46663] Updated weights for policy 1, policy_version 82301 (0.0009) +[2023-10-13 03:41:50,869][46662] Updated weights for policy 0, policy_version 82370 (0.0008) +[2023-10-13 03:41:51,248][46662] Updated weights for policy 0, policy_version 82380 (0.0009) +[2023-10-13 03:41:51,611][46662] Updated weights for policy 0, policy_version 82390 (0.0007) +[2023-10-13 03:41:51,978][46662] Updated weights for policy 0, policy_version 82400 (0.0009) +[2023-10-13 03:41:53,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 168656896. Throughput: 0: 1682.9, 1: 1683.4. Samples: 42176894. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) +[2023-10-13 03:41:53,607][45375] Avg episode reward: [(0, '58.610'), (1, '57.780')] +[2023-10-13 03:41:53,615][46091] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000082400_84377600.pth... +[2023-10-13 03:41:53,655][46091] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000080832_82771968.pth +[2023-10-13 03:41:53,804][46663] Updated weights for policy 1, policy_version 82311 (0.0009) +[2023-10-13 03:41:54,167][46663] Updated weights for policy 1, policy_version 82321 (0.0009) +[2023-10-13 03:41:54,538][46663] Updated weights for policy 1, policy_version 82331 (0.0009) +[2023-10-13 03:41:54,723][46384] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000082336_84312064.pth... +[2023-10-13 03:41:54,760][46384] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000080736_82673664.pth +[2023-10-13 03:41:56,062][46662] Updated weights for policy 0, policy_version 82410 (0.0007) +[2023-10-13 03:41:56,437][46662] Updated weights for policy 0, policy_version 82420 (0.0009) +[2023-10-13 03:41:56,805][46662] Updated weights for policy 0, policy_version 82430 (0.0009) +[2023-10-13 03:41:58,587][46663] Updated weights for policy 1, policy_version 82341 (0.0008) +[2023-10-13 03:41:58,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 168722432. Throughput: 0: 1690.7, 1: 1684.7. Samples: 42187376. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) +[2023-10-13 03:41:58,607][45375] Avg episode reward: [(0, '58.810'), (1, '56.970')] +[2023-10-13 03:41:58,958][46663] Updated weights for policy 1, policy_version 82351 (0.0009) +[2023-10-13 03:41:59,317][46663] Updated weights for policy 1, policy_version 82361 (0.0009) +[2023-10-13 03:42:00,734][46662] Updated weights for policy 0, policy_version 82440 (0.0007) +[2023-10-13 03:42:01,101][46662] Updated weights for policy 0, policy_version 82450 (0.0007) +[2023-10-13 03:42:01,471][46662] Updated weights for policy 0, policy_version 82460 (0.0008) +[2023-10-13 03:42:03,348][46663] Updated weights for policy 1, policy_version 82371 (0.0010) +[2023-10-13 03:42:03,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 168787968. Throughput: 0: 1665.4, 1: 1684.8. Samples: 42207148. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) +[2023-10-13 03:42:03,608][45375] Avg episode reward: [(0, '58.810'), (1, '59.740')] +[2023-10-13 03:42:03,721][46663] Updated weights for policy 1, policy_version 82381 (0.0009) +[2023-10-13 03:42:04,089][46663] Updated weights for policy 1, policy_version 82391 (0.0010) +[2023-10-13 03:42:05,504][46662] Updated weights for policy 0, policy_version 82470 (0.0010) +[2023-10-13 03:42:05,879][46662] Updated weights for policy 0, policy_version 82480 (0.0007) +[2023-10-13 03:42:06,258][46662] Updated weights for policy 0, policy_version 82490 (0.0007) +[2023-10-13 03:42:08,402][46663] Updated weights for policy 1, policy_version 82401 (0.0009) +[2023-10-13 03:42:08,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 168853504. Throughput: 0: 1689.7, 1: 1677.4. Samples: 42227586. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) +[2023-10-13 03:42:08,607][45375] Avg episode reward: [(0, '58.380'), (1, '59.630')] +[2023-10-13 03:42:08,764][46663] Updated weights for policy 1, policy_version 82411 (0.0008) +[2023-10-13 03:42:09,128][46663] Updated weights for policy 1, policy_version 82421 (0.0009) +[2023-10-13 03:42:09,496][46663] Updated weights for policy 1, policy_version 82431 (0.0008) +[2023-10-13 03:42:10,271][46662] Updated weights for policy 0, policy_version 82500 (0.0007) +[2023-10-13 03:42:10,640][46662] Updated weights for policy 0, policy_version 82510 (0.0007) +[2023-10-13 03:42:11,014][46662] Updated weights for policy 0, policy_version 82520 (0.0008) +[2023-10-13 03:42:13,522][46663] Updated weights for policy 1, policy_version 82441 (0.0010) +[2023-10-13 03:42:13,607][45375] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 168919040. Throughput: 0: 1673.2, 1: 1680.7. Samples: 42237494. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) +[2023-10-13 03:42:13,607][45375] Avg episode reward: [(0, '57.630'), (1, '58.910')] +[2023-10-13 03:42:13,885][46663] Updated weights for policy 1, policy_version 82451 (0.0007) +[2023-10-13 03:42:14,249][46663] Updated weights for policy 1, policy_version 82461 (0.0010) +[2023-10-13 03:42:15,121][46662] Updated weights for policy 0, policy_version 82530 (0.0009) +[2023-10-13 03:42:15,485][46662] Updated weights for policy 0, policy_version 82540 (0.0008) +[2023-10-13 03:42:15,850][46662] Updated weights for policy 0, policy_version 82550 (0.0009) +[2023-10-13 03:42:16,222][46662] Updated weights for policy 0, policy_version 82560 (0.0008) +[2023-10-13 03:42:18,189][46663] Updated weights for policy 1, policy_version 82471 (0.0010) +[2023-10-13 03:42:18,567][46663] Updated weights for policy 1, policy_version 82481 (0.0009) +[2023-10-13 03:42:18,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 168984576. Throughput: 0: 1672.2, 1: 1671.7. Samples: 42257466. Policy #0 lag: (min: 31.0, avg: 36.8, max: 63.0) +[2023-10-13 03:42:18,607][45375] Avg episode reward: [(0, '58.350'), (1, '58.390')] +[2023-10-13 03:42:18,935][46663] Updated weights for policy 1, policy_version 82491 (0.0007) +[2023-10-13 03:42:20,361][46662] Updated weights for policy 0, policy_version 82570 (0.0008) +[2023-10-13 03:42:20,721][46662] Updated weights for policy 0, policy_version 82580 (0.0007) +[2023-10-13 03:42:21,097][46662] Updated weights for policy 0, policy_version 82590 (0.0008) +[2023-10-13 03:42:22,926][46663] Updated weights for policy 1, policy_version 82501 (0.0008) +[2023-10-13 03:42:23,287][46663] Updated weights for policy 1, policy_version 82511 (0.0009) +[2023-10-13 03:42:23,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 169050112. Throughput: 0: 1691.5, 1: 1660.8. Samples: 42277758. Policy #0 lag: (min: 31.0, avg: 36.8, max: 63.0) +[2023-10-13 03:42:23,608][45375] Avg episode reward: [(0, '57.060'), (1, '58.250')] +[2023-10-13 03:42:23,645][46663] Updated weights for policy 1, policy_version 82521 (0.0008) +[2023-10-13 03:42:25,198][46662] Updated weights for policy 0, policy_version 82600 (0.0009) +[2023-10-13 03:42:25,573][46662] Updated weights for policy 0, policy_version 82610 (0.0009) +[2023-10-13 03:42:25,950][46662] Updated weights for policy 0, policy_version 82620 (0.0009) +[2023-10-13 03:42:27,852][46663] Updated weights for policy 1, policy_version 82531 (0.0009) +[2023-10-13 03:42:28,270][46663] Updated weights for policy 1, policy_version 82541 (0.0009) +[2023-10-13 03:42:28,606][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 169115648. Throughput: 0: 1664.6, 1: 1677.0. Samples: 42287668. Policy #0 lag: (min: 31.0, avg: 36.8, max: 63.0) +[2023-10-13 03:42:28,607][45375] Avg episode reward: [(0, '56.020'), (1, '58.240')] +[2023-10-13 03:42:28,636][46663] Updated weights for policy 1, policy_version 82551 (0.0010) +[2023-10-13 03:42:30,122][46662] Updated weights for policy 0, policy_version 82630 (0.0007) +[2023-10-13 03:42:30,483][46662] Updated weights for policy 0, policy_version 82640 (0.0008) +[2023-10-13 03:42:30,856][46662] Updated weights for policy 0, policy_version 82650 (0.0009) +[2023-10-13 03:42:32,678][46663] Updated weights for policy 1, policy_version 82561 (0.0010) +[2023-10-13 03:42:33,048][46663] Updated weights for policy 1, policy_version 82571 (0.0008) +[2023-10-13 03:42:33,410][46663] Updated weights for policy 1, policy_version 82581 (0.0008) +[2023-10-13 03:42:33,607][45375] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 169181184. Throughput: 0: 1673.6, 1: 1676.4. Samples: 42307776. Policy #0 lag: (min: 31.0, avg: 36.8, max: 63.0) +[2023-10-13 03:42:33,607][45375] Avg episode reward: [(0, '57.630'), (1, '58.990')] +[2023-10-13 03:42:33,773][46663] Updated weights for policy 1, policy_version 82591 (0.0008) +[2023-10-13 03:42:34,656][46662] Updated weights for policy 0, policy_version 82660 (0.0008) +[2023-10-13 03:42:35,020][46662] Updated weights for policy 0, policy_version 82670 (0.0007) +[2023-10-13 03:42:35,388][46662] Updated weights for policy 0, policy_version 82680 (0.0009) +[2023-10-13 03:42:37,999][46663] Updated weights for policy 1, policy_version 82601 (0.0011) +[2023-10-13 03:42:38,365][46663] Updated weights for policy 1, policy_version 82611 (0.0010) +[2023-10-13 03:42:38,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 169246720. Throughput: 0: 1691.6, 1: 1659.5. Samples: 42327692. Policy #0 lag: (min: 31.0, avg: 36.8, max: 63.0) +[2023-10-13 03:42:38,607][45375] Avg episode reward: [(0, '57.320'), (1, '57.610')] +[2023-10-13 03:42:38,725][46663] Updated weights for policy 1, policy_version 82621 (0.0009) +[2023-10-13 03:42:39,538][46662] Updated weights for policy 0, policy_version 82690 (0.0011) +[2023-10-13 03:42:39,912][46662] Updated weights for policy 0, policy_version 82700 (0.0007) +[2023-10-13 03:42:40,278][46662] Updated weights for policy 0, policy_version 82710 (0.0008) +[2023-10-13 03:42:40,652][46662] Updated weights for policy 0, policy_version 82720 (0.0009) +[2023-10-13 03:42:42,754][46663] Updated weights for policy 1, policy_version 82631 (0.0008) +[2023-10-13 03:42:43,110][46663] Updated weights for policy 1, policy_version 82641 (0.0008) +[2023-10-13 03:42:43,478][46663] Updated weights for policy 1, policy_version 82651 (0.0009) +[2023-10-13 03:42:43,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 169312256. Throughput: 0: 1660.0, 1: 1673.3. Samples: 42337374. Policy #0 lag: (min: 31.0, avg: 36.8, max: 63.0) +[2023-10-13 03:42:43,608][45375] Avg episode reward: [(0, '56.910'), (1, '55.580')] +[2023-10-13 03:42:44,647][46662] Updated weights for policy 0, policy_version 82730 (0.0010) +[2023-10-13 03:42:45,023][46662] Updated weights for policy 0, policy_version 82740 (0.0009) +[2023-10-13 03:42:45,395][46662] Updated weights for policy 0, policy_version 82750 (0.0009) +[2023-10-13 03:42:47,477][46663] Updated weights for policy 1, policy_version 82661 (0.0009) +[2023-10-13 03:42:47,851][46663] Updated weights for policy 1, policy_version 82671 (0.0009) +[2023-10-13 03:42:48,225][46663] Updated weights for policy 1, policy_version 82681 (0.0009) +[2023-10-13 03:42:48,607][45375] Fps is (10 sec: 16384.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 169410560. Throughput: 0: 1679.0, 1: 1676.9. Samples: 42358162. Policy #0 lag: (min: 31.0, avg: 36.8, max: 63.0) +[2023-10-13 03:42:48,607][45375] Avg episode reward: [(0, '57.400'), (1, '55.530')] +[2023-10-13 03:42:49,612][46662] Updated weights for policy 0, policy_version 82760 (0.0008) +[2023-10-13 03:42:49,984][46662] Updated weights for policy 0, policy_version 82770 (0.0007) +[2023-10-13 03:42:50,358][46662] Updated weights for policy 0, policy_version 82780 (0.0008) +[2023-10-13 03:42:52,391][46663] Updated weights for policy 1, policy_version 82691 (0.0008) +[2023-10-13 03:42:52,758][46663] Updated weights for policy 1, policy_version 82701 (0.0008) +[2023-10-13 03:42:53,124][46663] Updated weights for policy 1, policy_version 82711 (0.0008) +[2023-10-13 03:42:53,607][45375] Fps is (10 sec: 16383.8, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 169476096. Throughput: 0: 1678.7, 1: 1655.0. Samples: 42377604. Policy #0 lag: (min: 31.0, avg: 36.8, max: 63.0) +[2023-10-13 03:42:53,608][45375] Avg episode reward: [(0, '58.080'), (1, '56.950')] +[2023-10-13 03:42:54,415][46662] Updated weights for policy 0, policy_version 82790 (0.0008) +[2023-10-13 03:42:54,779][46662] Updated weights for policy 0, policy_version 82800 (0.0007) +[2023-10-13 03:42:55,155][46662] Updated weights for policy 0, policy_version 82810 (0.0007) +[2023-10-13 03:42:57,244][46663] Updated weights for policy 1, policy_version 82721 (0.0010) +[2023-10-13 03:42:57,604][46663] Updated weights for policy 1, policy_version 82731 (0.0008) +[2023-10-13 03:42:57,968][46663] Updated weights for policy 1, policy_version 82741 (0.0008) +[2023-10-13 03:42:58,328][46663] Updated weights for policy 1, policy_version 82751 (0.0007) +[2023-10-13 03:42:58,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 169541632. Throughput: 0: 1661.2, 1: 1679.6. Samples: 42387828. Policy #0 lag: (min: 31.0, avg: 36.8, max: 63.0) +[2023-10-13 03:42:58,607][45375] Avg episode reward: [(0, '58.020'), (1, '56.610')] +[2023-10-13 03:42:59,173][46662] Updated weights for policy 0, policy_version 82820 (0.0007) +[2023-10-13 03:42:59,546][46662] Updated weights for policy 0, policy_version 82830 (0.0009) +[2023-10-13 03:42:59,914][46662] Updated weights for policy 0, policy_version 82840 (0.0008) +[2023-10-13 03:43:02,344][46663] Updated weights for policy 1, policy_version 82761 (0.0009) +[2023-10-13 03:43:02,713][46663] Updated weights for policy 1, policy_version 82771 (0.0008) +[2023-10-13 03:43:03,078][46663] Updated weights for policy 1, policy_version 82781 (0.0007) +[2023-10-13 03:43:03,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 169607168. Throughput: 0: 1676.6, 1: 1676.1. Samples: 42408336. Policy #0 lag: (min: 31.0, avg: 36.8, max: 63.0) +[2023-10-13 03:43:03,608][45375] Avg episode reward: [(0, '56.340'), (1, '56.530')] +[2023-10-13 03:43:04,013][46662] Updated weights for policy 0, policy_version 82850 (0.0007) +[2023-10-13 03:43:04,397][46662] Updated weights for policy 0, policy_version 82860 (0.0008) +[2023-10-13 03:43:04,769][46662] Updated weights for policy 0, policy_version 82870 (0.0007) +[2023-10-13 03:43:05,138][46662] Updated weights for policy 0, policy_version 82880 (0.0008) +[2023-10-13 03:43:07,177][46663] Updated weights for policy 1, policy_version 82791 (0.0011) +[2023-10-13 03:43:07,544][46663] Updated weights for policy 1, policy_version 82801 (0.0010) +[2023-10-13 03:43:07,913][46663] Updated weights for policy 1, policy_version 82811 (0.0007) +[2023-10-13 03:43:08,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 169672704. Throughput: 0: 1671.2, 1: 1661.8. Samples: 42427746. Policy #0 lag: (min: 31.0, avg: 36.8, max: 63.0) +[2023-10-13 03:43:08,608][45375] Avg episode reward: [(0, '58.440'), (1, '56.460')] +[2023-10-13 03:43:09,265][46662] Updated weights for policy 0, policy_version 82890 (0.0010) +[2023-10-13 03:43:09,637][46662] Updated weights for policy 0, policy_version 82900 (0.0010) +[2023-10-13 03:43:10,006][46662] Updated weights for policy 0, policy_version 82910 (0.0010) +[2023-10-13 03:43:11,919][46663] Updated weights for policy 1, policy_version 82821 (0.0008) +[2023-10-13 03:43:12,294][46663] Updated weights for policy 1, policy_version 82831 (0.0009) +[2023-10-13 03:43:12,655][46663] Updated weights for policy 1, policy_version 82841 (0.0010) +[2023-10-13 03:43:13,607][45375] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 169738240. Throughput: 0: 1666.1, 1: 1675.7. Samples: 42438050. Policy #0 lag: (min: 31.0, avg: 35.6, max: 63.0) +[2023-10-13 03:43:13,607][45375] Avg episode reward: [(0, '58.050'), (1, '55.810')] +[2023-10-13 03:43:14,130][46662] Updated weights for policy 0, policy_version 82920 (0.0008) +[2023-10-13 03:43:14,496][46662] Updated weights for policy 0, policy_version 82930 (0.0007) +[2023-10-13 03:43:14,871][46662] Updated weights for policy 0, policy_version 82940 (0.0008) +[2023-10-13 03:43:16,759][46663] Updated weights for policy 1, policy_version 82851 (0.0010) +[2023-10-13 03:43:17,177][46663] Updated weights for policy 1, policy_version 82861 (0.0008) +[2023-10-13 03:43:17,538][46663] Updated weights for policy 1, policy_version 82871 (0.0007) +[2023-10-13 03:43:18,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 169803776. Throughput: 0: 1675.6, 1: 1659.0. Samples: 42457830. Policy #0 lag: (min: 31.0, avg: 35.6, max: 63.0) +[2023-10-13 03:43:18,607][45375] Avg episode reward: [(0, '58.560'), (1, '54.900')] +[2023-10-13 03:43:18,796][46662] Updated weights for policy 0, policy_version 82950 (0.0008) +[2023-10-13 03:43:19,170][46662] Updated weights for policy 0, policy_version 82960 (0.0009) +[2023-10-13 03:43:19,550][46662] Updated weights for policy 0, policy_version 82970 (0.0007) +[2023-10-13 03:43:21,535][46663] Updated weights for policy 1, policy_version 82881 (0.0008) +[2023-10-13 03:43:21,905][46663] Updated weights for policy 1, policy_version 82891 (0.0007) +[2023-10-13 03:43:22,273][46663] Updated weights for policy 1, policy_version 82901 (0.0011) +[2023-10-13 03:43:22,644][46663] Updated weights for policy 1, policy_version 82911 (0.0012) +[2023-10-13 03:43:23,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 169869312. Throughput: 0: 1672.0, 1: 1662.8. Samples: 42477758. Policy #0 lag: (min: 31.0, avg: 35.6, max: 63.0) +[2023-10-13 03:43:23,607][45375] Avg episode reward: [(0, '58.920'), (1, '53.830')] +[2023-10-13 03:43:23,722][46662] Updated weights for policy 0, policy_version 82980 (0.0010) +[2023-10-13 03:43:24,099][46662] Updated weights for policy 0, policy_version 82990 (0.0010) +[2023-10-13 03:43:24,467][46662] Updated weights for policy 0, policy_version 83000 (0.0008) +[2023-10-13 03:43:26,899][46663] Updated weights for policy 1, policy_version 82921 (0.0008) +[2023-10-13 03:43:27,268][46663] Updated weights for policy 1, policy_version 82931 (0.0007) +[2023-10-13 03:43:27,625][46663] Updated weights for policy 1, policy_version 82941 (0.0008) +[2023-10-13 03:43:28,574][46662] Updated weights for policy 0, policy_version 83010 (0.0009) +[2023-10-13 03:43:28,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 169934848. Throughput: 0: 1671.4, 1: 1676.4. Samples: 42488024. Policy #0 lag: (min: 31.0, avg: 35.6, max: 63.0) +[2023-10-13 03:43:28,608][45375] Avg episode reward: [(0, '59.450'), (1, '54.280')] +[2023-10-13 03:43:28,949][46662] Updated weights for policy 0, policy_version 83020 (0.0011) +[2023-10-13 03:43:29,322][46662] Updated weights for policy 0, policy_version 83030 (0.0012) +[2023-10-13 03:43:29,681][46662] Updated weights for policy 0, policy_version 83040 (0.0011) +[2023-10-13 03:43:31,810][46663] Updated weights for policy 1, policy_version 82951 (0.0010) +[2023-10-13 03:43:32,177][46663] Updated weights for policy 1, policy_version 82961 (0.0007) +[2023-10-13 03:43:32,546][46663] Updated weights for policy 1, policy_version 82971 (0.0007) +[2023-10-13 03:43:33,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 170000384. Throughput: 0: 1673.8, 1: 1652.8. Samples: 42507856. Policy #0 lag: (min: 31.0, avg: 35.6, max: 63.0) +[2023-10-13 03:43:33,607][45375] Avg episode reward: [(0, '59.790'), (1, '54.690')] +[2023-10-13 03:43:33,871][46662] Updated weights for policy 0, policy_version 83050 (0.0010) +[2023-10-13 03:43:34,242][46662] Updated weights for policy 0, policy_version 83060 (0.0009) +[2023-10-13 03:43:34,619][46662] Updated weights for policy 0, policy_version 83070 (0.0009) +[2023-10-13 03:43:36,540][46663] Updated weights for policy 1, policy_version 82981 (0.0007) +[2023-10-13 03:43:36,917][46663] Updated weights for policy 1, policy_version 82991 (0.0008) +[2023-10-13 03:43:37,283][46663] Updated weights for policy 1, policy_version 83001 (0.0007) +[2023-10-13 03:43:38,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 170065920. Throughput: 0: 1671.0, 1: 1670.0. Samples: 42527948. Policy #0 lag: (min: 31.0, avg: 35.6, max: 63.0) +[2023-10-13 03:43:38,608][45375] Avg episode reward: [(0, '59.640'), (1, '55.520')] +[2023-10-13 03:43:38,799][46662] Updated weights for policy 0, policy_version 83080 (0.0010) +[2023-10-13 03:43:39,170][46662] Updated weights for policy 0, policy_version 83090 (0.0010) +[2023-10-13 03:43:39,538][46662] Updated weights for policy 0, policy_version 83100 (0.0010) +[2023-10-13 03:43:41,439][46663] Updated weights for policy 1, policy_version 83011 (0.0008) +[2023-10-13 03:43:41,813][46663] Updated weights for policy 1, policy_version 83021 (0.0007) +[2023-10-13 03:43:42,169][46663] Updated weights for policy 1, policy_version 83031 (0.0010) +[2023-10-13 03:43:43,488][46662] Updated weights for policy 0, policy_version 83110 (0.0008) +[2023-10-13 03:43:43,607][45375] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 170131456. Throughput: 0: 1670.6, 1: 1671.6. Samples: 42538228. Policy #0 lag: (min: 31.0, avg: 35.6, max: 63.0) +[2023-10-13 03:43:43,608][45375] Avg episode reward: [(0, '58.810'), (1, '54.590')] +[2023-10-13 03:43:43,856][46662] Updated weights for policy 0, policy_version 83120 (0.0008) +[2023-10-13 03:43:44,234][46662] Updated weights for policy 0, policy_version 83130 (0.0007) +[2023-10-13 03:43:46,449][46663] Updated weights for policy 1, policy_version 83041 (0.0009) +[2023-10-13 03:43:46,815][46663] Updated weights for policy 1, policy_version 83051 (0.0010) +[2023-10-13 03:43:47,186][46663] Updated weights for policy 1, policy_version 83061 (0.0008) +[2023-10-13 03:43:47,554][46663] Updated weights for policy 1, policy_version 83071 (0.0007) +[2023-10-13 03:43:48,231][46662] Updated weights for policy 0, policy_version 83140 (0.0009) +[2023-10-13 03:43:48,607][45375] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 170196992. Throughput: 0: 1673.0, 1: 1649.6. Samples: 42557854. Policy #0 lag: (min: 31.0, avg: 35.6, max: 63.0) +[2023-10-13 03:43:48,607][45375] Avg episode reward: [(0, '59.380'), (1, '56.330')] +[2023-10-13 03:43:48,610][46662] Updated weights for policy 0, policy_version 83150 (0.0009) +[2023-10-13 03:43:48,982][46662] Updated weights for policy 0, policy_version 83160 (0.0009) +[2023-10-13 03:43:51,565][46663] Updated weights for policy 1, policy_version 83081 (0.0008) +[2023-10-13 03:43:51,936][46663] Updated weights for policy 1, policy_version 83091 (0.0009) +[2023-10-13 03:43:52,309][46663] Updated weights for policy 1, policy_version 83101 (0.0008) +[2023-10-13 03:43:53,133][46662] Updated weights for policy 0, policy_version 83170 (0.0010) +[2023-10-13 03:43:53,506][46662] Updated weights for policy 0, policy_version 83180 (0.0009) +[2023-10-13 03:43:53,607][45375] Fps is (10 sec: 13107.5, 60 sec: 13107.3, 300 sec: 13440.4). Total num frames: 170262528. Throughput: 0: 1675.0, 1: 1673.6. Samples: 42578434. Policy #0 lag: (min: 31.0, avg: 35.6, max: 63.0) +[2023-10-13 03:43:53,607][45375] Avg episode reward: [(0, '58.790'), (1, '57.380')] +[2023-10-13 03:43:53,614][46384] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000083104_85098496.pth... +[2023-10-13 03:43:53,647][46384] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000081536_83492864.pth +[2023-10-13 03:43:53,881][46662] Updated weights for policy 0, policy_version 83190 (0.0009) +[2023-10-13 03:43:54,250][46091] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000083200_85196800.pth... +[2023-10-13 03:43:54,251][46662] Updated weights for policy 0, policy_version 83200 (0.0007) +[2023-10-13 03:43:54,278][46091] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000081632_83591168.pth +[2023-10-13 03:43:56,265][46663] Updated weights for policy 1, policy_version 83111 (0.0009) +[2023-10-13 03:43:56,630][46663] Updated weights for policy 1, policy_version 83121 (0.0008) +[2023-10-13 03:43:56,990][46663] Updated weights for policy 1, policy_version 83131 (0.0009) +[2023-10-13 03:43:58,288][46662] Updated weights for policy 0, policy_version 83210 (0.0009) +[2023-10-13 03:43:58,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 170328064. Throughput: 0: 1674.1, 1: 1664.0. Samples: 42588266. Policy #0 lag: (min: 31.0, avg: 35.6, max: 63.0) +[2023-10-13 03:43:58,607][45375] Avg episode reward: [(0, '58.660'), (1, '56.310')] +[2023-10-13 03:43:58,652][46662] Updated weights for policy 0, policy_version 83220 (0.0008) +[2023-10-13 03:43:59,013][46662] Updated weights for policy 0, policy_version 83230 (0.0007) +[2023-10-13 03:44:01,081][46663] Updated weights for policy 1, policy_version 83141 (0.0010) +[2023-10-13 03:44:01,457][46663] Updated weights for policy 1, policy_version 83151 (0.0010) +[2023-10-13 03:44:01,814][46663] Updated weights for policy 1, policy_version 83161 (0.0009) +[2023-10-13 03:44:03,162][46662] Updated weights for policy 0, policy_version 83240 (0.0009) +[2023-10-13 03:44:03,521][46662] Updated weights for policy 0, policy_version 83250 (0.0010) +[2023-10-13 03:44:03,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 170393600. Throughput: 0: 1674.6, 1: 1663.4. Samples: 42608040. Policy #0 lag: (min: 31.0, avg: 35.6, max: 63.0) +[2023-10-13 03:44:03,607][45375] Avg episode reward: [(0, '57.450'), (1, '55.200')] +[2023-10-13 03:44:03,897][46662] Updated weights for policy 0, policy_version 83260 (0.0007) +[2023-10-13 03:44:05,832][46663] Updated weights for policy 1, policy_version 83171 (0.0010) +[2023-10-13 03:44:06,201][46663] Updated weights for policy 1, policy_version 83181 (0.0010) +[2023-10-13 03:44:06,562][46663] Updated weights for policy 1, policy_version 83191 (0.0009) +[2023-10-13 03:44:07,813][46662] Updated weights for policy 0, policy_version 83270 (0.0009) +[2023-10-13 03:44:08,180][46662] Updated weights for policy 0, policy_version 83280 (0.0008) +[2023-10-13 03:44:08,555][46662] Updated weights for policy 0, policy_version 83290 (0.0007) +[2023-10-13 03:44:08,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 170459136. Throughput: 0: 1676.2, 1: 1680.6. Samples: 42628816. Policy #0 lag: (min: 0.0, avg: 25.9, max: 32.0) +[2023-10-13 03:44:08,607][45375] Avg episode reward: [(0, '56.410'), (1, '54.590')] +[2023-10-13 03:44:10,642][46663] Updated weights for policy 1, policy_version 83201 (0.0010) +[2023-10-13 03:44:11,015][46663] Updated weights for policy 1, policy_version 83211 (0.0008) +[2023-10-13 03:44:11,387][46663] Updated weights for policy 1, policy_version 83221 (0.0008) +[2023-10-13 03:44:11,750][46663] Updated weights for policy 1, policy_version 83231 (0.0009) +[2023-10-13 03:44:12,556][46662] Updated weights for policy 0, policy_version 83300 (0.0009) +[2023-10-13 03:44:12,920][46662] Updated weights for policy 0, policy_version 83310 (0.0009) +[2023-10-13 03:44:13,290][46662] Updated weights for policy 0, policy_version 83320 (0.0007) +[2023-10-13 03:44:13,606][45375] Fps is (10 sec: 16384.0, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 170557440. Throughput: 0: 1688.3, 1: 1663.8. Samples: 42638868. Policy #0 lag: (min: 0.0, avg: 25.9, max: 32.0) +[2023-10-13 03:44:13,607][45375] Avg episode reward: [(0, '57.450'), (1, '54.920')] +[2023-10-13 03:44:15,921][46663] Updated weights for policy 1, policy_version 83241 (0.0008) +[2023-10-13 03:44:16,281][46663] Updated weights for policy 1, policy_version 83251 (0.0009) +[2023-10-13 03:44:16,637][46663] Updated weights for policy 1, policy_version 83261 (0.0008) +[2023-10-13 03:44:17,375][46662] Updated weights for policy 0, policy_version 83330 (0.0007) +[2023-10-13 03:44:17,745][46662] Updated weights for policy 0, policy_version 83340 (0.0008) +[2023-10-13 03:44:18,117][46662] Updated weights for policy 0, policy_version 83350 (0.0008) +[2023-10-13 03:44:18,473][46662] Updated weights for policy 0, policy_version 83360 (0.0010) +[2023-10-13 03:44:18,606][45375] Fps is (10 sec: 16384.2, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 170622976. Throughput: 0: 1687.4, 1: 1669.3. Samples: 42658906. Policy #0 lag: (min: 0.0, avg: 25.9, max: 32.0) +[2023-10-13 03:44:18,607][45375] Avg episode reward: [(0, '57.230'), (1, '54.420')] +[2023-10-13 03:44:20,760][46663] Updated weights for policy 1, policy_version 83271 (0.0009) +[2023-10-13 03:44:21,133][46663] Updated weights for policy 1, policy_version 83281 (0.0010) +[2023-10-13 03:44:21,497][46663] Updated weights for policy 1, policy_version 83291 (0.0010) +[2023-10-13 03:44:22,530][46662] Updated weights for policy 0, policy_version 83370 (0.0008) +[2023-10-13 03:44:22,894][46662] Updated weights for policy 0, policy_version 83380 (0.0009) +[2023-10-13 03:44:23,279][46662] Updated weights for policy 0, policy_version 83390 (0.0008) +[2023-10-13 03:44:23,607][45375] Fps is (10 sec: 13106.8, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 170688512. Throughput: 0: 1679.6, 1: 1674.2. Samples: 42678870. Policy #0 lag: (min: 0.0, avg: 25.9, max: 32.0) +[2023-10-13 03:44:23,608][45375] Avg episode reward: [(0, '57.590'), (1, '53.510')] +[2023-10-13 03:44:25,650][46663] Updated weights for policy 1, policy_version 83301 (0.0009) +[2023-10-13 03:44:26,019][46663] Updated weights for policy 1, policy_version 83311 (0.0009) +[2023-10-13 03:44:26,388][46663] Updated weights for policy 1, policy_version 83321 (0.0007) +[2023-10-13 03:44:27,422][46662] Updated weights for policy 0, policy_version 83400 (0.0008) +[2023-10-13 03:44:27,794][46662] Updated weights for policy 0, policy_version 83410 (0.0008) +[2023-10-13 03:44:28,164][46662] Updated weights for policy 0, policy_version 83420 (0.0007) +[2023-10-13 03:44:28,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 170754048. Throughput: 0: 1694.4, 1: 1653.6. Samples: 42688886. Policy #0 lag: (min: 0.0, avg: 25.9, max: 32.0) +[2023-10-13 03:44:28,607][45375] Avg episode reward: [(0, '57.330'), (1, '54.490')] +[2023-10-13 03:44:30,400][46663] Updated weights for policy 1, policy_version 83331 (0.0008) +[2023-10-13 03:44:30,777][46663] Updated weights for policy 1, policy_version 83341 (0.0008) +[2023-10-13 03:44:31,137][46663] Updated weights for policy 1, policy_version 83351 (0.0008) +[2023-10-13 03:44:32,271][46662] Updated weights for policy 0, policy_version 83430 (0.0009) +[2023-10-13 03:44:32,632][46662] Updated weights for policy 0, policy_version 83440 (0.0010) +[2023-10-13 03:44:33,006][46662] Updated weights for policy 0, policy_version 83450 (0.0007) +[2023-10-13 03:44:33,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 170819584. Throughput: 0: 1693.0, 1: 1673.7. Samples: 42709358. Policy #0 lag: (min: 0.0, avg: 25.9, max: 32.0) +[2023-10-13 03:44:33,608][45375] Avg episode reward: [(0, '58.310'), (1, '55.410')] +[2023-10-13 03:44:35,158][46663] Updated weights for policy 1, policy_version 83361 (0.0007) +[2023-10-13 03:44:35,525][46663] Updated weights for policy 1, policy_version 83371 (0.0008) +[2023-10-13 03:44:35,892][46663] Updated weights for policy 1, policy_version 83381 (0.0007) +[2023-10-13 03:44:36,265][46663] Updated weights for policy 1, policy_version 83391 (0.0008) +[2023-10-13 03:44:36,813][46662] Updated weights for policy 0, policy_version 83460 (0.0008) +[2023-10-13 03:44:37,194][46662] Updated weights for policy 0, policy_version 83470 (0.0007) +[2023-10-13 03:44:37,564][46662] Updated weights for policy 0, policy_version 83480 (0.0007) +[2023-10-13 03:44:38,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 170885120. Throughput: 0: 1672.3, 1: 1681.0. Samples: 42729334. Policy #0 lag: (min: 0.0, avg: 25.9, max: 32.0) +[2023-10-13 03:44:38,607][45375] Avg episode reward: [(0, '57.460'), (1, '54.890')] +[2023-10-13 03:44:40,403][46663] Updated weights for policy 1, policy_version 83401 (0.0008) +[2023-10-13 03:44:40,763][46663] Updated weights for policy 1, policy_version 83411 (0.0010) +[2023-10-13 03:44:41,135][46663] Updated weights for policy 1, policy_version 83421 (0.0010) +[2023-10-13 03:44:41,617][46662] Updated weights for policy 0, policy_version 83490 (0.0009) +[2023-10-13 03:44:41,985][46662] Updated weights for policy 0, policy_version 83500 (0.0007) +[2023-10-13 03:44:42,353][46662] Updated weights for policy 0, policy_version 83510 (0.0008) +[2023-10-13 03:44:42,725][46662] Updated weights for policy 0, policy_version 83520 (0.0009) +[2023-10-13 03:44:43,607][45375] Fps is (10 sec: 13107.4, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 170950656. Throughput: 0: 1699.6, 1: 1659.7. Samples: 42739434. Policy #0 lag: (min: 0.0, avg: 25.9, max: 32.0) +[2023-10-13 03:44:43,607][45375] Avg episode reward: [(0, '58.320'), (1, '54.190')] +[2023-10-13 03:44:45,069][46663] Updated weights for policy 1, policy_version 83431 (0.0009) +[2023-10-13 03:44:45,441][46663] Updated weights for policy 1, policy_version 83441 (0.0008) +[2023-10-13 03:44:45,806][46663] Updated weights for policy 1, policy_version 83451 (0.0007) +[2023-10-13 03:44:46,906][46662] Updated weights for policy 0, policy_version 83530 (0.0010) +[2023-10-13 03:44:47,278][46662] Updated weights for policy 0, policy_version 83540 (0.0008) +[2023-10-13 03:44:47,649][46662] Updated weights for policy 0, policy_version 83550 (0.0010) +[2023-10-13 03:44:48,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 171016192. Throughput: 0: 1689.6, 1: 1684.5. Samples: 42759876. Policy #0 lag: (min: 0.0, avg: 25.9, max: 32.0) +[2023-10-13 03:44:48,607][45375] Avg episode reward: [(0, '59.760'), (1, '54.640')] +[2023-10-13 03:44:49,949][46663] Updated weights for policy 1, policy_version 83461 (0.0007) +[2023-10-13 03:44:50,344][46663] Updated weights for policy 1, policy_version 83471 (0.0008) +[2023-10-13 03:44:50,719][46663] Updated weights for policy 1, policy_version 83481 (0.0009) +[2023-10-13 03:44:51,734][46662] Updated weights for policy 0, policy_version 83560 (0.0008) +[2023-10-13 03:44:52,109][46662] Updated weights for policy 0, policy_version 83570 (0.0007) +[2023-10-13 03:44:52,479][46662] Updated weights for policy 0, policy_version 83580 (0.0009) +[2023-10-13 03:44:53,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 171081728. Throughput: 0: 1666.5, 1: 1678.7. Samples: 42779348. Policy #0 lag: (min: 0.0, avg: 25.9, max: 32.0) +[2023-10-13 03:44:53,607][45375] Avg episode reward: [(0, '60.570'), (1, '52.860')] +[2023-10-13 03:44:54,990][46663] Updated weights for policy 1, policy_version 83491 (0.0010) +[2023-10-13 03:44:55,351][46663] Updated weights for policy 1, policy_version 83501 (0.0008) +[2023-10-13 03:44:55,717][46663] Updated weights for policy 1, policy_version 83511 (0.0008) +[2023-10-13 03:44:56,384][46662] Updated weights for policy 0, policy_version 83590 (0.0007) +[2023-10-13 03:44:56,757][46662] Updated weights for policy 0, policy_version 83600 (0.0007) +[2023-10-13 03:44:57,119][46662] Updated weights for policy 0, policy_version 83610 (0.0010) +[2023-10-13 03:44:58,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 171147264. Throughput: 0: 1690.2, 1: 1666.7. Samples: 42789928. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 03:44:58,607][45375] Avg episode reward: [(0, '58.910'), (1, '53.480')] +[2023-10-13 03:44:59,718][46663] Updated weights for policy 1, policy_version 83521 (0.0008) +[2023-10-13 03:45:00,086][46663] Updated weights for policy 1, policy_version 83531 (0.0011) +[2023-10-13 03:45:00,444][46663] Updated weights for policy 1, policy_version 83541 (0.0009) +[2023-10-13 03:45:00,813][46663] Updated weights for policy 1, policy_version 83551 (0.0009) +[2023-10-13 03:45:01,232][46662] Updated weights for policy 0, policy_version 83620 (0.0010) +[2023-10-13 03:45:01,608][46662] Updated weights for policy 0, policy_version 83630 (0.0008) +[2023-10-13 03:45:01,979][46662] Updated weights for policy 0, policy_version 83640 (0.0008) +[2023-10-13 03:45:03,607][45375] Fps is (10 sec: 13106.7, 60 sec: 13653.2, 300 sec: 13440.4). Total num frames: 171212800. Throughput: 0: 1672.8, 1: 1680.1. Samples: 42809790. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 03:45:03,608][45375] Avg episode reward: [(0, '59.830'), (1, '54.640')] +[2023-10-13 03:45:04,962][46663] Updated weights for policy 1, policy_version 83561 (0.0009) +[2023-10-13 03:45:05,328][46663] Updated weights for policy 1, policy_version 83571 (0.0009) +[2023-10-13 03:45:05,693][46663] Updated weights for policy 1, policy_version 83581 (0.0007) +[2023-10-13 03:45:05,953][46662] Updated weights for policy 0, policy_version 83650 (0.0008) +[2023-10-13 03:45:06,320][46662] Updated weights for policy 0, policy_version 83660 (0.0008) +[2023-10-13 03:45:06,687][46662] Updated weights for policy 0, policy_version 83670 (0.0007) +[2023-10-13 03:45:07,053][46662] Updated weights for policy 0, policy_version 83680 (0.0007) +[2023-10-13 03:45:08,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 171278336. Throughput: 0: 1677.9, 1: 1682.5. Samples: 42830086. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 03:45:08,607][45375] Avg episode reward: [(0, '59.370'), (1, '55.900')] +[2023-10-13 03:45:09,792][46663] Updated weights for policy 1, policy_version 83591 (0.0008) +[2023-10-13 03:45:10,158][46663] Updated weights for policy 1, policy_version 83601 (0.0007) +[2023-10-13 03:45:10,531][46663] Updated weights for policy 1, policy_version 83611 (0.0007) +[2023-10-13 03:45:11,079][46662] Updated weights for policy 0, policy_version 83690 (0.0008) +[2023-10-13 03:45:11,455][46662] Updated weights for policy 0, policy_version 83700 (0.0007) +[2023-10-13 03:45:11,827][46662] Updated weights for policy 0, policy_version 83710 (0.0008) +[2023-10-13 03:45:13,607][45375] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 171343872. Throughput: 0: 1689.3, 1: 1674.1. Samples: 42840238. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 03:45:13,608][45375] Avg episode reward: [(0, '59.890'), (1, '54.640')] +[2023-10-13 03:45:14,688][46663] Updated weights for policy 1, policy_version 83621 (0.0008) +[2023-10-13 03:45:15,058][46663] Updated weights for policy 1, policy_version 83631 (0.0010) +[2023-10-13 03:45:15,420][46663] Updated weights for policy 1, policy_version 83641 (0.0011) +[2023-10-13 03:45:15,874][46662] Updated weights for policy 0, policy_version 83720 (0.0009) +[2023-10-13 03:45:16,252][46662] Updated weights for policy 0, policy_version 83730 (0.0007) +[2023-10-13 03:45:16,623][46662] Updated weights for policy 0, policy_version 83740 (0.0007) +[2023-10-13 03:45:18,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 171409408. Throughput: 0: 1665.7, 1: 1678.6. Samples: 42859854. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 03:45:18,607][45375] Avg episode reward: [(0, '59.210'), (1, '53.800')] +[2023-10-13 03:45:19,526][46663] Updated weights for policy 1, policy_version 83651 (0.0008) +[2023-10-13 03:45:19,893][46663] Updated weights for policy 1, policy_version 83661 (0.0009) +[2023-10-13 03:45:20,259][46663] Updated weights for policy 1, policy_version 83671 (0.0010) +[2023-10-13 03:45:20,554][46662] Updated weights for policy 0, policy_version 83750 (0.0008) +[2023-10-13 03:45:20,926][46662] Updated weights for policy 0, policy_version 83760 (0.0009) +[2023-10-13 03:45:21,296][46662] Updated weights for policy 0, policy_version 83770 (0.0009) +[2023-10-13 03:45:23,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 171474944. Throughput: 0: 1688.8, 1: 1672.7. Samples: 42880600. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 03:45:23,608][45375] Avg episode reward: [(0, '58.760'), (1, '54.880')] +[2023-10-13 03:45:24,269][46663] Updated weights for policy 1, policy_version 83681 (0.0007) +[2023-10-13 03:45:24,638][46663] Updated weights for policy 1, policy_version 83691 (0.0007) +[2023-10-13 03:45:25,001][46663] Updated weights for policy 1, policy_version 83701 (0.0009) +[2023-10-13 03:45:25,333][46662] Updated weights for policy 0, policy_version 83780 (0.0008) +[2023-10-13 03:45:25,370][46663] Updated weights for policy 1, policy_version 83711 (0.0009) +[2023-10-13 03:45:25,697][46662] Updated weights for policy 0, policy_version 83790 (0.0007) +[2023-10-13 03:45:26,060][46662] Updated weights for policy 0, policy_version 83800 (0.0010) +[2023-10-13 03:45:28,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 171540480. Throughput: 0: 1678.4, 1: 1677.8. Samples: 42890462. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 03:45:28,607][45375] Avg episode reward: [(0, '57.130'), (1, '54.530')] +[2023-10-13 03:45:29,581][46663] Updated weights for policy 1, policy_version 83721 (0.0008) +[2023-10-13 03:45:29,941][46663] Updated weights for policy 1, policy_version 83731 (0.0010) +[2023-10-13 03:45:30,084][46662] Updated weights for policy 0, policy_version 83810 (0.0007) +[2023-10-13 03:45:30,318][46663] Updated weights for policy 1, policy_version 83741 (0.0009) +[2023-10-13 03:45:30,455][46662] Updated weights for policy 0, policy_version 83820 (0.0007) +[2023-10-13 03:45:30,819][46662] Updated weights for policy 0, policy_version 83830 (0.0008) +[2023-10-13 03:45:31,197][46662] Updated weights for policy 0, policy_version 83840 (0.0009) +[2023-10-13 03:45:33,607][45375] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 171606016. Throughput: 0: 1679.5, 1: 1666.5. Samples: 42910446. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 03:45:33,607][45375] Avg episode reward: [(0, '56.710'), (1, '55.290')] +[2023-10-13 03:45:34,229][46663] Updated weights for policy 1, policy_version 83751 (0.0007) +[2023-10-13 03:45:34,601][46663] Updated weights for policy 1, policy_version 83761 (0.0007) +[2023-10-13 03:45:34,972][46663] Updated weights for policy 1, policy_version 83771 (0.0009) +[2023-10-13 03:45:35,260][46662] Updated weights for policy 0, policy_version 83850 (0.0008) +[2023-10-13 03:45:35,627][46662] Updated weights for policy 0, policy_version 83860 (0.0008) +[2023-10-13 03:45:36,001][46662] Updated weights for policy 0, policy_version 83870 (0.0007) +[2023-10-13 03:45:38,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 171671552. Throughput: 0: 1703.6, 1: 1672.8. Samples: 42931286. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 03:45:38,607][45375] Avg episode reward: [(0, '56.310'), (1, '55.290')] +[2023-10-13 03:45:39,173][46663] Updated weights for policy 1, policy_version 83781 (0.0008) +[2023-10-13 03:45:39,573][46663] Updated weights for policy 1, policy_version 83791 (0.0009) +[2023-10-13 03:45:39,937][46663] Updated weights for policy 1, policy_version 83801 (0.0009) +[2023-10-13 03:45:40,124][46662] Updated weights for policy 0, policy_version 83880 (0.0008) +[2023-10-13 03:45:40,501][46662] Updated weights for policy 0, policy_version 83890 (0.0009) +[2023-10-13 03:45:40,875][46662] Updated weights for policy 0, policy_version 83900 (0.0008) +[2023-10-13 03:45:43,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 171737088. Throughput: 0: 1672.4, 1: 1670.2. Samples: 42940348. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 03:45:43,608][45375] Avg episode reward: [(0, '56.060'), (1, '54.620')] +[2023-10-13 03:45:43,883][46663] Updated weights for policy 1, policy_version 83811 (0.0009) +[2023-10-13 03:45:44,252][46663] Updated weights for policy 1, policy_version 83821 (0.0010) +[2023-10-13 03:45:44,626][46663] Updated weights for policy 1, policy_version 83831 (0.0009) +[2023-10-13 03:45:44,900][46662] Updated weights for policy 0, policy_version 83910 (0.0007) +[2023-10-13 03:45:45,270][46662] Updated weights for policy 0, policy_version 83920 (0.0009) +[2023-10-13 03:45:45,637][46662] Updated weights for policy 0, policy_version 83930 (0.0010) +[2023-10-13 03:45:48,607][45375] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 171802624. Throughput: 0: 1686.7, 1: 1669.9. Samples: 42960834. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 03:45:48,608][45375] Avg episode reward: [(0, '55.870'), (1, '54.340')] +[2023-10-13 03:45:48,630][46663] Updated weights for policy 1, policy_version 83841 (0.0008) +[2023-10-13 03:45:48,994][46663] Updated weights for policy 1, policy_version 83851 (0.0008) +[2023-10-13 03:45:49,355][46663] Updated weights for policy 1, policy_version 83861 (0.0007) +[2023-10-13 03:45:49,725][46663] Updated weights for policy 1, policy_version 83871 (0.0010) +[2023-10-13 03:45:49,820][46662] Updated weights for policy 0, policy_version 83940 (0.0007) +[2023-10-13 03:45:50,188][46662] Updated weights for policy 0, policy_version 83950 (0.0007) +[2023-10-13 03:45:50,557][46662] Updated weights for policy 0, policy_version 83960 (0.0007) +[2023-10-13 03:45:53,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 171868160. Throughput: 0: 1689.0, 1: 1673.7. Samples: 42981408. Policy #0 lag: (min: 31.0, avg: 37.4, max: 63.0) +[2023-10-13 03:45:53,607][45375] Avg episode reward: [(0, '55.260'), (1, '54.380')] +[2023-10-13 03:45:53,616][46091] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000083968_85983232.pth... +[2023-10-13 03:45:53,652][46091] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000082400_84377600.pth +[2023-10-13 03:45:53,946][46663] Updated weights for policy 1, policy_version 83881 (0.0008) +[2023-10-13 03:45:54,321][46663] Updated weights for policy 1, policy_version 83891 (0.0007) +[2023-10-13 03:45:54,685][46663] Updated weights for policy 1, policy_version 83901 (0.0009) +[2023-10-13 03:45:54,786][46384] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000083904_85917696.pth... +[2023-10-13 03:45:54,816][46384] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000082336_84312064.pth +[2023-10-13 03:45:54,818][46662] Updated weights for policy 0, policy_version 83970 (0.0007) +[2023-10-13 03:45:55,197][46662] Updated weights for policy 0, policy_version 83980 (0.0009) +[2023-10-13 03:45:55,557][46662] Updated weights for policy 0, policy_version 83990 (0.0008) +[2023-10-13 03:45:55,927][46662] Updated weights for policy 0, policy_version 84000 (0.0008) +[2023-10-13 03:45:58,606][45375] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 171933696. Throughput: 0: 1669.9, 1: 1675.8. Samples: 42990792. Policy #0 lag: (min: 31.0, avg: 37.4, max: 63.0) +[2023-10-13 03:45:58,607][45375] Avg episode reward: [(0, '55.630'), (1, '54.630')] +[2023-10-13 03:45:58,772][46663] Updated weights for policy 1, policy_version 83911 (0.0007) +[2023-10-13 03:45:59,139][46663] Updated weights for policy 1, policy_version 83921 (0.0008) +[2023-10-13 03:45:59,503][46663] Updated weights for policy 1, policy_version 83931 (0.0008) +[2023-10-13 03:45:59,940][46662] Updated weights for policy 0, policy_version 84010 (0.0008) +[2023-10-13 03:46:00,309][46662] Updated weights for policy 0, policy_version 84020 (0.0009) +[2023-10-13 03:46:00,678][46662] Updated weights for policy 0, policy_version 84030 (0.0009) +[2023-10-13 03:46:03,509][46663] Updated weights for policy 1, policy_version 83941 (0.0008) +[2023-10-13 03:46:03,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.3, 300 sec: 13329.4). Total num frames: 171999232. Throughput: 0: 1686.8, 1: 1680.2. Samples: 43011368. Policy #0 lag: (min: 31.0, avg: 37.4, max: 63.0) +[2023-10-13 03:46:03,607][45375] Avg episode reward: [(0, '54.480'), (1, '54.790')] +[2023-10-13 03:46:03,880][46663] Updated weights for policy 1, policy_version 83951 (0.0008) +[2023-10-13 03:46:04,243][46663] Updated weights for policy 1, policy_version 83961 (0.0010) +[2023-10-13 03:46:04,760][46662] Updated weights for policy 0, policy_version 84040 (0.0009) +[2023-10-13 03:46:05,128][46662] Updated weights for policy 0, policy_version 84050 (0.0010) +[2023-10-13 03:46:05,508][46662] Updated weights for policy 0, policy_version 84060 (0.0007) +[2023-10-13 03:46:08,247][46663] Updated weights for policy 1, policy_version 83971 (0.0009) +[2023-10-13 03:46:08,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 172064768. Throughput: 0: 1686.6, 1: 1676.1. Samples: 43031920. Policy #0 lag: (min: 31.0, avg: 37.4, max: 63.0) +[2023-10-13 03:46:08,607][45375] Avg episode reward: [(0, '54.760'), (1, '52.650')] +[2023-10-13 03:46:08,614][46663] Updated weights for policy 1, policy_version 83981 (0.0010) +[2023-10-13 03:46:08,982][46663] Updated weights for policy 1, policy_version 83991 (0.0007) +[2023-10-13 03:46:09,593][46662] Updated weights for policy 0, policy_version 84070 (0.0008) +[2023-10-13 03:46:09,963][46662] Updated weights for policy 0, policy_version 84080 (0.0008) +[2023-10-13 03:46:10,341][46662] Updated weights for policy 0, policy_version 84090 (0.0007) +[2023-10-13 03:46:12,975][46663] Updated weights for policy 1, policy_version 84001 (0.0007) +[2023-10-13 03:46:13,352][46663] Updated weights for policy 1, policy_version 84011 (0.0007) +[2023-10-13 03:46:13,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 172130304. Throughput: 0: 1672.0, 1: 1680.0. Samples: 43041300. Policy #0 lag: (min: 31.0, avg: 37.4, max: 63.0) +[2023-10-13 03:46:13,607][45375] Avg episode reward: [(0, '55.290'), (1, '53.230')] +[2023-10-13 03:46:13,723][46663] Updated weights for policy 1, policy_version 84021 (0.0008) +[2023-10-13 03:46:14,086][46663] Updated weights for policy 1, policy_version 84031 (0.0010) +[2023-10-13 03:46:14,347][46662] Updated weights for policy 0, policy_version 84100 (0.0008) +[2023-10-13 03:46:14,729][46662] Updated weights for policy 0, policy_version 84110 (0.0008) +[2023-10-13 03:46:15,093][46662] Updated weights for policy 0, policy_version 84120 (0.0010) +[2023-10-13 03:46:18,226][46663] Updated weights for policy 1, policy_version 84041 (0.0009) +[2023-10-13 03:46:18,588][46663] Updated weights for policy 1, policy_version 84051 (0.0007) +[2023-10-13 03:46:18,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 172195840. Throughput: 0: 1681.6, 1: 1689.3. Samples: 43062138. Policy #0 lag: (min: 31.0, avg: 37.4, max: 63.0) +[2023-10-13 03:46:18,607][45375] Avg episode reward: [(0, '55.790'), (1, '55.540')] +[2023-10-13 03:46:18,960][46663] Updated weights for policy 1, policy_version 84061 (0.0007) +[2023-10-13 03:46:19,108][46662] Updated weights for policy 0, policy_version 84130 (0.0009) +[2023-10-13 03:46:19,480][46662] Updated weights for policy 0, policy_version 84140 (0.0009) +[2023-10-13 03:46:19,843][46662] Updated weights for policy 0, policy_version 84150 (0.0009) +[2023-10-13 03:46:20,207][46662] Updated weights for policy 0, policy_version 84160 (0.0008) +[2023-10-13 03:46:23,053][46663] Updated weights for policy 1, policy_version 84071 (0.0008) +[2023-10-13 03:46:23,413][46663] Updated weights for policy 1, policy_version 84081 (0.0011) +[2023-10-13 03:46:23,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 172261376. Throughput: 0: 1678.0, 1: 1669.8. Samples: 43081934. Policy #0 lag: (min: 31.0, avg: 37.4, max: 63.0) +[2023-10-13 03:46:23,607][45375] Avg episode reward: [(0, '56.840'), (1, '56.980')] +[2023-10-13 03:46:23,783][46663] Updated weights for policy 1, policy_version 84091 (0.0008) +[2023-10-13 03:46:24,303][46662] Updated weights for policy 0, policy_version 84170 (0.0010) +[2023-10-13 03:46:24,678][46662] Updated weights for policy 0, policy_version 84180 (0.0008) +[2023-10-13 03:46:25,039][46662] Updated weights for policy 0, policy_version 84190 (0.0008) +[2023-10-13 03:46:27,900][46663] Updated weights for policy 1, policy_version 84101 (0.0010) +[2023-10-13 03:46:28,301][46663] Updated weights for policy 1, policy_version 84111 (0.0010) +[2023-10-13 03:46:28,606][45375] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 172326912. Throughput: 0: 1670.6, 1: 1690.5. Samples: 43091598. Policy #0 lag: (min: 31.0, avg: 37.4, max: 63.0) +[2023-10-13 03:46:28,607][45375] Avg episode reward: [(0, '57.110'), (1, '57.240')] +[2023-10-13 03:46:28,662][46663] Updated weights for policy 1, policy_version 84121 (0.0010) +[2023-10-13 03:46:29,310][46662] Updated weights for policy 0, policy_version 84200 (0.0009) +[2023-10-13 03:46:29,685][46662] Updated weights for policy 0, policy_version 84210 (0.0007) +[2023-10-13 03:46:30,049][46662] Updated weights for policy 0, policy_version 84220 (0.0010) +[2023-10-13 03:46:32,728][46663] Updated weights for policy 1, policy_version 84131 (0.0007) +[2023-10-13 03:46:33,094][46663] Updated weights for policy 1, policy_version 84141 (0.0009) +[2023-10-13 03:46:33,471][46663] Updated weights for policy 1, policy_version 84151 (0.0008) +[2023-10-13 03:46:33,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 172392448. Throughput: 0: 1674.0, 1: 1690.4. Samples: 43112230. Policy #0 lag: (min: 31.0, avg: 37.4, max: 63.0) +[2023-10-13 03:46:33,607][45375] Avg episode reward: [(0, '56.750'), (1, '57.260')] +[2023-10-13 03:46:34,099][46662] Updated weights for policy 0, policy_version 84230 (0.0011) +[2023-10-13 03:46:34,461][46662] Updated weights for policy 0, policy_version 84240 (0.0010) +[2023-10-13 03:46:34,833][46662] Updated weights for policy 0, policy_version 84250 (0.0011) +[2023-10-13 03:46:37,580][46663] Updated weights for policy 1, policy_version 84161 (0.0008) +[2023-10-13 03:46:37,948][46663] Updated weights for policy 1, policy_version 84171 (0.0008) +[2023-10-13 03:46:38,305][46663] Updated weights for policy 1, policy_version 84181 (0.0007) +[2023-10-13 03:46:38,606][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 172457984. Throughput: 0: 1676.1, 1: 1669.6. Samples: 43131966. Policy #0 lag: (min: 31.0, avg: 37.4, max: 63.0) +[2023-10-13 03:46:38,607][45375] Avg episode reward: [(0, '58.720'), (1, '57.500')] +[2023-10-13 03:46:38,669][46663] Updated weights for policy 1, policy_version 84191 (0.0008) +[2023-10-13 03:46:38,828][46662] Updated weights for policy 0, policy_version 84260 (0.0010) +[2023-10-13 03:46:39,197][46662] Updated weights for policy 0, policy_version 84270 (0.0008) +[2023-10-13 03:46:39,575][46662] Updated weights for policy 0, policy_version 84280 (0.0007) +[2023-10-13 03:46:42,578][46663] Updated weights for policy 1, policy_version 84201 (0.0009) +[2023-10-13 03:46:42,949][46663] Updated weights for policy 1, policy_version 84211 (0.0007) +[2023-10-13 03:46:43,311][46663] Updated weights for policy 1, policy_version 84221 (0.0007) +[2023-10-13 03:46:43,607][45375] Fps is (10 sec: 16383.7, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 172556288. Throughput: 0: 1673.2, 1: 1692.2. Samples: 43142234. Policy #0 lag: (min: 31.0, avg: 37.4, max: 63.0) +[2023-10-13 03:46:43,608][45375] Avg episode reward: [(0, '58.020'), (1, '57.960')] +[2023-10-13 03:46:43,630][46662] Updated weights for policy 0, policy_version 84290 (0.0007) +[2023-10-13 03:46:44,006][46662] Updated weights for policy 0, policy_version 84300 (0.0007) +[2023-10-13 03:46:44,374][46662] Updated weights for policy 0, policy_version 84310 (0.0009) +[2023-10-13 03:46:44,744][46662] Updated weights for policy 0, policy_version 84320 (0.0008) +[2023-10-13 03:46:47,558][46663] Updated weights for policy 1, policy_version 84231 (0.0007) +[2023-10-13 03:46:47,931][46663] Updated weights for policy 1, policy_version 84241 (0.0008) +[2023-10-13 03:46:48,299][46663] Updated weights for policy 1, policy_version 84251 (0.0009) +[2023-10-13 03:46:48,607][45375] Fps is (10 sec: 16383.9, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 172621824. Throughput: 0: 1681.1, 1: 1682.2. Samples: 43162716. Policy #0 lag: (min: 19.0, avg: 23.3, max: 51.0) +[2023-10-13 03:46:48,607][45375] Avg episode reward: [(0, '57.760'), (1, '56.670')] +[2023-10-13 03:46:48,834][46662] Updated weights for policy 0, policy_version 84330 (0.0010) +[2023-10-13 03:46:49,208][46662] Updated weights for policy 0, policy_version 84340 (0.0008) +[2023-10-13 03:46:49,578][46662] Updated weights for policy 0, policy_version 84350 (0.0007) +[2023-10-13 03:46:52,291][46663] Updated weights for policy 1, policy_version 84261 (0.0008) +[2023-10-13 03:46:52,660][46663] Updated weights for policy 1, policy_version 84271 (0.0008) +[2023-10-13 03:46:53,022][46663] Updated weights for policy 1, policy_version 84281 (0.0010) +[2023-10-13 03:46:53,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 172687360. Throughput: 0: 1680.2, 1: 1662.5. Samples: 43182344. Policy #0 lag: (min: 19.0, avg: 23.3, max: 51.0) +[2023-10-13 03:46:53,607][45375] Avg episode reward: [(0, '57.190'), (1, '56.460')] +[2023-10-13 03:46:53,694][46662] Updated weights for policy 0, policy_version 84360 (0.0009) +[2023-10-13 03:46:54,075][46662] Updated weights for policy 0, policy_version 84370 (0.0008) +[2023-10-13 03:46:54,433][46662] Updated weights for policy 0, policy_version 84380 (0.0009) +[2023-10-13 03:46:57,243][46663] Updated weights for policy 1, policy_version 84291 (0.0010) +[2023-10-13 03:46:57,611][46663] Updated weights for policy 1, policy_version 84301 (0.0009) +[2023-10-13 03:46:57,975][46663] Updated weights for policy 1, policy_version 84311 (0.0009) +[2023-10-13 03:46:58,525][46662] Updated weights for policy 0, policy_version 84390 (0.0008) +[2023-10-13 03:46:58,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 172752896. Throughput: 0: 1681.3, 1: 1686.8. Samples: 43192864. Policy #0 lag: (min: 19.0, avg: 23.3, max: 51.0) +[2023-10-13 03:46:58,607][45375] Avg episode reward: [(0, '57.040'), (1, '56.540')] +[2023-10-13 03:46:58,888][46662] Updated weights for policy 0, policy_version 84400 (0.0009) +[2023-10-13 03:46:59,270][46662] Updated weights for policy 0, policy_version 84410 (0.0011) +[2023-10-13 03:47:02,250][46663] Updated weights for policy 1, policy_version 84321 (0.0009) +[2023-10-13 03:47:02,614][46663] Updated weights for policy 1, policy_version 84331 (0.0011) +[2023-10-13 03:47:02,991][46663] Updated weights for policy 1, policy_version 84341 (0.0012) +[2023-10-13 03:47:03,276][46662] Updated weights for policy 0, policy_version 84420 (0.0010) +[2023-10-13 03:47:03,361][46663] Updated weights for policy 1, policy_version 84351 (0.0008) +[2023-10-13 03:47:03,606][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 172818432. Throughput: 0: 1681.8, 1: 1673.4. Samples: 43213122. Policy #0 lag: (min: 19.0, avg: 23.3, max: 51.0) +[2023-10-13 03:47:03,607][45375] Avg episode reward: [(0, '56.520'), (1, '55.160')] +[2023-10-13 03:47:03,635][46662] Updated weights for policy 0, policy_version 84430 (0.0009) +[2023-10-13 03:47:04,014][46662] Updated weights for policy 0, policy_version 84440 (0.0009) +[2023-10-13 03:47:07,447][46663] Updated weights for policy 1, policy_version 84361 (0.0011) +[2023-10-13 03:47:07,819][46663] Updated weights for policy 1, policy_version 84371 (0.0008) +[2023-10-13 03:47:07,952][46662] Updated weights for policy 0, policy_version 84450 (0.0008) +[2023-10-13 03:47:08,179][46663] Updated weights for policy 1, policy_version 84381 (0.0009) +[2023-10-13 03:47:08,326][46662] Updated weights for policy 0, policy_version 84460 (0.0007) +[2023-10-13 03:47:08,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 172883968. Throughput: 0: 1682.8, 1: 1665.7. Samples: 43232616. Policy #0 lag: (min: 19.0, avg: 23.3, max: 51.0) +[2023-10-13 03:47:08,607][45375] Avg episode reward: [(0, '55.940'), (1, '55.570')] +[2023-10-13 03:47:08,697][46662] Updated weights for policy 0, policy_version 84470 (0.0008) +[2023-10-13 03:47:09,057][46662] Updated weights for policy 0, policy_version 84480 (0.0009) +[2023-10-13 03:47:12,084][46663] Updated weights for policy 1, policy_version 84391 (0.0007) +[2023-10-13 03:47:12,447][46663] Updated weights for policy 1, policy_version 84401 (0.0008) +[2023-10-13 03:47:12,820][46663] Updated weights for policy 1, policy_version 84411 (0.0008) +[2023-10-13 03:47:13,252][46662] Updated weights for policy 0, policy_version 84490 (0.0007) +[2023-10-13 03:47:13,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 172949504. Throughput: 0: 1686.3, 1: 1680.0. Samples: 43243080. Policy #0 lag: (min: 19.0, avg: 23.3, max: 51.0) +[2023-10-13 03:47:13,607][45375] Avg episode reward: [(0, '56.200'), (1, '55.400')] +[2023-10-13 03:47:13,623][46662] Updated weights for policy 0, policy_version 84500 (0.0008) +[2023-10-13 03:47:13,992][46662] Updated weights for policy 0, policy_version 84510 (0.0007) +[2023-10-13 03:47:17,089][46663] Updated weights for policy 1, policy_version 84421 (0.0007) +[2023-10-13 03:47:17,481][46663] Updated weights for policy 1, policy_version 84431 (0.0009) +[2023-10-13 03:47:17,851][46663] Updated weights for policy 1, policy_version 84441 (0.0007) +[2023-10-13 03:47:18,109][46662] Updated weights for policy 0, policy_version 84520 (0.0007) +[2023-10-13 03:47:18,483][46662] Updated weights for policy 0, policy_version 84530 (0.0010) +[2023-10-13 03:47:18,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 173015040. Throughput: 0: 1688.1, 1: 1668.1. Samples: 43263262. Policy #0 lag: (min: 19.0, avg: 23.3, max: 51.0) +[2023-10-13 03:47:18,607][45375] Avg episode reward: [(0, '56.830'), (1, '54.530')] +[2023-10-13 03:47:18,846][46662] Updated weights for policy 0, policy_version 84540 (0.0010) +[2023-10-13 03:47:21,805][46663] Updated weights for policy 1, policy_version 84451 (0.0010) +[2023-10-13 03:47:22,168][46663] Updated weights for policy 1, policy_version 84461 (0.0007) +[2023-10-13 03:47:22,536][46663] Updated weights for policy 1, policy_version 84471 (0.0008) +[2023-10-13 03:47:23,005][46662] Updated weights for policy 0, policy_version 84550 (0.0011) +[2023-10-13 03:47:23,377][46662] Updated weights for policy 0, policy_version 84560 (0.0007) +[2023-10-13 03:47:23,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 173080576. Throughput: 0: 1682.7, 1: 1671.7. Samples: 43282914. Policy #0 lag: (min: 19.0, avg: 23.3, max: 51.0) +[2023-10-13 03:47:23,607][45375] Avg episode reward: [(0, '56.630'), (1, '53.410')] +[2023-10-13 03:47:23,752][46662] Updated weights for policy 0, policy_version 84570 (0.0007) +[2023-10-13 03:47:26,574][46663] Updated weights for policy 1, policy_version 84481 (0.0008) +[2023-10-13 03:47:26,934][46663] Updated weights for policy 1, policy_version 84491 (0.0007) +[2023-10-13 03:47:27,304][46663] Updated weights for policy 1, policy_version 84501 (0.0010) +[2023-10-13 03:47:27,669][46663] Updated weights for policy 1, policy_version 84511 (0.0010) +[2023-10-13 03:47:27,730][46662] Updated weights for policy 0, policy_version 84580 (0.0008) +[2023-10-13 03:47:28,097][46662] Updated weights for policy 0, policy_version 84590 (0.0008) +[2023-10-13 03:47:28,471][46662] Updated weights for policy 0, policy_version 84600 (0.0007) +[2023-10-13 03:47:28,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 173146112. Throughput: 0: 1682.0, 1: 1672.7. Samples: 43293194. Policy #0 lag: (min: 19.0, avg: 23.3, max: 51.0) +[2023-10-13 03:47:28,608][45375] Avg episode reward: [(0, '55.680'), (1, '53.370')] +[2023-10-13 03:47:31,690][46663] Updated weights for policy 1, policy_version 84521 (0.0008) +[2023-10-13 03:47:32,056][46663] Updated weights for policy 1, policy_version 84531 (0.0008) +[2023-10-13 03:47:32,433][46663] Updated weights for policy 1, policy_version 84541 (0.0010) +[2023-10-13 03:47:32,537][46662] Updated weights for policy 0, policy_version 84610 (0.0009) +[2023-10-13 03:47:32,907][46662] Updated weights for policy 0, policy_version 84620 (0.0008) +[2023-10-13 03:47:33,274][46662] Updated weights for policy 0, policy_version 84630 (0.0007) +[2023-10-13 03:47:33,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 173211648. Throughput: 0: 1683.6, 1: 1657.0. Samples: 43313042. Policy #0 lag: (min: 19.0, avg: 23.3, max: 51.0) +[2023-10-13 03:47:33,608][45375] Avg episode reward: [(0, '55.420'), (1, '52.580')] +[2023-10-13 03:47:33,640][46662] Updated weights for policy 0, policy_version 84640 (0.0007) +[2023-10-13 03:47:36,509][46663] Updated weights for policy 1, policy_version 84551 (0.0009) +[2023-10-13 03:47:36,865][46663] Updated weights for policy 1, policy_version 84561 (0.0010) +[2023-10-13 03:47:37,235][46663] Updated weights for policy 1, policy_version 84571 (0.0007) +[2023-10-13 03:47:37,580][46662] Updated weights for policy 0, policy_version 84650 (0.0007) +[2023-10-13 03:47:37,941][46662] Updated weights for policy 0, policy_version 84660 (0.0007) +[2023-10-13 03:47:38,305][46662] Updated weights for policy 0, policy_version 84670 (0.0007) +[2023-10-13 03:47:38,606][45375] Fps is (10 sec: 16384.2, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 173309952. Throughput: 0: 1678.8, 1: 1672.4. Samples: 43333148. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0) +[2023-10-13 03:47:38,607][45375] Avg episode reward: [(0, '55.880'), (1, '52.940')] +[2023-10-13 03:47:41,311][46663] Updated weights for policy 1, policy_version 84581 (0.0007) +[2023-10-13 03:47:41,670][46663] Updated weights for policy 1, policy_version 84591 (0.0008) +[2023-10-13 03:47:42,035][46663] Updated weights for policy 1, policy_version 84601 (0.0008) +[2023-10-13 03:47:42,333][46662] Updated weights for policy 0, policy_version 84680 (0.0008) +[2023-10-13 03:47:42,704][46662] Updated weights for policy 0, policy_version 84690 (0.0008) +[2023-10-13 03:47:43,075][46662] Updated weights for policy 0, policy_version 84700 (0.0009) +[2023-10-13 03:47:43,607][45375] Fps is (10 sec: 16384.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 173375488. Throughput: 0: 1689.3, 1: 1666.8. Samples: 43343886. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0) +[2023-10-13 03:47:43,608][45375] Avg episode reward: [(0, '56.530'), (1, '53.260')] +[2023-10-13 03:47:45,894][46663] Updated weights for policy 1, policy_version 84611 (0.0007) +[2023-10-13 03:47:46,259][46663] Updated weights for policy 1, policy_version 84621 (0.0007) +[2023-10-13 03:47:46,628][46663] Updated weights for policy 1, policy_version 84631 (0.0008) +[2023-10-13 03:47:47,216][46662] Updated weights for policy 0, policy_version 84710 (0.0008) +[2023-10-13 03:47:47,591][46662] Updated weights for policy 0, policy_version 84720 (0.0009) +[2023-10-13 03:47:47,964][46662] Updated weights for policy 0, policy_version 84730 (0.0007) +[2023-10-13 03:47:48,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 173441024. Throughput: 0: 1686.8, 1: 1659.3. Samples: 43363700. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0) +[2023-10-13 03:47:48,607][45375] Avg episode reward: [(0, '56.120'), (1, '53.310')] +[2023-10-13 03:47:50,944][46663] Updated weights for policy 1, policy_version 84641 (0.0009) +[2023-10-13 03:47:51,307][46663] Updated weights for policy 1, policy_version 84651 (0.0008) +[2023-10-13 03:47:51,666][46663] Updated weights for policy 1, policy_version 84661 (0.0009) +[2023-10-13 03:47:51,890][46662] Updated weights for policy 0, policy_version 84740 (0.0008) +[2023-10-13 03:47:52,035][46663] Updated weights for policy 1, policy_version 84671 (0.0007) +[2023-10-13 03:47:52,258][46662] Updated weights for policy 0, policy_version 84750 (0.0009) +[2023-10-13 03:47:52,637][46662] Updated weights for policy 0, policy_version 84760 (0.0010) +[2023-10-13 03:47:53,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 173506560. Throughput: 0: 1666.6, 1: 1684.5. Samples: 43383416. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0) +[2023-10-13 03:47:53,608][45375] Avg episode reward: [(0, '56.630'), (1, '54.280')] +[2023-10-13 03:47:53,618][46384] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000084672_86704128.pth... +[2023-10-13 03:47:53,618][46091] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000084768_86802432.pth... +[2023-10-13 03:47:53,650][46384] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000083104_85098496.pth +[2023-10-13 03:47:53,653][46091] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000083200_85196800.pth +[2023-10-13 03:47:56,018][46663] Updated weights for policy 1, policy_version 84681 (0.0010) +[2023-10-13 03:47:56,378][46663] Updated weights for policy 1, policy_version 84691 (0.0008) +[2023-10-13 03:47:56,707][46662] Updated weights for policy 0, policy_version 84770 (0.0008) +[2023-10-13 03:47:56,745][46663] Updated weights for policy 1, policy_version 84701 (0.0009) +[2023-10-13 03:47:57,083][46662] Updated weights for policy 0, policy_version 84780 (0.0009) +[2023-10-13 03:47:57,452][46662] Updated weights for policy 0, policy_version 84790 (0.0009) +[2023-10-13 03:47:57,824][46662] Updated weights for policy 0, policy_version 84800 (0.0009) +[2023-10-13 03:47:58,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 173572096. Throughput: 0: 1693.1, 1: 1668.4. Samples: 43394344. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0) +[2023-10-13 03:47:58,607][45375] Avg episode reward: [(0, '57.260'), (1, '55.570')] +[2023-10-13 03:48:00,915][46663] Updated weights for policy 1, policy_version 84711 (0.0008) +[2023-10-13 03:48:01,279][46663] Updated weights for policy 1, policy_version 84721 (0.0007) +[2023-10-13 03:48:01,646][46663] Updated weights for policy 1, policy_version 84731 (0.0007) +[2023-10-13 03:48:01,848][46662] Updated weights for policy 0, policy_version 84810 (0.0009) +[2023-10-13 03:48:02,233][46662] Updated weights for policy 0, policy_version 84820 (0.0008) +[2023-10-13 03:48:02,601][46662] Updated weights for policy 0, policy_version 84830 (0.0010) +[2023-10-13 03:48:03,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 173637632. Throughput: 0: 1684.5, 1: 1664.4. Samples: 43413964. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0) +[2023-10-13 03:48:03,608][45375] Avg episode reward: [(0, '58.390'), (1, '55.170')] +[2023-10-13 03:48:05,699][46663] Updated weights for policy 1, policy_version 84741 (0.0009) +[2023-10-13 03:48:06,065][46663] Updated weights for policy 1, policy_version 84751 (0.0011) +[2023-10-13 03:48:06,438][46663] Updated weights for policy 1, policy_version 84761 (0.0009) +[2023-10-13 03:48:06,589][46662] Updated weights for policy 0, policy_version 84840 (0.0008) +[2023-10-13 03:48:06,958][46662] Updated weights for policy 0, policy_version 84850 (0.0009) +[2023-10-13 03:48:07,324][46662] Updated weights for policy 0, policy_version 84860 (0.0009) +[2023-10-13 03:48:08,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 173703168. Throughput: 0: 1671.9, 1: 1678.8. Samples: 43433696. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0) +[2023-10-13 03:48:08,607][45375] Avg episode reward: [(0, '60.850'), (1, '56.590')] +[2023-10-13 03:48:10,517][46663] Updated weights for policy 1, policy_version 84771 (0.0010) +[2023-10-13 03:48:10,885][46663] Updated weights for policy 1, policy_version 84781 (0.0008) +[2023-10-13 03:48:11,257][46663] Updated weights for policy 1, policy_version 84791 (0.0007) +[2023-10-13 03:48:11,353][46662] Updated weights for policy 0, policy_version 84870 (0.0009) +[2023-10-13 03:48:11,736][46662] Updated weights for policy 0, policy_version 84880 (0.0008) +[2023-10-13 03:48:12,102][46662] Updated weights for policy 0, policy_version 84890 (0.0010) +[2023-10-13 03:48:13,607][45375] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 173768704. Throughput: 0: 1698.4, 1: 1657.6. Samples: 43444210. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0) +[2023-10-13 03:48:13,607][45375] Avg episode reward: [(0, '60.210'), (1, '58.340')] +[2023-10-13 03:48:15,271][46663] Updated weights for policy 1, policy_version 84801 (0.0008) +[2023-10-13 03:48:15,639][46663] Updated weights for policy 1, policy_version 84811 (0.0010) +[2023-10-13 03:48:16,007][46663] Updated weights for policy 1, policy_version 84821 (0.0010) +[2023-10-13 03:48:16,341][46662] Updated weights for policy 0, policy_version 84900 (0.0008) +[2023-10-13 03:48:16,373][46663] Updated weights for policy 1, policy_version 84831 (0.0007) +[2023-10-13 03:48:16,712][46662] Updated weights for policy 0, policy_version 84910 (0.0007) +[2023-10-13 03:48:17,093][46662] Updated weights for policy 0, policy_version 84920 (0.0010) +[2023-10-13 03:48:18,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 173834240. Throughput: 0: 1674.7, 1: 1674.6. Samples: 43463762. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0) +[2023-10-13 03:48:18,607][45375] Avg episode reward: [(0, '61.460'), (1, '57.460')] +[2023-10-13 03:48:20,554][46663] Updated weights for policy 1, policy_version 84841 (0.0007) +[2023-10-13 03:48:20,914][46663] Updated weights for policy 1, policy_version 84851 (0.0008) +[2023-10-13 03:48:21,090][46662] Updated weights for policy 0, policy_version 84930 (0.0009) +[2023-10-13 03:48:21,285][46663] Updated weights for policy 1, policy_version 84861 (0.0007) +[2023-10-13 03:48:21,472][46662] Updated weights for policy 0, policy_version 84940 (0.0008) +[2023-10-13 03:48:21,837][46662] Updated weights for policy 0, policy_version 84950 (0.0009) +[2023-10-13 03:48:22,208][46662] Updated weights for policy 0, policy_version 84960 (0.0009) +[2023-10-13 03:48:23,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 173899776. Throughput: 0: 1665.5, 1: 1682.8. Samples: 43483822. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0) +[2023-10-13 03:48:23,607][45375] Avg episode reward: [(0, '61.080'), (1, '58.980')] +[2023-10-13 03:48:25,481][46663] Updated weights for policy 1, policy_version 84871 (0.0009) +[2023-10-13 03:48:25,847][46663] Updated weights for policy 1, policy_version 84881 (0.0010) +[2023-10-13 03:48:26,208][46663] Updated weights for policy 1, policy_version 84891 (0.0008) +[2023-10-13 03:48:26,332][46662] Updated weights for policy 0, policy_version 84970 (0.0008) +[2023-10-13 03:48:26,703][46662] Updated weights for policy 0, policy_version 84980 (0.0009) +[2023-10-13 03:48:27,083][46662] Updated weights for policy 0, policy_version 84990 (0.0009) +[2023-10-13 03:48:28,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 173965312. Throughput: 0: 1681.1, 1: 1662.1. Samples: 43494330. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0) +[2023-10-13 03:48:28,607][45375] Avg episode reward: [(0, '60.520'), (1, '57.860')] +[2023-10-13 03:48:30,217][46663] Updated weights for policy 1, policy_version 84901 (0.0009) +[2023-10-13 03:48:30,584][46663] Updated weights for policy 1, policy_version 84911 (0.0007) +[2023-10-13 03:48:30,946][46663] Updated weights for policy 1, policy_version 84921 (0.0008) +[2023-10-13 03:48:30,995][46662] Updated weights for policy 0, policy_version 85000 (0.0008) +[2023-10-13 03:48:31,367][46662] Updated weights for policy 0, policy_version 85010 (0.0008) +[2023-10-13 03:48:31,744][46662] Updated weights for policy 0, policy_version 85020 (0.0009) +[2023-10-13 03:48:33,606][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 174030848. Throughput: 0: 1656.3, 1: 1680.5. Samples: 43513858. Policy #0 lag: (min: 31.0, avg: 44.2, max: 63.0) +[2023-10-13 03:48:33,607][45375] Avg episode reward: [(0, '60.910'), (1, '57.470')] +[2023-10-13 03:48:34,994][46663] Updated weights for policy 1, policy_version 84931 (0.0009) +[2023-10-13 03:48:35,366][46663] Updated weights for policy 1, policy_version 84941 (0.0008) +[2023-10-13 03:48:35,732][46663] Updated weights for policy 1, policy_version 84951 (0.0008) +[2023-10-13 03:48:35,875][46662] Updated weights for policy 0, policy_version 85030 (0.0009) +[2023-10-13 03:48:36,234][46662] Updated weights for policy 0, policy_version 85040 (0.0008) +[2023-10-13 03:48:36,611][46662] Updated weights for policy 0, policy_version 85050 (0.0011) +[2023-10-13 03:48:38,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 174096384. Throughput: 0: 1676.4, 1: 1684.8. Samples: 43534668. Policy #0 lag: (min: 31.0, avg: 44.2, max: 63.0) +[2023-10-13 03:48:38,607][45375] Avg episode reward: [(0, '60.150'), (1, '56.550')] +[2023-10-13 03:48:39,820][46663] Updated weights for policy 1, policy_version 84961 (0.0008) +[2023-10-13 03:48:40,180][46663] Updated weights for policy 1, policy_version 84971 (0.0008) +[2023-10-13 03:48:40,482][46662] Updated weights for policy 0, policy_version 85060 (0.0009) +[2023-10-13 03:48:40,552][46663] Updated weights for policy 1, policy_version 84981 (0.0007) +[2023-10-13 03:48:40,857][46662] Updated weights for policy 0, policy_version 85070 (0.0007) +[2023-10-13 03:48:40,914][46663] Updated weights for policy 1, policy_version 84991 (0.0009) +[2023-10-13 03:48:41,230][46662] Updated weights for policy 0, policy_version 85080 (0.0008) +[2023-10-13 03:48:43,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 174161920. Throughput: 0: 1672.4, 1: 1671.0. Samples: 43544800. Policy #0 lag: (min: 31.0, avg: 44.2, max: 63.0) +[2023-10-13 03:48:43,608][45375] Avg episode reward: [(0, '58.980'), (1, '55.520')] +[2023-10-13 03:48:44,737][46663] Updated weights for policy 1, policy_version 85001 (0.0011) +[2023-10-13 03:48:45,105][46663] Updated weights for policy 1, policy_version 85011 (0.0009) +[2023-10-13 03:48:45,430][46662] Updated weights for policy 0, policy_version 85090 (0.0009) +[2023-10-13 03:48:45,468][46663] Updated weights for policy 1, policy_version 85021 (0.0007) +[2023-10-13 03:48:45,803][46662] Updated weights for policy 0, policy_version 85100 (0.0008) +[2023-10-13 03:48:46,171][46662] Updated weights for policy 0, policy_version 85110 (0.0007) +[2023-10-13 03:48:46,547][46662] Updated weights for policy 0, policy_version 85120 (0.0007) +[2023-10-13 03:48:48,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 174227456. Throughput: 0: 1662.4, 1: 1690.0. Samples: 43564822. Policy #0 lag: (min: 31.0, avg: 44.2, max: 63.0) +[2023-10-13 03:48:48,607][45375] Avg episode reward: [(0, '58.240'), (1, '53.970')] +[2023-10-13 03:48:49,488][46663] Updated weights for policy 1, policy_version 85031 (0.0007) +[2023-10-13 03:48:49,850][46663] Updated weights for policy 1, policy_version 85041 (0.0011) +[2023-10-13 03:48:50,215][46663] Updated weights for policy 1, policy_version 85051 (0.0010) +[2023-10-13 03:48:50,625][46662] Updated weights for policy 0, policy_version 85130 (0.0007) +[2023-10-13 03:48:50,992][46662] Updated weights for policy 0, policy_version 85140 (0.0008) +[2023-10-13 03:48:51,375][46662] Updated weights for policy 0, policy_version 85150 (0.0007) +[2023-10-13 03:48:53,607][45375] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 174292992. Throughput: 0: 1681.3, 1: 1693.9. Samples: 43585578. Policy #0 lag: (min: 31.0, avg: 44.2, max: 63.0) +[2023-10-13 03:48:53,608][45375] Avg episode reward: [(0, '58.850'), (1, '53.130')] +[2023-10-13 03:48:54,336][46663] Updated weights for policy 1, policy_version 85061 (0.0008) +[2023-10-13 03:48:54,739][46663] Updated weights for policy 1, policy_version 85071 (0.0009) +[2023-10-13 03:48:55,104][46663] Updated weights for policy 1, policy_version 85081 (0.0009) +[2023-10-13 03:48:55,480][46662] Updated weights for policy 0, policy_version 85160 (0.0009) +[2023-10-13 03:48:55,849][46662] Updated weights for policy 0, policy_version 85170 (0.0008) +[2023-10-13 03:48:56,214][46662] Updated weights for policy 0, policy_version 85180 (0.0010) +[2023-10-13 03:48:58,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 174358528. Throughput: 0: 1667.0, 1: 1686.8. Samples: 43595128. Policy #0 lag: (min: 31.0, avg: 44.2, max: 63.0) +[2023-10-13 03:48:58,607][45375] Avg episode reward: [(0, '58.220'), (1, '53.230')] +[2023-10-13 03:48:59,332][46663] Updated weights for policy 1, policy_version 85091 (0.0008) +[2023-10-13 03:48:59,706][46663] Updated weights for policy 1, policy_version 85101 (0.0008) +[2023-10-13 03:49:00,077][46663] Updated weights for policy 1, policy_version 85111 (0.0010) +[2023-10-13 03:49:00,155][46662] Updated weights for policy 0, policy_version 85190 (0.0008) +[2023-10-13 03:49:00,532][46662] Updated weights for policy 0, policy_version 85200 (0.0008) +[2023-10-13 03:49:00,907][46662] Updated weights for policy 0, policy_version 85210 (0.0010) +[2023-10-13 03:49:03,607][45375] Fps is (10 sec: 13107.6, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 174424064. Throughput: 0: 1677.7, 1: 1693.4. Samples: 43615462. Policy #0 lag: (min: 31.0, avg: 44.2, max: 63.0) +[2023-10-13 03:49:03,607][45375] Avg episode reward: [(0, '57.640'), (1, '52.970')] +[2023-10-13 03:49:03,965][46663] Updated weights for policy 1, policy_version 85121 (0.0008) +[2023-10-13 03:49:04,329][46663] Updated weights for policy 1, policy_version 85131 (0.0010) +[2023-10-13 03:49:04,697][46663] Updated weights for policy 1, policy_version 85141 (0.0009) +[2023-10-13 03:49:04,976][46662] Updated weights for policy 0, policy_version 85220 (0.0009) +[2023-10-13 03:49:05,069][46663] Updated weights for policy 1, policy_version 85151 (0.0009) +[2023-10-13 03:49:05,356][46662] Updated weights for policy 0, policy_version 85230 (0.0007) +[2023-10-13 03:49:05,719][46662] Updated weights for policy 0, policy_version 85240 (0.0007) +[2023-10-13 03:49:08,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 174489600. Throughput: 0: 1693.2, 1: 1698.4. Samples: 43636442. Policy #0 lag: (min: 31.0, avg: 44.2, max: 63.0) +[2023-10-13 03:49:08,607][45375] Avg episode reward: [(0, '57.060'), (1, '52.820')] +[2023-10-13 03:49:09,074][46663] Updated weights for policy 1, policy_version 85161 (0.0008) +[2023-10-13 03:49:09,441][46663] Updated weights for policy 1, policy_version 85171 (0.0009) +[2023-10-13 03:49:09,776][46662] Updated weights for policy 0, policy_version 85250 (0.0010) +[2023-10-13 03:49:09,813][46663] Updated weights for policy 1, policy_version 85181 (0.0008) +[2023-10-13 03:49:10,149][46662] Updated weights for policy 0, policy_version 85260 (0.0008) +[2023-10-13 03:49:10,521][46662] Updated weights for policy 0, policy_version 85270 (0.0007) +[2023-10-13 03:49:10,895][46662] Updated weights for policy 0, policy_version 85280 (0.0008) +[2023-10-13 03:49:13,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 174555136. Throughput: 0: 1668.1, 1: 1695.0. Samples: 43645670. Policy #0 lag: (min: 31.0, avg: 44.2, max: 63.0) +[2023-10-13 03:49:13,607][45375] Avg episode reward: [(0, '55.310'), (1, '52.580')] +[2023-10-13 03:49:13,973][46663] Updated weights for policy 1, policy_version 85191 (0.0008) +[2023-10-13 03:49:14,336][46663] Updated weights for policy 1, policy_version 85201 (0.0011) +[2023-10-13 03:49:14,709][46663] Updated weights for policy 1, policy_version 85211 (0.0009) +[2023-10-13 03:49:14,978][46662] Updated weights for policy 0, policy_version 85290 (0.0009) +[2023-10-13 03:49:15,351][46662] Updated weights for policy 0, policy_version 85300 (0.0009) +[2023-10-13 03:49:15,721][46662] Updated weights for policy 0, policy_version 85310 (0.0009) +[2023-10-13 03:49:18,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 174620672. Throughput: 0: 1690.9, 1: 1696.4. Samples: 43666286. Policy #0 lag: (min: 31.0, avg: 44.2, max: 63.0) +[2023-10-13 03:49:18,607][45375] Avg episode reward: [(0, '54.600'), (1, '52.900')] +[2023-10-13 03:49:18,709][46663] Updated weights for policy 1, policy_version 85221 (0.0008) +[2023-10-13 03:49:19,069][46663] Updated weights for policy 1, policy_version 85231 (0.0007) +[2023-10-13 03:49:19,432][46663] Updated weights for policy 1, policy_version 85241 (0.0008) +[2023-10-13 03:49:19,743][46662] Updated weights for policy 0, policy_version 85320 (0.0009) +[2023-10-13 03:49:20,119][46662] Updated weights for policy 0, policy_version 85330 (0.0008) +[2023-10-13 03:49:20,496][46662] Updated weights for policy 0, policy_version 85340 (0.0008) +[2023-10-13 03:49:23,554][46663] Updated weights for policy 1, policy_version 85251 (0.0008) +[2023-10-13 03:49:23,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 174686208. Throughput: 0: 1691.2, 1: 1695.8. Samples: 43687086. Policy #0 lag: (min: 31.0, avg: 44.2, max: 63.0) +[2023-10-13 03:49:23,607][45375] Avg episode reward: [(0, '54.000'), (1, '52.130')] +[2023-10-13 03:49:23,930][46663] Updated weights for policy 1, policy_version 85261 (0.0008) +[2023-10-13 03:49:24,292][46663] Updated weights for policy 1, policy_version 85271 (0.0007) +[2023-10-13 03:49:24,457][46662] Updated weights for policy 0, policy_version 85350 (0.0008) +[2023-10-13 03:49:24,816][46662] Updated weights for policy 0, policy_version 85360 (0.0008) +[2023-10-13 03:49:25,190][46662] Updated weights for policy 0, policy_version 85370 (0.0008) +[2023-10-13 03:49:28,382][46663] Updated weights for policy 1, policy_version 85281 (0.0009) +[2023-10-13 03:49:28,606][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 174751744. Throughput: 0: 1670.0, 1: 1698.9. Samples: 43696402. Policy #0 lag: (min: 17.0, avg: 22.1, max: 49.0) +[2023-10-13 03:49:28,607][45375] Avg episode reward: [(0, '54.250'), (1, '51.870')] +[2023-10-13 03:49:28,747][46663] Updated weights for policy 1, policy_version 85291 (0.0008) +[2023-10-13 03:49:29,115][46663] Updated weights for policy 1, policy_version 85301 (0.0009) +[2023-10-13 03:49:29,177][46662] Updated weights for policy 0, policy_version 85380 (0.0007) +[2023-10-13 03:49:29,484][46663] Updated weights for policy 1, policy_version 85311 (0.0009) +[2023-10-13 03:49:29,544][46662] Updated weights for policy 0, policy_version 85390 (0.0009) +[2023-10-13 03:49:29,915][46662] Updated weights for policy 0, policy_version 85400 (0.0009) +[2023-10-13 03:49:33,527][46663] Updated weights for policy 1, policy_version 85321 (0.0009) +[2023-10-13 03:49:33,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 174817280. Throughput: 0: 1692.1, 1: 1692.8. Samples: 43717142. Policy #0 lag: (min: 17.0, avg: 22.1, max: 49.0) +[2023-10-13 03:49:33,607][45375] Avg episode reward: [(0, '54.200'), (1, '52.000')] +[2023-10-13 03:49:33,885][46662] Updated weights for policy 0, policy_version 85410 (0.0007) +[2023-10-13 03:49:33,889][46663] Updated weights for policy 1, policy_version 85331 (0.0010) +[2023-10-13 03:49:34,245][46662] Updated weights for policy 0, policy_version 85420 (0.0008) +[2023-10-13 03:49:34,264][46663] Updated weights for policy 1, policy_version 85341 (0.0007) +[2023-10-13 03:49:34,621][46662] Updated weights for policy 0, policy_version 85430 (0.0007) +[2023-10-13 03:49:34,989][46662] Updated weights for policy 0, policy_version 85440 (0.0009) +[2023-10-13 03:49:38,255][46663] Updated weights for policy 1, policy_version 85351 (0.0008) +[2023-10-13 03:49:38,606][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 174882816. Throughput: 0: 1694.6, 1: 1681.3. Samples: 43737492. Policy #0 lag: (min: 17.0, avg: 22.1, max: 49.0) +[2023-10-13 03:49:38,607][45375] Avg episode reward: [(0, '55.060'), (1, '52.620')] +[2023-10-13 03:49:38,626][46663] Updated weights for policy 1, policy_version 85361 (0.0007) +[2023-10-13 03:49:38,987][46663] Updated weights for policy 1, policy_version 85371 (0.0008) +[2023-10-13 03:49:39,151][46662] Updated weights for policy 0, policy_version 85450 (0.0009) +[2023-10-13 03:49:39,517][46662] Updated weights for policy 0, policy_version 85460 (0.0009) +[2023-10-13 03:49:39,883][46662] Updated weights for policy 0, policy_version 85470 (0.0011) +[2023-10-13 03:49:42,939][46663] Updated weights for policy 1, policy_version 85381 (0.0008) +[2023-10-13 03:49:43,339][46663] Updated weights for policy 1, policy_version 85391 (0.0008) +[2023-10-13 03:49:43,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 174948352. Throughput: 0: 1679.7, 1: 1693.7. Samples: 43746932. Policy #0 lag: (min: 17.0, avg: 22.1, max: 49.0) +[2023-10-13 03:49:43,608][45375] Avg episode reward: [(0, '55.910'), (1, '53.640')] +[2023-10-13 03:49:43,709][46663] Updated weights for policy 1, policy_version 85401 (0.0010) +[2023-10-13 03:49:44,005][46662] Updated weights for policy 0, policy_version 85480 (0.0010) +[2023-10-13 03:49:44,383][46662] Updated weights for policy 0, policy_version 85490 (0.0008) +[2023-10-13 03:49:44,748][46662] Updated weights for policy 0, policy_version 85500 (0.0007) +[2023-10-13 03:49:47,907][46663] Updated weights for policy 1, policy_version 85411 (0.0009) +[2023-10-13 03:49:48,268][46663] Updated weights for policy 1, policy_version 85421 (0.0009) +[2023-10-13 03:49:48,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 175013888. Throughput: 0: 1686.4, 1: 1684.0. Samples: 43767130. Policy #0 lag: (min: 17.0, avg: 22.1, max: 49.0) +[2023-10-13 03:49:48,607][45375] Avg episode reward: [(0, '55.170'), (1, '54.040')] +[2023-10-13 03:49:48,647][46663] Updated weights for policy 1, policy_version 85431 (0.0008) +[2023-10-13 03:49:48,978][46662] Updated weights for policy 0, policy_version 85510 (0.0007) +[2023-10-13 03:49:49,355][46662] Updated weights for policy 0, policy_version 85520 (0.0008) +[2023-10-13 03:49:49,725][46662] Updated weights for policy 0, policy_version 85530 (0.0007) +[2023-10-13 03:49:52,709][46663] Updated weights for policy 1, policy_version 85441 (0.0010) +[2023-10-13 03:49:53,084][46663] Updated weights for policy 1, policy_version 85451 (0.0010) +[2023-10-13 03:49:53,455][46663] Updated weights for policy 1, policy_version 85461 (0.0008) +[2023-10-13 03:49:53,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.3, 300 sec: 13329.4). Total num frames: 175079424. Throughput: 0: 1687.9, 1: 1666.3. Samples: 43787382. Policy #0 lag: (min: 17.0, avg: 22.1, max: 49.0) +[2023-10-13 03:49:53,607][45375] Avg episode reward: [(0, '54.780'), (1, '55.480')] +[2023-10-13 03:49:53,615][46091] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000085536_87588864.pth... +[2023-10-13 03:49:53,646][46091] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000083968_85983232.pth +[2023-10-13 03:49:53,827][46663] Updated weights for policy 1, policy_version 85471 (0.0009) +[2023-10-13 03:49:53,855][46384] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000085472_87523328.pth... +[2023-10-13 03:49:53,888][46384] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000083904_85917696.pth +[2023-10-13 03:49:53,937][46662] Updated weights for policy 0, policy_version 85540 (0.0008) +[2023-10-13 03:49:54,300][46662] Updated weights for policy 0, policy_version 85550 (0.0009) +[2023-10-13 03:49:54,673][46662] Updated weights for policy 0, policy_version 85560 (0.0009) +[2023-10-13 03:49:58,083][46663] Updated weights for policy 1, policy_version 85481 (0.0009) +[2023-10-13 03:49:58,457][46663] Updated weights for policy 1, policy_version 85491 (0.0007) +[2023-10-13 03:49:58,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 175144960. Throughput: 0: 1683.3, 1: 1680.8. Samples: 43797054. Policy #0 lag: (min: 17.0, avg: 22.1, max: 49.0) +[2023-10-13 03:49:58,607][45375] Avg episode reward: [(0, '56.320'), (1, '56.410')] +[2023-10-13 03:49:58,738][46662] Updated weights for policy 0, policy_version 85570 (0.0011) +[2023-10-13 03:49:58,817][46663] Updated weights for policy 1, policy_version 85501 (0.0007) +[2023-10-13 03:49:59,100][46662] Updated weights for policy 0, policy_version 85580 (0.0009) +[2023-10-13 03:49:59,474][46662] Updated weights for policy 0, policy_version 85590 (0.0008) +[2023-10-13 03:49:59,847][46662] Updated weights for policy 0, policy_version 85600 (0.0010) +[2023-10-13 03:50:02,776][46663] Updated weights for policy 1, policy_version 85511 (0.0009) +[2023-10-13 03:50:03,134][46663] Updated weights for policy 1, policy_version 85521 (0.0008) +[2023-10-13 03:50:03,498][46663] Updated weights for policy 1, policy_version 85531 (0.0010) +[2023-10-13 03:50:03,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 175210496. Throughput: 0: 1686.0, 1: 1681.0. Samples: 43817800. Policy #0 lag: (min: 17.0, avg: 22.1, max: 49.0) +[2023-10-13 03:50:03,607][45375] Avg episode reward: [(0, '56.530'), (1, '56.570')] +[2023-10-13 03:50:03,908][46662] Updated weights for policy 0, policy_version 85610 (0.0007) +[2023-10-13 03:50:04,271][46662] Updated weights for policy 0, policy_version 85620 (0.0008) +[2023-10-13 03:50:04,644][46662] Updated weights for policy 0, policy_version 85630 (0.0008) +[2023-10-13 03:50:07,636][46663] Updated weights for policy 1, policy_version 85541 (0.0007) +[2023-10-13 03:50:08,008][46663] Updated weights for policy 1, policy_version 85551 (0.0008) +[2023-10-13 03:50:08,362][46663] Updated weights for policy 1, policy_version 85561 (0.0009) +[2023-10-13 03:50:08,585][46662] Updated weights for policy 0, policy_version 85640 (0.0008) +[2023-10-13 03:50:08,606][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 175276032. Throughput: 0: 1690.8, 1: 1658.3. Samples: 43837794. Policy #0 lag: (min: 17.0, avg: 22.1, max: 49.0) +[2023-10-13 03:50:08,607][45375] Avg episode reward: [(0, '56.260'), (1, '56.860')] +[2023-10-13 03:50:08,961][46662] Updated weights for policy 0, policy_version 85650 (0.0010) +[2023-10-13 03:50:09,336][46662] Updated weights for policy 0, policy_version 85660 (0.0009) +[2023-10-13 03:50:12,597][46663] Updated weights for policy 1, policy_version 85571 (0.0011) +[2023-10-13 03:50:12,963][46663] Updated weights for policy 1, policy_version 85581 (0.0010) +[2023-10-13 03:50:13,333][46663] Updated weights for policy 1, policy_version 85591 (0.0007) +[2023-10-13 03:50:13,402][46662] Updated weights for policy 0, policy_version 85670 (0.0008) +[2023-10-13 03:50:13,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 175341568. Throughput: 0: 1687.6, 1: 1676.1. Samples: 43847768. Policy #0 lag: (min: 17.0, avg: 22.1, max: 49.0) +[2023-10-13 03:50:13,607][45375] Avg episode reward: [(0, '56.850'), (1, '55.060')] +[2023-10-13 03:50:13,784][46662] Updated weights for policy 0, policy_version 85680 (0.0009) +[2023-10-13 03:50:14,157][46662] Updated weights for policy 0, policy_version 85690 (0.0008) +[2023-10-13 03:50:17,373][46663] Updated weights for policy 1, policy_version 85601 (0.0008) +[2023-10-13 03:50:17,744][46663] Updated weights for policy 1, policy_version 85611 (0.0007) +[2023-10-13 03:50:18,105][46663] Updated weights for policy 1, policy_version 85621 (0.0009) +[2023-10-13 03:50:18,285][46662] Updated weights for policy 0, policy_version 85700 (0.0007) +[2023-10-13 03:50:18,478][46663] Updated weights for policy 1, policy_version 85631 (0.0009) +[2023-10-13 03:50:18,607][45375] Fps is (10 sec: 16383.8, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 175439872. Throughput: 0: 1680.9, 1: 1675.6. Samples: 43868188. Policy #0 lag: (min: 17.0, avg: 22.1, max: 49.0) +[2023-10-13 03:50:18,608][45375] Avg episode reward: [(0, '57.290'), (1, '54.980')] +[2023-10-13 03:50:18,652][46662] Updated weights for policy 0, policy_version 85710 (0.0008) +[2023-10-13 03:50:19,027][46662] Updated weights for policy 0, policy_version 85720 (0.0008) +[2023-10-13 03:50:22,489][46663] Updated weights for policy 1, policy_version 85641 (0.0010) +[2023-10-13 03:50:22,858][46663] Updated weights for policy 1, policy_version 85651 (0.0008) +[2023-10-13 03:50:23,008][46662] Updated weights for policy 0, policy_version 85730 (0.0010) +[2023-10-13 03:50:23,222][46663] Updated weights for policy 1, policy_version 85661 (0.0007) +[2023-10-13 03:50:23,375][46662] Updated weights for policy 0, policy_version 85740 (0.0007) +[2023-10-13 03:50:23,607][45375] Fps is (10 sec: 16383.5, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 175505408. Throughput: 0: 1685.4, 1: 1661.0. Samples: 43888080. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 03:50:23,608][45375] Avg episode reward: [(0, '57.530'), (1, '54.510')] +[2023-10-13 03:50:23,751][46662] Updated weights for policy 0, policy_version 85750 (0.0007) +[2023-10-13 03:50:24,123][46662] Updated weights for policy 0, policy_version 85760 (0.0007) +[2023-10-13 03:50:27,264][46663] Updated weights for policy 1, policy_version 85671 (0.0008) +[2023-10-13 03:50:27,634][46663] Updated weights for policy 1, policy_version 85681 (0.0007) +[2023-10-13 03:50:27,992][46663] Updated weights for policy 1, policy_version 85691 (0.0009) +[2023-10-13 03:50:28,157][46662] Updated weights for policy 0, policy_version 85770 (0.0008) +[2023-10-13 03:50:28,527][46662] Updated weights for policy 0, policy_version 85780 (0.0008) +[2023-10-13 03:50:28,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 175570944. Throughput: 0: 1686.7, 1: 1678.7. Samples: 43898374. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 03:50:28,607][45375] Avg episode reward: [(0, '57.370'), (1, '54.980')] +[2023-10-13 03:50:28,901][46662] Updated weights for policy 0, policy_version 85790 (0.0009) +[2023-10-13 03:50:32,158][46663] Updated weights for policy 1, policy_version 85701 (0.0008) +[2023-10-13 03:50:32,555][46663] Updated weights for policy 1, policy_version 85711 (0.0009) +[2023-10-13 03:50:32,924][46663] Updated weights for policy 1, policy_version 85721 (0.0009) +[2023-10-13 03:50:33,110][46662] Updated weights for policy 0, policy_version 85800 (0.0008) +[2023-10-13 03:50:33,482][46662] Updated weights for policy 0, policy_version 85810 (0.0008) +[2023-10-13 03:50:33,607][45375] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 175636480. Throughput: 0: 1683.5, 1: 1673.2. Samples: 43918182. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 03:50:33,607][45375] Avg episode reward: [(0, '58.340'), (1, '56.200')] +[2023-10-13 03:50:33,857][46662] Updated weights for policy 0, policy_version 85820 (0.0010) +[2023-10-13 03:50:36,936][46663] Updated weights for policy 1, policy_version 85731 (0.0009) +[2023-10-13 03:50:37,306][46663] Updated weights for policy 1, policy_version 85741 (0.0011) +[2023-10-13 03:50:37,666][46663] Updated weights for policy 1, policy_version 85751 (0.0009) +[2023-10-13 03:50:37,923][46662] Updated weights for policy 0, policy_version 85830 (0.0008) +[2023-10-13 03:50:38,294][46662] Updated weights for policy 0, policy_version 85840 (0.0009) +[2023-10-13 03:50:38,606][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 175702016. Throughput: 0: 1676.7, 1: 1667.9. Samples: 43937890. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 03:50:38,607][45375] Avg episode reward: [(0, '59.240'), (1, '56.110')] +[2023-10-13 03:50:38,661][46662] Updated weights for policy 0, policy_version 85850 (0.0010) +[2023-10-13 03:50:41,632][46663] Updated weights for policy 1, policy_version 85761 (0.0010) +[2023-10-13 03:50:42,000][46663] Updated weights for policy 1, policy_version 85771 (0.0008) +[2023-10-13 03:50:42,367][46663] Updated weights for policy 1, policy_version 85781 (0.0008) +[2023-10-13 03:50:42,632][46662] Updated weights for policy 0, policy_version 85860 (0.0009) +[2023-10-13 03:50:42,738][46663] Updated weights for policy 1, policy_version 85791 (0.0008) +[2023-10-13 03:50:43,001][46662] Updated weights for policy 0, policy_version 85870 (0.0009) +[2023-10-13 03:50:43,379][46662] Updated weights for policy 0, policy_version 85880 (0.0011) +[2023-10-13 03:50:43,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 175767552. Throughput: 0: 1680.9, 1: 1685.0. Samples: 43948520. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 03:50:43,608][45375] Avg episode reward: [(0, '60.270'), (1, '56.410')] +[2023-10-13 03:50:46,668][46663] Updated weights for policy 1, policy_version 85801 (0.0008) +[2023-10-13 03:50:47,041][46663] Updated weights for policy 1, policy_version 85811 (0.0008) +[2023-10-13 03:50:47,410][46663] Updated weights for policy 1, policy_version 85821 (0.0009) +[2023-10-13 03:50:47,439][46662] Updated weights for policy 0, policy_version 85890 (0.0007) +[2023-10-13 03:50:47,808][46662] Updated weights for policy 0, policy_version 85900 (0.0008) +[2023-10-13 03:50:48,173][46662] Updated weights for policy 0, policy_version 85910 (0.0009) +[2023-10-13 03:50:48,539][46662] Updated weights for policy 0, policy_version 85920 (0.0009) +[2023-10-13 03:50:48,606][45375] Fps is (10 sec: 16384.0, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 175865856. Throughput: 0: 1680.5, 1: 1664.0. Samples: 43968304. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 03:50:48,607][45375] Avg episode reward: [(0, '61.380'), (1, '56.310')] +[2023-10-13 03:50:51,568][46663] Updated weights for policy 1, policy_version 85831 (0.0007) +[2023-10-13 03:50:51,944][46663] Updated weights for policy 1, policy_version 85841 (0.0008) +[2023-10-13 03:50:52,299][46663] Updated weights for policy 1, policy_version 85851 (0.0008) +[2023-10-13 03:50:52,608][46662] Updated weights for policy 0, policy_version 85930 (0.0010) +[2023-10-13 03:50:52,989][46662] Updated weights for policy 0, policy_version 85940 (0.0010) +[2023-10-13 03:50:53,359][46662] Updated weights for policy 0, policy_version 85950 (0.0008) +[2023-10-13 03:50:53,607][45375] Fps is (10 sec: 16383.9, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 175931392. Throughput: 0: 1665.0, 1: 1677.0. Samples: 43988186. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 03:50:53,608][45375] Avg episode reward: [(0, '60.700'), (1, '56.850')] +[2023-10-13 03:50:56,379][46663] Updated weights for policy 1, policy_version 85861 (0.0008) +[2023-10-13 03:50:56,755][46663] Updated weights for policy 1, policy_version 85871 (0.0007) +[2023-10-13 03:50:57,125][46663] Updated weights for policy 1, policy_version 85881 (0.0007) +[2023-10-13 03:50:57,303][46662] Updated weights for policy 0, policy_version 85960 (0.0007) +[2023-10-13 03:50:57,684][46662] Updated weights for policy 0, policy_version 85970 (0.0009) +[2023-10-13 03:50:58,052][46662] Updated weights for policy 0, policy_version 85980 (0.0009) +[2023-10-13 03:50:58,607][45375] Fps is (10 sec: 13107.0, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 175996928. Throughput: 0: 1682.5, 1: 1678.6. Samples: 43999018. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 03:50:58,607][45375] Avg episode reward: [(0, '60.180'), (1, '56.500')] +[2023-10-13 03:51:01,148][46663] Updated weights for policy 1, policy_version 85891 (0.0008) +[2023-10-13 03:51:01,529][46663] Updated weights for policy 1, policy_version 85901 (0.0009) +[2023-10-13 03:51:01,895][46663] Updated weights for policy 1, policy_version 85911 (0.0009) +[2023-10-13 03:51:01,932][46662] Updated weights for policy 0, policy_version 85990 (0.0010) +[2023-10-13 03:51:02,299][46662] Updated weights for policy 0, policy_version 86000 (0.0009) +[2023-10-13 03:51:02,666][46662] Updated weights for policy 0, policy_version 86010 (0.0009) +[2023-10-13 03:51:03,607][45375] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 176062464. Throughput: 0: 1687.3, 1: 1657.0. Samples: 44018680. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 03:51:03,607][45375] Avg episode reward: [(0, '61.140'), (1, '57.280')] +[2023-10-13 03:51:06,091][46663] Updated weights for policy 1, policy_version 85921 (0.0009) +[2023-10-13 03:51:06,455][46663] Updated weights for policy 1, policy_version 85931 (0.0008) +[2023-10-13 03:51:06,728][46662] Updated weights for policy 0, policy_version 86020 (0.0007) +[2023-10-13 03:51:06,819][46663] Updated weights for policy 1, policy_version 85941 (0.0009) +[2023-10-13 03:51:07,102][46662] Updated weights for policy 0, policy_version 86030 (0.0007) +[2023-10-13 03:51:07,184][46663] Updated weights for policy 1, policy_version 85951 (0.0010) +[2023-10-13 03:51:07,475][46662] Updated weights for policy 0, policy_version 86040 (0.0009) +[2023-10-13 03:51:08,606][45375] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 176128000. Throughput: 0: 1659.7, 1: 1679.4. Samples: 44038336. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 03:51:08,607][45375] Avg episode reward: [(0, '60.790'), (1, '57.610')] +[2023-10-13 03:51:11,221][46663] Updated weights for policy 1, policy_version 85961 (0.0010) +[2023-10-13 03:51:11,592][46663] Updated weights for policy 1, policy_version 85971 (0.0011) +[2023-10-13 03:51:11,603][46662] Updated weights for policy 0, policy_version 86050 (0.0007) +[2023-10-13 03:51:11,954][46663] Updated weights for policy 1, policy_version 85981 (0.0007) +[2023-10-13 03:51:11,977][46662] Updated weights for policy 0, policy_version 86060 (0.0007) +[2023-10-13 03:51:12,340][46662] Updated weights for policy 0, policy_version 86070 (0.0010) +[2023-10-13 03:51:12,709][46662] Updated weights for policy 0, policy_version 86080 (0.0010) +[2023-10-13 03:51:13,607][45375] Fps is (10 sec: 13107.0, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 176193536. Throughput: 0: 1685.4, 1: 1671.6. Samples: 44049442. Policy #0 lag: (min: 30.0, avg: 32.2, max: 62.0) +[2023-10-13 03:51:13,608][45375] Avg episode reward: [(0, '58.560'), (1, '58.010')] +[2023-10-13 03:51:16,197][46663] Updated weights for policy 1, policy_version 85991 (0.0008) +[2023-10-13 03:51:16,560][46663] Updated weights for policy 1, policy_version 86001 (0.0007) +[2023-10-13 03:51:16,808][46662] Updated weights for policy 0, policy_version 86090 (0.0008) +[2023-10-13 03:51:16,933][46663] Updated weights for policy 1, policy_version 86011 (0.0007) +[2023-10-13 03:51:17,173][46662] Updated weights for policy 0, policy_version 86100 (0.0007) +[2023-10-13 03:51:17,553][46662] Updated weights for policy 0, policy_version 86110 (0.0009) +[2023-10-13 03:51:18,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 176259072. Throughput: 0: 1682.8, 1: 1665.9. Samples: 44068872. Policy #0 lag: (min: 30.0, avg: 32.2, max: 62.0) +[2023-10-13 03:51:18,607][45375] Avg episode reward: [(0, '58.550'), (1, '58.460')] +[2023-10-13 03:51:21,075][46663] Updated weights for policy 1, policy_version 86021 (0.0007) +[2023-10-13 03:51:21,455][46663] Updated weights for policy 1, policy_version 86031 (0.0010) +[2023-10-13 03:51:21,786][46662] Updated weights for policy 0, policy_version 86120 (0.0009) +[2023-10-13 03:51:21,813][46663] Updated weights for policy 1, policy_version 86041 (0.0009) +[2023-10-13 03:51:22,159][46662] Updated weights for policy 0, policy_version 86130 (0.0009) +[2023-10-13 03:51:22,521][46662] Updated weights for policy 0, policy_version 86140 (0.0008) +[2023-10-13 03:51:23,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 176324608. Throughput: 0: 1666.8, 1: 1677.9. Samples: 44088400. Policy #0 lag: (min: 30.0, avg: 32.2, max: 62.0) +[2023-10-13 03:51:23,607][45375] Avg episode reward: [(0, '59.490'), (1, '59.370')] +[2023-10-13 03:51:25,800][46663] Updated weights for policy 1, policy_version 86051 (0.0007) +[2023-10-13 03:51:26,163][46663] Updated weights for policy 1, policy_version 86061 (0.0008) +[2023-10-13 03:51:26,523][46663] Updated weights for policy 1, policy_version 86071 (0.0007) +[2023-10-13 03:51:26,578][46662] Updated weights for policy 0, policy_version 86150 (0.0008) +[2023-10-13 03:51:26,942][46662] Updated weights for policy 0, policy_version 86160 (0.0009) +[2023-10-13 03:51:27,304][46662] Updated weights for policy 0, policy_version 86170 (0.0009) +[2023-10-13 03:51:28,606][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 176390144. Throughput: 0: 1692.8, 1: 1661.2. Samples: 44099446. Policy #0 lag: (min: 30.0, avg: 32.2, max: 62.0) +[2023-10-13 03:51:28,607][45375] Avg episode reward: [(0, '59.130'), (1, '59.840')] +[2023-10-13 03:51:30,412][46663] Updated weights for policy 1, policy_version 86081 (0.0009) +[2023-10-13 03:51:30,782][46663] Updated weights for policy 1, policy_version 86091 (0.0007) +[2023-10-13 03:51:31,139][46663] Updated weights for policy 1, policy_version 86101 (0.0009) +[2023-10-13 03:51:31,391][46662] Updated weights for policy 0, policy_version 86180 (0.0010) +[2023-10-13 03:51:31,512][46663] Updated weights for policy 1, policy_version 86111 (0.0008) +[2023-10-13 03:51:31,760][46662] Updated weights for policy 0, policy_version 86190 (0.0008) +[2023-10-13 03:51:32,131][46662] Updated weights for policy 0, policy_version 86200 (0.0008) +[2023-10-13 03:51:33,606][45375] Fps is (10 sec: 13107.4, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 176455680. Throughput: 0: 1677.8, 1: 1672.8. Samples: 44119080. Policy #0 lag: (min: 30.0, avg: 32.2, max: 62.0) +[2023-10-13 03:51:33,607][45375] Avg episode reward: [(0, '59.600'), (1, '59.890')] +[2023-10-13 03:51:35,632][46663] Updated weights for policy 1, policy_version 86121 (0.0007) +[2023-10-13 03:51:36,004][46663] Updated weights for policy 1, policy_version 86131 (0.0008) +[2023-10-13 03:51:36,169][46662] Updated weights for policy 0, policy_version 86210 (0.0008) +[2023-10-13 03:51:36,365][46663] Updated weights for policy 1, policy_version 86141 (0.0008) +[2023-10-13 03:51:36,547][46662] Updated weights for policy 0, policy_version 86220 (0.0007) +[2023-10-13 03:51:36,927][46662] Updated weights for policy 0, policy_version 86230 (0.0008) +[2023-10-13 03:51:37,291][46662] Updated weights for policy 0, policy_version 86240 (0.0008) +[2023-10-13 03:51:38,606][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 176521216. Throughput: 0: 1672.3, 1: 1683.1. Samples: 44139178. Policy #0 lag: (min: 30.0, avg: 32.2, max: 62.0) +[2023-10-13 03:51:38,607][45375] Avg episode reward: [(0, '60.530'), (1, '58.840')] +[2023-10-13 03:51:40,487][46663] Updated weights for policy 1, policy_version 86151 (0.0008) +[2023-10-13 03:51:40,860][46663] Updated weights for policy 1, policy_version 86161 (0.0009) +[2023-10-13 03:51:41,227][46663] Updated weights for policy 1, policy_version 86171 (0.0009) +[2023-10-13 03:51:41,230][46662] Updated weights for policy 0, policy_version 86250 (0.0008) +[2023-10-13 03:51:41,599][46662] Updated weights for policy 0, policy_version 86260 (0.0009) +[2023-10-13 03:51:41,967][46662] Updated weights for policy 0, policy_version 86270 (0.0007) +[2023-10-13 03:51:43,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 176586752. Throughput: 0: 1685.9, 1: 1659.4. Samples: 44149558. Policy #0 lag: (min: 30.0, avg: 32.2, max: 62.0) +[2023-10-13 03:51:43,607][45375] Avg episode reward: [(0, '58.830'), (1, '58.720')] +[2023-10-13 03:51:45,392][46663] Updated weights for policy 1, policy_version 86181 (0.0008) +[2023-10-13 03:51:45,757][46663] Updated weights for policy 1, policy_version 86191 (0.0007) +[2023-10-13 03:51:45,893][46662] Updated weights for policy 0, policy_version 86280 (0.0009) +[2023-10-13 03:51:46,128][46663] Updated weights for policy 1, policy_version 86201 (0.0009) +[2023-10-13 03:51:46,267][46662] Updated weights for policy 0, policy_version 86290 (0.0008) +[2023-10-13 03:51:46,637][46662] Updated weights for policy 0, policy_version 86300 (0.0010) +[2023-10-13 03:51:48,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 176652288. Throughput: 0: 1658.5, 1: 1682.4. Samples: 44169022. Policy #0 lag: (min: 30.0, avg: 32.2, max: 62.0) +[2023-10-13 03:51:48,607][45375] Avg episode reward: [(0, '58.060'), (1, '58.060')] +[2023-10-13 03:51:50,238][46663] Updated weights for policy 1, policy_version 86211 (0.0008) +[2023-10-13 03:51:50,610][46663] Updated weights for policy 1, policy_version 86221 (0.0008) +[2023-10-13 03:51:50,822][46662] Updated weights for policy 0, policy_version 86310 (0.0008) +[2023-10-13 03:51:50,979][46663] Updated weights for policy 1, policy_version 86231 (0.0007) +[2023-10-13 03:51:51,191][46662] Updated weights for policy 0, policy_version 86320 (0.0007) +[2023-10-13 03:51:51,571][46662] Updated weights for policy 0, policy_version 86330 (0.0009) +[2023-10-13 03:51:53,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 176717824. Throughput: 0: 1673.5, 1: 1681.2. Samples: 44189296. Policy #0 lag: (min: 30.0, avg: 32.2, max: 62.0) +[2023-10-13 03:51:53,607][45375] Avg episode reward: [(0, '57.190'), (1, '57.380')] +[2023-10-13 03:51:53,615][46091] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000086336_88408064.pth... +[2023-10-13 03:51:53,616][46384] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000086240_88309760.pth... +[2023-10-13 03:51:53,655][46384] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000084672_86704128.pth +[2023-10-13 03:51:53,657][46091] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000084768_86802432.pth +[2023-10-13 03:51:53,659][46384] Saving a milestone ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/milestones/checkpoint_000086240_88309760.pth +[2023-10-13 03:51:53,661][46091] Saving a milestone ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/milestones/checkpoint_000086336_88408064.pth +[2023-10-13 03:51:54,857][46663] Updated weights for policy 1, policy_version 86241 (0.0008) +[2023-10-13 03:51:55,223][46663] Updated weights for policy 1, policy_version 86251 (0.0007) +[2023-10-13 03:51:55,584][46662] Updated weights for policy 0, policy_version 86340 (0.0008) +[2023-10-13 03:51:55,596][46663] Updated weights for policy 1, policy_version 86261 (0.0009) +[2023-10-13 03:51:55,954][46662] Updated weights for policy 0, policy_version 86350 (0.0009) +[2023-10-13 03:51:55,969][46663] Updated weights for policy 1, policy_version 86271 (0.0008) +[2023-10-13 03:51:56,317][46662] Updated weights for policy 0, policy_version 86360 (0.0008) +[2023-10-13 03:51:58,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 176783360. Throughput: 0: 1671.5, 1: 1664.3. Samples: 44199552. Policy #0 lag: (min: 30.0, avg: 32.2, max: 62.0) +[2023-10-13 03:51:58,607][45375] Avg episode reward: [(0, '56.050'), (1, '58.170')] +[2023-10-13 03:52:00,035][46663] Updated weights for policy 1, policy_version 86281 (0.0008) +[2023-10-13 03:52:00,397][46663] Updated weights for policy 1, policy_version 86291 (0.0009) +[2023-10-13 03:52:00,470][46662] Updated weights for policy 0, policy_version 86370 (0.0007) +[2023-10-13 03:52:00,757][46663] Updated weights for policy 1, policy_version 86301 (0.0007) +[2023-10-13 03:52:00,841][46662] Updated weights for policy 0, policy_version 86380 (0.0008) +[2023-10-13 03:52:01,202][46662] Updated weights for policy 0, policy_version 86390 (0.0008) +[2023-10-13 03:52:01,560][46662] Updated weights for policy 0, policy_version 86400 (0.0008) +[2023-10-13 03:52:03,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 176848896. Throughput: 0: 1661.8, 1: 1681.7. Samples: 44219328. Policy #0 lag: (min: 30.0, avg: 32.2, max: 62.0) +[2023-10-13 03:52:03,607][45375] Avg episode reward: [(0, '56.330'), (1, '57.460')] +[2023-10-13 03:52:04,826][46663] Updated weights for policy 1, policy_version 86311 (0.0009) +[2023-10-13 03:52:05,185][46663] Updated weights for policy 1, policy_version 86321 (0.0008) +[2023-10-13 03:52:05,554][46663] Updated weights for policy 1, policy_version 86331 (0.0008) +[2023-10-13 03:52:05,600][46662] Updated weights for policy 0, policy_version 86410 (0.0009) +[2023-10-13 03:52:05,964][46662] Updated weights for policy 0, policy_version 86420 (0.0008) +[2023-10-13 03:52:06,330][46662] Updated weights for policy 0, policy_version 86430 (0.0009) +[2023-10-13 03:52:08,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 176914432. Throughput: 0: 1684.1, 1: 1688.4. Samples: 44240162. Policy #0 lag: (min: 29.0, avg: 29.0, max: 29.0) +[2023-10-13 03:52:08,607][45375] Avg episode reward: [(0, '56.380'), (1, '57.160')] +[2023-10-13 03:52:09,742][46663] Updated weights for policy 1, policy_version 86341 (0.0008) +[2023-10-13 03:52:10,144][46663] Updated weights for policy 1, policy_version 86351 (0.0008) +[2023-10-13 03:52:10,497][46662] Updated weights for policy 0, policy_version 86440 (0.0007) +[2023-10-13 03:52:10,517][46663] Updated weights for policy 1, policy_version 86361 (0.0009) +[2023-10-13 03:52:10,877][46662] Updated weights for policy 0, policy_version 86450 (0.0007) +[2023-10-13 03:52:11,255][46662] Updated weights for policy 0, policy_version 86460 (0.0007) +[2023-10-13 03:52:13,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.3, 300 sec: 13440.4). Total num frames: 176979968. Throughput: 0: 1667.8, 1: 1668.8. Samples: 44249594. Policy #0 lag: (min: 29.0, avg: 29.0, max: 29.0) +[2023-10-13 03:52:13,607][45375] Avg episode reward: [(0, '57.730'), (1, '56.670')] +[2023-10-13 03:52:14,437][46663] Updated weights for policy 1, policy_version 86371 (0.0008) +[2023-10-13 03:52:14,803][46663] Updated weights for policy 1, policy_version 86381 (0.0007) +[2023-10-13 03:52:15,171][46663] Updated weights for policy 1, policy_version 86391 (0.0008) +[2023-10-13 03:52:15,218][46662] Updated weights for policy 0, policy_version 86470 (0.0008) +[2023-10-13 03:52:15,591][46662] Updated weights for policy 0, policy_version 86480 (0.0008) +[2023-10-13 03:52:15,954][46662] Updated weights for policy 0, policy_version 86490 (0.0008) +[2023-10-13 03:52:18,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 177045504. Throughput: 0: 1672.0, 1: 1679.6. Samples: 44269900. Policy #0 lag: (min: 29.0, avg: 29.0, max: 29.0) +[2023-10-13 03:52:18,607][45375] Avg episode reward: [(0, '57.910'), (1, '55.770')] +[2023-10-13 03:52:19,370][46663] Updated weights for policy 1, policy_version 86401 (0.0007) +[2023-10-13 03:52:19,744][46663] Updated weights for policy 1, policy_version 86411 (0.0008) +[2023-10-13 03:52:20,113][46663] Updated weights for policy 1, policy_version 86421 (0.0010) +[2023-10-13 03:52:20,218][46662] Updated weights for policy 0, policy_version 86500 (0.0009) +[2023-10-13 03:52:20,485][46663] Updated weights for policy 1, policy_version 86431 (0.0007) +[2023-10-13 03:52:20,589][46662] Updated weights for policy 0, policy_version 86510 (0.0009) +[2023-10-13 03:52:20,951][46662] Updated weights for policy 0, policy_version 86520 (0.0010) +[2023-10-13 03:52:23,606][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 177111040. Throughput: 0: 1687.9, 1: 1673.7. Samples: 44290450. Policy #0 lag: (min: 29.0, avg: 29.0, max: 29.0) +[2023-10-13 03:52:23,607][45375] Avg episode reward: [(0, '57.630'), (1, '54.270')] +[2023-10-13 03:52:24,550][46663] Updated weights for policy 1, policy_version 86441 (0.0009) +[2023-10-13 03:52:24,908][46663] Updated weights for policy 1, policy_version 86451 (0.0008) +[2023-10-13 03:52:24,913][46662] Updated weights for policy 0, policy_version 86530 (0.0009) +[2023-10-13 03:52:25,274][46663] Updated weights for policy 1, policy_version 86461 (0.0008) +[2023-10-13 03:52:25,287][46662] Updated weights for policy 0, policy_version 86540 (0.0008) +[2023-10-13 03:52:25,656][46662] Updated weights for policy 0, policy_version 86550 (0.0009) +[2023-10-13 03:52:26,017][46662] Updated weights for policy 0, policy_version 86560 (0.0010) +[2023-10-13 03:52:28,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 177176576. Throughput: 0: 1664.6, 1: 1673.9. Samples: 44299790. Policy #0 lag: (min: 29.0, avg: 29.0, max: 29.0) +[2023-10-13 03:52:28,607][45375] Avg episode reward: [(0, '57.420'), (1, '53.460')] +[2023-10-13 03:52:29,455][46663] Updated weights for policy 1, policy_version 86471 (0.0008) +[2023-10-13 03:52:29,828][46663] Updated weights for policy 1, policy_version 86481 (0.0009) +[2023-10-13 03:52:30,179][46662] Updated weights for policy 0, policy_version 86570 (0.0008) +[2023-10-13 03:52:30,195][46663] Updated weights for policy 1, policy_version 86491 (0.0009) +[2023-10-13 03:52:30,548][46662] Updated weights for policy 0, policy_version 86580 (0.0009) +[2023-10-13 03:52:30,935][46662] Updated weights for policy 0, policy_version 86590 (0.0009) +[2023-10-13 03:52:33,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 177242112. Throughput: 0: 1681.6, 1: 1673.8. Samples: 44320016. Policy #0 lag: (min: 29.0, avg: 29.0, max: 29.0) +[2023-10-13 03:52:33,608][45375] Avg episode reward: [(0, '56.810'), (1, '54.740')] +[2023-10-13 03:52:34,328][46663] Updated weights for policy 1, policy_version 86501 (0.0008) +[2023-10-13 03:52:34,698][46663] Updated weights for policy 1, policy_version 86511 (0.0008) +[2023-10-13 03:52:35,015][46662] Updated weights for policy 0, policy_version 86600 (0.0008) +[2023-10-13 03:52:35,056][46663] Updated weights for policy 1, policy_version 86521 (0.0008) +[2023-10-13 03:52:35,393][46662] Updated weights for policy 0, policy_version 86610 (0.0009) +[2023-10-13 03:52:35,751][46662] Updated weights for policy 0, policy_version 86620 (0.0008) +[2023-10-13 03:52:38,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 177307648. Throughput: 0: 1689.5, 1: 1675.4. Samples: 44340716. Policy #0 lag: (min: 29.0, avg: 29.0, max: 29.0) +[2023-10-13 03:52:38,607][45375] Avg episode reward: [(0, '55.520'), (1, '54.870')] +[2023-10-13 03:52:38,959][46663] Updated weights for policy 1, policy_version 86531 (0.0007) +[2023-10-13 03:52:39,333][46663] Updated weights for policy 1, policy_version 86541 (0.0007) +[2023-10-13 03:52:39,705][46663] Updated weights for policy 1, policy_version 86551 (0.0008) +[2023-10-13 03:52:39,726][46662] Updated weights for policy 0, policy_version 86630 (0.0009) +[2023-10-13 03:52:40,097][46662] Updated weights for policy 0, policy_version 86640 (0.0009) +[2023-10-13 03:52:40,464][46662] Updated weights for policy 0, policy_version 86650 (0.0008) +[2023-10-13 03:52:43,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 177373184. Throughput: 0: 1667.4, 1: 1671.5. Samples: 44349802. Policy #0 lag: (min: 29.0, avg: 29.0, max: 29.0) +[2023-10-13 03:52:43,608][45375] Avg episode reward: [(0, '55.750'), (1, '55.150')] +[2023-10-13 03:52:43,677][46663] Updated weights for policy 1, policy_version 86561 (0.0009) +[2023-10-13 03:52:44,030][46663] Updated weights for policy 1, policy_version 86571 (0.0009) +[2023-10-13 03:52:44,400][46663] Updated weights for policy 1, policy_version 86581 (0.0008) +[2023-10-13 03:52:44,536][46662] Updated weights for policy 0, policy_version 86660 (0.0010) +[2023-10-13 03:52:44,756][46663] Updated weights for policy 1, policy_version 86591 (0.0009) +[2023-10-13 03:52:44,905][46662] Updated weights for policy 0, policy_version 86670 (0.0009) +[2023-10-13 03:52:45,276][46662] Updated weights for policy 0, policy_version 86680 (0.0009) +[2023-10-13 03:52:48,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 177438720. Throughput: 0: 1686.4, 1: 1675.3. Samples: 44370608. Policy #0 lag: (min: 29.0, avg: 29.0, max: 29.0) +[2023-10-13 03:52:48,607][45375] Avg episode reward: [(0, '55.740'), (1, '52.820')] +[2023-10-13 03:52:48,988][46663] Updated weights for policy 1, policy_version 86601 (0.0010) +[2023-10-13 03:52:49,347][46663] Updated weights for policy 1, policy_version 86611 (0.0008) +[2023-10-13 03:52:49,349][46662] Updated weights for policy 0, policy_version 86690 (0.0010) +[2023-10-13 03:52:49,712][46663] Updated weights for policy 1, policy_version 86621 (0.0008) +[2023-10-13 03:52:49,714][46662] Updated weights for policy 0, policy_version 86700 (0.0009) +[2023-10-13 03:52:50,078][46662] Updated weights for policy 0, policy_version 86710 (0.0010) +[2023-10-13 03:52:50,441][46662] Updated weights for policy 0, policy_version 86720 (0.0011) +[2023-10-13 03:52:53,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 177504256. Throughput: 0: 1679.9, 1: 1669.2. Samples: 44390872. Policy #0 lag: (min: 29.0, avg: 29.0, max: 29.0) +[2023-10-13 03:52:53,608][45375] Avg episode reward: [(0, '56.130'), (1, '52.190')] +[2023-10-13 03:52:53,967][46663] Updated weights for policy 1, policy_version 86631 (0.0009) +[2023-10-13 03:52:54,334][46663] Updated weights for policy 1, policy_version 86641 (0.0008) +[2023-10-13 03:52:54,650][46662] Updated weights for policy 0, policy_version 86730 (0.0007) +[2023-10-13 03:52:54,695][46663] Updated weights for policy 1, policy_version 86651 (0.0008) +[2023-10-13 03:52:55,032][46662] Updated weights for policy 0, policy_version 86740 (0.0008) +[2023-10-13 03:52:55,397][46662] Updated weights for policy 0, policy_version 86750 (0.0011) +[2023-10-13 03:52:58,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 177569792. Throughput: 0: 1667.3, 1: 1676.2. Samples: 44400050. Policy #0 lag: (min: 29.0, avg: 29.0, max: 29.0) +[2023-10-13 03:52:58,608][45375] Avg episode reward: [(0, '58.100'), (1, '51.980')] +[2023-10-13 03:52:58,994][46663] Updated weights for policy 1, policy_version 86661 (0.0007) +[2023-10-13 03:52:59,368][46663] Updated weights for policy 1, policy_version 86671 (0.0009) +[2023-10-13 03:52:59,594][46662] Updated weights for policy 0, policy_version 86760 (0.0009) +[2023-10-13 03:52:59,745][46663] Updated weights for policy 1, policy_version 86681 (0.0007) +[2023-10-13 03:52:59,959][46662] Updated weights for policy 0, policy_version 86770 (0.0008) +[2023-10-13 03:53:00,322][46662] Updated weights for policy 0, policy_version 86780 (0.0010) +[2023-10-13 03:53:03,606][45375] Fps is (10 sec: 13107.6, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 177635328. Throughput: 0: 1670.8, 1: 1670.7. Samples: 44420264. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-10-13 03:53:03,607][45375] Avg episode reward: [(0, '59.420'), (1, '52.180')] +[2023-10-13 03:53:03,793][46663] Updated weights for policy 1, policy_version 86691 (0.0008) +[2023-10-13 03:53:04,159][46663] Updated weights for policy 1, policy_version 86701 (0.0007) +[2023-10-13 03:53:04,372][46662] Updated weights for policy 0, policy_version 86790 (0.0008) +[2023-10-13 03:53:04,522][46663] Updated weights for policy 1, policy_version 86711 (0.0008) +[2023-10-13 03:53:04,750][46662] Updated weights for policy 0, policy_version 86800 (0.0007) +[2023-10-13 03:53:05,107][46662] Updated weights for policy 0, policy_version 86810 (0.0010) +[2023-10-13 03:53:08,591][46663] Updated weights for policy 1, policy_version 86721 (0.0009) +[2023-10-13 03:53:08,607][45375] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 177700864. Throughput: 0: 1670.0, 1: 1672.7. Samples: 44440868. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-10-13 03:53:08,607][45375] Avg episode reward: [(0, '58.570'), (1, '52.130')] +[2023-10-13 03:53:08,959][46663] Updated weights for policy 1, policy_version 86731 (0.0008) +[2023-10-13 03:53:09,282][46662] Updated weights for policy 0, policy_version 86820 (0.0009) +[2023-10-13 03:53:09,322][46663] Updated weights for policy 1, policy_version 86741 (0.0008) +[2023-10-13 03:53:09,648][46662] Updated weights for policy 0, policy_version 86830 (0.0007) +[2023-10-13 03:53:09,689][46663] Updated weights for policy 1, policy_version 86751 (0.0010) +[2023-10-13 03:53:10,020][46662] Updated weights for policy 0, policy_version 86840 (0.0007) +[2023-10-13 03:53:13,607][45375] Fps is (10 sec: 13106.8, 60 sec: 13107.1, 300 sec: 13329.3). Total num frames: 177766400. Throughput: 0: 1662.9, 1: 1671.7. Samples: 44449850. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-10-13 03:53:13,608][45375] Avg episode reward: [(0, '59.420'), (1, '52.620')] +[2023-10-13 03:53:13,788][46663] Updated weights for policy 1, policy_version 86761 (0.0009) +[2023-10-13 03:53:14,061][46662] Updated weights for policy 0, policy_version 86850 (0.0007) +[2023-10-13 03:53:14,141][46663] Updated weights for policy 1, policy_version 86771 (0.0007) +[2023-10-13 03:53:14,437][46662] Updated weights for policy 0, policy_version 86860 (0.0009) +[2023-10-13 03:53:14,498][46663] Updated weights for policy 1, policy_version 86781 (0.0009) +[2023-10-13 03:53:14,805][46662] Updated weights for policy 0, policy_version 86870 (0.0007) +[2023-10-13 03:53:15,179][46662] Updated weights for policy 0, policy_version 86880 (0.0009) +[2023-10-13 03:53:18,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 177831936. Throughput: 0: 1667.3, 1: 1673.1. Samples: 44470334. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-10-13 03:53:18,607][45375] Avg episode reward: [(0, '58.570'), (1, '51.990')] +[2023-10-13 03:53:18,725][46663] Updated weights for policy 1, policy_version 86791 (0.0009) +[2023-10-13 03:53:19,098][46663] Updated weights for policy 1, policy_version 86801 (0.0008) +[2023-10-13 03:53:19,332][46662] Updated weights for policy 0, policy_version 86890 (0.0007) +[2023-10-13 03:53:19,464][46663] Updated weights for policy 1, policy_version 86811 (0.0007) +[2023-10-13 03:53:19,691][46662] Updated weights for policy 0, policy_version 86900 (0.0007) +[2023-10-13 03:53:20,061][46662] Updated weights for policy 0, policy_version 86910 (0.0008) +[2023-10-13 03:53:23,606][45375] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 177897472. Throughput: 0: 1666.4, 1: 1670.5. Samples: 44490874. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-10-13 03:53:23,607][45375] Avg episode reward: [(0, '59.410'), (1, '53.200')] +[2023-10-13 03:53:23,672][46663] Updated weights for policy 1, policy_version 86821 (0.0007) +[2023-10-13 03:53:23,987][46662] Updated weights for policy 0, policy_version 86920 (0.0009) +[2023-10-13 03:53:24,029][46663] Updated weights for policy 1, policy_version 86831 (0.0008) +[2023-10-13 03:53:24,365][46662] Updated weights for policy 0, policy_version 86930 (0.0008) +[2023-10-13 03:53:24,397][46663] Updated weights for policy 1, policy_version 86841 (0.0007) +[2023-10-13 03:53:24,736][46662] Updated weights for policy 0, policy_version 86940 (0.0008) +[2023-10-13 03:53:28,375][46663] Updated weights for policy 1, policy_version 86851 (0.0008) +[2023-10-13 03:53:28,606][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 177963008. Throughput: 0: 1666.5, 1: 1672.7. Samples: 44500066. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-10-13 03:53:28,607][45375] Avg episode reward: [(0, '58.490'), (1, '54.250')] +[2023-10-13 03:53:28,751][46663] Updated weights for policy 1, policy_version 86861 (0.0009) +[2023-10-13 03:53:28,869][46662] Updated weights for policy 0, policy_version 86950 (0.0008) +[2023-10-13 03:53:29,117][46663] Updated weights for policy 1, policy_version 86871 (0.0007) +[2023-10-13 03:53:29,242][46662] Updated weights for policy 0, policy_version 86960 (0.0008) +[2023-10-13 03:53:29,600][46662] Updated weights for policy 0, policy_version 86970 (0.0009) +[2023-10-13 03:53:33,067][46663] Updated weights for policy 1, policy_version 86881 (0.0009) +[2023-10-13 03:53:33,439][46663] Updated weights for policy 1, policy_version 86891 (0.0009) +[2023-10-13 03:53:33,606][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 178028544. Throughput: 0: 1660.2, 1: 1677.9. Samples: 44520822. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-10-13 03:53:33,607][45375] Avg episode reward: [(0, '59.900'), (1, '54.160')] +[2023-10-13 03:53:33,778][46662] Updated weights for policy 0, policy_version 86980 (0.0008) +[2023-10-13 03:53:33,805][46663] Updated weights for policy 1, policy_version 86901 (0.0008) +[2023-10-13 03:53:34,149][46662] Updated weights for policy 0, policy_version 86990 (0.0008) +[2023-10-13 03:53:34,169][46663] Updated weights for policy 1, policy_version 86911 (0.0007) +[2023-10-13 03:53:34,525][46662] Updated weights for policy 0, policy_version 87000 (0.0008) +[2023-10-13 03:53:38,217][46663] Updated weights for policy 1, policy_version 86921 (0.0009) +[2023-10-13 03:53:38,513][46662] Updated weights for policy 0, policy_version 87010 (0.0009) +[2023-10-13 03:53:38,575][46663] Updated weights for policy 1, policy_version 86931 (0.0008) +[2023-10-13 03:53:38,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 178094080. Throughput: 0: 1669.7, 1: 1670.9. Samples: 44541198. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-10-13 03:53:38,607][45375] Avg episode reward: [(0, '61.190'), (1, '52.890')] +[2023-10-13 03:53:38,881][46662] Updated weights for policy 0, policy_version 87020 (0.0008) +[2023-10-13 03:53:38,937][46663] Updated weights for policy 1, policy_version 86941 (0.0008) +[2023-10-13 03:53:39,252][46662] Updated weights for policy 0, policy_version 87030 (0.0010) +[2023-10-13 03:53:39,608][46662] Updated weights for policy 0, policy_version 87040 (0.0010) +[2023-10-13 03:53:43,009][46663] Updated weights for policy 1, policy_version 86951 (0.0009) +[2023-10-13 03:53:43,379][46663] Updated weights for policy 1, policy_version 86961 (0.0007) +[2023-10-13 03:53:43,606][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 178159616. Throughput: 0: 1665.1, 1: 1683.2. Samples: 44550722. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-10-13 03:53:43,607][45375] Avg episode reward: [(0, '61.450'), (1, '51.370')] +[2023-10-13 03:53:43,753][46663] Updated weights for policy 1, policy_version 86971 (0.0008) +[2023-10-13 03:53:43,907][46662] Updated weights for policy 0, policy_version 87050 (0.0007) +[2023-10-13 03:53:44,282][46662] Updated weights for policy 0, policy_version 87060 (0.0007) +[2023-10-13 03:53:44,647][46662] Updated weights for policy 0, policy_version 87070 (0.0008) +[2023-10-13 03:53:47,941][46663] Updated weights for policy 1, policy_version 86981 (0.0007) +[2023-10-13 03:53:48,327][46663] Updated weights for policy 1, policy_version 86991 (0.0009) +[2023-10-13 03:53:48,606][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 178225152. Throughput: 0: 1667.6, 1: 1686.9. Samples: 44571216. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-10-13 03:53:48,607][45375] Avg episode reward: [(0, '62.090'), (1, '51.930')] +[2023-10-13 03:53:48,684][46663] Updated weights for policy 1, policy_version 87001 (0.0008) +[2023-10-13 03:53:48,751][46662] Updated weights for policy 0, policy_version 87080 (0.0008) +[2023-10-13 03:53:49,123][46662] Updated weights for policy 0, policy_version 87090 (0.0009) +[2023-10-13 03:53:49,495][46662] Updated weights for policy 0, policy_version 87100 (0.0009) +[2023-10-13 03:53:52,596][46663] Updated weights for policy 1, policy_version 87011 (0.0009) +[2023-10-13 03:53:52,973][46663] Updated weights for policy 1, policy_version 87021 (0.0011) +[2023-10-13 03:53:53,333][46663] Updated weights for policy 1, policy_version 87031 (0.0008) +[2023-10-13 03:53:53,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 178290688. Throughput: 0: 1667.4, 1: 1665.1. Samples: 44590832. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-10-13 03:53:53,607][45375] Avg episode reward: [(0, '62.940'), (1, '54.260')] +[2023-10-13 03:53:53,661][46384] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000087040_89128960.pth... +[2023-10-13 03:53:53,690][46384] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000085472_87523328.pth +[2023-10-13 03:53:53,747][46662] Updated weights for policy 0, policy_version 87110 (0.0008) +[2023-10-13 03:53:54,136][46662] Updated weights for policy 0, policy_version 87120 (0.0008) +[2023-10-13 03:53:54,494][46662] Updated weights for policy 0, policy_version 87130 (0.0011) +[2023-10-13 03:53:54,717][46091] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000087136_89227264.pth... +[2023-10-13 03:53:54,745][46091] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000085536_87588864.pth +[2023-10-13 03:53:57,285][46663] Updated weights for policy 1, policy_version 87041 (0.0007) +[2023-10-13 03:53:57,656][46663] Updated weights for policy 1, policy_version 87051 (0.0011) +[2023-10-13 03:53:58,027][46663] Updated weights for policy 1, policy_version 87061 (0.0009) +[2023-10-13 03:53:58,389][46663] Updated weights for policy 1, policy_version 87071 (0.0008) +[2023-10-13 03:53:58,587][46662] Updated weights for policy 0, policy_version 87140 (0.0009) +[2023-10-13 03:53:58,606][45375] Fps is (10 sec: 16384.0, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 178388992. Throughput: 0: 1665.8, 1: 1688.6. Samples: 44600800. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 03:53:58,607][45375] Avg episode reward: [(0, '63.380'), (1, '57.120')] +[2023-10-13 03:53:58,955][46662] Updated weights for policy 0, policy_version 87150 (0.0007) +[2023-10-13 03:53:59,336][46662] Updated weights for policy 0, policy_version 87160 (0.0008) +[2023-10-13 03:54:02,612][46663] Updated weights for policy 1, policy_version 87081 (0.0010) +[2023-10-13 03:54:02,973][46663] Updated weights for policy 1, policy_version 87091 (0.0010) +[2023-10-13 03:54:03,341][46663] Updated weights for policy 1, policy_version 87101 (0.0009) +[2023-10-13 03:54:03,343][46662] Updated weights for policy 0, policy_version 87170 (0.0010) +[2023-10-13 03:54:03,607][45375] Fps is (10 sec: 16384.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 178454528. Throughput: 0: 1668.8, 1: 1682.9. Samples: 44621162. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 03:54:03,607][45375] Avg episode reward: [(0, '63.860'), (1, '58.410')] +[2023-10-13 03:54:03,714][46662] Updated weights for policy 0, policy_version 87180 (0.0007) +[2023-10-13 03:54:04,088][46662] Updated weights for policy 0, policy_version 87190 (0.0010) +[2023-10-13 03:54:04,456][46662] Updated weights for policy 0, policy_version 87200 (0.0011) +[2023-10-13 03:54:07,420][46663] Updated weights for policy 1, policy_version 87111 (0.0008) +[2023-10-13 03:54:07,777][46663] Updated weights for policy 1, policy_version 87121 (0.0009) +[2023-10-13 03:54:08,143][46663] Updated weights for policy 1, policy_version 87131 (0.0008) +[2023-10-13 03:54:08,513][46662] Updated weights for policy 0, policy_version 87210 (0.0008) +[2023-10-13 03:54:08,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 178520064. Throughput: 0: 1668.7, 1: 1661.8. Samples: 44640748. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 03:54:08,607][45375] Avg episode reward: [(0, '63.540'), (1, '56.610')] +[2023-10-13 03:54:08,880][46662] Updated weights for policy 0, policy_version 87220 (0.0009) +[2023-10-13 03:54:09,251][46662] Updated weights for policy 0, policy_version 87230 (0.0009) +[2023-10-13 03:54:12,173][46663] Updated weights for policy 1, policy_version 87141 (0.0008) +[2023-10-13 03:54:12,542][46663] Updated weights for policy 1, policy_version 87151 (0.0011) +[2023-10-13 03:54:12,914][46663] Updated weights for policy 1, policy_version 87161 (0.0009) +[2023-10-13 03:54:13,437][46662] Updated weights for policy 0, policy_version 87240 (0.0008) +[2023-10-13 03:54:13,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 178585600. Throughput: 0: 1664.5, 1: 1689.2. Samples: 44650982. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 03:54:13,607][45375] Avg episode reward: [(0, '63.330'), (1, '56.850')] +[2023-10-13 03:54:13,808][46662] Updated weights for policy 0, policy_version 87250 (0.0008) +[2023-10-13 03:54:14,180][46662] Updated weights for policy 0, policy_version 87260 (0.0007) +[2023-10-13 03:54:16,801][46663] Updated weights for policy 1, policy_version 87171 (0.0008) +[2023-10-13 03:54:17,174][46663] Updated weights for policy 1, policy_version 87181 (0.0009) +[2023-10-13 03:54:17,535][46663] Updated weights for policy 1, policy_version 87191 (0.0009) +[2023-10-13 03:54:18,274][46662] Updated weights for policy 0, policy_version 87270 (0.0008) +[2023-10-13 03:54:18,606][45375] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 178651136. Throughput: 0: 1668.0, 1: 1670.6. Samples: 44671058. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 03:54:18,607][45375] Avg episode reward: [(0, '63.850'), (1, '56.520')] +[2023-10-13 03:54:18,655][46662] Updated weights for policy 0, policy_version 87280 (0.0010) +[2023-10-13 03:54:19,030][46662] Updated weights for policy 0, policy_version 87290 (0.0009) +[2023-10-13 03:54:21,698][46663] Updated weights for policy 1, policy_version 87201 (0.0008) +[2023-10-13 03:54:22,064][46663] Updated weights for policy 1, policy_version 87211 (0.0008) +[2023-10-13 03:54:22,438][46663] Updated weights for policy 1, policy_version 87221 (0.0008) +[2023-10-13 03:54:22,795][46663] Updated weights for policy 1, policy_version 87231 (0.0007) +[2023-10-13 03:54:23,215][46662] Updated weights for policy 0, policy_version 87300 (0.0010) +[2023-10-13 03:54:23,585][46662] Updated weights for policy 0, policy_version 87310 (0.0008) +[2023-10-13 03:54:23,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 178716672. Throughput: 0: 1661.2, 1: 1668.6. Samples: 44691038. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 03:54:23,607][45375] Avg episode reward: [(0, '64.230'), (1, '57.080')] +[2023-10-13 03:54:23,951][46662] Updated weights for policy 0, policy_version 87320 (0.0007) +[2023-10-13 03:54:27,097][46663] Updated weights for policy 1, policy_version 87241 (0.0010) +[2023-10-13 03:54:27,458][46663] Updated weights for policy 1, policy_version 87251 (0.0009) +[2023-10-13 03:54:27,826][46663] Updated weights for policy 1, policy_version 87261 (0.0009) +[2023-10-13 03:54:27,989][46662] Updated weights for policy 0, policy_version 87330 (0.0008) +[2023-10-13 03:54:28,357][46662] Updated weights for policy 0, policy_version 87340 (0.0008) +[2023-10-13 03:54:28,606][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 178782208. Throughput: 0: 1667.3, 1: 1680.8. Samples: 44701390. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 03:54:28,607][45375] Avg episode reward: [(0, '62.380'), (1, '55.970')] +[2023-10-13 03:54:28,741][46662] Updated weights for policy 0, policy_version 87350 (0.0009) +[2023-10-13 03:54:29,107][46662] Updated weights for policy 0, policy_version 87360 (0.0007) +[2023-10-13 03:54:31,947][46663] Updated weights for policy 1, policy_version 87271 (0.0009) +[2023-10-13 03:54:32,309][46663] Updated weights for policy 1, policy_version 87281 (0.0010) +[2023-10-13 03:54:32,676][46663] Updated weights for policy 1, policy_version 87291 (0.0010) +[2023-10-13 03:54:33,164][46662] Updated weights for policy 0, policy_version 87370 (0.0008) +[2023-10-13 03:54:33,527][46662] Updated weights for policy 0, policy_version 87380 (0.0009) +[2023-10-13 03:54:33,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 178847744. Throughput: 0: 1672.8, 1: 1663.5. Samples: 44721348. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 03:54:33,607][45375] Avg episode reward: [(0, '62.710'), (1, '55.100')] +[2023-10-13 03:54:33,894][46662] Updated weights for policy 0, policy_version 87390 (0.0007) +[2023-10-13 03:54:36,590][46663] Updated weights for policy 1, policy_version 87301 (0.0007) +[2023-10-13 03:54:36,987][46663] Updated weights for policy 1, policy_version 87311 (0.0009) +[2023-10-13 03:54:37,352][46663] Updated weights for policy 1, policy_version 87321 (0.0008) +[2023-10-13 03:54:37,887][46662] Updated weights for policy 0, policy_version 87400 (0.0008) +[2023-10-13 03:54:38,254][46662] Updated weights for policy 0, policy_version 87410 (0.0007) +[2023-10-13 03:54:38,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 178913280. Throughput: 0: 1672.8, 1: 1671.0. Samples: 44741304. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 03:54:38,607][45375] Avg episode reward: [(0, '63.040'), (1, '55.320')] +[2023-10-13 03:54:38,634][46662] Updated weights for policy 0, policy_version 87420 (0.0008) +[2023-10-13 03:54:41,292][46663] Updated weights for policy 1, policy_version 87331 (0.0010) +[2023-10-13 03:54:41,661][46663] Updated weights for policy 1, policy_version 87341 (0.0007) +[2023-10-13 03:54:42,020][46663] Updated weights for policy 1, policy_version 87351 (0.0008) +[2023-10-13 03:54:42,685][46662] Updated weights for policy 0, policy_version 87430 (0.0009) +[2023-10-13 03:54:43,054][46662] Updated weights for policy 0, policy_version 87440 (0.0007) +[2023-10-13 03:54:43,423][46662] Updated weights for policy 0, policy_version 87450 (0.0009) +[2023-10-13 03:54:43,607][45375] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 178978816. Throughput: 0: 1681.1, 1: 1674.2. Samples: 44751788. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 03:54:43,608][45375] Avg episode reward: [(0, '63.110'), (1, '57.360')] +[2023-10-13 03:54:46,135][46663] Updated weights for policy 1, policy_version 87361 (0.0008) +[2023-10-13 03:54:46,509][46663] Updated weights for policy 1, policy_version 87371 (0.0009) +[2023-10-13 03:54:46,887][46663] Updated weights for policy 1, policy_version 87381 (0.0008) +[2023-10-13 03:54:47,253][46663] Updated weights for policy 1, policy_version 87391 (0.0009) +[2023-10-13 03:54:47,315][46662] Updated weights for policy 0, policy_version 87460 (0.0009) +[2023-10-13 03:54:47,688][46662] Updated weights for policy 0, policy_version 87470 (0.0009) +[2023-10-13 03:54:48,057][46662] Updated weights for policy 0, policy_version 87480 (0.0009) +[2023-10-13 03:54:48,606][45375] Fps is (10 sec: 16384.1, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 179077120. Throughput: 0: 1683.9, 1: 1655.6. Samples: 44771442. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 03:54:48,607][45375] Avg episode reward: [(0, '63.410'), (1, '58.170')] +[2023-10-13 03:54:51,430][46663] Updated weights for policy 1, policy_version 87401 (0.0007) +[2023-10-13 03:54:51,805][46663] Updated weights for policy 1, policy_version 87411 (0.0007) +[2023-10-13 03:54:52,117][46662] Updated weights for policy 0, policy_version 87490 (0.0009) +[2023-10-13 03:54:52,171][46663] Updated weights for policy 1, policy_version 87421 (0.0009) +[2023-10-13 03:54:52,486][46662] Updated weights for policy 0, policy_version 87500 (0.0009) +[2023-10-13 03:54:52,852][46662] Updated weights for policy 0, policy_version 87510 (0.0008) +[2023-10-13 03:54:53,226][46662] Updated weights for policy 0, policy_version 87520 (0.0007) +[2023-10-13 03:54:53,607][45375] Fps is (10 sec: 16384.1, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 179142656. Throughput: 0: 1669.0, 1: 1675.8. Samples: 44791264. Policy #0 lag: (min: 30.0, avg: 30.0, max: 30.0) +[2023-10-13 03:54:53,607][45375] Avg episode reward: [(0, '62.990'), (1, '58.170')] +[2023-10-13 03:54:56,232][46663] Updated weights for policy 1, policy_version 87431 (0.0009) +[2023-10-13 03:54:56,595][46663] Updated weights for policy 1, policy_version 87441 (0.0007) +[2023-10-13 03:54:56,956][46663] Updated weights for policy 1, policy_version 87451 (0.0009) +[2023-10-13 03:54:57,265][46662] Updated weights for policy 0, policy_version 87530 (0.0010) +[2023-10-13 03:54:57,638][46662] Updated weights for policy 0, policy_version 87540 (0.0010) +[2023-10-13 03:54:58,009][46662] Updated weights for policy 0, policy_version 87550 (0.0009) +[2023-10-13 03:54:58,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 179208192. Throughput: 0: 1686.5, 1: 1665.7. Samples: 44801832. Policy #0 lag: (min: 30.0, avg: 30.0, max: 30.0) +[2023-10-13 03:54:58,607][45375] Avg episode reward: [(0, '61.510'), (1, '56.390')] +[2023-10-13 03:55:01,037][46663] Updated weights for policy 1, policy_version 87461 (0.0008) +[2023-10-13 03:55:01,416][46663] Updated weights for policy 1, policy_version 87471 (0.0010) +[2023-10-13 03:55:01,775][46663] Updated weights for policy 1, policy_version 87481 (0.0008) +[2023-10-13 03:55:01,998][46662] Updated weights for policy 0, policy_version 87560 (0.0008) +[2023-10-13 03:55:02,379][46662] Updated weights for policy 0, policy_version 87570 (0.0008) +[2023-10-13 03:55:02,747][46662] Updated weights for policy 0, policy_version 87580 (0.0008) +[2023-10-13 03:55:03,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 179273728. Throughput: 0: 1692.6, 1: 1658.7. Samples: 44821868. Policy #0 lag: (min: 30.0, avg: 30.0, max: 30.0) +[2023-10-13 03:55:03,608][45375] Avg episode reward: [(0, '61.460'), (1, '57.350')] +[2023-10-13 03:55:05,991][46663] Updated weights for policy 1, policy_version 87491 (0.0007) +[2023-10-13 03:55:06,357][46663] Updated weights for policy 1, policy_version 87501 (0.0008) +[2023-10-13 03:55:06,712][46663] Updated weights for policy 1, policy_version 87511 (0.0008) +[2023-10-13 03:55:06,759][46662] Updated weights for policy 0, policy_version 87590 (0.0009) +[2023-10-13 03:55:07,125][46662] Updated weights for policy 0, policy_version 87600 (0.0008) +[2023-10-13 03:55:07,500][46662] Updated weights for policy 0, policy_version 87610 (0.0009) +[2023-10-13 03:55:08,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 179339264. Throughput: 0: 1668.1, 1: 1671.8. Samples: 44841334. Policy #0 lag: (min: 30.0, avg: 30.0, max: 30.0) +[2023-10-13 03:55:08,607][45375] Avg episode reward: [(0, '60.840'), (1, '56.660')] +[2023-10-13 03:55:10,791][46663] Updated weights for policy 1, policy_version 87521 (0.0007) +[2023-10-13 03:55:11,166][46663] Updated weights for policy 1, policy_version 87531 (0.0009) +[2023-10-13 03:55:11,533][46663] Updated weights for policy 1, policy_version 87541 (0.0009) +[2023-10-13 03:55:11,694][46662] Updated weights for policy 0, policy_version 87620 (0.0008) +[2023-10-13 03:55:11,897][46663] Updated weights for policy 1, policy_version 87551 (0.0007) +[2023-10-13 03:55:12,067][46662] Updated weights for policy 0, policy_version 87630 (0.0009) +[2023-10-13 03:55:12,420][46662] Updated weights for policy 0, policy_version 87640 (0.0011) +[2023-10-13 03:55:13,606][45375] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 179404800. Throughput: 0: 1695.8, 1: 1656.0. Samples: 44852220. Policy #0 lag: (min: 30.0, avg: 30.0, max: 30.0) +[2023-10-13 03:55:13,607][45375] Avg episode reward: [(0, '60.600'), (1, '58.550')] +[2023-10-13 03:55:16,085][46663] Updated weights for policy 1, policy_version 87561 (0.0007) +[2023-10-13 03:55:16,447][46662] Updated weights for policy 0, policy_version 87650 (0.0008) +[2023-10-13 03:55:16,453][46663] Updated weights for policy 1, policy_version 87571 (0.0007) +[2023-10-13 03:55:16,806][46663] Updated weights for policy 1, policy_version 87581 (0.0007) +[2023-10-13 03:55:16,817][46662] Updated weights for policy 0, policy_version 87660 (0.0008) +[2023-10-13 03:55:17,187][46662] Updated weights for policy 0, policy_version 87670 (0.0009) +[2023-10-13 03:55:17,553][46662] Updated weights for policy 0, policy_version 87680 (0.0007) +[2023-10-13 03:55:18,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 179470336. Throughput: 0: 1690.8, 1: 1657.2. Samples: 44872008. Policy #0 lag: (min: 30.0, avg: 30.0, max: 30.0) +[2023-10-13 03:55:18,607][45375] Avg episode reward: [(0, '60.370'), (1, '58.940')] +[2023-10-13 03:55:20,941][46663] Updated weights for policy 1, policy_version 87591 (0.0008) +[2023-10-13 03:55:21,305][46663] Updated weights for policy 1, policy_version 87601 (0.0008) +[2023-10-13 03:55:21,503][46662] Updated weights for policy 0, policy_version 87690 (0.0008) +[2023-10-13 03:55:21,676][46663] Updated weights for policy 1, policy_version 87611 (0.0007) +[2023-10-13 03:55:21,880][46662] Updated weights for policy 0, policy_version 87700 (0.0009) +[2023-10-13 03:55:22,243][46662] Updated weights for policy 0, policy_version 87710 (0.0009) +[2023-10-13 03:55:23,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 179535872. Throughput: 0: 1674.1, 1: 1668.9. Samples: 44891740. Policy #0 lag: (min: 30.0, avg: 30.0, max: 30.0) +[2023-10-13 03:55:23,607][45375] Avg episode reward: [(0, '60.290'), (1, '58.790')] +[2023-10-13 03:55:25,837][46663] Updated weights for policy 1, policy_version 87621 (0.0007) +[2023-10-13 03:55:26,234][46663] Updated weights for policy 1, policy_version 87631 (0.0008) +[2023-10-13 03:55:26,416][46662] Updated weights for policy 0, policy_version 87720 (0.0007) +[2023-10-13 03:55:26,608][46663] Updated weights for policy 1, policy_version 87641 (0.0008) +[2023-10-13 03:55:26,782][46662] Updated weights for policy 0, policy_version 87730 (0.0007) +[2023-10-13 03:55:27,160][46662] Updated weights for policy 0, policy_version 87740 (0.0008) +[2023-10-13 03:55:28,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 179601408. Throughput: 0: 1695.4, 1: 1653.2. Samples: 44902472. Policy #0 lag: (min: 30.0, avg: 30.0, max: 30.0) +[2023-10-13 03:55:28,607][45375] Avg episode reward: [(0, '59.640'), (1, '59.240')] +[2023-10-13 03:55:30,698][46663] Updated weights for policy 1, policy_version 87651 (0.0008) +[2023-10-13 03:55:31,063][46663] Updated weights for policy 1, policy_version 87661 (0.0009) +[2023-10-13 03:55:31,344][46662] Updated weights for policy 0, policy_version 87750 (0.0008) +[2023-10-13 03:55:31,420][46663] Updated weights for policy 1, policy_version 87671 (0.0009) +[2023-10-13 03:55:31,730][46662] Updated weights for policy 0, policy_version 87760 (0.0007) +[2023-10-13 03:55:32,098][46662] Updated weights for policy 0, policy_version 87770 (0.0007) +[2023-10-13 03:55:33,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 179666944. Throughput: 0: 1674.6, 1: 1669.7. Samples: 44921936. Policy #0 lag: (min: 30.0, avg: 30.0, max: 30.0) +[2023-10-13 03:55:33,607][45375] Avg episode reward: [(0, '59.810'), (1, '59.070')] +[2023-10-13 03:55:35,543][46663] Updated weights for policy 1, policy_version 87681 (0.0008) +[2023-10-13 03:55:35,911][46663] Updated weights for policy 1, policy_version 87691 (0.0009) +[2023-10-13 03:55:35,947][46662] Updated weights for policy 0, policy_version 87780 (0.0008) +[2023-10-13 03:55:36,277][46663] Updated weights for policy 1, policy_version 87701 (0.0007) +[2023-10-13 03:55:36,325][46662] Updated weights for policy 0, policy_version 87790 (0.0007) +[2023-10-13 03:55:36,647][46663] Updated weights for policy 1, policy_version 87711 (0.0007) +[2023-10-13 03:55:36,698][46662] Updated weights for policy 0, policy_version 87800 (0.0007) +[2023-10-13 03:55:38,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 179732480. Throughput: 0: 1682.0, 1: 1671.9. Samples: 44942186. Policy #0 lag: (min: 30.0, avg: 30.0, max: 30.0) +[2023-10-13 03:55:38,607][45375] Avg episode reward: [(0, '59.250'), (1, '58.460')] +[2023-10-13 03:55:40,731][46663] Updated weights for policy 1, policy_version 87721 (0.0009) +[2023-10-13 03:55:40,874][46662] Updated weights for policy 0, policy_version 87810 (0.0011) +[2023-10-13 03:55:41,088][46663] Updated weights for policy 1, policy_version 87731 (0.0008) +[2023-10-13 03:55:41,241][46662] Updated weights for policy 0, policy_version 87820 (0.0007) +[2023-10-13 03:55:41,458][46663] Updated weights for policy 1, policy_version 87741 (0.0007) +[2023-10-13 03:55:41,605][46662] Updated weights for policy 0, policy_version 87830 (0.0007) +[2023-10-13 03:55:41,975][46662] Updated weights for policy 0, policy_version 87840 (0.0008) +[2023-10-13 03:55:43,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13329.4). Total num frames: 179798016. Throughput: 0: 1695.1, 1: 1660.6. Samples: 44952840. Policy #0 lag: (min: 30.0, avg: 30.0, max: 30.0) +[2023-10-13 03:55:43,607][45375] Avg episode reward: [(0, '60.680'), (1, '59.880')] +[2023-10-13 03:55:45,415][46663] Updated weights for policy 1, policy_version 87751 (0.0009) +[2023-10-13 03:55:45,781][46663] Updated weights for policy 1, policy_version 87761 (0.0008) +[2023-10-13 03:55:45,945][46662] Updated weights for policy 0, policy_version 87850 (0.0009) +[2023-10-13 03:55:46,145][46663] Updated weights for policy 1, policy_version 87771 (0.0008) +[2023-10-13 03:55:46,313][46662] Updated weights for policy 0, policy_version 87860 (0.0008) +[2023-10-13 03:55:46,680][46662] Updated weights for policy 0, policy_version 87870 (0.0009) +[2023-10-13 03:55:48,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 179863552. Throughput: 0: 1663.3, 1: 1675.2. Samples: 44972102. Policy #0 lag: (min: 30.0, avg: 30.0, max: 30.0) +[2023-10-13 03:55:48,607][45375] Avg episode reward: [(0, '59.980'), (1, '58.910')] +[2023-10-13 03:55:50,314][46663] Updated weights for policy 1, policy_version 87781 (0.0009) +[2023-10-13 03:55:50,683][46663] Updated weights for policy 1, policy_version 87791 (0.0009) +[2023-10-13 03:55:50,869][46662] Updated weights for policy 0, policy_version 87880 (0.0009) +[2023-10-13 03:55:51,056][46663] Updated weights for policy 1, policy_version 87801 (0.0009) +[2023-10-13 03:55:51,229][46662] Updated weights for policy 0, policy_version 87890 (0.0008) +[2023-10-13 03:55:51,610][46662] Updated weights for policy 0, policy_version 87900 (0.0008) +[2023-10-13 03:55:53,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 179929088. Throughput: 0: 1683.4, 1: 1673.2. Samples: 44992380. Policy #0 lag: (min: 31.0, avg: 35.4, max: 63.0) +[2023-10-13 03:55:53,607][45375] Avg episode reward: [(0, '57.420'), (1, '57.810')] +[2023-10-13 03:55:53,616][46384] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000087808_89915392.pth... +[2023-10-13 03:55:53,616][46091] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000087904_90013696.pth... +[2023-10-13 03:55:53,657][46091] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000086336_88408064.pth +[2023-10-13 03:55:53,658][46384] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000086240_88309760.pth +[2023-10-13 03:55:55,260][46663] Updated weights for policy 1, policy_version 87811 (0.0009) +[2023-10-13 03:55:55,626][46663] Updated weights for policy 1, policy_version 87821 (0.0008) +[2023-10-13 03:55:55,709][46662] Updated weights for policy 0, policy_version 87910 (0.0007) +[2023-10-13 03:55:55,995][46663] Updated weights for policy 1, policy_version 87831 (0.0009) +[2023-10-13 03:55:56,078][46662] Updated weights for policy 0, policy_version 87920 (0.0008) +[2023-10-13 03:55:56,444][46662] Updated weights for policy 0, policy_version 87930 (0.0009) +[2023-10-13 03:55:58,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 179994624. Throughput: 0: 1677.3, 1: 1660.7. Samples: 45002430. Policy #0 lag: (min: 31.0, avg: 35.4, max: 63.0) +[2023-10-13 03:55:58,607][45375] Avg episode reward: [(0, '56.400'), (1, '56.320')] +[2023-10-13 03:56:00,209][46663] Updated weights for policy 1, policy_version 87841 (0.0008) +[2023-10-13 03:56:00,575][46663] Updated weights for policy 1, policy_version 87851 (0.0009) +[2023-10-13 03:56:00,620][46662] Updated weights for policy 0, policy_version 87940 (0.0007) +[2023-10-13 03:56:00,938][46663] Updated weights for policy 1, policy_version 87861 (0.0009) +[2023-10-13 03:56:00,986][46662] Updated weights for policy 0, policy_version 87950 (0.0007) +[2023-10-13 03:56:01,315][46663] Updated weights for policy 1, policy_version 87871 (0.0007) +[2023-10-13 03:56:01,360][46662] Updated weights for policy 0, policy_version 87960 (0.0009) +[2023-10-13 03:56:03,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 180060160. Throughput: 0: 1657.2, 1: 1676.0. Samples: 45022006. Policy #0 lag: (min: 31.0, avg: 35.4, max: 63.0) +[2023-10-13 03:56:03,607][45375] Avg episode reward: [(0, '56.040'), (1, '57.190')] +[2023-10-13 03:56:05,255][46662] Updated weights for policy 0, policy_version 87970 (0.0010) +[2023-10-13 03:56:05,406][46663] Updated weights for policy 1, policy_version 87881 (0.0010) +[2023-10-13 03:56:05,621][46662] Updated weights for policy 0, policy_version 87980 (0.0008) +[2023-10-13 03:56:05,774][46663] Updated weights for policy 1, policy_version 87891 (0.0010) +[2023-10-13 03:56:05,988][46662] Updated weights for policy 0, policy_version 87990 (0.0007) +[2023-10-13 03:56:06,135][46663] Updated weights for policy 1, policy_version 87901 (0.0009) +[2023-10-13 03:56:06,361][46662] Updated weights for policy 0, policy_version 88000 (0.0007) +[2023-10-13 03:56:08,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 180125696. Throughput: 0: 1677.5, 1: 1677.3. Samples: 45042708. Policy #0 lag: (min: 31.0, avg: 35.4, max: 63.0) +[2023-10-13 03:56:08,607][45375] Avg episode reward: [(0, '56.500'), (1, '57.970')] +[2023-10-13 03:56:10,165][46663] Updated weights for policy 1, policy_version 87911 (0.0009) +[2023-10-13 03:56:10,480][46662] Updated weights for policy 0, policy_version 88010 (0.0007) +[2023-10-13 03:56:10,531][46663] Updated weights for policy 1, policy_version 87921 (0.0009) +[2023-10-13 03:56:10,852][46662] Updated weights for policy 0, policy_version 88020 (0.0008) +[2023-10-13 03:56:10,894][46663] Updated weights for policy 1, policy_version 87931 (0.0009) +[2023-10-13 03:56:11,221][46662] Updated weights for policy 0, policy_version 88030 (0.0007) +[2023-10-13 03:56:13,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 180191232. Throughput: 0: 1662.6, 1: 1668.1. Samples: 45052356. Policy #0 lag: (min: 31.0, avg: 35.4, max: 63.0) +[2023-10-13 03:56:13,607][45375] Avg episode reward: [(0, '55.400'), (1, '56.990')] +[2023-10-13 03:56:15,065][46663] Updated weights for policy 1, policy_version 87941 (0.0008) +[2023-10-13 03:56:15,266][46662] Updated weights for policy 0, policy_version 88040 (0.0008) +[2023-10-13 03:56:15,457][46663] Updated weights for policy 1, policy_version 87951 (0.0009) +[2023-10-13 03:56:15,635][46662] Updated weights for policy 0, policy_version 88050 (0.0008) +[2023-10-13 03:56:15,824][46663] Updated weights for policy 1, policy_version 87961 (0.0009) +[2023-10-13 03:56:16,005][46662] Updated weights for policy 0, policy_version 88060 (0.0008) +[2023-10-13 03:56:18,606][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 180256768. Throughput: 0: 1667.3, 1: 1673.6. Samples: 45072278. Policy #0 lag: (min: 31.0, avg: 35.4, max: 63.0) +[2023-10-13 03:56:18,607][45375] Avg episode reward: [(0, '55.820'), (1, '56.860')] +[2023-10-13 03:56:19,777][46663] Updated weights for policy 1, policy_version 87971 (0.0008) +[2023-10-13 03:56:20,125][46662] Updated weights for policy 0, policy_version 88070 (0.0007) +[2023-10-13 03:56:20,150][46663] Updated weights for policy 1, policy_version 87981 (0.0009) +[2023-10-13 03:56:20,506][46663] Updated weights for policy 1, policy_version 87991 (0.0009) +[2023-10-13 03:56:20,510][46662] Updated weights for policy 0, policy_version 88080 (0.0009) +[2023-10-13 03:56:20,879][46662] Updated weights for policy 0, policy_version 88090 (0.0008) +[2023-10-13 03:56:23,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 180322304. Throughput: 0: 1674.4, 1: 1673.9. Samples: 45092862. Policy #0 lag: (min: 31.0, avg: 35.4, max: 63.0) +[2023-10-13 03:56:23,607][45375] Avg episode reward: [(0, '56.390'), (1, '56.240')] +[2023-10-13 03:56:24,371][46663] Updated weights for policy 1, policy_version 88001 (0.0008) +[2023-10-13 03:56:24,736][46663] Updated weights for policy 1, policy_version 88011 (0.0008) +[2023-10-13 03:56:24,817][46662] Updated weights for policy 0, policy_version 88100 (0.0008) +[2023-10-13 03:56:25,101][46663] Updated weights for policy 1, policy_version 88021 (0.0009) +[2023-10-13 03:56:25,187][46662] Updated weights for policy 0, policy_version 88110 (0.0007) +[2023-10-13 03:56:25,462][46663] Updated weights for policy 1, policy_version 88031 (0.0007) +[2023-10-13 03:56:25,546][46662] Updated weights for policy 0, policy_version 88120 (0.0007) +[2023-10-13 03:56:28,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 180387840. Throughput: 0: 1650.7, 1: 1666.9. Samples: 45102134. Policy #0 lag: (min: 31.0, avg: 35.4, max: 63.0) +[2023-10-13 03:56:28,607][45375] Avg episode reward: [(0, '55.600'), (1, '56.910')] +[2023-10-13 03:56:29,681][46662] Updated weights for policy 0, policy_version 88130 (0.0009) +[2023-10-13 03:56:29,734][46663] Updated weights for policy 1, policy_version 88041 (0.0008) +[2023-10-13 03:56:30,041][46662] Updated weights for policy 0, policy_version 88140 (0.0008) +[2023-10-13 03:56:30,110][46663] Updated weights for policy 1, policy_version 88051 (0.0008) +[2023-10-13 03:56:30,411][46662] Updated weights for policy 0, policy_version 88150 (0.0008) +[2023-10-13 03:56:30,472][46663] Updated weights for policy 1, policy_version 88061 (0.0007) +[2023-10-13 03:56:30,771][46662] Updated weights for policy 0, policy_version 88160 (0.0008) +[2023-10-13 03:56:33,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 180453376. Throughput: 0: 1680.5, 1: 1666.0. Samples: 45122692. Policy #0 lag: (min: 31.0, avg: 35.4, max: 63.0) +[2023-10-13 03:56:33,608][45375] Avg episode reward: [(0, '56.380'), (1, '56.890')] +[2023-10-13 03:56:34,705][46663] Updated weights for policy 1, policy_version 88071 (0.0008) +[2023-10-13 03:56:34,889][46662] Updated weights for policy 0, policy_version 88170 (0.0007) +[2023-10-13 03:56:35,071][46663] Updated weights for policy 1, policy_version 88081 (0.0008) +[2023-10-13 03:56:35,249][46662] Updated weights for policy 0, policy_version 88180 (0.0008) +[2023-10-13 03:56:35,435][46663] Updated weights for policy 1, policy_version 88091 (0.0009) +[2023-10-13 03:56:35,628][46662] Updated weights for policy 0, policy_version 88190 (0.0008) +[2023-10-13 03:56:38,607][45375] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 180518912. Throughput: 0: 1682.3, 1: 1663.9. Samples: 45142962. Policy #0 lag: (min: 31.0, avg: 35.4, max: 63.0) +[2023-10-13 03:56:38,608][45375] Avg episode reward: [(0, '57.260'), (1, '56.930')] +[2023-10-13 03:56:39,591][46663] Updated weights for policy 1, policy_version 88101 (0.0008) +[2023-10-13 03:56:39,708][46662] Updated weights for policy 0, policy_version 88200 (0.0007) +[2023-10-13 03:56:39,962][46663] Updated weights for policy 1, policy_version 88111 (0.0007) +[2023-10-13 03:56:40,080][46662] Updated weights for policy 0, policy_version 88210 (0.0008) +[2023-10-13 03:56:40,328][46663] Updated weights for policy 1, policy_version 88121 (0.0008) +[2023-10-13 03:56:40,441][46662] Updated weights for policy 0, policy_version 88220 (0.0008) +[2023-10-13 03:56:43,607][45375] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 180584448. Throughput: 0: 1661.5, 1: 1663.6. Samples: 45152058. Policy #0 lag: (min: 31.0, avg: 35.4, max: 63.0) +[2023-10-13 03:56:43,607][45375] Avg episode reward: [(0, '55.950'), (1, '57.770')] +[2023-10-13 03:56:44,432][46663] Updated weights for policy 1, policy_version 88131 (0.0009) +[2023-10-13 03:56:44,594][46662] Updated weights for policy 0, policy_version 88230 (0.0009) +[2023-10-13 03:56:44,793][46663] Updated weights for policy 1, policy_version 88141 (0.0010) +[2023-10-13 03:56:44,954][46662] Updated weights for policy 0, policy_version 88240 (0.0008) +[2023-10-13 03:56:45,158][46663] Updated weights for policy 1, policy_version 88151 (0.0010) +[2023-10-13 03:56:45,322][46662] Updated weights for policy 0, policy_version 88250 (0.0008) +[2023-10-13 03:56:48,607][45375] Fps is (10 sec: 13107.6, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 180649984. Throughput: 0: 1682.4, 1: 1664.0. Samples: 45172598. Policy #0 lag: (min: 31.0, avg: 35.4, max: 63.0) +[2023-10-13 03:56:48,607][45375] Avg episode reward: [(0, '56.310'), (1, '57.100')] +[2023-10-13 03:56:49,479][46663] Updated weights for policy 1, policy_version 88161 (0.0007) +[2023-10-13 03:56:49,485][46662] Updated weights for policy 0, policy_version 88260 (0.0009) +[2023-10-13 03:56:49,841][46663] Updated weights for policy 1, policy_version 88171 (0.0010) +[2023-10-13 03:56:49,863][46662] Updated weights for policy 0, policy_version 88270 (0.0007) +[2023-10-13 03:56:50,205][46663] Updated weights for policy 1, policy_version 88181 (0.0009) +[2023-10-13 03:56:50,228][46662] Updated weights for policy 0, policy_version 88280 (0.0007) +[2023-10-13 03:56:50,574][46663] Updated weights for policy 1, policy_version 88191 (0.0008) +[2023-10-13 03:56:53,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 180715520. Throughput: 0: 1681.5, 1: 1659.9. Samples: 45193070. Policy #0 lag: (min: 31.0, avg: 37.6, max: 63.0) +[2023-10-13 03:56:53,607][45375] Avg episode reward: [(0, '55.390'), (1, '56.820')] +[2023-10-13 03:56:54,174][46662] Updated weights for policy 0, policy_version 88290 (0.0008) +[2023-10-13 03:56:54,542][46662] Updated weights for policy 0, policy_version 88300 (0.0008) +[2023-10-13 03:56:54,689][46663] Updated weights for policy 1, policy_version 88201 (0.0008) +[2023-10-13 03:56:54,922][46662] Updated weights for policy 0, policy_version 88310 (0.0008) +[2023-10-13 03:56:55,053][46663] Updated weights for policy 1, policy_version 88211 (0.0007) +[2023-10-13 03:56:55,278][46662] Updated weights for policy 0, policy_version 88320 (0.0009) +[2023-10-13 03:56:55,409][46663] Updated weights for policy 1, policy_version 88221 (0.0008) +[2023-10-13 03:56:58,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 180781056. Throughput: 0: 1668.1, 1: 1657.9. Samples: 45202024. Policy #0 lag: (min: 31.0, avg: 37.6, max: 63.0) +[2023-10-13 03:56:58,607][45375] Avg episode reward: [(0, '55.630'), (1, '57.370')] +[2023-10-13 03:56:59,423][46662] Updated weights for policy 0, policy_version 88330 (0.0009) +[2023-10-13 03:56:59,580][46663] Updated weights for policy 1, policy_version 88231 (0.0008) +[2023-10-13 03:56:59,794][46662] Updated weights for policy 0, policy_version 88340 (0.0009) +[2023-10-13 03:56:59,936][46663] Updated weights for policy 1, policy_version 88241 (0.0007) +[2023-10-13 03:57:00,163][46662] Updated weights for policy 0, policy_version 88350 (0.0008) +[2023-10-13 03:57:00,299][46663] Updated weights for policy 1, policy_version 88251 (0.0008) +[2023-10-13 03:57:03,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 180846592. Throughput: 0: 1681.2, 1: 1658.9. Samples: 45222584. Policy #0 lag: (min: 31.0, avg: 37.6, max: 63.0) +[2023-10-13 03:57:03,608][45375] Avg episode reward: [(0, '56.440'), (1, '56.010')] +[2023-10-13 03:57:04,212][46662] Updated weights for policy 0, policy_version 88360 (0.0008) +[2023-10-13 03:57:04,328][46663] Updated weights for policy 1, policy_version 88261 (0.0008) +[2023-10-13 03:57:04,579][46662] Updated weights for policy 0, policy_version 88370 (0.0007) +[2023-10-13 03:57:04,717][46663] Updated weights for policy 1, policy_version 88271 (0.0009) +[2023-10-13 03:57:04,949][46662] Updated weights for policy 0, policy_version 88380 (0.0007) +[2023-10-13 03:57:05,071][46663] Updated weights for policy 1, policy_version 88281 (0.0008) +[2023-10-13 03:57:08,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 180912128. Throughput: 0: 1684.7, 1: 1658.7. Samples: 45243316. Policy #0 lag: (min: 31.0, avg: 37.6, max: 63.0) +[2023-10-13 03:57:08,607][45375] Avg episode reward: [(0, '56.190'), (1, '56.570')] +[2023-10-13 03:57:09,097][46663] Updated weights for policy 1, policy_version 88291 (0.0009) +[2023-10-13 03:57:09,159][46662] Updated weights for policy 0, policy_version 88390 (0.0009) +[2023-10-13 03:57:09,453][46663] Updated weights for policy 1, policy_version 88301 (0.0009) +[2023-10-13 03:57:09,522][46662] Updated weights for policy 0, policy_version 88400 (0.0008) +[2023-10-13 03:57:09,818][46663] Updated weights for policy 1, policy_version 88311 (0.0008) +[2023-10-13 03:57:09,891][46662] Updated weights for policy 0, policy_version 88410 (0.0010) +[2023-10-13 03:57:13,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 180977664. Throughput: 0: 1677.6, 1: 1658.8. Samples: 45252272. Policy #0 lag: (min: 31.0, avg: 37.6, max: 63.0) +[2023-10-13 03:57:13,608][45375] Avg episode reward: [(0, '56.540'), (1, '56.880')] +[2023-10-13 03:57:13,886][46663] Updated weights for policy 1, policy_version 88321 (0.0009) +[2023-10-13 03:57:14,002][46662] Updated weights for policy 0, policy_version 88420 (0.0009) +[2023-10-13 03:57:14,253][46663] Updated weights for policy 1, policy_version 88331 (0.0007) +[2023-10-13 03:57:14,369][46662] Updated weights for policy 0, policy_version 88430 (0.0010) +[2023-10-13 03:57:14,627][46663] Updated weights for policy 1, policy_version 88341 (0.0008) +[2023-10-13 03:57:14,735][46662] Updated weights for policy 0, policy_version 88440 (0.0010) +[2023-10-13 03:57:14,997][46663] Updated weights for policy 1, policy_version 88351 (0.0008) +[2023-10-13 03:57:18,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 181043200. Throughput: 0: 1670.9, 1: 1666.9. Samples: 45272890. Policy #0 lag: (min: 31.0, avg: 37.6, max: 63.0) +[2023-10-13 03:57:18,607][45375] Avg episode reward: [(0, '56.690'), (1, '54.870')] +[2023-10-13 03:57:18,958][46662] Updated weights for policy 0, policy_version 88450 (0.0010) +[2023-10-13 03:57:19,001][46663] Updated weights for policy 1, policy_version 88361 (0.0009) +[2023-10-13 03:57:19,329][46662] Updated weights for policy 0, policy_version 88460 (0.0008) +[2023-10-13 03:57:19,363][46663] Updated weights for policy 1, policy_version 88371 (0.0009) +[2023-10-13 03:57:19,698][46662] Updated weights for policy 0, policy_version 88470 (0.0008) +[2023-10-13 03:57:19,724][46663] Updated weights for policy 1, policy_version 88381 (0.0007) +[2023-10-13 03:57:20,063][46662] Updated weights for policy 0, policy_version 88480 (0.0008) +[2023-10-13 03:57:23,606][45375] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 181108736. Throughput: 0: 1666.5, 1: 1667.1. Samples: 45292974. Policy #0 lag: (min: 31.0, avg: 37.6, max: 63.0) +[2023-10-13 03:57:23,607][45375] Avg episode reward: [(0, '56.850'), (1, '54.300')] +[2023-10-13 03:57:24,022][46663] Updated weights for policy 1, policy_version 88391 (0.0009) +[2023-10-13 03:57:24,247][46662] Updated weights for policy 0, policy_version 88490 (0.0007) +[2023-10-13 03:57:24,387][46663] Updated weights for policy 1, policy_version 88401 (0.0009) +[2023-10-13 03:57:24,611][46662] Updated weights for policy 0, policy_version 88500 (0.0007) +[2023-10-13 03:57:24,750][46663] Updated weights for policy 1, policy_version 88411 (0.0009) +[2023-10-13 03:57:24,976][46662] Updated weights for policy 0, policy_version 88510 (0.0008) +[2023-10-13 03:57:28,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 181174272. Throughput: 0: 1663.0, 1: 1668.7. Samples: 45301986. Policy #0 lag: (min: 31.0, avg: 37.6, max: 63.0) +[2023-10-13 03:57:28,607][45375] Avg episode reward: [(0, '58.840'), (1, '53.650')] +[2023-10-13 03:57:28,989][46663] Updated weights for policy 1, policy_version 88421 (0.0009) +[2023-10-13 03:57:29,212][46662] Updated weights for policy 0, policy_version 88520 (0.0008) +[2023-10-13 03:57:29,359][46663] Updated weights for policy 1, policy_version 88431 (0.0007) +[2023-10-13 03:57:29,588][46662] Updated weights for policy 0, policy_version 88530 (0.0008) +[2023-10-13 03:57:29,721][46663] Updated weights for policy 1, policy_version 88441 (0.0008) +[2023-10-13 03:57:29,959][46662] Updated weights for policy 0, policy_version 88540 (0.0009) +[2023-10-13 03:57:33,607][45375] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 181239808. Throughput: 0: 1659.5, 1: 1665.7. Samples: 45322236. Policy #0 lag: (min: 31.0, avg: 37.6, max: 63.0) +[2023-10-13 03:57:33,608][45375] Avg episode reward: [(0, '58.030'), (1, '51.720')] +[2023-10-13 03:57:33,822][46663] Updated weights for policy 1, policy_version 88451 (0.0008) +[2023-10-13 03:57:34,180][46663] Updated weights for policy 1, policy_version 88461 (0.0009) +[2023-10-13 03:57:34,287][46662] Updated weights for policy 0, policy_version 88550 (0.0008) +[2023-10-13 03:57:34,550][46663] Updated weights for policy 1, policy_version 88471 (0.0009) +[2023-10-13 03:57:34,656][46662] Updated weights for policy 0, policy_version 88560 (0.0008) +[2023-10-13 03:57:35,017][46662] Updated weights for policy 0, policy_version 88570 (0.0009) +[2023-10-13 03:57:38,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.3, 300 sec: 13329.4). Total num frames: 181305344. Throughput: 0: 1650.4, 1: 1671.7. Samples: 45342564. Policy #0 lag: (min: 31.0, avg: 37.6, max: 63.0) +[2023-10-13 03:57:38,607][45375] Avg episode reward: [(0, '58.230'), (1, '52.500')] +[2023-10-13 03:57:38,682][46663] Updated weights for policy 1, policy_version 88481 (0.0008) +[2023-10-13 03:57:39,041][46663] Updated weights for policy 1, policy_version 88491 (0.0007) +[2023-10-13 03:57:39,239][46662] Updated weights for policy 0, policy_version 88580 (0.0008) +[2023-10-13 03:57:39,409][46663] Updated weights for policy 1, policy_version 88501 (0.0007) +[2023-10-13 03:57:39,617][46662] Updated weights for policy 0, policy_version 88590 (0.0008) +[2023-10-13 03:57:39,782][46663] Updated weights for policy 1, policy_version 88511 (0.0007) +[2023-10-13 03:57:39,988][46662] Updated weights for policy 0, policy_version 88600 (0.0007) +[2023-10-13 03:57:43,606][45375] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 181370880. Throughput: 0: 1652.5, 1: 1672.7. Samples: 45351656. Policy #0 lag: (min: 31.0, avg: 37.6, max: 63.0) +[2023-10-13 03:57:43,607][45375] Avg episode reward: [(0, '57.640'), (1, '52.390')] +[2023-10-13 03:57:43,806][46663] Updated weights for policy 1, policy_version 88521 (0.0010) +[2023-10-13 03:57:44,024][46662] Updated weights for policy 0, policy_version 88610 (0.0008) +[2023-10-13 03:57:44,178][46663] Updated weights for policy 1, policy_version 88531 (0.0008) +[2023-10-13 03:57:44,389][46662] Updated weights for policy 0, policy_version 88620 (0.0007) +[2023-10-13 03:57:44,533][46663] Updated weights for policy 1, policy_version 88541 (0.0009) +[2023-10-13 03:57:44,749][46662] Updated weights for policy 0, policy_version 88630 (0.0007) +[2023-10-13 03:57:45,120][46662] Updated weights for policy 0, policy_version 88640 (0.0008) +[2023-10-13 03:57:48,606][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 181436416. Throughput: 0: 1657.1, 1: 1675.4. Samples: 45372546. Policy #0 lag: (min: 17.0, avg: 37.6, max: 49.0) +[2023-10-13 03:57:48,607][45375] Avg episode reward: [(0, '57.340'), (1, '52.350')] +[2023-10-13 03:57:48,650][46663] Updated weights for policy 1, policy_version 88551 (0.0008) +[2023-10-13 03:57:49,015][46663] Updated weights for policy 1, policy_version 88561 (0.0007) +[2023-10-13 03:57:49,152][46662] Updated weights for policy 0, policy_version 88650 (0.0008) +[2023-10-13 03:57:49,376][46663] Updated weights for policy 1, policy_version 88571 (0.0007) +[2023-10-13 03:57:49,520][46662] Updated weights for policy 0, policy_version 88660 (0.0009) +[2023-10-13 03:57:49,898][46662] Updated weights for policy 0, policy_version 88670 (0.0008) +[2023-10-13 03:57:53,523][46663] Updated weights for policy 1, policy_version 88581 (0.0007) +[2023-10-13 03:57:53,607][45375] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 181501952. Throughput: 0: 1650.5, 1: 1671.5. Samples: 45392806. Policy #0 lag: (min: 17.0, avg: 37.6, max: 49.0) +[2023-10-13 03:57:53,607][45375] Avg episode reward: [(0, '59.070'), (1, '54.650')] +[2023-10-13 03:57:53,616][46091] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000088672_90800128.pth... +[2023-10-13 03:57:53,651][46091] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000087136_89227264.pth +[2023-10-13 03:57:53,901][46663] Updated weights for policy 1, policy_version 88591 (0.0008) +[2023-10-13 03:57:54,110][46662] Updated weights for policy 0, policy_version 88680 (0.0007) +[2023-10-13 03:57:54,257][46663] Updated weights for policy 1, policy_version 88601 (0.0007) +[2023-10-13 03:57:54,488][46662] Updated weights for policy 0, policy_version 88690 (0.0007) +[2023-10-13 03:57:54,501][46384] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000088608_90734592.pth... +[2023-10-13 03:57:54,530][46384] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000087040_89128960.pth +[2023-10-13 03:57:54,866][46662] Updated weights for policy 0, policy_version 88700 (0.0010) +[2023-10-13 03:57:58,359][46663] Updated weights for policy 1, policy_version 88611 (0.0008) +[2023-10-13 03:57:58,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 181567488. Throughput: 0: 1649.5, 1: 1673.3. Samples: 45401798. Policy #0 lag: (min: 17.0, avg: 37.6, max: 49.0) +[2023-10-13 03:57:58,607][45375] Avg episode reward: [(0, '58.100'), (1, '54.240')] +[2023-10-13 03:57:58,707][46662] Updated weights for policy 0, policy_version 88710 (0.0009) +[2023-10-13 03:57:58,715][46663] Updated weights for policy 1, policy_version 88621 (0.0008) +[2023-10-13 03:57:59,074][46662] Updated weights for policy 0, policy_version 88720 (0.0008) +[2023-10-13 03:57:59,077][46663] Updated weights for policy 1, policy_version 88631 (0.0007) +[2023-10-13 03:57:59,436][46662] Updated weights for policy 0, policy_version 88730 (0.0009) +[2023-10-13 03:58:03,135][46663] Updated weights for policy 1, policy_version 88641 (0.0009) +[2023-10-13 03:58:03,377][46662] Updated weights for policy 0, policy_version 88740 (0.0008) +[2023-10-13 03:58:03,502][46663] Updated weights for policy 1, policy_version 88651 (0.0008) +[2023-10-13 03:58:03,607][45375] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 181633024. Throughput: 0: 1658.9, 1: 1663.4. Samples: 45422394. Policy #0 lag: (min: 17.0, avg: 37.6, max: 49.0) +[2023-10-13 03:58:03,607][45375] Avg episode reward: [(0, '59.300'), (1, '53.450')] +[2023-10-13 03:58:03,755][46662] Updated weights for policy 0, policy_version 88750 (0.0009) +[2023-10-13 03:58:03,869][46663] Updated weights for policy 1, policy_version 88661 (0.0009) +[2023-10-13 03:58:04,124][46662] Updated weights for policy 0, policy_version 88760 (0.0009) +[2023-10-13 03:58:04,227][46663] Updated weights for policy 1, policy_version 88671 (0.0009) +[2023-10-13 03:58:08,255][46662] Updated weights for policy 0, policy_version 88770 (0.0009) +[2023-10-13 03:58:08,353][46663] Updated weights for policy 1, policy_version 88681 (0.0009) +[2023-10-13 03:58:08,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 181698560. Throughput: 0: 1667.5, 1: 1661.5. Samples: 45442784. Policy #0 lag: (min: 17.0, avg: 37.6, max: 49.0) +[2023-10-13 03:58:08,608][45375] Avg episode reward: [(0, '59.720'), (1, '53.890')] +[2023-10-13 03:58:08,629][46662] Updated weights for policy 0, policy_version 88780 (0.0009) +[2023-10-13 03:58:08,713][46663] Updated weights for policy 1, policy_version 88691 (0.0008) +[2023-10-13 03:58:09,001][46662] Updated weights for policy 0, policy_version 88790 (0.0007) +[2023-10-13 03:58:09,085][46663] Updated weights for policy 1, policy_version 88701 (0.0007) +[2023-10-13 03:58:09,365][46662] Updated weights for policy 0, policy_version 88800 (0.0007) +[2023-10-13 03:58:13,257][46663] Updated weights for policy 1, policy_version 88711 (0.0008) +[2023-10-13 03:58:13,352][46662] Updated weights for policy 0, policy_version 88810 (0.0008) +[2023-10-13 03:58:13,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 181764096. Throughput: 0: 1669.6, 1: 1670.9. Samples: 45452308. Policy #0 lag: (min: 17.0, avg: 37.6, max: 49.0) +[2023-10-13 03:58:13,607][45375] Avg episode reward: [(0, '60.260'), (1, '53.520')] +[2023-10-13 03:58:13,622][46663] Updated weights for policy 1, policy_version 88721 (0.0008) +[2023-10-13 03:58:13,718][46662] Updated weights for policy 0, policy_version 88820 (0.0007) +[2023-10-13 03:58:13,982][46663] Updated weights for policy 1, policy_version 88731 (0.0010) +[2023-10-13 03:58:14,089][46662] Updated weights for policy 0, policy_version 88830 (0.0007) +[2023-10-13 03:58:18,009][46663] Updated weights for policy 1, policy_version 88741 (0.0008) +[2023-10-13 03:58:18,212][46662] Updated weights for policy 0, policy_version 88840 (0.0010) +[2023-10-13 03:58:18,371][46663] Updated weights for policy 1, policy_version 88751 (0.0008) +[2023-10-13 03:58:18,589][46662] Updated weights for policy 0, policy_version 88850 (0.0007) +[2023-10-13 03:58:18,607][45375] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 181829632. Throughput: 0: 1680.8, 1: 1674.0. Samples: 45473204. Policy #0 lag: (min: 17.0, avg: 37.6, max: 49.0) +[2023-10-13 03:58:18,607][45375] Avg episode reward: [(0, '61.220'), (1, '52.140')] +[2023-10-13 03:58:18,741][46663] Updated weights for policy 1, policy_version 88761 (0.0009) +[2023-10-13 03:58:18,956][46662] Updated weights for policy 0, policy_version 88860 (0.0009) +[2023-10-13 03:58:22,805][46663] Updated weights for policy 1, policy_version 88771 (0.0007) +[2023-10-13 03:58:22,927][46662] Updated weights for policy 0, policy_version 88870 (0.0010) +[2023-10-13 03:58:23,175][46663] Updated weights for policy 1, policy_version 88781 (0.0008) +[2023-10-13 03:58:23,293][46662] Updated weights for policy 0, policy_version 88880 (0.0008) +[2023-10-13 03:58:23,553][46663] Updated weights for policy 1, policy_version 88791 (0.0009) +[2023-10-13 03:58:23,607][45375] Fps is (10 sec: 13106.9, 60 sec: 13107.1, 300 sec: 13329.3). Total num frames: 181895168. Throughput: 0: 1683.6, 1: 1659.6. Samples: 45493008. Policy #0 lag: (min: 17.0, avg: 37.6, max: 49.0) +[2023-10-13 03:58:23,608][45375] Avg episode reward: [(0, '60.800'), (1, '52.730')] +[2023-10-13 03:58:23,669][46662] Updated weights for policy 0, policy_version 88890 (0.0007) +[2023-10-13 03:58:27,721][46663] Updated weights for policy 1, policy_version 88801 (0.0008) +[2023-10-13 03:58:27,813][46662] Updated weights for policy 0, policy_version 88900 (0.0009) +[2023-10-13 03:58:28,081][46663] Updated weights for policy 1, policy_version 88811 (0.0007) +[2023-10-13 03:58:28,191][46662] Updated weights for policy 0, policy_version 88910 (0.0008) +[2023-10-13 03:58:28,455][46663] Updated weights for policy 1, policy_version 88821 (0.0008) +[2023-10-13 03:58:28,561][46662] Updated weights for policy 0, policy_version 88920 (0.0007) +[2023-10-13 03:58:28,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 181960704. Throughput: 0: 1687.9, 1: 1674.6. Samples: 45502966. Policy #0 lag: (min: 17.0, avg: 37.6, max: 49.0) +[2023-10-13 03:58:28,607][45375] Avg episode reward: [(0, '60.820'), (1, '52.930')] +[2023-10-13 03:58:28,811][46663] Updated weights for policy 1, policy_version 88831 (0.0008) +[2023-10-13 03:58:32,575][46662] Updated weights for policy 0, policy_version 88930 (0.0008) +[2023-10-13 03:58:32,859][46663] Updated weights for policy 1, policy_version 88841 (0.0008) +[2023-10-13 03:58:32,942][46662] Updated weights for policy 0, policy_version 88940 (0.0009) +[2023-10-13 03:58:33,219][46663] Updated weights for policy 1, policy_version 88851 (0.0009) +[2023-10-13 03:58:33,315][46662] Updated weights for policy 0, policy_version 88950 (0.0009) +[2023-10-13 03:58:33,583][46663] Updated weights for policy 1, policy_version 88861 (0.0007) +[2023-10-13 03:58:33,607][45375] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 182026240. Throughput: 0: 1680.1, 1: 1672.6. Samples: 45523416. Policy #0 lag: (min: 17.0, avg: 37.6, max: 49.0) +[2023-10-13 03:58:33,607][45375] Avg episode reward: [(0, '60.120'), (1, '52.810')] +[2023-10-13 03:58:33,674][46662] Updated weights for policy 0, policy_version 88960 (0.0007) +[2023-10-13 03:58:37,720][46663] Updated weights for policy 1, policy_version 88871 (0.0008) +[2023-10-13 03:58:37,880][46662] Updated weights for policy 0, policy_version 88970 (0.0008) +[2023-10-13 03:58:38,078][46663] Updated weights for policy 1, policy_version 88881 (0.0007) +[2023-10-13 03:58:38,247][46662] Updated weights for policy 0, policy_version 88980 (0.0007) +[2023-10-13 03:58:38,451][46663] Updated weights for policy 1, policy_version 88891 (0.0007) +[2023-10-13 03:58:38,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 182091776. Throughput: 0: 1676.4, 1: 1655.3. Samples: 45542730. Policy #0 lag: (min: 17.0, avg: 37.6, max: 49.0) +[2023-10-13 03:58:38,607][45375] Avg episode reward: [(0, '61.530'), (1, '53.330')] +[2023-10-13 03:58:38,616][46662] Updated weights for policy 0, policy_version 88990 (0.0007) +[2023-10-13 03:58:42,743][46663] Updated weights for policy 1, policy_version 88901 (0.0008) +[2023-10-13 03:58:42,859][46662] Updated weights for policy 0, policy_version 89000 (0.0009) +[2023-10-13 03:58:43,137][46663] Updated weights for policy 1, policy_version 88911 (0.0011) +[2023-10-13 03:58:43,230][46662] Updated weights for policy 0, policy_version 89010 (0.0007) +[2023-10-13 03:58:43,496][46663] Updated weights for policy 1, policy_version 88921 (0.0007) +[2023-10-13 03:58:43,603][46662] Updated weights for policy 0, policy_version 89020 (0.0008) +[2023-10-13 03:58:43,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 182157312. Throughput: 0: 1684.2, 1: 1673.9. Samples: 45552910. Policy #0 lag: (min: 17.0, avg: 37.6, max: 49.0) +[2023-10-13 03:58:43,607][45375] Avg episode reward: [(0, '59.980'), (1, '52.190')] +[2023-10-13 03:58:47,412][46663] Updated weights for policy 1, policy_version 88931 (0.0009) +[2023-10-13 03:58:47,650][46662] Updated weights for policy 0, policy_version 89030 (0.0009) +[2023-10-13 03:58:47,781][46663] Updated weights for policy 1, policy_version 88941 (0.0008) +[2023-10-13 03:58:48,021][46662] Updated weights for policy 0, policy_version 89040 (0.0009) +[2023-10-13 03:58:48,149][46663] Updated weights for policy 1, policy_version 88951 (0.0009) +[2023-10-13 03:58:48,393][46662] Updated weights for policy 0, policy_version 89050 (0.0009) +[2023-10-13 03:58:48,606][45375] Fps is (10 sec: 16384.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 182255616. Throughput: 0: 1676.0, 1: 1672.5. Samples: 45573078. Policy #0 lag: (min: 2.0, avg: 4.5, max: 34.0) +[2023-10-13 03:58:48,607][45375] Avg episode reward: [(0, '60.200'), (1, '51.950')] +[2023-10-13 03:58:52,150][46663] Updated weights for policy 1, policy_version 88961 (0.0009) +[2023-10-13 03:58:52,492][46662] Updated weights for policy 0, policy_version 89060 (0.0008) +[2023-10-13 03:58:52,509][46663] Updated weights for policy 1, policy_version 88971 (0.0009) +[2023-10-13 03:58:52,863][46662] Updated weights for policy 0, policy_version 89070 (0.0009) +[2023-10-13 03:58:52,873][46663] Updated weights for policy 1, policy_version 88981 (0.0010) +[2023-10-13 03:58:53,234][46662] Updated weights for policy 0, policy_version 89080 (0.0008) +[2023-10-13 03:58:53,235][46663] Updated weights for policy 1, policy_version 88991 (0.0007) +[2023-10-13 03:58:53,607][45375] Fps is (10 sec: 19660.9, 60 sec: 14199.5, 300 sec: 13440.4). Total num frames: 182353920. Throughput: 0: 1662.8, 1: 1651.1. Samples: 45591908. Policy #0 lag: (min: 2.0, avg: 4.5, max: 34.0) +[2023-10-13 03:58:53,607][45375] Avg episode reward: [(0, '60.290'), (1, '50.240')] +[2023-10-13 03:58:57,180][46663] Updated weights for policy 1, policy_version 89001 (0.0008) +[2023-10-13 03:58:57,358][46662] Updated weights for policy 0, policy_version 89090 (0.0008) +[2023-10-13 03:58:57,541][46663] Updated weights for policy 1, policy_version 89011 (0.0007) +[2023-10-13 03:58:57,729][46662] Updated weights for policy 0, policy_version 89100 (0.0007) +[2023-10-13 03:58:57,897][46663] Updated weights for policy 1, policy_version 89021 (0.0007) +[2023-10-13 03:58:58,100][46662] Updated weights for policy 0, policy_version 89110 (0.0008) +[2023-10-13 03:58:58,463][46662] Updated weights for policy 0, policy_version 89120 (0.0007) +[2023-10-13 03:58:58,606][45375] Fps is (10 sec: 16384.0, 60 sec: 14199.5, 300 sec: 13440.4). Total num frames: 182419456. Throughput: 0: 1672.7, 1: 1672.9. Samples: 45602860. Policy #0 lag: (min: 2.0, avg: 4.5, max: 34.0) +[2023-10-13 03:58:58,607][45375] Avg episode reward: [(0, '60.380'), (1, '50.580')] +[2023-10-13 03:59:02,129][46663] Updated weights for policy 1, policy_version 89031 (0.0008) +[2023-10-13 03:59:02,485][46663] Updated weights for policy 1, policy_version 89041 (0.0009) +[2023-10-13 03:59:02,528][46662] Updated weights for policy 0, policy_version 89130 (0.0008) +[2023-10-13 03:59:02,843][46663] Updated weights for policy 1, policy_version 89051 (0.0008) +[2023-10-13 03:59:02,891][46662] Updated weights for policy 0, policy_version 89140 (0.0008) +[2023-10-13 03:59:03,263][46662] Updated weights for policy 0, policy_version 89150 (0.0009) +[2023-10-13 03:59:03,607][45375] Fps is (10 sec: 13106.9, 60 sec: 14199.4, 300 sec: 13440.4). Total num frames: 182484992. Throughput: 0: 1663.9, 1: 1659.6. Samples: 45622766. Policy #0 lag: (min: 2.0, avg: 4.5, max: 34.0) +[2023-10-13 03:59:03,608][45375] Avg episode reward: [(0, '60.470'), (1, '51.080')] +[2023-10-13 03:59:06,925][46663] Updated weights for policy 1, policy_version 89061 (0.0008) +[2023-10-13 03:59:07,287][46663] Updated weights for policy 1, policy_version 89071 (0.0009) +[2023-10-13 03:59:07,394][46662] Updated weights for policy 0, policy_version 89160 (0.0008) +[2023-10-13 03:59:07,654][46663] Updated weights for policy 1, policy_version 89081 (0.0009) +[2023-10-13 03:59:07,759][46662] Updated weights for policy 0, policy_version 89170 (0.0010) +[2023-10-13 03:59:08,128][46662] Updated weights for policy 0, policy_version 89180 (0.0007) +[2023-10-13 03:59:08,607][45375] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 13440.4). Total num frames: 182550528. Throughput: 0: 1649.8, 1: 1659.5. Samples: 45641926. Policy #0 lag: (min: 2.0, avg: 4.5, max: 34.0) +[2023-10-13 03:59:08,607][45375] Avg episode reward: [(0, '61.620'), (1, '50.540')] +[2023-10-13 03:59:11,760][46663] Updated weights for policy 1, policy_version 89091 (0.0010) +[2023-10-13 03:59:12,126][46663] Updated weights for policy 1, policy_version 89101 (0.0008) +[2023-10-13 03:59:12,217][46662] Updated weights for policy 0, policy_version 89190 (0.0008) +[2023-10-13 03:59:12,478][46663] Updated weights for policy 1, policy_version 89111 (0.0008) +[2023-10-13 03:59:12,589][46662] Updated weights for policy 0, policy_version 89200 (0.0007) +[2023-10-13 03:59:12,952][46662] Updated weights for policy 0, policy_version 89210 (0.0007) +[2023-10-13 03:59:13,607][45375] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 13440.4). Total num frames: 182616064. Throughput: 0: 1658.6, 1: 1674.0. Samples: 45652934. Policy #0 lag: (min: 2.0, avg: 4.5, max: 34.0) +[2023-10-13 03:59:13,607][45375] Avg episode reward: [(0, '60.310'), (1, '50.160')] +[2023-10-13 03:59:16,425][46663] Updated weights for policy 1, policy_version 89121 (0.0010) +[2023-10-13 03:59:16,791][46663] Updated weights for policy 1, policy_version 89131 (0.0008) +[2023-10-13 03:59:17,162][46663] Updated weights for policy 1, policy_version 89141 (0.0008) +[2023-10-13 03:59:17,301][46662] Updated weights for policy 0, policy_version 89220 (0.0008) +[2023-10-13 03:59:17,524][46663] Updated weights for policy 1, policy_version 89151 (0.0009) +[2023-10-13 03:59:17,666][46662] Updated weights for policy 0, policy_version 89230 (0.0008) +[2023-10-13 03:59:18,043][46662] Updated weights for policy 0, policy_version 89240 (0.0009) +[2023-10-13 03:59:18,606][45375] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 13440.4). Total num frames: 182681600. Throughput: 0: 1662.6, 1: 1658.3. Samples: 45672856. Policy #0 lag: (min: 2.0, avg: 4.5, max: 34.0) +[2023-10-13 03:59:18,607][45375] Avg episode reward: [(0, '58.910'), (1, '50.400')] +[2023-10-13 03:59:21,499][46663] Updated weights for policy 1, policy_version 89161 (0.0008) +[2023-10-13 03:59:21,868][46663] Updated weights for policy 1, policy_version 89171 (0.0007) +[2023-10-13 03:59:21,913][46662] Updated weights for policy 0, policy_version 89250 (0.0009) +[2023-10-13 03:59:22,236][46663] Updated weights for policy 1, policy_version 89181 (0.0008) +[2023-10-13 03:59:22,283][46662] Updated weights for policy 0, policy_version 89260 (0.0009) +[2023-10-13 03:59:22,656][46662] Updated weights for policy 0, policy_version 89270 (0.0010) +[2023-10-13 03:59:23,024][46662] Updated weights for policy 0, policy_version 89280 (0.0009) +[2023-10-13 03:59:23,606][45375] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 13440.4). Total num frames: 182747136. Throughput: 0: 1646.4, 1: 1677.6. Samples: 45692314. Policy #0 lag: (min: 2.0, avg: 4.5, max: 34.0) +[2023-10-13 03:59:23,607][45375] Avg episode reward: [(0, '58.690'), (1, '49.880')] +[2023-10-13 03:59:26,558][46663] Updated weights for policy 1, policy_version 89191 (0.0008) +[2023-10-13 03:59:26,913][46663] Updated weights for policy 1, policy_version 89201 (0.0007) +[2023-10-13 03:59:27,056][46662] Updated weights for policy 0, policy_version 89290 (0.0008) +[2023-10-13 03:59:27,278][46663] Updated weights for policy 1, policy_version 89211 (0.0009) +[2023-10-13 03:59:27,420][46662] Updated weights for policy 0, policy_version 89300 (0.0010) +[2023-10-13 03:59:27,787][46662] Updated weights for policy 0, policy_version 89310 (0.0011) +[2023-10-13 03:59:28,607][45375] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 13440.4). Total num frames: 182812672. Throughput: 0: 1665.8, 1: 1678.6. Samples: 45703410. Policy #0 lag: (min: 2.0, avg: 4.5, max: 34.0) +[2023-10-13 03:59:28,607][45375] Avg episode reward: [(0, '57.420'), (1, '49.440')] +[2023-10-13 03:59:31,469][46663] Updated weights for policy 1, policy_version 89221 (0.0009) +[2023-10-13 03:59:31,856][46663] Updated weights for policy 1, policy_version 89231 (0.0008) +[2023-10-13 03:59:32,102][46662] Updated weights for policy 0, policy_version 89320 (0.0008) +[2023-10-13 03:59:32,227][46663] Updated weights for policy 1, policy_version 89241 (0.0008) +[2023-10-13 03:59:32,474][46662] Updated weights for policy 0, policy_version 89330 (0.0009) +[2023-10-13 03:59:32,843][46662] Updated weights for policy 0, policy_version 89340 (0.0009) +[2023-10-13 03:59:33,607][45375] Fps is (10 sec: 13106.9, 60 sec: 14199.4, 300 sec: 13440.4). Total num frames: 182878208. Throughput: 0: 1664.7, 1: 1664.2. Samples: 45722880. Policy #0 lag: (min: 2.0, avg: 4.5, max: 34.0) +[2023-10-13 03:59:33,608][45375] Avg episode reward: [(0, '58.370'), (1, '48.870')] +[2023-10-13 03:59:36,239][46663] Updated weights for policy 1, policy_version 89251 (0.0009) +[2023-10-13 03:59:36,610][46663] Updated weights for policy 1, policy_version 89261 (0.0010) +[2023-10-13 03:59:36,985][46663] Updated weights for policy 1, policy_version 89271 (0.0010) +[2023-10-13 03:59:37,014][46662] Updated weights for policy 0, policy_version 89350 (0.0008) +[2023-10-13 03:59:37,373][46662] Updated weights for policy 0, policy_version 89360 (0.0007) +[2023-10-13 03:59:37,759][46662] Updated weights for policy 0, policy_version 89370 (0.0007) +[2023-10-13 03:59:38,607][45375] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 13440.4). Total num frames: 182943744. Throughput: 0: 1652.0, 1: 1687.6. Samples: 45742190. Policy #0 lag: (min: 29.0, avg: 29.0, max: 29.0) +[2023-10-13 03:59:38,607][45375] Avg episode reward: [(0, '59.430'), (1, '49.920')] +[2023-10-13 03:59:41,183][46663] Updated weights for policy 1, policy_version 89281 (0.0008) +[2023-10-13 03:59:41,548][46663] Updated weights for policy 1, policy_version 89291 (0.0007) +[2023-10-13 03:59:41,809][46662] Updated weights for policy 0, policy_version 89380 (0.0008) +[2023-10-13 03:59:41,903][46663] Updated weights for policy 1, policy_version 89301 (0.0008) +[2023-10-13 03:59:42,183][46662] Updated weights for policy 0, policy_version 89390 (0.0008) +[2023-10-13 03:59:42,267][46663] Updated weights for policy 1, policy_version 89311 (0.0008) +[2023-10-13 03:59:42,549][46662] Updated weights for policy 0, policy_version 89400 (0.0009) +[2023-10-13 03:59:43,606][45375] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 13329.4). Total num frames: 183009280. Throughput: 0: 1662.4, 1: 1676.5. Samples: 45753110. Policy #0 lag: (min: 29.0, avg: 29.0, max: 29.0) +[2023-10-13 03:59:43,607][45375] Avg episode reward: [(0, '58.050'), (1, '50.940')] +[2023-10-13 03:59:46,425][46663] Updated weights for policy 1, policy_version 89321 (0.0008) +[2023-10-13 03:59:46,614][46662] Updated weights for policy 0, policy_version 89410 (0.0009) +[2023-10-13 03:59:46,788][46663] Updated weights for policy 1, policy_version 89331 (0.0008) +[2023-10-13 03:59:46,987][46662] Updated weights for policy 0, policy_version 89420 (0.0007) +[2023-10-13 03:59:47,154][46663] Updated weights for policy 1, policy_version 89341 (0.0008) +[2023-10-13 03:59:47,355][46662] Updated weights for policy 0, policy_version 89430 (0.0009) +[2023-10-13 03:59:47,732][46662] Updated weights for policy 0, policy_version 89440 (0.0009) +[2023-10-13 03:59:48,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13329.4). Total num frames: 183074816. Throughput: 0: 1659.8, 1: 1663.5. Samples: 45772312. Policy #0 lag: (min: 29.0, avg: 29.0, max: 29.0) +[2023-10-13 03:59:48,607][45375] Avg episode reward: [(0, '57.110'), (1, '50.170')] +[2023-10-13 03:59:51,169][46663] Updated weights for policy 1, policy_version 89351 (0.0008) +[2023-10-13 03:59:51,534][46663] Updated weights for policy 1, policy_version 89361 (0.0007) +[2023-10-13 03:59:51,819][46662] Updated weights for policy 0, policy_version 89450 (0.0008) +[2023-10-13 03:59:51,900][46663] Updated weights for policy 1, policy_version 89371 (0.0007) +[2023-10-13 03:59:52,192][46662] Updated weights for policy 0, policy_version 89460 (0.0007) +[2023-10-13 03:59:52,554][46662] Updated weights for policy 0, policy_version 89470 (0.0008) +[2023-10-13 03:59:53,607][45375] Fps is (10 sec: 13106.7, 60 sec: 13107.1, 300 sec: 13329.3). Total num frames: 183140352. Throughput: 0: 1655.9, 1: 1683.0. Samples: 45792178. Policy #0 lag: (min: 29.0, avg: 29.0, max: 29.0) +[2023-10-13 03:59:53,608][45375] Avg episode reward: [(0, '57.970'), (1, '50.350')] +[2023-10-13 03:59:53,622][46384] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000089376_91521024.pth... +[2023-10-13 03:59:53,622][46091] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000089472_91619328.pth... +[2023-10-13 03:59:53,654][46384] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000087808_89915392.pth +[2023-10-13 03:59:53,659][46091] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000087904_90013696.pth +[2023-10-13 03:59:55,885][46663] Updated weights for policy 1, policy_version 89381 (0.0008) +[2023-10-13 03:59:56,256][46663] Updated weights for policy 1, policy_version 89391 (0.0010) +[2023-10-13 03:59:56,620][46662] Updated weights for policy 0, policy_version 89480 (0.0007) +[2023-10-13 03:59:56,630][46663] Updated weights for policy 1, policy_version 89401 (0.0007) +[2023-10-13 03:59:56,990][46662] Updated weights for policy 0, policy_version 89490 (0.0008) +[2023-10-13 03:59:57,372][46662] Updated weights for policy 0, policy_version 89500 (0.0011) +[2023-10-13 03:59:58,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 183205888. Throughput: 0: 1669.0, 1: 1667.8. Samples: 45803088. Policy #0 lag: (min: 29.0, avg: 29.0, max: 29.0) +[2023-10-13 03:59:58,607][45375] Avg episode reward: [(0, '59.070'), (1, '49.920')] +[2023-10-13 04:00:00,756][46663] Updated weights for policy 1, policy_version 89411 (0.0009) +[2023-10-13 04:00:01,127][46663] Updated weights for policy 1, policy_version 89421 (0.0010) +[2023-10-13 04:00:01,432][46662] Updated weights for policy 0, policy_version 89510 (0.0009) +[2023-10-13 04:00:01,492][46663] Updated weights for policy 1, policy_version 89431 (0.0007) +[2023-10-13 04:00:01,800][46662] Updated weights for policy 0, policy_version 89520 (0.0008) +[2023-10-13 04:00:02,166][46662] Updated weights for policy 0, policy_version 89530 (0.0009) +[2023-10-13 04:00:03,607][45375] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 183271424. Throughput: 0: 1654.6, 1: 1670.7. Samples: 45822496. Policy #0 lag: (min: 29.0, avg: 29.0, max: 29.0) +[2023-10-13 04:00:03,608][45375] Avg episode reward: [(0, '57.840'), (1, '50.350')] +[2023-10-13 04:00:05,674][46663] Updated weights for policy 1, policy_version 89441 (0.0009) +[2023-10-13 04:00:06,054][46663] Updated weights for policy 1, policy_version 89451 (0.0008) +[2023-10-13 04:00:06,172][46662] Updated weights for policy 0, policy_version 89540 (0.0009) +[2023-10-13 04:00:06,420][46663] Updated weights for policy 1, policy_version 89461 (0.0009) +[2023-10-13 04:00:06,530][46662] Updated weights for policy 0, policy_version 89550 (0.0008) +[2023-10-13 04:00:06,777][46663] Updated weights for policy 1, policy_version 89471 (0.0009) +[2023-10-13 04:00:06,897][46662] Updated weights for policy 0, policy_version 89560 (0.0009) +[2023-10-13 04:00:08,606][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 183336960. Throughput: 0: 1667.0, 1: 1676.6. Samples: 45842778. Policy #0 lag: (min: 29.0, avg: 29.0, max: 29.0) +[2023-10-13 04:00:08,607][45375] Avg episode reward: [(0, '56.550'), (1, '50.250')] +[2023-10-13 04:00:10,844][46663] Updated weights for policy 1, policy_version 89481 (0.0008) +[2023-10-13 04:00:11,034][46662] Updated weights for policy 0, policy_version 89570 (0.0009) +[2023-10-13 04:00:11,206][46663] Updated weights for policy 1, policy_version 89491 (0.0009) +[2023-10-13 04:00:11,407][46662] Updated weights for policy 0, policy_version 89580 (0.0008) +[2023-10-13 04:00:11,576][46663] Updated weights for policy 1, policy_version 89501 (0.0007) +[2023-10-13 04:00:11,775][46662] Updated weights for policy 0, policy_version 89590 (0.0007) +[2023-10-13 04:00:12,140][46662] Updated weights for policy 0, policy_version 89600 (0.0008) +[2023-10-13 04:00:13,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 183402496. Throughput: 0: 1670.0, 1: 1664.8. Samples: 45853478. Policy #0 lag: (min: 29.0, avg: 29.0, max: 29.0) +[2023-10-13 04:00:13,608][45375] Avg episode reward: [(0, '56.950'), (1, '51.330')] +[2023-10-13 04:00:15,610][46663] Updated weights for policy 1, policy_version 89511 (0.0008) +[2023-10-13 04:00:15,971][46663] Updated weights for policy 1, policy_version 89521 (0.0008) +[2023-10-13 04:00:16,189][46662] Updated weights for policy 0, policy_version 89610 (0.0007) +[2023-10-13 04:00:16,342][46663] Updated weights for policy 1, policy_version 89531 (0.0008) +[2023-10-13 04:00:16,556][46662] Updated weights for policy 0, policy_version 89620 (0.0010) +[2023-10-13 04:00:16,941][46662] Updated weights for policy 0, policy_version 89630 (0.0009) +[2023-10-13 04:00:18,606][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 183468032. Throughput: 0: 1654.9, 1: 1675.8. Samples: 45872762. Policy #0 lag: (min: 29.0, avg: 29.0, max: 29.0) +[2023-10-13 04:00:18,607][45375] Avg episode reward: [(0, '56.600'), (1, '51.630')] +[2023-10-13 04:00:20,673][46663] Updated weights for policy 1, policy_version 89541 (0.0009) +[2023-10-13 04:00:21,063][46663] Updated weights for policy 1, policy_version 89551 (0.0008) +[2023-10-13 04:00:21,098][46662] Updated weights for policy 0, policy_version 89640 (0.0008) +[2023-10-13 04:00:21,420][46663] Updated weights for policy 1, policy_version 89561 (0.0007) +[2023-10-13 04:00:21,461][46662] Updated weights for policy 0, policy_version 89650 (0.0007) +[2023-10-13 04:00:21,845][46662] Updated weights for policy 0, policy_version 89660 (0.0008) +[2023-10-13 04:00:23,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.1, 300 sec: 13329.3). Total num frames: 183533568. Throughput: 0: 1672.8, 1: 1676.4. Samples: 45892904. Policy #0 lag: (min: 29.0, avg: 29.0, max: 29.0) +[2023-10-13 04:00:23,608][45375] Avg episode reward: [(0, '57.150'), (1, '51.940')] +[2023-10-13 04:00:25,512][46663] Updated weights for policy 1, policy_version 89571 (0.0008) +[2023-10-13 04:00:25,847][46662] Updated weights for policy 0, policy_version 89670 (0.0008) +[2023-10-13 04:00:25,878][46663] Updated weights for policy 1, policy_version 89581 (0.0008) +[2023-10-13 04:00:26,214][46662] Updated weights for policy 0, policy_version 89680 (0.0010) +[2023-10-13 04:00:26,248][46663] Updated weights for policy 1, policy_version 89591 (0.0009) +[2023-10-13 04:00:26,585][46662] Updated weights for policy 0, policy_version 89690 (0.0010) +[2023-10-13 04:00:28,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 183599104. Throughput: 0: 1675.9, 1: 1659.6. Samples: 45903208. Policy #0 lag: (min: 29.0, avg: 29.0, max: 29.0) +[2023-10-13 04:00:28,607][45375] Avg episode reward: [(0, '58.240'), (1, '53.120')] +[2023-10-13 04:00:30,327][46663] Updated weights for policy 1, policy_version 89601 (0.0010) +[2023-10-13 04:00:30,683][46663] Updated weights for policy 1, policy_version 89611 (0.0010) +[2023-10-13 04:00:30,832][46662] Updated weights for policy 0, policy_version 89700 (0.0007) +[2023-10-13 04:00:31,056][46663] Updated weights for policy 1, policy_version 89621 (0.0009) +[2023-10-13 04:00:31,196][46662] Updated weights for policy 0, policy_version 89710 (0.0007) +[2023-10-13 04:00:31,428][46663] Updated weights for policy 1, policy_version 89631 (0.0007) +[2023-10-13 04:00:31,575][46662] Updated weights for policy 0, policy_version 89720 (0.0008) +[2023-10-13 04:00:33,606][45375] Fps is (10 sec: 13107.6, 60 sec: 13107.3, 300 sec: 13329.4). Total num frames: 183664640. Throughput: 0: 1657.8, 1: 1676.8. Samples: 45922368. Policy #0 lag: (min: 29.0, avg: 29.0, max: 29.0) +[2023-10-13 04:00:33,607][45375] Avg episode reward: [(0, '58.750'), (1, '54.220')] +[2023-10-13 04:00:35,529][46663] Updated weights for policy 1, policy_version 89641 (0.0009) +[2023-10-13 04:00:35,714][46662] Updated weights for policy 0, policy_version 89730 (0.0010) +[2023-10-13 04:00:35,893][46663] Updated weights for policy 1, policy_version 89651 (0.0009) +[2023-10-13 04:00:36,089][46662] Updated weights for policy 0, policy_version 89740 (0.0010) +[2023-10-13 04:00:36,265][46663] Updated weights for policy 1, policy_version 89661 (0.0010) +[2023-10-13 04:00:36,447][46662] Updated weights for policy 0, policy_version 89750 (0.0009) +[2023-10-13 04:00:36,814][46662] Updated weights for policy 0, policy_version 89760 (0.0010) +[2023-10-13 04:00:38,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 183730176. Throughput: 0: 1673.3, 1: 1665.9. Samples: 45942438. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 04:00:38,607][45375] Avg episode reward: [(0, '57.710'), (1, '54.670')] +[2023-10-13 04:00:40,427][46663] Updated weights for policy 1, policy_version 89671 (0.0008) +[2023-10-13 04:00:40,798][46663] Updated weights for policy 1, policy_version 89681 (0.0008) +[2023-10-13 04:00:40,873][46662] Updated weights for policy 0, policy_version 89770 (0.0009) +[2023-10-13 04:00:41,159][46663] Updated weights for policy 1, policy_version 89691 (0.0008) +[2023-10-13 04:00:41,244][46662] Updated weights for policy 0, policy_version 89780 (0.0009) +[2023-10-13 04:00:41,610][46662] Updated weights for policy 0, policy_version 89790 (0.0009) +[2023-10-13 04:00:43,607][45375] Fps is (10 sec: 13106.9, 60 sec: 13107.1, 300 sec: 13329.3). Total num frames: 183795712. Throughput: 0: 1668.7, 1: 1656.0. Samples: 45952700. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 04:00:43,608][45375] Avg episode reward: [(0, '59.970'), (1, '54.920')] +[2023-10-13 04:00:44,923][46663] Updated weights for policy 1, policy_version 89701 (0.0007) +[2023-10-13 04:00:45,283][46663] Updated weights for policy 1, policy_version 89711 (0.0010) +[2023-10-13 04:00:45,625][46662] Updated weights for policy 0, policy_version 89800 (0.0008) +[2023-10-13 04:00:45,652][46663] Updated weights for policy 1, policy_version 89721 (0.0007) +[2023-10-13 04:00:45,990][46662] Updated weights for policy 0, policy_version 89810 (0.0009) +[2023-10-13 04:00:46,363][46662] Updated weights for policy 0, policy_version 89820 (0.0010) +[2023-10-13 04:00:48,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 183861248. Throughput: 0: 1660.7, 1: 1673.8. Samples: 45972548. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 04:00:48,607][45375] Avg episode reward: [(0, '58.980'), (1, '55.080')] +[2023-10-13 04:00:49,753][46663] Updated weights for policy 1, policy_version 89731 (0.0008) +[2023-10-13 04:00:50,118][46663] Updated weights for policy 1, policy_version 89741 (0.0009) +[2023-10-13 04:00:50,494][46663] Updated weights for policy 1, policy_version 89751 (0.0007) +[2023-10-13 04:00:50,581][46662] Updated weights for policy 0, policy_version 89830 (0.0009) +[2023-10-13 04:00:50,948][46662] Updated weights for policy 0, policy_version 89840 (0.0009) +[2023-10-13 04:00:51,315][46662] Updated weights for policy 0, policy_version 89850 (0.0010) +[2023-10-13 04:00:53,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 183926784. Throughput: 0: 1670.4, 1: 1670.4. Samples: 45993116. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 04:00:53,608][45375] Avg episode reward: [(0, '58.110'), (1, '55.120')] +[2023-10-13 04:00:54,531][46663] Updated weights for policy 1, policy_version 89761 (0.0007) +[2023-10-13 04:00:54,893][46663] Updated weights for policy 1, policy_version 89771 (0.0007) +[2023-10-13 04:00:55,257][46663] Updated weights for policy 1, policy_version 89781 (0.0008) +[2023-10-13 04:00:55,320][46662] Updated weights for policy 0, policy_version 89860 (0.0008) +[2023-10-13 04:00:55,631][46663] Updated weights for policy 1, policy_version 89791 (0.0009) +[2023-10-13 04:00:55,690][46662] Updated weights for policy 0, policy_version 89870 (0.0008) +[2023-10-13 04:00:56,049][46662] Updated weights for policy 0, policy_version 89880 (0.0009) +[2023-10-13 04:00:58,606][45375] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 183992320. Throughput: 0: 1659.6, 1: 1662.5. Samples: 46002970. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 04:00:58,607][45375] Avg episode reward: [(0, '58.760'), (1, '56.270')] +[2023-10-13 04:00:59,740][46663] Updated weights for policy 1, policy_version 89801 (0.0007) +[2023-10-13 04:01:00,059][46662] Updated weights for policy 0, policy_version 89890 (0.0009) +[2023-10-13 04:01:00,105][46663] Updated weights for policy 1, policy_version 89811 (0.0008) +[2023-10-13 04:01:00,425][46662] Updated weights for policy 0, policy_version 89900 (0.0007) +[2023-10-13 04:01:00,460][46663] Updated weights for policy 1, policy_version 89821 (0.0010) +[2023-10-13 04:01:00,804][46662] Updated weights for policy 0, policy_version 89910 (0.0009) +[2023-10-13 04:01:01,158][46662] Updated weights for policy 0, policy_version 89920 (0.0010) +[2023-10-13 04:01:03,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 184057856. Throughput: 0: 1669.2, 1: 1675.4. Samples: 46023270. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 04:01:03,608][45375] Avg episode reward: [(0, '59.850'), (1, '57.120')] +[2023-10-13 04:01:04,522][46663] Updated weights for policy 1, policy_version 89831 (0.0009) +[2023-10-13 04:01:04,890][46663] Updated weights for policy 1, policy_version 89841 (0.0009) +[2023-10-13 04:01:05,255][46663] Updated weights for policy 1, policy_version 89851 (0.0008) +[2023-10-13 04:01:05,275][46662] Updated weights for policy 0, policy_version 89930 (0.0008) +[2023-10-13 04:01:05,643][46662] Updated weights for policy 0, policy_version 89940 (0.0009) +[2023-10-13 04:01:06,016][46662] Updated weights for policy 0, policy_version 89950 (0.0009) +[2023-10-13 04:01:08,607][45375] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 184123392. Throughput: 0: 1672.3, 1: 1681.6. Samples: 46043828. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 04:01:08,608][45375] Avg episode reward: [(0, '58.330'), (1, '57.440')] +[2023-10-13 04:01:09,366][46663] Updated weights for policy 1, policy_version 89861 (0.0008) +[2023-10-13 04:01:09,751][46663] Updated weights for policy 1, policy_version 89871 (0.0007) +[2023-10-13 04:01:10,121][46663] Updated weights for policy 1, policy_version 89881 (0.0008) +[2023-10-13 04:01:10,291][46662] Updated weights for policy 0, policy_version 89960 (0.0007) +[2023-10-13 04:01:10,661][46662] Updated weights for policy 0, policy_version 89970 (0.0008) +[2023-10-13 04:01:11,041][46662] Updated weights for policy 0, policy_version 89980 (0.0008) +[2023-10-13 04:01:13,607][45375] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 184188928. Throughput: 0: 1655.5, 1: 1675.0. Samples: 46053078. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 04:01:13,607][45375] Avg episode reward: [(0, '57.850'), (1, '58.080')] +[2023-10-13 04:01:14,087][46663] Updated weights for policy 1, policy_version 89891 (0.0007) +[2023-10-13 04:01:14,456][46663] Updated weights for policy 1, policy_version 89901 (0.0009) +[2023-10-13 04:01:14,822][46663] Updated weights for policy 1, policy_version 89911 (0.0009) +[2023-10-13 04:01:15,140][46662] Updated weights for policy 0, policy_version 89990 (0.0008) +[2023-10-13 04:01:15,520][46662] Updated weights for policy 0, policy_version 90000 (0.0009) +[2023-10-13 04:01:15,889][46662] Updated weights for policy 0, policy_version 90010 (0.0008) +[2023-10-13 04:01:18,607][45375] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 184254464. Throughput: 0: 1669.1, 1: 1685.2. Samples: 46073310. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 04:01:18,607][45375] Avg episode reward: [(0, '58.440'), (1, '58.780')] +[2023-10-13 04:01:19,042][46663] Updated weights for policy 1, policy_version 89921 (0.0009) +[2023-10-13 04:01:19,406][46663] Updated weights for policy 1, policy_version 89931 (0.0009) +[2023-10-13 04:01:19,769][46663] Updated weights for policy 1, policy_version 89941 (0.0009) +[2023-10-13 04:01:19,843][46662] Updated weights for policy 0, policy_version 90020 (0.0009) +[2023-10-13 04:01:20,137][46663] Updated weights for policy 1, policy_version 89951 (0.0008) +[2023-10-13 04:01:20,213][46662] Updated weights for policy 0, policy_version 90030 (0.0008) +[2023-10-13 04:01:20,577][46662] Updated weights for policy 0, policy_version 90040 (0.0008) +[2023-10-13 04:01:23,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 184320000. Throughput: 0: 1681.0, 1: 1687.0. Samples: 46093996. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 04:01:23,608][45375] Avg episode reward: [(0, '58.600'), (1, '59.470')] +[2023-10-13 04:01:24,361][46663] Updated weights for policy 1, policy_version 89961 (0.0008) +[2023-10-13 04:01:24,634][46662] Updated weights for policy 0, policy_version 90050 (0.0009) +[2023-10-13 04:01:24,731][46663] Updated weights for policy 1, policy_version 89971 (0.0007) +[2023-10-13 04:01:25,010][46662] Updated weights for policy 0, policy_version 90060 (0.0007) +[2023-10-13 04:01:25,088][46663] Updated weights for policy 1, policy_version 89981 (0.0008) +[2023-10-13 04:01:25,374][46662] Updated weights for policy 0, policy_version 90070 (0.0008) +[2023-10-13 04:01:25,742][46662] Updated weights for policy 0, policy_version 90080 (0.0008) +[2023-10-13 04:01:28,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 184385536. Throughput: 0: 1659.5, 1: 1686.0. Samples: 46103246. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 04:01:28,607][45375] Avg episode reward: [(0, '57.430'), (1, '60.220')] +[2023-10-13 04:01:29,126][46663] Updated weights for policy 1, policy_version 89991 (0.0007) +[2023-10-13 04:01:29,495][46663] Updated weights for policy 1, policy_version 90001 (0.0007) +[2023-10-13 04:01:29,785][46662] Updated weights for policy 0, policy_version 90090 (0.0011) +[2023-10-13 04:01:29,857][46663] Updated weights for policy 1, policy_version 90011 (0.0009) +[2023-10-13 04:01:30,162][46662] Updated weights for policy 0, policy_version 90100 (0.0010) +[2023-10-13 04:01:30,532][46662] Updated weights for policy 0, policy_version 90110 (0.0007) +[2023-10-13 04:01:33,607][45375] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 184451072. Throughput: 0: 1683.4, 1: 1680.6. Samples: 46123928. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 04:01:33,607][45375] Avg episode reward: [(0, '56.690'), (1, '59.090')] +[2023-10-13 04:01:33,761][46663] Updated weights for policy 1, policy_version 90021 (0.0008) +[2023-10-13 04:01:34,135][46663] Updated weights for policy 1, policy_version 90031 (0.0009) +[2023-10-13 04:01:34,499][46663] Updated weights for policy 1, policy_version 90041 (0.0008) +[2023-10-13 04:01:34,662][46662] Updated weights for policy 0, policy_version 90120 (0.0009) +[2023-10-13 04:01:35,037][46662] Updated weights for policy 0, policy_version 90130 (0.0008) +[2023-10-13 04:01:35,407][46662] Updated weights for policy 0, policy_version 90140 (0.0007) +[2023-10-13 04:01:38,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 184516608. Throughput: 0: 1683.2, 1: 1682.9. Samples: 46144592. Policy #0 lag: (min: 16.0, avg: 37.9, max: 48.0) +[2023-10-13 04:01:38,607][45375] Avg episode reward: [(0, '56.940'), (1, '57.720')] +[2023-10-13 04:01:38,635][46663] Updated weights for policy 1, policy_version 90051 (0.0008) +[2023-10-13 04:01:39,002][46663] Updated weights for policy 1, policy_version 90061 (0.0008) +[2023-10-13 04:01:39,367][46663] Updated weights for policy 1, policy_version 90071 (0.0008) +[2023-10-13 04:01:39,415][46662] Updated weights for policy 0, policy_version 90150 (0.0008) +[2023-10-13 04:01:39,785][46662] Updated weights for policy 0, policy_version 90160 (0.0007) +[2023-10-13 04:01:40,152][46662] Updated weights for policy 0, policy_version 90170 (0.0008) +[2023-10-13 04:01:43,508][46663] Updated weights for policy 1, policy_version 90081 (0.0009) +[2023-10-13 04:01:43,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 184582144. Throughput: 0: 1668.9, 1: 1679.2. Samples: 46153636. Policy #0 lag: (min: 16.0, avg: 37.9, max: 48.0) +[2023-10-13 04:01:43,608][45375] Avg episode reward: [(0, '57.880'), (1, '57.210')] +[2023-10-13 04:01:43,872][46663] Updated weights for policy 1, policy_version 90091 (0.0011) +[2023-10-13 04:01:44,239][46663] Updated weights for policy 1, policy_version 90101 (0.0010) +[2023-10-13 04:01:44,276][46662] Updated weights for policy 0, policy_version 90180 (0.0008) +[2023-10-13 04:01:44,609][46663] Updated weights for policy 1, policy_version 90111 (0.0008) +[2023-10-13 04:01:44,653][46662] Updated weights for policy 0, policy_version 90190 (0.0009) +[2023-10-13 04:01:45,013][46662] Updated weights for policy 0, policy_version 90200 (0.0008) +[2023-10-13 04:01:48,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 184647680. Throughput: 0: 1672.8, 1: 1665.2. Samples: 46173480. Policy #0 lag: (min: 16.0, avg: 37.9, max: 48.0) +[2023-10-13 04:01:48,607][45375] Avg episode reward: [(0, '57.740'), (1, '57.420')] +[2023-10-13 04:01:48,963][46663] Updated weights for policy 1, policy_version 90121 (0.0008) +[2023-10-13 04:01:49,318][46663] Updated weights for policy 1, policy_version 90131 (0.0009) +[2023-10-13 04:01:49,321][46662] Updated weights for policy 0, policy_version 90210 (0.0008) +[2023-10-13 04:01:49,684][46663] Updated weights for policy 1, policy_version 90141 (0.0009) +[2023-10-13 04:01:49,693][46662] Updated weights for policy 0, policy_version 90220 (0.0010) +[2023-10-13 04:01:50,064][46662] Updated weights for policy 0, policy_version 90230 (0.0010) +[2023-10-13 04:01:50,428][46662] Updated weights for policy 0, policy_version 90240 (0.0011) +[2023-10-13 04:01:53,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 184713216. Throughput: 0: 1663.2, 1: 1656.2. Samples: 46193204. Policy #0 lag: (min: 16.0, avg: 37.9, max: 48.0) +[2023-10-13 04:01:53,607][45375] Avg episode reward: [(0, '55.630'), (1, '56.770')] +[2023-10-13 04:01:53,618][46091] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000090240_92405760.pth... +[2023-10-13 04:01:53,618][46384] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000090144_92307456.pth... +[2023-10-13 04:01:53,654][46384] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000088608_90734592.pth +[2023-10-13 04:01:53,655][46091] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000088672_90800128.pth +[2023-10-13 04:01:54,029][46663] Updated weights for policy 1, policy_version 90151 (0.0007) +[2023-10-13 04:01:54,388][46663] Updated weights for policy 1, policy_version 90161 (0.0007) +[2023-10-13 04:01:54,670][46662] Updated weights for policy 0, policy_version 90250 (0.0008) +[2023-10-13 04:01:54,762][46663] Updated weights for policy 1, policy_version 90171 (0.0010) +[2023-10-13 04:01:55,046][46662] Updated weights for policy 0, policy_version 90260 (0.0009) +[2023-10-13 04:01:55,414][46662] Updated weights for policy 0, policy_version 90270 (0.0008) +[2023-10-13 04:01:58,606][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 184778752. Throughput: 0: 1653.3, 1: 1658.4. Samples: 46202106. Policy #0 lag: (min: 16.0, avg: 37.9, max: 48.0) +[2023-10-13 04:01:58,607][45375] Avg episode reward: [(0, '56.140'), (1, '56.100')] +[2023-10-13 04:01:58,890][46663] Updated weights for policy 1, policy_version 90181 (0.0009) +[2023-10-13 04:01:59,265][46663] Updated weights for policy 1, policy_version 90191 (0.0011) +[2023-10-13 04:01:59,635][46663] Updated weights for policy 1, policy_version 90201 (0.0008) +[2023-10-13 04:01:59,670][46662] Updated weights for policy 0, policy_version 90280 (0.0008) +[2023-10-13 04:02:00,030][46662] Updated weights for policy 0, policy_version 90290 (0.0009) +[2023-10-13 04:02:00,394][46662] Updated weights for policy 0, policy_version 90300 (0.0010) +[2023-10-13 04:02:03,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 184844288. Throughput: 0: 1659.9, 1: 1646.5. Samples: 46222098. Policy #0 lag: (min: 16.0, avg: 37.9, max: 48.0) +[2023-10-13 04:02:03,608][45375] Avg episode reward: [(0, '56.630'), (1, '54.850')] +[2023-10-13 04:02:03,875][46663] Updated weights for policy 1, policy_version 90211 (0.0008) +[2023-10-13 04:02:04,232][46663] Updated weights for policy 1, policy_version 90221 (0.0008) +[2023-10-13 04:02:04,600][46663] Updated weights for policy 1, policy_version 90231 (0.0010) +[2023-10-13 04:02:04,742][46662] Updated weights for policy 0, policy_version 90310 (0.0009) +[2023-10-13 04:02:05,122][46662] Updated weights for policy 0, policy_version 90320 (0.0009) +[2023-10-13 04:02:05,496][46662] Updated weights for policy 0, policy_version 90330 (0.0008) +[2023-10-13 04:02:08,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 184909824. Throughput: 0: 1648.4, 1: 1642.2. Samples: 46242072. Policy #0 lag: (min: 16.0, avg: 37.9, max: 48.0) +[2023-10-13 04:02:08,607][45375] Avg episode reward: [(0, '57.130'), (1, '55.090')] +[2023-10-13 04:02:08,866][46663] Updated weights for policy 1, policy_version 90241 (0.0009) +[2023-10-13 04:02:09,228][46663] Updated weights for policy 1, policy_version 90251 (0.0008) +[2023-10-13 04:02:09,554][46662] Updated weights for policy 0, policy_version 90340 (0.0008) +[2023-10-13 04:02:09,596][46663] Updated weights for policy 1, policy_version 90261 (0.0009) +[2023-10-13 04:02:09,928][46662] Updated weights for policy 0, policy_version 90350 (0.0008) +[2023-10-13 04:02:09,957][46663] Updated weights for policy 1, policy_version 90271 (0.0010) +[2023-10-13 04:02:10,288][46662] Updated weights for policy 0, policy_version 90360 (0.0009) +[2023-10-13 04:02:13,607][45375] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 184975360. Throughput: 0: 1645.1, 1: 1637.6. Samples: 46250968. Policy #0 lag: (min: 16.0, avg: 37.9, max: 48.0) +[2023-10-13 04:02:13,607][45375] Avg episode reward: [(0, '55.750'), (1, '55.020')] +[2023-10-13 04:02:14,145][46663] Updated weights for policy 1, policy_version 90281 (0.0008) +[2023-10-13 04:02:14,445][46662] Updated weights for policy 0, policy_version 90370 (0.0008) +[2023-10-13 04:02:14,501][46663] Updated weights for policy 1, policy_version 90291 (0.0010) +[2023-10-13 04:02:14,806][46662] Updated weights for policy 0, policy_version 90380 (0.0009) +[2023-10-13 04:02:14,866][46663] Updated weights for policy 1, policy_version 90301 (0.0007) +[2023-10-13 04:02:15,178][46662] Updated weights for policy 0, policy_version 90390 (0.0009) +[2023-10-13 04:02:15,547][46662] Updated weights for policy 0, policy_version 90400 (0.0009) +[2023-10-13 04:02:18,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 185040896. Throughput: 0: 1641.1, 1: 1631.6. Samples: 46271196. Policy #0 lag: (min: 16.0, avg: 37.9, max: 48.0) +[2023-10-13 04:02:18,607][45375] Avg episode reward: [(0, '56.660'), (1, '55.850')] +[2023-10-13 04:02:19,104][46663] Updated weights for policy 1, policy_version 90311 (0.0008) +[2023-10-13 04:02:19,467][46663] Updated weights for policy 1, policy_version 90321 (0.0009) +[2023-10-13 04:02:19,681][46662] Updated weights for policy 0, policy_version 90410 (0.0008) +[2023-10-13 04:02:19,825][46663] Updated weights for policy 1, policy_version 90331 (0.0009) +[2023-10-13 04:02:20,049][46662] Updated weights for policy 0, policy_version 90420 (0.0008) +[2023-10-13 04:02:20,420][46662] Updated weights for policy 0, policy_version 90430 (0.0010) +[2023-10-13 04:02:23,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 185106432. Throughput: 0: 1634.3, 1: 1619.8. Samples: 46291028. Policy #0 lag: (min: 16.0, avg: 37.9, max: 48.0) +[2023-10-13 04:02:23,607][45375] Avg episode reward: [(0, '58.390'), (1, '55.460')] +[2023-10-13 04:02:24,255][46663] Updated weights for policy 1, policy_version 90341 (0.0010) +[2023-10-13 04:02:24,615][46662] Updated weights for policy 0, policy_version 90440 (0.0009) +[2023-10-13 04:02:24,616][46663] Updated weights for policy 1, policy_version 90351 (0.0007) +[2023-10-13 04:02:24,984][46663] Updated weights for policy 1, policy_version 90361 (0.0009) +[2023-10-13 04:02:24,986][46662] Updated weights for policy 0, policy_version 90450 (0.0010) +[2023-10-13 04:02:25,355][46662] Updated weights for policy 0, policy_version 90460 (0.0009) +[2023-10-13 04:02:28,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 185171968. Throughput: 0: 1625.7, 1: 1617.2. Samples: 46299566. Policy #0 lag: (min: 16.0, avg: 37.9, max: 48.0) +[2023-10-13 04:02:28,607][45375] Avg episode reward: [(0, '60.530'), (1, '54.240')] +[2023-10-13 04:02:29,204][46663] Updated weights for policy 1, policy_version 90371 (0.0008) +[2023-10-13 04:02:29,518][46662] Updated weights for policy 0, policy_version 90470 (0.0010) +[2023-10-13 04:02:29,569][46663] Updated weights for policy 1, policy_version 90381 (0.0008) +[2023-10-13 04:02:29,887][46662] Updated weights for policy 0, policy_version 90480 (0.0008) +[2023-10-13 04:02:29,941][46663] Updated weights for policy 1, policy_version 90391 (0.0009) +[2023-10-13 04:02:30,245][46662] Updated weights for policy 0, policy_version 90490 (0.0008) +[2023-10-13 04:02:33,606][45375] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 185237504. Throughput: 0: 1630.0, 1: 1630.1. Samples: 46320184. Policy #0 lag: (min: 16.0, avg: 37.9, max: 48.0) +[2023-10-13 04:02:33,607][45375] Avg episode reward: [(0, '60.230'), (1, '54.140')] +[2023-10-13 04:02:33,821][46663] Updated weights for policy 1, policy_version 90401 (0.0008) +[2023-10-13 04:02:34,192][46663] Updated weights for policy 1, policy_version 90411 (0.0008) +[2023-10-13 04:02:34,390][46662] Updated weights for policy 0, policy_version 90500 (0.0008) +[2023-10-13 04:02:34,552][46663] Updated weights for policy 1, policy_version 90421 (0.0008) +[2023-10-13 04:02:34,762][46662] Updated weights for policy 0, policy_version 90510 (0.0008) +[2023-10-13 04:02:34,933][46663] Updated weights for policy 1, policy_version 90431 (0.0008) +[2023-10-13 04:02:35,128][46662] Updated weights for policy 0, policy_version 90520 (0.0007) +[2023-10-13 04:02:38,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 185303040. Throughput: 0: 1646.4, 1: 1633.5. Samples: 46340798. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-10-13 04:02:38,607][45375] Avg episode reward: [(0, '61.350'), (1, '54.500')] +[2023-10-13 04:02:39,065][46663] Updated weights for policy 1, policy_version 90441 (0.0008) +[2023-10-13 04:02:39,186][46662] Updated weights for policy 0, policy_version 90530 (0.0008) +[2023-10-13 04:02:39,427][46663] Updated weights for policy 1, policy_version 90451 (0.0008) +[2023-10-13 04:02:39,549][46662] Updated weights for policy 0, policy_version 90540 (0.0009) +[2023-10-13 04:02:39,782][46663] Updated weights for policy 1, policy_version 90461 (0.0007) +[2023-10-13 04:02:39,925][46662] Updated weights for policy 0, policy_version 90550 (0.0009) +[2023-10-13 04:02:40,300][46662] Updated weights for policy 0, policy_version 90560 (0.0010) +[2023-10-13 04:02:43,606][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 185368576. Throughput: 0: 1644.3, 1: 1636.2. Samples: 46349728. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-10-13 04:02:43,607][45375] Avg episode reward: [(0, '61.550'), (1, '55.030')] +[2023-10-13 04:02:44,046][46663] Updated weights for policy 1, policy_version 90471 (0.0009) +[2023-10-13 04:02:44,413][46662] Updated weights for policy 0, policy_version 90570 (0.0007) +[2023-10-13 04:02:44,423][46663] Updated weights for policy 1, policy_version 90481 (0.0007) +[2023-10-13 04:02:44,777][46662] Updated weights for policy 0, policy_version 90580 (0.0007) +[2023-10-13 04:02:44,792][46663] Updated weights for policy 1, policy_version 90491 (0.0010) +[2023-10-13 04:02:45,150][46662] Updated weights for policy 0, policy_version 90590 (0.0007) +[2023-10-13 04:02:48,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 185434112. Throughput: 0: 1646.8, 1: 1642.3. Samples: 46370106. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-10-13 04:02:48,607][45375] Avg episode reward: [(0, '62.180'), (1, '56.330')] +[2023-10-13 04:02:48,978][46663] Updated weights for policy 1, policy_version 90501 (0.0009) +[2023-10-13 04:02:49,336][46663] Updated weights for policy 1, policy_version 90511 (0.0008) +[2023-10-13 04:02:49,572][46662] Updated weights for policy 0, policy_version 90600 (0.0008) +[2023-10-13 04:02:49,705][46663] Updated weights for policy 1, policy_version 90521 (0.0010) +[2023-10-13 04:02:49,933][46662] Updated weights for policy 0, policy_version 90610 (0.0008) +[2023-10-13 04:02:50,306][46662] Updated weights for policy 0, policy_version 90620 (0.0008) +[2023-10-13 04:02:53,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 185499648. Throughput: 0: 1646.4, 1: 1645.3. Samples: 46390196. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-10-13 04:02:53,607][45375] Avg episode reward: [(0, '63.540'), (1, '56.100')] +[2023-10-13 04:02:53,847][46663] Updated weights for policy 1, policy_version 90531 (0.0008) +[2023-10-13 04:02:54,214][46663] Updated weights for policy 1, policy_version 90541 (0.0009) +[2023-10-13 04:02:54,576][46663] Updated weights for policy 1, policy_version 90551 (0.0009) +[2023-10-13 04:02:54,662][46662] Updated weights for policy 0, policy_version 90630 (0.0009) +[2023-10-13 04:02:55,029][46662] Updated weights for policy 0, policy_version 90640 (0.0008) +[2023-10-13 04:02:55,399][46662] Updated weights for policy 0, policy_version 90650 (0.0009) +[2023-10-13 04:02:58,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 185565184. Throughput: 0: 1643.2, 1: 1645.2. Samples: 46398944. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-10-13 04:02:58,607][45375] Avg episode reward: [(0, '61.920'), (1, '56.190')] +[2023-10-13 04:02:58,785][46663] Updated weights for policy 1, policy_version 90561 (0.0007) +[2023-10-13 04:02:59,143][46663] Updated weights for policy 1, policy_version 90571 (0.0011) +[2023-10-13 04:02:59,447][46662] Updated weights for policy 0, policy_version 90660 (0.0009) +[2023-10-13 04:02:59,511][46663] Updated weights for policy 1, policy_version 90581 (0.0009) +[2023-10-13 04:02:59,806][46662] Updated weights for policy 0, policy_version 90670 (0.0009) +[2023-10-13 04:02:59,870][46663] Updated weights for policy 1, policy_version 90591 (0.0009) +[2023-10-13 04:03:00,176][46662] Updated weights for policy 0, policy_version 90680 (0.0008) +[2023-10-13 04:03:03,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 185630720. Throughput: 0: 1638.3, 1: 1643.2. Samples: 46418862. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-10-13 04:03:03,607][45375] Avg episode reward: [(0, '61.050'), (1, '57.200')] +[2023-10-13 04:03:04,204][46663] Updated weights for policy 1, policy_version 90601 (0.0009) +[2023-10-13 04:03:04,350][46662] Updated weights for policy 0, policy_version 90690 (0.0010) +[2023-10-13 04:03:04,564][46663] Updated weights for policy 1, policy_version 90611 (0.0010) +[2023-10-13 04:03:04,718][46662] Updated weights for policy 0, policy_version 90700 (0.0008) +[2023-10-13 04:03:04,931][46663] Updated weights for policy 1, policy_version 90621 (0.0009) +[2023-10-13 04:03:05,085][46662] Updated weights for policy 0, policy_version 90710 (0.0009) +[2023-10-13 04:03:05,454][46662] Updated weights for policy 0, policy_version 90720 (0.0009) +[2023-10-13 04:03:08,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 185696256. Throughput: 0: 1642.4, 1: 1640.1. Samples: 46438742. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-10-13 04:03:08,607][45375] Avg episode reward: [(0, '63.240'), (1, '56.140')] +[2023-10-13 04:03:09,232][46663] Updated weights for policy 1, policy_version 90631 (0.0009) +[2023-10-13 04:03:09,602][46663] Updated weights for policy 1, policy_version 90641 (0.0007) +[2023-10-13 04:03:09,742][46662] Updated weights for policy 0, policy_version 90730 (0.0009) +[2023-10-13 04:03:09,962][46663] Updated weights for policy 1, policy_version 90651 (0.0007) +[2023-10-13 04:03:10,109][46662] Updated weights for policy 0, policy_version 90740 (0.0009) +[2023-10-13 04:03:10,476][46662] Updated weights for policy 0, policy_version 90750 (0.0009) +[2023-10-13 04:03:13,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 185761792. Throughput: 0: 1645.8, 1: 1642.3. Samples: 46447530. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-10-13 04:03:13,607][45375] Avg episode reward: [(0, '62.700'), (1, '56.480')] +[2023-10-13 04:03:14,036][46663] Updated weights for policy 1, policy_version 90661 (0.0008) +[2023-10-13 04:03:14,402][46663] Updated weights for policy 1, policy_version 90671 (0.0009) +[2023-10-13 04:03:14,538][46662] Updated weights for policy 0, policy_version 90760 (0.0008) +[2023-10-13 04:03:14,766][46663] Updated weights for policy 1, policy_version 90681 (0.0007) +[2023-10-13 04:03:14,908][46662] Updated weights for policy 0, policy_version 90770 (0.0009) +[2023-10-13 04:03:15,279][46662] Updated weights for policy 0, policy_version 90780 (0.0009) +[2023-10-13 04:03:18,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 185827328. Throughput: 0: 1643.8, 1: 1643.9. Samples: 46468128. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-10-13 04:03:18,607][45375] Avg episode reward: [(0, '62.120'), (1, '55.730')] +[2023-10-13 04:03:18,819][46663] Updated weights for policy 1, policy_version 90691 (0.0009) +[2023-10-13 04:03:19,188][46663] Updated weights for policy 1, policy_version 90701 (0.0011) +[2023-10-13 04:03:19,422][46662] Updated weights for policy 0, policy_version 90790 (0.0007) +[2023-10-13 04:03:19,542][46663] Updated weights for policy 1, policy_version 90711 (0.0009) +[2023-10-13 04:03:19,793][46662] Updated weights for policy 0, policy_version 90800 (0.0007) +[2023-10-13 04:03:20,165][46662] Updated weights for policy 0, policy_version 90810 (0.0009) +[2023-10-13 04:03:23,561][46663] Updated weights for policy 1, policy_version 90721 (0.0010) +[2023-10-13 04:03:23,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 185892864. Throughput: 0: 1640.4, 1: 1651.9. Samples: 46488950. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-10-13 04:03:23,607][45375] Avg episode reward: [(0, '61.500'), (1, '55.210')] +[2023-10-13 04:03:23,925][46663] Updated weights for policy 1, policy_version 90731 (0.0008) +[2023-10-13 04:03:24,080][46662] Updated weights for policy 0, policy_version 90820 (0.0007) +[2023-10-13 04:03:24,294][46663] Updated weights for policy 1, policy_version 90741 (0.0009) +[2023-10-13 04:03:24,454][46662] Updated weights for policy 0, policy_version 90830 (0.0008) +[2023-10-13 04:03:24,660][46663] Updated weights for policy 1, policy_version 90751 (0.0009) +[2023-10-13 04:03:24,835][46662] Updated weights for policy 0, policy_version 90840 (0.0007) +[2023-10-13 04:03:28,606][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 185958400. Throughput: 0: 1648.7, 1: 1650.3. Samples: 46498184. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-10-13 04:03:28,607][45375] Avg episode reward: [(0, '61.470'), (1, '54.420')] +[2023-10-13 04:03:28,702][46663] Updated weights for policy 1, policy_version 90761 (0.0008) +[2023-10-13 04:03:28,936][46662] Updated weights for policy 0, policy_version 90850 (0.0008) +[2023-10-13 04:03:29,082][46663] Updated weights for policy 1, policy_version 90771 (0.0008) +[2023-10-13 04:03:29,303][46662] Updated weights for policy 0, policy_version 90860 (0.0009) +[2023-10-13 04:03:29,441][46663] Updated weights for policy 1, policy_version 90781 (0.0010) +[2023-10-13 04:03:29,678][46662] Updated weights for policy 0, policy_version 90870 (0.0008) +[2023-10-13 04:03:30,051][46662] Updated weights for policy 0, policy_version 90880 (0.0007) +[2023-10-13 04:03:33,589][46663] Updated weights for policy 1, policy_version 90791 (0.0009) +[2023-10-13 04:03:33,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 186023936. Throughput: 0: 1648.3, 1: 1657.5. Samples: 46518866. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-10-13 04:03:33,608][45375] Avg episode reward: [(0, '59.700'), (1, '53.920')] +[2023-10-13 04:03:33,969][46663] Updated weights for policy 1, policy_version 90801 (0.0007) +[2023-10-13 04:03:34,134][46662] Updated weights for policy 0, policy_version 90890 (0.0009) +[2023-10-13 04:03:34,345][46663] Updated weights for policy 1, policy_version 90811 (0.0007) +[2023-10-13 04:03:34,504][46662] Updated weights for policy 0, policy_version 90900 (0.0009) +[2023-10-13 04:03:34,873][46662] Updated weights for policy 0, policy_version 90910 (0.0011) +[2023-10-13 04:03:38,437][46663] Updated weights for policy 1, policy_version 90821 (0.0008) +[2023-10-13 04:03:38,607][45375] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 186089472. Throughput: 0: 1653.0, 1: 1654.6. Samples: 46539040. Policy #0 lag: (min: 31.0, avg: 31.1, max: 37.0) +[2023-10-13 04:03:38,608][45375] Avg episode reward: [(0, '59.760'), (1, '54.250')] +[2023-10-13 04:03:38,794][46663] Updated weights for policy 1, policy_version 90831 (0.0007) +[2023-10-13 04:03:39,021][46662] Updated weights for policy 0, policy_version 90920 (0.0010) +[2023-10-13 04:03:39,154][46663] Updated weights for policy 1, policy_version 90841 (0.0009) +[2023-10-13 04:03:39,394][46662] Updated weights for policy 0, policy_version 90930 (0.0007) +[2023-10-13 04:03:39,766][46662] Updated weights for policy 0, policy_version 90940 (0.0007) +[2023-10-13 04:03:43,150][46663] Updated weights for policy 1, policy_version 90851 (0.0007) +[2023-10-13 04:03:43,514][46663] Updated weights for policy 1, policy_version 90861 (0.0008) +[2023-10-13 04:03:43,607][45375] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13218.3). Total num frames: 186155008. Throughput: 0: 1656.3, 1: 1663.0. Samples: 46548312. Policy #0 lag: (min: 31.0, avg: 31.1, max: 37.0) +[2023-10-13 04:03:43,607][45375] Avg episode reward: [(0, '60.680'), (1, '52.280')] +[2023-10-13 04:03:43,764][46662] Updated weights for policy 0, policy_version 90950 (0.0009) +[2023-10-13 04:03:43,878][46663] Updated weights for policy 1, policy_version 90871 (0.0008) +[2023-10-13 04:03:44,131][46662] Updated weights for policy 0, policy_version 90960 (0.0009) +[2023-10-13 04:03:44,503][46662] Updated weights for policy 0, policy_version 90970 (0.0007) +[2023-10-13 04:03:47,931][46663] Updated weights for policy 1, policy_version 90881 (0.0008) +[2023-10-13 04:03:48,283][46663] Updated weights for policy 1, policy_version 90891 (0.0009) +[2023-10-13 04:03:48,606][45375] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13107.2). Total num frames: 186220544. Throughput: 0: 1665.3, 1: 1676.4. Samples: 46569240. Policy #0 lag: (min: 31.0, avg: 31.1, max: 37.0) +[2023-10-13 04:03:48,607][45375] Avg episode reward: [(0, '61.400'), (1, '51.580')] +[2023-10-13 04:03:48,628][46662] Updated weights for policy 0, policy_version 90980 (0.0009) +[2023-10-13 04:03:48,652][46663] Updated weights for policy 1, policy_version 90901 (0.0007) +[2023-10-13 04:03:48,994][46662] Updated weights for policy 0, policy_version 90990 (0.0010) +[2023-10-13 04:03:49,018][46663] Updated weights for policy 1, policy_version 90911 (0.0007) +[2023-10-13 04:03:49,367][46662] Updated weights for policy 0, policy_version 91000 (0.0011) +[2023-10-13 04:03:52,968][46663] Updated weights for policy 1, policy_version 90921 (0.0009) +[2023-10-13 04:03:53,344][46663] Updated weights for policy 1, policy_version 90931 (0.0008) +[2023-10-13 04:03:53,507][46662] Updated weights for policy 0, policy_version 91010 (0.0010) +[2023-10-13 04:03:53,607][45375] Fps is (10 sec: 13106.8, 60 sec: 13107.2, 300 sec: 13107.2). Total num frames: 186286080. Throughput: 0: 1665.3, 1: 1674.1. Samples: 46589016. Policy #0 lag: (min: 31.0, avg: 31.1, max: 37.0) +[2023-10-13 04:03:53,608][45375] Avg episode reward: [(0, '59.570'), (1, '51.360')] +[2023-10-13 04:03:53,702][46663] Updated weights for policy 1, policy_version 90941 (0.0008) +[2023-10-13 04:03:53,813][46384] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000090944_93126656.pth... +[2023-10-13 04:03:53,843][46384] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000089376_91521024.pth +[2023-10-13 04:03:53,877][46662] Updated weights for policy 0, policy_version 91020 (0.0009) +[2023-10-13 04:03:54,256][46662] Updated weights for policy 0, policy_version 91030 (0.0009) +[2023-10-13 04:03:54,617][46091] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000091040_93224960.pth... +[2023-10-13 04:03:54,619][46662] Updated weights for policy 0, policy_version 91040 (0.0007) +[2023-10-13 04:03:54,646][46091] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000089472_91619328.pth +[2023-10-13 04:03:57,613][46663] Updated weights for policy 1, policy_version 90951 (0.0009) +[2023-10-13 04:03:57,987][46663] Updated weights for policy 1, policy_version 90961 (0.0011) +[2023-10-13 04:03:58,354][46663] Updated weights for policy 1, policy_version 90971 (0.0009) +[2023-10-13 04:03:58,607][45375] Fps is (10 sec: 16383.9, 60 sec: 13653.3, 300 sec: 13218.3). Total num frames: 186384384. Throughput: 0: 1668.9, 1: 1698.2. Samples: 46599048. Policy #0 lag: (min: 31.0, avg: 31.1, max: 37.0) +[2023-10-13 04:03:58,607][45375] Avg episode reward: [(0, '59.930'), (1, '51.340')] +[2023-10-13 04:03:58,683][46662] Updated weights for policy 0, policy_version 91050 (0.0008) +[2023-10-13 04:03:59,050][46662] Updated weights for policy 0, policy_version 91060 (0.0011) +[2023-10-13 04:03:59,427][46662] Updated weights for policy 0, policy_version 91070 (0.0010) +[2023-10-13 04:04:02,392][46663] Updated weights for policy 1, policy_version 90981 (0.0008) +[2023-10-13 04:04:02,747][46663] Updated weights for policy 1, policy_version 90991 (0.0008) +[2023-10-13 04:04:03,122][46663] Updated weights for policy 1, policy_version 91001 (0.0007) +[2023-10-13 04:04:03,590][46662] Updated weights for policy 0, policy_version 91080 (0.0010) +[2023-10-13 04:04:03,607][45375] Fps is (10 sec: 16384.5, 60 sec: 13653.3, 300 sec: 13218.3). Total num frames: 186449920. Throughput: 0: 1676.6, 1: 1691.3. Samples: 46619686. Policy #0 lag: (min: 31.0, avg: 31.1, max: 37.0) +[2023-10-13 04:04:03,607][45375] Avg episode reward: [(0, '58.080'), (1, '51.740')] +[2023-10-13 04:04:03,953][46662] Updated weights for policy 0, policy_version 91090 (0.0011) +[2023-10-13 04:04:04,315][46662] Updated weights for policy 0, policy_version 91100 (0.0010) +[2023-10-13 04:04:07,256][46663] Updated weights for policy 1, policy_version 91011 (0.0008) +[2023-10-13 04:04:07,627][46663] Updated weights for policy 1, policy_version 91021 (0.0007) +[2023-10-13 04:04:07,996][46663] Updated weights for policy 1, policy_version 91031 (0.0008) +[2023-10-13 04:04:08,394][46662] Updated weights for policy 0, policy_version 91110 (0.0010) +[2023-10-13 04:04:08,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13218.3). Total num frames: 186515456. Throughput: 0: 1679.3, 1: 1661.4. Samples: 46639284. Policy #0 lag: (min: 31.0, avg: 31.1, max: 37.0) +[2023-10-13 04:04:08,607][45375] Avg episode reward: [(0, '59.070'), (1, '51.620')] +[2023-10-13 04:04:08,757][46662] Updated weights for policy 0, policy_version 91120 (0.0009) +[2023-10-13 04:04:09,141][46662] Updated weights for policy 0, policy_version 91130 (0.0010) +[2023-10-13 04:04:12,289][46663] Updated weights for policy 1, policy_version 91041 (0.0008) +[2023-10-13 04:04:12,657][46663] Updated weights for policy 1, policy_version 91051 (0.0007) +[2023-10-13 04:04:13,026][46663] Updated weights for policy 1, policy_version 91061 (0.0007) +[2023-10-13 04:04:13,220][46662] Updated weights for policy 0, policy_version 91140 (0.0009) +[2023-10-13 04:04:13,387][46663] Updated weights for policy 1, policy_version 91071 (0.0009) +[2023-10-13 04:04:13,587][46662] Updated weights for policy 0, policy_version 91150 (0.0007) +[2023-10-13 04:04:13,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13218.3). Total num frames: 186580992. Throughput: 0: 1673.1, 1: 1690.6. Samples: 46649552. Policy #0 lag: (min: 31.0, avg: 31.1, max: 37.0) +[2023-10-13 04:04:13,608][45375] Avg episode reward: [(0, '58.550'), (1, '53.510')] +[2023-10-13 04:04:13,950][46662] Updated weights for policy 0, policy_version 91160 (0.0007) +[2023-10-13 04:04:17,340][46663] Updated weights for policy 1, policy_version 91081 (0.0008) +[2023-10-13 04:04:17,715][46663] Updated weights for policy 1, policy_version 91091 (0.0007) +[2023-10-13 04:04:18,005][46662] Updated weights for policy 0, policy_version 91170 (0.0009) +[2023-10-13 04:04:18,074][46663] Updated weights for policy 1, policy_version 91101 (0.0009) +[2023-10-13 04:04:18,385][46662] Updated weights for policy 0, policy_version 91180 (0.0008) +[2023-10-13 04:04:18,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13218.3). Total num frames: 186646528. Throughput: 0: 1678.0, 1: 1683.7. Samples: 46670140. Policy #0 lag: (min: 31.0, avg: 31.1, max: 37.0) +[2023-10-13 04:04:18,607][45375] Avg episode reward: [(0, '58.370'), (1, '53.330')] +[2023-10-13 04:04:18,756][46662] Updated weights for policy 0, policy_version 91190 (0.0009) +[2023-10-13 04:04:19,128][46662] Updated weights for policy 0, policy_version 91200 (0.0008) +[2023-10-13 04:04:22,417][46663] Updated weights for policy 1, policy_version 91111 (0.0009) +[2023-10-13 04:04:22,799][46663] Updated weights for policy 1, policy_version 91121 (0.0008) +[2023-10-13 04:04:23,086][46662] Updated weights for policy 0, policy_version 91210 (0.0008) +[2023-10-13 04:04:23,163][46663] Updated weights for policy 1, policy_version 91131 (0.0009) +[2023-10-13 04:04:23,460][46662] Updated weights for policy 0, policy_version 91220 (0.0009) +[2023-10-13 04:04:23,607][45375] Fps is (10 sec: 13107.5, 60 sec: 13653.4, 300 sec: 13218.3). Total num frames: 186712064. Throughput: 0: 1682.2, 1: 1670.0. Samples: 46689890. Policy #0 lag: (min: 31.0, avg: 31.1, max: 37.0) +[2023-10-13 04:04:23,607][45375] Avg episode reward: [(0, '57.990'), (1, '53.800')] +[2023-10-13 04:04:23,824][46662] Updated weights for policy 0, policy_version 91230 (0.0007) +[2023-10-13 04:04:27,204][46663] Updated weights for policy 1, policy_version 91141 (0.0007) +[2023-10-13 04:04:27,574][46663] Updated weights for policy 1, policy_version 91151 (0.0009) +[2023-10-13 04:04:27,941][46663] Updated weights for policy 1, policy_version 91161 (0.0009) +[2023-10-13 04:04:27,949][46662] Updated weights for policy 0, policy_version 91240 (0.0009) +[2023-10-13 04:04:28,323][46662] Updated weights for policy 0, policy_version 91250 (0.0009) +[2023-10-13 04:04:28,606][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13218.3). Total num frames: 186777600. Throughput: 0: 1687.3, 1: 1689.8. Samples: 46700282. Policy #0 lag: (min: 31.0, avg: 31.1, max: 37.0) +[2023-10-13 04:04:28,607][45375] Avg episode reward: [(0, '58.170'), (1, '54.910')] +[2023-10-13 04:04:28,691][46662] Updated weights for policy 0, policy_version 91260 (0.0008) +[2023-10-13 04:04:32,023][46663] Updated weights for policy 1, policy_version 91171 (0.0010) +[2023-10-13 04:04:32,391][46663] Updated weights for policy 1, policy_version 91181 (0.0010) +[2023-10-13 04:04:32,612][46662] Updated weights for policy 0, policy_version 91270 (0.0009) +[2023-10-13 04:04:32,754][46663] Updated weights for policy 1, policy_version 91191 (0.0009) +[2023-10-13 04:04:32,981][46662] Updated weights for policy 0, policy_version 91280 (0.0010) +[2023-10-13 04:04:33,345][46662] Updated weights for policy 0, policy_version 91290 (0.0007) +[2023-10-13 04:04:33,606][45375] Fps is (10 sec: 16384.1, 60 sec: 14199.5, 300 sec: 13329.4). Total num frames: 186875904. Throughput: 0: 1683.5, 1: 1676.0. Samples: 46720414. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0) +[2023-10-13 04:04:33,607][45375] Avg episode reward: [(0, '58.630'), (1, '53.990')] +[2023-10-13 04:04:36,703][46663] Updated weights for policy 1, policy_version 91201 (0.0008) +[2023-10-13 04:04:37,077][46663] Updated weights for policy 1, policy_version 91211 (0.0009) +[2023-10-13 04:04:37,449][46663] Updated weights for policy 1, policy_version 91221 (0.0009) +[2023-10-13 04:04:37,554][46662] Updated weights for policy 0, policy_version 91300 (0.0009) +[2023-10-13 04:04:37,812][46663] Updated weights for policy 1, policy_version 91231 (0.0009) +[2023-10-13 04:04:37,917][46662] Updated weights for policy 0, policy_version 91310 (0.0007) +[2023-10-13 04:04:38,291][46662] Updated weights for policy 0, policy_version 91320 (0.0007) +[2023-10-13 04:04:38,607][45375] Fps is (10 sec: 16383.8, 60 sec: 14199.5, 300 sec: 13329.4). Total num frames: 186941440. Throughput: 0: 1679.6, 1: 1676.9. Samples: 46740058. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0) +[2023-10-13 04:04:38,607][45375] Avg episode reward: [(0, '59.020'), (1, '53.700')] +[2023-10-13 04:04:41,784][46663] Updated weights for policy 1, policy_version 91241 (0.0007) +[2023-10-13 04:04:42,158][46663] Updated weights for policy 1, policy_version 91251 (0.0010) +[2023-10-13 04:04:42,334][46662] Updated weights for policy 0, policy_version 91330 (0.0007) +[2023-10-13 04:04:42,528][46663] Updated weights for policy 1, policy_version 91261 (0.0009) +[2023-10-13 04:04:42,693][46662] Updated weights for policy 0, policy_version 91340 (0.0008) +[2023-10-13 04:04:43,059][46662] Updated weights for policy 0, policy_version 91350 (0.0007) +[2023-10-13 04:04:43,427][46662] Updated weights for policy 0, policy_version 91360 (0.0007) +[2023-10-13 04:04:43,607][45375] Fps is (10 sec: 13107.0, 60 sec: 14199.4, 300 sec: 13329.4). Total num frames: 187006976. Throughput: 0: 1685.6, 1: 1681.6. Samples: 46750572. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0) +[2023-10-13 04:04:43,608][45375] Avg episode reward: [(0, '59.530'), (1, '54.210')] +[2023-10-13 04:04:46,464][46663] Updated weights for policy 1, policy_version 91271 (0.0008) +[2023-10-13 04:04:46,833][46663] Updated weights for policy 1, policy_version 91281 (0.0008) +[2023-10-13 04:04:47,205][46663] Updated weights for policy 1, policy_version 91291 (0.0007) +[2023-10-13 04:04:47,522][46662] Updated weights for policy 0, policy_version 91370 (0.0007) +[2023-10-13 04:04:47,904][46662] Updated weights for policy 0, policy_version 91380 (0.0010) +[2023-10-13 04:04:48,267][46662] Updated weights for policy 0, policy_version 91390 (0.0011) +[2023-10-13 04:04:48,607][45375] Fps is (10 sec: 13107.2, 60 sec: 14199.4, 300 sec: 13329.4). Total num frames: 187072512. Throughput: 0: 1684.5, 1: 1662.6. Samples: 46770306. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0) +[2023-10-13 04:04:48,607][45375] Avg episode reward: [(0, '59.900'), (1, '53.250')] +[2023-10-13 04:04:51,423][46663] Updated weights for policy 1, policy_version 91301 (0.0010) +[2023-10-13 04:04:51,792][46663] Updated weights for policy 1, policy_version 91311 (0.0008) +[2023-10-13 04:04:52,157][46663] Updated weights for policy 1, policy_version 91321 (0.0009) +[2023-10-13 04:04:52,354][46662] Updated weights for policy 0, policy_version 91400 (0.0010) +[2023-10-13 04:04:52,733][46662] Updated weights for policy 0, policy_version 91410 (0.0011) +[2023-10-13 04:04:53,101][46662] Updated weights for policy 0, policy_version 91420 (0.0011) +[2023-10-13 04:04:53,606][45375] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 13329.4). Total num frames: 187138048. Throughput: 0: 1666.6, 1: 1688.4. Samples: 46790260. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0) +[2023-10-13 04:04:53,607][45375] Avg episode reward: [(0, '57.570'), (1, '53.610')] +[2023-10-13 04:04:56,194][46663] Updated weights for policy 1, policy_version 91331 (0.0008) +[2023-10-13 04:04:56,564][46663] Updated weights for policy 1, policy_version 91341 (0.0007) +[2023-10-13 04:04:56,944][46663] Updated weights for policy 1, policy_version 91351 (0.0009) +[2023-10-13 04:04:57,225][46662] Updated weights for policy 0, policy_version 91430 (0.0009) +[2023-10-13 04:04:57,598][46662] Updated weights for policy 0, policy_version 91440 (0.0007) +[2023-10-13 04:04:57,980][46662] Updated weights for policy 0, policy_version 91450 (0.0008) +[2023-10-13 04:04:58,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13329.4). Total num frames: 187203584. Throughput: 0: 1686.3, 1: 1682.8. Samples: 46801164. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0) +[2023-10-13 04:04:58,607][45375] Avg episode reward: [(0, '57.700'), (1, '53.300')] +[2023-10-13 04:05:01,044][46663] Updated weights for policy 1, policy_version 91361 (0.0007) +[2023-10-13 04:05:01,405][46663] Updated weights for policy 1, policy_version 91371 (0.0009) +[2023-10-13 04:05:01,775][46663] Updated weights for policy 1, policy_version 91381 (0.0009) +[2023-10-13 04:05:02,156][46663] Updated weights for policy 1, policy_version 91391 (0.0009) +[2023-10-13 04:05:02,233][46662] Updated weights for policy 0, policy_version 91460 (0.0008) +[2023-10-13 04:05:02,603][46662] Updated weights for policy 0, policy_version 91470 (0.0010) +[2023-10-13 04:05:02,978][46662] Updated weights for policy 0, policy_version 91480 (0.0008) +[2023-10-13 04:05:03,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13329.4). Total num frames: 187269120. Throughput: 0: 1682.0, 1: 1664.7. Samples: 46820742. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0) +[2023-10-13 04:05:03,607][45375] Avg episode reward: [(0, '57.380'), (1, '52.570')] +[2023-10-13 04:05:06,096][46663] Updated weights for policy 1, policy_version 91401 (0.0010) +[2023-10-13 04:05:06,467][46663] Updated weights for policy 1, policy_version 91411 (0.0009) +[2023-10-13 04:05:06,824][46663] Updated weights for policy 1, policy_version 91421 (0.0007) +[2023-10-13 04:05:06,918][46662] Updated weights for policy 0, policy_version 91490 (0.0009) +[2023-10-13 04:05:07,291][46662] Updated weights for policy 0, policy_version 91500 (0.0009) +[2023-10-13 04:05:07,658][46662] Updated weights for policy 0, policy_version 91510 (0.0011) +[2023-10-13 04:05:08,029][46662] Updated weights for policy 0, policy_version 91520 (0.0009) +[2023-10-13 04:05:08,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13329.4). Total num frames: 187334656. Throughput: 0: 1663.6, 1: 1686.9. Samples: 46840664. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0) +[2023-10-13 04:05:08,607][45375] Avg episode reward: [(0, '57.280'), (1, '52.900')] +[2023-10-13 04:05:11,017][46663] Updated weights for policy 1, policy_version 91431 (0.0009) +[2023-10-13 04:05:11,399][46663] Updated weights for policy 1, policy_version 91441 (0.0007) +[2023-10-13 04:05:11,765][46663] Updated weights for policy 1, policy_version 91451 (0.0007) +[2023-10-13 04:05:11,981][46662] Updated weights for policy 0, policy_version 91530 (0.0007) +[2023-10-13 04:05:12,343][46662] Updated weights for policy 0, policy_version 91540 (0.0008) +[2023-10-13 04:05:12,721][46662] Updated weights for policy 0, policy_version 91550 (0.0008) +[2023-10-13 04:05:13,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13329.4). Total num frames: 187400192. Throughput: 0: 1684.8, 1: 1670.3. Samples: 46851262. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0) +[2023-10-13 04:05:13,607][45375] Avg episode reward: [(0, '58.640'), (1, '50.880')] +[2023-10-13 04:05:15,748][46663] Updated weights for policy 1, policy_version 91461 (0.0009) +[2023-10-13 04:05:16,126][46663] Updated weights for policy 1, policy_version 91471 (0.0009) +[2023-10-13 04:05:16,491][46663] Updated weights for policy 1, policy_version 91481 (0.0009) +[2023-10-13 04:05:16,846][46662] Updated weights for policy 0, policy_version 91560 (0.0007) +[2023-10-13 04:05:17,227][46662] Updated weights for policy 0, policy_version 91570 (0.0010) +[2023-10-13 04:05:17,589][46662] Updated weights for policy 0, policy_version 91580 (0.0009) +[2023-10-13 04:05:18,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13329.4). Total num frames: 187465728. Throughput: 0: 1683.2, 1: 1662.8. Samples: 46870988. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0) +[2023-10-13 04:05:18,607][45375] Avg episode reward: [(0, '57.610'), (1, '50.000')] +[2023-10-13 04:05:20,512][46663] Updated weights for policy 1, policy_version 91491 (0.0008) +[2023-10-13 04:05:20,876][46663] Updated weights for policy 1, policy_version 91501 (0.0009) +[2023-10-13 04:05:21,236][46663] Updated weights for policy 1, policy_version 91511 (0.0010) +[2023-10-13 04:05:21,580][46662] Updated weights for policy 0, policy_version 91590 (0.0008) +[2023-10-13 04:05:21,950][46662] Updated weights for policy 0, policy_version 91600 (0.0011) +[2023-10-13 04:05:22,317][46662] Updated weights for policy 0, policy_version 91610 (0.0009) +[2023-10-13 04:05:23,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13329.4). Total num frames: 187531264. Throughput: 0: 1669.7, 1: 1677.0. Samples: 46890660. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0) +[2023-10-13 04:05:23,608][45375] Avg episode reward: [(0, '58.900'), (1, '51.340')] +[2023-10-13 04:05:25,489][46663] Updated weights for policy 1, policy_version 91521 (0.0008) +[2023-10-13 04:05:25,843][46663] Updated weights for policy 1, policy_version 91531 (0.0009) +[2023-10-13 04:05:26,219][46663] Updated weights for policy 1, policy_version 91541 (0.0008) +[2023-10-13 04:05:26,297][46662] Updated weights for policy 0, policy_version 91620 (0.0009) +[2023-10-13 04:05:26,577][46663] Updated weights for policy 1, policy_version 91551 (0.0009) +[2023-10-13 04:05:26,662][46662] Updated weights for policy 0, policy_version 91630 (0.0007) +[2023-10-13 04:05:27,042][46662] Updated weights for policy 0, policy_version 91640 (0.0009) +[2023-10-13 04:05:28,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13329.4). Total num frames: 187596800. Throughput: 0: 1692.7, 1: 1655.5. Samples: 46901242. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0) +[2023-10-13 04:05:28,607][45375] Avg episode reward: [(0, '59.580'), (1, '49.590')] +[2023-10-13 04:05:30,561][46663] Updated weights for policy 1, policy_version 91561 (0.0011) +[2023-10-13 04:05:30,934][46663] Updated weights for policy 1, policy_version 91571 (0.0009) +[2023-10-13 04:05:31,191][46662] Updated weights for policy 0, policy_version 91650 (0.0007) +[2023-10-13 04:05:31,299][46663] Updated weights for policy 1, policy_version 91581 (0.0009) +[2023-10-13 04:05:31,572][46662] Updated weights for policy 0, policy_version 91660 (0.0010) +[2023-10-13 04:05:31,934][46662] Updated weights for policy 0, policy_version 91670 (0.0008) +[2023-10-13 04:05:32,312][46662] Updated weights for policy 0, policy_version 91680 (0.0008) +[2023-10-13 04:05:33,606][45375] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 187662336. Throughput: 0: 1673.4, 1: 1673.1. Samples: 46920896. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0) +[2023-10-13 04:05:33,607][45375] Avg episode reward: [(0, '58.400'), (1, '49.430')] +[2023-10-13 04:05:35,673][46663] Updated weights for policy 1, policy_version 91591 (0.0008) +[2023-10-13 04:05:36,048][46663] Updated weights for policy 1, policy_version 91601 (0.0007) +[2023-10-13 04:05:36,393][46662] Updated weights for policy 0, policy_version 91690 (0.0010) +[2023-10-13 04:05:36,407][46663] Updated weights for policy 1, policy_version 91611 (0.0007) +[2023-10-13 04:05:36,753][46662] Updated weights for policy 0, policy_version 91700 (0.0009) +[2023-10-13 04:05:37,115][46662] Updated weights for policy 0, policy_version 91710 (0.0008) +[2023-10-13 04:05:38,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 187727872. Throughput: 0: 1676.8, 1: 1670.5. Samples: 46940886. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0) +[2023-10-13 04:05:38,607][45375] Avg episode reward: [(0, '59.310'), (1, '48.400')] +[2023-10-13 04:05:40,641][46663] Updated weights for policy 1, policy_version 91621 (0.0008) +[2023-10-13 04:05:41,008][46663] Updated weights for policy 1, policy_version 91631 (0.0010) +[2023-10-13 04:05:41,177][46662] Updated weights for policy 0, policy_version 91720 (0.0007) +[2023-10-13 04:05:41,378][46663] Updated weights for policy 1, policy_version 91641 (0.0007) +[2023-10-13 04:05:41,555][46662] Updated weights for policy 0, policy_version 91730 (0.0007) +[2023-10-13 04:05:41,929][46662] Updated weights for policy 0, policy_version 91740 (0.0009) +[2023-10-13 04:05:43,606][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 187793408. Throughput: 0: 1687.8, 1: 1652.3. Samples: 46951468. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0) +[2023-10-13 04:05:43,607][45375] Avg episode reward: [(0, '60.780'), (1, '49.830')] +[2023-10-13 04:05:45,455][46663] Updated weights for policy 1, policy_version 91651 (0.0007) +[2023-10-13 04:05:45,817][46663] Updated weights for policy 1, policy_version 91661 (0.0007) +[2023-10-13 04:05:46,020][46662] Updated weights for policy 0, policy_version 91750 (0.0008) +[2023-10-13 04:05:46,180][46663] Updated weights for policy 1, policy_version 91671 (0.0009) +[2023-10-13 04:05:46,390][46662] Updated weights for policy 0, policy_version 91760 (0.0009) +[2023-10-13 04:05:46,762][46662] Updated weights for policy 0, policy_version 91770 (0.0008) +[2023-10-13 04:05:48,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 187858944. Throughput: 0: 1664.7, 1: 1668.8. Samples: 46970748. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0) +[2023-10-13 04:05:48,608][45375] Avg episode reward: [(0, '63.360'), (1, '49.900')] +[2023-10-13 04:05:50,313][46663] Updated weights for policy 1, policy_version 91681 (0.0008) +[2023-10-13 04:05:50,624][46662] Updated weights for policy 0, policy_version 91780 (0.0008) +[2023-10-13 04:05:50,678][46663] Updated weights for policy 1, policy_version 91691 (0.0010) +[2023-10-13 04:05:51,001][46662] Updated weights for policy 0, policy_version 91790 (0.0008) +[2023-10-13 04:05:51,035][46663] Updated weights for policy 1, policy_version 91701 (0.0007) +[2023-10-13 04:05:51,369][46662] Updated weights for policy 0, policy_version 91800 (0.0009) +[2023-10-13 04:05:51,396][46663] Updated weights for policy 1, policy_version 91711 (0.0007) +[2023-10-13 04:05:53,607][45375] Fps is (10 sec: 13106.8, 60 sec: 13107.1, 300 sec: 13329.3). Total num frames: 187924480. Throughput: 0: 1682.9, 1: 1664.3. Samples: 46991288. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0) +[2023-10-13 04:05:53,608][45375] Avg episode reward: [(0, '63.420'), (1, '49.440')] +[2023-10-13 04:05:53,620][46384] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000091712_93913088.pth... +[2023-10-13 04:05:53,620][46091] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000091808_94011392.pth... +[2023-10-13 04:05:53,649][46384] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000090144_92307456.pth +[2023-10-13 04:05:53,654][46091] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000090240_92405760.pth +[2023-10-13 04:05:55,410][46663] Updated weights for policy 1, policy_version 91721 (0.0008) +[2023-10-13 04:05:55,468][46662] Updated weights for policy 0, policy_version 91810 (0.0007) +[2023-10-13 04:05:55,778][46663] Updated weights for policy 1, policy_version 91731 (0.0007) +[2023-10-13 04:05:55,827][46662] Updated weights for policy 0, policy_version 91820 (0.0008) +[2023-10-13 04:05:56,145][46663] Updated weights for policy 1, policy_version 91741 (0.0007) +[2023-10-13 04:05:56,193][46662] Updated weights for policy 0, policy_version 91830 (0.0007) +[2023-10-13 04:05:56,563][46662] Updated weights for policy 0, policy_version 91840 (0.0009) +[2023-10-13 04:05:58,606][45375] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 187990016. Throughput: 0: 1683.8, 1: 1653.7. Samples: 47001452. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0) +[2023-10-13 04:05:58,607][45375] Avg episode reward: [(0, '62.520'), (1, '50.280')] +[2023-10-13 04:06:00,231][46663] Updated weights for policy 1, policy_version 91751 (0.0009) +[2023-10-13 04:06:00,596][46663] Updated weights for policy 1, policy_version 91761 (0.0010) +[2023-10-13 04:06:00,640][46662] Updated weights for policy 0, policy_version 91850 (0.0010) +[2023-10-13 04:06:00,956][46663] Updated weights for policy 1, policy_version 91771 (0.0011) +[2023-10-13 04:06:01,004][46662] Updated weights for policy 0, policy_version 91860 (0.0008) +[2023-10-13 04:06:01,379][46662] Updated weights for policy 0, policy_version 91870 (0.0007) +[2023-10-13 04:06:03,607][45375] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 188055552. Throughput: 0: 1668.8, 1: 1671.8. Samples: 47021318. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0) +[2023-10-13 04:06:03,607][45375] Avg episode reward: [(0, '63.870'), (1, '51.000')] +[2023-10-13 04:06:04,928][46663] Updated weights for policy 1, policy_version 91781 (0.0009) +[2023-10-13 04:06:05,325][46663] Updated weights for policy 1, policy_version 91791 (0.0009) +[2023-10-13 04:06:05,395][46662] Updated weights for policy 0, policy_version 91880 (0.0008) +[2023-10-13 04:06:05,699][46663] Updated weights for policy 1, policy_version 91801 (0.0008) +[2023-10-13 04:06:05,761][46662] Updated weights for policy 0, policy_version 91890 (0.0008) +[2023-10-13 04:06:06,132][46662] Updated weights for policy 0, policy_version 91900 (0.0009) +[2023-10-13 04:06:08,606][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 188121088. Throughput: 0: 1692.4, 1: 1674.2. Samples: 47042158. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0) +[2023-10-13 04:06:08,607][45375] Avg episode reward: [(0, '62.870'), (1, '51.830')] +[2023-10-13 04:06:09,740][46663] Updated weights for policy 1, policy_version 91811 (0.0009) +[2023-10-13 04:06:10,101][46663] Updated weights for policy 1, policy_version 91821 (0.0008) +[2023-10-13 04:06:10,158][46662] Updated weights for policy 0, policy_version 91910 (0.0007) +[2023-10-13 04:06:10,468][46663] Updated weights for policy 1, policy_version 91831 (0.0008) +[2023-10-13 04:06:10,523][46662] Updated weights for policy 0, policy_version 91920 (0.0007) +[2023-10-13 04:06:10,888][46662] Updated weights for policy 0, policy_version 91930 (0.0008) +[2023-10-13 04:06:13,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 188186624. Throughput: 0: 1673.3, 1: 1670.3. Samples: 47051702. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0) +[2023-10-13 04:06:13,607][45375] Avg episode reward: [(0, '62.650'), (1, '52.230')] +[2023-10-13 04:06:14,646][46663] Updated weights for policy 1, policy_version 91841 (0.0008) +[2023-10-13 04:06:15,017][46663] Updated weights for policy 1, policy_version 91851 (0.0008) +[2023-10-13 04:06:15,075][46662] Updated weights for policy 0, policy_version 91940 (0.0008) +[2023-10-13 04:06:15,387][46663] Updated weights for policy 1, policy_version 91861 (0.0008) +[2023-10-13 04:06:15,446][46662] Updated weights for policy 0, policy_version 91950 (0.0008) +[2023-10-13 04:06:15,748][46663] Updated weights for policy 1, policy_version 91871 (0.0009) +[2023-10-13 04:06:15,807][46662] Updated weights for policy 0, policy_version 91960 (0.0009) +[2023-10-13 04:06:18,606][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 188252160. Throughput: 0: 1676.6, 1: 1670.4. Samples: 47071508. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0) +[2023-10-13 04:06:18,607][45375] Avg episode reward: [(0, '63.190'), (1, '52.310')] +[2023-10-13 04:06:19,770][46663] Updated weights for policy 1, policy_version 91881 (0.0010) +[2023-10-13 04:06:19,953][46662] Updated weights for policy 0, policy_version 91970 (0.0009) +[2023-10-13 04:06:20,126][46663] Updated weights for policy 1, policy_version 91891 (0.0007) +[2023-10-13 04:06:20,317][46662] Updated weights for policy 0, policy_version 91980 (0.0008) +[2023-10-13 04:06:20,493][46663] Updated weights for policy 1, policy_version 91901 (0.0007) +[2023-10-13 04:06:20,680][46662] Updated weights for policy 0, policy_version 91990 (0.0009) +[2023-10-13 04:06:21,048][46662] Updated weights for policy 0, policy_version 92000 (0.0008) +[2023-10-13 04:06:23,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 188317696. Throughput: 0: 1683.2, 1: 1674.5. Samples: 47091986. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0) +[2023-10-13 04:06:23,607][45375] Avg episode reward: [(0, '64.670'), (1, '53.940')] +[2023-10-13 04:06:24,543][46663] Updated weights for policy 1, policy_version 91911 (0.0008) +[2023-10-13 04:06:24,908][46663] Updated weights for policy 1, policy_version 91921 (0.0008) +[2023-10-13 04:06:25,103][46662] Updated weights for policy 0, policy_version 92010 (0.0008) +[2023-10-13 04:06:25,277][46663] Updated weights for policy 1, policy_version 91931 (0.0008) +[2023-10-13 04:06:25,479][46662] Updated weights for policy 0, policy_version 92020 (0.0008) +[2023-10-13 04:06:25,854][46662] Updated weights for policy 0, policy_version 92030 (0.0010) +[2023-10-13 04:06:28,606][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 188383232. Throughput: 0: 1658.2, 1: 1671.7. Samples: 47101316. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0) +[2023-10-13 04:06:28,607][45375] Avg episode reward: [(0, '62.950'), (1, '54.680')] +[2023-10-13 04:06:29,293][46663] Updated weights for policy 1, policy_version 91941 (0.0009) +[2023-10-13 04:06:29,664][46663] Updated weights for policy 1, policy_version 91951 (0.0007) +[2023-10-13 04:06:30,024][46663] Updated weights for policy 1, policy_version 91961 (0.0009) +[2023-10-13 04:06:30,079][46662] Updated weights for policy 0, policy_version 92040 (0.0011) +[2023-10-13 04:06:30,444][46662] Updated weights for policy 0, policy_version 92050 (0.0009) +[2023-10-13 04:06:30,820][46662] Updated weights for policy 0, policy_version 92060 (0.0011) +[2023-10-13 04:06:33,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 188448768. Throughput: 0: 1673.1, 1: 1678.1. Samples: 47121552. Policy #0 lag: (min: 23.0, avg: 31.0, max: 55.0) +[2023-10-13 04:06:33,607][45375] Avg episode reward: [(0, '63.630'), (1, '53.960')] +[2023-10-13 04:06:34,085][46663] Updated weights for policy 1, policy_version 91971 (0.0009) +[2023-10-13 04:06:34,460][46663] Updated weights for policy 1, policy_version 91981 (0.0010) +[2023-10-13 04:06:34,826][46663] Updated weights for policy 1, policy_version 91991 (0.0007) +[2023-10-13 04:06:34,852][46662] Updated weights for policy 0, policy_version 92070 (0.0008) +[2023-10-13 04:06:35,231][46662] Updated weights for policy 0, policy_version 92080 (0.0008) +[2023-10-13 04:06:35,603][46662] Updated weights for policy 0, policy_version 92090 (0.0008) +[2023-10-13 04:06:38,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 188514304. Throughput: 0: 1671.7, 1: 1682.1. Samples: 47142212. Policy #0 lag: (min: 23.0, avg: 31.0, max: 55.0) +[2023-10-13 04:06:38,607][45375] Avg episode reward: [(0, '62.720'), (1, '56.150')] +[2023-10-13 04:06:38,961][46663] Updated weights for policy 1, policy_version 92001 (0.0009) +[2023-10-13 04:06:39,325][46663] Updated weights for policy 1, policy_version 92011 (0.0011) +[2023-10-13 04:06:39,693][46662] Updated weights for policy 0, policy_version 92100 (0.0010) +[2023-10-13 04:06:39,697][46663] Updated weights for policy 1, policy_version 92021 (0.0011) +[2023-10-13 04:06:40,070][46662] Updated weights for policy 0, policy_version 92110 (0.0010) +[2023-10-13 04:06:40,071][46663] Updated weights for policy 1, policy_version 92031 (0.0007) +[2023-10-13 04:06:40,428][46662] Updated weights for policy 0, policy_version 92120 (0.0010) +[2023-10-13 04:06:43,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 188579840. Throughput: 0: 1643.8, 1: 1681.9. Samples: 47151110. Policy #0 lag: (min: 23.0, avg: 31.0, max: 55.0) +[2023-10-13 04:06:43,608][45375] Avg episode reward: [(0, '60.550'), (1, '58.010')] +[2023-10-13 04:06:44,253][46663] Updated weights for policy 1, policy_version 92041 (0.0008) +[2023-10-13 04:06:44,581][46662] Updated weights for policy 0, policy_version 92130 (0.0009) +[2023-10-13 04:06:44,627][46663] Updated weights for policy 1, policy_version 92051 (0.0007) +[2023-10-13 04:06:44,948][46662] Updated weights for policy 0, policy_version 92140 (0.0008) +[2023-10-13 04:06:44,988][46663] Updated weights for policy 1, policy_version 92061 (0.0007) +[2023-10-13 04:06:45,314][46662] Updated weights for policy 0, policy_version 92150 (0.0008) +[2023-10-13 04:06:45,684][46662] Updated weights for policy 0, policy_version 92160 (0.0007) +[2023-10-13 04:06:48,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 188645376. Throughput: 0: 1662.0, 1: 1678.3. Samples: 47171632. Policy #0 lag: (min: 23.0, avg: 31.0, max: 55.0) +[2023-10-13 04:06:48,607][45375] Avg episode reward: [(0, '61.620'), (1, '59.680')] +[2023-10-13 04:06:49,262][46663] Updated weights for policy 1, policy_version 92071 (0.0008) +[2023-10-13 04:06:49,627][46663] Updated weights for policy 1, policy_version 92081 (0.0007) +[2023-10-13 04:06:49,897][46662] Updated weights for policy 0, policy_version 92170 (0.0007) +[2023-10-13 04:06:49,998][46663] Updated weights for policy 1, policy_version 92091 (0.0007) +[2023-10-13 04:06:50,263][46662] Updated weights for policy 0, policy_version 92180 (0.0007) +[2023-10-13 04:06:50,639][46662] Updated weights for policy 0, policy_version 92190 (0.0007) +[2023-10-13 04:06:53,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 188710912. Throughput: 0: 1653.2, 1: 1673.5. Samples: 47191862. Policy #0 lag: (min: 23.0, avg: 31.0, max: 55.0) +[2023-10-13 04:06:53,608][45375] Avg episode reward: [(0, '61.390'), (1, '60.340')] +[2023-10-13 04:06:54,165][46663] Updated weights for policy 1, policy_version 92101 (0.0007) +[2023-10-13 04:06:54,557][46663] Updated weights for policy 1, policy_version 92111 (0.0007) +[2023-10-13 04:06:54,747][46662] Updated weights for policy 0, policy_version 92200 (0.0009) +[2023-10-13 04:06:54,920][46663] Updated weights for policy 1, policy_version 92121 (0.0010) +[2023-10-13 04:06:55,120][46662] Updated weights for policy 0, policy_version 92210 (0.0008) +[2023-10-13 04:06:55,481][46662] Updated weights for policy 0, policy_version 92220 (0.0008) +[2023-10-13 04:06:58,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 188776448. Throughput: 0: 1642.0, 1: 1671.0. Samples: 47200786. Policy #0 lag: (min: 23.0, avg: 31.0, max: 55.0) +[2023-10-13 04:06:58,607][45375] Avg episode reward: [(0, '60.770'), (1, '59.900')] +[2023-10-13 04:06:59,063][46663] Updated weights for policy 1, policy_version 92131 (0.0008) +[2023-10-13 04:06:59,423][46663] Updated weights for policy 1, policy_version 92141 (0.0009) +[2023-10-13 04:06:59,532][46662] Updated weights for policy 0, policy_version 92230 (0.0008) +[2023-10-13 04:06:59,790][46663] Updated weights for policy 1, policy_version 92151 (0.0009) +[2023-10-13 04:06:59,901][46662] Updated weights for policy 0, policy_version 92240 (0.0008) +[2023-10-13 04:07:00,270][46662] Updated weights for policy 0, policy_version 92250 (0.0007) +[2023-10-13 04:07:03,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 188841984. Throughput: 0: 1656.3, 1: 1673.1. Samples: 47221334. Policy #0 lag: (min: 23.0, avg: 31.0, max: 55.0) +[2023-10-13 04:07:03,608][45375] Avg episode reward: [(0, '59.750'), (1, '58.980')] +[2023-10-13 04:07:03,799][46663] Updated weights for policy 1, policy_version 92161 (0.0008) +[2023-10-13 04:07:04,168][46663] Updated weights for policy 1, policy_version 92171 (0.0008) +[2023-10-13 04:07:04,431][46662] Updated weights for policy 0, policy_version 92260 (0.0010) +[2023-10-13 04:07:04,530][46663] Updated weights for policy 1, policy_version 92181 (0.0007) +[2023-10-13 04:07:04,795][46662] Updated weights for policy 0, policy_version 92270 (0.0009) +[2023-10-13 04:07:04,892][46663] Updated weights for policy 1, policy_version 92191 (0.0007) +[2023-10-13 04:07:05,166][46662] Updated weights for policy 0, policy_version 92280 (0.0008) +[2023-10-13 04:07:08,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 188907520. Throughput: 0: 1663.2, 1: 1673.5. Samples: 47242140. Policy #0 lag: (min: 23.0, avg: 31.0, max: 55.0) +[2023-10-13 04:07:08,607][45375] Avg episode reward: [(0, '60.350'), (1, '58.200')] +[2023-10-13 04:07:08,897][46663] Updated weights for policy 1, policy_version 92201 (0.0008) +[2023-10-13 04:07:09,265][46663] Updated weights for policy 1, policy_version 92211 (0.0009) +[2023-10-13 04:07:09,300][46662] Updated weights for policy 0, policy_version 92290 (0.0009) +[2023-10-13 04:07:09,628][46663] Updated weights for policy 1, policy_version 92221 (0.0007) +[2023-10-13 04:07:09,669][46662] Updated weights for policy 0, policy_version 92300 (0.0007) +[2023-10-13 04:07:10,034][46662] Updated weights for policy 0, policy_version 92310 (0.0009) +[2023-10-13 04:07:10,401][46662] Updated weights for policy 0, policy_version 92320 (0.0008) +[2023-10-13 04:07:13,607][45375] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 188973056. Throughput: 0: 1659.0, 1: 1672.0. Samples: 47251210. Policy #0 lag: (min: 23.0, avg: 31.0, max: 55.0) +[2023-10-13 04:07:13,607][45375] Avg episode reward: [(0, '60.400'), (1, '58.180')] +[2023-10-13 04:07:13,735][46663] Updated weights for policy 1, policy_version 92231 (0.0009) +[2023-10-13 04:07:14,114][46663] Updated weights for policy 1, policy_version 92241 (0.0009) +[2023-10-13 04:07:14,460][46662] Updated weights for policy 0, policy_version 92330 (0.0009) +[2023-10-13 04:07:14,484][46663] Updated weights for policy 1, policy_version 92251 (0.0010) +[2023-10-13 04:07:14,830][46662] Updated weights for policy 0, policy_version 92340 (0.0009) +[2023-10-13 04:07:15,201][46662] Updated weights for policy 0, policy_version 92350 (0.0011) +[2023-10-13 04:07:18,582][46663] Updated weights for policy 1, policy_version 92261 (0.0010) +[2023-10-13 04:07:18,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 189038592. Throughput: 0: 1673.5, 1: 1667.4. Samples: 47271892. Policy #0 lag: (min: 23.0, avg: 31.0, max: 55.0) +[2023-10-13 04:07:18,607][45375] Avg episode reward: [(0, '60.460'), (1, '56.690')] +[2023-10-13 04:07:18,943][46663] Updated weights for policy 1, policy_version 92271 (0.0007) +[2023-10-13 04:07:19,311][46663] Updated weights for policy 1, policy_version 92281 (0.0008) +[2023-10-13 04:07:19,314][46662] Updated weights for policy 0, policy_version 92360 (0.0009) +[2023-10-13 04:07:19,671][46662] Updated weights for policy 0, policy_version 92370 (0.0008) +[2023-10-13 04:07:20,042][46662] Updated weights for policy 0, policy_version 92380 (0.0007) +[2023-10-13 04:07:23,267][46663] Updated weights for policy 1, policy_version 92291 (0.0008) +[2023-10-13 04:07:23,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 189104128. Throughput: 0: 1671.9, 1: 1668.5. Samples: 47292530. Policy #0 lag: (min: 23.0, avg: 31.0, max: 55.0) +[2023-10-13 04:07:23,608][45375] Avg episode reward: [(0, '60.960'), (1, '56.460')] +[2023-10-13 04:07:23,627][46663] Updated weights for policy 1, policy_version 92301 (0.0009) +[2023-10-13 04:07:24,003][46663] Updated weights for policy 1, policy_version 92311 (0.0010) +[2023-10-13 04:07:24,250][46662] Updated weights for policy 0, policy_version 92390 (0.0008) +[2023-10-13 04:07:24,613][46662] Updated weights for policy 0, policy_version 92400 (0.0009) +[2023-10-13 04:07:24,983][46662] Updated weights for policy 0, policy_version 92410 (0.0009) +[2023-10-13 04:07:28,078][46663] Updated weights for policy 1, policy_version 92321 (0.0007) +[2023-10-13 04:07:28,433][46663] Updated weights for policy 1, policy_version 92331 (0.0009) +[2023-10-13 04:07:28,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 189169664. Throughput: 0: 1674.8, 1: 1675.3. Samples: 47301864. Policy #0 lag: (min: 23.0, avg: 31.0, max: 55.0) +[2023-10-13 04:07:28,608][45375] Avg episode reward: [(0, '59.890'), (1, '56.470')] +[2023-10-13 04:07:28,809][46663] Updated weights for policy 1, policy_version 92341 (0.0007) +[2023-10-13 04:07:29,155][46662] Updated weights for policy 0, policy_version 92420 (0.0008) +[2023-10-13 04:07:29,182][46663] Updated weights for policy 1, policy_version 92351 (0.0007) +[2023-10-13 04:07:29,527][46662] Updated weights for policy 0, policy_version 92430 (0.0009) +[2023-10-13 04:07:29,905][46662] Updated weights for policy 0, policy_version 92440 (0.0011) +[2023-10-13 04:07:33,314][46663] Updated weights for policy 1, policy_version 92361 (0.0008) +[2023-10-13 04:07:33,606][45375] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 189235200. Throughput: 0: 1673.2, 1: 1677.6. Samples: 47322422. Policy #0 lag: (min: 23.0, avg: 23.0, max: 23.0) +[2023-10-13 04:07:33,607][45375] Avg episode reward: [(0, '59.140'), (1, '56.050')] +[2023-10-13 04:07:33,672][46663] Updated weights for policy 1, policy_version 92371 (0.0008) +[2023-10-13 04:07:33,989][46662] Updated weights for policy 0, policy_version 92450 (0.0009) +[2023-10-13 04:07:34,035][46663] Updated weights for policy 1, policy_version 92381 (0.0008) +[2023-10-13 04:07:34,350][46662] Updated weights for policy 0, policy_version 92460 (0.0010) +[2023-10-13 04:07:34,731][46662] Updated weights for policy 0, policy_version 92470 (0.0009) +[2023-10-13 04:07:35,099][46662] Updated weights for policy 0, policy_version 92480 (0.0008) +[2023-10-13 04:07:38,105][46663] Updated weights for policy 1, policy_version 92391 (0.0008) +[2023-10-13 04:07:38,478][46663] Updated weights for policy 1, policy_version 92401 (0.0008) +[2023-10-13 04:07:38,606][45375] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 189300736. Throughput: 0: 1675.7, 1: 1673.9. Samples: 47342592. Policy #0 lag: (min: 23.0, avg: 23.0, max: 23.0) +[2023-10-13 04:07:38,607][45375] Avg episode reward: [(0, '58.370'), (1, '55.690')] +[2023-10-13 04:07:38,838][46663] Updated weights for policy 1, policy_version 92411 (0.0008) +[2023-10-13 04:07:39,272][46662] Updated weights for policy 0, policy_version 92490 (0.0009) +[2023-10-13 04:07:39,649][46662] Updated weights for policy 0, policy_version 92500 (0.0008) +[2023-10-13 04:07:40,018][46662] Updated weights for policy 0, policy_version 92510 (0.0008) +[2023-10-13 04:07:42,980][46663] Updated weights for policy 1, policy_version 92421 (0.0010) +[2023-10-13 04:07:43,362][46663] Updated weights for policy 1, policy_version 92431 (0.0011) +[2023-10-13 04:07:43,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 189366272. Throughput: 0: 1673.6, 1: 1689.9. Samples: 47352144. Policy #0 lag: (min: 23.0, avg: 23.0, max: 23.0) +[2023-10-13 04:07:43,607][45375] Avg episode reward: [(0, '59.380'), (1, '55.080')] +[2023-10-13 04:07:43,726][46663] Updated weights for policy 1, policy_version 92441 (0.0011) +[2023-10-13 04:07:44,081][46662] Updated weights for policy 0, policy_version 92520 (0.0009) +[2023-10-13 04:07:44,450][46662] Updated weights for policy 0, policy_version 92530 (0.0010) +[2023-10-13 04:07:44,823][46662] Updated weights for policy 0, policy_version 92540 (0.0009) +[2023-10-13 04:07:47,795][46663] Updated weights for policy 1, policy_version 92451 (0.0007) +[2023-10-13 04:07:48,170][46663] Updated weights for policy 1, policy_version 92461 (0.0008) +[2023-10-13 04:07:48,536][46663] Updated weights for policy 1, policy_version 92471 (0.0009) +[2023-10-13 04:07:48,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 189431808. Throughput: 0: 1675.2, 1: 1686.0. Samples: 47372588. Policy #0 lag: (min: 23.0, avg: 23.0, max: 23.0) +[2023-10-13 04:07:48,607][45375] Avg episode reward: [(0, '58.440'), (1, '54.040')] +[2023-10-13 04:07:48,929][46662] Updated weights for policy 0, policy_version 92550 (0.0010) +[2023-10-13 04:07:49,314][46662] Updated weights for policy 0, policy_version 92560 (0.0008) +[2023-10-13 04:07:49,692][46662] Updated weights for policy 0, policy_version 92570 (0.0008) +[2023-10-13 04:07:52,572][46663] Updated weights for policy 1, policy_version 92481 (0.0008) +[2023-10-13 04:07:52,945][46663] Updated weights for policy 1, policy_version 92491 (0.0007) +[2023-10-13 04:07:53,314][46663] Updated weights for policy 1, policy_version 92501 (0.0007) +[2023-10-13 04:07:53,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 189497344. Throughput: 0: 1668.7, 1: 1669.5. Samples: 47392360. Policy #0 lag: (min: 23.0, avg: 23.0, max: 23.0) +[2023-10-13 04:07:53,608][45375] Avg episode reward: [(0, '58.210'), (1, '54.700')] +[2023-10-13 04:07:53,675][46663] Updated weights for policy 1, policy_version 92511 (0.0007) +[2023-10-13 04:07:53,703][46384] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000092512_94732288.pth... +[2023-10-13 04:07:53,732][46384] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000090944_93126656.pth +[2023-10-13 04:07:53,767][46662] Updated weights for policy 0, policy_version 92580 (0.0008) +[2023-10-13 04:07:54,134][46662] Updated weights for policy 0, policy_version 92590 (0.0007) +[2023-10-13 04:07:54,506][46662] Updated weights for policy 0, policy_version 92600 (0.0007) +[2023-10-13 04:07:54,799][46091] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000092608_94830592.pth... +[2023-10-13 04:07:54,828][46091] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000091040_93224960.pth +[2023-10-13 04:07:57,717][46663] Updated weights for policy 1, policy_version 92521 (0.0008) +[2023-10-13 04:07:58,082][46663] Updated weights for policy 1, policy_version 92531 (0.0009) +[2023-10-13 04:07:58,446][46663] Updated weights for policy 1, policy_version 92541 (0.0008) +[2023-10-13 04:07:58,521][46662] Updated weights for policy 0, policy_version 92610 (0.0007) +[2023-10-13 04:07:58,606][45375] Fps is (10 sec: 16384.1, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 189595648. Throughput: 0: 1672.7, 1: 1686.2. Samples: 47402360. Policy #0 lag: (min: 23.0, avg: 23.0, max: 23.0) +[2023-10-13 04:07:58,607][45375] Avg episode reward: [(0, '58.720'), (1, '53.350')] +[2023-10-13 04:07:58,886][46662] Updated weights for policy 0, policy_version 92620 (0.0011) +[2023-10-13 04:07:59,271][46662] Updated weights for policy 0, policy_version 92630 (0.0010) +[2023-10-13 04:07:59,653][46662] Updated weights for policy 0, policy_version 92640 (0.0008) +[2023-10-13 04:08:02,567][46663] Updated weights for policy 1, policy_version 92551 (0.0007) +[2023-10-13 04:08:02,929][46663] Updated weights for policy 1, policy_version 92561 (0.0008) +[2023-10-13 04:08:03,299][46663] Updated weights for policy 1, policy_version 92571 (0.0010) +[2023-10-13 04:08:03,606][45375] Fps is (10 sec: 16384.4, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 189661184. Throughput: 0: 1669.8, 1: 1689.8. Samples: 47423074. Policy #0 lag: (min: 23.0, avg: 23.0, max: 23.0) +[2023-10-13 04:08:03,607][45375] Avg episode reward: [(0, '57.580'), (1, '52.970')] +[2023-10-13 04:08:03,717][46662] Updated weights for policy 0, policy_version 92650 (0.0009) +[2023-10-13 04:08:04,085][46662] Updated weights for policy 0, policy_version 92660 (0.0009) +[2023-10-13 04:08:04,453][46662] Updated weights for policy 0, policy_version 92670 (0.0008) +[2023-10-13 04:08:07,230][46663] Updated weights for policy 1, policy_version 92581 (0.0009) +[2023-10-13 04:08:07,597][46663] Updated weights for policy 1, policy_version 92591 (0.0007) +[2023-10-13 04:08:07,953][46663] Updated weights for policy 1, policy_version 92601 (0.0009) +[2023-10-13 04:08:08,428][46662] Updated weights for policy 0, policy_version 92680 (0.0008) +[2023-10-13 04:08:08,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 189726720. Throughput: 0: 1673.8, 1: 1667.2. Samples: 47442876. Policy #0 lag: (min: 23.0, avg: 23.0, max: 23.0) +[2023-10-13 04:08:08,607][45375] Avg episode reward: [(0, '56.100'), (1, '53.300')] +[2023-10-13 04:08:08,793][46662] Updated weights for policy 0, policy_version 92690 (0.0009) +[2023-10-13 04:08:09,170][46662] Updated weights for policy 0, policy_version 92700 (0.0011) +[2023-10-13 04:08:12,054][46663] Updated weights for policy 1, policy_version 92611 (0.0009) +[2023-10-13 04:08:12,423][46663] Updated weights for policy 1, policy_version 92621 (0.0009) +[2023-10-13 04:08:12,789][46663] Updated weights for policy 1, policy_version 92631 (0.0009) +[2023-10-13 04:08:13,060][46662] Updated weights for policy 0, policy_version 92710 (0.0009) +[2023-10-13 04:08:13,423][46662] Updated weights for policy 0, policy_version 92720 (0.0010) +[2023-10-13 04:08:13,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 189792256. Throughput: 0: 1672.8, 1: 1689.5. Samples: 47453164. Policy #0 lag: (min: 23.0, avg: 23.0, max: 23.0) +[2023-10-13 04:08:13,607][45375] Avg episode reward: [(0, '57.470'), (1, '53.880')] +[2023-10-13 04:08:13,806][46662] Updated weights for policy 0, policy_version 92730 (0.0011) +[2023-10-13 04:08:17,005][46663] Updated weights for policy 1, policy_version 92641 (0.0009) +[2023-10-13 04:08:17,366][46663] Updated weights for policy 1, policy_version 92651 (0.0010) +[2023-10-13 04:08:17,735][46663] Updated weights for policy 1, policy_version 92661 (0.0007) +[2023-10-13 04:08:17,947][46662] Updated weights for policy 0, policy_version 92740 (0.0009) +[2023-10-13 04:08:18,096][46663] Updated weights for policy 1, policy_version 92671 (0.0007) +[2023-10-13 04:08:18,320][46662] Updated weights for policy 0, policy_version 92750 (0.0010) +[2023-10-13 04:08:18,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 189857792. Throughput: 0: 1676.2, 1: 1681.6. Samples: 47473524. Policy #0 lag: (min: 23.0, avg: 23.0, max: 23.0) +[2023-10-13 04:08:18,607][45375] Avg episode reward: [(0, '56.790'), (1, '54.110')] +[2023-10-13 04:08:18,686][46662] Updated weights for policy 0, policy_version 92760 (0.0010) +[2023-10-13 04:08:22,251][46663] Updated weights for policy 1, policy_version 92681 (0.0008) +[2023-10-13 04:08:22,621][46663] Updated weights for policy 1, policy_version 92691 (0.0008) +[2023-10-13 04:08:22,853][46662] Updated weights for policy 0, policy_version 92770 (0.0011) +[2023-10-13 04:08:22,989][46663] Updated weights for policy 1, policy_version 92701 (0.0008) +[2023-10-13 04:08:23,226][46662] Updated weights for policy 0, policy_version 92780 (0.0008) +[2023-10-13 04:08:23,600][46662] Updated weights for policy 0, policy_version 92790 (0.0009) +[2023-10-13 04:08:23,607][45375] Fps is (10 sec: 13106.8, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 189923328. Throughput: 0: 1678.4, 1: 1667.8. Samples: 47493170. Policy #0 lag: (min: 23.0, avg: 23.0, max: 23.0) +[2023-10-13 04:08:23,608][45375] Avg episode reward: [(0, '54.760'), (1, '54.890')] +[2023-10-13 04:08:23,973][46662] Updated weights for policy 0, policy_version 92800 (0.0009) +[2023-10-13 04:08:27,075][46663] Updated weights for policy 1, policy_version 92711 (0.0009) +[2023-10-13 04:08:27,442][46663] Updated weights for policy 1, policy_version 92721 (0.0009) +[2023-10-13 04:08:27,808][46663] Updated weights for policy 1, policy_version 92731 (0.0007) +[2023-10-13 04:08:28,015][46662] Updated weights for policy 0, policy_version 92810 (0.0008) +[2023-10-13 04:08:28,395][46662] Updated weights for policy 0, policy_version 92820 (0.0008) +[2023-10-13 04:08:28,606][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 189988864. Throughput: 0: 1685.0, 1: 1679.8. Samples: 47503560. Policy #0 lag: (min: 23.0, avg: 23.0, max: 23.0) +[2023-10-13 04:08:28,607][45375] Avg episode reward: [(0, '55.340'), (1, '55.760')] +[2023-10-13 04:08:28,769][46662] Updated weights for policy 0, policy_version 92830 (0.0009) +[2023-10-13 04:08:31,939][46663] Updated weights for policy 1, policy_version 92741 (0.0008) +[2023-10-13 04:08:32,338][46663] Updated weights for policy 1, policy_version 92751 (0.0010) +[2023-10-13 04:08:32,700][46663] Updated weights for policy 1, policy_version 92761 (0.0009) +[2023-10-13 04:08:32,863][46662] Updated weights for policy 0, policy_version 92840 (0.0009) +[2023-10-13 04:08:33,232][46662] Updated weights for policy 0, policy_version 92850 (0.0010) +[2023-10-13 04:08:33,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 190054400. Throughput: 0: 1678.2, 1: 1669.5. Samples: 47523236. Policy #0 lag: (min: 21.0, avg: 22.0, max: 41.0) +[2023-10-13 04:08:33,608][45375] Avg episode reward: [(0, '55.120'), (1, '55.830')] +[2023-10-13 04:08:33,609][46662] Updated weights for policy 0, policy_version 92860 (0.0007) +[2023-10-13 04:08:36,754][46663] Updated weights for policy 1, policy_version 92771 (0.0010) +[2023-10-13 04:08:37,123][46663] Updated weights for policy 1, policy_version 92781 (0.0009) +[2023-10-13 04:08:37,490][46662] Updated weights for policy 0, policy_version 92870 (0.0009) +[2023-10-13 04:08:37,491][46663] Updated weights for policy 1, policy_version 92791 (0.0008) +[2023-10-13 04:08:37,865][46662] Updated weights for policy 0, policy_version 92880 (0.0008) +[2023-10-13 04:08:38,240][46662] Updated weights for policy 0, policy_version 92890 (0.0008) +[2023-10-13 04:08:38,607][45375] Fps is (10 sec: 16383.9, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 190152704. Throughput: 0: 1676.5, 1: 1670.0. Samples: 47542952. Policy #0 lag: (min: 21.0, avg: 22.0, max: 41.0) +[2023-10-13 04:08:38,607][45375] Avg episode reward: [(0, '54.910'), (1, '56.640')] +[2023-10-13 04:08:41,422][46663] Updated weights for policy 1, policy_version 92801 (0.0008) +[2023-10-13 04:08:41,788][46663] Updated weights for policy 1, policy_version 92811 (0.0009) +[2023-10-13 04:08:42,156][46663] Updated weights for policy 1, policy_version 92821 (0.0010) +[2023-10-13 04:08:42,195][46662] Updated weights for policy 0, policy_version 92900 (0.0008) +[2023-10-13 04:08:42,512][46663] Updated weights for policy 1, policy_version 92831 (0.0008) +[2023-10-13 04:08:42,561][46662] Updated weights for policy 0, policy_version 92910 (0.0009) +[2023-10-13 04:08:42,934][46662] Updated weights for policy 0, policy_version 92920 (0.0011) +[2023-10-13 04:08:43,607][45375] Fps is (10 sec: 16384.2, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 190218240. Throughput: 0: 1682.4, 1: 1682.0. Samples: 47553760. Policy #0 lag: (min: 21.0, avg: 22.0, max: 41.0) +[2023-10-13 04:08:43,607][45375] Avg episode reward: [(0, '55.830'), (1, '55.750')] +[2023-10-13 04:08:46,660][46663] Updated weights for policy 1, policy_version 92841 (0.0011) +[2023-10-13 04:08:47,033][46663] Updated weights for policy 1, policy_version 92851 (0.0010) +[2023-10-13 04:08:47,126][46662] Updated weights for policy 0, policy_version 92930 (0.0008) +[2023-10-13 04:08:47,398][46663] Updated weights for policy 1, policy_version 92861 (0.0009) +[2023-10-13 04:08:47,495][46662] Updated weights for policy 0, policy_version 92940 (0.0008) +[2023-10-13 04:08:47,866][46662] Updated weights for policy 0, policy_version 92950 (0.0010) +[2023-10-13 04:08:48,229][46662] Updated weights for policy 0, policy_version 92960 (0.0009) +[2023-10-13 04:08:48,607][45375] Fps is (10 sec: 13107.0, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 190283776. Throughput: 0: 1682.3, 1: 1660.2. Samples: 47573486. Policy #0 lag: (min: 21.0, avg: 22.0, max: 41.0) +[2023-10-13 04:08:48,608][45375] Avg episode reward: [(0, '55.260'), (1, '55.540')] +[2023-10-13 04:08:51,502][46663] Updated weights for policy 1, policy_version 92871 (0.0009) +[2023-10-13 04:08:51,868][46663] Updated weights for policy 1, policy_version 92881 (0.0009) +[2023-10-13 04:08:52,241][46663] Updated weights for policy 1, policy_version 92891 (0.0008) +[2023-10-13 04:08:52,415][46662] Updated weights for policy 0, policy_version 92970 (0.0008) +[2023-10-13 04:08:52,775][46662] Updated weights for policy 0, policy_version 92980 (0.0009) +[2023-10-13 04:08:53,148][46662] Updated weights for policy 0, policy_version 92990 (0.0011) +[2023-10-13 04:08:53,607][45375] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 13440.4). Total num frames: 190349312. Throughput: 0: 1662.7, 1: 1677.1. Samples: 47593166. Policy #0 lag: (min: 21.0, avg: 22.0, max: 41.0) +[2023-10-13 04:08:53,607][45375] Avg episode reward: [(0, '54.960'), (1, '56.810')] +[2023-10-13 04:08:56,334][46663] Updated weights for policy 1, policy_version 92901 (0.0009) +[2023-10-13 04:08:56,704][46663] Updated weights for policy 1, policy_version 92911 (0.0008) +[2023-10-13 04:08:57,057][46663] Updated weights for policy 1, policy_version 92921 (0.0009) +[2023-10-13 04:08:57,182][46662] Updated weights for policy 0, policy_version 93000 (0.0009) +[2023-10-13 04:08:57,547][46662] Updated weights for policy 0, policy_version 93010 (0.0007) +[2023-10-13 04:08:57,919][46662] Updated weights for policy 0, policy_version 93020 (0.0007) +[2023-10-13 04:08:58,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 190414848. Throughput: 0: 1682.6, 1: 1670.8. Samples: 47604066. Policy #0 lag: (min: 21.0, avg: 22.0, max: 41.0) +[2023-10-13 04:08:58,608][45375] Avg episode reward: [(0, '55.110'), (1, '56.980')] +[2023-10-13 04:09:01,059][46663] Updated weights for policy 1, policy_version 92931 (0.0008) +[2023-10-13 04:09:01,423][46663] Updated weights for policy 1, policy_version 92941 (0.0009) +[2023-10-13 04:09:01,794][46663] Updated weights for policy 1, policy_version 92951 (0.0008) +[2023-10-13 04:09:02,127][46662] Updated weights for policy 0, policy_version 93030 (0.0008) +[2023-10-13 04:09:02,505][46662] Updated weights for policy 0, policy_version 93040 (0.0010) +[2023-10-13 04:09:02,878][46662] Updated weights for policy 0, policy_version 93050 (0.0009) +[2023-10-13 04:09:03,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 190480384. Throughput: 0: 1677.8, 1: 1658.1. Samples: 47623642. Policy #0 lag: (min: 21.0, avg: 22.0, max: 41.0) +[2023-10-13 04:09:03,608][45375] Avg episode reward: [(0, '53.930'), (1, '56.900')] +[2023-10-13 04:09:05,798][46663] Updated weights for policy 1, policy_version 92961 (0.0009) +[2023-10-13 04:09:06,156][46663] Updated weights for policy 1, policy_version 92971 (0.0010) +[2023-10-13 04:09:06,523][46663] Updated weights for policy 1, policy_version 92981 (0.0011) +[2023-10-13 04:09:06,768][46662] Updated weights for policy 0, policy_version 93060 (0.0007) +[2023-10-13 04:09:06,890][46663] Updated weights for policy 1, policy_version 92991 (0.0010) +[2023-10-13 04:09:07,142][46662] Updated weights for policy 0, policy_version 93070 (0.0008) +[2023-10-13 04:09:07,508][46662] Updated weights for policy 0, policy_version 93080 (0.0008) +[2023-10-13 04:09:08,607][45375] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 190545920. Throughput: 0: 1652.2, 1: 1680.9. Samples: 47643156. Policy #0 lag: (min: 21.0, avg: 22.0, max: 41.0) +[2023-10-13 04:09:08,607][45375] Avg episode reward: [(0, '55.350'), (1, '56.650')] +[2023-10-13 04:09:11,095][46663] Updated weights for policy 1, policy_version 93001 (0.0009) +[2023-10-13 04:09:11,469][46663] Updated weights for policy 1, policy_version 93011 (0.0010) +[2023-10-13 04:09:11,660][46662] Updated weights for policy 0, policy_version 93090 (0.0008) +[2023-10-13 04:09:11,824][46663] Updated weights for policy 1, policy_version 93021 (0.0010) +[2023-10-13 04:09:12,038][46662] Updated weights for policy 0, policy_version 93100 (0.0009) +[2023-10-13 04:09:12,414][46662] Updated weights for policy 0, policy_version 93110 (0.0010) +[2023-10-13 04:09:12,775][46662] Updated weights for policy 0, policy_version 93120 (0.0009) +[2023-10-13 04:09:13,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 190611456. Throughput: 0: 1677.5, 1: 1666.3. Samples: 47654032. Policy #0 lag: (min: 21.0, avg: 22.0, max: 41.0) +[2023-10-13 04:09:13,608][45375] Avg episode reward: [(0, '55.720'), (1, '58.950')] +[2023-10-13 04:09:15,773][46663] Updated weights for policy 1, policy_version 93031 (0.0009) +[2023-10-13 04:09:16,141][46663] Updated weights for policy 1, policy_version 93041 (0.0010) +[2023-10-13 04:09:16,514][46663] Updated weights for policy 1, policy_version 93051 (0.0009) +[2023-10-13 04:09:16,929][46662] Updated weights for policy 0, policy_version 93130 (0.0009) +[2023-10-13 04:09:17,297][46662] Updated weights for policy 0, policy_version 93140 (0.0010) +[2023-10-13 04:09:17,669][46662] Updated weights for policy 0, policy_version 93150 (0.0010) +[2023-10-13 04:09:18,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 190676992. Throughput: 0: 1675.4, 1: 1668.7. Samples: 47673722. Policy #0 lag: (min: 21.0, avg: 22.0, max: 41.0) +[2023-10-13 04:09:18,607][45375] Avg episode reward: [(0, '56.350'), (1, '60.560')] +[2023-10-13 04:09:20,639][46663] Updated weights for policy 1, policy_version 93061 (0.0009) +[2023-10-13 04:09:21,033][46663] Updated weights for policy 1, policy_version 93071 (0.0009) +[2023-10-13 04:09:21,390][46663] Updated weights for policy 1, policy_version 93081 (0.0007) +[2023-10-13 04:09:21,714][46662] Updated weights for policy 0, policy_version 93160 (0.0008) +[2023-10-13 04:09:22,090][46662] Updated weights for policy 0, policy_version 93170 (0.0009) +[2023-10-13 04:09:22,462][46662] Updated weights for policy 0, policy_version 93180 (0.0009) +[2023-10-13 04:09:23,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 190742528. Throughput: 0: 1655.4, 1: 1684.6. Samples: 47693254. Policy #0 lag: (min: 21.0, avg: 22.0, max: 41.0) +[2023-10-13 04:09:23,608][45375] Avg episode reward: [(0, '56.310'), (1, '59.930')] +[2023-10-13 04:09:25,298][46663] Updated weights for policy 1, policy_version 93091 (0.0010) +[2023-10-13 04:09:25,661][46663] Updated weights for policy 1, policy_version 93101 (0.0010) +[2023-10-13 04:09:26,026][46663] Updated weights for policy 1, policy_version 93111 (0.0009) +[2023-10-13 04:09:26,522][46662] Updated weights for policy 0, policy_version 93190 (0.0007) +[2023-10-13 04:09:26,902][46662] Updated weights for policy 0, policy_version 93200 (0.0009) +[2023-10-13 04:09:27,272][46662] Updated weights for policy 0, policy_version 93210 (0.0009) +[2023-10-13 04:09:28,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13329.4). Total num frames: 190808064. Throughput: 0: 1673.1, 1: 1659.5. Samples: 47703726. Policy #0 lag: (min: 26.0, avg: 26.0, max: 26.0) +[2023-10-13 04:09:28,607][45375] Avg episode reward: [(0, '56.260'), (1, '58.900')] +[2023-10-13 04:09:30,074][46663] Updated weights for policy 1, policy_version 93121 (0.0008) +[2023-10-13 04:09:30,441][46663] Updated weights for policy 1, policy_version 93131 (0.0008) +[2023-10-13 04:09:30,802][46663] Updated weights for policy 1, policy_version 93141 (0.0007) +[2023-10-13 04:09:31,160][46663] Updated weights for policy 1, policy_version 93151 (0.0009) +[2023-10-13 04:09:31,410][46662] Updated weights for policy 0, policy_version 93220 (0.0009) +[2023-10-13 04:09:31,785][46662] Updated weights for policy 0, policy_version 93230 (0.0010) +[2023-10-13 04:09:32,162][46662] Updated weights for policy 0, policy_version 93240 (0.0010) +[2023-10-13 04:09:33,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13329.4). Total num frames: 190873600. Throughput: 0: 1659.5, 1: 1686.4. Samples: 47724048. Policy #0 lag: (min: 26.0, avg: 26.0, max: 26.0) +[2023-10-13 04:09:33,608][45375] Avg episode reward: [(0, '58.360'), (1, '58.700')] +[2023-10-13 04:09:35,333][46663] Updated weights for policy 1, policy_version 93161 (0.0008) +[2023-10-13 04:09:35,705][46663] Updated weights for policy 1, policy_version 93171 (0.0010) +[2023-10-13 04:09:36,060][46663] Updated weights for policy 1, policy_version 93181 (0.0010) +[2023-10-13 04:09:36,225][46662] Updated weights for policy 0, policy_version 93250 (0.0010) +[2023-10-13 04:09:36,588][46662] Updated weights for policy 0, policy_version 93260 (0.0009) +[2023-10-13 04:09:36,962][46662] Updated weights for policy 0, policy_version 93270 (0.0007) +[2023-10-13 04:09:37,335][46662] Updated weights for policy 0, policy_version 93280 (0.0009) +[2023-10-13 04:09:38,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 190939136. Throughput: 0: 1663.6, 1: 1687.8. Samples: 47743978. Policy #0 lag: (min: 26.0, avg: 26.0, max: 26.0) +[2023-10-13 04:09:38,608][45375] Avg episode reward: [(0, '58.960'), (1, '58.520')] +[2023-10-13 04:09:40,081][46663] Updated weights for policy 1, policy_version 93191 (0.0007) +[2023-10-13 04:09:40,443][46663] Updated weights for policy 1, policy_version 93201 (0.0009) +[2023-10-13 04:09:40,810][46663] Updated weights for policy 1, policy_version 93211 (0.0009) +[2023-10-13 04:09:41,412][46662] Updated weights for policy 0, policy_version 93290 (0.0007) +[2023-10-13 04:09:41,795][46662] Updated weights for policy 0, policy_version 93300 (0.0008) +[2023-10-13 04:09:42,156][46662] Updated weights for policy 0, policy_version 93310 (0.0010) +[2023-10-13 04:09:43,607][45375] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 191004672. Throughput: 0: 1673.9, 1: 1665.0. Samples: 47754316. Policy #0 lag: (min: 26.0, avg: 26.0, max: 26.0) +[2023-10-13 04:09:43,607][45375] Avg episode reward: [(0, '58.280'), (1, '57.720')] +[2023-10-13 04:09:44,821][46663] Updated weights for policy 1, policy_version 93221 (0.0009) +[2023-10-13 04:09:45,188][46663] Updated weights for policy 1, policy_version 93231 (0.0008) +[2023-10-13 04:09:45,569][46663] Updated weights for policy 1, policy_version 93241 (0.0007) +[2023-10-13 04:09:46,337][46662] Updated weights for policy 0, policy_version 93320 (0.0010) +[2023-10-13 04:09:46,719][46662] Updated weights for policy 0, policy_version 93330 (0.0009) +[2023-10-13 04:09:47,078][46662] Updated weights for policy 0, policy_version 93340 (0.0008) +[2023-10-13 04:09:48,607][45375] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 191070208. Throughput: 0: 1658.7, 1: 1690.3. Samples: 47774346. Policy #0 lag: (min: 26.0, avg: 26.0, max: 26.0) +[2023-10-13 04:09:48,607][45375] Avg episode reward: [(0, '60.000'), (1, '58.880')] +[2023-10-13 04:09:49,807][46663] Updated weights for policy 1, policy_version 93251 (0.0008) +[2023-10-13 04:09:50,173][46663] Updated weights for policy 1, policy_version 93261 (0.0009) +[2023-10-13 04:09:50,555][46663] Updated weights for policy 1, policy_version 93271 (0.0010) +[2023-10-13 04:09:51,294][46662] Updated weights for policy 0, policy_version 93350 (0.0009) +[2023-10-13 04:09:51,663][46662] Updated weights for policy 0, policy_version 93360 (0.0008) +[2023-10-13 04:09:52,038][46662] Updated weights for policy 0, policy_version 93370 (0.0008) +[2023-10-13 04:09:53,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 191135744. Throughput: 0: 1672.1, 1: 1685.8. Samples: 47794264. Policy #0 lag: (min: 26.0, avg: 26.0, max: 26.0) +[2023-10-13 04:09:53,607][45375] Avg episode reward: [(0, '60.410'), (1, '57.810')] +[2023-10-13 04:09:53,617][46384] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000093280_95518720.pth... +[2023-10-13 04:09:53,617][46091] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000093376_95617024.pth... +[2023-10-13 04:09:53,654][46091] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000091808_94011392.pth +[2023-10-13 04:09:53,657][46384] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000091712_93913088.pth +[2023-10-13 04:09:54,572][46663] Updated weights for policy 1, policy_version 93281 (0.0011) +[2023-10-13 04:09:54,940][46663] Updated weights for policy 1, policy_version 93291 (0.0010) +[2023-10-13 04:09:55,303][46663] Updated weights for policy 1, policy_version 93301 (0.0010) +[2023-10-13 04:09:55,668][46663] Updated weights for policy 1, policy_version 93311 (0.0010) +[2023-10-13 04:09:56,108][46662] Updated weights for policy 0, policy_version 93380 (0.0009) +[2023-10-13 04:09:56,471][46662] Updated weights for policy 0, policy_version 93390 (0.0011) +[2023-10-13 04:09:56,840][46662] Updated weights for policy 0, policy_version 93400 (0.0010) +[2023-10-13 04:09:58,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 191201280. Throughput: 0: 1673.7, 1: 1674.1. Samples: 47804684. Policy #0 lag: (min: 26.0, avg: 26.0, max: 26.0) +[2023-10-13 04:09:58,607][45375] Avg episode reward: [(0, '60.780'), (1, '57.810')] +[2023-10-13 04:09:59,815][46663] Updated weights for policy 1, policy_version 93321 (0.0008) +[2023-10-13 04:10:00,185][46663] Updated weights for policy 1, policy_version 93331 (0.0009) +[2023-10-13 04:10:00,551][46663] Updated weights for policy 1, policy_version 93341 (0.0009) +[2023-10-13 04:10:00,902][46662] Updated weights for policy 0, policy_version 93410 (0.0008) +[2023-10-13 04:10:01,275][46662] Updated weights for policy 0, policy_version 93420 (0.0010) +[2023-10-13 04:10:01,636][46662] Updated weights for policy 0, policy_version 93430 (0.0009) +[2023-10-13 04:10:01,991][46662] Updated weights for policy 0, policy_version 93440 (0.0007) +[2023-10-13 04:10:03,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 191266816. Throughput: 0: 1659.0, 1: 1689.5. Samples: 47824404. Policy #0 lag: (min: 26.0, avg: 26.0, max: 26.0) +[2023-10-13 04:10:03,608][45375] Avg episode reward: [(0, '60.150'), (1, '56.360')] +[2023-10-13 04:10:04,626][46663] Updated weights for policy 1, policy_version 93351 (0.0011) +[2023-10-13 04:10:05,004][46663] Updated weights for policy 1, policy_version 93361 (0.0011) +[2023-10-13 04:10:05,377][46663] Updated weights for policy 1, policy_version 93371 (0.0010) +[2023-10-13 04:10:06,137][46662] Updated weights for policy 0, policy_version 93450 (0.0011) +[2023-10-13 04:10:06,502][46662] Updated weights for policy 0, policy_version 93460 (0.0008) +[2023-10-13 04:10:06,884][46662] Updated weights for policy 0, policy_version 93470 (0.0007) +[2023-10-13 04:10:08,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 191332352. Throughput: 0: 1680.0, 1: 1688.3. Samples: 47844830. Policy #0 lag: (min: 26.0, avg: 26.0, max: 26.0) +[2023-10-13 04:10:08,608][45375] Avg episode reward: [(0, '61.370'), (1, '55.050')] +[2023-10-13 04:10:09,629][46663] Updated weights for policy 1, policy_version 93381 (0.0009) +[2023-10-13 04:10:10,016][46663] Updated weights for policy 1, policy_version 93391 (0.0010) +[2023-10-13 04:10:10,382][46663] Updated weights for policy 1, policy_version 93401 (0.0009) +[2023-10-13 04:10:10,920][46662] Updated weights for policy 0, policy_version 93480 (0.0010) +[2023-10-13 04:10:11,289][46662] Updated weights for policy 0, policy_version 93490 (0.0007) +[2023-10-13 04:10:11,651][46662] Updated weights for policy 0, policy_version 93500 (0.0007) +[2023-10-13 04:10:13,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 191397888. Throughput: 0: 1677.6, 1: 1682.3. Samples: 47854922. Policy #0 lag: (min: 26.0, avg: 26.0, max: 26.0) +[2023-10-13 04:10:13,608][45375] Avg episode reward: [(0, '62.690'), (1, '55.010')] +[2023-10-13 04:10:14,232][46663] Updated weights for policy 1, policy_version 93411 (0.0007) +[2023-10-13 04:10:14,588][46663] Updated weights for policy 1, policy_version 93421 (0.0009) +[2023-10-13 04:10:14,970][46663] Updated weights for policy 1, policy_version 93431 (0.0009) +[2023-10-13 04:10:15,574][46662] Updated weights for policy 0, policy_version 93510 (0.0008) +[2023-10-13 04:10:15,944][46662] Updated weights for policy 0, policy_version 93520 (0.0009) +[2023-10-13 04:10:16,306][46662] Updated weights for policy 0, policy_version 93530 (0.0009) +[2023-10-13 04:10:18,607][45375] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 191463424. Throughput: 0: 1664.5, 1: 1677.7. Samples: 47874446. Policy #0 lag: (min: 26.0, avg: 26.0, max: 26.0) +[2023-10-13 04:10:18,607][45375] Avg episode reward: [(0, '62.850'), (1, '53.940')] +[2023-10-13 04:10:19,040][46663] Updated weights for policy 1, policy_version 93441 (0.0009) +[2023-10-13 04:10:19,405][46663] Updated weights for policy 1, policy_version 93451 (0.0007) +[2023-10-13 04:10:19,775][46663] Updated weights for policy 1, policy_version 93461 (0.0010) +[2023-10-13 04:10:20,135][46663] Updated weights for policy 1, policy_version 93471 (0.0007) +[2023-10-13 04:10:20,305][46662] Updated weights for policy 0, policy_version 93540 (0.0008) +[2023-10-13 04:10:20,667][46662] Updated weights for policy 0, policy_version 93550 (0.0009) +[2023-10-13 04:10:21,046][46662] Updated weights for policy 0, policy_version 93560 (0.0007) +[2023-10-13 04:10:23,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 191528960. Throughput: 0: 1680.4, 1: 1676.9. Samples: 47895058. Policy #0 lag: (min: 26.0, avg: 26.0, max: 26.0) +[2023-10-13 04:10:23,608][45375] Avg episode reward: [(0, '63.820'), (1, '53.150')] +[2023-10-13 04:10:24,225][46663] Updated weights for policy 1, policy_version 93481 (0.0011) +[2023-10-13 04:10:24,603][46663] Updated weights for policy 1, policy_version 93491 (0.0011) +[2023-10-13 04:10:24,964][46663] Updated weights for policy 1, policy_version 93501 (0.0010) +[2023-10-13 04:10:25,071][46662] Updated weights for policy 0, policy_version 93570 (0.0008) +[2023-10-13 04:10:25,443][46662] Updated weights for policy 0, policy_version 93580 (0.0011) +[2023-10-13 04:10:25,823][46662] Updated weights for policy 0, policy_version 93590 (0.0009) +[2023-10-13 04:10:26,185][46662] Updated weights for policy 0, policy_version 93600 (0.0010) +[2023-10-13 04:10:28,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 191594496. Throughput: 0: 1667.1, 1: 1681.4. Samples: 47904996. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 04:10:28,607][45375] Avg episode reward: [(0, '63.050'), (1, '54.500')] +[2023-10-13 04:10:28,932][46663] Updated weights for policy 1, policy_version 93511 (0.0009) +[2023-10-13 04:10:29,293][46663] Updated weights for policy 1, policy_version 93521 (0.0010) +[2023-10-13 04:10:29,666][46663] Updated weights for policy 1, policy_version 93531 (0.0008) +[2023-10-13 04:10:30,116][46662] Updated weights for policy 0, policy_version 93610 (0.0007) +[2023-10-13 04:10:30,474][46662] Updated weights for policy 0, policy_version 93620 (0.0009) +[2023-10-13 04:10:30,847][46662] Updated weights for policy 0, policy_version 93630 (0.0007) +[2023-10-13 04:10:33,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 191660032. Throughput: 0: 1675.2, 1: 1677.5. Samples: 47925218. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 04:10:33,608][45375] Avg episode reward: [(0, '63.710'), (1, '55.120')] +[2023-10-13 04:10:33,724][46663] Updated weights for policy 1, policy_version 93541 (0.0008) +[2023-10-13 04:10:34,086][46663] Updated weights for policy 1, policy_version 93551 (0.0009) +[2023-10-13 04:10:34,448][46663] Updated weights for policy 1, policy_version 93561 (0.0008) +[2023-10-13 04:10:34,953][46662] Updated weights for policy 0, policy_version 93640 (0.0009) +[2023-10-13 04:10:35,319][46662] Updated weights for policy 0, policy_version 93650 (0.0008) +[2023-10-13 04:10:35,680][46662] Updated weights for policy 0, policy_version 93660 (0.0010) +[2023-10-13 04:10:38,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 191725568. Throughput: 0: 1694.1, 1: 1672.9. Samples: 47945778. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 04:10:38,607][45375] Avg episode reward: [(0, '63.550'), (1, '55.320')] +[2023-10-13 04:10:38,692][46663] Updated weights for policy 1, policy_version 93571 (0.0008) +[2023-10-13 04:10:39,056][46663] Updated weights for policy 1, policy_version 93581 (0.0008) +[2023-10-13 04:10:39,426][46663] Updated weights for policy 1, policy_version 93591 (0.0009) +[2023-10-13 04:10:39,776][46662] Updated weights for policy 0, policy_version 93670 (0.0008) +[2023-10-13 04:10:40,150][46662] Updated weights for policy 0, policy_version 93680 (0.0008) +[2023-10-13 04:10:40,526][46662] Updated weights for policy 0, policy_version 93690 (0.0007) +[2023-10-13 04:10:43,474][46663] Updated weights for policy 1, policy_version 93601 (0.0007) +[2023-10-13 04:10:43,606][45375] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 191791104. Throughput: 0: 1664.5, 1: 1673.8. Samples: 47954908. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 04:10:43,607][45375] Avg episode reward: [(0, '62.770'), (1, '56.120')] +[2023-10-13 04:10:43,851][46663] Updated weights for policy 1, policy_version 93611 (0.0008) +[2023-10-13 04:10:44,223][46663] Updated weights for policy 1, policy_version 93621 (0.0008) +[2023-10-13 04:10:44,592][46663] Updated weights for policy 1, policy_version 93631 (0.0008) +[2023-10-13 04:10:44,605][46662] Updated weights for policy 0, policy_version 93700 (0.0008) +[2023-10-13 04:10:44,970][46662] Updated weights for policy 0, policy_version 93710 (0.0008) +[2023-10-13 04:10:45,341][46662] Updated weights for policy 0, policy_version 93720 (0.0009) +[2023-10-13 04:10:48,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 191856640. Throughput: 0: 1688.0, 1: 1673.7. Samples: 47975684. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 04:10:48,607][45375] Avg episode reward: [(0, '61.250'), (1, '55.550')] +[2023-10-13 04:10:48,754][46663] Updated weights for policy 1, policy_version 93641 (0.0011) +[2023-10-13 04:10:49,140][46663] Updated weights for policy 1, policy_version 93651 (0.0008) +[2023-10-13 04:10:49,437][46662] Updated weights for policy 0, policy_version 93730 (0.0007) +[2023-10-13 04:10:49,506][46663] Updated weights for policy 1, policy_version 93661 (0.0007) +[2023-10-13 04:10:49,806][46662] Updated weights for policy 0, policy_version 93740 (0.0008) +[2023-10-13 04:10:50,173][46662] Updated weights for policy 0, policy_version 93750 (0.0010) +[2023-10-13 04:10:50,542][46662] Updated weights for policy 0, policy_version 93760 (0.0011) +[2023-10-13 04:10:53,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 191922176. Throughput: 0: 1692.0, 1: 1668.8. Samples: 47996064. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 04:10:53,607][45375] Avg episode reward: [(0, '60.630'), (1, '55.150')] +[2023-10-13 04:10:53,719][46663] Updated weights for policy 1, policy_version 93671 (0.0009) +[2023-10-13 04:10:54,078][46663] Updated weights for policy 1, policy_version 93681 (0.0008) +[2023-10-13 04:10:54,458][46663] Updated weights for policy 1, policy_version 93691 (0.0008) +[2023-10-13 04:10:54,655][46662] Updated weights for policy 0, policy_version 93770 (0.0007) +[2023-10-13 04:10:55,036][46662] Updated weights for policy 0, policy_version 93780 (0.0008) +[2023-10-13 04:10:55,408][46662] Updated weights for policy 0, policy_version 93790 (0.0009) +[2023-10-13 04:10:58,539][46663] Updated weights for policy 1, policy_version 93701 (0.0007) +[2023-10-13 04:10:58,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 191987712. Throughput: 0: 1665.2, 1: 1672.4. Samples: 48005110. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 04:10:58,608][45375] Avg episode reward: [(0, '61.820'), (1, '55.290')] +[2023-10-13 04:10:58,918][46663] Updated weights for policy 1, policy_version 93711 (0.0008) +[2023-10-13 04:10:59,290][46663] Updated weights for policy 1, policy_version 93721 (0.0008) +[2023-10-13 04:10:59,433][46662] Updated weights for policy 0, policy_version 93800 (0.0007) +[2023-10-13 04:10:59,807][46662] Updated weights for policy 0, policy_version 93810 (0.0009) +[2023-10-13 04:11:00,171][46662] Updated weights for policy 0, policy_version 93820 (0.0009) +[2023-10-13 04:11:03,218][46663] Updated weights for policy 1, policy_version 93731 (0.0008) +[2023-10-13 04:11:03,583][46663] Updated weights for policy 1, policy_version 93741 (0.0009) +[2023-10-13 04:11:03,606][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.3, 300 sec: 13329.4). Total num frames: 192053248. Throughput: 0: 1688.2, 1: 1672.1. Samples: 48025660. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 04:11:03,607][45375] Avg episode reward: [(0, '61.510'), (1, '55.490')] +[2023-10-13 04:11:03,956][46663] Updated weights for policy 1, policy_version 93751 (0.0010) +[2023-10-13 04:11:04,282][46662] Updated weights for policy 0, policy_version 93830 (0.0008) +[2023-10-13 04:11:04,653][46662] Updated weights for policy 0, policy_version 93840 (0.0008) +[2023-10-13 04:11:05,014][46662] Updated weights for policy 0, policy_version 93850 (0.0009) +[2023-10-13 04:11:08,055][46663] Updated weights for policy 1, policy_version 93761 (0.0008) +[2023-10-13 04:11:08,417][46663] Updated weights for policy 1, policy_version 93771 (0.0007) +[2023-10-13 04:11:08,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 192118784. Throughput: 0: 1687.9, 1: 1667.4. Samples: 48046044. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 04:11:08,607][45375] Avg episode reward: [(0, '61.240'), (1, '56.370')] +[2023-10-13 04:11:08,786][46663] Updated weights for policy 1, policy_version 93781 (0.0011) +[2023-10-13 04:11:09,113][46662] Updated weights for policy 0, policy_version 93860 (0.0008) +[2023-10-13 04:11:09,152][46663] Updated weights for policy 1, policy_version 93791 (0.0009) +[2023-10-13 04:11:09,490][46662] Updated weights for policy 0, policy_version 93870 (0.0008) +[2023-10-13 04:11:09,860][46662] Updated weights for policy 0, policy_version 93880 (0.0009) +[2023-10-13 04:11:13,359][46663] Updated weights for policy 1, policy_version 93801 (0.0007) +[2023-10-13 04:11:13,607][45375] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 192184320. Throughput: 0: 1671.1, 1: 1675.2. Samples: 48055584. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 04:11:13,608][45375] Avg episode reward: [(0, '60.190'), (1, '54.950')] +[2023-10-13 04:11:13,726][46663] Updated weights for policy 1, policy_version 93811 (0.0009) +[2023-10-13 04:11:13,923][46662] Updated weights for policy 0, policy_version 93890 (0.0009) +[2023-10-13 04:11:14,095][46663] Updated weights for policy 1, policy_version 93821 (0.0009) +[2023-10-13 04:11:14,290][46662] Updated weights for policy 0, policy_version 93900 (0.0007) +[2023-10-13 04:11:14,656][46662] Updated weights for policy 0, policy_version 93910 (0.0009) +[2023-10-13 04:11:15,022][46662] Updated weights for policy 0, policy_version 93920 (0.0008) +[2023-10-13 04:11:18,225][46663] Updated weights for policy 1, policy_version 93831 (0.0008) +[2023-10-13 04:11:18,596][46663] Updated weights for policy 1, policy_version 93841 (0.0007) +[2023-10-13 04:11:18,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 192249856. Throughput: 0: 1682.8, 1: 1674.3. Samples: 48076288. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 04:11:18,607][45375] Avg episode reward: [(0, '60.180'), (1, '55.540')] +[2023-10-13 04:11:18,952][46663] Updated weights for policy 1, policy_version 93851 (0.0008) +[2023-10-13 04:11:19,251][46662] Updated weights for policy 0, policy_version 93930 (0.0008) +[2023-10-13 04:11:19,628][46662] Updated weights for policy 0, policy_version 93940 (0.0009) +[2023-10-13 04:11:20,003][46662] Updated weights for policy 0, policy_version 93950 (0.0008) +[2023-10-13 04:11:22,793][46663] Updated weights for policy 1, policy_version 93861 (0.0009) +[2023-10-13 04:11:23,166][46663] Updated weights for policy 1, policy_version 93871 (0.0007) +[2023-10-13 04:11:23,540][46663] Updated weights for policy 1, policy_version 93881 (0.0009) +[2023-10-13 04:11:23,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 192315392. Throughput: 0: 1677.9, 1: 1666.5. Samples: 48096274. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 04:11:23,608][45375] Avg episode reward: [(0, '59.690'), (1, '57.550')] +[2023-10-13 04:11:23,921][46662] Updated weights for policy 0, policy_version 93960 (0.0009) +[2023-10-13 04:11:24,293][46662] Updated weights for policy 0, policy_version 93970 (0.0009) +[2023-10-13 04:11:24,671][46662] Updated weights for policy 0, policy_version 93980 (0.0009) +[2023-10-13 04:11:27,370][46663] Updated weights for policy 1, policy_version 93891 (0.0009) +[2023-10-13 04:11:27,737][46663] Updated weights for policy 1, policy_version 93901 (0.0011) +[2023-10-13 04:11:28,105][46663] Updated weights for policy 1, policy_version 93911 (0.0008) +[2023-10-13 04:11:28,606][45375] Fps is (10 sec: 16384.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 192413696. Throughput: 0: 1679.3, 1: 1683.6. Samples: 48106242. Policy #0 lag: (min: 28.0, avg: 28.0, max: 28.0) +[2023-10-13 04:11:28,607][45375] Avg episode reward: [(0, '59.750'), (1, '57.020')] +[2023-10-13 04:11:28,738][46662] Updated weights for policy 0, policy_version 93990 (0.0008) +[2023-10-13 04:11:29,118][46662] Updated weights for policy 0, policy_version 94000 (0.0007) +[2023-10-13 04:11:29,482][46662] Updated weights for policy 0, policy_version 94010 (0.0008) +[2023-10-13 04:11:32,355][46663] Updated weights for policy 1, policy_version 93921 (0.0008) +[2023-10-13 04:11:32,723][46663] Updated weights for policy 1, policy_version 93931 (0.0008) +[2023-10-13 04:11:33,086][46663] Updated weights for policy 1, policy_version 93941 (0.0008) +[2023-10-13 04:11:33,450][46663] Updated weights for policy 1, policy_version 93951 (0.0007) +[2023-10-13 04:11:33,509][46662] Updated weights for policy 0, policy_version 94020 (0.0007) +[2023-10-13 04:11:33,606][45375] Fps is (10 sec: 16384.4, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 192479232. Throughput: 0: 1678.8, 1: 1683.2. Samples: 48126978. Policy #0 lag: (min: 28.0, avg: 28.0, max: 28.0) +[2023-10-13 04:11:33,607][45375] Avg episode reward: [(0, '58.610'), (1, '56.790')] +[2023-10-13 04:11:33,877][46662] Updated weights for policy 0, policy_version 94030 (0.0007) +[2023-10-13 04:11:34,248][46662] Updated weights for policy 0, policy_version 94040 (0.0008) +[2023-10-13 04:11:37,757][46663] Updated weights for policy 1, policy_version 93961 (0.0010) +[2023-10-13 04:11:38,125][46663] Updated weights for policy 1, policy_version 93971 (0.0008) +[2023-10-13 04:11:38,313][46662] Updated weights for policy 0, policy_version 94050 (0.0008) +[2023-10-13 04:11:38,491][46663] Updated weights for policy 1, policy_version 93981 (0.0008) +[2023-10-13 04:11:38,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 192544768. Throughput: 0: 1680.6, 1: 1668.5. Samples: 48146774. Policy #0 lag: (min: 28.0, avg: 28.0, max: 28.0) +[2023-10-13 04:11:38,608][45375] Avg episode reward: [(0, '59.420'), (1, '56.900')] +[2023-10-13 04:11:38,683][46662] Updated weights for policy 0, policy_version 94060 (0.0008) +[2023-10-13 04:11:39,055][46662] Updated weights for policy 0, policy_version 94070 (0.0007) +[2023-10-13 04:11:39,427][46662] Updated weights for policy 0, policy_version 94080 (0.0009) +[2023-10-13 04:11:42,441][46663] Updated weights for policy 1, policy_version 93991 (0.0010) +[2023-10-13 04:11:42,808][46663] Updated weights for policy 1, policy_version 94001 (0.0007) +[2023-10-13 04:11:43,169][46663] Updated weights for policy 1, policy_version 94011 (0.0010) +[2023-10-13 04:11:43,561][46662] Updated weights for policy 0, policy_version 94090 (0.0009) +[2023-10-13 04:11:43,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 192610304. Throughput: 0: 1685.7, 1: 1691.0. Samples: 48157062. Policy #0 lag: (min: 28.0, avg: 28.0, max: 28.0) +[2023-10-13 04:11:43,607][45375] Avg episode reward: [(0, '57.700'), (1, '56.300')] +[2023-10-13 04:11:43,941][46662] Updated weights for policy 0, policy_version 94100 (0.0011) +[2023-10-13 04:11:44,308][46662] Updated weights for policy 0, policy_version 94110 (0.0007) +[2023-10-13 04:11:47,317][46663] Updated weights for policy 1, policy_version 94021 (0.0009) +[2023-10-13 04:11:47,714][46663] Updated weights for policy 1, policy_version 94031 (0.0010) +[2023-10-13 04:11:48,088][46663] Updated weights for policy 1, policy_version 94041 (0.0008) +[2023-10-13 04:11:48,290][46662] Updated weights for policy 0, policy_version 94120 (0.0009) +[2023-10-13 04:11:48,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 192675840. Throughput: 0: 1688.3, 1: 1684.4. Samples: 48177430. Policy #0 lag: (min: 28.0, avg: 28.0, max: 28.0) +[2023-10-13 04:11:48,608][45375] Avg episode reward: [(0, '58.030'), (1, '55.710')] +[2023-10-13 04:11:48,664][46662] Updated weights for policy 0, policy_version 94130 (0.0009) +[2023-10-13 04:11:49,030][46662] Updated weights for policy 0, policy_version 94140 (0.0007) +[2023-10-13 04:11:51,951][46663] Updated weights for policy 1, policy_version 94051 (0.0009) +[2023-10-13 04:11:52,318][46663] Updated weights for policy 1, policy_version 94061 (0.0008) +[2023-10-13 04:11:52,674][46663] Updated weights for policy 1, policy_version 94071 (0.0008) +[2023-10-13 04:11:53,052][46662] Updated weights for policy 0, policy_version 94150 (0.0007) +[2023-10-13 04:11:53,422][46662] Updated weights for policy 0, policy_version 94160 (0.0007) +[2023-10-13 04:11:53,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 192741376. Throughput: 0: 1688.1, 1: 1673.4. Samples: 48197310. Policy #0 lag: (min: 28.0, avg: 28.0, max: 28.0) +[2023-10-13 04:11:53,607][45375] Avg episode reward: [(0, '58.190'), (1, '55.210')] +[2023-10-13 04:11:53,614][46384] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000094080_96337920.pth... +[2023-10-13 04:11:53,644][46384] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000092512_94732288.pth +[2023-10-13 04:11:53,648][46384] Saving a milestone ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/milestones/checkpoint_000094080_96337920.pth +[2023-10-13 04:11:53,797][46662] Updated weights for policy 0, policy_version 94170 (0.0008) +[2023-10-13 04:11:54,023][46091] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000094176_96436224.pth... +[2023-10-13 04:11:54,051][46091] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000092608_94830592.pth +[2023-10-13 04:11:54,055][46091] Saving a milestone ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/milestones/checkpoint_000094176_96436224.pth +[2023-10-13 04:11:56,831][46663] Updated weights for policy 1, policy_version 94081 (0.0008) +[2023-10-13 04:11:57,200][46663] Updated weights for policy 1, policy_version 94091 (0.0008) +[2023-10-13 04:11:57,563][46663] Updated weights for policy 1, policy_version 94101 (0.0007) +[2023-10-13 04:11:57,866][46662] Updated weights for policy 0, policy_version 94180 (0.0009) +[2023-10-13 04:11:57,939][46663] Updated weights for policy 1, policy_version 94111 (0.0010) +[2023-10-13 04:11:58,224][46662] Updated weights for policy 0, policy_version 94190 (0.0010) +[2023-10-13 04:11:58,594][46662] Updated weights for policy 0, policy_version 94200 (0.0009) +[2023-10-13 04:11:58,606][45375] Fps is (10 sec: 13107.5, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 192806912. Throughput: 0: 1688.9, 1: 1690.9. Samples: 48207674. Policy #0 lag: (min: 28.0, avg: 28.0, max: 28.0) +[2023-10-13 04:11:58,607][45375] Avg episode reward: [(0, '59.880'), (1, '54.640')] +[2023-10-13 04:12:01,971][46663] Updated weights for policy 1, policy_version 94121 (0.0009) +[2023-10-13 04:12:02,333][46663] Updated weights for policy 1, policy_version 94131 (0.0007) +[2023-10-13 04:12:02,693][46663] Updated weights for policy 1, policy_version 94141 (0.0008) +[2023-10-13 04:12:02,711][46662] Updated weights for policy 0, policy_version 94210 (0.0008) +[2023-10-13 04:12:03,085][46662] Updated weights for policy 0, policy_version 94220 (0.0009) +[2023-10-13 04:12:03,458][46662] Updated weights for policy 0, policy_version 94230 (0.0008) +[2023-10-13 04:12:03,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 192872448. Throughput: 0: 1685.2, 1: 1676.3. Samples: 48227554. Policy #0 lag: (min: 28.0, avg: 28.0, max: 28.0) +[2023-10-13 04:12:03,608][45375] Avg episode reward: [(0, '59.170'), (1, '54.060')] +[2023-10-13 04:12:03,831][46662] Updated weights for policy 0, policy_version 94240 (0.0010) +[2023-10-13 04:12:06,614][46663] Updated weights for policy 1, policy_version 94151 (0.0009) +[2023-10-13 04:12:06,989][46663] Updated weights for policy 1, policy_version 94161 (0.0011) +[2023-10-13 04:12:07,354][46663] Updated weights for policy 1, policy_version 94171 (0.0010) +[2023-10-13 04:12:07,964][46662] Updated weights for policy 0, policy_version 94250 (0.0010) +[2023-10-13 04:12:08,343][46662] Updated weights for policy 0, policy_version 94260 (0.0008) +[2023-10-13 04:12:08,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 192937984. Throughput: 0: 1679.8, 1: 1684.7. Samples: 48247674. Policy #0 lag: (min: 28.0, avg: 28.0, max: 28.0) +[2023-10-13 04:12:08,607][45375] Avg episode reward: [(0, '58.320'), (1, '54.510')] +[2023-10-13 04:12:08,713][46662] Updated weights for policy 0, policy_version 94270 (0.0007) +[2023-10-13 04:12:11,455][46663] Updated weights for policy 1, policy_version 94181 (0.0010) +[2023-10-13 04:12:11,825][46663] Updated weights for policy 1, policy_version 94191 (0.0009) +[2023-10-13 04:12:12,196][46663] Updated weights for policy 1, policy_version 94201 (0.0009) +[2023-10-13 04:12:12,858][46662] Updated weights for policy 0, policy_version 94280 (0.0011) +[2023-10-13 04:12:13,235][46662] Updated weights for policy 0, policy_version 94290 (0.0008) +[2023-10-13 04:12:13,605][46662] Updated weights for policy 0, policy_version 94300 (0.0009) +[2023-10-13 04:12:13,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 193003520. Throughput: 0: 1680.0, 1: 1691.1. Samples: 48257942. Policy #0 lag: (min: 28.0, avg: 28.0, max: 28.0) +[2023-10-13 04:12:13,607][45375] Avg episode reward: [(0, '58.240'), (1, '54.460')] +[2023-10-13 04:12:16,264][46663] Updated weights for policy 1, policy_version 94211 (0.0009) +[2023-10-13 04:12:16,635][46663] Updated weights for policy 1, policy_version 94221 (0.0010) +[2023-10-13 04:12:17,000][46663] Updated weights for policy 1, policy_version 94231 (0.0008) +[2023-10-13 04:12:17,678][46662] Updated weights for policy 0, policy_version 94310 (0.0007) +[2023-10-13 04:12:18,058][46662] Updated weights for policy 0, policy_version 94320 (0.0011) +[2023-10-13 04:12:18,424][46662] Updated weights for policy 0, policy_version 94330 (0.0011) +[2023-10-13 04:12:18,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 193069056. Throughput: 0: 1681.6, 1: 1669.8. Samples: 48277788. Policy #0 lag: (min: 28.0, avg: 28.0, max: 28.0) +[2023-10-13 04:12:18,607][45375] Avg episode reward: [(0, '61.320'), (1, '56.340')] +[2023-10-13 04:12:21,082][46663] Updated weights for policy 1, policy_version 94241 (0.0009) +[2023-10-13 04:12:21,449][46663] Updated weights for policy 1, policy_version 94251 (0.0009) +[2023-10-13 04:12:21,815][46663] Updated weights for policy 1, policy_version 94261 (0.0007) +[2023-10-13 04:12:22,176][46663] Updated weights for policy 1, policy_version 94271 (0.0010) +[2023-10-13 04:12:22,462][46662] Updated weights for policy 0, policy_version 94340 (0.0008) +[2023-10-13 04:12:22,823][46662] Updated weights for policy 0, policy_version 94350 (0.0010) +[2023-10-13 04:12:23,195][46662] Updated weights for policy 0, policy_version 94360 (0.0010) +[2023-10-13 04:12:23,607][45375] Fps is (10 sec: 16383.9, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 193167360. Throughput: 0: 1670.1, 1: 1688.0. Samples: 48297892. Policy #0 lag: (min: 31.0, avg: 37.3, max: 63.0) +[2023-10-13 04:12:23,607][45375] Avg episode reward: [(0, '62.240'), (1, '56.950')] +[2023-10-13 04:12:26,274][46663] Updated weights for policy 1, policy_version 94281 (0.0008) +[2023-10-13 04:12:26,646][46663] Updated weights for policy 1, policy_version 94291 (0.0010) +[2023-10-13 04:12:27,007][46663] Updated weights for policy 1, policy_version 94301 (0.0007) +[2023-10-13 04:12:27,296][46662] Updated weights for policy 0, policy_version 94370 (0.0007) +[2023-10-13 04:12:27,670][46662] Updated weights for policy 0, policy_version 94380 (0.0007) +[2023-10-13 04:12:28,044][46662] Updated weights for policy 0, policy_version 94390 (0.0008) +[2023-10-13 04:12:28,407][46662] Updated weights for policy 0, policy_version 94400 (0.0007) +[2023-10-13 04:12:28,607][45375] Fps is (10 sec: 16384.0, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 193232896. Throughput: 0: 1675.5, 1: 1681.5. Samples: 48308128. Policy #0 lag: (min: 31.0, avg: 37.3, max: 63.0) +[2023-10-13 04:12:28,607][45375] Avg episode reward: [(0, '62.400'), (1, '55.090')] +[2023-10-13 04:12:31,053][46663] Updated weights for policy 1, policy_version 94311 (0.0009) +[2023-10-13 04:12:31,413][46663] Updated weights for policy 1, policy_version 94321 (0.0009) +[2023-10-13 04:12:31,775][46663] Updated weights for policy 1, policy_version 94331 (0.0010) +[2023-10-13 04:12:32,473][46662] Updated weights for policy 0, policy_version 94410 (0.0011) +[2023-10-13 04:12:32,847][46662] Updated weights for policy 0, policy_version 94420 (0.0010) +[2023-10-13 04:12:33,221][46662] Updated weights for policy 0, policy_version 94430 (0.0010) +[2023-10-13 04:12:33,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 193298432. Throughput: 0: 1682.3, 1: 1667.3. Samples: 48328160. Policy #0 lag: (min: 31.0, avg: 37.3, max: 63.0) +[2023-10-13 04:12:33,608][45375] Avg episode reward: [(0, '64.180'), (1, '55.140')] +[2023-10-13 04:12:36,128][46663] Updated weights for policy 1, policy_version 94341 (0.0010) +[2023-10-13 04:12:36,522][46663] Updated weights for policy 1, policy_version 94351 (0.0010) +[2023-10-13 04:12:36,887][46663] Updated weights for policy 1, policy_version 94361 (0.0007) +[2023-10-13 04:12:37,290][46662] Updated weights for policy 0, policy_version 94440 (0.0008) +[2023-10-13 04:12:37,658][46662] Updated weights for policy 0, policy_version 94450 (0.0010) +[2023-10-13 04:12:38,038][46662] Updated weights for policy 0, policy_version 94460 (0.0007) +[2023-10-13 04:12:38,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 193363968. Throughput: 0: 1658.8, 1: 1687.3. Samples: 48347884. Policy #0 lag: (min: 31.0, avg: 37.3, max: 63.0) +[2023-10-13 04:12:38,608][45375] Avg episode reward: [(0, '63.930'), (1, '55.460')] +[2023-10-13 04:12:40,897][46663] Updated weights for policy 1, policy_version 94371 (0.0008) +[2023-10-13 04:12:41,274][46663] Updated weights for policy 1, policy_version 94381 (0.0008) +[2023-10-13 04:12:41,630][46663] Updated weights for policy 1, policy_version 94391 (0.0008) +[2023-10-13 04:12:42,058][46662] Updated weights for policy 0, policy_version 94470 (0.0007) +[2023-10-13 04:12:42,421][46662] Updated weights for policy 0, policy_version 94480 (0.0007) +[2023-10-13 04:12:42,785][46662] Updated weights for policy 0, policy_version 94490 (0.0008) +[2023-10-13 04:12:43,606][45375] Fps is (10 sec: 13107.5, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 193429504. Throughput: 0: 1671.8, 1: 1673.6. Samples: 48358218. Policy #0 lag: (min: 31.0, avg: 37.3, max: 63.0) +[2023-10-13 04:12:43,607][45375] Avg episode reward: [(0, '64.420'), (1, '55.200')] +[2023-10-13 04:12:45,643][46663] Updated weights for policy 1, policy_version 94401 (0.0008) +[2023-10-13 04:12:46,010][46663] Updated weights for policy 1, policy_version 94411 (0.0008) +[2023-10-13 04:12:46,375][46663] Updated weights for policy 1, policy_version 94421 (0.0008) +[2023-10-13 04:12:46,750][46663] Updated weights for policy 1, policy_version 94431 (0.0007) +[2023-10-13 04:12:46,801][46662] Updated weights for policy 0, policy_version 94500 (0.0010) +[2023-10-13 04:12:47,177][46662] Updated weights for policy 0, policy_version 94510 (0.0010) +[2023-10-13 04:12:47,542][46662] Updated weights for policy 0, policy_version 94520 (0.0008) +[2023-10-13 04:12:48,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 193495040. Throughput: 0: 1676.5, 1: 1678.9. Samples: 48378548. Policy #0 lag: (min: 31.0, avg: 37.3, max: 63.0) +[2023-10-13 04:12:48,607][45375] Avg episode reward: [(0, '63.250'), (1, '54.380')] +[2023-10-13 04:12:50,821][46663] Updated weights for policy 1, policy_version 94441 (0.0009) +[2023-10-13 04:12:51,190][46663] Updated weights for policy 1, policy_version 94451 (0.0008) +[2023-10-13 04:12:51,564][46663] Updated weights for policy 1, policy_version 94461 (0.0008) +[2023-10-13 04:12:51,644][46662] Updated weights for policy 0, policy_version 94530 (0.0010) +[2023-10-13 04:12:52,024][46662] Updated weights for policy 0, policy_version 94540 (0.0008) +[2023-10-13 04:12:52,392][46662] Updated weights for policy 0, policy_version 94550 (0.0008) +[2023-10-13 04:12:52,767][46662] Updated weights for policy 0, policy_version 94560 (0.0009) +[2023-10-13 04:12:53,607][45375] Fps is (10 sec: 13106.8, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 193560576. Throughput: 0: 1654.8, 1: 1684.8. Samples: 48397952. Policy #0 lag: (min: 31.0, avg: 37.3, max: 63.0) +[2023-10-13 04:12:53,608][45375] Avg episode reward: [(0, '62.450'), (1, '53.400')] +[2023-10-13 04:12:55,439][46663] Updated weights for policy 1, policy_version 94471 (0.0007) +[2023-10-13 04:12:55,806][46663] Updated weights for policy 1, policy_version 94481 (0.0007) +[2023-10-13 04:12:56,165][46663] Updated weights for policy 1, policy_version 94491 (0.0008) +[2023-10-13 04:12:56,845][46662] Updated weights for policy 0, policy_version 94570 (0.0007) +[2023-10-13 04:12:57,209][46662] Updated weights for policy 0, policy_version 94580 (0.0007) +[2023-10-13 04:12:57,575][46662] Updated weights for policy 0, policy_version 94590 (0.0007) +[2023-10-13 04:12:58,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 193626112. Throughput: 0: 1681.8, 1: 1659.6. Samples: 48408306. Policy #0 lag: (min: 31.0, avg: 37.3, max: 63.0) +[2023-10-13 04:12:58,607][45375] Avg episode reward: [(0, '62.850'), (1, '54.380')] +[2023-10-13 04:13:00,233][46663] Updated weights for policy 1, policy_version 94501 (0.0009) +[2023-10-13 04:13:00,598][46663] Updated weights for policy 1, policy_version 94511 (0.0007) +[2023-10-13 04:13:00,976][46663] Updated weights for policy 1, policy_version 94521 (0.0008) +[2023-10-13 04:13:01,588][46662] Updated weights for policy 0, policy_version 94600 (0.0009) +[2023-10-13 04:13:01,961][46662] Updated weights for policy 0, policy_version 94610 (0.0007) +[2023-10-13 04:13:02,326][46662] Updated weights for policy 0, policy_version 94620 (0.0007) +[2023-10-13 04:13:03,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 193691648. Throughput: 0: 1669.3, 1: 1680.0. Samples: 48428508. Policy #0 lag: (min: 31.0, avg: 37.3, max: 63.0) +[2023-10-13 04:13:03,608][45375] Avg episode reward: [(0, '62.090'), (1, '54.650')] +[2023-10-13 04:13:04,982][46663] Updated weights for policy 1, policy_version 94531 (0.0009) +[2023-10-13 04:13:05,352][46663] Updated weights for policy 1, policy_version 94541 (0.0009) +[2023-10-13 04:13:05,719][46663] Updated weights for policy 1, policy_version 94551 (0.0008) +[2023-10-13 04:13:06,511][46662] Updated weights for policy 0, policy_version 94630 (0.0009) +[2023-10-13 04:13:06,870][46662] Updated weights for policy 0, policy_version 94640 (0.0008) +[2023-10-13 04:13:07,240][46662] Updated weights for policy 0, policy_version 94650 (0.0009) +[2023-10-13 04:13:08,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 193757184. Throughput: 0: 1665.0, 1: 1681.3. Samples: 48448476. Policy #0 lag: (min: 31.0, avg: 37.3, max: 63.0) +[2023-10-13 04:13:08,607][45375] Avg episode reward: [(0, '62.020'), (1, '53.940')] +[2023-10-13 04:13:09,690][46663] Updated weights for policy 1, policy_version 94561 (0.0009) +[2023-10-13 04:13:10,059][46663] Updated weights for policy 1, policy_version 94571 (0.0010) +[2023-10-13 04:13:10,436][46663] Updated weights for policy 1, policy_version 94581 (0.0007) +[2023-10-13 04:13:10,812][46663] Updated weights for policy 1, policy_version 94591 (0.0009) +[2023-10-13 04:13:11,322][46662] Updated weights for policy 0, policy_version 94660 (0.0008) +[2023-10-13 04:13:11,689][46662] Updated weights for policy 0, policy_version 94670 (0.0007) +[2023-10-13 04:13:12,053][46662] Updated weights for policy 0, policy_version 94680 (0.0008) +[2023-10-13 04:13:13,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 193822720. Throughput: 0: 1684.2, 1: 1664.2. Samples: 48458808. Policy #0 lag: (min: 31.0, avg: 37.3, max: 63.0) +[2023-10-13 04:13:13,608][45375] Avg episode reward: [(0, '61.470'), (1, '51.890')] +[2023-10-13 04:13:14,924][46663] Updated weights for policy 1, policy_version 94601 (0.0008) +[2023-10-13 04:13:15,287][46663] Updated weights for policy 1, policy_version 94611 (0.0007) +[2023-10-13 04:13:15,656][46663] Updated weights for policy 1, policy_version 94621 (0.0007) +[2023-10-13 04:13:16,186][46662] Updated weights for policy 0, policy_version 94690 (0.0008) +[2023-10-13 04:13:16,561][46662] Updated weights for policy 0, policy_version 94700 (0.0007) +[2023-10-13 04:13:16,925][46662] Updated weights for policy 0, policy_version 94710 (0.0010) +[2023-10-13 04:13:17,306][46662] Updated weights for policy 0, policy_version 94720 (0.0009) +[2023-10-13 04:13:18,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 193888256. Throughput: 0: 1664.1, 1: 1687.7. Samples: 48478992. Policy #0 lag: (min: 31.0, avg: 37.3, max: 63.0) +[2023-10-13 04:13:18,607][45375] Avg episode reward: [(0, '62.010'), (1, '50.360')] +[2023-10-13 04:13:19,905][46663] Updated weights for policy 1, policy_version 94631 (0.0009) +[2023-10-13 04:13:20,272][46663] Updated weights for policy 1, policy_version 94641 (0.0009) +[2023-10-13 04:13:20,641][46663] Updated weights for policy 1, policy_version 94651 (0.0007) +[2023-10-13 04:13:21,399][46662] Updated weights for policy 0, policy_version 94730 (0.0010) +[2023-10-13 04:13:21,767][46662] Updated weights for policy 0, policy_version 94740 (0.0008) +[2023-10-13 04:13:22,142][46662] Updated weights for policy 0, policy_version 94750 (0.0009) +[2023-10-13 04:13:23,606][45375] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 193953792. Throughput: 0: 1671.4, 1: 1686.0. Samples: 48498964. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 04:13:23,607][45375] Avg episode reward: [(0, '61.550'), (1, '50.140')] +[2023-10-13 04:13:24,765][46663] Updated weights for policy 1, policy_version 94661 (0.0008) +[2023-10-13 04:13:25,162][46663] Updated weights for policy 1, policy_version 94671 (0.0008) +[2023-10-13 04:13:25,523][46663] Updated weights for policy 1, policy_version 94681 (0.0009) +[2023-10-13 04:13:25,989][46662] Updated weights for policy 0, policy_version 94760 (0.0009) +[2023-10-13 04:13:26,366][46662] Updated weights for policy 0, policy_version 94770 (0.0010) +[2023-10-13 04:13:26,736][46662] Updated weights for policy 0, policy_version 94780 (0.0011) +[2023-10-13 04:13:28,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 194019328. Throughput: 0: 1684.4, 1: 1665.3. Samples: 48508956. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 04:13:28,607][45375] Avg episode reward: [(0, '60.360'), (1, '52.940')] +[2023-10-13 04:13:29,646][46663] Updated weights for policy 1, policy_version 94691 (0.0008) +[2023-10-13 04:13:30,004][46663] Updated weights for policy 1, policy_version 94701 (0.0008) +[2023-10-13 04:13:30,368][46663] Updated weights for policy 1, policy_version 94711 (0.0010) +[2023-10-13 04:13:30,860][46662] Updated weights for policy 0, policy_version 94790 (0.0009) +[2023-10-13 04:13:31,229][46662] Updated weights for policy 0, policy_version 94800 (0.0008) +[2023-10-13 04:13:31,608][46662] Updated weights for policy 0, policy_version 94810 (0.0007) +[2023-10-13 04:13:33,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 194084864. Throughput: 0: 1661.4, 1: 1674.7. Samples: 48528672. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 04:13:33,607][45375] Avg episode reward: [(0, '60.110'), (1, '51.930')] +[2023-10-13 04:13:34,460][46663] Updated weights for policy 1, policy_version 94721 (0.0007) +[2023-10-13 04:13:34,827][46663] Updated weights for policy 1, policy_version 94731 (0.0008) +[2023-10-13 04:13:35,196][46663] Updated weights for policy 1, policy_version 94741 (0.0008) +[2023-10-13 04:13:35,560][46663] Updated weights for policy 1, policy_version 94751 (0.0009) +[2023-10-13 04:13:35,819][46662] Updated weights for policy 0, policy_version 94820 (0.0008) +[2023-10-13 04:13:36,190][46662] Updated weights for policy 0, policy_version 94830 (0.0010) +[2023-10-13 04:13:36,571][46662] Updated weights for policy 0, policy_version 94840 (0.0009) +[2023-10-13 04:13:38,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 194150400. Throughput: 0: 1684.1, 1: 1675.5. Samples: 48549132. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 04:13:38,607][45375] Avg episode reward: [(0, '58.780'), (1, '52.350')] +[2023-10-13 04:13:39,559][46663] Updated weights for policy 1, policy_version 94761 (0.0009) +[2023-10-13 04:13:39,921][46663] Updated weights for policy 1, policy_version 94771 (0.0009) +[2023-10-13 04:13:40,285][46663] Updated weights for policy 1, policy_version 94781 (0.0009) +[2023-10-13 04:13:40,660][46662] Updated weights for policy 0, policy_version 94850 (0.0008) +[2023-10-13 04:13:41,024][46662] Updated weights for policy 0, policy_version 94860 (0.0010) +[2023-10-13 04:13:41,398][46662] Updated weights for policy 0, policy_version 94870 (0.0008) +[2023-10-13 04:13:41,767][46662] Updated weights for policy 0, policy_version 94880 (0.0007) +[2023-10-13 04:13:43,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.1, 300 sec: 13329.4). Total num frames: 194215936. Throughput: 0: 1676.6, 1: 1676.6. Samples: 48559198. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 04:13:43,608][45375] Avg episode reward: [(0, '57.360'), (1, '53.150')] +[2023-10-13 04:13:44,548][46663] Updated weights for policy 1, policy_version 94791 (0.0008) +[2023-10-13 04:13:44,908][46663] Updated weights for policy 1, policy_version 94801 (0.0009) +[2023-10-13 04:13:45,279][46663] Updated weights for policy 1, policy_version 94811 (0.0010) +[2023-10-13 04:13:45,642][46662] Updated weights for policy 0, policy_version 94890 (0.0008) +[2023-10-13 04:13:46,024][46662] Updated weights for policy 0, policy_version 94900 (0.0010) +[2023-10-13 04:13:46,395][46662] Updated weights for policy 0, policy_version 94910 (0.0009) +[2023-10-13 04:13:48,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 194281472. Throughput: 0: 1666.6, 1: 1673.1. Samples: 48578794. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 04:13:48,607][45375] Avg episode reward: [(0, '57.560'), (1, '52.060')] +[2023-10-13 04:13:49,550][46663] Updated weights for policy 1, policy_version 94821 (0.0008) +[2023-10-13 04:13:49,914][46663] Updated weights for policy 1, policy_version 94831 (0.0009) +[2023-10-13 04:13:50,281][46663] Updated weights for policy 1, policy_version 94841 (0.0009) +[2023-10-13 04:13:50,571][46662] Updated weights for policy 0, policy_version 94920 (0.0007) +[2023-10-13 04:13:50,939][46662] Updated weights for policy 0, policy_version 94930 (0.0007) +[2023-10-13 04:13:51,311][46662] Updated weights for policy 0, policy_version 94940 (0.0011) +[2023-10-13 04:13:53,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 194347008. Throughput: 0: 1680.8, 1: 1669.2. Samples: 48599228. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 04:13:53,608][45375] Avg episode reward: [(0, '56.620'), (1, '51.360')] +[2023-10-13 04:13:53,620][46091] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000094944_97222656.pth... +[2023-10-13 04:13:53,621][46384] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000094848_97124352.pth... +[2023-10-13 04:13:53,665][46384] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000093280_95518720.pth +[2023-10-13 04:13:53,668][46091] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000093376_95617024.pth +[2023-10-13 04:13:54,304][46663] Updated weights for policy 1, policy_version 94851 (0.0009) +[2023-10-13 04:13:54,671][46663] Updated weights for policy 1, policy_version 94861 (0.0007) +[2023-10-13 04:13:55,037][46663] Updated weights for policy 1, policy_version 94871 (0.0007) +[2023-10-13 04:13:55,433][46662] Updated weights for policy 0, policy_version 94950 (0.0008) +[2023-10-13 04:13:55,802][46662] Updated weights for policy 0, policy_version 94960 (0.0007) +[2023-10-13 04:13:56,183][46662] Updated weights for policy 0, policy_version 94970 (0.0007) +[2023-10-13 04:13:58,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 194412544. Throughput: 0: 1667.8, 1: 1670.3. Samples: 48609022. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 04:13:58,607][45375] Avg episode reward: [(0, '55.820'), (1, '52.730')] +[2023-10-13 04:13:59,086][46663] Updated weights for policy 1, policy_version 94881 (0.0007) +[2023-10-13 04:13:59,449][46663] Updated weights for policy 1, policy_version 94891 (0.0008) +[2023-10-13 04:13:59,808][46663] Updated weights for policy 1, policy_version 94901 (0.0008) +[2023-10-13 04:14:00,177][46663] Updated weights for policy 1, policy_version 94911 (0.0008) +[2023-10-13 04:14:00,230][46662] Updated weights for policy 0, policy_version 94980 (0.0008) +[2023-10-13 04:14:00,604][46662] Updated weights for policy 0, policy_version 94990 (0.0007) +[2023-10-13 04:14:00,971][46662] Updated weights for policy 0, policy_version 95000 (0.0010) +[2023-10-13 04:14:03,607][45375] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 194478080. Throughput: 0: 1664.9, 1: 1666.3. Samples: 48628898. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 04:14:03,608][45375] Avg episode reward: [(0, '56.830'), (1, '53.680')] +[2023-10-13 04:14:04,299][46663] Updated weights for policy 1, policy_version 94921 (0.0008) +[2023-10-13 04:14:04,655][46663] Updated weights for policy 1, policy_version 94931 (0.0007) +[2023-10-13 04:14:04,934][46662] Updated weights for policy 0, policy_version 95010 (0.0008) +[2023-10-13 04:14:05,033][46663] Updated weights for policy 1, policy_version 94941 (0.0008) +[2023-10-13 04:14:05,299][46662] Updated weights for policy 0, policy_version 95020 (0.0008) +[2023-10-13 04:14:05,671][46662] Updated weights for policy 0, policy_version 95030 (0.0007) +[2023-10-13 04:14:06,041][46662] Updated weights for policy 0, policy_version 95040 (0.0008) +[2023-10-13 04:14:08,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 194543616. Throughput: 0: 1683.2, 1: 1663.3. Samples: 48649556. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 04:14:08,607][45375] Avg episode reward: [(0, '56.450'), (1, '53.390')] +[2023-10-13 04:14:09,213][46663] Updated weights for policy 1, policy_version 94951 (0.0008) +[2023-10-13 04:14:09,568][46663] Updated weights for policy 1, policy_version 94961 (0.0007) +[2023-10-13 04:14:09,946][46663] Updated weights for policy 1, policy_version 94971 (0.0007) +[2023-10-13 04:14:10,238][46662] Updated weights for policy 0, policy_version 95050 (0.0009) +[2023-10-13 04:14:10,597][46662] Updated weights for policy 0, policy_version 95060 (0.0008) +[2023-10-13 04:14:10,973][46662] Updated weights for policy 0, policy_version 95070 (0.0010) +[2023-10-13 04:14:13,606][45375] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 194609152. Throughput: 0: 1659.1, 1: 1669.0. Samples: 48658722. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 04:14:13,607][45375] Avg episode reward: [(0, '54.790'), (1, '54.110')] +[2023-10-13 04:14:14,171][46663] Updated weights for policy 1, policy_version 94981 (0.0009) +[2023-10-13 04:14:14,567][46663] Updated weights for policy 1, policy_version 94991 (0.0009) +[2023-10-13 04:14:14,925][46663] Updated weights for policy 1, policy_version 95001 (0.0009) +[2023-10-13 04:14:15,017][46662] Updated weights for policy 0, policy_version 95080 (0.0008) +[2023-10-13 04:14:15,387][46662] Updated weights for policy 0, policy_version 95090 (0.0007) +[2023-10-13 04:14:15,757][46662] Updated weights for policy 0, policy_version 95100 (0.0007) +[2023-10-13 04:14:18,606][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 194674688. Throughput: 0: 1676.5, 1: 1664.0. Samples: 48678992. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 04:14:18,607][45375] Avg episode reward: [(0, '54.540'), (1, '53.710')] +[2023-10-13 04:14:18,958][46663] Updated weights for policy 1, policy_version 95011 (0.0008) +[2023-10-13 04:14:19,313][46663] Updated weights for policy 1, policy_version 95021 (0.0011) +[2023-10-13 04:14:19,660][46662] Updated weights for policy 0, policy_version 95110 (0.0009) +[2023-10-13 04:14:19,682][46663] Updated weights for policy 1, policy_version 95031 (0.0009) +[2023-10-13 04:14:20,022][46662] Updated weights for policy 0, policy_version 95120 (0.0007) +[2023-10-13 04:14:20,397][46662] Updated weights for policy 0, policy_version 95130 (0.0010) +[2023-10-13 04:14:23,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 194740224. Throughput: 0: 1682.6, 1: 1657.3. Samples: 48699430. Policy #0 lag: (min: 19.0, avg: 19.0, max: 19.0) +[2023-10-13 04:14:23,607][45375] Avg episode reward: [(0, '55.600'), (1, '55.690')] +[2023-10-13 04:14:23,713][46663] Updated weights for policy 1, policy_version 95041 (0.0008) +[2023-10-13 04:14:24,073][46663] Updated weights for policy 1, policy_version 95051 (0.0008) +[2023-10-13 04:14:24,432][46662] Updated weights for policy 0, policy_version 95140 (0.0009) +[2023-10-13 04:14:24,446][46663] Updated weights for policy 1, policy_version 95061 (0.0008) +[2023-10-13 04:14:24,805][46662] Updated weights for policy 0, policy_version 95150 (0.0007) +[2023-10-13 04:14:24,806][46663] Updated weights for policy 1, policy_version 95071 (0.0008) +[2023-10-13 04:14:25,165][46662] Updated weights for policy 0, policy_version 95160 (0.0007) +[2023-10-13 04:14:28,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 194805760. Throughput: 0: 1664.3, 1: 1657.0. Samples: 48708656. Policy #0 lag: (min: 19.0, avg: 19.0, max: 19.0) +[2023-10-13 04:14:28,608][45375] Avg episode reward: [(0, '55.170'), (1, '57.910')] +[2023-10-13 04:14:28,859][46663] Updated weights for policy 1, policy_version 95081 (0.0007) +[2023-10-13 04:14:29,192][46662] Updated weights for policy 0, policy_version 95170 (0.0007) +[2023-10-13 04:14:29,224][46663] Updated weights for policy 1, policy_version 95091 (0.0007) +[2023-10-13 04:14:29,556][46662] Updated weights for policy 0, policy_version 95180 (0.0008) +[2023-10-13 04:14:29,589][46663] Updated weights for policy 1, policy_version 95101 (0.0009) +[2023-10-13 04:14:29,932][46662] Updated weights for policy 0, policy_version 95190 (0.0008) +[2023-10-13 04:14:30,299][46662] Updated weights for policy 0, policy_version 95200 (0.0009) +[2023-10-13 04:14:33,606][45375] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 194871296. Throughput: 0: 1679.1, 1: 1662.9. Samples: 48729186. Policy #0 lag: (min: 19.0, avg: 19.0, max: 19.0) +[2023-10-13 04:14:33,607][45375] Avg episode reward: [(0, '55.890'), (1, '56.450')] +[2023-10-13 04:14:33,680][46663] Updated weights for policy 1, policy_version 95111 (0.0010) +[2023-10-13 04:14:34,046][46663] Updated weights for policy 1, policy_version 95121 (0.0008) +[2023-10-13 04:14:34,417][46663] Updated weights for policy 1, policy_version 95131 (0.0007) +[2023-10-13 04:14:34,579][46662] Updated weights for policy 0, policy_version 95210 (0.0009) +[2023-10-13 04:14:34,941][46662] Updated weights for policy 0, policy_version 95220 (0.0007) +[2023-10-13 04:14:35,314][46662] Updated weights for policy 0, policy_version 95230 (0.0011) +[2023-10-13 04:14:38,569][46663] Updated weights for policy 1, policy_version 95141 (0.0007) +[2023-10-13 04:14:38,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 194936832. Throughput: 0: 1676.0, 1: 1663.1. Samples: 48749486. Policy #0 lag: (min: 19.0, avg: 19.0, max: 19.0) +[2023-10-13 04:14:38,607][45375] Avg episode reward: [(0, '55.290'), (1, '55.630')] +[2023-10-13 04:14:38,928][46663] Updated weights for policy 1, policy_version 95151 (0.0007) +[2023-10-13 04:14:39,289][46663] Updated weights for policy 1, policy_version 95161 (0.0009) +[2023-10-13 04:14:39,421][46662] Updated weights for policy 0, policy_version 95240 (0.0009) +[2023-10-13 04:14:39,796][46662] Updated weights for policy 0, policy_version 95250 (0.0009) +[2023-10-13 04:14:40,175][46662] Updated weights for policy 0, policy_version 95260 (0.0008) +[2023-10-13 04:14:43,418][46663] Updated weights for policy 1, policy_version 95171 (0.0009) +[2023-10-13 04:14:43,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 195002368. Throughput: 0: 1660.8, 1: 1662.8. Samples: 48758584. Policy #0 lag: (min: 19.0, avg: 19.0, max: 19.0) +[2023-10-13 04:14:43,607][45375] Avg episode reward: [(0, '56.270'), (1, '54.220')] +[2023-10-13 04:14:43,795][46663] Updated weights for policy 1, policy_version 95181 (0.0010) +[2023-10-13 04:14:44,166][46663] Updated weights for policy 1, policy_version 95191 (0.0009) +[2023-10-13 04:14:44,364][46662] Updated weights for policy 0, policy_version 95270 (0.0008) +[2023-10-13 04:14:44,737][46662] Updated weights for policy 0, policy_version 95280 (0.0008) +[2023-10-13 04:14:45,113][46662] Updated weights for policy 0, policy_version 95290 (0.0009) +[2023-10-13 04:14:48,206][46663] Updated weights for policy 1, policy_version 95201 (0.0008) +[2023-10-13 04:14:48,575][46663] Updated weights for policy 1, policy_version 95211 (0.0010) +[2023-10-13 04:14:48,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 195067904. Throughput: 0: 1676.5, 1: 1666.2. Samples: 48779322. Policy #0 lag: (min: 19.0, avg: 19.0, max: 19.0) +[2023-10-13 04:14:48,608][45375] Avg episode reward: [(0, '55.590'), (1, '53.220')] +[2023-10-13 04:14:48,941][46663] Updated weights for policy 1, policy_version 95221 (0.0009) +[2023-10-13 04:14:49,153][46662] Updated weights for policy 0, policy_version 95300 (0.0008) +[2023-10-13 04:14:49,311][46663] Updated weights for policy 1, policy_version 95231 (0.0007) +[2023-10-13 04:14:49,524][46662] Updated weights for policy 0, policy_version 95310 (0.0009) +[2023-10-13 04:14:49,905][46662] Updated weights for policy 0, policy_version 95320 (0.0007) +[2023-10-13 04:14:53,489][46663] Updated weights for policy 1, policy_version 95241 (0.0008) +[2023-10-13 04:14:53,607][45375] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 195133440. Throughput: 0: 1672.4, 1: 1666.5. Samples: 48799810. Policy #0 lag: (min: 19.0, avg: 19.0, max: 19.0) +[2023-10-13 04:14:53,608][45375] Avg episode reward: [(0, '56.020'), (1, '52.970')] +[2023-10-13 04:14:53,865][46663] Updated weights for policy 1, policy_version 95251 (0.0008) +[2023-10-13 04:14:53,931][46662] Updated weights for policy 0, policy_version 95330 (0.0007) +[2023-10-13 04:14:54,237][46663] Updated weights for policy 1, policy_version 95261 (0.0007) +[2023-10-13 04:14:54,301][46662] Updated weights for policy 0, policy_version 95340 (0.0008) +[2023-10-13 04:14:54,672][46662] Updated weights for policy 0, policy_version 95350 (0.0007) +[2023-10-13 04:14:55,042][46662] Updated weights for policy 0, policy_version 95360 (0.0008) +[2023-10-13 04:14:58,523][46663] Updated weights for policy 1, policy_version 95271 (0.0010) +[2023-10-13 04:14:58,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 195198976. Throughput: 0: 1666.9, 1: 1672.8. Samples: 48809012. Policy #0 lag: (min: 19.0, avg: 19.0, max: 19.0) +[2023-10-13 04:14:58,607][45375] Avg episode reward: [(0, '58.810'), (1, '54.080')] +[2023-10-13 04:14:58,896][46663] Updated weights for policy 1, policy_version 95281 (0.0009) +[2023-10-13 04:14:59,265][46662] Updated weights for policy 0, policy_version 95370 (0.0009) +[2023-10-13 04:14:59,271][46663] Updated weights for policy 1, policy_version 95291 (0.0008) +[2023-10-13 04:14:59,631][46662] Updated weights for policy 0, policy_version 95380 (0.0008) +[2023-10-13 04:15:00,002][46662] Updated weights for policy 0, policy_version 95390 (0.0008) +[2023-10-13 04:15:03,272][46663] Updated weights for policy 1, policy_version 95301 (0.0009) +[2023-10-13 04:15:03,607][45375] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 195264512. Throughput: 0: 1669.2, 1: 1676.8. Samples: 48829562. Policy #0 lag: (min: 19.0, avg: 19.0, max: 19.0) +[2023-10-13 04:15:03,607][45375] Avg episode reward: [(0, '58.610'), (1, '53.290')] +[2023-10-13 04:15:03,643][46663] Updated weights for policy 1, policy_version 95311 (0.0007) +[2023-10-13 04:15:04,017][46663] Updated weights for policy 1, policy_version 95321 (0.0008) +[2023-10-13 04:15:04,204][46662] Updated weights for policy 0, policy_version 95400 (0.0008) +[2023-10-13 04:15:04,568][46662] Updated weights for policy 0, policy_version 95410 (0.0008) +[2023-10-13 04:15:04,941][46662] Updated weights for policy 0, policy_version 95420 (0.0011) +[2023-10-13 04:15:08,128][46663] Updated weights for policy 1, policy_version 95331 (0.0009) +[2023-10-13 04:15:08,502][46663] Updated weights for policy 1, policy_version 95341 (0.0009) +[2023-10-13 04:15:08,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 195330048. Throughput: 0: 1665.9, 1: 1674.1. Samples: 48849728. Policy #0 lag: (min: 19.0, avg: 19.0, max: 19.0) +[2023-10-13 04:15:08,607][45375] Avg episode reward: [(0, '58.880'), (1, '53.620')] +[2023-10-13 04:15:08,875][46663] Updated weights for policy 1, policy_version 95351 (0.0009) +[2023-10-13 04:15:09,012][46662] Updated weights for policy 0, policy_version 95430 (0.0008) +[2023-10-13 04:15:09,380][46662] Updated weights for policy 0, policy_version 95440 (0.0008) +[2023-10-13 04:15:09,752][46662] Updated weights for policy 0, policy_version 95450 (0.0008) +[2023-10-13 04:15:13,132][46663] Updated weights for policy 1, policy_version 95361 (0.0008) +[2023-10-13 04:15:13,495][46663] Updated weights for policy 1, policy_version 95371 (0.0011) +[2023-10-13 04:15:13,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 195395584. Throughput: 0: 1662.7, 1: 1680.5. Samples: 48859098. Policy #0 lag: (min: 19.0, avg: 19.0, max: 19.0) +[2023-10-13 04:15:13,607][45375] Avg episode reward: [(0, '58.750'), (1, '54.510')] +[2023-10-13 04:15:13,774][46662] Updated weights for policy 0, policy_version 95460 (0.0008) +[2023-10-13 04:15:13,867][46663] Updated weights for policy 1, policy_version 95381 (0.0009) +[2023-10-13 04:15:14,146][46662] Updated weights for policy 0, policy_version 95470 (0.0007) +[2023-10-13 04:15:14,238][46663] Updated weights for policy 1, policy_version 95391 (0.0008) +[2023-10-13 04:15:14,515][46662] Updated weights for policy 0, policy_version 95480 (0.0007) +[2023-10-13 04:15:18,366][46663] Updated weights for policy 1, policy_version 95401 (0.0010) +[2023-10-13 04:15:18,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 195461120. Throughput: 0: 1670.0, 1: 1677.2. Samples: 48879810. Policy #0 lag: (min: 19.0, avg: 19.0, max: 19.0) +[2023-10-13 04:15:18,607][45375] Avg episode reward: [(0, '59.670'), (1, '56.180')] +[2023-10-13 04:15:18,620][46662] Updated weights for policy 0, policy_version 95490 (0.0007) +[2023-10-13 04:15:18,731][46663] Updated weights for policy 1, policy_version 95411 (0.0009) +[2023-10-13 04:15:18,985][46662] Updated weights for policy 0, policy_version 95500 (0.0008) +[2023-10-13 04:15:19,089][46663] Updated weights for policy 1, policy_version 95421 (0.0008) +[2023-10-13 04:15:19,350][46662] Updated weights for policy 0, policy_version 95510 (0.0010) +[2023-10-13 04:15:19,724][46662] Updated weights for policy 0, policy_version 95520 (0.0009) +[2023-10-13 04:15:23,138][46663] Updated weights for policy 1, policy_version 95431 (0.0008) +[2023-10-13 04:15:23,500][46663] Updated weights for policy 1, policy_version 95441 (0.0009) +[2023-10-13 04:15:23,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 195526656. Throughput: 0: 1672.5, 1: 1666.3. Samples: 48899734. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 04:15:23,607][45375] Avg episode reward: [(0, '59.780'), (1, '56.370')] +[2023-10-13 04:15:23,850][46662] Updated weights for policy 0, policy_version 95530 (0.0009) +[2023-10-13 04:15:23,873][46663] Updated weights for policy 1, policy_version 95451 (0.0007) +[2023-10-13 04:15:24,219][46662] Updated weights for policy 0, policy_version 95540 (0.0009) +[2023-10-13 04:15:24,587][46662] Updated weights for policy 0, policy_version 95550 (0.0007) +[2023-10-13 04:15:27,916][46663] Updated weights for policy 1, policy_version 95461 (0.0008) +[2023-10-13 04:15:28,290][46663] Updated weights for policy 1, policy_version 95471 (0.0010) +[2023-10-13 04:15:28,606][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 195592192. Throughput: 0: 1675.0, 1: 1674.7. Samples: 48909320. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 04:15:28,607][45375] Avg episode reward: [(0, '60.830'), (1, '56.810')] +[2023-10-13 04:15:28,651][46663] Updated weights for policy 1, policy_version 95481 (0.0009) +[2023-10-13 04:15:28,657][46662] Updated weights for policy 0, policy_version 95560 (0.0010) +[2023-10-13 04:15:29,029][46662] Updated weights for policy 0, policy_version 95570 (0.0007) +[2023-10-13 04:15:29,390][46662] Updated weights for policy 0, policy_version 95580 (0.0008) +[2023-10-13 04:15:32,440][46663] Updated weights for policy 1, policy_version 95491 (0.0009) +[2023-10-13 04:15:32,801][46663] Updated weights for policy 1, policy_version 95501 (0.0008) +[2023-10-13 04:15:33,159][46663] Updated weights for policy 1, policy_version 95511 (0.0008) +[2023-10-13 04:15:33,536][46662] Updated weights for policy 0, policy_version 95590 (0.0007) +[2023-10-13 04:15:33,607][45375] Fps is (10 sec: 16384.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 195690496. Throughput: 0: 1679.2, 1: 1681.8. Samples: 48930564. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 04:15:33,607][45375] Avg episode reward: [(0, '61.800'), (1, '53.410')] +[2023-10-13 04:15:33,905][46662] Updated weights for policy 0, policy_version 95600 (0.0009) +[2023-10-13 04:15:34,264][46662] Updated weights for policy 0, policy_version 95610 (0.0008) +[2023-10-13 04:15:37,163][46663] Updated weights for policy 1, policy_version 95521 (0.0008) +[2023-10-13 04:15:37,521][46663] Updated weights for policy 1, policy_version 95531 (0.0011) +[2023-10-13 04:15:37,885][46663] Updated weights for policy 1, policy_version 95541 (0.0010) +[2023-10-13 04:15:38,254][46663] Updated weights for policy 1, policy_version 95551 (0.0009) +[2023-10-13 04:15:38,282][46662] Updated weights for policy 0, policy_version 95620 (0.0010) +[2023-10-13 04:15:38,607][45375] Fps is (10 sec: 16383.8, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 195756032. Throughput: 0: 1681.7, 1: 1657.8. Samples: 48950088. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 04:15:38,607][45375] Avg episode reward: [(0, '61.840'), (1, '54.510')] +[2023-10-13 04:15:38,649][46662] Updated weights for policy 0, policy_version 95630 (0.0008) +[2023-10-13 04:15:39,018][46662] Updated weights for policy 0, policy_version 95640 (0.0009) +[2023-10-13 04:15:42,357][46663] Updated weights for policy 1, policy_version 95561 (0.0010) +[2023-10-13 04:15:42,719][46663] Updated weights for policy 1, policy_version 95571 (0.0009) +[2023-10-13 04:15:43,082][46663] Updated weights for policy 1, policy_version 95581 (0.0007) +[2023-10-13 04:15:43,175][46662] Updated weights for policy 0, policy_version 95650 (0.0008) +[2023-10-13 04:15:43,546][46662] Updated weights for policy 0, policy_version 95660 (0.0007) +[2023-10-13 04:15:43,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 195821568. Throughput: 0: 1687.7, 1: 1677.9. Samples: 48960464. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 04:15:43,607][45375] Avg episode reward: [(0, '62.090'), (1, '53.790')] +[2023-10-13 04:15:43,913][46662] Updated weights for policy 0, policy_version 95670 (0.0008) +[2023-10-13 04:15:44,283][46662] Updated weights for policy 0, policy_version 95680 (0.0009) +[2023-10-13 04:15:47,395][46663] Updated weights for policy 1, policy_version 95591 (0.0007) +[2023-10-13 04:15:47,755][46663] Updated weights for policy 1, policy_version 95601 (0.0010) +[2023-10-13 04:15:48,117][46663] Updated weights for policy 1, policy_version 95611 (0.0007) +[2023-10-13 04:15:48,187][46662] Updated weights for policy 0, policy_version 95690 (0.0008) +[2023-10-13 04:15:48,560][46662] Updated weights for policy 0, policy_version 95700 (0.0008) +[2023-10-13 04:15:48,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 195887104. Throughput: 0: 1691.9, 1: 1668.3. Samples: 48980770. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 04:15:48,607][45375] Avg episode reward: [(0, '61.680'), (1, '53.150')] +[2023-10-13 04:15:48,935][46662] Updated weights for policy 0, policy_version 95710 (0.0008) +[2023-10-13 04:15:52,268][46663] Updated weights for policy 1, policy_version 95621 (0.0008) +[2023-10-13 04:15:52,665][46663] Updated weights for policy 1, policy_version 95631 (0.0011) +[2023-10-13 04:15:53,022][46662] Updated weights for policy 0, policy_version 95720 (0.0008) +[2023-10-13 04:15:53,032][46663] Updated weights for policy 1, policy_version 95641 (0.0007) +[2023-10-13 04:15:53,388][46662] Updated weights for policy 0, policy_version 95730 (0.0009) +[2023-10-13 04:15:53,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 195952640. Throughput: 0: 1688.8, 1: 1652.4. Samples: 49000080. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 04:15:53,607][45375] Avg episode reward: [(0, '60.240'), (1, '55.530')] +[2023-10-13 04:15:53,615][46384] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000095648_97943552.pth... +[2023-10-13 04:15:53,649][46384] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000094080_96337920.pth +[2023-10-13 04:15:53,750][46662] Updated weights for policy 0, policy_version 95740 (0.0009) +[2023-10-13 04:15:53,897][46091] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000095744_98041856.pth... +[2023-10-13 04:15:53,926][46091] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000094176_96436224.pth +[2023-10-13 04:15:57,049][46663] Updated weights for policy 1, policy_version 95651 (0.0009) +[2023-10-13 04:15:57,401][46663] Updated weights for policy 1, policy_version 95661 (0.0009) +[2023-10-13 04:15:57,763][46662] Updated weights for policy 0, policy_version 95750 (0.0009) +[2023-10-13 04:15:57,771][46663] Updated weights for policy 1, policy_version 95671 (0.0009) +[2023-10-13 04:15:58,133][46662] Updated weights for policy 0, policy_version 95760 (0.0007) +[2023-10-13 04:15:58,501][46662] Updated weights for policy 0, policy_version 95770 (0.0007) +[2023-10-13 04:15:58,606][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 196018176. Throughput: 0: 1690.9, 1: 1679.9. Samples: 49010782. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 04:15:58,607][45375] Avg episode reward: [(0, '59.290'), (1, '56.060')] +[2023-10-13 04:16:01,806][46663] Updated weights for policy 1, policy_version 95681 (0.0009) +[2023-10-13 04:16:02,158][46663] Updated weights for policy 1, policy_version 95691 (0.0008) +[2023-10-13 04:16:02,527][46663] Updated weights for policy 1, policy_version 95701 (0.0009) +[2023-10-13 04:16:02,638][46662] Updated weights for policy 0, policy_version 95780 (0.0008) +[2023-10-13 04:16:02,896][46663] Updated weights for policy 1, policy_version 95711 (0.0007) +[2023-10-13 04:16:03,006][46662] Updated weights for policy 0, policy_version 95790 (0.0008) +[2023-10-13 04:16:03,380][46662] Updated weights for policy 0, policy_version 95800 (0.0009) +[2023-10-13 04:16:03,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 196083712. Throughput: 0: 1685.9, 1: 1666.9. Samples: 49030684. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 04:16:03,607][45375] Avg episode reward: [(0, '59.660'), (1, '56.680')] +[2023-10-13 04:16:06,919][46663] Updated weights for policy 1, policy_version 95721 (0.0007) +[2023-10-13 04:16:07,284][46663] Updated weights for policy 1, policy_version 95731 (0.0007) +[2023-10-13 04:16:07,458][46662] Updated weights for policy 0, policy_version 95810 (0.0008) +[2023-10-13 04:16:07,649][46663] Updated weights for policy 1, policy_version 95741 (0.0009) +[2023-10-13 04:16:07,824][46662] Updated weights for policy 0, policy_version 95820 (0.0007) +[2023-10-13 04:16:08,187][46662] Updated weights for policy 0, policy_version 95830 (0.0007) +[2023-10-13 04:16:08,564][46662] Updated weights for policy 0, policy_version 95840 (0.0009) +[2023-10-13 04:16:08,607][45375] Fps is (10 sec: 16383.9, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 196182016. Throughput: 0: 1683.3, 1: 1668.9. Samples: 49050582. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 04:16:08,607][45375] Avg episode reward: [(0, '57.660'), (1, '56.130')] +[2023-10-13 04:16:11,671][46663] Updated weights for policy 1, policy_version 95751 (0.0008) +[2023-10-13 04:16:12,044][46663] Updated weights for policy 1, policy_version 95761 (0.0010) +[2023-10-13 04:16:12,397][46663] Updated weights for policy 1, policy_version 95771 (0.0009) +[2023-10-13 04:16:12,505][46662] Updated weights for policy 0, policy_version 95850 (0.0008) +[2023-10-13 04:16:12,879][46662] Updated weights for policy 0, policy_version 95860 (0.0008) +[2023-10-13 04:16:13,251][46662] Updated weights for policy 0, policy_version 95870 (0.0007) +[2023-10-13 04:16:13,607][45375] Fps is (10 sec: 16383.9, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 196247552. Throughput: 0: 1692.7, 1: 1686.3. Samples: 49061374. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 04:16:13,607][45375] Avg episode reward: [(0, '58.410'), (1, '54.980')] +[2023-10-13 04:16:16,516][46663] Updated weights for policy 1, policy_version 95781 (0.0008) +[2023-10-13 04:16:16,890][46663] Updated weights for policy 1, policy_version 95791 (0.0009) +[2023-10-13 04:16:17,256][46663] Updated weights for policy 1, policy_version 95801 (0.0008) +[2023-10-13 04:16:17,404][46662] Updated weights for policy 0, policy_version 95880 (0.0009) +[2023-10-13 04:16:17,774][46662] Updated weights for policy 0, policy_version 95890 (0.0009) +[2023-10-13 04:16:18,152][46662] Updated weights for policy 0, policy_version 95900 (0.0010) +[2023-10-13 04:16:18,606][45375] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 196313088. Throughput: 0: 1691.5, 1: 1655.6. Samples: 49081182. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 04:16:18,607][45375] Avg episode reward: [(0, '57.860'), (1, '54.090')] +[2023-10-13 04:16:21,307][46663] Updated weights for policy 1, policy_version 95811 (0.0009) +[2023-10-13 04:16:21,667][46663] Updated weights for policy 1, policy_version 95821 (0.0009) +[2023-10-13 04:16:22,027][46663] Updated weights for policy 1, policy_version 95831 (0.0009) +[2023-10-13 04:16:22,040][46662] Updated weights for policy 0, policy_version 95910 (0.0008) +[2023-10-13 04:16:22,403][46662] Updated weights for policy 0, policy_version 95920 (0.0008) +[2023-10-13 04:16:22,770][46662] Updated weights for policy 0, policy_version 95930 (0.0010) +[2023-10-13 04:16:23,607][45375] Fps is (10 sec: 13107.0, 60 sec: 14199.4, 300 sec: 13440.4). Total num frames: 196378624. Throughput: 0: 1672.4, 1: 1678.8. Samples: 49100888. Policy #0 lag: (min: 31.0, avg: 40.8, max: 63.0) +[2023-10-13 04:16:23,608][45375] Avg episode reward: [(0, '57.290'), (1, '52.600')] +[2023-10-13 04:16:26,162][46663] Updated weights for policy 1, policy_version 95841 (0.0008) +[2023-10-13 04:16:26,536][46663] Updated weights for policy 1, policy_version 95851 (0.0008) +[2023-10-13 04:16:26,732][46662] Updated weights for policy 0, policy_version 95940 (0.0009) +[2023-10-13 04:16:26,904][46663] Updated weights for policy 1, policy_version 95861 (0.0007) +[2023-10-13 04:16:27,095][46662] Updated weights for policy 0, policy_version 95950 (0.0010) +[2023-10-13 04:16:27,260][46663] Updated weights for policy 1, policy_version 95871 (0.0009) +[2023-10-13 04:16:27,460][46662] Updated weights for policy 0, policy_version 95960 (0.0010) +[2023-10-13 04:16:28,607][45375] Fps is (10 sec: 13107.1, 60 sec: 14199.4, 300 sec: 13440.4). Total num frames: 196444160. Throughput: 0: 1693.4, 1: 1674.4. Samples: 49112012. Policy #0 lag: (min: 31.0, avg: 40.8, max: 63.0) +[2023-10-13 04:16:28,607][45375] Avg episode reward: [(0, '56.280'), (1, '52.770')] +[2023-10-13 04:16:31,158][46663] Updated weights for policy 1, policy_version 95881 (0.0010) +[2023-10-13 04:16:31,529][46663] Updated weights for policy 1, policy_version 95891 (0.0010) +[2023-10-13 04:16:31,632][46662] Updated weights for policy 0, policy_version 95970 (0.0010) +[2023-10-13 04:16:31,901][46663] Updated weights for policy 1, policy_version 95901 (0.0007) +[2023-10-13 04:16:32,003][46662] Updated weights for policy 0, policy_version 95980 (0.0007) +[2023-10-13 04:16:32,366][46662] Updated weights for policy 0, policy_version 95990 (0.0008) +[2023-10-13 04:16:32,729][46662] Updated weights for policy 0, policy_version 96000 (0.0008) +[2023-10-13 04:16:33,607][45375] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 196509696. Throughput: 0: 1680.0, 1: 1664.9. Samples: 49131290. Policy #0 lag: (min: 31.0, avg: 40.8, max: 63.0) +[2023-10-13 04:16:33,607][45375] Avg episode reward: [(0, '55.550'), (1, '52.870')] +[2023-10-13 04:16:36,090][46663] Updated weights for policy 1, policy_version 95911 (0.0010) +[2023-10-13 04:16:36,454][46663] Updated weights for policy 1, policy_version 95921 (0.0010) +[2023-10-13 04:16:36,818][46662] Updated weights for policy 0, policy_version 96010 (0.0009) +[2023-10-13 04:16:36,818][46663] Updated weights for policy 1, policy_version 95931 (0.0009) +[2023-10-13 04:16:37,198][46662] Updated weights for policy 0, policy_version 96020 (0.0009) +[2023-10-13 04:16:37,559][46662] Updated weights for policy 0, policy_version 96030 (0.0008) +[2023-10-13 04:16:38,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 196575232. Throughput: 0: 1663.4, 1: 1683.8. Samples: 49150702. Policy #0 lag: (min: 31.0, avg: 40.8, max: 63.0) +[2023-10-13 04:16:38,607][45375] Avg episode reward: [(0, '55.450'), (1, '53.680')] +[2023-10-13 04:16:41,179][46663] Updated weights for policy 1, policy_version 95941 (0.0008) +[2023-10-13 04:16:41,521][46662] Updated weights for policy 0, policy_version 96040 (0.0009) +[2023-10-13 04:16:41,577][46663] Updated weights for policy 1, policy_version 95951 (0.0009) +[2023-10-13 04:16:41,886][46662] Updated weights for policy 0, policy_version 96050 (0.0008) +[2023-10-13 04:16:41,934][46663] Updated weights for policy 1, policy_version 95961 (0.0007) +[2023-10-13 04:16:42,254][46662] Updated weights for policy 0, policy_version 96060 (0.0007) +[2023-10-13 04:16:43,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 196640768. Throughput: 0: 1692.4, 1: 1668.0. Samples: 49162000. Policy #0 lag: (min: 31.0, avg: 40.8, max: 63.0) +[2023-10-13 04:16:43,607][45375] Avg episode reward: [(0, '56.010'), (1, '53.030')] +[2023-10-13 04:16:45,999][46663] Updated weights for policy 1, policy_version 95971 (0.0007) +[2023-10-13 04:16:46,364][46663] Updated weights for policy 1, policy_version 95981 (0.0009) +[2023-10-13 04:16:46,410][46662] Updated weights for policy 0, policy_version 96070 (0.0007) +[2023-10-13 04:16:46,722][46663] Updated weights for policy 1, policy_version 95991 (0.0007) +[2023-10-13 04:16:46,772][46662] Updated weights for policy 0, policy_version 96080 (0.0009) +[2023-10-13 04:16:47,136][46662] Updated weights for policy 0, policy_version 96090 (0.0008) +[2023-10-13 04:16:48,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 196706304. Throughput: 0: 1679.2, 1: 1664.1. Samples: 49181130. Policy #0 lag: (min: 31.0, avg: 40.8, max: 63.0) +[2023-10-13 04:16:48,607][45375] Avg episode reward: [(0, '57.020'), (1, '53.110')] +[2023-10-13 04:16:50,652][46663] Updated weights for policy 1, policy_version 96001 (0.0008) +[2023-10-13 04:16:51,013][46663] Updated weights for policy 1, policy_version 96011 (0.0008) +[2023-10-13 04:16:51,289][46662] Updated weights for policy 0, policy_version 96100 (0.0009) +[2023-10-13 04:16:51,379][46663] Updated weights for policy 1, policy_version 96021 (0.0008) +[2023-10-13 04:16:51,660][46662] Updated weights for policy 0, policy_version 96110 (0.0008) +[2023-10-13 04:16:51,748][46663] Updated weights for policy 1, policy_version 96031 (0.0008) +[2023-10-13 04:16:52,033][46662] Updated weights for policy 0, policy_version 96120 (0.0009) +[2023-10-13 04:16:53,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 196771840. Throughput: 0: 1668.0, 1: 1683.7. Samples: 49201408. Policy #0 lag: (min: 31.0, avg: 40.8, max: 63.0) +[2023-10-13 04:16:53,607][45375] Avg episode reward: [(0, '54.900'), (1, '54.410')] +[2023-10-13 04:16:55,736][46663] Updated weights for policy 1, policy_version 96041 (0.0008) +[2023-10-13 04:16:56,109][46663] Updated weights for policy 1, policy_version 96051 (0.0010) +[2023-10-13 04:16:56,132][46662] Updated weights for policy 0, policy_version 96130 (0.0008) +[2023-10-13 04:16:56,478][46663] Updated weights for policy 1, policy_version 96061 (0.0008) +[2023-10-13 04:16:56,509][46662] Updated weights for policy 0, policy_version 96140 (0.0008) +[2023-10-13 04:16:56,881][46662] Updated weights for policy 0, policy_version 96150 (0.0007) +[2023-10-13 04:16:57,252][46662] Updated weights for policy 0, policy_version 96160 (0.0007) +[2023-10-13 04:16:58,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 196837376. Throughput: 0: 1685.5, 1: 1666.2. Samples: 49212198. Policy #0 lag: (min: 31.0, avg: 40.8, max: 63.0) +[2023-10-13 04:16:58,607][45375] Avg episode reward: [(0, '55.410'), (1, '53.920')] +[2023-10-13 04:17:00,469][46663] Updated weights for policy 1, policy_version 96071 (0.0009) +[2023-10-13 04:17:00,842][46663] Updated weights for policy 1, policy_version 96081 (0.0008) +[2023-10-13 04:17:01,207][46663] Updated weights for policy 1, policy_version 96091 (0.0008) +[2023-10-13 04:17:01,314][46662] Updated weights for policy 0, policy_version 96170 (0.0008) +[2023-10-13 04:17:01,688][46662] Updated weights for policy 0, policy_version 96180 (0.0009) +[2023-10-13 04:17:02,059][46662] Updated weights for policy 0, policy_version 96190 (0.0008) +[2023-10-13 04:17:03,606][45375] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 196902912. Throughput: 0: 1659.6, 1: 1679.8. Samples: 49231458. Policy #0 lag: (min: 31.0, avg: 40.8, max: 63.0) +[2023-10-13 04:17:03,607][45375] Avg episode reward: [(0, '57.510'), (1, '53.430')] +[2023-10-13 04:17:05,318][46663] Updated weights for policy 1, policy_version 96101 (0.0008) +[2023-10-13 04:17:05,681][46663] Updated weights for policy 1, policy_version 96111 (0.0007) +[2023-10-13 04:17:06,049][46663] Updated weights for policy 1, policy_version 96121 (0.0009) +[2023-10-13 04:17:06,077][46662] Updated weights for policy 0, policy_version 96200 (0.0010) +[2023-10-13 04:17:06,444][46662] Updated weights for policy 0, policy_version 96210 (0.0007) +[2023-10-13 04:17:06,810][46662] Updated weights for policy 0, policy_version 96220 (0.0008) +[2023-10-13 04:17:08,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 196968448. Throughput: 0: 1666.7, 1: 1680.8. Samples: 49251528. Policy #0 lag: (min: 31.0, avg: 40.8, max: 63.0) +[2023-10-13 04:17:08,607][45375] Avg episode reward: [(0, '57.960'), (1, '54.620')] +[2023-10-13 04:17:10,234][46663] Updated weights for policy 1, policy_version 96131 (0.0008) +[2023-10-13 04:17:10,604][46663] Updated weights for policy 1, policy_version 96141 (0.0008) +[2023-10-13 04:17:10,781][46662] Updated weights for policy 0, policy_version 96230 (0.0008) +[2023-10-13 04:17:10,970][46663] Updated weights for policy 1, policy_version 96151 (0.0008) +[2023-10-13 04:17:11,155][46662] Updated weights for policy 0, policy_version 96240 (0.0007) +[2023-10-13 04:17:11,524][46662] Updated weights for policy 0, policy_version 96250 (0.0007) +[2023-10-13 04:17:13,607][45375] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 197033984. Throughput: 0: 1666.1, 1: 1658.2. Samples: 49261604. Policy #0 lag: (min: 31.0, avg: 40.8, max: 63.0) +[2023-10-13 04:17:13,608][45375] Avg episode reward: [(0, '57.650'), (1, '55.430')] +[2023-10-13 04:17:14,977][46663] Updated weights for policy 1, policy_version 96161 (0.0009) +[2023-10-13 04:17:15,347][46663] Updated weights for policy 1, policy_version 96171 (0.0009) +[2023-10-13 04:17:15,434][46662] Updated weights for policy 0, policy_version 96260 (0.0007) +[2023-10-13 04:17:15,715][46663] Updated weights for policy 1, policy_version 96181 (0.0007) +[2023-10-13 04:17:15,814][46662] Updated weights for policy 0, policy_version 96270 (0.0008) +[2023-10-13 04:17:16,082][46663] Updated weights for policy 1, policy_version 96191 (0.0008) +[2023-10-13 04:17:16,180][46662] Updated weights for policy 0, policy_version 96280 (0.0010) +[2023-10-13 04:17:18,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 197099520. Throughput: 0: 1656.4, 1: 1684.5. Samples: 49281630. Policy #0 lag: (min: 31.0, avg: 40.8, max: 63.0) +[2023-10-13 04:17:18,607][45375] Avg episode reward: [(0, '60.360'), (1, '56.810')] +[2023-10-13 04:17:20,152][46663] Updated weights for policy 1, policy_version 96201 (0.0008) +[2023-10-13 04:17:20,363][46662] Updated weights for policy 0, policy_version 96290 (0.0010) +[2023-10-13 04:17:20,523][46663] Updated weights for policy 1, policy_version 96211 (0.0008) +[2023-10-13 04:17:20,735][46662] Updated weights for policy 0, policy_version 96300 (0.0008) +[2023-10-13 04:17:20,889][46663] Updated weights for policy 1, policy_version 96221 (0.0009) +[2023-10-13 04:17:21,103][46662] Updated weights for policy 0, policy_version 96310 (0.0008) +[2023-10-13 04:17:21,475][46662] Updated weights for policy 0, policy_version 96320 (0.0008) +[2023-10-13 04:17:23,606][45375] Fps is (10 sec: 13107.5, 60 sec: 13107.3, 300 sec: 13329.4). Total num frames: 197165056. Throughput: 0: 1678.5, 1: 1689.3. Samples: 49302256. Policy #0 lag: (min: 31.0, avg: 40.8, max: 63.0) +[2023-10-13 04:17:23,607][45375] Avg episode reward: [(0, '61.310'), (1, '56.910')] +[2023-10-13 04:17:24,939][46663] Updated weights for policy 1, policy_version 96231 (0.0007) +[2023-10-13 04:17:25,310][46663] Updated weights for policy 1, policy_version 96241 (0.0008) +[2023-10-13 04:17:25,671][46663] Updated weights for policy 1, policy_version 96251 (0.0009) +[2023-10-13 04:17:25,803][46662] Updated weights for policy 0, policy_version 96330 (0.0009) +[2023-10-13 04:17:26,169][46662] Updated weights for policy 0, policy_version 96340 (0.0008) +[2023-10-13 04:17:26,545][46662] Updated weights for policy 0, policy_version 96350 (0.0010) +[2023-10-13 04:17:28,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 197230592. Throughput: 0: 1661.9, 1: 1670.7. Samples: 49311968. Policy #0 lag: (min: 31.0, avg: 33.8, max: 63.0) +[2023-10-13 04:17:28,607][45375] Avg episode reward: [(0, '62.150'), (1, '57.770')] +[2023-10-13 04:17:29,734][46663] Updated weights for policy 1, policy_version 96261 (0.0010) +[2023-10-13 04:17:30,086][46663] Updated weights for policy 1, policy_version 96271 (0.0009) +[2023-10-13 04:17:30,451][46663] Updated weights for policy 1, policy_version 96281 (0.0008) +[2023-10-13 04:17:30,627][46662] Updated weights for policy 0, policy_version 96360 (0.0009) +[2023-10-13 04:17:31,004][46662] Updated weights for policy 0, policy_version 96370 (0.0010) +[2023-10-13 04:17:31,365][46662] Updated weights for policy 0, policy_version 96380 (0.0011) +[2023-10-13 04:17:33,606][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 197296128. Throughput: 0: 1659.6, 1: 1688.4. Samples: 49331794. Policy #0 lag: (min: 31.0, avg: 33.8, max: 63.0) +[2023-10-13 04:17:33,607][45375] Avg episode reward: [(0, '61.730'), (1, '58.790')] +[2023-10-13 04:17:34,660][46663] Updated weights for policy 1, policy_version 96291 (0.0009) +[2023-10-13 04:17:35,063][46663] Updated weights for policy 1, policy_version 96301 (0.0010) +[2023-10-13 04:17:35,321][46662] Updated weights for policy 0, policy_version 96390 (0.0008) +[2023-10-13 04:17:35,435][46663] Updated weights for policy 1, policy_version 96311 (0.0007) +[2023-10-13 04:17:35,686][46662] Updated weights for policy 0, policy_version 96400 (0.0008) +[2023-10-13 04:17:36,057][46662] Updated weights for policy 0, policy_version 96410 (0.0010) +[2023-10-13 04:17:38,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 197361664. Throughput: 0: 1679.8, 1: 1672.4. Samples: 49352254. Policy #0 lag: (min: 31.0, avg: 33.8, max: 63.0) +[2023-10-13 04:17:38,607][45375] Avg episode reward: [(0, '61.870'), (1, '58.210')] +[2023-10-13 04:17:39,584][46663] Updated weights for policy 1, policy_version 96321 (0.0007) +[2023-10-13 04:17:39,951][46663] Updated weights for policy 1, policy_version 96331 (0.0007) +[2023-10-13 04:17:40,231][46662] Updated weights for policy 0, policy_version 96420 (0.0010) +[2023-10-13 04:17:40,317][46663] Updated weights for policy 1, policy_version 96341 (0.0008) +[2023-10-13 04:17:40,600][46662] Updated weights for policy 0, policy_version 96430 (0.0008) +[2023-10-13 04:17:40,686][46663] Updated weights for policy 1, policy_version 96351 (0.0008) +[2023-10-13 04:17:40,971][46662] Updated weights for policy 0, policy_version 96440 (0.0009) +[2023-10-13 04:17:43,607][45375] Fps is (10 sec: 13106.7, 60 sec: 13107.1, 300 sec: 13329.3). Total num frames: 197427200. Throughput: 0: 1660.8, 1: 1668.2. Samples: 49362006. Policy #0 lag: (min: 31.0, avg: 33.8, max: 63.0) +[2023-10-13 04:17:43,608][45375] Avg episode reward: [(0, '62.150'), (1, '59.390')] +[2023-10-13 04:17:44,687][46663] Updated weights for policy 1, policy_version 96361 (0.0007) +[2023-10-13 04:17:45,000][46662] Updated weights for policy 0, policy_version 96450 (0.0008) +[2023-10-13 04:17:45,048][46663] Updated weights for policy 1, policy_version 96371 (0.0008) +[2023-10-13 04:17:45,358][46662] Updated weights for policy 0, policy_version 96460 (0.0007) +[2023-10-13 04:17:45,412][46663] Updated weights for policy 1, policy_version 96381 (0.0007) +[2023-10-13 04:17:45,726][46662] Updated weights for policy 0, policy_version 96470 (0.0008) +[2023-10-13 04:17:46,102][46662] Updated weights for policy 0, policy_version 96480 (0.0008) +[2023-10-13 04:17:48,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 197492736. Throughput: 0: 1673.1, 1: 1679.5. Samples: 49382324. Policy #0 lag: (min: 31.0, avg: 33.8, max: 63.0) +[2023-10-13 04:17:48,607][45375] Avg episode reward: [(0, '61.190'), (1, '60.690')] +[2023-10-13 04:17:49,416][46663] Updated weights for policy 1, policy_version 96391 (0.0008) +[2023-10-13 04:17:49,782][46663] Updated weights for policy 1, policy_version 96401 (0.0007) +[2023-10-13 04:17:50,148][46663] Updated weights for policy 1, policy_version 96411 (0.0008) +[2023-10-13 04:17:50,226][46662] Updated weights for policy 0, policy_version 96490 (0.0009) +[2023-10-13 04:17:50,593][46662] Updated weights for policy 0, policy_version 96500 (0.0008) +[2023-10-13 04:17:50,961][46662] Updated weights for policy 0, policy_version 96510 (0.0010) +[2023-10-13 04:17:53,607][45375] Fps is (10 sec: 13107.6, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 197558272. Throughput: 0: 1684.6, 1: 1681.4. Samples: 49402996. Policy #0 lag: (min: 31.0, avg: 33.8, max: 63.0) +[2023-10-13 04:17:53,607][45375] Avg episode reward: [(0, '60.190'), (1, '61.890')] +[2023-10-13 04:17:53,617][46384] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000096416_98729984.pth... +[2023-10-13 04:17:53,617][46091] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000096512_98828288.pth... +[2023-10-13 04:17:53,656][46091] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000094944_97222656.pth +[2023-10-13 04:17:53,657][46384] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000094848_97124352.pth +[2023-10-13 04:17:54,262][46663] Updated weights for policy 1, policy_version 96421 (0.0008) +[2023-10-13 04:17:54,635][46663] Updated weights for policy 1, policy_version 96431 (0.0007) +[2023-10-13 04:17:54,903][46662] Updated weights for policy 0, policy_version 96520 (0.0009) +[2023-10-13 04:17:54,996][46663] Updated weights for policy 1, policy_version 96441 (0.0008) +[2023-10-13 04:17:55,283][46662] Updated weights for policy 0, policy_version 96530 (0.0009) +[2023-10-13 04:17:55,655][46662] Updated weights for policy 0, policy_version 96540 (0.0007) +[2023-10-13 04:17:58,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 197623808. Throughput: 0: 1661.6, 1: 1680.8. Samples: 49412012. Policy #0 lag: (min: 31.0, avg: 33.8, max: 63.0) +[2023-10-13 04:17:58,607][45375] Avg episode reward: [(0, '59.520'), (1, '62.100')] +[2023-10-13 04:17:58,608][46384] Saving new best policy, reward=62.100! +[2023-10-13 04:17:59,115][46663] Updated weights for policy 1, policy_version 96451 (0.0009) +[2023-10-13 04:17:59,474][46663] Updated weights for policy 1, policy_version 96461 (0.0011) +[2023-10-13 04:17:59,708][46662] Updated weights for policy 0, policy_version 96550 (0.0008) +[2023-10-13 04:17:59,849][46663] Updated weights for policy 1, policy_version 96471 (0.0009) +[2023-10-13 04:18:00,071][46662] Updated weights for policy 0, policy_version 96560 (0.0008) +[2023-10-13 04:18:00,445][46662] Updated weights for policy 0, policy_version 96570 (0.0011) +[2023-10-13 04:18:03,606][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 197689344. Throughput: 0: 1682.2, 1: 1674.5. Samples: 49432682. Policy #0 lag: (min: 31.0, avg: 33.8, max: 63.0) +[2023-10-13 04:18:03,607][45375] Avg episode reward: [(0, '60.400'), (1, '61.810')] +[2023-10-13 04:18:04,019][46663] Updated weights for policy 1, policy_version 96481 (0.0010) +[2023-10-13 04:18:04,384][46663] Updated weights for policy 1, policy_version 96491 (0.0008) +[2023-10-13 04:18:04,627][46662] Updated weights for policy 0, policy_version 96580 (0.0008) +[2023-10-13 04:18:04,743][46663] Updated weights for policy 1, policy_version 96501 (0.0007) +[2023-10-13 04:18:05,009][46662] Updated weights for policy 0, policy_version 96590 (0.0010) +[2023-10-13 04:18:05,121][46663] Updated weights for policy 1, policy_version 96511 (0.0008) +[2023-10-13 04:18:05,378][46662] Updated weights for policy 0, policy_version 96600 (0.0007) +[2023-10-13 04:18:08,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 197754880. Throughput: 0: 1681.7, 1: 1674.6. Samples: 49453288. Policy #0 lag: (min: 31.0, avg: 33.8, max: 63.0) +[2023-10-13 04:18:08,608][45375] Avg episode reward: [(0, '59.400'), (1, '62.420')] +[2023-10-13 04:18:08,619][46384] Saving new best policy, reward=62.420! +[2023-10-13 04:18:09,117][46663] Updated weights for policy 1, policy_version 96521 (0.0010) +[2023-10-13 04:18:09,450][46662] Updated weights for policy 0, policy_version 96610 (0.0010) +[2023-10-13 04:18:09,485][46663] Updated weights for policy 1, policy_version 96531 (0.0007) +[2023-10-13 04:18:09,860][46663] Updated weights for policy 1, policy_version 96541 (0.0007) +[2023-10-13 04:18:09,869][46662] Updated weights for policy 0, policy_version 96620 (0.0009) +[2023-10-13 04:18:10,246][46662] Updated weights for policy 0, policy_version 96630 (0.0008) +[2023-10-13 04:18:10,614][46662] Updated weights for policy 0, policy_version 96640 (0.0008) +[2023-10-13 04:18:13,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 197820416. Throughput: 0: 1662.7, 1: 1675.8. Samples: 49462200. Policy #0 lag: (min: 31.0, avg: 33.8, max: 63.0) +[2023-10-13 04:18:13,607][45375] Avg episode reward: [(0, '59.280'), (1, '61.870')] +[2023-10-13 04:18:13,855][46663] Updated weights for policy 1, policy_version 96551 (0.0008) +[2023-10-13 04:18:14,220][46663] Updated weights for policy 1, policy_version 96561 (0.0007) +[2023-10-13 04:18:14,585][46663] Updated weights for policy 1, policy_version 96571 (0.0008) +[2023-10-13 04:18:14,659][46662] Updated weights for policy 0, policy_version 96650 (0.0010) +[2023-10-13 04:18:15,029][46662] Updated weights for policy 0, policy_version 96660 (0.0010) +[2023-10-13 04:18:15,415][46662] Updated weights for policy 0, policy_version 96670 (0.0009) +[2023-10-13 04:18:18,586][46663] Updated weights for policy 1, policy_version 96581 (0.0007) +[2023-10-13 04:18:18,606][45375] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 197885952. Throughput: 0: 1679.9, 1: 1681.8. Samples: 49483072. Policy #0 lag: (min: 31.0, avg: 33.8, max: 63.0) +[2023-10-13 04:18:18,607][45375] Avg episode reward: [(0, '59.380'), (1, '62.460')] +[2023-10-13 04:18:18,956][46663] Updated weights for policy 1, policy_version 96591 (0.0010) +[2023-10-13 04:18:19,331][46663] Updated weights for policy 1, policy_version 96601 (0.0008) +[2023-10-13 04:18:19,567][46662] Updated weights for policy 0, policy_version 96680 (0.0007) +[2023-10-13 04:18:19,582][46384] Saving new best policy, reward=62.460! +[2023-10-13 04:18:19,944][46662] Updated weights for policy 0, policy_version 96690 (0.0010) +[2023-10-13 04:18:20,313][46662] Updated weights for policy 0, policy_version 96700 (0.0009) +[2023-10-13 04:18:23,538][46663] Updated weights for policy 1, policy_version 96611 (0.0009) +[2023-10-13 04:18:23,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 197951488. Throughput: 0: 1671.6, 1: 1688.9. Samples: 49503476. Policy #0 lag: (min: 31.0, avg: 33.8, max: 63.0) +[2023-10-13 04:18:23,607][45375] Avg episode reward: [(0, '58.980'), (1, '62.390')] +[2023-10-13 04:18:23,936][46663] Updated weights for policy 1, policy_version 96621 (0.0009) +[2023-10-13 04:18:24,310][46663] Updated weights for policy 1, policy_version 96631 (0.0010) +[2023-10-13 04:18:24,561][46662] Updated weights for policy 0, policy_version 96710 (0.0008) +[2023-10-13 04:18:24,933][46662] Updated weights for policy 0, policy_version 96720 (0.0008) +[2023-10-13 04:18:25,299][46662] Updated weights for policy 0, policy_version 96730 (0.0009) +[2023-10-13 04:18:28,127][46663] Updated weights for policy 1, policy_version 96641 (0.0007) +[2023-10-13 04:18:28,497][46663] Updated weights for policy 1, policy_version 96651 (0.0008) +[2023-10-13 04:18:28,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 198017024. Throughput: 0: 1661.6, 1: 1683.2. Samples: 49512518. Policy #0 lag: (min: 31.0, avg: 33.8, max: 63.0) +[2023-10-13 04:18:28,607][45375] Avg episode reward: [(0, '57.310'), (1, '61.490')] +[2023-10-13 04:18:28,858][46663] Updated weights for policy 1, policy_version 96661 (0.0009) +[2023-10-13 04:18:29,223][46663] Updated weights for policy 1, policy_version 96671 (0.0010) +[2023-10-13 04:18:29,340][46662] Updated weights for policy 0, policy_version 96740 (0.0007) +[2023-10-13 04:18:29,704][46662] Updated weights for policy 0, policy_version 96750 (0.0008) +[2023-10-13 04:18:30,077][46662] Updated weights for policy 0, policy_version 96760 (0.0007) +[2023-10-13 04:18:33,258][46663] Updated weights for policy 1, policy_version 96681 (0.0009) +[2023-10-13 04:18:33,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 198082560. Throughput: 0: 1670.8, 1: 1687.6. Samples: 49533452. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 04:18:33,608][45375] Avg episode reward: [(0, '56.250'), (1, '61.810')] +[2023-10-13 04:18:33,628][46663] Updated weights for policy 1, policy_version 96691 (0.0008) +[2023-10-13 04:18:33,988][46663] Updated weights for policy 1, policy_version 96701 (0.0007) +[2023-10-13 04:18:34,149][46662] Updated weights for policy 0, policy_version 96770 (0.0008) +[2023-10-13 04:18:34,521][46662] Updated weights for policy 0, policy_version 96780 (0.0009) +[2023-10-13 04:18:34,901][46662] Updated weights for policy 0, policy_version 96790 (0.0007) +[2023-10-13 04:18:35,274][46662] Updated weights for policy 0, policy_version 96800 (0.0009) +[2023-10-13 04:18:38,150][46663] Updated weights for policy 1, policy_version 96711 (0.0007) +[2023-10-13 04:18:38,512][46663] Updated weights for policy 1, policy_version 96721 (0.0007) +[2023-10-13 04:18:38,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 198148096. Throughput: 0: 1671.5, 1: 1676.8. Samples: 49553670. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 04:18:38,607][45375] Avg episode reward: [(0, '56.920'), (1, '63.070')] +[2023-10-13 04:18:38,884][46663] Updated weights for policy 1, policy_version 96731 (0.0008) +[2023-10-13 04:18:39,071][46384] Saving new best policy, reward=63.070! +[2023-10-13 04:18:39,329][46662] Updated weights for policy 0, policy_version 96810 (0.0007) +[2023-10-13 04:18:39,693][46662] Updated weights for policy 0, policy_version 96820 (0.0010) +[2023-10-13 04:18:40,073][46662] Updated weights for policy 0, policy_version 96830 (0.0008) +[2023-10-13 04:18:42,896][46663] Updated weights for policy 1, policy_version 96741 (0.0007) +[2023-10-13 04:18:43,267][46663] Updated weights for policy 1, policy_version 96751 (0.0007) +[2023-10-13 04:18:43,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 198213632. Throughput: 0: 1672.5, 1: 1689.4. Samples: 49563300. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 04:18:43,608][45375] Avg episode reward: [(0, '57.360'), (1, '63.280')] +[2023-10-13 04:18:43,626][46663] Updated weights for policy 1, policy_version 96761 (0.0008) +[2023-10-13 04:18:43,885][46384] Saving new best policy, reward=63.280! +[2023-10-13 04:18:44,225][46662] Updated weights for policy 0, policy_version 96840 (0.0007) +[2023-10-13 04:18:44,592][46662] Updated weights for policy 0, policy_version 96850 (0.0007) +[2023-10-13 04:18:44,975][46662] Updated weights for policy 0, policy_version 96860 (0.0008) +[2023-10-13 04:18:47,782][46663] Updated weights for policy 1, policy_version 96771 (0.0010) +[2023-10-13 04:18:48,147][46663] Updated weights for policy 1, policy_version 96781 (0.0009) +[2023-10-13 04:18:48,520][46663] Updated weights for policy 1, policy_version 96791 (0.0009) +[2023-10-13 04:18:48,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 198279168. Throughput: 0: 1666.4, 1: 1688.8. Samples: 49583668. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 04:18:48,607][45375] Avg episode reward: [(0, '57.570'), (1, '62.600')] +[2023-10-13 04:18:49,081][46662] Updated weights for policy 0, policy_version 96870 (0.0010) +[2023-10-13 04:18:49,448][46662] Updated weights for policy 0, policy_version 96880 (0.0008) +[2023-10-13 04:18:49,821][46662] Updated weights for policy 0, policy_version 96890 (0.0009) +[2023-10-13 04:18:52,546][46663] Updated weights for policy 1, policy_version 96801 (0.0009) +[2023-10-13 04:18:52,915][46663] Updated weights for policy 1, policy_version 96811 (0.0011) +[2023-10-13 04:18:53,280][46663] Updated weights for policy 1, policy_version 96821 (0.0008) +[2023-10-13 04:18:53,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13107.1, 300 sec: 13329.3). Total num frames: 198344704. Throughput: 0: 1668.4, 1: 1670.7. Samples: 49603550. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 04:18:53,608][45375] Avg episode reward: [(0, '58.380'), (1, '60.890')] +[2023-10-13 04:18:53,656][46663] Updated weights for policy 1, policy_version 96831 (0.0009) +[2023-10-13 04:18:53,957][46662] Updated weights for policy 0, policy_version 96900 (0.0008) +[2023-10-13 04:18:54,335][46662] Updated weights for policy 0, policy_version 96910 (0.0010) +[2023-10-13 04:18:54,706][46662] Updated weights for policy 0, policy_version 96920 (0.0008) +[2023-10-13 04:18:57,913][46663] Updated weights for policy 1, policy_version 96841 (0.0010) +[2023-10-13 04:18:58,287][46663] Updated weights for policy 1, policy_version 96851 (0.0009) +[2023-10-13 04:18:58,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 198410240. Throughput: 0: 1672.8, 1: 1687.6. Samples: 49613420. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 04:18:58,607][45375] Avg episode reward: [(0, '58.610'), (1, '62.020')] +[2023-10-13 04:18:58,651][46663] Updated weights for policy 1, policy_version 96861 (0.0010) +[2023-10-13 04:18:58,866][46662] Updated weights for policy 0, policy_version 96930 (0.0008) +[2023-10-13 04:18:59,239][46662] Updated weights for policy 0, policy_version 96940 (0.0008) +[2023-10-13 04:18:59,610][46662] Updated weights for policy 0, policy_version 96950 (0.0008) +[2023-10-13 04:18:59,980][46662] Updated weights for policy 0, policy_version 96960 (0.0007) +[2023-10-13 04:19:02,602][46663] Updated weights for policy 1, policy_version 96871 (0.0010) +[2023-10-13 04:19:02,965][46663] Updated weights for policy 1, policy_version 96881 (0.0011) +[2023-10-13 04:19:03,329][46663] Updated weights for policy 1, policy_version 96891 (0.0008) +[2023-10-13 04:19:03,607][45375] Fps is (10 sec: 16384.4, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 198508544. Throughput: 0: 1671.0, 1: 1680.4. Samples: 49633886. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 04:19:03,607][45375] Avg episode reward: [(0, '58.290'), (1, '62.610')] +[2023-10-13 04:19:03,918][46662] Updated weights for policy 0, policy_version 96970 (0.0008) +[2023-10-13 04:19:04,291][46662] Updated weights for policy 0, policy_version 96980 (0.0009) +[2023-10-13 04:19:04,664][46662] Updated weights for policy 0, policy_version 96990 (0.0009) +[2023-10-13 04:19:07,419][46663] Updated weights for policy 1, policy_version 96901 (0.0009) +[2023-10-13 04:19:07,784][46663] Updated weights for policy 1, policy_version 96911 (0.0008) +[2023-10-13 04:19:08,156][46663] Updated weights for policy 1, policy_version 96921 (0.0007) +[2023-10-13 04:19:08,607][45375] Fps is (10 sec: 16383.9, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 198574080. Throughput: 0: 1672.0, 1: 1658.9. Samples: 49653368. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 04:19:08,607][45375] Avg episode reward: [(0, '57.280'), (1, '62.160')] +[2023-10-13 04:19:08,807][46662] Updated weights for policy 0, policy_version 97000 (0.0008) +[2023-10-13 04:19:09,178][46662] Updated weights for policy 0, policy_version 97010 (0.0010) +[2023-10-13 04:19:09,560][46662] Updated weights for policy 0, policy_version 97020 (0.0010) +[2023-10-13 04:19:12,298][46663] Updated weights for policy 1, policy_version 96931 (0.0009) +[2023-10-13 04:19:12,691][46663] Updated weights for policy 1, policy_version 96941 (0.0009) +[2023-10-13 04:19:13,062][46663] Updated weights for policy 1, policy_version 96951 (0.0009) +[2023-10-13 04:19:13,512][46662] Updated weights for policy 0, policy_version 97030 (0.0009) +[2023-10-13 04:19:13,607][45375] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 198639616. Throughput: 0: 1672.3, 1: 1686.8. Samples: 49663678. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 04:19:13,607][45375] Avg episode reward: [(0, '56.890'), (1, '62.330')] +[2023-10-13 04:19:13,875][46662] Updated weights for policy 0, policy_version 97040 (0.0008) +[2023-10-13 04:19:14,247][46662] Updated weights for policy 0, policy_version 97050 (0.0009) +[2023-10-13 04:19:17,117][46663] Updated weights for policy 1, policy_version 96961 (0.0009) +[2023-10-13 04:19:17,479][46663] Updated weights for policy 1, policy_version 96971 (0.0010) +[2023-10-13 04:19:17,857][46663] Updated weights for policy 1, policy_version 96981 (0.0008) +[2023-10-13 04:19:18,118][46662] Updated weights for policy 0, policy_version 97060 (0.0010) +[2023-10-13 04:19:18,224][46663] Updated weights for policy 1, policy_version 96991 (0.0008) +[2023-10-13 04:19:18,492][46662] Updated weights for policy 0, policy_version 97070 (0.0009) +[2023-10-13 04:19:18,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 198705152. Throughput: 0: 1680.0, 1: 1668.9. Samples: 49684150. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 04:19:18,607][45375] Avg episode reward: [(0, '55.830'), (1, '62.500')] +[2023-10-13 04:19:18,866][46662] Updated weights for policy 0, policy_version 97080 (0.0011) +[2023-10-13 04:19:22,288][46663] Updated weights for policy 1, policy_version 97001 (0.0008) +[2023-10-13 04:19:22,655][46663] Updated weights for policy 1, policy_version 97011 (0.0011) +[2023-10-13 04:19:22,954][46662] Updated weights for policy 0, policy_version 97090 (0.0010) +[2023-10-13 04:19:23,019][46663] Updated weights for policy 1, policy_version 97021 (0.0009) +[2023-10-13 04:19:23,324][46662] Updated weights for policy 0, policy_version 97100 (0.0009) +[2023-10-13 04:19:23,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 198770688. Throughput: 0: 1678.5, 1: 1658.9. Samples: 49703854. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 04:19:23,607][45375] Avg episode reward: [(0, '56.910'), (1, '62.490')] +[2023-10-13 04:19:23,686][46662] Updated weights for policy 0, policy_version 97110 (0.0010) +[2023-10-13 04:19:24,057][46662] Updated weights for policy 0, policy_version 97120 (0.0008) +[2023-10-13 04:19:27,131][46663] Updated weights for policy 1, policy_version 97031 (0.0009) +[2023-10-13 04:19:27,495][46663] Updated weights for policy 1, policy_version 97041 (0.0011) +[2023-10-13 04:19:27,867][46663] Updated weights for policy 1, policy_version 97051 (0.0009) +[2023-10-13 04:19:28,098][46662] Updated weights for policy 0, policy_version 97130 (0.0010) +[2023-10-13 04:19:28,464][46662] Updated weights for policy 0, policy_version 97140 (0.0010) +[2023-10-13 04:19:28,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 198836224. Throughput: 0: 1676.6, 1: 1678.0. Samples: 49714258. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 04:19:28,607][45375] Avg episode reward: [(0, '55.420'), (1, '63.050')] +[2023-10-13 04:19:28,829][46662] Updated weights for policy 0, policy_version 97150 (0.0008) +[2023-10-13 04:19:32,111][46663] Updated weights for policy 1, policy_version 97061 (0.0007) +[2023-10-13 04:19:32,475][46663] Updated weights for policy 1, policy_version 97071 (0.0008) +[2023-10-13 04:19:32,849][46663] Updated weights for policy 1, policy_version 97081 (0.0010) +[2023-10-13 04:19:32,906][46662] Updated weights for policy 0, policy_version 97160 (0.0008) +[2023-10-13 04:19:33,268][46662] Updated weights for policy 0, policy_version 97170 (0.0010) +[2023-10-13 04:19:33,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 198901760. Throughput: 0: 1688.7, 1: 1668.4. Samples: 49734740. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 04:19:33,608][45375] Avg episode reward: [(0, '55.190'), (1, '62.980')] +[2023-10-13 04:19:33,637][46662] Updated weights for policy 0, policy_version 97180 (0.0010) +[2023-10-13 04:19:36,947][46663] Updated weights for policy 1, policy_version 97091 (0.0008) +[2023-10-13 04:19:37,319][46663] Updated weights for policy 1, policy_version 97101 (0.0008) +[2023-10-13 04:19:37,508][46662] Updated weights for policy 0, policy_version 97190 (0.0010) +[2023-10-13 04:19:37,688][46663] Updated weights for policy 1, policy_version 97111 (0.0010) +[2023-10-13 04:19:37,873][46662] Updated weights for policy 0, policy_version 97200 (0.0008) +[2023-10-13 04:19:38,239][46662] Updated weights for policy 0, policy_version 97210 (0.0009) +[2023-10-13 04:19:38,606][45375] Fps is (10 sec: 16384.2, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 199000064. Throughput: 0: 1675.5, 1: 1666.0. Samples: 49753916. Policy #0 lag: (min: 13.0, avg: 25.0, max: 45.0) +[2023-10-13 04:19:38,607][45375] Avg episode reward: [(0, '54.600'), (1, '61.090')] +[2023-10-13 04:19:41,628][46663] Updated weights for policy 1, policy_version 97121 (0.0009) +[2023-10-13 04:19:42,003][46663] Updated weights for policy 1, policy_version 97131 (0.0010) +[2023-10-13 04:19:42,366][46663] Updated weights for policy 1, policy_version 97141 (0.0010) +[2023-10-13 04:19:42,541][46662] Updated weights for policy 0, policy_version 97220 (0.0009) +[2023-10-13 04:19:42,734][46663] Updated weights for policy 1, policy_version 97151 (0.0009) +[2023-10-13 04:19:42,911][46662] Updated weights for policy 0, policy_version 97230 (0.0009) +[2023-10-13 04:19:43,284][46662] Updated weights for policy 0, policy_version 97240 (0.0009) +[2023-10-13 04:19:43,606][45375] Fps is (10 sec: 16384.4, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 199065600. Throughput: 0: 1683.7, 1: 1676.0. Samples: 49764606. Policy #0 lag: (min: 13.0, avg: 25.0, max: 45.0) +[2023-10-13 04:19:43,607][45375] Avg episode reward: [(0, '54.210'), (1, '62.870')] +[2023-10-13 04:19:46,762][46663] Updated weights for policy 1, policy_version 97161 (0.0008) +[2023-10-13 04:19:47,130][46663] Updated weights for policy 1, policy_version 97171 (0.0008) +[2023-10-13 04:19:47,482][46662] Updated weights for policy 0, policy_version 97250 (0.0008) +[2023-10-13 04:19:47,501][46663] Updated weights for policy 1, policy_version 97181 (0.0009) +[2023-10-13 04:19:47,872][46662] Updated weights for policy 0, policy_version 97260 (0.0009) +[2023-10-13 04:19:48,235][46662] Updated weights for policy 0, policy_version 97270 (0.0009) +[2023-10-13 04:19:48,606][45375] Fps is (10 sec: 9830.4, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 199098368. Throughput: 0: 1688.1, 1: 1658.9. Samples: 49784502. Policy #0 lag: (min: 13.0, avg: 25.0, max: 45.0) +[2023-10-13 04:19:48,607][45375] Avg episode reward: [(0, '54.110'), (1, '61.790')] +[2023-10-13 04:19:48,615][46662] Updated weights for policy 0, policy_version 97280 (0.0008) +[2023-10-13 04:19:51,603][46663] Updated weights for policy 1, policy_version 97191 (0.0008) +[2023-10-13 04:19:51,966][46663] Updated weights for policy 1, policy_version 97201 (0.0007) +[2023-10-13 04:19:52,334][46663] Updated weights for policy 1, policy_version 97211 (0.0008) +[2023-10-13 04:19:52,533][46662] Updated weights for policy 0, policy_version 97290 (0.0009) +[2023-10-13 04:19:52,897][46662] Updated weights for policy 0, policy_version 97300 (0.0008) +[2023-10-13 04:19:53,262][46662] Updated weights for policy 0, policy_version 97310 (0.0009) +[2023-10-13 04:19:53,607][45375] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 199196672. Throughput: 0: 1678.7, 1: 1678.8. Samples: 49804456. Policy #0 lag: (min: 13.0, avg: 25.0, max: 45.0) +[2023-10-13 04:19:53,607][45375] Avg episode reward: [(0, '53.860'), (1, '61.330')] +[2023-10-13 04:19:53,616][46091] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000097312_99647488.pth... +[2023-10-13 04:19:53,616][46384] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000097216_99549184.pth... +[2023-10-13 04:19:53,654][46384] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000095648_97943552.pth +[2023-10-13 04:19:53,656][46091] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000095744_98041856.pth +[2023-10-13 04:19:56,372][46663] Updated weights for policy 1, policy_version 97221 (0.0009) +[2023-10-13 04:19:56,749][46663] Updated weights for policy 1, policy_version 97231 (0.0009) +[2023-10-13 04:19:57,106][46663] Updated weights for policy 1, policy_version 97241 (0.0007) +[2023-10-13 04:19:57,348][46662] Updated weights for policy 0, policy_version 97320 (0.0009) +[2023-10-13 04:19:57,715][46662] Updated weights for policy 0, policy_version 97330 (0.0007) +[2023-10-13 04:19:58,075][46662] Updated weights for policy 0, policy_version 97340 (0.0008) +[2023-10-13 04:19:58,607][45375] Fps is (10 sec: 16383.9, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 199262208. Throughput: 0: 1692.6, 1: 1675.1. Samples: 49815226. Policy #0 lag: (min: 13.0, avg: 25.0, max: 45.0) +[2023-10-13 04:19:58,607][45375] Avg episode reward: [(0, '56.140'), (1, '62.510')] +[2023-10-13 04:20:01,223][46663] Updated weights for policy 1, policy_version 97251 (0.0008) +[2023-10-13 04:20:01,638][46663] Updated weights for policy 1, policy_version 97261 (0.0007) +[2023-10-13 04:20:02,017][46663] Updated weights for policy 1, policy_version 97271 (0.0007) +[2023-10-13 04:20:02,123][46662] Updated weights for policy 0, policy_version 97350 (0.0008) +[2023-10-13 04:20:02,489][46662] Updated weights for policy 0, policy_version 97360 (0.0010) +[2023-10-13 04:20:02,857][46662] Updated weights for policy 0, policy_version 97370 (0.0008) +[2023-10-13 04:20:03,607][45375] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 199327744. Throughput: 0: 1690.9, 1: 1657.4. Samples: 49834824. Policy #0 lag: (min: 13.0, avg: 25.0, max: 45.0) +[2023-10-13 04:20:03,608][45375] Avg episode reward: [(0, '55.290'), (1, '63.360')] +[2023-10-13 04:20:03,609][46384] Saving new best policy, reward=63.360! +[2023-10-13 04:20:06,026][46663] Updated weights for policy 1, policy_version 97281 (0.0009) +[2023-10-13 04:20:06,395][46663] Updated weights for policy 1, policy_version 97291 (0.0011) +[2023-10-13 04:20:06,759][46663] Updated weights for policy 1, policy_version 97301 (0.0009) +[2023-10-13 04:20:06,785][46662] Updated weights for policy 0, policy_version 97380 (0.0008) +[2023-10-13 04:20:07,118][46663] Updated weights for policy 1, policy_version 97311 (0.0010) +[2023-10-13 04:20:07,147][46662] Updated weights for policy 0, policy_version 97390 (0.0007) +[2023-10-13 04:20:07,523][46662] Updated weights for policy 0, policy_version 97400 (0.0010) +[2023-10-13 04:20:08,606][45375] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 199393280. Throughput: 0: 1663.1, 1: 1680.0. Samples: 49854296. Policy #0 lag: (min: 13.0, avg: 25.0, max: 45.0) +[2023-10-13 04:20:08,607][45375] Avg episode reward: [(0, '55.280'), (1, '61.230')] +[2023-10-13 04:20:11,379][46663] Updated weights for policy 1, policy_version 97321 (0.0008) +[2023-10-13 04:20:11,653][46662] Updated weights for policy 0, policy_version 97410 (0.0008) +[2023-10-13 04:20:11,738][46663] Updated weights for policy 1, policy_version 97331 (0.0008) +[2023-10-13 04:20:12,025][46662] Updated weights for policy 0, policy_version 97420 (0.0008) +[2023-10-13 04:20:12,111][46663] Updated weights for policy 1, policy_version 97341 (0.0010) +[2023-10-13 04:20:12,397][46662] Updated weights for policy 0, policy_version 97430 (0.0009) +[2023-10-13 04:20:12,766][46662] Updated weights for policy 0, policy_version 97440 (0.0009) +[2023-10-13 04:20:13,606][45375] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 199458816. Throughput: 0: 1687.3, 1: 1668.4. Samples: 49865264. Policy #0 lag: (min: 13.0, avg: 25.0, max: 45.0) +[2023-10-13 04:20:13,607][45375] Avg episode reward: [(0, '56.010'), (1, '58.820')] +[2023-10-13 04:20:16,135][46663] Updated weights for policy 1, policy_version 97351 (0.0008) +[2023-10-13 04:20:16,501][46663] Updated weights for policy 1, policy_version 97361 (0.0007) +[2023-10-13 04:20:16,872][46663] Updated weights for policy 1, policy_version 97371 (0.0008) +[2023-10-13 04:20:16,917][46662] Updated weights for policy 0, policy_version 97450 (0.0009) +[2023-10-13 04:20:17,284][46662] Updated weights for policy 0, policy_version 97460 (0.0009) +[2023-10-13 04:20:17,648][46662] Updated weights for policy 0, policy_version 97470 (0.0009) +[2023-10-13 04:20:18,606][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 199524352. Throughput: 0: 1673.7, 1: 1663.1. Samples: 49884892. Policy #0 lag: (min: 13.0, avg: 25.0, max: 45.0) +[2023-10-13 04:20:18,607][45375] Avg episode reward: [(0, '56.550'), (1, '59.320')] +[2023-10-13 04:20:20,998][46663] Updated weights for policy 1, policy_version 97381 (0.0008) +[2023-10-13 04:20:21,359][46663] Updated weights for policy 1, policy_version 97391 (0.0009) +[2023-10-13 04:20:21,676][46662] Updated weights for policy 0, policy_version 97480 (0.0010) +[2023-10-13 04:20:21,728][46663] Updated weights for policy 1, policy_version 97401 (0.0007) +[2023-10-13 04:20:22,045][46662] Updated weights for policy 0, policy_version 97490 (0.0009) +[2023-10-13 04:20:22,420][46662] Updated weights for policy 0, policy_version 97500 (0.0010) +[2023-10-13 04:20:23,606][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 199589888. Throughput: 0: 1667.4, 1: 1683.3. Samples: 49904698. Policy #0 lag: (min: 13.0, avg: 25.0, max: 45.0) +[2023-10-13 04:20:23,607][45375] Avg episode reward: [(0, '56.210'), (1, '60.400')] +[2023-10-13 04:20:25,779][46663] Updated weights for policy 1, policy_version 97411 (0.0009) +[2023-10-13 04:20:26,137][46663] Updated weights for policy 1, policy_version 97421 (0.0009) +[2023-10-13 04:20:26,500][46663] Updated weights for policy 1, policy_version 97431 (0.0007) +[2023-10-13 04:20:26,543][46662] Updated weights for policy 0, policy_version 97510 (0.0008) +[2023-10-13 04:20:26,899][46662] Updated weights for policy 0, policy_version 97520 (0.0008) +[2023-10-13 04:20:27,283][46662] Updated weights for policy 0, policy_version 97530 (0.0008) +[2023-10-13 04:20:28,607][45375] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 199655424. Throughput: 0: 1692.3, 1: 1667.9. Samples: 49915814. Policy #0 lag: (min: 13.0, avg: 25.0, max: 45.0) +[2023-10-13 04:20:28,607][45375] Avg episode reward: [(0, '56.550'), (1, '60.670')] +[2023-10-13 04:20:30,590][46663] Updated weights for policy 1, policy_version 97441 (0.0009) +[2023-10-13 04:20:30,953][46663] Updated weights for policy 1, policy_version 97451 (0.0009) +[2023-10-13 04:20:31,323][46663] Updated weights for policy 1, policy_version 97461 (0.0009) +[2023-10-13 04:20:31,575][46662] Updated weights for policy 0, policy_version 97540 (0.0008) +[2023-10-13 04:20:31,686][46663] Updated weights for policy 1, policy_version 97471 (0.0009) +[2023-10-13 04:20:31,958][46662] Updated weights for policy 0, policy_version 97550 (0.0009) +[2023-10-13 04:20:32,321][46662] Updated weights for policy 0, policy_version 97560 (0.0009) +[2023-10-13 04:20:33,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 199720960. Throughput: 0: 1676.9, 1: 1672.1. Samples: 49935208. Policy #0 lag: (min: 13.0, avg: 25.0, max: 45.0) +[2023-10-13 04:20:33,607][45375] Avg episode reward: [(0, '55.640'), (1, '57.610')] +[2023-10-13 04:20:35,843][46663] Updated weights for policy 1, policy_version 97481 (0.0010) +[2023-10-13 04:20:36,206][46663] Updated weights for policy 1, policy_version 97491 (0.0007) +[2023-10-13 04:20:36,223][46662] Updated weights for policy 0, policy_version 97570 (0.0009) +[2023-10-13 04:20:36,575][46663] Updated weights for policy 1, policy_version 97501 (0.0007) +[2023-10-13 04:20:36,589][46662] Updated weights for policy 0, policy_version 97580 (0.0009) +[2023-10-13 04:20:36,957][46662] Updated weights for policy 0, policy_version 97590 (0.0009) +[2023-10-13 04:20:37,331][46662] Updated weights for policy 0, policy_version 97600 (0.0010) +[2023-10-13 04:20:38,607][45375] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 199786496. Throughput: 0: 1672.8, 1: 1674.3. Samples: 49955074. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 04:20:38,607][45375] Avg episode reward: [(0, '54.900'), (1, '57.420')] +[2023-10-13 04:20:40,669][46663] Updated weights for policy 1, policy_version 97511 (0.0008) +[2023-10-13 04:20:41,037][46663] Updated weights for policy 1, policy_version 97521 (0.0008) +[2023-10-13 04:20:41,298][46662] Updated weights for policy 0, policy_version 97610 (0.0008) +[2023-10-13 04:20:41,405][46663] Updated weights for policy 1, policy_version 97531 (0.0007) +[2023-10-13 04:20:41,663][46662] Updated weights for policy 0, policy_version 97620 (0.0007) +[2023-10-13 04:20:42,030][46662] Updated weights for policy 0, policy_version 97630 (0.0008) +[2023-10-13 04:20:43,607][45375] Fps is (10 sec: 13106.9, 60 sec: 13107.1, 300 sec: 13440.4). Total num frames: 199852032. Throughput: 0: 1685.3, 1: 1657.2. Samples: 49965638. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 04:20:43,608][45375] Avg episode reward: [(0, '55.120'), (1, '57.990')] +[2023-10-13 04:20:45,508][46663] Updated weights for policy 1, policy_version 97541 (0.0009) +[2023-10-13 04:20:45,882][46663] Updated weights for policy 1, policy_version 97551 (0.0009) +[2023-10-13 04:20:46,245][46663] Updated weights for policy 1, policy_version 97561 (0.0009) +[2023-10-13 04:20:46,365][46662] Updated weights for policy 0, policy_version 97640 (0.0009) +[2023-10-13 04:20:46,728][46662] Updated weights for policy 0, policy_version 97650 (0.0008) +[2023-10-13 04:20:47,098][46662] Updated weights for policy 0, policy_version 97660 (0.0007) +[2023-10-13 04:20:48,606][45375] Fps is (10 sec: 13107.4, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 199917568. Throughput: 0: 1662.8, 1: 1679.4. Samples: 49985222. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 04:20:48,607][45375] Avg episode reward: [(0, '56.030'), (1, '56.210')] +[2023-10-13 04:20:50,495][46663] Updated weights for policy 1, policy_version 97571 (0.0007) +[2023-10-13 04:20:50,901][46663] Updated weights for policy 1, policy_version 97581 (0.0009) +[2023-10-13 04:20:51,107][46662] Updated weights for policy 0, policy_version 97670 (0.0008) +[2023-10-13 04:20:51,271][46663] Updated weights for policy 1, policy_version 97591 (0.0007) +[2023-10-13 04:20:51,464][46662] Updated weights for policy 0, policy_version 97680 (0.0009) +[2023-10-13 04:20:51,835][46662] Updated weights for policy 0, policy_version 97690 (0.0008) +[2023-10-13 04:20:53,607][45375] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 199983104. Throughput: 0: 1681.6, 1: 1669.3. Samples: 50005088. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 04:20:53,607][45375] Avg episode reward: [(0, '55.670'), (1, '56.320')] +[2023-10-13 04:20:55,357][46663] Updated weights for policy 1, policy_version 97601 (0.0007) +[2023-10-13 04:20:55,719][46663] Updated weights for policy 1, policy_version 97611 (0.0008) +[2023-10-13 04:20:55,745][46662] Updated weights for policy 0, policy_version 97700 (0.0008) +[2023-10-13 04:20:56,077][46663] Updated weights for policy 1, policy_version 97621 (0.0007) +[2023-10-13 04:20:56,112][46662] Updated weights for policy 0, policy_version 97710 (0.0007) +[2023-10-13 04:20:56,440][46663] Updated weights for policy 1, policy_version 97631 (0.0007) +[2023-10-13 04:20:56,477][46662] Updated weights for policy 0, policy_version 97720 (0.0008) +[2023-10-13 04:20:58,607][45375] Fps is (10 sec: 13106.6, 60 sec: 13107.1, 300 sec: 13440.4). Total num frames: 200048640. Throughput: 0: 1683.1, 1: 1656.8. Samples: 50015562. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-13 04:20:58,608][45375] Avg episode reward: [(0, '55.410'), (1, '58.440')] +[2023-10-13 04:21:00,423][46662] Updated weights for policy 0, policy_version 97730 (0.0008) +[2023-10-13 04:21:00,571][46663] Updated weights for policy 1, policy_version 97641 (0.0008) +[2023-10-13 04:21:00,801][46662] Updated weights for policy 0, policy_version 97740 (0.0009) +[2023-10-13 04:21:00,933][46663] Updated weights for policy 1, policy_version 97651 (0.0007) +[2023-10-13 04:21:01,165][46662] Updated weights for policy 0, policy_version 97750 (0.0009) +[2023-10-13 04:21:01,301][46663] Updated weights for policy 1, policy_version 97661 (0.0009) +[2023-10-13 04:21:01,523][46384] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000097664_100007936.pth... +[2023-10-13 04:21:01,523][46707] Stopping RolloutWorker_w10... +[2023-10-13 04:21:01,523][46700] Stopping RolloutWorker_w4... +[2023-10-13 04:21:01,523][46699] Stopping RolloutWorker_w2... +[2023-10-13 04:21:01,523][46705] Stopping RolloutWorker_w8... +[2023-10-13 04:21:01,523][46699] Loop rollout_proc2_evt_loop terminating... +[2023-10-13 04:21:01,523][46700] Loop rollout_proc4_evt_loop terminating... +[2023-10-13 04:21:01,523][46707] Loop rollout_proc10_evt_loop terminating... +[2023-10-13 04:21:01,523][46091] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000097760_100106240.pth... +[2023-10-13 04:21:01,523][46696] Stopping RolloutWorker_w0... +[2023-10-13 04:21:01,523][46704] Stopping RolloutWorker_w7... +[2023-10-13 04:21:01,523][46705] Loop rollout_proc8_evt_loop terminating... +[2023-10-13 04:21:01,524][46704] Loop rollout_proc7_evt_loop terminating... +[2023-10-13 04:21:01,524][46696] Loop rollout_proc0_evt_loop terminating... +[2023-10-13 04:21:01,523][46662] Updated weights for policy 0, policy_version 97760 (0.0010) +[2023-10-13 04:21:01,523][45375] Component RolloutWorker_w8 stopped! +[2023-10-13 04:21:01,524][46710] Stopping RolloutWorker_w13... +[2023-10-13 04:21:01,524][45375] Component RolloutWorker_w10 stopped! +[2023-10-13 04:21:01,524][46710] Loop rollout_proc13_evt_loop terminating... +[2023-10-13 04:21:01,525][45375] Component RolloutWorker_w4 stopped! +[2023-10-13 04:21:01,525][45375] Component RolloutWorker_w2 stopped! +[2023-10-13 04:21:01,526][45375] Component RolloutWorker_w0 stopped! +[2023-10-13 04:21:01,526][46701] Stopping RolloutWorker_w3... +[2023-10-13 04:21:01,526][46701] Loop rollout_proc3_evt_loop terminating... +[2023-10-13 04:21:01,526][45375] Component RolloutWorker_w7 stopped! +[2023-10-13 04:21:01,526][46706] Stopping RolloutWorker_w9... +[2023-10-13 04:21:01,527][46706] Loop rollout_proc9_evt_loop terminating... +[2023-10-13 04:21:01,527][45375] Component Batcher_0 stopped! +[2023-10-13 04:21:01,527][45375] Component RolloutWorker_w13 stopped! +[2023-10-13 04:21:01,528][45375] Component RolloutWorker_w3 stopped! +[2023-10-13 04:21:01,528][46697] Stopping RolloutWorker_w1... +[2023-10-13 04:21:01,528][46703] Stopping RolloutWorker_w6... +[2023-10-13 04:21:01,528][46697] Loop rollout_proc1_evt_loop terminating... +[2023-10-13 04:21:01,528][46709] Stopping RolloutWorker_w11... +[2023-10-13 04:21:01,528][45375] Component RolloutWorker_w9 stopped! +[2023-10-13 04:21:01,528][46703] Loop rollout_proc6_evt_loop terminating... +[2023-10-13 04:21:01,529][46709] Loop rollout_proc11_evt_loop terminating... +[2023-10-13 04:21:01,529][46708] Stopping RolloutWorker_w12... +[2023-10-13 04:21:01,529][45375] Component RolloutWorker_w6 stopped! +[2023-10-13 04:21:01,529][46708] Loop rollout_proc12_evt_loop terminating... +[2023-10-13 04:21:01,529][45375] Component RolloutWorker_w1 stopped! +[2023-10-13 04:21:01,530][45375] Component RolloutWorker_w11 stopped! +[2023-10-13 04:21:01,530][45375] Component RolloutWorker_w12 stopped! +[2023-10-13 04:21:01,531][45375] Component RolloutWorker_w5 stopped! +[2023-10-13 04:21:01,531][46702] Stopping RolloutWorker_w5... +[2023-10-13 04:21:01,531][47476] Stopping RolloutWorker_w14... +[2023-10-13 04:21:01,531][45375] Component RolloutWorker_w14 stopped! +[2023-10-13 04:21:01,532][47476] Loop rollout_proc14_evt_loop terminating... +[2023-10-13 04:21:01,532][46702] Loop rollout_proc5_evt_loop terminating... +[2023-10-13 04:21:01,532][47477] Stopping RolloutWorker_w15... +[2023-10-13 04:21:01,532][45375] Component RolloutWorker_w15 stopped! +[2023-10-13 04:21:01,532][47477] Loop rollout_proc15_evt_loop terminating... +[2023-10-13 04:21:01,523][46091] Stopping Batcher_0... +[2023-10-13 04:21:01,536][45375] Component Batcher_1 stopped! +[2023-10-13 04:21:01,550][46663] Weights refcount: 2 0 +[2023-10-13 04:21:01,552][46663] Stopping InferenceWorker_p1-w0... +[2023-10-13 04:21:01,552][45375] Component InferenceWorker_p1-w0 stopped! +[2023-10-13 04:21:01,536][46384] Stopping Batcher_1... +[2023-10-13 04:21:01,553][46663] Loop inference_proc1-0_evt_loop terminating... +[2023-10-13 04:21:01,547][46091] Loop batcher_evt_loop terminating... +[2023-10-13 04:21:01,556][46662] Weights refcount: 2 0 +[2023-10-13 04:21:01,557][46091] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000096512_98828288.pth +[2023-10-13 04:21:01,558][46662] Stopping InferenceWorker_p0-w0... +[2023-10-13 04:21:01,558][45375] Component InferenceWorker_p0-w0 stopped! +[2023-10-13 04:21:01,558][46662] Loop inference_proc0-0_evt_loop terminating... +[2023-10-13 04:21:01,562][46091] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p0/checkpoint_000097760_100106240.pth... +[2023-10-13 04:21:01,564][46384] Loop batcher_evt_loop terminating... +[2023-10-13 04:21:01,567][46384] Removing ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000096416_98729984.pth +[2023-10-13 04:21:01,573][46384] Saving ./train_atari/atari_kongfumaster_APPO/checkpoint_p1/checkpoint_000097664_100007936.pth... +[2023-10-13 04:21:01,600][46091] Stopping LearnerWorker_p0... +[2023-10-13 04:21:01,600][46091] Loop learner_proc0_evt_loop terminating... +[2023-10-13 04:21:01,600][45375] Component LearnerWorker_p0 stopped! +[2023-10-13 04:21:01,628][46384] Stopping LearnerWorker_p1... +[2023-10-13 04:21:01,629][46384] Loop learner_proc1_evt_loop terminating... +[2023-10-13 04:21:01,629][45375] Component LearnerWorker_p1 stopped! +[2023-10-13 04:21:01,630][45375] Waiting for process learner_proc0 to stop... +[2023-10-13 04:21:02,394][45375] Waiting for process learner_proc1 to stop... +[2023-10-13 04:21:02,570][45375] Waiting for process inference_proc0-0 to join... +[2023-10-13 04:21:02,571][45375] Waiting for process inference_proc1-0 to join... +[2023-10-13 04:21:02,572][45375] Waiting for process rollout_proc0 to join... +[2023-10-13 04:21:02,573][45375] Waiting for process rollout_proc1 to join... +[2023-10-13 04:21:02,574][45375] Waiting for process rollout_proc2 to join... +[2023-10-13 04:21:02,575][45375] Waiting for process rollout_proc3 to join... +[2023-10-13 04:21:02,575][45375] Waiting for process rollout_proc4 to join... +[2023-10-13 04:21:02,576][45375] Waiting for process rollout_proc5 to join... +[2023-10-13 04:21:02,577][45375] Waiting for process rollout_proc6 to join... +[2023-10-13 04:21:02,577][45375] Waiting for process rollout_proc7 to join... +[2023-10-13 04:21:02,578][45375] Waiting for process rollout_proc8 to join... +[2023-10-13 04:21:02,578][45375] Waiting for process rollout_proc9 to join... +[2023-10-13 04:21:02,582][45375] Waiting for process rollout_proc10 to join... +[2023-10-13 04:21:02,583][45375] Waiting for process rollout_proc11 to join... +[2023-10-13 04:21:02,584][45375] Waiting for process rollout_proc12 to join... +[2023-10-13 04:21:02,584][45375] Waiting for process rollout_proc13 to join... +[2023-10-13 04:21:02,585][45375] Waiting for process rollout_proc14 to join... +[2023-10-13 04:21:02,585][45375] Waiting for process rollout_proc15 to join... +[2023-10-13 04:21:02,586][45375] Batcher 0 profile tree view: +batching: 170.4705, releasing_batches: 0.0907 +[2023-10-13 04:21:02,586][45375] Batcher 1 profile tree view: +batching: 171.4064, releasing_batches: 0.0892 +[2023-10-13 04:21:02,586][45375] InferenceWorker_p0-w0 profile tree view: +wait_policy: 0.0006 + wait_policy_total: 2618.5743 +update_model: 212.4874 + weight_update: 0.0010 +one_step: 0.0026 + handle_policy_step: 11418.5827 + deserialize: 65.2337, stack: 190.0612, obs_to_device_normalize: 2545.6749, forward: 5181.4458, prepare_outputs: 2480.5709, send_messages: 463.5732 +[2023-10-13 04:21:02,586][45375] InferenceWorker_p1-w0 profile tree view: +wait_policy: 0.0009 + wait_policy_total: 2591.0630 +update_model: 209.1796 + weight_update: 0.0010 +one_step: 0.0023 + handle_policy_step: 11445.5427 + deserialize: 65.1579, stack: 194.0790, obs_to_device_normalize: 2568.6460, forward: 5166.6228, prepare_outputs: 2490.7134, send_messages: 469.2933 +[2023-10-13 04:21:02,587][45375] Learner 0 profile tree view: +misc: 0.0190, prepare_batch: 269.4691 +train: 3633.4727 + epoch_init: 0.1886, minibatch_init: 13.1407, losses_postprocess: 896.3717, kl_divergence: 32.5002, update: 381.6553, after_optimizer: 2123.3892 + calculate_losses: 169.9903 + losses_init: 0.3984, forward_head: 59.5656, bptt_initial: 1.4238, bptt: 1.8516, tail: 38.0710, advantages_returns: 11.1356, losses: 43.7074 +[2023-10-13 04:21:02,587][45375] Learner 1 profile tree view: +misc: 0.0191, prepare_batch: 269.6882 +train: 3606.3003 + epoch_init: 0.1888, minibatch_init: 12.9215, losses_postprocess: 885.6027, kl_divergence: 31.8461, update: 386.0471, after_optimizer: 2104.6569 + calculate_losses: 168.2066 + losses_init: 0.3960, forward_head: 56.0941, bptt_initial: 1.4588, bptt: 2.0244, tail: 38.4161, advantages_returns: 11.2767, losses: 44.8445 +[2023-10-13 04:21:02,587][45375] RolloutWorker_w0 profile tree view: +wait_for_trajectories: 1.2462, enqueue_policy_requests: 412.5194, process_policy_outputs: 192.2199, env_step: 7676.0058, finalize_trajectories: 3.5139, complete_rollouts: 2.9928 +post_env_step: 378.2736 + process_env_step: 85.7010 +[2023-10-13 04:21:02,587][45375] RolloutWorker_w15 profile tree view: +wait_for_trajectories: 1.2412, enqueue_policy_requests: 408.7930, process_policy_outputs: 190.0709, env_step: 7631.2909, finalize_trajectories: 3.4855, complete_rollouts: 2.9355 +post_env_step: 379.8862 + process_env_step: 84.3236 +[2023-10-13 04:21:02,588][45375] Loop Runner_EvtLoop terminating... +[2023-10-13 04:21:02,588][45375] Runner profile tree view: +main_loop: 14944.4640 +[2023-10-13 04:21:02,588][45375] Collected {0: 100106240, 1: 100007936}, FPS: 13390.5