[2023-10-12 15:44:22,389][61643] Saving configuration to ./train_atari/atari_kangaroo_APPO/config.json... [2023-10-12 15:44:22,706][61643] Rollout worker 0 uses device cpu [2023-10-12 15:44:22,707][61643] Rollout worker 1 uses device cpu [2023-10-12 15:44:22,707][61643] Rollout worker 2 uses device cpu [2023-10-12 15:44:22,708][61643] Rollout worker 3 uses device cpu [2023-10-12 15:44:22,708][61643] Rollout worker 4 uses device cpu [2023-10-12 15:44:22,709][61643] Rollout worker 5 uses device cpu [2023-10-12 15:44:22,709][61643] Rollout worker 6 uses device cpu [2023-10-12 15:44:22,710][61643] Rollout worker 7 uses device cpu [2023-10-12 15:44:22,710][61643] Rollout worker 8 uses device cpu [2023-10-12 15:44:22,710][61643] Rollout worker 9 uses device cpu [2023-10-12 15:44:22,711][61643] Rollout worker 10 uses device cpu [2023-10-12 15:44:22,711][61643] Rollout worker 11 uses device cpu [2023-10-12 15:44:22,712][61643] Rollout worker 12 uses device cpu [2023-10-12 15:44:22,712][61643] Rollout worker 13 uses device cpu [2023-10-12 15:44:22,712][61643] Rollout worker 14 uses device cpu [2023-10-12 15:44:22,713][61643] Rollout worker 15 uses device cpu [2023-10-12 15:44:22,993][61643] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2023-10-12 15:44:22,994][61643] InferenceWorker_p0-w0: min num requests: 2 [2023-10-12 15:44:22,997][61643] Using GPUs [1] for process 1 (actually maps to GPUs [1]) [2023-10-12 15:44:22,997][61643] InferenceWorker_p1-w0: min num requests: 2 [2023-10-12 15:44:23,044][61643] Starting all processes... [2023-10-12 15:44:23,045][61643] Starting process learner_proc0 [2023-10-12 15:44:24,721][61643] Starting process learner_proc1 [2023-10-12 15:44:24,725][62354] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2023-10-12 15:44:24,725][62354] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for learning process 0 [2023-10-12 15:44:24,744][62354] Num visible devices: 1 [2023-10-12 15:44:24,764][62354] Setting fixed seed 1234 [2023-10-12 15:44:24,766][62354] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2023-10-12 15:44:24,766][62354] Initializing actor-critic model on device cuda:0 [2023-10-12 15:44:24,766][62354] RunningMeanStd input shape: (4, 84, 84) [2023-10-12 15:44:24,767][62354] RunningMeanStd input shape: (1,) [2023-10-12 15:44:24,785][62354] ConvEncoder: input_channels=4 [2023-10-12 15:44:24,971][62354] Conv encoder output size: 512 [2023-10-12 15:44:24,973][62354] Created Actor Critic model with architecture: [2023-10-12 15:44:24,973][62354] ActorCriticSharedWeights( (obs_normalizer): ObservationNormalizer( (running_mean_std): RunningMeanStdDictInPlace( (running_mean_std): ModuleDict( (obs): RunningMeanStdInPlace() ) ) ) (returns_normalizer): RecursiveScriptModule(original_name=RunningMeanStdInPlace) (encoder): MultiInputEncoder( (encoders): ModuleDict( (obs): ConvEncoder( (enc): RecursiveScriptModule( original_name=ConvEncoderImpl (conv_head): RecursiveScriptModule( original_name=Sequential (0): RecursiveScriptModule(original_name=Conv2d) (1): RecursiveScriptModule(original_name=ReLU) (2): RecursiveScriptModule(original_name=Conv2d) (3): RecursiveScriptModule(original_name=ReLU) (4): RecursiveScriptModule(original_name=Conv2d) (5): RecursiveScriptModule(original_name=ReLU) ) (mlp_layers): RecursiveScriptModule( original_name=Sequential (0): RecursiveScriptModule(original_name=Linear) (1): RecursiveScriptModule(original_name=ReLU) ) ) ) ) ) (core): ModelCoreIdentity() (decoder): MlpDecoder( (mlp): Identity() ) (critic_linear): Linear(in_features=512, out_features=1, bias=True) (action_parameterization): ActionParameterizationDefault( (distribution_linear): Linear(in_features=512, out_features=18, bias=True) ) ) [2023-10-12 15:44:25,537][62354] Using optimizer [2023-10-12 15:44:25,538][62354] No checkpoints found [2023-10-12 15:44:25,538][62354] Did not load from checkpoint, starting from scratch! [2023-10-12 15:44:25,538][62354] Initialized policy 0 weights for model version 0 [2023-10-12 15:44:25,539][62354] LearnerWorker_p0 finished initialization! [2023-10-12 15:44:25,540][62354] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2023-10-12 15:44:26,492][61643] Starting all processes... [2023-10-12 15:44:26,495][62495] Using GPUs [1] for process 1 (actually maps to GPUs [1]) [2023-10-12 15:44:26,496][62495] Set environment var CUDA_VISIBLE_DEVICES to '1' (GPU indices [1]) for learning process 1 [2023-10-12 15:44:26,501][61643] Starting process inference_proc0-0 [2023-10-12 15:44:26,501][61643] Starting process inference_proc1-0 [2023-10-12 15:44:26,514][62495] Num visible devices: 1 [2023-10-12 15:44:26,501][61643] Starting process rollout_proc0 [2023-10-12 15:44:26,502][61643] Starting process rollout_proc1 [2023-10-12 15:44:26,502][61643] Starting process rollout_proc2 [2023-10-12 15:44:26,505][61643] Starting process rollout_proc3 [2023-10-12 15:44:26,507][61643] Starting process rollout_proc4 [2023-10-12 15:44:26,539][62495] Setting fixed seed 1234 [2023-10-12 15:44:26,540][62495] Using GPUs [0] for process 1 (actually maps to GPUs [1]) [2023-10-12 15:44:26,541][62495] Initializing actor-critic model on device cuda:0 [2023-10-12 15:44:26,541][62495] RunningMeanStd input shape: (4, 84, 84) [2023-10-12 15:44:26,542][62495] RunningMeanStd input shape: (1,) [2023-10-12 15:44:26,509][61643] Starting process rollout_proc5 [2023-10-12 15:44:26,513][61643] Starting process rollout_proc6 [2023-10-12 15:44:26,514][61643] Starting process rollout_proc7 [2023-10-12 15:44:26,515][61643] Starting process rollout_proc8 [2023-10-12 15:44:26,515][61643] Starting process rollout_proc9 [2023-10-12 15:44:26,553][62495] ConvEncoder: input_channels=4 [2023-10-12 15:44:26,517][61643] Starting process rollout_proc10 [2023-10-12 15:44:26,518][61643] Starting process rollout_proc11 [2023-10-12 15:44:26,521][61643] Starting process rollout_proc12 [2023-10-12 15:44:26,521][61643] Starting process rollout_proc13 [2023-10-12 15:44:26,906][62495] Conv encoder output size: 512 [2023-10-12 15:44:26,909][62495] Created Actor Critic model with architecture: [2023-10-12 15:44:26,910][62495] ActorCriticSharedWeights( (obs_normalizer): ObservationNormalizer( (running_mean_std): RunningMeanStdDictInPlace( (running_mean_std): ModuleDict( (obs): RunningMeanStdInPlace() ) ) ) (returns_normalizer): RecursiveScriptModule(original_name=RunningMeanStdInPlace) (encoder): MultiInputEncoder( (encoders): ModuleDict( (obs): ConvEncoder( (enc): RecursiveScriptModule( original_name=ConvEncoderImpl (conv_head): RecursiveScriptModule( original_name=Sequential (0): RecursiveScriptModule(original_name=Conv2d) (1): RecursiveScriptModule(original_name=ReLU) (2): RecursiveScriptModule(original_name=Conv2d) (3): RecursiveScriptModule(original_name=ReLU) (4): RecursiveScriptModule(original_name=Conv2d) (5): RecursiveScriptModule(original_name=ReLU) ) (mlp_layers): RecursiveScriptModule( original_name=Sequential (0): RecursiveScriptModule(original_name=Linear) (1): RecursiveScriptModule(original_name=ReLU) ) ) ) ) ) (core): ModelCoreIdentity() (decoder): MlpDecoder( (mlp): Identity() ) (critic_linear): Linear(in_features=512, out_features=1, bias=True) (action_parameterization): ActionParameterizationDefault( (distribution_linear): Linear(in_features=512, out_features=18, bias=True) ) ) [2023-10-12 15:44:27,784][62495] Using optimizer [2023-10-12 15:44:27,785][62495] No checkpoints found [2023-10-12 15:44:27,785][62495] Did not load from checkpoint, starting from scratch! [2023-10-12 15:44:27,785][62495] Initialized policy 1 weights for model version 0 [2023-10-12 15:44:27,787][62495] LearnerWorker_p1 finished initialization! [2023-10-12 15:44:27,787][62495] Using GPUs [0] for process 1 (actually maps to GPUs [1]) [2023-10-12 15:44:28,741][61643] Starting process rollout_proc14 [2023-10-12 15:44:28,746][62673] Worker 5 uses CPU cores [10, 11] [2023-10-12 15:44:28,755][61643] Starting process rollout_proc15 [2023-10-12 15:44:28,762][62672] Worker 3 uses CPU cores [6, 7] [2023-10-12 15:44:28,790][62681] Worker 13 uses CPU cores [26, 27] [2023-10-12 15:44:28,814][62671] Worker 4 uses CPU cores [8, 9] [2023-10-12 15:44:28,817][62680] Worker 12 uses CPU cores [24, 25] [2023-10-12 15:44:28,838][62675] Worker 7 uses CPU cores [14, 15] [2023-10-12 15:44:29,026][62676] Worker 8 uses CPU cores [16, 17] [2023-10-12 15:44:29,055][62674] Worker 6 uses CPU cores [12, 13] [2023-10-12 15:44:29,101][62667] Worker 2 uses CPU cores [4, 5] [2023-10-12 15:44:29,111][62634] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2023-10-12 15:44:29,111][62634] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for inference process 0 [2023-10-12 15:44:29,130][62634] Num visible devices: 1 [2023-10-12 15:44:29,213][62635] Using GPUs [1] for process 1 (actually maps to GPUs [1]) [2023-10-12 15:44:29,213][62635] Set environment var CUDA_VISIBLE_DEVICES to '1' (GPU indices [1]) for inference process 1 [2023-10-12 15:44:29,215][62670] Worker 1 uses CPU cores [2, 3] [2023-10-12 15:44:29,231][62635] Num visible devices: 1 [2023-10-12 15:44:29,233][62677] Worker 9 uses CPU cores [18, 19] [2023-10-12 15:44:29,290][62668] Worker 0 uses CPU cores [0, 1] [2023-10-12 15:44:29,465][62679] Worker 11 uses CPU cores [22, 23] [2023-10-12 15:44:29,568][62678] Worker 10 uses CPU cores [20, 21] [2023-10-12 15:44:29,744][62634] RunningMeanStd input shape: (4, 84, 84) [2023-10-12 15:44:29,745][62634] RunningMeanStd input shape: (1,) [2023-10-12 15:44:29,757][62634] ConvEncoder: input_channels=4 [2023-10-12 15:44:29,826][62635] RunningMeanStd input shape: (4, 84, 84) [2023-10-12 15:44:29,826][62635] RunningMeanStd input shape: (1,) [2023-10-12 15:44:29,838][62635] ConvEncoder: input_channels=4 [2023-10-12 15:44:29,864][62634] Conv encoder output size: 512 [2023-10-12 15:44:29,942][62635] Conv encoder output size: 512 [2023-10-12 15:44:30,702][63415] Worker 15 uses CPU cores [30, 31] [2023-10-12 15:44:30,733][61643] Inference worker 0-0 is ready! [2023-10-12 15:44:30,734][61643] Inference worker 1-0 is ready! [2023-10-12 15:44:30,735][63383] Worker 14 uses CPU cores [28, 29] [2023-10-12 15:44:30,735][61643] All inference workers are ready! Signal rollout workers to start! [2023-10-12 15:44:30,737][62677] EnvRunner 9-0 uses policy 1 [2023-10-12 15:44:30,737][62675] EnvRunner 7-0 uses policy 1 [2023-10-12 15:44:30,737][62676] EnvRunner 8-0 uses policy 0 [2023-10-12 15:44:30,737][62681] EnvRunner 13-0 uses policy 1 [2023-10-12 15:44:30,737][62674] EnvRunner 6-0 uses policy 0 [2023-10-12 15:44:30,737][62680] EnvRunner 12-0 uses policy 0 [2023-10-12 15:44:30,737][62673] EnvRunner 5-0 uses policy 1 [2023-10-12 15:44:30,737][62671] EnvRunner 4-0 uses policy 0 [2023-10-12 15:44:30,737][62679] EnvRunner 11-0 uses policy 1 [2023-10-12 15:44:30,737][62670] EnvRunner 1-0 uses policy 1 [2023-10-12 15:44:30,737][62672] EnvRunner 3-0 uses policy 1 [2023-10-12 15:44:30,737][62668] EnvRunner 0-0 uses policy 0 [2023-10-12 15:44:30,737][61643] Fps is (10 sec: nan, 60 sec: nan, 300 sec: nan). Total num frames: 0. Throughput: 0: nan, 1: nan. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) [2023-10-12 15:44:30,737][62667] EnvRunner 2-0 uses policy 0 [2023-10-12 15:44:30,737][62678] EnvRunner 10-0 uses policy 0 [2023-10-12 15:44:30,863][63383] EnvRunner 14-0 uses policy 0 [2023-10-12 15:44:30,871][63415] EnvRunner 15-0 uses policy 1 [2023-10-12 15:44:32,981][61643] Heartbeat connected on Batcher_0 [2023-10-12 15:44:32,984][61643] Heartbeat connected on LearnerWorker_p0 [2023-10-12 15:44:32,987][61643] Heartbeat connected on Batcher_1 [2023-10-12 15:44:32,990][61643] Heartbeat connected on LearnerWorker_p1 [2023-10-12 15:44:32,996][61643] Heartbeat connected on InferenceWorker_p0-w0 [2023-10-12 15:44:32,999][61643] Heartbeat connected on InferenceWorker_p1-w0 [2023-10-12 15:44:33,001][61643] Heartbeat connected on RolloutWorker_w0 [2023-10-12 15:44:33,007][61643] Heartbeat connected on RolloutWorker_w2 [2023-10-12 15:44:33,009][61643] Heartbeat connected on RolloutWorker_w3 [2023-10-12 15:44:33,010][61643] Heartbeat connected on RolloutWorker_w1 [2023-10-12 15:44:33,012][61643] Heartbeat connected on RolloutWorker_w4 [2023-10-12 15:44:33,019][61643] Heartbeat connected on RolloutWorker_w6 [2023-10-12 15:44:33,020][61643] Heartbeat connected on RolloutWorker_w5 [2023-10-12 15:44:33,021][61643] Heartbeat connected on RolloutWorker_w7 [2023-10-12 15:44:33,025][61643] Heartbeat connected on RolloutWorker_w8 [2023-10-12 15:44:33,028][61643] Heartbeat connected on RolloutWorker_w9 [2023-10-12 15:44:33,030][61643] Heartbeat connected on RolloutWorker_w10 [2023-10-12 15:44:33,035][61643] Heartbeat connected on RolloutWorker_w11 [2023-10-12 15:44:33,036][61643] Heartbeat connected on RolloutWorker_w12 [2023-10-12 15:44:33,038][61643] Heartbeat connected on RolloutWorker_w13 [2023-10-12 15:44:33,040][61643] Heartbeat connected on RolloutWorker_w14 [2023-10-12 15:44:33,047][61643] Heartbeat connected on RolloutWorker_w15 [2023-10-12 15:44:33,435][61643] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 0. Throughput: 0: 513.7, 1: 562.7. Samples: 2904. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) [2023-10-12 15:44:33,436][61643] Avg episode reward: [(0, '0.000'), (1, '0.000')] [2023-10-12 15:44:38,435][61643] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 0. Throughput: 0: 918.9, 1: 939.5. Samples: 14306. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) [2023-10-12 15:44:38,436][61643] Avg episode reward: [(0, '0.103'), (1, '0.138')] [2023-10-12 15:44:40,616][62635] Updated weights for policy 1, policy_version 10 (0.0009) [2023-10-12 15:44:40,808][62634] Updated weights for policy 0, policy_version 10 (0.0009) [2023-10-12 15:44:40,987][62635] Updated weights for policy 1, policy_version 20 (0.0009) [2023-10-12 15:44:41,186][62634] Updated weights for policy 0, policy_version 20 (0.0009) [2023-10-12 15:44:41,351][62635] Updated weights for policy 1, policy_version 30 (0.0007) [2023-10-12 15:44:41,564][62634] Updated weights for policy 0, policy_version 30 (0.0009) [2023-10-12 15:44:43,435][61643] Fps is (10 sec: 6553.7, 60 sec: 5161.2, 300 sec: 5161.2). Total num frames: 65536. Throughput: 0: 1196.6, 1: 1224.5. Samples: 30742. Policy #0 lag: (min: 21.0, avg: 21.0, max: 21.0) [2023-10-12 15:44:43,435][61643] Avg episode reward: [(0, '0.110'), (1, '0.131')] [2023-10-12 15:44:44,270][62635] Updated weights for policy 1, policy_version 40 (0.0007) [2023-10-12 15:44:44,297][62634] Updated weights for policy 0, policy_version 40 (0.0009) [2023-10-12 15:44:44,632][62635] Updated weights for policy 1, policy_version 50 (0.0009) [2023-10-12 15:44:44,680][62634] Updated weights for policy 0, policy_version 50 (0.0008) [2023-10-12 15:44:44,991][62635] Updated weights for policy 1, policy_version 60 (0.0007) [2023-10-12 15:44:45,051][62634] Updated weights for policy 0, policy_version 60 (0.0007) [2023-10-12 15:44:48,314][62635] Updated weights for policy 1, policy_version 70 (0.0008) [2023-10-12 15:44:48,435][61643] Fps is (10 sec: 13107.5, 60 sec: 7406.1, 300 sec: 7406.1). Total num frames: 131072. Throughput: 0: 1434.6, 1: 1465.9. Samples: 51334. Policy #0 lag: (min: 33.0, avg: 33.0, max: 33.0) [2023-10-12 15:44:48,435][61643] Avg episode reward: [(0, '0.080'), (1, '0.090')] [2023-10-12 15:44:48,580][62634] Updated weights for policy 0, policy_version 70 (0.0008) [2023-10-12 15:44:48,668][62635] Updated weights for policy 1, policy_version 80 (0.0008) [2023-10-12 15:44:48,951][62634] Updated weights for policy 0, policy_version 80 (0.0010) [2023-10-12 15:44:49,037][62635] Updated weights for policy 1, policy_version 90 (0.0009) [2023-10-12 15:44:49,324][62634] Updated weights for policy 0, policy_version 90 (0.0008) [2023-10-12 15:44:52,698][62635] Updated weights for policy 1, policy_version 100 (0.0008) [2023-10-12 15:44:53,064][62635] Updated weights for policy 1, policy_version 110 (0.0007) [2023-10-12 15:44:53,195][62634] Updated weights for policy 0, policy_version 100 (0.0008) [2023-10-12 15:44:53,420][62635] Updated weights for policy 1, policy_version 120 (0.0008) [2023-10-12 15:44:53,435][61643] Fps is (10 sec: 13107.3, 60 sec: 8662.0, 300 sec: 8662.0). Total num frames: 196608. Throughput: 0: 1315.0, 1: 1349.1. Samples: 60470. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) [2023-10-12 15:44:53,435][61643] Avg episode reward: [(0, '0.070'), (1, '0.040')] [2023-10-12 15:44:53,566][62634] Updated weights for policy 0, policy_version 110 (0.0010) [2023-10-12 15:44:53,944][62634] Updated weights for policy 0, policy_version 120 (0.0008) [2023-10-12 15:44:57,602][62635] Updated weights for policy 1, policy_version 130 (0.0010) [2023-10-12 15:44:57,962][62635] Updated weights for policy 1, policy_version 140 (0.0009) [2023-10-12 15:44:57,995][62634] Updated weights for policy 0, policy_version 130 (0.0009) [2023-10-12 15:44:58,330][62635] Updated weights for policy 1, policy_version 150 (0.0008) [2023-10-12 15:44:58,371][62634] Updated weights for policy 0, policy_version 140 (0.0008) [2023-10-12 15:44:58,435][61643] Fps is (10 sec: 13107.2, 60 sec: 9464.4, 300 sec: 9464.4). Total num frames: 262144. Throughput: 0: 1443.1, 1: 1487.5. Samples: 81170. Policy #0 lag: (min: 22.0, avg: 22.0, max: 22.0) [2023-10-12 15:44:58,435][61643] Avg episode reward: [(0, '0.060'), (1, '0.040')] [2023-10-12 15:44:58,686][62495] Saving new best policy, reward=0.040! [2023-10-12 15:44:58,688][62635] Updated weights for policy 1, policy_version 160 (0.0008) [2023-10-12 15:44:58,743][62634] Updated weights for policy 0, policy_version 150 (0.0008) [2023-10-12 15:44:59,121][62354] Saving new best policy, reward=0.060! [2023-10-12 15:44:59,121][62634] Updated weights for policy 0, policy_version 160 (0.0008) [2023-10-12 15:45:02,891][62635] Updated weights for policy 1, policy_version 170 (0.0007) [2023-10-12 15:45:03,196][62634] Updated weights for policy 0, policy_version 170 (0.0009) [2023-10-12 15:45:03,257][62635] Updated weights for policy 1, policy_version 180 (0.0007) [2023-10-12 15:45:03,435][61643] Fps is (10 sec: 13107.2, 60 sec: 10021.5, 300 sec: 10021.5). Total num frames: 327680. Throughput: 0: 1537.4, 1: 1557.5. Samples: 101196. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 15:45:03,435][61643] Avg episode reward: [(0, '0.040'), (1, '0.060')] [2023-10-12 15:45:03,564][62634] Updated weights for policy 0, policy_version 180 (0.0007) [2023-10-12 15:45:03,611][62635] Updated weights for policy 1, policy_version 190 (0.0008) [2023-10-12 15:45:03,685][62495] Saving new best policy, reward=0.060! [2023-10-12 15:45:03,948][62634] Updated weights for policy 0, policy_version 190 (0.0007) [2023-10-12 15:45:07,632][62635] Updated weights for policy 1, policy_version 200 (0.0008) [2023-10-12 15:45:07,989][62635] Updated weights for policy 1, policy_version 210 (0.0008) [2023-10-12 15:45:08,134][62634] Updated weights for policy 0, policy_version 200 (0.0008) [2023-10-12 15:45:08,355][62635] Updated weights for policy 1, policy_version 220 (0.0008) [2023-10-12 15:45:08,435][61643] Fps is (10 sec: 13107.1, 60 sec: 10430.7, 300 sec: 10430.7). Total num frames: 393216. Throughput: 0: 1452.4, 1: 1485.8. Samples: 110762. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 15:45:08,436][61643] Avg episode reward: [(0, '0.090'), (1, '0.030')] [2023-10-12 15:45:08,509][62634] Updated weights for policy 0, policy_version 210 (0.0009) [2023-10-12 15:45:08,883][62634] Updated weights for policy 0, policy_version 220 (0.0009) [2023-10-12 15:45:09,034][62354] Saving new best policy, reward=0.090! [2023-10-12 15:45:12,285][62635] Updated weights for policy 1, policy_version 230 (0.0008) [2023-10-12 15:45:12,643][62635] Updated weights for policy 1, policy_version 240 (0.0008) [2023-10-12 15:45:12,900][62634] Updated weights for policy 0, policy_version 230 (0.0007) [2023-10-12 15:45:13,020][62635] Updated weights for policy 1, policy_version 250 (0.0007) [2023-10-12 15:45:13,284][62634] Updated weights for policy 0, policy_version 240 (0.0009) [2023-10-12 15:45:13,435][61643] Fps is (10 sec: 16383.9, 60 sec: 11511.6, 300 sec: 11511.6). Total num frames: 491520. Throughput: 0: 1525.7, 1: 1559.1. Samples: 131710. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 15:45:13,435][61643] Avg episode reward: [(0, '0.090'), (1, '0.050')] [2023-10-12 15:45:13,660][62634] Updated weights for policy 0, policy_version 250 (0.0009) [2023-10-12 15:45:17,210][62635] Updated weights for policy 1, policy_version 260 (0.0009) [2023-10-12 15:45:17,575][62635] Updated weights for policy 1, policy_version 270 (0.0009) [2023-10-12 15:45:17,689][62634] Updated weights for policy 0, policy_version 260 (0.0009) [2023-10-12 15:45:17,937][62635] Updated weights for policy 1, policy_version 280 (0.0009) [2023-10-12 15:45:18,056][62634] Updated weights for policy 0, policy_version 270 (0.0008) [2023-10-12 15:45:18,432][62634] Updated weights for policy 0, policy_version 280 (0.0009) [2023-10-12 15:45:18,435][61643] Fps is (10 sec: 16383.4, 60 sec: 11678.8, 300 sec: 11678.8). Total num frames: 557056. Throughput: 0: 1638.3, 1: 1649.8. Samples: 150870. Policy #0 lag: (min: 4.0, avg: 6.1, max: 35.0) [2023-10-12 15:45:18,436][61643] Avg episode reward: [(0, '0.050'), (1, '0.080')] [2023-10-12 15:45:18,444][62495] Saving new best policy, reward=0.080! [2023-10-12 15:45:21,937][62635] Updated weights for policy 1, policy_version 290 (0.0008) [2023-10-12 15:45:22,307][62635] Updated weights for policy 1, policy_version 300 (0.0009) [2023-10-12 15:45:22,668][62635] Updated weights for policy 1, policy_version 310 (0.0009) [2023-10-12 15:45:22,759][62634] Updated weights for policy 0, policy_version 290 (0.0009) [2023-10-12 15:45:23,027][62635] Updated weights for policy 1, policy_version 320 (0.0008) [2023-10-12 15:45:23,167][62634] Updated weights for policy 0, policy_version 300 (0.0009) [2023-10-12 15:45:23,435][61643] Fps is (10 sec: 13107.2, 60 sec: 11814.4, 300 sec: 11814.4). Total num frames: 622592. Throughput: 0: 1621.8, 1: 1646.5. Samples: 161378. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 15:45:23,435][61643] Avg episode reward: [(0, '0.050'), (1, '0.060')] [2023-10-12 15:45:23,535][62634] Updated weights for policy 0, policy_version 310 (0.0008) [2023-10-12 15:45:23,914][62634] Updated weights for policy 0, policy_version 320 (0.0009) [2023-10-12 15:45:27,039][62635] Updated weights for policy 1, policy_version 330 (0.0008) [2023-10-12 15:45:27,405][62635] Updated weights for policy 1, policy_version 340 (0.0010) [2023-10-12 15:45:27,768][62635] Updated weights for policy 1, policy_version 350 (0.0008) [2023-10-12 15:45:27,879][62634] Updated weights for policy 0, policy_version 330 (0.0007) [2023-10-12 15:45:28,264][62634] Updated weights for policy 0, policy_version 340 (0.0007) [2023-10-12 15:45:28,435][61643] Fps is (10 sec: 13107.5, 60 sec: 11926.4, 300 sec: 11926.4). Total num frames: 688128. Throughput: 0: 1665.4, 1: 1681.0. Samples: 181332. Policy #0 lag: (min: 31.0, avg: 33.8, max: 63.0) [2023-10-12 15:45:28,436][61643] Avg episode reward: [(0, '0.050'), (1, '0.080')] [2023-10-12 15:45:28,645][62634] Updated weights for policy 0, policy_version 350 (0.0008) [2023-10-12 15:45:31,641][62635] Updated weights for policy 1, policy_version 360 (0.0009) [2023-10-12 15:45:31,999][62635] Updated weights for policy 1, policy_version 370 (0.0008) [2023-10-12 15:45:32,374][62635] Updated weights for policy 1, policy_version 380 (0.0007) [2023-10-12 15:45:32,598][62634] Updated weights for policy 0, policy_version 360 (0.0007) [2023-10-12 15:45:32,983][62634] Updated weights for policy 0, policy_version 370 (0.0007) [2023-10-12 15:45:33,351][62634] Updated weights for policy 0, policy_version 380 (0.0007) [2023-10-12 15:45:33,435][61643] Fps is (10 sec: 13107.3, 60 sec: 12561.1, 300 sec: 12020.6). Total num frames: 753664. Throughput: 0: 1656.1, 1: 1664.8. Samples: 200774. Policy #0 lag: (min: 26.0, avg: 27.1, max: 47.0) [2023-10-12 15:45:33,435][61643] Avg episode reward: [(0, '0.020'), (1, '0.090')] [2023-10-12 15:45:33,444][62495] Saving new best policy, reward=0.090! [2023-10-12 15:45:36,372][62635] Updated weights for policy 1, policy_version 390 (0.0008) [2023-10-12 15:45:36,742][62635] Updated weights for policy 1, policy_version 400 (0.0008) [2023-10-12 15:45:37,114][62635] Updated weights for policy 1, policy_version 410 (0.0007) [2023-10-12 15:45:37,380][62634] Updated weights for policy 0, policy_version 390 (0.0008) [2023-10-12 15:45:37,759][62634] Updated weights for policy 0, policy_version 400 (0.0007) [2023-10-12 15:45:38,133][62634] Updated weights for policy 0, policy_version 410 (0.0007) [2023-10-12 15:45:38,435][61643] Fps is (10 sec: 16384.3, 60 sec: 14199.5, 300 sec: 12584.9). Total num frames: 851968. Throughput: 0: 1674.7, 1: 1691.6. Samples: 211954. Policy #0 lag: (min: 22.0, avg: 22.0, max: 22.0) [2023-10-12 15:45:38,435][61643] Avg episode reward: [(0, '0.060'), (1, '0.050')] [2023-10-12 15:45:41,123][62635] Updated weights for policy 1, policy_version 420 (0.0007) [2023-10-12 15:45:41,497][62635] Updated weights for policy 1, policy_version 430 (0.0008) [2023-10-12 15:45:41,861][62635] Updated weights for policy 1, policy_version 440 (0.0008) [2023-10-12 15:45:42,141][62634] Updated weights for policy 0, policy_version 420 (0.0008) [2023-10-12 15:45:42,527][62634] Updated weights for policy 0, policy_version 430 (0.0010) [2023-10-12 15:45:42,897][62634] Updated weights for policy 0, policy_version 440 (0.0008) [2023-10-12 15:45:43,435][61643] Fps is (10 sec: 16383.9, 60 sec: 14199.5, 300 sec: 12620.8). Total num frames: 917504. Throughput: 0: 1678.0, 1: 1667.6. Samples: 231722. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 15:45:43,436][61643] Avg episode reward: [(0, '0.080'), (1, '0.060')] [2023-10-12 15:45:45,869][62635] Updated weights for policy 1, policy_version 450 (0.0007) [2023-10-12 15:45:46,225][62635] Updated weights for policy 1, policy_version 460 (0.0008) [2023-10-12 15:45:46,607][62635] Updated weights for policy 1, policy_version 470 (0.0008) [2023-10-12 15:45:46,961][62635] Updated weights for policy 1, policy_version 480 (0.0008) [2023-10-12 15:45:46,994][62634] Updated weights for policy 0, policy_version 450 (0.0009) [2023-10-12 15:45:47,369][62634] Updated weights for policy 0, policy_version 460 (0.0009) [2023-10-12 15:45:47,735][62634] Updated weights for policy 0, policy_version 470 (0.0007) [2023-10-12 15:45:48,116][62634] Updated weights for policy 0, policy_version 480 (0.0008) [2023-10-12 15:45:48,435][61643] Fps is (10 sec: 13107.1, 60 sec: 14199.4, 300 sec: 12652.1). Total num frames: 983040. Throughput: 0: 1658.9, 1: 1682.5. Samples: 251560. Policy #0 lag: (min: 1.0, avg: 7.3, max: 33.0) [2023-10-12 15:45:48,436][61643] Avg episode reward: [(0, '0.050'), (1, '0.030')] [2023-10-12 15:45:51,119][62635] Updated weights for policy 1, policy_version 490 (0.0007) [2023-10-12 15:45:51,491][62635] Updated weights for policy 1, policy_version 500 (0.0007) [2023-10-12 15:45:51,852][62635] Updated weights for policy 1, policy_version 510 (0.0007) [2023-10-12 15:45:52,073][62634] Updated weights for policy 0, policy_version 490 (0.0009) [2023-10-12 15:45:52,458][62634] Updated weights for policy 0, policy_version 500 (0.0009) [2023-10-12 15:45:52,842][62634] Updated weights for policy 0, policy_version 510 (0.0009) [2023-10-12 15:45:53,435][61643] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 12679.6). Total num frames: 1048576. Throughput: 0: 1684.9, 1: 1689.9. Samples: 262630. Policy #0 lag: (min: 26.0, avg: 33.9, max: 58.0) [2023-10-12 15:45:53,435][61643] Avg episode reward: [(0, '0.090'), (1, '0.040')] [2023-10-12 15:45:55,812][62635] Updated weights for policy 1, policy_version 520 (0.0008) [2023-10-12 15:45:56,177][62635] Updated weights for policy 1, policy_version 530 (0.0010) [2023-10-12 15:45:56,545][62635] Updated weights for policy 1, policy_version 540 (0.0009) [2023-10-12 15:45:56,825][62634] Updated weights for policy 0, policy_version 520 (0.0010) [2023-10-12 15:45:57,186][62634] Updated weights for policy 0, policy_version 530 (0.0007) [2023-10-12 15:45:57,567][62634] Updated weights for policy 0, policy_version 540 (0.0008) [2023-10-12 15:45:58,435][61643] Fps is (10 sec: 13107.0, 60 sec: 14199.4, 300 sec: 12704.0). Total num frames: 1114112. Throughput: 0: 1676.5, 1: 1667.6. Samples: 282198. Policy #0 lag: (min: 31.0, avg: 32.6, max: 57.0) [2023-10-12 15:45:58,436][61643] Avg episode reward: [(0, '0.100'), (1, '0.040')] [2023-10-12 15:45:58,437][62354] Saving new best policy, reward=0.100! [2023-10-12 15:46:00,683][62635] Updated weights for policy 1, policy_version 550 (0.0008) [2023-10-12 15:46:01,054][62635] Updated weights for policy 1, policy_version 560 (0.0008) [2023-10-12 15:46:01,415][62635] Updated weights for policy 1, policy_version 570 (0.0009) [2023-10-12 15:46:01,629][62634] Updated weights for policy 0, policy_version 550 (0.0008) [2023-10-12 15:46:02,006][62634] Updated weights for policy 0, policy_version 560 (0.0010) [2023-10-12 15:46:02,374][62634] Updated weights for policy 0, policy_version 570 (0.0010) [2023-10-12 15:46:03,435][61643] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 12725.7). Total num frames: 1179648. Throughput: 0: 1662.3, 1: 1696.6. Samples: 302018. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 15:46:03,435][61643] Avg episode reward: [(0, '0.070'), (1, '0.060')] [2023-10-12 15:46:05,513][62635] Updated weights for policy 1, policy_version 580 (0.0009) [2023-10-12 15:46:05,884][62635] Updated weights for policy 1, policy_version 590 (0.0007) [2023-10-12 15:46:06,255][62635] Updated weights for policy 1, policy_version 600 (0.0009) [2023-10-12 15:46:06,524][62634] Updated weights for policy 0, policy_version 580 (0.0008) [2023-10-12 15:46:06,898][62634] Updated weights for policy 0, policy_version 590 (0.0007) [2023-10-12 15:46:07,267][62634] Updated weights for policy 0, policy_version 600 (0.0009) [2023-10-12 15:46:08,435][61643] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 12745.3). Total num frames: 1245184. Throughput: 0: 1681.2, 1: 1687.3. Samples: 312958. Policy #0 lag: (min: 17.0, avg: 19.6, max: 47.0) [2023-10-12 15:46:08,436][61643] Avg episode reward: [(0, '0.030'), (1, '0.080')] [2023-10-12 15:46:10,490][62635] Updated weights for policy 1, policy_version 610 (0.0008) [2023-10-12 15:46:10,919][62635] Updated weights for policy 1, policy_version 620 (0.0009) [2023-10-12 15:46:11,290][62635] Updated weights for policy 1, policy_version 630 (0.0009) [2023-10-12 15:46:11,458][62634] Updated weights for policy 0, policy_version 610 (0.0009) [2023-10-12 15:46:11,660][62635] Updated weights for policy 1, policy_version 640 (0.0009) [2023-10-12 15:46:11,859][62634] Updated weights for policy 0, policy_version 620 (0.0009) [2023-10-12 15:46:12,234][62634] Updated weights for policy 0, policy_version 630 (0.0010) [2023-10-12 15:46:12,619][62634] Updated weights for policy 0, policy_version 640 (0.0008) [2023-10-12 15:46:13,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 12762.9). Total num frames: 1310720. Throughput: 0: 1669.8, 1: 1675.4. Samples: 331864. Policy #0 lag: (min: 28.0, avg: 28.2, max: 36.0) [2023-10-12 15:46:13,435][61643] Avg episode reward: [(0, '0.060'), (1, '0.070')] [2023-10-12 15:46:15,772][62635] Updated weights for policy 1, policy_version 650 (0.0007) [2023-10-12 15:46:16,132][62635] Updated weights for policy 1, policy_version 660 (0.0009) [2023-10-12 15:46:16,487][62635] Updated weights for policy 1, policy_version 670 (0.0008) [2023-10-12 15:46:16,534][62634] Updated weights for policy 0, policy_version 650 (0.0009) [2023-10-12 15:46:16,905][62634] Updated weights for policy 0, policy_version 660 (0.0010) [2023-10-12 15:46:17,289][62634] Updated weights for policy 0, policy_version 670 (0.0007) [2023-10-12 15:46:18,435][61643] Fps is (10 sec: 13106.7, 60 sec: 13653.3, 300 sec: 12778.8). Total num frames: 1376256. Throughput: 0: 1671.2, 1: 1686.8. Samples: 351884. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 15:46:18,436][61643] Avg episode reward: [(0, '0.110'), (1, '0.060')] [2023-10-12 15:46:18,449][62354] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000000672_688128.pth... [2023-10-12 15:46:18,449][62495] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000000672_688128.pth... [2023-10-12 15:46:18,491][62354] Saving new best policy, reward=0.110! [2023-10-12 15:46:20,715][62635] Updated weights for policy 1, policy_version 680 (0.0010) [2023-10-12 15:46:21,084][62635] Updated weights for policy 1, policy_version 690 (0.0009) [2023-10-12 15:46:21,447][62635] Updated weights for policy 1, policy_version 700 (0.0007) [2023-10-12 15:46:21,452][62634] Updated weights for policy 0, policy_version 680 (0.0007) [2023-10-12 15:46:21,829][62634] Updated weights for policy 0, policy_version 690 (0.0009) [2023-10-12 15:46:22,196][62634] Updated weights for policy 0, policy_version 700 (0.0008) [2023-10-12 15:46:23,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 12793.4). Total num frames: 1441792. Throughput: 0: 1679.8, 1: 1669.4. Samples: 362668. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-10-12 15:46:23,435][61643] Avg episode reward: [(0, '0.120'), (1, '0.100')] [2023-10-12 15:46:23,436][62354] Saving new best policy, reward=0.120! [2023-10-12 15:46:23,436][62495] Saving new best policy, reward=0.100! [2023-10-12 15:46:25,492][62635] Updated weights for policy 1, policy_version 710 (0.0008) [2023-10-12 15:46:25,859][62635] Updated weights for policy 1, policy_version 720 (0.0009) [2023-10-12 15:46:26,176][62634] Updated weights for policy 0, policy_version 710 (0.0008) [2023-10-12 15:46:26,219][62635] Updated weights for policy 1, policy_version 730 (0.0008) [2023-10-12 15:46:26,552][62634] Updated weights for policy 0, policy_version 720 (0.0009) [2023-10-12 15:46:26,935][62634] Updated weights for policy 0, policy_version 730 (0.0010) [2023-10-12 15:46:28,435][61643] Fps is (10 sec: 13107.7, 60 sec: 13653.4, 300 sec: 12806.8). Total num frames: 1507328. Throughput: 0: 1661.7, 1: 1678.0. Samples: 382008. Policy #0 lag: (min: 4.0, avg: 12.6, max: 36.0) [2023-10-12 15:46:28,436][61643] Avg episode reward: [(0, '0.050'), (1, '0.120')] [2023-10-12 15:46:28,437][62495] Saving new best policy, reward=0.120! [2023-10-12 15:46:30,285][62635] Updated weights for policy 1, policy_version 740 (0.0008) [2023-10-12 15:46:30,655][62635] Updated weights for policy 1, policy_version 750 (0.0007) [2023-10-12 15:46:31,022][62635] Updated weights for policy 1, policy_version 760 (0.0008) [2023-10-12 15:46:31,043][62634] Updated weights for policy 0, policy_version 740 (0.0008) [2023-10-12 15:46:31,421][62634] Updated weights for policy 0, policy_version 750 (0.0008) [2023-10-12 15:46:31,797][62634] Updated weights for policy 0, policy_version 760 (0.0009) [2023-10-12 15:46:33,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 12819.0). Total num frames: 1572864. Throughput: 0: 1677.3, 1: 1673.2. Samples: 402334. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 15:46:33,435][61643] Avg episode reward: [(0, '0.110'), (1, '0.130')] [2023-10-12 15:46:33,443][62495] Saving new best policy, reward=0.130! [2023-10-12 15:46:35,012][62635] Updated weights for policy 1, policy_version 770 (0.0007) [2023-10-12 15:46:35,375][62635] Updated weights for policy 1, policy_version 780 (0.0009) [2023-10-12 15:46:35,753][62635] Updated weights for policy 1, policy_version 790 (0.0010) [2023-10-12 15:46:35,814][62634] Updated weights for policy 0, policy_version 770 (0.0008) [2023-10-12 15:46:36,112][62635] Updated weights for policy 1, policy_version 800 (0.0007) [2023-10-12 15:46:36,187][62634] Updated weights for policy 0, policy_version 780 (0.0008) [2023-10-12 15:46:36,553][62634] Updated weights for policy 0, policy_version 790 (0.0007) [2023-10-12 15:46:36,927][62634] Updated weights for policy 0, policy_version 800 (0.0007) [2023-10-12 15:46:38,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 12830.3). Total num frames: 1638400. Throughput: 0: 1674.7, 1: 1659.8. Samples: 412686. Policy #0 lag: (min: 15.0, avg: 29.0, max: 47.0) [2023-10-12 15:46:38,436][61643] Avg episode reward: [(0, '0.150'), (1, '0.090')] [2023-10-12 15:46:38,437][62354] Saving new best policy, reward=0.150! [2023-10-12 15:46:40,133][62635] Updated weights for policy 1, policy_version 810 (0.0010) [2023-10-12 15:46:40,506][62635] Updated weights for policy 1, policy_version 820 (0.0007) [2023-10-12 15:46:40,870][62635] Updated weights for policy 1, policy_version 830 (0.0009) [2023-10-12 15:46:41,124][62634] Updated weights for policy 0, policy_version 810 (0.0009) [2023-10-12 15:46:41,496][62634] Updated weights for policy 0, policy_version 820 (0.0009) [2023-10-12 15:46:41,871][62634] Updated weights for policy 0, policy_version 830 (0.0009) [2023-10-12 15:46:43,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 12840.7). Total num frames: 1703936. Throughput: 0: 1656.6, 1: 1677.4. Samples: 432228. Policy #0 lag: (min: 12.0, avg: 12.8, max: 33.0) [2023-10-12 15:46:43,435][61643] Avg episode reward: [(0, '0.080'), (1, '0.100')] [2023-10-12 15:46:44,865][62635] Updated weights for policy 1, policy_version 840 (0.0007) [2023-10-12 15:46:45,233][62635] Updated weights for policy 1, policy_version 850 (0.0008) [2023-10-12 15:46:45,599][62635] Updated weights for policy 1, policy_version 860 (0.0007) [2023-10-12 15:46:46,007][62634] Updated weights for policy 0, policy_version 840 (0.0008) [2023-10-12 15:46:46,380][62634] Updated weights for policy 0, policy_version 850 (0.0011) [2023-10-12 15:46:46,765][62634] Updated weights for policy 0, policy_version 860 (0.0008) [2023-10-12 15:46:48,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 12850.4). Total num frames: 1769472. Throughput: 0: 1675.7, 1: 1679.7. Samples: 453012. Policy #0 lag: (min: 8.0, avg: 30.1, max: 40.0) [2023-10-12 15:46:48,435][61643] Avg episode reward: [(0, '0.160'), (1, '0.100')] [2023-10-12 15:46:48,444][62354] Saving new best policy, reward=0.160! [2023-10-12 15:46:49,667][62635] Updated weights for policy 1, policy_version 870 (0.0009) [2023-10-12 15:46:50,037][62635] Updated weights for policy 1, policy_version 880 (0.0007) [2023-10-12 15:46:50,403][62635] Updated weights for policy 1, policy_version 890 (0.0010) [2023-10-12 15:46:50,848][62634] Updated weights for policy 0, policy_version 870 (0.0009) [2023-10-12 15:46:51,224][62634] Updated weights for policy 0, policy_version 880 (0.0010) [2023-10-12 15:46:51,601][62634] Updated weights for policy 0, policy_version 890 (0.0009) [2023-10-12 15:46:53,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 12859.4). Total num frames: 1835008. Throughput: 0: 1672.4, 1: 1663.4. Samples: 463068. Policy #0 lag: (min: 21.0, avg: 45.9, max: 48.0) [2023-10-12 15:46:53,435][61643] Avg episode reward: [(0, '0.150'), (1, '0.110')] [2023-10-12 15:46:54,453][62635] Updated weights for policy 1, policy_version 900 (0.0008) [2023-10-12 15:46:54,814][62635] Updated weights for policy 1, policy_version 910 (0.0008) [2023-10-12 15:46:55,184][62635] Updated weights for policy 1, policy_version 920 (0.0008) [2023-10-12 15:46:55,614][62634] Updated weights for policy 0, policy_version 900 (0.0008) [2023-10-12 15:46:55,986][62634] Updated weights for policy 0, policy_version 910 (0.0009) [2023-10-12 15:46:56,362][62634] Updated weights for policy 0, policy_version 920 (0.0009) [2023-10-12 15:46:58,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 12867.8). Total num frames: 1900544. Throughput: 0: 1666.4, 1: 1690.0. Samples: 482898. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 15:46:58,436][61643] Avg episode reward: [(0, '0.110'), (1, '0.130')] [2023-10-12 15:46:59,305][62635] Updated weights for policy 1, policy_version 930 (0.0009) [2023-10-12 15:46:59,706][62635] Updated weights for policy 1, policy_version 940 (0.0008) [2023-10-12 15:47:00,080][62635] Updated weights for policy 1, policy_version 950 (0.0008) [2023-10-12 15:47:00,445][62635] Updated weights for policy 1, policy_version 960 (0.0010) [2023-10-12 15:47:00,493][62634] Updated weights for policy 0, policy_version 930 (0.0008) [2023-10-12 15:47:00,910][62634] Updated weights for policy 0, policy_version 940 (0.0010) [2023-10-12 15:47:01,296][62634] Updated weights for policy 0, policy_version 950 (0.0009) [2023-10-12 15:47:01,679][62634] Updated weights for policy 0, policy_version 960 (0.0010) [2023-10-12 15:47:03,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 12875.6). Total num frames: 1966080. Throughput: 0: 1678.2, 1: 1691.3. Samples: 503512. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 15:47:03,436][61643] Avg episode reward: [(0, '0.070'), (1, '0.110')] [2023-10-12 15:47:04,496][62635] Updated weights for policy 1, policy_version 970 (0.0007) [2023-10-12 15:47:04,862][62635] Updated weights for policy 1, policy_version 980 (0.0007) [2023-10-12 15:47:05,234][62635] Updated weights for policy 1, policy_version 990 (0.0008) [2023-10-12 15:47:05,542][62634] Updated weights for policy 0, policy_version 970 (0.0007) [2023-10-12 15:47:05,908][62634] Updated weights for policy 0, policy_version 980 (0.0008) [2023-10-12 15:47:06,295][62634] Updated weights for policy 0, policy_version 990 (0.0008) [2023-10-12 15:47:08,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 12883.0). Total num frames: 2031616. Throughput: 0: 1664.7, 1: 1678.0. Samples: 513092. Policy #0 lag: (min: 21.0, avg: 21.0, max: 21.0) [2023-10-12 15:47:08,436][61643] Avg episode reward: [(0, '0.110'), (1, '0.110')] [2023-10-12 15:47:09,381][62635] Updated weights for policy 1, policy_version 1000 (0.0010) [2023-10-12 15:47:09,751][62635] Updated weights for policy 1, policy_version 1010 (0.0007) [2023-10-12 15:47:10,122][62635] Updated weights for policy 1, policy_version 1020 (0.0008) [2023-10-12 15:47:10,296][62634] Updated weights for policy 0, policy_version 1000 (0.0007) [2023-10-12 15:47:10,682][62634] Updated weights for policy 0, policy_version 1010 (0.0010) [2023-10-12 15:47:11,057][62634] Updated weights for policy 0, policy_version 1020 (0.0008) [2023-10-12 15:47:13,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 12889.8). Total num frames: 2097152. Throughput: 0: 1671.9, 1: 1688.8. Samples: 533242. Policy #0 lag: (min: 13.0, avg: 13.3, max: 25.0) [2023-10-12 15:47:13,436][61643] Avg episode reward: [(0, '0.110'), (1, '0.120')] [2023-10-12 15:47:14,169][62635] Updated weights for policy 1, policy_version 1030 (0.0009) [2023-10-12 15:47:14,549][62635] Updated weights for policy 1, policy_version 1040 (0.0008) [2023-10-12 15:47:14,917][62635] Updated weights for policy 1, policy_version 1050 (0.0008) [2023-10-12 15:47:15,226][62634] Updated weights for policy 0, policy_version 1030 (0.0008) [2023-10-12 15:47:15,603][62634] Updated weights for policy 0, policy_version 1040 (0.0007) [2023-10-12 15:47:15,980][62634] Updated weights for policy 0, policy_version 1050 (0.0009) [2023-10-12 15:47:18,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 12896.3). Total num frames: 2162688. Throughput: 0: 1677.1, 1: 1694.3. Samples: 554046. Policy #0 lag: (min: 3.0, avg: 10.6, max: 35.0) [2023-10-12 15:47:18,436][61643] Avg episode reward: [(0, '0.130'), (1, '0.090')] [2023-10-12 15:47:18,840][62635] Updated weights for policy 1, policy_version 1060 (0.0009) [2023-10-12 15:47:19,207][62635] Updated weights for policy 1, policy_version 1070 (0.0009) [2023-10-12 15:47:19,577][62635] Updated weights for policy 1, policy_version 1080 (0.0008) [2023-10-12 15:47:19,962][62634] Updated weights for policy 0, policy_version 1060 (0.0009) [2023-10-12 15:47:20,329][62634] Updated weights for policy 0, policy_version 1070 (0.0008) [2023-10-12 15:47:20,709][62634] Updated weights for policy 0, policy_version 1080 (0.0009) [2023-10-12 15:47:23,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 12902.4). Total num frames: 2228224. Throughput: 0: 1660.5, 1: 1688.9. Samples: 563408. Policy #0 lag: (min: 31.0, avg: 33.3, max: 63.0) [2023-10-12 15:47:23,436][61643] Avg episode reward: [(0, '0.180'), (1, '0.080')] [2023-10-12 15:47:23,436][62354] Saving new best policy, reward=0.180! [2023-10-12 15:47:23,627][62635] Updated weights for policy 1, policy_version 1090 (0.0008) [2023-10-12 15:47:23,997][62635] Updated weights for policy 1, policy_version 1100 (0.0008) [2023-10-12 15:47:24,371][62635] Updated weights for policy 1, policy_version 1110 (0.0007) [2023-10-12 15:47:24,736][62635] Updated weights for policy 1, policy_version 1120 (0.0007) [2023-10-12 15:47:24,846][62634] Updated weights for policy 0, policy_version 1090 (0.0008) [2023-10-12 15:47:25,212][62634] Updated weights for policy 0, policy_version 1100 (0.0009) [2023-10-12 15:47:25,591][62634] Updated weights for policy 0, policy_version 1110 (0.0010) [2023-10-12 15:47:25,963][62634] Updated weights for policy 0, policy_version 1120 (0.0008) [2023-10-12 15:47:28,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 12908.2). Total num frames: 2293760. Throughput: 0: 1677.3, 1: 1688.1. Samples: 583674. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) [2023-10-12 15:47:28,436][61643] Avg episode reward: [(0, '0.140'), (1, '0.100')] [2023-10-12 15:47:28,943][62635] Updated weights for policy 1, policy_version 1130 (0.0008) [2023-10-12 15:47:29,312][62635] Updated weights for policy 1, policy_version 1140 (0.0008) [2023-10-12 15:47:29,683][62635] Updated weights for policy 1, policy_version 1150 (0.0008) [2023-10-12 15:47:30,022][62634] Updated weights for policy 0, policy_version 1130 (0.0011) [2023-10-12 15:47:30,404][62634] Updated weights for policy 0, policy_version 1140 (0.0009) [2023-10-12 15:47:30,781][62634] Updated weights for policy 0, policy_version 1150 (0.0008) [2023-10-12 15:47:33,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 12913.7). Total num frames: 2359296. Throughput: 0: 1683.1, 1: 1685.7. Samples: 604608. Policy #0 lag: (min: 31.0, avg: 38.6, max: 63.0) [2023-10-12 15:47:33,436][61643] Avg episode reward: [(0, '0.110'), (1, '0.110')] [2023-10-12 15:47:33,827][62635] Updated weights for policy 1, policy_version 1160 (0.0011) [2023-10-12 15:47:34,196][62635] Updated weights for policy 1, policy_version 1170 (0.0008) [2023-10-12 15:47:34,566][62635] Updated weights for policy 1, policy_version 1180 (0.0009) [2023-10-12 15:47:34,867][62634] Updated weights for policy 0, policy_version 1160 (0.0007) [2023-10-12 15:47:35,249][62634] Updated weights for policy 0, policy_version 1170 (0.0008) [2023-10-12 15:47:35,623][62634] Updated weights for policy 0, policy_version 1180 (0.0007) [2023-10-12 15:47:38,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 12918.8). Total num frames: 2424832. Throughput: 0: 1657.8, 1: 1686.8. Samples: 613574. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 15:47:38,436][61643] Avg episode reward: [(0, '0.100'), (1, '0.100')] [2023-10-12 15:47:38,557][62635] Updated weights for policy 1, policy_version 1190 (0.0007) [2023-10-12 15:47:38,925][62635] Updated weights for policy 1, policy_version 1200 (0.0008) [2023-10-12 15:47:39,288][62635] Updated weights for policy 1, policy_version 1210 (0.0007) [2023-10-12 15:47:39,625][62634] Updated weights for policy 0, policy_version 1190 (0.0007) [2023-10-12 15:47:40,004][62634] Updated weights for policy 0, policy_version 1200 (0.0007) [2023-10-12 15:47:40,383][62634] Updated weights for policy 0, policy_version 1210 (0.0010) [2023-10-12 15:47:43,322][62635] Updated weights for policy 1, policy_version 1220 (0.0007) [2023-10-12 15:47:43,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 12923.7). Total num frames: 2490368. Throughput: 0: 1683.1, 1: 1681.7. Samples: 634314. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 15:47:43,435][61643] Avg episode reward: [(0, '0.140'), (1, '0.120')] [2023-10-12 15:47:43,684][62635] Updated weights for policy 1, policy_version 1230 (0.0009) [2023-10-12 15:47:44,052][62635] Updated weights for policy 1, policy_version 1240 (0.0009) [2023-10-12 15:47:44,463][62634] Updated weights for policy 0, policy_version 1220 (0.0007) [2023-10-12 15:47:44,836][62634] Updated weights for policy 0, policy_version 1230 (0.0010) [2023-10-12 15:47:45,208][62634] Updated weights for policy 0, policy_version 1240 (0.0007) [2023-10-12 15:47:48,189][62635] Updated weights for policy 1, policy_version 1250 (0.0008) [2023-10-12 15:47:48,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 12928.3). Total num frames: 2555904. Throughput: 0: 1679.8, 1: 1685.4. Samples: 654946. Policy #0 lag: (min: 31.0, avg: 31.0, max: 35.0) [2023-10-12 15:47:48,435][61643] Avg episode reward: [(0, '0.170'), (1, '0.180')] [2023-10-12 15:47:48,598][62635] Updated weights for policy 1, policy_version 1260 (0.0009) [2023-10-12 15:47:48,961][62635] Updated weights for policy 1, policy_version 1270 (0.0011) [2023-10-12 15:47:49,325][62495] Saving new best policy, reward=0.180! [2023-10-12 15:47:49,327][62635] Updated weights for policy 1, policy_version 1280 (0.0008) [2023-10-12 15:47:49,393][62634] Updated weights for policy 0, policy_version 1250 (0.0008) [2023-10-12 15:47:49,802][62634] Updated weights for policy 0, policy_version 1260 (0.0010) [2023-10-12 15:47:50,175][62634] Updated weights for policy 0, policy_version 1270 (0.0011) [2023-10-12 15:47:50,554][62634] Updated weights for policy 0, policy_version 1280 (0.0010) [2023-10-12 15:47:53,377][62635] Updated weights for policy 1, policy_version 1290 (0.0009) [2023-10-12 15:47:53,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 12932.8). Total num frames: 2621440. Throughput: 0: 1663.8, 1: 1683.5. Samples: 663722. Policy #0 lag: (min: 31.0, avg: 35.2, max: 63.0) [2023-10-12 15:47:53,435][61643] Avg episode reward: [(0, '0.160'), (1, '0.170')] [2023-10-12 15:47:53,740][62635] Updated weights for policy 1, policy_version 1300 (0.0007) [2023-10-12 15:47:54,110][62635] Updated weights for policy 1, policy_version 1310 (0.0007) [2023-10-12 15:47:54,609][62634] Updated weights for policy 0, policy_version 1290 (0.0010) [2023-10-12 15:47:54,988][62634] Updated weights for policy 0, policy_version 1300 (0.0010) [2023-10-12 15:47:55,376][62634] Updated weights for policy 0, policy_version 1310 (0.0010) [2023-10-12 15:47:58,301][62635] Updated weights for policy 1, policy_version 1320 (0.0008) [2023-10-12 15:47:58,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 12937.0). Total num frames: 2686976. Throughput: 0: 1677.8, 1: 1683.4. Samples: 684496. Policy #0 lag: (min: 26.0, avg: 26.2, max: 36.0) [2023-10-12 15:47:58,435][61643] Avg episode reward: [(0, '0.130'), (1, '0.140')] [2023-10-12 15:47:58,667][62635] Updated weights for policy 1, policy_version 1330 (0.0011) [2023-10-12 15:47:59,037][62635] Updated weights for policy 1, policy_version 1340 (0.0008) [2023-10-12 15:47:59,209][62634] Updated weights for policy 0, policy_version 1320 (0.0009) [2023-10-12 15:47:59,578][62634] Updated weights for policy 0, policy_version 1330 (0.0010) [2023-10-12 15:47:59,961][62634] Updated weights for policy 0, policy_version 1340 (0.0009) [2023-10-12 15:48:03,193][62635] Updated weights for policy 1, policy_version 1350 (0.0008) [2023-10-12 15:48:03,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 12941.0). Total num frames: 2752512. Throughput: 0: 1688.3, 1: 1677.6. Samples: 705512. Policy #0 lag: (min: 28.0, avg: 28.0, max: 28.0) [2023-10-12 15:48:03,435][61643] Avg episode reward: [(0, '0.160'), (1, '0.070')] [2023-10-12 15:48:03,567][62635] Updated weights for policy 1, policy_version 1360 (0.0008) [2023-10-12 15:48:03,785][62634] Updated weights for policy 0, policy_version 1350 (0.0008) [2023-10-12 15:48:03,922][62635] Updated weights for policy 1, policy_version 1370 (0.0007) [2023-10-12 15:48:04,164][62634] Updated weights for policy 0, policy_version 1360 (0.0007) [2023-10-12 15:48:04,541][62634] Updated weights for policy 0, policy_version 1370 (0.0009) [2023-10-12 15:48:08,084][62635] Updated weights for policy 1, policy_version 1380 (0.0008) [2023-10-12 15:48:08,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.3, 300 sec: 12944.8). Total num frames: 2818048. Throughput: 0: 1678.0, 1: 1679.5. Samples: 714494. Policy #0 lag: (min: 10.0, avg: 17.3, max: 42.0) [2023-10-12 15:48:08,435][61643] Avg episode reward: [(0, '0.200'), (1, '0.070')] [2023-10-12 15:48:08,458][62635] Updated weights for policy 1, policy_version 1390 (0.0008) [2023-10-12 15:48:08,600][62634] Updated weights for policy 0, policy_version 1380 (0.0009) [2023-10-12 15:48:08,816][62635] Updated weights for policy 1, policy_version 1400 (0.0009) [2023-10-12 15:48:08,976][62634] Updated weights for policy 0, policy_version 1390 (0.0007) [2023-10-12 15:48:09,350][62634] Updated weights for policy 0, policy_version 1400 (0.0008) [2023-10-12 15:48:09,655][62354] Saving new best policy, reward=0.200! [2023-10-12 15:48:12,692][62635] Updated weights for policy 1, policy_version 1410 (0.0009) [2023-10-12 15:48:13,063][62635] Updated weights for policy 1, policy_version 1420 (0.0008) [2023-10-12 15:48:13,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 12948.4). Total num frames: 2883584. Throughput: 0: 1686.0, 1: 1683.2. Samples: 735290. Policy #0 lag: (min: 10.0, avg: 20.4, max: 42.0) [2023-10-12 15:48:13,436][61643] Avg episode reward: [(0, '0.160'), (1, '0.100')] [2023-10-12 15:48:13,440][62635] Updated weights for policy 1, policy_version 1430 (0.0008) [2023-10-12 15:48:13,472][62634] Updated weights for policy 0, policy_version 1410 (0.0011) [2023-10-12 15:48:13,803][62635] Updated weights for policy 1, policy_version 1440 (0.0009) [2023-10-12 15:48:13,849][62634] Updated weights for policy 0, policy_version 1420 (0.0007) [2023-10-12 15:48:14,238][62634] Updated weights for policy 0, policy_version 1430 (0.0007) [2023-10-12 15:48:14,607][62634] Updated weights for policy 0, policy_version 1440 (0.0007) [2023-10-12 15:48:17,726][62635] Updated weights for policy 1, policy_version 1450 (0.0008) [2023-10-12 15:48:18,093][62635] Updated weights for policy 1, policy_version 1460 (0.0008) [2023-10-12 15:48:18,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 12951.9). Total num frames: 2949120. Throughput: 0: 1686.1, 1: 1672.2. Samples: 755732. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-10-12 15:48:18,436][61643] Avg episode reward: [(0, '0.140'), (1, '0.110')] [2023-10-12 15:48:18,452][62635] Updated weights for policy 1, policy_version 1470 (0.0009) [2023-10-12 15:48:18,526][62495] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000001472_1507328.pth... [2023-10-12 15:48:18,743][62634] Updated weights for policy 0, policy_version 1450 (0.0008) [2023-10-12 15:48:19,121][62634] Updated weights for policy 0, policy_version 1460 (0.0009) [2023-10-12 15:48:19,491][62634] Updated weights for policy 0, policy_version 1470 (0.0011) [2023-10-12 15:48:19,569][62354] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000001472_1507328.pth... [2023-10-12 15:48:22,283][62635] Updated weights for policy 1, policy_version 1480 (0.0009) [2023-10-12 15:48:22,656][62635] Updated weights for policy 1, policy_version 1490 (0.0011) [2023-10-12 15:48:23,024][62635] Updated weights for policy 1, policy_version 1500 (0.0011) [2023-10-12 15:48:23,435][61643] Fps is (10 sec: 16384.4, 60 sec: 13653.4, 300 sec: 13096.1). Total num frames: 3047424. Throughput: 0: 1685.7, 1: 1691.5. Samples: 765548. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 15:48:23,435][61643] Avg episode reward: [(0, '0.200'), (1, '0.120')] [2023-10-12 15:48:23,524][62634] Updated weights for policy 0, policy_version 1480 (0.0009) [2023-10-12 15:48:23,903][62634] Updated weights for policy 0, policy_version 1490 (0.0010) [2023-10-12 15:48:24,275][62634] Updated weights for policy 0, policy_version 1500 (0.0011) [2023-10-12 15:48:27,140][62635] Updated weights for policy 1, policy_version 1510 (0.0011) [2023-10-12 15:48:27,516][62635] Updated weights for policy 1, policy_version 1520 (0.0008) [2023-10-12 15:48:27,879][62635] Updated weights for policy 1, policy_version 1530 (0.0007) [2023-10-12 15:48:28,435][61643] Fps is (10 sec: 16384.3, 60 sec: 13653.4, 300 sec: 13096.3). Total num frames: 3112960. Throughput: 0: 1682.3, 1: 1690.5. Samples: 786090. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 15:48:28,435][61643] Avg episode reward: [(0, '0.190'), (1, '0.090')] [2023-10-12 15:48:28,489][62634] Updated weights for policy 0, policy_version 1510 (0.0010) [2023-10-12 15:48:28,863][62634] Updated weights for policy 0, policy_version 1520 (0.0007) [2023-10-12 15:48:29,248][62634] Updated weights for policy 0, policy_version 1530 (0.0007) [2023-10-12 15:48:31,966][62635] Updated weights for policy 1, policy_version 1540 (0.0009) [2023-10-12 15:48:32,335][62635] Updated weights for policy 1, policy_version 1550 (0.0010) [2023-10-12 15:48:32,706][62635] Updated weights for policy 1, policy_version 1560 (0.0009) [2023-10-12 15:48:33,174][62634] Updated weights for policy 0, policy_version 1540 (0.0009) [2023-10-12 15:48:33,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13096.5). Total num frames: 3178496. Throughput: 0: 1689.1, 1: 1666.2. Samples: 805934. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) [2023-10-12 15:48:33,436][61643] Avg episode reward: [(0, '0.200'), (1, '0.180')] [2023-10-12 15:48:33,542][62634] Updated weights for policy 0, policy_version 1550 (0.0009) [2023-10-12 15:48:33,925][62634] Updated weights for policy 0, policy_version 1560 (0.0009) [2023-10-12 15:48:36,780][62635] Updated weights for policy 1, policy_version 1570 (0.0008) [2023-10-12 15:48:37,198][62635] Updated weights for policy 1, policy_version 1580 (0.0009) [2023-10-12 15:48:37,559][62635] Updated weights for policy 1, policy_version 1590 (0.0009) [2023-10-12 15:48:37,934][62635] Updated weights for policy 1, policy_version 1600 (0.0009) [2023-10-12 15:48:37,999][62634] Updated weights for policy 0, policy_version 1570 (0.0009) [2023-10-12 15:48:38,392][62634] Updated weights for policy 0, policy_version 1580 (0.0009) [2023-10-12 15:48:38,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13096.7). Total num frames: 3244032. Throughput: 0: 1692.0, 1: 1700.5. Samples: 816388. Policy #0 lag: (min: 17.0, avg: 28.9, max: 49.0) [2023-10-12 15:48:38,435][61643] Avg episode reward: [(0, '0.240'), (1, '0.190')] [2023-10-12 15:48:38,436][62495] Saving new best policy, reward=0.190! [2023-10-12 15:48:38,782][62634] Updated weights for policy 0, policy_version 1590 (0.0010) [2023-10-12 15:48:39,154][62354] Saving new best policy, reward=0.240! [2023-10-12 15:48:39,154][62634] Updated weights for policy 0, policy_version 1600 (0.0010) [2023-10-12 15:48:41,849][62635] Updated weights for policy 1, policy_version 1610 (0.0007) [2023-10-12 15:48:42,214][62635] Updated weights for policy 1, policy_version 1620 (0.0008) [2023-10-12 15:48:42,583][62635] Updated weights for policy 1, policy_version 1630 (0.0007) [2023-10-12 15:48:43,227][62634] Updated weights for policy 0, policy_version 1610 (0.0010) [2023-10-12 15:48:43,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13096.9). Total num frames: 3309568. Throughput: 0: 1687.6, 1: 1687.7. Samples: 836386. Policy #0 lag: (min: 30.0, avg: 36.3, max: 62.0) [2023-10-12 15:48:43,436][61643] Avg episode reward: [(0, '0.350'), (1, '0.150')] [2023-10-12 15:48:43,600][62634] Updated weights for policy 0, policy_version 1620 (0.0009) [2023-10-12 15:48:43,982][62634] Updated weights for policy 0, policy_version 1630 (0.0007) [2023-10-12 15:48:44,058][62354] Saving new best policy, reward=0.350! [2023-10-12 15:48:46,604][62635] Updated weights for policy 1, policy_version 1640 (0.0008) [2023-10-12 15:48:46,968][62635] Updated weights for policy 1, policy_version 1650 (0.0009) [2023-10-12 15:48:47,348][62635] Updated weights for policy 1, policy_version 1660 (0.0008) [2023-10-12 15:48:48,109][62634] Updated weights for policy 0, policy_version 1640 (0.0008) [2023-10-12 15:48:48,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13097.1). Total num frames: 3375104. Throughput: 0: 1675.4, 1: 1675.4. Samples: 856298. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0) [2023-10-12 15:48:48,435][61643] Avg episode reward: [(0, '0.360'), (1, '0.180')] [2023-10-12 15:48:48,484][62634] Updated weights for policy 0, policy_version 1650 (0.0007) [2023-10-12 15:48:48,869][62634] Updated weights for policy 0, policy_version 1660 (0.0008) [2023-10-12 15:48:49,016][62354] Saving new best policy, reward=0.360! [2023-10-12 15:48:51,546][62635] Updated weights for policy 1, policy_version 1670 (0.0007) [2023-10-12 15:48:51,921][62635] Updated weights for policy 1, policy_version 1680 (0.0009) [2023-10-12 15:48:52,292][62635] Updated weights for policy 1, policy_version 1690 (0.0009) [2023-10-12 15:48:52,867][62634] Updated weights for policy 0, policy_version 1670 (0.0008) [2023-10-12 15:48:53,247][62634] Updated weights for policy 0, policy_version 1680 (0.0010) [2023-10-12 15:48:53,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13097.3). Total num frames: 3440640. Throughput: 0: 1684.9, 1: 1700.8. Samples: 866848. Policy #0 lag: (min: 4.0, avg: 9.3, max: 36.0) [2023-10-12 15:48:53,436][61643] Avg episode reward: [(0, '0.370'), (1, '0.210')] [2023-10-12 15:48:53,436][62495] Saving new best policy, reward=0.210! [2023-10-12 15:48:53,614][62634] Updated weights for policy 0, policy_version 1690 (0.0008) [2023-10-12 15:48:53,845][62354] Saving new best policy, reward=0.370! [2023-10-12 15:48:56,216][62635] Updated weights for policy 1, policy_version 1700 (0.0009) [2023-10-12 15:48:56,584][62635] Updated weights for policy 1, policy_version 1710 (0.0011) [2023-10-12 15:48:56,955][62635] Updated weights for policy 1, policy_version 1720 (0.0009) [2023-10-12 15:48:57,644][62634] Updated weights for policy 0, policy_version 1700 (0.0010) [2023-10-12 15:48:58,028][62634] Updated weights for policy 0, policy_version 1710 (0.0009) [2023-10-12 15:48:58,409][62634] Updated weights for policy 0, policy_version 1720 (0.0008) [2023-10-12 15:48:58,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13097.5). Total num frames: 3506176. Throughput: 0: 1692.4, 1: 1680.4. Samples: 887066. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 15:48:58,436][61643] Avg episode reward: [(0, '0.490'), (1, '0.240')] [2023-10-12 15:48:58,437][62495] Saving new best policy, reward=0.240! [2023-10-12 15:48:58,706][62354] Saving new best policy, reward=0.490! [2023-10-12 15:49:01,017][62635] Updated weights for policy 1, policy_version 1730 (0.0010) [2023-10-12 15:49:01,389][62635] Updated weights for policy 1, policy_version 1740 (0.0009) [2023-10-12 15:49:01,752][62635] Updated weights for policy 1, policy_version 1750 (0.0007) [2023-10-12 15:49:02,123][62635] Updated weights for policy 1, policy_version 1760 (0.0009) [2023-10-12 15:49:02,314][62634] Updated weights for policy 0, policy_version 1730 (0.0007) [2023-10-12 15:49:02,684][62634] Updated weights for policy 0, policy_version 1740 (0.0008) [2023-10-12 15:49:03,058][62634] Updated weights for policy 0, policy_version 1750 (0.0008) [2023-10-12 15:49:03,436][62634] Updated weights for policy 0, policy_version 1760 (0.0010) [2023-10-12 15:49:03,442][61643] Fps is (10 sec: 16372.7, 60 sec: 14197.8, 300 sec: 13217.5). Total num frames: 3604480. Throughput: 0: 1676.0, 1: 1684.4. Samples: 906974. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 15:49:03,443][61643] Avg episode reward: [(0, '0.480'), (1, '0.270')] [2023-10-12 15:49:03,452][62495] Saving new best policy, reward=0.270! [2023-10-12 15:49:06,085][62635] Updated weights for policy 1, policy_version 1770 (0.0010) [2023-10-12 15:49:06,461][62635] Updated weights for policy 1, policy_version 1780 (0.0010) [2023-10-12 15:49:06,829][62635] Updated weights for policy 1, policy_version 1790 (0.0008) [2023-10-12 15:49:07,462][62634] Updated weights for policy 0, policy_version 1770 (0.0011) [2023-10-12 15:49:07,840][62634] Updated weights for policy 0, policy_version 1780 (0.0009) [2023-10-12 15:49:08,215][62634] Updated weights for policy 0, policy_version 1790 (0.0008) [2023-10-12 15:49:08,435][61643] Fps is (10 sec: 16383.9, 60 sec: 14199.4, 300 sec: 13215.9). Total num frames: 3670016. Throughput: 0: 1693.5, 1: 1689.2. Samples: 917774. Policy #0 lag: (min: 8.0, avg: 36.9, max: 40.0) [2023-10-12 15:49:08,436][61643] Avg episode reward: [(0, '0.400'), (1, '0.190')] [2023-10-12 15:49:10,715][62635] Updated weights for policy 1, policy_version 1800 (0.0009) [2023-10-12 15:49:11,086][62635] Updated weights for policy 1, policy_version 1810 (0.0009) [2023-10-12 15:49:11,450][62635] Updated weights for policy 1, policy_version 1820 (0.0008) [2023-10-12 15:49:12,018][62634] Updated weights for policy 0, policy_version 1800 (0.0010) [2023-10-12 15:49:12,396][62634] Updated weights for policy 0, policy_version 1810 (0.0008) [2023-10-12 15:49:12,777][62634] Updated weights for policy 0, policy_version 1820 (0.0009) [2023-10-12 15:49:13,435][61643] Fps is (10 sec: 13116.2, 60 sec: 14199.5, 300 sec: 13213.9). Total num frames: 3735552. Throughput: 0: 1694.8, 1: 1674.5. Samples: 937710. Policy #0 lag: (min: 31.0, avg: 36.6, max: 63.0) [2023-10-12 15:49:13,436][61643] Avg episode reward: [(0, '0.410'), (1, '0.170')] [2023-10-12 15:49:15,524][62635] Updated weights for policy 1, policy_version 1830 (0.0008) [2023-10-12 15:49:15,890][62635] Updated weights for policy 1, policy_version 1840 (0.0007) [2023-10-12 15:49:16,251][62635] Updated weights for policy 1, policy_version 1850 (0.0008) [2023-10-12 15:49:16,893][62634] Updated weights for policy 0, policy_version 1830 (0.0008) [2023-10-12 15:49:17,270][62634] Updated weights for policy 0, policy_version 1840 (0.0007) [2023-10-12 15:49:17,648][62634] Updated weights for policy 0, policy_version 1850 (0.0007) [2023-10-12 15:49:18,435][61643] Fps is (10 sec: 13107.2, 60 sec: 14199.4, 300 sec: 13212.1). Total num frames: 3801088. Throughput: 0: 1667.3, 1: 1699.4. Samples: 957436. Policy #0 lag: (min: 31.0, avg: 33.9, max: 63.0) [2023-10-12 15:49:18,436][61643] Avg episode reward: [(0, '0.350'), (1, '0.250')] [2023-10-12 15:49:20,246][62635] Updated weights for policy 1, policy_version 1860 (0.0007) [2023-10-12 15:49:20,616][62635] Updated weights for policy 1, policy_version 1870 (0.0007) [2023-10-12 15:49:20,978][62635] Updated weights for policy 1, policy_version 1880 (0.0007) [2023-10-12 15:49:21,877][62634] Updated weights for policy 0, policy_version 1860 (0.0007) [2023-10-12 15:49:22,243][62634] Updated weights for policy 0, policy_version 1870 (0.0007) [2023-10-12 15:49:22,616][62634] Updated weights for policy 0, policy_version 1880 (0.0008) [2023-10-12 15:49:23,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13210.3). Total num frames: 3866624. Throughput: 0: 1695.3, 1: 1673.3. Samples: 967976. Policy #0 lag: (min: 26.0, avg: 41.4, max: 58.0) [2023-10-12 15:49:23,436][61643] Avg episode reward: [(0, '0.250'), (1, '0.250')] [2023-10-12 15:49:24,925][62635] Updated weights for policy 1, policy_version 1890 (0.0007) [2023-10-12 15:49:25,301][62635] Updated weights for policy 1, policy_version 1900 (0.0011) [2023-10-12 15:49:25,673][62635] Updated weights for policy 1, policy_version 1910 (0.0009) [2023-10-12 15:49:26,049][62635] Updated weights for policy 1, policy_version 1920 (0.0008) [2023-10-12 15:49:26,629][62634] Updated weights for policy 0, policy_version 1890 (0.0009) [2023-10-12 15:49:27,036][62634] Updated weights for policy 0, policy_version 1900 (0.0008) [2023-10-12 15:49:27,415][62634] Updated weights for policy 0, policy_version 1910 (0.0008) [2023-10-12 15:49:27,794][62634] Updated weights for policy 0, policy_version 1920 (0.0009) [2023-10-12 15:49:28,435][61643] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13329.4). Total num frames: 3932160. Throughput: 0: 1685.9, 1: 1682.8. Samples: 987978. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 15:49:28,435][61643] Avg episode reward: [(0, '0.350'), (1, '0.190')] [2023-10-12 15:49:30,275][62635] Updated weights for policy 1, policy_version 1930 (0.0008) [2023-10-12 15:49:30,649][62635] Updated weights for policy 1, policy_version 1940 (0.0008) [2023-10-12 15:49:31,023][62635] Updated weights for policy 1, policy_version 1950 (0.0007) [2023-10-12 15:49:31,748][62634] Updated weights for policy 0, policy_version 1930 (0.0007) [2023-10-12 15:49:32,117][62634] Updated weights for policy 0, policy_version 1940 (0.0007) [2023-10-12 15:49:32,492][62634] Updated weights for policy 0, policy_version 1950 (0.0007) [2023-10-12 15:49:33,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 3997696. Throughput: 0: 1669.3, 1: 1695.8. Samples: 1007728. Policy #0 lag: (min: 31.0, avg: 33.5, max: 63.0) [2023-10-12 15:49:33,436][61643] Avg episode reward: [(0, '0.460'), (1, '0.240')] [2023-10-12 15:49:35,140][62635] Updated weights for policy 1, policy_version 1960 (0.0009) [2023-10-12 15:49:35,506][62635] Updated weights for policy 1, policy_version 1970 (0.0010) [2023-10-12 15:49:35,873][62635] Updated weights for policy 1, policy_version 1980 (0.0007) [2023-10-12 15:49:36,610][62634] Updated weights for policy 0, policy_version 1960 (0.0010) [2023-10-12 15:49:36,979][62634] Updated weights for policy 0, policy_version 1970 (0.0007) [2023-10-12 15:49:37,353][62634] Updated weights for policy 0, policy_version 1980 (0.0008) [2023-10-12 15:49:38,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 4063232. Throughput: 0: 1695.2, 1: 1669.1. Samples: 1018244. Policy #0 lag: (min: 31.0, avg: 37.6, max: 63.0) [2023-10-12 15:49:38,435][61643] Avg episode reward: [(0, '0.540'), (1, '0.240')] [2023-10-12 15:49:38,436][62354] Saving new best policy, reward=0.540! [2023-10-12 15:49:40,039][62635] Updated weights for policy 1, policy_version 1990 (0.0009) [2023-10-12 15:49:40,405][62635] Updated weights for policy 1, policy_version 2000 (0.0009) [2023-10-12 15:49:40,773][62635] Updated weights for policy 1, policy_version 2010 (0.0008) [2023-10-12 15:49:41,299][62634] Updated weights for policy 0, policy_version 1990 (0.0008) [2023-10-12 15:49:41,665][62634] Updated weights for policy 0, policy_version 2000 (0.0009) [2023-10-12 15:49:42,036][62634] Updated weights for policy 0, policy_version 2010 (0.0009) [2023-10-12 15:49:43,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 4128768. Throughput: 0: 1670.4, 1: 1685.5. Samples: 1038080. Policy #0 lag: (min: 1.0, avg: 11.6, max: 33.0) [2023-10-12 15:49:43,435][61643] Avg episode reward: [(0, '0.550'), (1, '0.300')] [2023-10-12 15:49:43,436][62495] Saving new best policy, reward=0.300! [2023-10-12 15:49:43,437][62354] Saving new best policy, reward=0.550! [2023-10-12 15:49:44,928][62635] Updated weights for policy 1, policy_version 2020 (0.0007) [2023-10-12 15:49:45,310][62635] Updated weights for policy 1, policy_version 2030 (0.0009) [2023-10-12 15:49:45,683][62635] Updated weights for policy 1, policy_version 2040 (0.0007) [2023-10-12 15:49:46,216][62634] Updated weights for policy 0, policy_version 2020 (0.0010) [2023-10-12 15:49:46,599][62634] Updated weights for policy 0, policy_version 2030 (0.0008) [2023-10-12 15:49:46,974][62634] Updated weights for policy 0, policy_version 2040 (0.0009) [2023-10-12 15:49:48,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 4194304. Throughput: 0: 1671.1, 1: 1689.0. Samples: 1058154. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 15:49:48,435][61643] Avg episode reward: [(0, '0.530'), (1, '0.340')] [2023-10-12 15:49:48,445][62495] Saving new best policy, reward=0.340! [2023-10-12 15:49:49,606][62635] Updated weights for policy 1, policy_version 2050 (0.0007) [2023-10-12 15:49:49,974][62635] Updated weights for policy 1, policy_version 2060 (0.0008) [2023-10-12 15:49:50,336][62635] Updated weights for policy 1, policy_version 2070 (0.0007) [2023-10-12 15:49:50,710][62635] Updated weights for policy 1, policy_version 2080 (0.0008) [2023-10-12 15:49:51,025][62634] Updated weights for policy 0, policy_version 2050 (0.0008) [2023-10-12 15:49:51,397][62634] Updated weights for policy 0, policy_version 2060 (0.0009) [2023-10-12 15:49:51,774][62634] Updated weights for policy 0, policy_version 2070 (0.0009) [2023-10-12 15:49:52,149][62634] Updated weights for policy 0, policy_version 2080 (0.0008) [2023-10-12 15:49:53,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 4259840. Throughput: 0: 1685.2, 1: 1667.2. Samples: 1068628. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 15:49:53,436][61643] Avg episode reward: [(0, '0.590'), (1, '0.350')] [2023-10-12 15:49:53,437][62354] Saving new best policy, reward=0.590! [2023-10-12 15:49:53,437][62495] Saving new best policy, reward=0.350! [2023-10-12 15:49:54,807][62635] Updated weights for policy 1, policy_version 2090 (0.0007) [2023-10-12 15:49:55,184][62635] Updated weights for policy 1, policy_version 2100 (0.0007) [2023-10-12 15:49:55,551][62635] Updated weights for policy 1, policy_version 2110 (0.0008) [2023-10-12 15:49:56,222][62634] Updated weights for policy 0, policy_version 2090 (0.0009) [2023-10-12 15:49:56,604][62634] Updated weights for policy 0, policy_version 2100 (0.0009) [2023-10-12 15:49:56,983][62634] Updated weights for policy 0, policy_version 2110 (0.0007) [2023-10-12 15:49:58,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 4325376. Throughput: 0: 1660.9, 1: 1681.1. Samples: 1088102. Policy #0 lag: (min: 26.0, avg: 34.0, max: 58.0) [2023-10-12 15:49:58,436][61643] Avg episode reward: [(0, '0.630'), (1, '0.350')] [2023-10-12 15:49:58,436][62354] Saving new best policy, reward=0.630! [2023-10-12 15:49:59,598][62635] Updated weights for policy 1, policy_version 2120 (0.0009) [2023-10-12 15:49:59,972][62635] Updated weights for policy 1, policy_version 2130 (0.0009) [2023-10-12 15:50:00,341][62635] Updated weights for policy 1, policy_version 2140 (0.0009) [2023-10-12 15:50:01,065][62634] Updated weights for policy 0, policy_version 2120 (0.0007) [2023-10-12 15:50:01,450][62634] Updated weights for policy 0, policy_version 2130 (0.0007) [2023-10-12 15:50:01,829][62634] Updated weights for policy 0, policy_version 2140 (0.0008) [2023-10-12 15:50:03,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13108.7, 300 sec: 13551.5). Total num frames: 4390912. Throughput: 0: 1680.4, 1: 1679.3. Samples: 1108618. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 15:50:03,435][61643] Avg episode reward: [(0, '0.630'), (1, '0.320')] [2023-10-12 15:50:04,350][62635] Updated weights for policy 1, policy_version 2150 (0.0007) [2023-10-12 15:50:04,720][62635] Updated weights for policy 1, policy_version 2160 (0.0007) [2023-10-12 15:50:05,084][62635] Updated weights for policy 1, policy_version 2170 (0.0007) [2023-10-12 15:50:05,829][62634] Updated weights for policy 0, policy_version 2150 (0.0009) [2023-10-12 15:50:06,209][62634] Updated weights for policy 0, policy_version 2160 (0.0008) [2023-10-12 15:50:06,598][62634] Updated weights for policy 0, policy_version 2170 (0.0010) [2023-10-12 15:50:08,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.3, 300 sec: 13440.4). Total num frames: 4456448. Throughput: 0: 1675.1, 1: 1674.9. Samples: 1118726. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 15:50:08,435][61643] Avg episode reward: [(0, '0.610'), (1, '0.290')] [2023-10-12 15:50:09,190][62635] Updated weights for policy 1, policy_version 2180 (0.0008) [2023-10-12 15:50:09,553][62635] Updated weights for policy 1, policy_version 2190 (0.0009) [2023-10-12 15:50:09,915][62635] Updated weights for policy 1, policy_version 2200 (0.0010) [2023-10-12 15:50:10,643][62634] Updated weights for policy 0, policy_version 2180 (0.0010) [2023-10-12 15:50:11,018][62634] Updated weights for policy 0, policy_version 2190 (0.0009) [2023-10-12 15:50:11,399][62634] Updated weights for policy 0, policy_version 2200 (0.0009) [2023-10-12 15:50:13,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.5). Total num frames: 4521984. Throughput: 0: 1667.8, 1: 1679.1. Samples: 1138590. Policy #0 lag: (min: 31.0, avg: 33.4, max: 63.0) [2023-10-12 15:50:13,435][61643] Avg episode reward: [(0, '0.640'), (1, '0.350')] [2023-10-12 15:50:13,436][62354] Saving new best policy, reward=0.640! [2023-10-12 15:50:13,924][62635] Updated weights for policy 1, policy_version 2210 (0.0010) [2023-10-12 15:50:14,296][62635] Updated weights for policy 1, policy_version 2220 (0.0007) [2023-10-12 15:50:14,664][62635] Updated weights for policy 1, policy_version 2230 (0.0007) [2023-10-12 15:50:15,030][62635] Updated weights for policy 1, policy_version 2240 (0.0008) [2023-10-12 15:50:15,271][62634] Updated weights for policy 0, policy_version 2210 (0.0010) [2023-10-12 15:50:15,654][62634] Updated weights for policy 0, policy_version 2220 (0.0010) [2023-10-12 15:50:16,037][62634] Updated weights for policy 0, policy_version 2230 (0.0011) [2023-10-12 15:50:16,400][62634] Updated weights for policy 0, policy_version 2240 (0.0009) [2023-10-12 15:50:18,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 4587520. Throughput: 0: 1686.0, 1: 1681.4. Samples: 1159262. Policy #0 lag: (min: 31.0, avg: 38.1, max: 63.0) [2023-10-12 15:50:18,436][61643] Avg episode reward: [(0, '0.740'), (1, '0.420')] [2023-10-12 15:50:18,445][62354] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000002240_2293760.pth... [2023-10-12 15:50:18,445][62495] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000002240_2293760.pth... [2023-10-12 15:50:18,475][62354] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000000672_688128.pth [2023-10-12 15:50:18,479][62354] Saving new best policy, reward=0.740! [2023-10-12 15:50:18,480][62495] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000000672_688128.pth [2023-10-12 15:50:18,483][62495] Saving new best policy, reward=0.420! [2023-10-12 15:50:19,150][62635] Updated weights for policy 1, policy_version 2250 (0.0008) [2023-10-12 15:50:19,531][62635] Updated weights for policy 1, policy_version 2260 (0.0008) [2023-10-12 15:50:19,896][62635] Updated weights for policy 1, policy_version 2270 (0.0009) [2023-10-12 15:50:20,512][62634] Updated weights for policy 0, policy_version 2250 (0.0010) [2023-10-12 15:50:20,889][62634] Updated weights for policy 0, policy_version 2260 (0.0010) [2023-10-12 15:50:21,270][62634] Updated weights for policy 0, policy_version 2270 (0.0008) [2023-10-12 15:50:23,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 4653056. Throughput: 0: 1666.1, 1: 1680.5. Samples: 1168842. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 15:50:23,435][61643] Avg episode reward: [(0, '0.720'), (1, '0.380')] [2023-10-12 15:50:23,977][62635] Updated weights for policy 1, policy_version 2280 (0.0010) [2023-10-12 15:50:24,356][62635] Updated weights for policy 1, policy_version 2290 (0.0008) [2023-10-12 15:50:24,722][62635] Updated weights for policy 1, policy_version 2300 (0.0008) [2023-10-12 15:50:25,515][62634] Updated weights for policy 0, policy_version 2280 (0.0007) [2023-10-12 15:50:25,901][62634] Updated weights for policy 0, policy_version 2290 (0.0008) [2023-10-12 15:50:26,275][62634] Updated weights for policy 0, policy_version 2300 (0.0008) [2023-10-12 15:50:28,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 4718592. Throughput: 0: 1673.2, 1: 1681.6. Samples: 1189046. Policy #0 lag: (min: 9.0, avg: 14.0, max: 41.0) [2023-10-12 15:50:28,435][61643] Avg episode reward: [(0, '0.750'), (1, '0.390')] [2023-10-12 15:50:28,436][62354] Saving new best policy, reward=0.750! [2023-10-12 15:50:28,821][62635] Updated weights for policy 1, policy_version 2310 (0.0009) [2023-10-12 15:50:29,189][62635] Updated weights for policy 1, policy_version 2320 (0.0008) [2023-10-12 15:50:29,566][62635] Updated weights for policy 1, policy_version 2330 (0.0008) [2023-10-12 15:50:30,531][62634] Updated weights for policy 0, policy_version 2310 (0.0010) [2023-10-12 15:50:30,909][62634] Updated weights for policy 0, policy_version 2320 (0.0009) [2023-10-12 15:50:31,284][62634] Updated weights for policy 0, policy_version 2330 (0.0009) [2023-10-12 15:50:33,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 4784128. Throughput: 0: 1683.7, 1: 1687.0. Samples: 1209838. Policy #0 lag: (min: 31.0, avg: 38.9, max: 63.0) [2023-10-12 15:50:33,435][61643] Avg episode reward: [(0, '0.880'), (1, '0.410')] [2023-10-12 15:50:33,441][62354] Saving new best policy, reward=0.880! [2023-10-12 15:50:33,533][62635] Updated weights for policy 1, policy_version 2340 (0.0009) [2023-10-12 15:50:33,907][62635] Updated weights for policy 1, policy_version 2350 (0.0009) [2023-10-12 15:50:34,276][62635] Updated weights for policy 1, policy_version 2360 (0.0007) [2023-10-12 15:50:35,286][62634] Updated weights for policy 0, policy_version 2340 (0.0009) [2023-10-12 15:50:35,661][62634] Updated weights for policy 0, policy_version 2350 (0.0009) [2023-10-12 15:50:36,046][62634] Updated weights for policy 0, policy_version 2360 (0.0007) [2023-10-12 15:50:38,319][62635] Updated weights for policy 1, policy_version 2370 (0.0008) [2023-10-12 15:50:38,435][61643] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 4849664. Throughput: 0: 1663.4, 1: 1689.6. Samples: 1219512. Policy #0 lag: (min: 14.0, avg: 22.0, max: 46.0) [2023-10-12 15:50:38,436][61643] Avg episode reward: [(0, '0.880'), (1, '0.380')] [2023-10-12 15:50:38,693][62635] Updated weights for policy 1, policy_version 2380 (0.0007) [2023-10-12 15:50:39,054][62635] Updated weights for policy 1, policy_version 2390 (0.0009) [2023-10-12 15:50:39,425][62635] Updated weights for policy 1, policy_version 2400 (0.0010) [2023-10-12 15:50:40,028][62634] Updated weights for policy 0, policy_version 2370 (0.0008) [2023-10-12 15:50:40,406][62634] Updated weights for policy 0, policy_version 2380 (0.0009) [2023-10-12 15:50:40,789][62634] Updated weights for policy 0, policy_version 2390 (0.0008) [2023-10-12 15:50:41,167][62634] Updated weights for policy 0, policy_version 2400 (0.0008) [2023-10-12 15:50:43,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 4915200. Throughput: 0: 1677.2, 1: 1695.4. Samples: 1239870. Policy #0 lag: (min: 31.0, avg: 32.5, max: 57.0) [2023-10-12 15:50:43,436][61643] Avg episode reward: [(0, '0.860'), (1, '0.490')] [2023-10-12 15:50:43,460][62635] Updated weights for policy 1, policy_version 2410 (0.0010) [2023-10-12 15:50:43,826][62635] Updated weights for policy 1, policy_version 2420 (0.0009) [2023-10-12 15:50:44,201][62635] Updated weights for policy 1, policy_version 2430 (0.0008) [2023-10-12 15:50:44,269][62495] Saving new best policy, reward=0.490! [2023-10-12 15:50:45,368][62634] Updated weights for policy 0, policy_version 2410 (0.0007) [2023-10-12 15:50:45,733][62634] Updated weights for policy 0, policy_version 2420 (0.0010) [2023-10-12 15:50:46,116][62634] Updated weights for policy 0, policy_version 2430 (0.0007) [2023-10-12 15:50:48,162][62635] Updated weights for policy 1, policy_version 2440 (0.0008) [2023-10-12 15:50:48,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 4980736. Throughput: 0: 1680.2, 1: 1692.4. Samples: 1260388. Policy #0 lag: (min: 4.0, avg: 10.9, max: 36.0) [2023-10-12 15:50:48,435][61643] Avg episode reward: [(0, '1.020'), (1, '0.570')] [2023-10-12 15:50:48,445][62354] Saving new best policy, reward=1.020! [2023-10-12 15:50:48,521][62635] Updated weights for policy 1, policy_version 2450 (0.0009) [2023-10-12 15:50:48,895][62635] Updated weights for policy 1, policy_version 2460 (0.0008) [2023-10-12 15:50:49,040][62495] Saving new best policy, reward=0.570! [2023-10-12 15:50:50,287][62634] Updated weights for policy 0, policy_version 2440 (0.0007) [2023-10-12 15:50:50,661][62634] Updated weights for policy 0, policy_version 2450 (0.0008) [2023-10-12 15:50:51,043][62634] Updated weights for policy 0, policy_version 2460 (0.0008) [2023-10-12 15:50:53,029][62635] Updated weights for policy 1, policy_version 2470 (0.0009) [2023-10-12 15:50:53,395][62635] Updated weights for policy 1, policy_version 2480 (0.0008) [2023-10-12 15:50:53,435][61643] Fps is (10 sec: 13106.8, 60 sec: 13107.1, 300 sec: 13329.3). Total num frames: 5046272. Throughput: 0: 1663.5, 1: 1695.1. Samples: 1269864. Policy #0 lag: (min: 31.0, avg: 38.2, max: 63.0) [2023-10-12 15:50:53,436][61643] Avg episode reward: [(0, '1.170'), (1, '0.550')] [2023-10-12 15:50:53,437][62354] Saving new best policy, reward=1.170! [2023-10-12 15:50:53,764][62635] Updated weights for policy 1, policy_version 2490 (0.0008) [2023-10-12 15:50:54,952][62634] Updated weights for policy 0, policy_version 2470 (0.0010) [2023-10-12 15:50:55,338][62634] Updated weights for policy 0, policy_version 2480 (0.0011) [2023-10-12 15:50:55,715][62634] Updated weights for policy 0, policy_version 2490 (0.0007) [2023-10-12 15:50:57,665][62635] Updated weights for policy 1, policy_version 2500 (0.0008) [2023-10-12 15:50:58,036][62635] Updated weights for policy 1, policy_version 2510 (0.0009) [2023-10-12 15:50:58,399][62635] Updated weights for policy 1, policy_version 2520 (0.0008) [2023-10-12 15:50:58,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 5111808. Throughput: 0: 1674.1, 1: 1696.1. Samples: 1290250. Policy #0 lag: (min: 21.0, avg: 25.8, max: 53.0) [2023-10-12 15:50:58,436][61643] Avg episode reward: [(0, '1.220'), (1, '0.560')] [2023-10-12 15:50:58,436][62354] Saving new best policy, reward=1.220! [2023-10-12 15:50:59,761][62634] Updated weights for policy 0, policy_version 2500 (0.0010) [2023-10-12 15:51:00,149][62634] Updated weights for policy 0, policy_version 2510 (0.0009) [2023-10-12 15:51:00,535][62634] Updated weights for policy 0, policy_version 2520 (0.0007) [2023-10-12 15:51:02,459][62635] Updated weights for policy 1, policy_version 2530 (0.0008) [2023-10-12 15:51:02,837][62635] Updated weights for policy 1, policy_version 2540 (0.0008) [2023-10-12 15:51:03,205][62635] Updated weights for policy 1, policy_version 2550 (0.0009) [2023-10-12 15:51:03,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13107.1, 300 sec: 13329.3). Total num frames: 5177344. Throughput: 0: 1675.8, 1: 1684.5. Samples: 1310474. Policy #0 lag: (min: 18.0, avg: 19.7, max: 40.0) [2023-10-12 15:51:03,436][61643] Avg episode reward: [(0, '1.220'), (1, '0.560')] [2023-10-12 15:51:03,577][62635] Updated weights for policy 1, policy_version 2560 (0.0008) [2023-10-12 15:51:04,550][62634] Updated weights for policy 0, policy_version 2530 (0.0010) [2023-10-12 15:51:04,942][62634] Updated weights for policy 0, policy_version 2540 (0.0010) [2023-10-12 15:51:05,310][62634] Updated weights for policy 0, policy_version 2550 (0.0011) [2023-10-12 15:51:05,692][62634] Updated weights for policy 0, policy_version 2560 (0.0010) [2023-10-12 15:51:07,550][62635] Updated weights for policy 1, policy_version 2570 (0.0008) [2023-10-12 15:51:07,915][62635] Updated weights for policy 1, policy_version 2580 (0.0009) [2023-10-12 15:51:08,288][62635] Updated weights for policy 1, policy_version 2590 (0.0008) [2023-10-12 15:51:08,435][61643] Fps is (10 sec: 16384.2, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 5275648. Throughput: 0: 1661.6, 1: 1701.6. Samples: 1320188. Policy #0 lag: (min: 31.0, avg: 40.9, max: 63.0) [2023-10-12 15:51:08,435][61643] Avg episode reward: [(0, '1.330'), (1, '0.640')] [2023-10-12 15:51:08,436][62354] Saving new best policy, reward=1.330! [2023-10-12 15:51:08,436][62495] Saving new best policy, reward=0.640! [2023-10-12 15:51:09,698][62634] Updated weights for policy 0, policy_version 2570 (0.0009) [2023-10-12 15:51:10,074][62634] Updated weights for policy 0, policy_version 2580 (0.0009) [2023-10-12 15:51:10,455][62634] Updated weights for policy 0, policy_version 2590 (0.0011) [2023-10-12 15:51:12,326][62635] Updated weights for policy 1, policy_version 2600 (0.0009) [2023-10-12 15:51:12,689][62635] Updated weights for policy 1, policy_version 2610 (0.0007) [2023-10-12 15:51:13,065][62635] Updated weights for policy 1, policy_version 2620 (0.0007) [2023-10-12 15:51:13,435][61643] Fps is (10 sec: 16384.3, 60 sec: 13653.3, 300 sec: 13440.5). Total num frames: 5341184. Throughput: 0: 1672.7, 1: 1703.4. Samples: 1340968. Policy #0 lag: (min: 31.0, avg: 38.7, max: 63.0) [2023-10-12 15:51:13,435][61643] Avg episode reward: [(0, '1.440'), (1, '0.720')] [2023-10-12 15:51:13,436][62495] Saving new best policy, reward=0.720! [2023-10-12 15:51:13,436][62354] Saving new best policy, reward=1.440! [2023-10-12 15:51:14,612][62634] Updated weights for policy 0, policy_version 2600 (0.0007) [2023-10-12 15:51:14,978][62634] Updated weights for policy 0, policy_version 2610 (0.0007) [2023-10-12 15:51:15,359][62634] Updated weights for policy 0, policy_version 2620 (0.0008) [2023-10-12 15:51:17,223][62635] Updated weights for policy 1, policy_version 2630 (0.0007) [2023-10-12 15:51:17,601][62635] Updated weights for policy 1, policy_version 2640 (0.0007) [2023-10-12 15:51:17,961][62635] Updated weights for policy 1, policy_version 2650 (0.0008) [2023-10-12 15:51:18,435][61643] Fps is (10 sec: 13106.8, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 5406720. Throughput: 0: 1676.5, 1: 1675.7. Samples: 1360686. Policy #0 lag: (min: 31.0, avg: 31.0, max: 32.0) [2023-10-12 15:51:18,436][61643] Avg episode reward: [(0, '1.480'), (1, '0.660')] [2023-10-12 15:51:18,449][62354] Saving new best policy, reward=1.480! [2023-10-12 15:51:19,382][62634] Updated weights for policy 0, policy_version 2630 (0.0009) [2023-10-12 15:51:19,757][62634] Updated weights for policy 0, policy_version 2640 (0.0008) [2023-10-12 15:51:20,141][62634] Updated weights for policy 0, policy_version 2650 (0.0009) [2023-10-12 15:51:22,076][62635] Updated weights for policy 1, policy_version 2660 (0.0009) [2023-10-12 15:51:22,450][62635] Updated weights for policy 1, policy_version 2670 (0.0009) [2023-10-12 15:51:22,828][62635] Updated weights for policy 1, policy_version 2680 (0.0009) [2023-10-12 15:51:23,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 5472256. Throughput: 0: 1665.8, 1: 1696.3. Samples: 1370806. Policy #0 lag: (min: 31.0, avg: 31.3, max: 43.0) [2023-10-12 15:51:23,435][61643] Avg episode reward: [(0, '1.510'), (1, '0.630')] [2023-10-12 15:51:23,436][62354] Saving new best policy, reward=1.510! [2023-10-12 15:51:24,179][62634] Updated weights for policy 0, policy_version 2660 (0.0009) [2023-10-12 15:51:24,565][62634] Updated weights for policy 0, policy_version 2670 (0.0008) [2023-10-12 15:51:24,946][62634] Updated weights for policy 0, policy_version 2680 (0.0009) [2023-10-12 15:51:26,934][62635] Updated weights for policy 1, policy_version 2690 (0.0008) [2023-10-12 15:51:27,301][62635] Updated weights for policy 1, policy_version 2700 (0.0008) [2023-10-12 15:51:27,674][62635] Updated weights for policy 1, policy_version 2710 (0.0008) [2023-10-12 15:51:28,042][62635] Updated weights for policy 1, policy_version 2720 (0.0008) [2023-10-12 15:51:28,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 5537792. Throughput: 0: 1677.0, 1: 1688.2. Samples: 1391302. Policy #0 lag: (min: 31.0, avg: 38.8, max: 63.0) [2023-10-12 15:51:28,436][61643] Avg episode reward: [(0, '1.540'), (1, '0.720')] [2023-10-12 15:51:28,437][62354] Saving new best policy, reward=1.540! [2023-10-12 15:51:28,941][62634] Updated weights for policy 0, policy_version 2690 (0.0007) [2023-10-12 15:51:29,318][62634] Updated weights for policy 0, policy_version 2700 (0.0009) [2023-10-12 15:51:29,692][62634] Updated weights for policy 0, policy_version 2710 (0.0008) [2023-10-12 15:51:30,070][62634] Updated weights for policy 0, policy_version 2720 (0.0009) [2023-10-12 15:51:32,282][62635] Updated weights for policy 1, policy_version 2730 (0.0008) [2023-10-12 15:51:32,648][62635] Updated weights for policy 1, policy_version 2740 (0.0009) [2023-10-12 15:51:33,010][62635] Updated weights for policy 1, policy_version 2750 (0.0010) [2023-10-12 15:51:33,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 5603328. Throughput: 0: 1681.6, 1: 1671.2. Samples: 1411266. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0) [2023-10-12 15:51:33,436][61643] Avg episode reward: [(0, '1.560'), (1, '0.810')] [2023-10-12 15:51:33,443][62495] Saving new best policy, reward=0.810! [2023-10-12 15:51:33,443][62354] Saving new best policy, reward=1.560! [2023-10-12 15:51:34,012][62634] Updated weights for policy 0, policy_version 2730 (0.0011) [2023-10-12 15:51:34,403][62634] Updated weights for policy 0, policy_version 2740 (0.0010) [2023-10-12 15:51:34,774][62634] Updated weights for policy 0, policy_version 2750 (0.0010) [2023-10-12 15:51:37,039][62635] Updated weights for policy 1, policy_version 2760 (0.0010) [2023-10-12 15:51:37,408][62635] Updated weights for policy 1, policy_version 2770 (0.0009) [2023-10-12 15:51:37,781][62635] Updated weights for policy 1, policy_version 2780 (0.0007) [2023-10-12 15:51:38,435][61643] Fps is (10 sec: 13107.5, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 5668864. Throughput: 0: 1675.2, 1: 1695.3. Samples: 1421534. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-10-12 15:51:38,435][61643] Avg episode reward: [(0, '1.580'), (1, '0.800')] [2023-10-12 15:51:38,721][62634] Updated weights for policy 0, policy_version 2760 (0.0008) [2023-10-12 15:51:39,098][62634] Updated weights for policy 0, policy_version 2770 (0.0007) [2023-10-12 15:51:39,472][62634] Updated weights for policy 0, policy_version 2780 (0.0007) [2023-10-12 15:51:39,625][62354] Saving new best policy, reward=1.580! [2023-10-12 15:51:41,799][62635] Updated weights for policy 1, policy_version 2790 (0.0009) [2023-10-12 15:51:42,173][62635] Updated weights for policy 1, policy_version 2800 (0.0008) [2023-10-12 15:51:42,550][62635] Updated weights for policy 1, policy_version 2810 (0.0007) [2023-10-12 15:51:43,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 5734400. Throughput: 0: 1691.6, 1: 1680.1. Samples: 1441976. Policy #0 lag: (min: 25.0, avg: 29.0, max: 57.0) [2023-10-12 15:51:43,436][61643] Avg episode reward: [(0, '1.730'), (1, '0.760')] [2023-10-12 15:51:43,548][62634] Updated weights for policy 0, policy_version 2790 (0.0009) [2023-10-12 15:51:43,928][62634] Updated weights for policy 0, policy_version 2800 (0.0008) [2023-10-12 15:51:44,309][62634] Updated weights for policy 0, policy_version 2810 (0.0007) [2023-10-12 15:51:44,533][62354] Saving new best policy, reward=1.730! [2023-10-12 15:51:46,579][62635] Updated weights for policy 1, policy_version 2820 (0.0007) [2023-10-12 15:51:46,950][62635] Updated weights for policy 1, policy_version 2830 (0.0009) [2023-10-12 15:51:47,315][62635] Updated weights for policy 1, policy_version 2840 (0.0007) [2023-10-12 15:51:48,214][62634] Updated weights for policy 0, policy_version 2820 (0.0008) [2023-10-12 15:51:48,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 5799936. Throughput: 0: 1691.4, 1: 1675.8. Samples: 1461996. Policy #0 lag: (min: 19.0, avg: 19.0, max: 19.0) [2023-10-12 15:51:48,435][61643] Avg episode reward: [(0, '1.610'), (1, '0.840')] [2023-10-12 15:51:48,444][62495] Saving new best policy, reward=0.840! [2023-10-12 15:51:48,594][62634] Updated weights for policy 0, policy_version 2830 (0.0008) [2023-10-12 15:51:48,964][62634] Updated weights for policy 0, policy_version 2840 (0.0008) [2023-10-12 15:51:51,364][62635] Updated weights for policy 1, policy_version 2850 (0.0007) [2023-10-12 15:51:51,740][62635] Updated weights for policy 1, policy_version 2860 (0.0009) [2023-10-12 15:51:52,115][62635] Updated weights for policy 1, policy_version 2870 (0.0008) [2023-10-12 15:51:52,491][62635] Updated weights for policy 1, policy_version 2880 (0.0010) [2023-10-12 15:51:52,984][62634] Updated weights for policy 0, policy_version 2850 (0.0009) [2023-10-12 15:51:53,378][62634] Updated weights for policy 0, policy_version 2860 (0.0010) [2023-10-12 15:51:53,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 5865472. Throughput: 0: 1697.0, 1: 1685.5. Samples: 1472402. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 15:51:53,435][61643] Avg episode reward: [(0, '1.590'), (1, '0.830')] [2023-10-12 15:51:53,760][62634] Updated weights for policy 0, policy_version 2870 (0.0008) [2023-10-12 15:51:54,146][62634] Updated weights for policy 0, policy_version 2880 (0.0008) [2023-10-12 15:51:56,523][62635] Updated weights for policy 1, policy_version 2890 (0.0008) [2023-10-12 15:51:56,893][62635] Updated weights for policy 1, policy_version 2900 (0.0007) [2023-10-12 15:51:57,268][62635] Updated weights for policy 1, policy_version 2910 (0.0009) [2023-10-12 15:51:58,313][62634] Updated weights for policy 0, policy_version 2890 (0.0008) [2023-10-12 15:51:58,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 5931008. Throughput: 0: 1694.6, 1: 1657.5. Samples: 1491810. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 15:51:58,435][61643] Avg episode reward: [(0, '1.610'), (1, '1.040')] [2023-10-12 15:51:58,436][62495] Saving new best policy, reward=1.040! [2023-10-12 15:51:58,689][62634] Updated weights for policy 0, policy_version 2900 (0.0010) [2023-10-12 15:51:59,074][62634] Updated weights for policy 0, policy_version 2910 (0.0009) [2023-10-12 15:52:01,421][62635] Updated weights for policy 1, policy_version 2920 (0.0008) [2023-10-12 15:52:01,789][62635] Updated weights for policy 1, policy_version 2930 (0.0007) [2023-10-12 15:52:02,156][62635] Updated weights for policy 1, policy_version 2940 (0.0007) [2023-10-12 15:52:03,160][62634] Updated weights for policy 0, policy_version 2920 (0.0008) [2023-10-12 15:52:03,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 5996544. Throughput: 0: 1691.7, 1: 1671.9. Samples: 1512048. Policy #0 lag: (min: 15.0, avg: 16.2, max: 39.0) [2023-10-12 15:52:03,435][61643] Avg episode reward: [(0, '1.710'), (1, '1.180')] [2023-10-12 15:52:03,442][62495] Saving new best policy, reward=1.180! [2023-10-12 15:52:03,540][62634] Updated weights for policy 0, policy_version 2930 (0.0007) [2023-10-12 15:52:03,921][62634] Updated weights for policy 0, policy_version 2940 (0.0007) [2023-10-12 15:52:06,341][62635] Updated weights for policy 1, policy_version 2950 (0.0009) [2023-10-12 15:52:06,718][62635] Updated weights for policy 1, policy_version 2960 (0.0009) [2023-10-12 15:52:07,091][62635] Updated weights for policy 1, policy_version 2970 (0.0011) [2023-10-12 15:52:07,940][62634] Updated weights for policy 0, policy_version 2950 (0.0009) [2023-10-12 15:52:08,317][62634] Updated weights for policy 0, policy_version 2960 (0.0009) [2023-10-12 15:52:08,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13107.1, 300 sec: 13440.4). Total num frames: 6062080. Throughput: 0: 1693.2, 1: 1674.1. Samples: 1522334. Policy #0 lag: (min: 3.0, avg: 4.6, max: 30.0) [2023-10-12 15:52:08,436][61643] Avg episode reward: [(0, '1.820'), (1, '1.120')] [2023-10-12 15:52:08,701][62634] Updated weights for policy 0, policy_version 2970 (0.0008) [2023-10-12 15:52:08,922][62354] Saving new best policy, reward=1.820! [2023-10-12 15:52:11,190][62635] Updated weights for policy 1, policy_version 2980 (0.0010) [2023-10-12 15:52:11,550][62635] Updated weights for policy 1, policy_version 2990 (0.0010) [2023-10-12 15:52:11,925][62635] Updated weights for policy 1, policy_version 3000 (0.0009) [2023-10-12 15:52:12,839][62634] Updated weights for policy 0, policy_version 2980 (0.0009) [2023-10-12 15:52:13,228][62634] Updated weights for policy 0, policy_version 2990 (0.0008) [2023-10-12 15:52:13,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 6127616. Throughput: 0: 1690.0, 1: 1652.7. Samples: 1541722. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 15:52:13,435][61643] Avg episode reward: [(0, '1.960'), (1, '1.100')] [2023-10-12 15:52:13,599][62634] Updated weights for policy 0, policy_version 3000 (0.0009) [2023-10-12 15:52:13,908][62354] Saving new best policy, reward=1.960! [2023-10-12 15:52:16,165][62635] Updated weights for policy 1, policy_version 3010 (0.0007) [2023-10-12 15:52:16,545][62635] Updated weights for policy 1, policy_version 3020 (0.0009) [2023-10-12 15:52:16,903][62635] Updated weights for policy 1, policy_version 3030 (0.0009) [2023-10-12 15:52:17,271][62635] Updated weights for policy 1, policy_version 3040 (0.0008) [2023-10-12 15:52:17,493][62634] Updated weights for policy 0, policy_version 3010 (0.0008) [2023-10-12 15:52:17,862][62634] Updated weights for policy 0, policy_version 3020 (0.0009) [2023-10-12 15:52:18,237][62634] Updated weights for policy 0, policy_version 3030 (0.0008) [2023-10-12 15:52:18,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 6193152. Throughput: 0: 1680.4, 1: 1663.2. Samples: 1561730. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 15:52:18,436][61643] Avg episode reward: [(0, '1.830'), (1, '1.130')] [2023-10-12 15:52:18,447][62495] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000003040_3112960.pth... [2023-10-12 15:52:18,487][62495] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000001472_1507328.pth [2023-10-12 15:52:18,619][62354] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000003040_3112960.pth... [2023-10-12 15:52:18,623][62634] Updated weights for policy 0, policy_version 3040 (0.0008) [2023-10-12 15:52:18,651][62354] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000001472_1507328.pth [2023-10-12 15:52:21,247][62635] Updated weights for policy 1, policy_version 3050 (0.0010) [2023-10-12 15:52:21,617][62635] Updated weights for policy 1, policy_version 3060 (0.0009) [2023-10-12 15:52:21,994][62635] Updated weights for policy 1, policy_version 3070 (0.0009) [2023-10-12 15:52:22,659][62634] Updated weights for policy 0, policy_version 3050 (0.0009) [2023-10-12 15:52:23,040][62634] Updated weights for policy 0, policy_version 3060 (0.0008) [2023-10-12 15:52:23,424][62634] Updated weights for policy 0, policy_version 3070 (0.0009) [2023-10-12 15:52:23,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 6258688. Throughput: 0: 1690.9, 1: 1663.0. Samples: 1572460. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 15:52:23,435][61643] Avg episode reward: [(0, '1.860'), (1, '1.250')] [2023-10-12 15:52:23,436][62495] Saving new best policy, reward=1.250! [2023-10-12 15:52:26,109][62635] Updated weights for policy 1, policy_version 3080 (0.0008) [2023-10-12 15:52:26,486][62635] Updated weights for policy 1, policy_version 3090 (0.0007) [2023-10-12 15:52:26,851][62635] Updated weights for policy 1, policy_version 3100 (0.0009) [2023-10-12 15:52:27,340][62634] Updated weights for policy 0, policy_version 3080 (0.0008) [2023-10-12 15:52:27,715][62634] Updated weights for policy 0, policy_version 3090 (0.0009) [2023-10-12 15:52:28,101][62634] Updated weights for policy 0, policy_version 3100 (0.0007) [2023-10-12 15:52:28,435][61643] Fps is (10 sec: 16384.1, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 6356992. Throughput: 0: 1693.7, 1: 1649.6. Samples: 1592424. Policy #0 lag: (min: 31.0, avg: 31.8, max: 50.0) [2023-10-12 15:52:28,436][61643] Avg episode reward: [(0, '1.970'), (1, '1.300')] [2023-10-12 15:52:28,437][62354] Saving new best policy, reward=1.970! [2023-10-12 15:52:28,438][62495] Saving new best policy, reward=1.300! [2023-10-12 15:52:30,864][62635] Updated weights for policy 1, policy_version 3110 (0.0009) [2023-10-12 15:52:31,230][62635] Updated weights for policy 1, policy_version 3120 (0.0007) [2023-10-12 15:52:31,602][62635] Updated weights for policy 1, policy_version 3130 (0.0007) [2023-10-12 15:52:31,957][62634] Updated weights for policy 0, policy_version 3110 (0.0008) [2023-10-12 15:52:32,335][62634] Updated weights for policy 0, policy_version 3120 (0.0008) [2023-10-12 15:52:32,719][62634] Updated weights for policy 0, policy_version 3130 (0.0010) [2023-10-12 15:52:33,435][61643] Fps is (10 sec: 16384.0, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 6422528. Throughput: 0: 1669.0, 1: 1665.6. Samples: 1612050. Policy #0 lag: (min: 31.0, avg: 41.5, max: 63.0) [2023-10-12 15:52:33,435][61643] Avg episode reward: [(0, '2.070'), (1, '1.320')] [2023-10-12 15:52:33,443][62495] Saving new best policy, reward=1.320! [2023-10-12 15:52:33,443][62354] Saving new best policy, reward=2.070! [2023-10-12 15:52:35,610][62635] Updated weights for policy 1, policy_version 3140 (0.0008) [2023-10-12 15:52:35,973][62635] Updated weights for policy 1, policy_version 3150 (0.0009) [2023-10-12 15:52:36,339][62635] Updated weights for policy 1, policy_version 3160 (0.0009) [2023-10-12 15:52:36,867][62634] Updated weights for policy 0, policy_version 3140 (0.0009) [2023-10-12 15:52:37,245][62634] Updated weights for policy 0, policy_version 3150 (0.0010) [2023-10-12 15:52:37,628][62634] Updated weights for policy 0, policy_version 3160 (0.0008) [2023-10-12 15:52:38,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 6488064. Throughput: 0: 1695.2, 1: 1651.7. Samples: 1623014. Policy #0 lag: (min: 31.0, avg: 41.5, max: 63.0) [2023-10-12 15:52:38,436][61643] Avg episode reward: [(0, '2.150'), (1, '1.350')] [2023-10-12 15:52:38,436][62354] Saving new best policy, reward=2.150! [2023-10-12 15:52:38,436][62495] Saving new best policy, reward=1.350! [2023-10-12 15:52:40,377][62635] Updated weights for policy 1, policy_version 3170 (0.0010) [2023-10-12 15:52:40,733][62635] Updated weights for policy 1, policy_version 3180 (0.0007) [2023-10-12 15:52:41,110][62635] Updated weights for policy 1, policy_version 3190 (0.0009) [2023-10-12 15:52:41,477][62635] Updated weights for policy 1, policy_version 3200 (0.0007) [2023-10-12 15:52:41,738][62634] Updated weights for policy 0, policy_version 3170 (0.0008) [2023-10-12 15:52:42,113][62634] Updated weights for policy 0, policy_version 3180 (0.0009) [2023-10-12 15:52:42,492][62634] Updated weights for policy 0, policy_version 3190 (0.0008) [2023-10-12 15:52:42,874][62634] Updated weights for policy 0, policy_version 3200 (0.0009) [2023-10-12 15:52:43,435][61643] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 6553600. Throughput: 0: 1688.3, 1: 1665.9. Samples: 1642754. Policy #0 lag: (min: 29.0, avg: 29.0, max: 29.0) [2023-10-12 15:52:43,436][61643] Avg episode reward: [(0, '2.180'), (1, '1.300')] [2023-10-12 15:52:43,438][62354] Saving new best policy, reward=2.180! [2023-10-12 15:52:45,642][62635] Updated weights for policy 1, policy_version 3210 (0.0008) [2023-10-12 15:52:46,016][62635] Updated weights for policy 1, policy_version 3220 (0.0011) [2023-10-12 15:52:46,385][62635] Updated weights for policy 1, policy_version 3230 (0.0009) [2023-10-12 15:52:47,163][62634] Updated weights for policy 0, policy_version 3210 (0.0009) [2023-10-12 15:52:47,542][62634] Updated weights for policy 0, policy_version 3220 (0.0009) [2023-10-12 15:52:47,919][62634] Updated weights for policy 0, policy_version 3230 (0.0009) [2023-10-12 15:52:48,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 6619136. Throughput: 0: 1663.0, 1: 1674.2. Samples: 1662220. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 15:52:48,435][61643] Avg episode reward: [(0, '2.240'), (1, '1.370')] [2023-10-12 15:52:48,445][62495] Saving new best policy, reward=1.370! [2023-10-12 15:52:48,445][62354] Saving new best policy, reward=2.240! [2023-10-12 15:52:50,443][62635] Updated weights for policy 1, policy_version 3240 (0.0011) [2023-10-12 15:52:50,814][62635] Updated weights for policy 1, policy_version 3250 (0.0007) [2023-10-12 15:52:51,182][62635] Updated weights for policy 1, policy_version 3260 (0.0008) [2023-10-12 15:52:52,004][62634] Updated weights for policy 0, policy_version 3240 (0.0010) [2023-10-12 15:52:52,382][62634] Updated weights for policy 0, policy_version 3250 (0.0009) [2023-10-12 15:52:52,759][62634] Updated weights for policy 0, policy_version 3260 (0.0007) [2023-10-12 15:52:53,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 6684672. Throughput: 0: 1688.4, 1: 1653.4. Samples: 1672712. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 15:52:53,435][61643] Avg episode reward: [(0, '2.190'), (1, '1.480')] [2023-10-12 15:52:53,436][62495] Saving new best policy, reward=1.480! [2023-10-12 15:52:55,412][62635] Updated weights for policy 1, policy_version 3270 (0.0008) [2023-10-12 15:52:55,772][62635] Updated weights for policy 1, policy_version 3280 (0.0008) [2023-10-12 15:52:56,140][62635] Updated weights for policy 1, policy_version 3290 (0.0008) [2023-10-12 15:52:56,709][62634] Updated weights for policy 0, policy_version 3270 (0.0007) [2023-10-12 15:52:57,086][62634] Updated weights for policy 0, policy_version 3280 (0.0007) [2023-10-12 15:52:57,472][62634] Updated weights for policy 0, policy_version 3290 (0.0007) [2023-10-12 15:52:58,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 6750208. Throughput: 0: 1683.9, 1: 1668.5. Samples: 1692582. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 15:52:58,436][61643] Avg episode reward: [(0, '2.150'), (1, '1.570')] [2023-10-12 15:52:58,437][62495] Saving new best policy, reward=1.570! [2023-10-12 15:53:00,136][62635] Updated weights for policy 1, policy_version 3300 (0.0009) [2023-10-12 15:53:00,501][62635] Updated weights for policy 1, policy_version 3310 (0.0009) [2023-10-12 15:53:00,879][62635] Updated weights for policy 1, policy_version 3320 (0.0007) [2023-10-12 15:53:01,431][62634] Updated weights for policy 0, policy_version 3300 (0.0007) [2023-10-12 15:53:01,804][62634] Updated weights for policy 0, policy_version 3310 (0.0009) [2023-10-12 15:53:02,181][62634] Updated weights for policy 0, policy_version 3320 (0.0009) [2023-10-12 15:53:03,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 6815744. Throughput: 0: 1675.3, 1: 1676.8. Samples: 1712574. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 15:53:03,436][61643] Avg episode reward: [(0, '2.250'), (1, '1.650')] [2023-10-12 15:53:03,447][62354] Saving new best policy, reward=2.250! [2023-10-12 15:53:03,447][62495] Saving new best policy, reward=1.650! [2023-10-12 15:53:05,032][62635] Updated weights for policy 1, policy_version 3330 (0.0007) [2023-10-12 15:53:05,398][62635] Updated weights for policy 1, policy_version 3340 (0.0010) [2023-10-12 15:53:05,764][62635] Updated weights for policy 1, policy_version 3350 (0.0009) [2023-10-12 15:53:06,041][62634] Updated weights for policy 0, policy_version 3330 (0.0008) [2023-10-12 15:53:06,135][62635] Updated weights for policy 1, policy_version 3360 (0.0010) [2023-10-12 15:53:06,420][62634] Updated weights for policy 0, policy_version 3340 (0.0011) [2023-10-12 15:53:06,807][62634] Updated weights for policy 0, policy_version 3350 (0.0008) [2023-10-12 15:53:07,179][62634] Updated weights for policy 0, policy_version 3360 (0.0007) [2023-10-12 15:53:08,435][61643] Fps is (10 sec: 13107.5, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 6881280. Throughput: 0: 1696.1, 1: 1649.6. Samples: 1723016. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 15:53:08,435][61643] Avg episode reward: [(0, '2.270'), (1, '1.720')] [2023-10-12 15:53:08,436][62354] Saving new best policy, reward=2.270! [2023-10-12 15:53:08,436][62495] Saving new best policy, reward=1.720! [2023-10-12 15:53:10,403][62635] Updated weights for policy 1, policy_version 3370 (0.0008) [2023-10-12 15:53:10,769][62635] Updated weights for policy 1, policy_version 3380 (0.0008) [2023-10-12 15:53:11,125][62635] Updated weights for policy 1, policy_version 3390 (0.0008) [2023-10-12 15:53:11,282][62634] Updated weights for policy 0, policy_version 3370 (0.0008) [2023-10-12 15:53:11,669][62634] Updated weights for policy 0, policy_version 3380 (0.0009) [2023-10-12 15:53:12,047][62634] Updated weights for policy 0, policy_version 3390 (0.0009) [2023-10-12 15:53:13,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 6946816. Throughput: 0: 1662.0, 1: 1673.5. Samples: 1742522. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 15:53:13,435][61643] Avg episode reward: [(0, '2.260'), (1, '1.820')] [2023-10-12 15:53:13,436][62495] Saving new best policy, reward=1.820! [2023-10-12 15:53:15,140][62635] Updated weights for policy 1, policy_version 3400 (0.0008) [2023-10-12 15:53:15,508][62635] Updated weights for policy 1, policy_version 3410 (0.0008) [2023-10-12 15:53:15,878][62635] Updated weights for policy 1, policy_version 3420 (0.0007) [2023-10-12 15:53:16,006][62634] Updated weights for policy 0, policy_version 3400 (0.0007) [2023-10-12 15:53:16,391][62634] Updated weights for policy 0, policy_version 3410 (0.0007) [2023-10-12 15:53:16,780][62634] Updated weights for policy 0, policy_version 3420 (0.0009) [2023-10-12 15:53:18,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 7012352. Throughput: 0: 1678.4, 1: 1676.5. Samples: 1763022. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 15:53:18,435][61643] Avg episode reward: [(0, '2.180'), (1, '1.820')] [2023-10-12 15:53:19,971][62635] Updated weights for policy 1, policy_version 3430 (0.0007) [2023-10-12 15:53:20,336][62635] Updated weights for policy 1, policy_version 3440 (0.0007) [2023-10-12 15:53:20,716][62635] Updated weights for policy 1, policy_version 3450 (0.0008) [2023-10-12 15:53:20,856][62634] Updated weights for policy 0, policy_version 3430 (0.0009) [2023-10-12 15:53:21,241][62634] Updated weights for policy 0, policy_version 3440 (0.0009) [2023-10-12 15:53:21,624][62634] Updated weights for policy 0, policy_version 3450 (0.0009) [2023-10-12 15:53:23,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 7077888. Throughput: 0: 1670.9, 1: 1662.0. Samples: 1772992. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 15:53:23,435][61643] Avg episode reward: [(0, '2.220'), (1, '1.920')] [2023-10-12 15:53:23,436][62495] Saving new best policy, reward=1.920! [2023-10-12 15:53:24,782][62635] Updated weights for policy 1, policy_version 3460 (0.0007) [2023-10-12 15:53:25,159][62635] Updated weights for policy 1, policy_version 3470 (0.0007) [2023-10-12 15:53:25,521][62635] Updated weights for policy 1, policy_version 3480 (0.0007) [2023-10-12 15:53:25,642][62634] Updated weights for policy 0, policy_version 3460 (0.0009) [2023-10-12 15:53:26,028][62634] Updated weights for policy 0, policy_version 3470 (0.0007) [2023-10-12 15:53:26,403][62634] Updated weights for policy 0, policy_version 3480 (0.0008) [2023-10-12 15:53:28,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 7143424. Throughput: 0: 1662.1, 1: 1674.1. Samples: 1792884. Policy #0 lag: (min: 7.0, avg: 17.8, max: 39.0) [2023-10-12 15:53:28,435][61643] Avg episode reward: [(0, '2.390'), (1, '2.040')] [2023-10-12 15:53:28,436][62495] Saving new best policy, reward=2.040! [2023-10-12 15:53:28,436][62354] Saving new best policy, reward=2.390! [2023-10-12 15:53:29,481][62635] Updated weights for policy 1, policy_version 3490 (0.0008) [2023-10-12 15:53:29,846][62635] Updated weights for policy 1, policy_version 3500 (0.0008) [2023-10-12 15:53:30,218][62635] Updated weights for policy 1, policy_version 3510 (0.0008) [2023-10-12 15:53:30,581][62635] Updated weights for policy 1, policy_version 3520 (0.0008) [2023-10-12 15:53:30,627][62634] Updated weights for policy 0, policy_version 3490 (0.0007) [2023-10-12 15:53:31,018][62634] Updated weights for policy 0, policy_version 3500 (0.0008) [2023-10-12 15:53:31,401][62634] Updated weights for policy 0, policy_version 3510 (0.0009) [2023-10-12 15:53:31,790][62634] Updated weights for policy 0, policy_version 3520 (0.0009) [2023-10-12 15:53:33,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13107.1, 300 sec: 13440.4). Total num frames: 7208960. Throughput: 0: 1685.5, 1: 1676.8. Samples: 1813526. Policy #0 lag: (min: 31.0, avg: 39.6, max: 63.0) [2023-10-12 15:53:33,436][61643] Avg episode reward: [(0, '2.480'), (1, '1.920')] [2023-10-12 15:53:33,446][62354] Saving new best policy, reward=2.480! [2023-10-12 15:53:34,536][62635] Updated weights for policy 1, policy_version 3530 (0.0007) [2023-10-12 15:53:34,898][62635] Updated weights for policy 1, policy_version 3540 (0.0007) [2023-10-12 15:53:35,269][62635] Updated weights for policy 1, policy_version 3550 (0.0007) [2023-10-12 15:53:35,781][62634] Updated weights for policy 0, policy_version 3530 (0.0009) [2023-10-12 15:53:36,160][62634] Updated weights for policy 0, policy_version 3540 (0.0007) [2023-10-12 15:53:36,531][62634] Updated weights for policy 0, policy_version 3550 (0.0011) [2023-10-12 15:53:38,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 7274496. Throughput: 0: 1677.3, 1: 1673.9. Samples: 1823518. Policy #0 lag: (min: 31.0, avg: 39.6, max: 63.0) [2023-10-12 15:53:38,435][61643] Avg episode reward: [(0, '2.460'), (1, '1.930')] [2023-10-12 15:53:39,286][62635] Updated weights for policy 1, policy_version 3560 (0.0009) [2023-10-12 15:53:39,655][62635] Updated weights for policy 1, policy_version 3570 (0.0010) [2023-10-12 15:53:40,025][62635] Updated weights for policy 1, policy_version 3580 (0.0010) [2023-10-12 15:53:40,684][62634] Updated weights for policy 0, policy_version 3560 (0.0007) [2023-10-12 15:53:41,060][62634] Updated weights for policy 0, policy_version 3570 (0.0007) [2023-10-12 15:53:41,433][62634] Updated weights for policy 0, policy_version 3580 (0.0007) [2023-10-12 15:53:43,435][61643] Fps is (10 sec: 13107.5, 60 sec: 13107.3, 300 sec: 13440.4). Total num frames: 7340032. Throughput: 0: 1666.0, 1: 1684.8. Samples: 1843366. Policy #0 lag: (min: 31.0, avg: 37.7, max: 63.0) [2023-10-12 15:53:43,435][61643] Avg episode reward: [(0, '2.410'), (1, '1.990')] [2023-10-12 15:53:43,991][62635] Updated weights for policy 1, policy_version 3590 (0.0008) [2023-10-12 15:53:44,360][62635] Updated weights for policy 1, policy_version 3600 (0.0008) [2023-10-12 15:53:44,735][62635] Updated weights for policy 1, policy_version 3610 (0.0008) [2023-10-12 15:53:45,260][62634] Updated weights for policy 0, policy_version 3590 (0.0009) [2023-10-12 15:53:45,628][62634] Updated weights for policy 0, policy_version 3600 (0.0007) [2023-10-12 15:53:46,002][62634] Updated weights for policy 0, policy_version 3610 (0.0009) [2023-10-12 15:53:48,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 7405568. Throughput: 0: 1683.5, 1: 1686.0. Samples: 1864200. Policy #0 lag: (min: 31.0, avg: 37.7, max: 63.0) [2023-10-12 15:53:48,435][61643] Avg episode reward: [(0, '2.390'), (1, '2.110')] [2023-10-12 15:53:48,441][62495] Saving new best policy, reward=2.110! [2023-10-12 15:53:48,779][62635] Updated weights for policy 1, policy_version 3620 (0.0009) [2023-10-12 15:53:49,144][62635] Updated weights for policy 1, policy_version 3630 (0.0010) [2023-10-12 15:53:49,513][62635] Updated weights for policy 1, policy_version 3640 (0.0011) [2023-10-12 15:53:50,026][62634] Updated weights for policy 0, policy_version 3620 (0.0009) [2023-10-12 15:53:50,406][62634] Updated weights for policy 0, policy_version 3630 (0.0010) [2023-10-12 15:53:50,785][62634] Updated weights for policy 0, policy_version 3640 (0.0011) [2023-10-12 15:53:53,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 7471104. Throughput: 0: 1661.6, 1: 1686.1. Samples: 1873664. Policy #0 lag: (min: 31.0, avg: 34.5, max: 63.0) [2023-10-12 15:53:53,435][61643] Avg episode reward: [(0, '2.470'), (1, '2.150')] [2023-10-12 15:53:53,533][62635] Updated weights for policy 1, policy_version 3650 (0.0010) [2023-10-12 15:53:53,900][62635] Updated weights for policy 1, policy_version 3660 (0.0008) [2023-10-12 15:53:54,262][62635] Updated weights for policy 1, policy_version 3670 (0.0007) [2023-10-12 15:53:54,634][62495] Saving new best policy, reward=2.150! [2023-10-12 15:53:54,639][62635] Updated weights for policy 1, policy_version 3680 (0.0007) [2023-10-12 15:53:54,995][62634] Updated weights for policy 0, policy_version 3650 (0.0010) [2023-10-12 15:53:55,381][62634] Updated weights for policy 0, policy_version 3660 (0.0007) [2023-10-12 15:53:55,748][62634] Updated weights for policy 0, policy_version 3670 (0.0011) [2023-10-12 15:53:56,124][62634] Updated weights for policy 0, policy_version 3680 (0.0010) [2023-10-12 15:53:58,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.7). Total num frames: 7536640. Throughput: 0: 1678.2, 1: 1691.3. Samples: 1894150. Policy #0 lag: (min: 31.0, avg: 34.5, max: 63.0) [2023-10-12 15:53:58,436][61643] Avg episode reward: [(0, '2.560'), (1, '2.150')] [2023-10-12 15:53:58,437][62354] Saving new best policy, reward=2.560! [2023-10-12 15:53:58,745][62635] Updated weights for policy 1, policy_version 3690 (0.0008) [2023-10-12 15:53:59,113][62635] Updated weights for policy 1, policy_version 3700 (0.0007) [2023-10-12 15:53:59,476][62635] Updated weights for policy 1, policy_version 3710 (0.0007) [2023-10-12 15:54:00,176][62634] Updated weights for policy 0, policy_version 3690 (0.0009) [2023-10-12 15:54:00,559][62634] Updated weights for policy 0, policy_version 3700 (0.0009) [2023-10-12 15:54:00,932][62634] Updated weights for policy 0, policy_version 3710 (0.0007) [2023-10-12 15:54:03,435][61643] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 7602176. Throughput: 0: 1685.3, 1: 1692.5. Samples: 1915024. Policy #0 lag: (min: 31.0, avg: 37.0, max: 63.0) [2023-10-12 15:54:03,436][61643] Avg episode reward: [(0, '2.570'), (1, '2.240')] [2023-10-12 15:54:03,444][62354] Saving new best policy, reward=2.570! [2023-10-12 15:54:03,566][62635] Updated weights for policy 1, policy_version 3720 (0.0009) [2023-10-12 15:54:03,935][62635] Updated weights for policy 1, policy_version 3730 (0.0009) [2023-10-12 15:54:04,310][62635] Updated weights for policy 1, policy_version 3740 (0.0010) [2023-10-12 15:54:04,458][62495] Saving new best policy, reward=2.240! [2023-10-12 15:54:04,938][62634] Updated weights for policy 0, policy_version 3720 (0.0008) [2023-10-12 15:54:05,320][62634] Updated weights for policy 0, policy_version 3730 (0.0009) [2023-10-12 15:54:05,702][62634] Updated weights for policy 0, policy_version 3740 (0.0009) [2023-10-12 15:54:08,417][62635] Updated weights for policy 1, policy_version 3750 (0.0008) [2023-10-12 15:54:08,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 7667712. Throughput: 0: 1665.5, 1: 1692.1. Samples: 1924082. Policy #0 lag: (min: 31.0, avg: 37.0, max: 63.0) [2023-10-12 15:54:08,435][61643] Avg episode reward: [(0, '2.490'), (1, '2.340')] [2023-10-12 15:54:08,791][62635] Updated weights for policy 1, policy_version 3760 (0.0008) [2023-10-12 15:54:09,161][62635] Updated weights for policy 1, policy_version 3770 (0.0007) [2023-10-12 15:54:09,379][62495] Saving new best policy, reward=2.340! [2023-10-12 15:54:09,815][62634] Updated weights for policy 0, policy_version 3750 (0.0008) [2023-10-12 15:54:10,196][62634] Updated weights for policy 0, policy_version 3760 (0.0010) [2023-10-12 15:54:10,576][62634] Updated weights for policy 0, policy_version 3770 (0.0008) [2023-10-12 15:54:13,337][62635] Updated weights for policy 1, policy_version 3780 (0.0007) [2023-10-12 15:54:13,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 7733248. Throughput: 0: 1684.1, 1: 1690.2. Samples: 1944728. Policy #0 lag: (min: 23.0, avg: 46.0, max: 48.0) [2023-10-12 15:54:13,435][61643] Avg episode reward: [(0, '2.490'), (1, '2.420')] [2023-10-12 15:54:13,701][62635] Updated weights for policy 1, policy_version 3790 (0.0008) [2023-10-12 15:54:14,083][62635] Updated weights for policy 1, policy_version 3800 (0.0007) [2023-10-12 15:54:14,371][62495] Saving new best policy, reward=2.420! [2023-10-12 15:54:14,613][62634] Updated weights for policy 0, policy_version 3780 (0.0007) [2023-10-12 15:54:14,993][62634] Updated weights for policy 0, policy_version 3790 (0.0009) [2023-10-12 15:54:15,383][62634] Updated weights for policy 0, policy_version 3800 (0.0009) [2023-10-12 15:54:18,230][62635] Updated weights for policy 1, policy_version 3810 (0.0007) [2023-10-12 15:54:18,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 7798784. Throughput: 0: 1696.5, 1: 1690.1. Samples: 1965920. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 15:54:18,435][61643] Avg episode reward: [(0, '2.560'), (1, '2.340')] [2023-10-12 15:54:18,442][62354] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000003808_3899392.pth... [2023-10-12 15:54:18,478][62354] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000002240_2293760.pth [2023-10-12 15:54:18,593][62635] Updated weights for policy 1, policy_version 3820 (0.0007) [2023-10-12 15:54:18,961][62635] Updated weights for policy 1, policy_version 3830 (0.0007) [2023-10-12 15:54:19,321][62495] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000003840_3932160.pth... [2023-10-12 15:54:19,327][62635] Updated weights for policy 1, policy_version 3840 (0.0007) [2023-10-12 15:54:19,355][62495] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000002240_2293760.pth [2023-10-12 15:54:19,401][62634] Updated weights for policy 0, policy_version 3810 (0.0008) [2023-10-12 15:54:19,810][62634] Updated weights for policy 0, policy_version 3820 (0.0008) [2023-10-12 15:54:20,191][62634] Updated weights for policy 0, policy_version 3830 (0.0007) [2023-10-12 15:54:20,561][62634] Updated weights for policy 0, policy_version 3840 (0.0007) [2023-10-12 15:54:23,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 7864320. Throughput: 0: 1676.0, 1: 1690.4. Samples: 1975004. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 15:54:23,435][61643] Avg episode reward: [(0, '2.590'), (1, '2.270')] [2023-10-12 15:54:23,436][62354] Saving new best policy, reward=2.590! [2023-10-12 15:54:23,504][62635] Updated weights for policy 1, policy_version 3850 (0.0009) [2023-10-12 15:54:23,884][62635] Updated weights for policy 1, policy_version 3860 (0.0008) [2023-10-12 15:54:24,250][62635] Updated weights for policy 1, policy_version 3870 (0.0009) [2023-10-12 15:54:24,508][62634] Updated weights for policy 0, policy_version 3850 (0.0007) [2023-10-12 15:54:24,886][62634] Updated weights for policy 0, policy_version 3860 (0.0007) [2023-10-12 15:54:25,268][62634] Updated weights for policy 0, policy_version 3870 (0.0008) [2023-10-12 15:54:28,045][62635] Updated weights for policy 1, policy_version 3880 (0.0010) [2023-10-12 15:54:28,421][62635] Updated weights for policy 1, policy_version 3890 (0.0009) [2023-10-12 15:54:28,435][61643] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 7929856. Throughput: 0: 1695.7, 1: 1695.1. Samples: 1995954. Policy #0 lag: (min: 17.0, avg: 28.1, max: 49.0) [2023-10-12 15:54:28,436][61643] Avg episode reward: [(0, '2.600'), (1, '2.250')] [2023-10-12 15:54:28,437][62354] Saving new best policy, reward=2.600! [2023-10-12 15:54:28,784][62635] Updated weights for policy 1, policy_version 3900 (0.0008) [2023-10-12 15:54:29,212][62634] Updated weights for policy 0, policy_version 3880 (0.0010) [2023-10-12 15:54:29,605][62634] Updated weights for policy 0, policy_version 3890 (0.0010) [2023-10-12 15:54:29,978][62634] Updated weights for policy 0, policy_version 3900 (0.0009) [2023-10-12 15:54:32,706][62635] Updated weights for policy 1, policy_version 3910 (0.0010) [2023-10-12 15:54:33,080][62635] Updated weights for policy 1, policy_version 3920 (0.0009) [2023-10-12 15:54:33,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.3, 300 sec: 13329.4). Total num frames: 7995392. Throughput: 0: 1693.2, 1: 1683.7. Samples: 2016162. Policy #0 lag: (min: 17.0, avg: 28.1, max: 49.0) [2023-10-12 15:54:33,435][61643] Avg episode reward: [(0, '2.590'), (1, '2.310')] [2023-10-12 15:54:33,453][62635] Updated weights for policy 1, policy_version 3930 (0.0010) [2023-10-12 15:54:34,028][62634] Updated weights for policy 0, policy_version 3910 (0.0008) [2023-10-12 15:54:34,400][62634] Updated weights for policy 0, policy_version 3920 (0.0008) [2023-10-12 15:54:34,776][62634] Updated weights for policy 0, policy_version 3930 (0.0007) [2023-10-12 15:54:37,622][62635] Updated weights for policy 1, policy_version 3940 (0.0007) [2023-10-12 15:54:37,989][62635] Updated weights for policy 1, policy_version 3950 (0.0009) [2023-10-12 15:54:38,360][62635] Updated weights for policy 1, policy_version 3960 (0.0009) [2023-10-12 15:54:38,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 8060928. Throughput: 0: 1689.7, 1: 1693.2. Samples: 2025896. Policy #0 lag: (min: 31.0, avg: 39.4, max: 63.0) [2023-10-12 15:54:38,436][61643] Avg episode reward: [(0, '2.560'), (1, '2.450')] [2023-10-12 15:54:38,611][62634] Updated weights for policy 0, policy_version 3940 (0.0007) [2023-10-12 15:54:38,657][62495] Saving new best policy, reward=2.450! [2023-10-12 15:54:38,995][62634] Updated weights for policy 0, policy_version 3950 (0.0011) [2023-10-12 15:54:39,381][62634] Updated weights for policy 0, policy_version 3960 (0.0009) [2023-10-12 15:54:42,293][62635] Updated weights for policy 1, policy_version 3970 (0.0009) [2023-10-12 15:54:42,655][62635] Updated weights for policy 1, policy_version 3980 (0.0008) [2023-10-12 15:54:43,022][62635] Updated weights for policy 1, policy_version 3990 (0.0008) [2023-10-12 15:54:43,367][62634] Updated weights for policy 0, policy_version 3970 (0.0008) [2023-10-12 15:54:43,386][62635] Updated weights for policy 1, policy_version 4000 (0.0009) [2023-10-12 15:54:43,435][61643] Fps is (10 sec: 16383.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 8159232. Throughput: 0: 1699.9, 1: 1692.1. Samples: 2046792. Policy #0 lag: (min: 31.0, avg: 39.4, max: 63.0) [2023-10-12 15:54:43,436][61643] Avg episode reward: [(0, '2.510'), (1, '2.490')] [2023-10-12 15:54:43,436][62495] Saving new best policy, reward=2.490! [2023-10-12 15:54:43,755][62634] Updated weights for policy 0, policy_version 3980 (0.0009) [2023-10-12 15:54:44,131][62634] Updated weights for policy 0, policy_version 3990 (0.0007) [2023-10-12 15:54:44,508][62634] Updated weights for policy 0, policy_version 4000 (0.0009) [2023-10-12 15:54:47,501][62635] Updated weights for policy 1, policy_version 4010 (0.0007) [2023-10-12 15:54:47,871][62635] Updated weights for policy 1, policy_version 4020 (0.0008) [2023-10-12 15:54:48,234][62635] Updated weights for policy 1, policy_version 4030 (0.0007) [2023-10-12 15:54:48,435][61643] Fps is (10 sec: 16383.8, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 8224768. Throughput: 0: 1700.4, 1: 1671.8. Samples: 2066772. Policy #0 lag: (min: 21.0, avg: 28.9, max: 53.0) [2023-10-12 15:54:48,436][61643] Avg episode reward: [(0, '2.550'), (1, '2.470')] [2023-10-12 15:54:48,676][62634] Updated weights for policy 0, policy_version 4010 (0.0007) [2023-10-12 15:54:49,058][62634] Updated weights for policy 0, policy_version 4020 (0.0009) [2023-10-12 15:54:49,431][62634] Updated weights for policy 0, policy_version 4030 (0.0008) [2023-10-12 15:54:52,359][62635] Updated weights for policy 1, policy_version 4040 (0.0008) [2023-10-12 15:54:52,726][62635] Updated weights for policy 1, policy_version 4050 (0.0007) [2023-10-12 15:54:53,101][62635] Updated weights for policy 1, policy_version 4060 (0.0007) [2023-10-12 15:54:53,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 8290304. Throughput: 0: 1697.7, 1: 1692.4. Samples: 2076638. Policy #0 lag: (min: 21.0, avg: 28.9, max: 53.0) [2023-10-12 15:54:53,435][61643] Avg episode reward: [(0, '2.630'), (1, '2.450')] [2023-10-12 15:54:53,480][62634] Updated weights for policy 0, policy_version 4040 (0.0008) [2023-10-12 15:54:53,856][62634] Updated weights for policy 0, policy_version 4050 (0.0010) [2023-10-12 15:54:54,232][62634] Updated weights for policy 0, policy_version 4060 (0.0008) [2023-10-12 15:54:54,374][62354] Saving new best policy, reward=2.630! [2023-10-12 15:54:57,047][62635] Updated weights for policy 1, policy_version 4070 (0.0009) [2023-10-12 15:54:57,424][62635] Updated weights for policy 1, policy_version 4080 (0.0007) [2023-10-12 15:54:57,792][62635] Updated weights for policy 1, policy_version 4090 (0.0007) [2023-10-12 15:54:58,232][62634] Updated weights for policy 0, policy_version 4070 (0.0007) [2023-10-12 15:54:58,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 8355840. Throughput: 0: 1700.6, 1: 1690.8. Samples: 2097338. Policy #0 lag: (min: 9.0, avg: 27.5, max: 41.0) [2023-10-12 15:54:58,436][61643] Avg episode reward: [(0, '2.690'), (1, '2.510')] [2023-10-12 15:54:58,437][62495] Saving new best policy, reward=2.510! [2023-10-12 15:54:58,611][62634] Updated weights for policy 0, policy_version 4080 (0.0008) [2023-10-12 15:54:58,983][62634] Updated weights for policy 0, policy_version 4090 (0.0010) [2023-10-12 15:54:59,209][62354] Saving new best policy, reward=2.690! [2023-10-12 15:55:01,993][62635] Updated weights for policy 1, policy_version 4100 (0.0008) [2023-10-12 15:55:02,368][62635] Updated weights for policy 1, policy_version 4110 (0.0009) [2023-10-12 15:55:02,751][62635] Updated weights for policy 1, policy_version 4120 (0.0009) [2023-10-12 15:55:02,959][62634] Updated weights for policy 0, policy_version 4100 (0.0008) [2023-10-12 15:55:03,340][62634] Updated weights for policy 0, policy_version 4110 (0.0008) [2023-10-12 15:55:03,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 8421376. Throughput: 0: 1692.2, 1: 1665.3. Samples: 2117008. Policy #0 lag: (min: 2.0, avg: 10.9, max: 34.0) [2023-10-12 15:55:03,435][61643] Avg episode reward: [(0, '2.670'), (1, '2.560')] [2023-10-12 15:55:03,443][62495] Saving new best policy, reward=2.560! [2023-10-12 15:55:03,721][62634] Updated weights for policy 0, policy_version 4120 (0.0009) [2023-10-12 15:55:06,534][62635] Updated weights for policy 1, policy_version 4130 (0.0009) [2023-10-12 15:55:06,903][62635] Updated weights for policy 1, policy_version 4140 (0.0009) [2023-10-12 15:55:07,281][62635] Updated weights for policy 1, policy_version 4150 (0.0007) [2023-10-12 15:55:07,649][62635] Updated weights for policy 1, policy_version 4160 (0.0007) [2023-10-12 15:55:07,785][62634] Updated weights for policy 0, policy_version 4130 (0.0008) [2023-10-12 15:55:08,198][62634] Updated weights for policy 0, policy_version 4140 (0.0008) [2023-10-12 15:55:08,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 8486912. Throughput: 0: 1696.6, 1: 1692.8. Samples: 2127526. Policy #0 lag: (min: 7.0, avg: 14.9, max: 39.0) [2023-10-12 15:55:08,436][61643] Avg episode reward: [(0, '2.700'), (1, '2.610')] [2023-10-12 15:55:08,437][62495] Saving new best policy, reward=2.610! [2023-10-12 15:55:08,577][62634] Updated weights for policy 0, policy_version 4150 (0.0010) [2023-10-12 15:55:08,952][62354] Saving new best policy, reward=2.700! [2023-10-12 15:55:08,953][62634] Updated weights for policy 0, policy_version 4160 (0.0008) [2023-10-12 15:55:11,826][62635] Updated weights for policy 1, policy_version 4170 (0.0010) [2023-10-12 15:55:12,192][62635] Updated weights for policy 1, policy_version 4180 (0.0010) [2023-10-12 15:55:12,562][62635] Updated weights for policy 1, policy_version 4190 (0.0007) [2023-10-12 15:55:12,989][62634] Updated weights for policy 0, policy_version 4170 (0.0010) [2023-10-12 15:55:13,360][62634] Updated weights for policy 0, policy_version 4180 (0.0010) [2023-10-12 15:55:13,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 8552448. Throughput: 0: 1692.4, 1: 1676.9. Samples: 2147572. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 15:55:13,435][61643] Avg episode reward: [(0, '2.750'), (1, '2.680')] [2023-10-12 15:55:13,436][62495] Saving new best policy, reward=2.680! [2023-10-12 15:55:13,734][62634] Updated weights for policy 0, policy_version 4190 (0.0009) [2023-10-12 15:55:13,809][62354] Saving new best policy, reward=2.750! [2023-10-12 15:55:16,632][62635] Updated weights for policy 1, policy_version 4200 (0.0009) [2023-10-12 15:55:16,990][62635] Updated weights for policy 1, policy_version 4210 (0.0008) [2023-10-12 15:55:17,368][62635] Updated weights for policy 1, policy_version 4220 (0.0009) [2023-10-12 15:55:17,742][62634] Updated weights for policy 0, policy_version 4200 (0.0009) [2023-10-12 15:55:18,114][62634] Updated weights for policy 0, policy_version 4210 (0.0010) [2023-10-12 15:55:18,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 8617984. Throughput: 0: 1684.2, 1: 1671.7. Samples: 2167180. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 15:55:18,436][61643] Avg episode reward: [(0, '2.720'), (1, '2.680')] [2023-10-12 15:55:18,496][62634] Updated weights for policy 0, policy_version 4220 (0.0008) [2023-10-12 15:55:21,444][62635] Updated weights for policy 1, policy_version 4230 (0.0007) [2023-10-12 15:55:21,816][62635] Updated weights for policy 1, policy_version 4240 (0.0009) [2023-10-12 15:55:22,186][62635] Updated weights for policy 1, policy_version 4250 (0.0007) [2023-10-12 15:55:22,526][62634] Updated weights for policy 0, policy_version 4230 (0.0009) [2023-10-12 15:55:22,909][62634] Updated weights for policy 0, policy_version 4240 (0.0008) [2023-10-12 15:55:23,280][62634] Updated weights for policy 0, policy_version 4250 (0.0008) [2023-10-12 15:55:23,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 8683520. Throughput: 0: 1691.3, 1: 1691.6. Samples: 2178126. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 15:55:23,435][61643] Avg episode reward: [(0, '2.700'), (1, '2.590')] [2023-10-12 15:55:26,244][62635] Updated weights for policy 1, policy_version 4260 (0.0008) [2023-10-12 15:55:26,611][62635] Updated weights for policy 1, policy_version 4270 (0.0009) [2023-10-12 15:55:26,988][62635] Updated weights for policy 1, policy_version 4280 (0.0010) [2023-10-12 15:55:27,349][62634] Updated weights for policy 0, policy_version 4260 (0.0009) [2023-10-12 15:55:27,735][62634] Updated weights for policy 0, policy_version 4270 (0.0008) [2023-10-12 15:55:28,116][62634] Updated weights for policy 0, policy_version 4280 (0.0008) [2023-10-12 15:55:28,435][61643] Fps is (10 sec: 16384.4, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 8781824. Throughput: 0: 1692.0, 1: 1672.3. Samples: 2198184. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) [2023-10-12 15:55:28,435][61643] Avg episode reward: [(0, '2.720'), (1, '2.610')] [2023-10-12 15:55:31,097][62635] Updated weights for policy 1, policy_version 4290 (0.0010) [2023-10-12 15:55:31,475][62635] Updated weights for policy 1, policy_version 4300 (0.0008) [2023-10-12 15:55:31,836][62635] Updated weights for policy 1, policy_version 4310 (0.0008) [2023-10-12 15:55:32,206][62635] Updated weights for policy 1, policy_version 4320 (0.0008) [2023-10-12 15:55:32,232][62634] Updated weights for policy 0, policy_version 4290 (0.0010) [2023-10-12 15:55:32,609][62634] Updated weights for policy 0, policy_version 4300 (0.0007) [2023-10-12 15:55:32,988][62634] Updated weights for policy 0, policy_version 4310 (0.0009) [2023-10-12 15:55:33,362][62634] Updated weights for policy 0, policy_version 4320 (0.0009) [2023-10-12 15:55:33,435][61643] Fps is (10 sec: 16384.0, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 8847360. Throughput: 0: 1672.5, 1: 1681.4. Samples: 2217696. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) [2023-10-12 15:55:33,435][61643] Avg episode reward: [(0, '2.660'), (1, '2.560')] [2023-10-12 15:55:36,110][62635] Updated weights for policy 1, policy_version 4330 (0.0008) [2023-10-12 15:55:36,481][62635] Updated weights for policy 1, policy_version 4340 (0.0009) [2023-10-12 15:55:36,852][62635] Updated weights for policy 1, policy_version 4350 (0.0007) [2023-10-12 15:55:37,383][62634] Updated weights for policy 0, policy_version 4330 (0.0010) [2023-10-12 15:55:37,760][62634] Updated weights for policy 0, policy_version 4340 (0.0011) [2023-10-12 15:55:38,136][62634] Updated weights for policy 0, policy_version 4350 (0.0010) [2023-10-12 15:55:38,435][61643] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 8912896. Throughput: 0: 1689.8, 1: 1688.2. Samples: 2228648. Policy #0 lag: (min: 31.0, avg: 40.1, max: 63.0) [2023-10-12 15:55:38,435][61643] Avg episode reward: [(0, '2.720'), (1, '2.620')] [2023-10-12 15:55:41,014][62635] Updated weights for policy 1, policy_version 4360 (0.0010) [2023-10-12 15:55:41,388][62635] Updated weights for policy 1, policy_version 4370 (0.0009) [2023-10-12 15:55:41,750][62635] Updated weights for policy 1, policy_version 4380 (0.0011) [2023-10-12 15:55:42,170][62634] Updated weights for policy 0, policy_version 4360 (0.0008) [2023-10-12 15:55:42,557][62634] Updated weights for policy 0, policy_version 4370 (0.0010) [2023-10-12 15:55:42,944][62634] Updated weights for policy 0, policy_version 4380 (0.0010) [2023-10-12 15:55:43,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 8978432. Throughput: 0: 1684.5, 1: 1668.2. Samples: 2248212. Policy #0 lag: (min: 31.0, avg: 40.1, max: 63.0) [2023-10-12 15:55:43,435][61643] Avg episode reward: [(0, '2.780'), (1, '2.710')] [2023-10-12 15:55:43,436][62354] Saving new best policy, reward=2.780! [2023-10-12 15:55:43,436][62495] Saving new best policy, reward=2.710! [2023-10-12 15:55:45,929][62635] Updated weights for policy 1, policy_version 4390 (0.0008) [2023-10-12 15:55:46,308][62635] Updated weights for policy 1, policy_version 4400 (0.0010) [2023-10-12 15:55:46,672][62635] Updated weights for policy 1, policy_version 4410 (0.0011) [2023-10-12 15:55:46,941][62634] Updated weights for policy 0, policy_version 4390 (0.0009) [2023-10-12 15:55:47,314][62634] Updated weights for policy 0, policy_version 4400 (0.0007) [2023-10-12 15:55:47,693][62634] Updated weights for policy 0, policy_version 4410 (0.0009) [2023-10-12 15:55:48,435][61643] Fps is (10 sec: 13106.7, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 9043968. Throughput: 0: 1660.9, 1: 1689.5. Samples: 2267778. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 15:55:48,436][61643] Avg episode reward: [(0, '2.840'), (1, '2.790')] [2023-10-12 15:55:48,445][62354] Saving new best policy, reward=2.840! [2023-10-12 15:55:48,445][62495] Saving new best policy, reward=2.790! [2023-10-12 15:55:50,800][62635] Updated weights for policy 1, policy_version 4420 (0.0008) [2023-10-12 15:55:51,166][62635] Updated weights for policy 1, policy_version 4430 (0.0008) [2023-10-12 15:55:51,534][62635] Updated weights for policy 1, policy_version 4440 (0.0007) [2023-10-12 15:55:51,703][62634] Updated weights for policy 0, policy_version 4420 (0.0010) [2023-10-12 15:55:52,079][62634] Updated weights for policy 0, policy_version 4430 (0.0007) [2023-10-12 15:55:52,465][62634] Updated weights for policy 0, policy_version 4440 (0.0007) [2023-10-12 15:55:53,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 9109504. Throughput: 0: 1684.8, 1: 1679.8. Samples: 2278932. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 15:55:53,436][61643] Avg episode reward: [(0, '2.760'), (1, '2.760')] [2023-10-12 15:55:55,546][62635] Updated weights for policy 1, policy_version 4450 (0.0009) [2023-10-12 15:55:55,924][62635] Updated weights for policy 1, policy_version 4460 (0.0008) [2023-10-12 15:55:56,293][62635] Updated weights for policy 1, policy_version 4470 (0.0009) [2023-10-12 15:55:56,599][62634] Updated weights for policy 0, policy_version 4450 (0.0007) [2023-10-12 15:55:56,668][62635] Updated weights for policy 1, policy_version 4480 (0.0008) [2023-10-12 15:55:56,991][62634] Updated weights for policy 0, policy_version 4460 (0.0007) [2023-10-12 15:55:57,369][62634] Updated weights for policy 0, policy_version 4470 (0.0007) [2023-10-12 15:55:57,752][62634] Updated weights for policy 0, policy_version 4480 (0.0007) [2023-10-12 15:55:58,435][61643] Fps is (10 sec: 13107.8, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 9175040. Throughput: 0: 1676.7, 1: 1673.0. Samples: 2298308. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 15:55:58,435][61643] Avg episode reward: [(0, '2.700'), (1, '2.830')] [2023-10-12 15:55:58,436][62495] Saving new best policy, reward=2.830! [2023-10-12 15:56:00,800][62635] Updated weights for policy 1, policy_version 4490 (0.0010) [2023-10-12 15:56:01,182][62635] Updated weights for policy 1, policy_version 4500 (0.0008) [2023-10-12 15:56:01,546][62635] Updated weights for policy 1, policy_version 4510 (0.0007) [2023-10-12 15:56:01,691][62634] Updated weights for policy 0, policy_version 4490 (0.0009) [2023-10-12 15:56:02,070][62634] Updated weights for policy 0, policy_version 4500 (0.0008) [2023-10-12 15:56:02,463][62634] Updated weights for policy 0, policy_version 4510 (0.0007) [2023-10-12 15:56:03,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 9240576. Throughput: 0: 1662.7, 1: 1687.3. Samples: 2317930. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 15:56:03,435][61643] Avg episode reward: [(0, '2.700'), (1, '2.840')] [2023-10-12 15:56:03,444][62495] Saving new best policy, reward=2.840! [2023-10-12 15:56:05,596][62635] Updated weights for policy 1, policy_version 4520 (0.0009) [2023-10-12 15:56:05,969][62635] Updated weights for policy 1, policy_version 4530 (0.0009) [2023-10-12 15:56:06,336][62635] Updated weights for policy 1, policy_version 4540 (0.0007) [2023-10-12 15:56:06,485][62634] Updated weights for policy 0, policy_version 4520 (0.0007) [2023-10-12 15:56:06,878][62634] Updated weights for policy 0, policy_version 4530 (0.0008) [2023-10-12 15:56:07,254][62634] Updated weights for policy 0, policy_version 4540 (0.0007) [2023-10-12 15:56:08,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 9306112. Throughput: 0: 1680.0, 1: 1668.1. Samples: 2328792. Policy #0 lag: (min: 31.0, avg: 34.5, max: 63.0) [2023-10-12 15:56:08,435][61643] Avg episode reward: [(0, '2.690'), (1, '2.750')] [2023-10-12 15:56:10,404][62635] Updated weights for policy 1, policy_version 4550 (0.0008) [2023-10-12 15:56:10,765][62635] Updated weights for policy 1, policy_version 4560 (0.0009) [2023-10-12 15:56:11,128][62635] Updated weights for policy 1, policy_version 4570 (0.0007) [2023-10-12 15:56:11,338][62634] Updated weights for policy 0, policy_version 4550 (0.0007) [2023-10-12 15:56:11,714][62634] Updated weights for policy 0, policy_version 4560 (0.0011) [2023-10-12 15:56:12,100][62634] Updated weights for policy 0, policy_version 4570 (0.0010) [2023-10-12 15:56:13,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 9371648. Throughput: 0: 1659.1, 1: 1675.0. Samples: 2348216. Policy #0 lag: (min: 31.0, avg: 34.5, max: 63.0) [2023-10-12 15:56:13,436][61643] Avg episode reward: [(0, '2.720'), (1, '2.700')] [2023-10-12 15:56:15,105][62635] Updated weights for policy 1, policy_version 4580 (0.0007) [2023-10-12 15:56:15,470][62635] Updated weights for policy 1, policy_version 4590 (0.0009) [2023-10-12 15:56:15,838][62635] Updated weights for policy 1, policy_version 4600 (0.0009) [2023-10-12 15:56:16,087][62634] Updated weights for policy 0, policy_version 4580 (0.0010) [2023-10-12 15:56:16,464][62634] Updated weights for policy 0, policy_version 4590 (0.0008) [2023-10-12 15:56:16,839][62634] Updated weights for policy 0, policy_version 4600 (0.0007) [2023-10-12 15:56:18,435][61643] Fps is (10 sec: 13106.7, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 9437184. Throughput: 0: 1670.7, 1: 1680.9. Samples: 2368522. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0) [2023-10-12 15:56:18,436][61643] Avg episode reward: [(0, '2.700'), (1, '2.740')] [2023-10-12 15:56:18,445][62495] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000004608_4718592.pth... [2023-10-12 15:56:18,446][62354] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000004608_4718592.pth... [2023-10-12 15:56:18,476][62495] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000003040_3112960.pth [2023-10-12 15:56:18,485][62354] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000003040_3112960.pth [2023-10-12 15:56:19,951][62635] Updated weights for policy 1, policy_version 4610 (0.0008) [2023-10-12 15:56:20,321][62635] Updated weights for policy 1, policy_version 4620 (0.0009) [2023-10-12 15:56:20,694][62635] Updated weights for policy 1, policy_version 4630 (0.0008) [2023-10-12 15:56:20,784][62634] Updated weights for policy 0, policy_version 4610 (0.0010) [2023-10-12 15:56:21,075][62635] Updated weights for policy 1, policy_version 4640 (0.0009) [2023-10-12 15:56:21,153][62634] Updated weights for policy 0, policy_version 4620 (0.0008) [2023-10-12 15:56:21,527][62634] Updated weights for policy 0, policy_version 4630 (0.0008) [2023-10-12 15:56:21,903][62634] Updated weights for policy 0, policy_version 4640 (0.0009) [2023-10-12 15:56:23,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 9502720. Throughput: 0: 1680.9, 1: 1656.0. Samples: 2378810. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0) [2023-10-12 15:56:23,436][61643] Avg episode reward: [(0, '2.780'), (1, '2.800')] [2023-10-12 15:56:25,187][62635] Updated weights for policy 1, policy_version 4650 (0.0008) [2023-10-12 15:56:25,557][62635] Updated weights for policy 1, policy_version 4660 (0.0007) [2023-10-12 15:56:25,850][62634] Updated weights for policy 0, policy_version 4650 (0.0007) [2023-10-12 15:56:25,917][62635] Updated weights for policy 1, policy_version 4670 (0.0007) [2023-10-12 15:56:26,233][62634] Updated weights for policy 0, policy_version 4660 (0.0009) [2023-10-12 15:56:26,607][62634] Updated weights for policy 0, policy_version 4670 (0.0010) [2023-10-12 15:56:28,435][61643] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 9568256. Throughput: 0: 1661.9, 1: 1674.8. Samples: 2398364. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 15:56:28,436][61643] Avg episode reward: [(0, '2.840'), (1, '2.630')] [2023-10-12 15:56:29,965][62635] Updated weights for policy 1, policy_version 4680 (0.0009) [2023-10-12 15:56:30,330][62635] Updated weights for policy 1, policy_version 4690 (0.0007) [2023-10-12 15:56:30,703][62635] Updated weights for policy 1, policy_version 4700 (0.0007) [2023-10-12 15:56:30,723][62634] Updated weights for policy 0, policy_version 4680 (0.0009) [2023-10-12 15:56:31,105][62634] Updated weights for policy 0, policy_version 4690 (0.0009) [2023-10-12 15:56:31,493][62634] Updated weights for policy 0, policy_version 4700 (0.0008) [2023-10-12 15:56:33,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13107.1, 300 sec: 13440.4). Total num frames: 9633792. Throughput: 0: 1686.8, 1: 1677.7. Samples: 2419180. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 15:56:33,436][61643] Avg episode reward: [(0, '2.810'), (1, '2.610')] [2023-10-12 15:56:34,849][62635] Updated weights for policy 1, policy_version 4710 (0.0008) [2023-10-12 15:56:35,221][62635] Updated weights for policy 1, policy_version 4720 (0.0008) [2023-10-12 15:56:35,486][62634] Updated weights for policy 0, policy_version 4710 (0.0008) [2023-10-12 15:56:35,586][62635] Updated weights for policy 1, policy_version 4730 (0.0008) [2023-10-12 15:56:35,866][62634] Updated weights for policy 0, policy_version 4720 (0.0007) [2023-10-12 15:56:36,236][62634] Updated weights for policy 0, policy_version 4730 (0.0007) [2023-10-12 15:56:38,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 9699328. Throughput: 0: 1674.7, 1: 1655.5. Samples: 2428788. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 15:56:38,436][61643] Avg episode reward: [(0, '2.810'), (1, '2.790')] [2023-10-12 15:56:39,536][62635] Updated weights for policy 1, policy_version 4740 (0.0008) [2023-10-12 15:56:39,908][62635] Updated weights for policy 1, policy_version 4750 (0.0008) [2023-10-12 15:56:40,282][62635] Updated weights for policy 1, policy_version 4760 (0.0009) [2023-10-12 15:56:40,477][62634] Updated weights for policy 0, policy_version 4740 (0.0008) [2023-10-12 15:56:40,847][62634] Updated weights for policy 0, policy_version 4750 (0.0007) [2023-10-12 15:56:41,232][62634] Updated weights for policy 0, policy_version 4760 (0.0008) [2023-10-12 15:56:43,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.1, 300 sec: 13440.4). Total num frames: 9764864. Throughput: 0: 1666.3, 1: 1680.9. Samples: 2448934. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 15:56:43,436][61643] Avg episode reward: [(0, '2.870'), (1, '2.890')] [2023-10-12 15:56:43,437][62354] Saving new best policy, reward=2.870! [2023-10-12 15:56:43,437][62495] Saving new best policy, reward=2.890! [2023-10-12 15:56:44,278][62635] Updated weights for policy 1, policy_version 4770 (0.0008) [2023-10-12 15:56:44,657][62635] Updated weights for policy 1, policy_version 4780 (0.0008) [2023-10-12 15:56:45,020][62635] Updated weights for policy 1, policy_version 4790 (0.0008) [2023-10-12 15:56:45,056][62634] Updated weights for policy 0, policy_version 4770 (0.0008) [2023-10-12 15:56:45,388][62635] Updated weights for policy 1, policy_version 4800 (0.0007) [2023-10-12 15:56:45,468][62634] Updated weights for policy 0, policy_version 4780 (0.0009) [2023-10-12 15:56:45,843][62634] Updated weights for policy 0, policy_version 4790 (0.0007) [2023-10-12 15:56:46,219][62634] Updated weights for policy 0, policy_version 4800 (0.0010) [2023-10-12 15:56:48,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 9830400. Throughput: 0: 1690.9, 1: 1685.5. Samples: 2469868. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2023-10-12 15:56:48,436][61643] Avg episode reward: [(0, '2.840'), (1, '2.920')] [2023-10-12 15:56:48,448][62495] Saving new best policy, reward=2.920! [2023-10-12 15:56:49,651][62635] Updated weights for policy 1, policy_version 4810 (0.0007) [2023-10-12 15:56:50,036][62635] Updated weights for policy 1, policy_version 4820 (0.0008) [2023-10-12 15:56:50,406][62635] Updated weights for policy 1, policy_version 4830 (0.0009) [2023-10-12 15:56:50,435][62634] Updated weights for policy 0, policy_version 4810 (0.0010) [2023-10-12 15:56:50,817][62634] Updated weights for policy 0, policy_version 4820 (0.0010) [2023-10-12 15:56:51,191][62634] Updated weights for policy 0, policy_version 4830 (0.0007) [2023-10-12 15:56:53,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 9895936. Throughput: 0: 1670.0, 1: 1670.6. Samples: 2479122. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2023-10-12 15:56:53,435][61643] Avg episode reward: [(0, '2.760'), (1, '2.850')] [2023-10-12 15:56:54,302][62635] Updated weights for policy 1, policy_version 4840 (0.0008) [2023-10-12 15:56:54,673][62635] Updated weights for policy 1, policy_version 4850 (0.0007) [2023-10-12 15:56:55,033][62635] Updated weights for policy 1, policy_version 4860 (0.0008) [2023-10-12 15:56:55,252][62634] Updated weights for policy 0, policy_version 4840 (0.0009) [2023-10-12 15:56:55,635][62634] Updated weights for policy 0, policy_version 4850 (0.0009) [2023-10-12 15:56:56,024][62634] Updated weights for policy 0, policy_version 4860 (0.0009) [2023-10-12 15:56:58,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.1, 300 sec: 13440.4). Total num frames: 9961472. Throughput: 0: 1678.3, 1: 1685.3. Samples: 2499578. Policy #0 lag: (min: 12.0, avg: 13.3, max: 37.0) [2023-10-12 15:56:58,436][61643] Avg episode reward: [(0, '2.780'), (1, '2.860')] [2023-10-12 15:56:59,098][62635] Updated weights for policy 1, policy_version 4870 (0.0009) [2023-10-12 15:56:59,462][62635] Updated weights for policy 1, policy_version 4880 (0.0008) [2023-10-12 15:56:59,830][62635] Updated weights for policy 1, policy_version 4890 (0.0007) [2023-10-12 15:57:00,044][62634] Updated weights for policy 0, policy_version 4870 (0.0009) [2023-10-12 15:57:00,426][62634] Updated weights for policy 0, policy_version 4880 (0.0009) [2023-10-12 15:57:00,802][62634] Updated weights for policy 0, policy_version 4890 (0.0008) [2023-10-12 15:57:03,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 10027008. Throughput: 0: 1685.3, 1: 1685.3. Samples: 2520194. Policy #0 lag: (min: 12.0, avg: 13.3, max: 37.0) [2023-10-12 15:57:03,435][61643] Avg episode reward: [(0, '2.870'), (1, '2.900')] [2023-10-12 15:57:03,965][62635] Updated weights for policy 1, policy_version 4900 (0.0008) [2023-10-12 15:57:04,342][62635] Updated weights for policy 1, policy_version 4910 (0.0009) [2023-10-12 15:57:04,714][62635] Updated weights for policy 1, policy_version 4920 (0.0007) [2023-10-12 15:57:04,781][62634] Updated weights for policy 0, policy_version 4900 (0.0008) [2023-10-12 15:57:05,155][62634] Updated weights for policy 0, policy_version 4910 (0.0008) [2023-10-12 15:57:05,530][62634] Updated weights for policy 0, policy_version 4920 (0.0011) [2023-10-12 15:57:08,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.1, 300 sec: 13440.4). Total num frames: 10092544. Throughput: 0: 1658.4, 1: 1685.3. Samples: 2529276. Policy #0 lag: (min: 31.0, avg: 41.6, max: 63.0) [2023-10-12 15:57:08,436][61643] Avg episode reward: [(0, '2.860'), (1, '2.910')] [2023-10-12 15:57:08,655][62635] Updated weights for policy 1, policy_version 4930 (0.0007) [2023-10-12 15:57:09,025][62635] Updated weights for policy 1, policy_version 4940 (0.0011) [2023-10-12 15:57:09,402][62635] Updated weights for policy 1, policy_version 4950 (0.0010) [2023-10-12 15:57:09,593][62634] Updated weights for policy 0, policy_version 4930 (0.0010) [2023-10-12 15:57:09,768][62635] Updated weights for policy 1, policy_version 4960 (0.0007) [2023-10-12 15:57:09,978][62634] Updated weights for policy 0, policy_version 4940 (0.0010) [2023-10-12 15:57:10,347][62634] Updated weights for policy 0, policy_version 4950 (0.0011) [2023-10-12 15:57:10,724][62634] Updated weights for policy 0, policy_version 4960 (0.0008) [2023-10-12 15:57:13,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 10158080. Throughput: 0: 1683.6, 1: 1691.0. Samples: 2550222. Policy #0 lag: (min: 31.0, avg: 41.6, max: 63.0) [2023-10-12 15:57:13,436][61643] Avg episode reward: [(0, '2.840'), (1, '2.850')] [2023-10-12 15:57:13,738][62635] Updated weights for policy 1, policy_version 4970 (0.0009) [2023-10-12 15:57:14,098][62635] Updated weights for policy 1, policy_version 4980 (0.0007) [2023-10-12 15:57:14,468][62635] Updated weights for policy 1, policy_version 4990 (0.0009) [2023-10-12 15:57:14,713][62634] Updated weights for policy 0, policy_version 4970 (0.0009) [2023-10-12 15:57:15,090][62634] Updated weights for policy 0, policy_version 4980 (0.0008) [2023-10-12 15:57:15,475][62634] Updated weights for policy 0, policy_version 4990 (0.0010) [2023-10-12 15:57:18,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 10223616. Throughput: 0: 1681.3, 1: 1693.5. Samples: 2571046. Policy #0 lag: (min: 8.0, avg: 25.5, max: 40.0) [2023-10-12 15:57:18,436][61643] Avg episode reward: [(0, '2.800'), (1, '2.740')] [2023-10-12 15:57:18,551][62635] Updated weights for policy 1, policy_version 5000 (0.0009) [2023-10-12 15:57:18,922][62635] Updated weights for policy 1, policy_version 5010 (0.0008) [2023-10-12 15:57:19,304][62635] Updated weights for policy 1, policy_version 5020 (0.0007) [2023-10-12 15:57:19,542][62634] Updated weights for policy 0, policy_version 5000 (0.0007) [2023-10-12 15:57:19,913][62634] Updated weights for policy 0, policy_version 5010 (0.0009) [2023-10-12 15:57:20,304][62634] Updated weights for policy 0, policy_version 5020 (0.0009) [2023-10-12 15:57:23,222][62635] Updated weights for policy 1, policy_version 5030 (0.0008) [2023-10-12 15:57:23,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 10289152. Throughput: 0: 1665.7, 1: 1699.3. Samples: 2580208. Policy #0 lag: (min: 8.0, avg: 25.5, max: 40.0) [2023-10-12 15:57:23,435][61643] Avg episode reward: [(0, '2.840'), (1, '2.740')] [2023-10-12 15:57:23,589][62635] Updated weights for policy 1, policy_version 5040 (0.0009) [2023-10-12 15:57:23,958][62635] Updated weights for policy 1, policy_version 5050 (0.0010) [2023-10-12 15:57:24,357][62634] Updated weights for policy 0, policy_version 5030 (0.0009) [2023-10-12 15:57:24,734][62634] Updated weights for policy 0, policy_version 5040 (0.0008) [2023-10-12 15:57:25,113][62634] Updated weights for policy 0, policy_version 5050 (0.0009) [2023-10-12 15:57:28,112][62635] Updated weights for policy 1, policy_version 5060 (0.0008) [2023-10-12 15:57:28,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 10354688. Throughput: 0: 1682.9, 1: 1694.6. Samples: 2600920. Policy #0 lag: (min: 27.0, avg: 30.0, max: 59.0) [2023-10-12 15:57:28,436][61643] Avg episode reward: [(0, '2.930'), (1, '2.750')] [2023-10-12 15:57:28,437][62354] Saving new best policy, reward=2.930! [2023-10-12 15:57:28,480][62635] Updated weights for policy 1, policy_version 5070 (0.0008) [2023-10-12 15:57:28,852][62635] Updated weights for policy 1, policy_version 5080 (0.0008) [2023-10-12 15:57:29,098][62634] Updated weights for policy 0, policy_version 5060 (0.0011) [2023-10-12 15:57:29,481][62634] Updated weights for policy 0, policy_version 5070 (0.0008) [2023-10-12 15:57:29,848][62634] Updated weights for policy 0, policy_version 5080 (0.0009) [2023-10-12 15:57:32,608][62635] Updated weights for policy 1, policy_version 5090 (0.0008) [2023-10-12 15:57:32,972][62635] Updated weights for policy 1, policy_version 5100 (0.0007) [2023-10-12 15:57:33,332][62635] Updated weights for policy 1, policy_version 5110 (0.0007) [2023-10-12 15:57:33,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 10420224. Throughput: 0: 1682.7, 1: 1685.8. Samples: 2621450. Policy #0 lag: (min: 27.0, avg: 30.0, max: 59.0) [2023-10-12 15:57:33,435][61643] Avg episode reward: [(0, '2.900'), (1, '2.890')] [2023-10-12 15:57:33,701][62635] Updated weights for policy 1, policy_version 5120 (0.0008) [2023-10-12 15:57:33,968][62634] Updated weights for policy 0, policy_version 5090 (0.0008) [2023-10-12 15:57:34,373][62634] Updated weights for policy 0, policy_version 5100 (0.0007) [2023-10-12 15:57:34,751][62634] Updated weights for policy 0, policy_version 5110 (0.0008) [2023-10-12 15:57:35,133][62634] Updated weights for policy 0, policy_version 5120 (0.0007) [2023-10-12 15:57:37,913][62635] Updated weights for policy 1, policy_version 5130 (0.0008) [2023-10-12 15:57:38,283][62635] Updated weights for policy 1, policy_version 5140 (0.0009) [2023-10-12 15:57:38,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 10485760. Throughput: 0: 1672.4, 1: 1701.9. Samples: 2630968. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 15:57:38,436][61643] Avg episode reward: [(0, '2.860'), (1, '2.870')] [2023-10-12 15:57:38,653][62635] Updated weights for policy 1, policy_version 5150 (0.0010) [2023-10-12 15:57:39,161][62634] Updated weights for policy 0, policy_version 5130 (0.0007) [2023-10-12 15:57:39,543][62634] Updated weights for policy 0, policy_version 5140 (0.0009) [2023-10-12 15:57:39,915][62634] Updated weights for policy 0, policy_version 5150 (0.0009) [2023-10-12 15:57:42,651][62635] Updated weights for policy 1, policy_version 5160 (0.0009) [2023-10-12 15:57:43,022][62635] Updated weights for policy 1, policy_version 5170 (0.0007) [2023-10-12 15:57:43,385][62635] Updated weights for policy 1, policy_version 5180 (0.0011) [2023-10-12 15:57:43,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 10551296. Throughput: 0: 1684.0, 1: 1698.5. Samples: 2651788. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 15:57:43,435][61643] Avg episode reward: [(0, '2.880'), (1, '2.860')] [2023-10-12 15:57:44,035][62634] Updated weights for policy 0, policy_version 5160 (0.0010) [2023-10-12 15:57:44,413][62634] Updated weights for policy 0, policy_version 5170 (0.0009) [2023-10-12 15:57:44,792][62634] Updated weights for policy 0, policy_version 5180 (0.0008) [2023-10-12 15:57:47,572][62635] Updated weights for policy 1, policy_version 5190 (0.0008) [2023-10-12 15:57:47,942][62635] Updated weights for policy 1, policy_version 5200 (0.0007) [2023-10-12 15:57:48,306][62635] Updated weights for policy 1, policy_version 5210 (0.0008) [2023-10-12 15:57:48,435][61643] Fps is (10 sec: 13107.5, 60 sec: 13107.3, 300 sec: 13329.4). Total num frames: 10616832. Throughput: 0: 1691.5, 1: 1682.6. Samples: 2672028. Policy #0 lag: (min: 26.0, avg: 26.6, max: 43.0) [2023-10-12 15:57:48,435][61643] Avg episode reward: [(0, '2.940'), (1, '2.840')] [2023-10-12 15:57:48,442][62354] Saving new best policy, reward=2.940! [2023-10-12 15:57:48,817][62634] Updated weights for policy 0, policy_version 5190 (0.0010) [2023-10-12 15:57:49,194][62634] Updated weights for policy 0, policy_version 5200 (0.0008) [2023-10-12 15:57:49,573][62634] Updated weights for policy 0, policy_version 5210 (0.0009) [2023-10-12 15:57:52,408][62635] Updated weights for policy 1, policy_version 5220 (0.0009) [2023-10-12 15:57:52,774][62635] Updated weights for policy 1, policy_version 5230 (0.0009) [2023-10-12 15:57:53,145][62635] Updated weights for policy 1, policy_version 5240 (0.0008) [2023-10-12 15:57:53,435][61643] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 10682368. Throughput: 0: 1691.3, 1: 1696.3. Samples: 2681718. Policy #0 lag: (min: 26.0, avg: 26.6, max: 43.0) [2023-10-12 15:57:53,436][61643] Avg episode reward: [(0, '2.930'), (1, '2.900')] [2023-10-12 15:57:53,697][62634] Updated weights for policy 0, policy_version 5220 (0.0011) [2023-10-12 15:57:54,071][62634] Updated weights for policy 0, policy_version 5230 (0.0010) [2023-10-12 15:57:54,451][62634] Updated weights for policy 0, policy_version 5240 (0.0009) [2023-10-12 15:57:57,253][62635] Updated weights for policy 1, policy_version 5250 (0.0008) [2023-10-12 15:57:57,622][62635] Updated weights for policy 1, policy_version 5260 (0.0011) [2023-10-12 15:57:57,992][62635] Updated weights for policy 1, policy_version 5270 (0.0011) [2023-10-12 15:57:58,358][62635] Updated weights for policy 1, policy_version 5280 (0.0011) [2023-10-12 15:57:58,435][61643] Fps is (10 sec: 16384.0, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 10780672. Throughput: 0: 1685.4, 1: 1696.4. Samples: 2702402. Policy #0 lag: (min: 1.0, avg: 11.5, max: 33.0) [2023-10-12 15:57:58,435][61643] Avg episode reward: [(0, '2.860'), (1, '2.910')] [2023-10-12 15:57:58,490][62634] Updated weights for policy 0, policy_version 5250 (0.0010) [2023-10-12 15:57:58,872][62634] Updated weights for policy 0, policy_version 5260 (0.0010) [2023-10-12 15:57:59,249][62634] Updated weights for policy 0, policy_version 5270 (0.0010) [2023-10-12 15:57:59,622][62634] Updated weights for policy 0, policy_version 5280 (0.0009) [2023-10-12 15:58:02,433][62635] Updated weights for policy 1, policy_version 5290 (0.0007) [2023-10-12 15:58:02,805][62635] Updated weights for policy 1, policy_version 5300 (0.0008) [2023-10-12 15:58:03,175][62635] Updated weights for policy 1, policy_version 5310 (0.0007) [2023-10-12 15:58:03,435][61643] Fps is (10 sec: 16384.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 10846208. Throughput: 0: 1689.0, 1: 1667.4. Samples: 2722084. Policy #0 lag: (min: 1.0, avg: 11.5, max: 33.0) [2023-10-12 15:58:03,436][61643] Avg episode reward: [(0, '2.780'), (1, '2.880')] [2023-10-12 15:58:03,750][62634] Updated weights for policy 0, policy_version 5290 (0.0009) [2023-10-12 15:58:04,124][62634] Updated weights for policy 0, policy_version 5300 (0.0007) [2023-10-12 15:58:04,501][62634] Updated weights for policy 0, policy_version 5310 (0.0009) [2023-10-12 15:58:07,386][62635] Updated weights for policy 1, policy_version 5320 (0.0008) [2023-10-12 15:58:07,760][62635] Updated weights for policy 1, policy_version 5330 (0.0008) [2023-10-12 15:58:08,129][62635] Updated weights for policy 1, policy_version 5340 (0.0007) [2023-10-12 15:58:08,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 10911744. Throughput: 0: 1686.5, 1: 1684.4. Samples: 2731900. Policy #0 lag: (min: 31.0, avg: 37.7, max: 63.0) [2023-10-12 15:58:08,436][61643] Avg episode reward: [(0, '2.750'), (1, '2.880')] [2023-10-12 15:58:08,611][62634] Updated weights for policy 0, policy_version 5320 (0.0007) [2023-10-12 15:58:09,000][62634] Updated weights for policy 0, policy_version 5330 (0.0008) [2023-10-12 15:58:09,373][62634] Updated weights for policy 0, policy_version 5340 (0.0009) [2023-10-12 15:58:12,129][62635] Updated weights for policy 1, policy_version 5350 (0.0010) [2023-10-12 15:58:12,498][62635] Updated weights for policy 1, policy_version 5360 (0.0011) [2023-10-12 15:58:12,881][62635] Updated weights for policy 1, policy_version 5370 (0.0007) [2023-10-12 15:58:13,351][62634] Updated weights for policy 0, policy_version 5350 (0.0010) [2023-10-12 15:58:13,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 10977280. Throughput: 0: 1686.6, 1: 1677.9. Samples: 2752320. Policy #0 lag: (min: 31.0, avg: 37.7, max: 63.0) [2023-10-12 15:58:13,436][61643] Avg episode reward: [(0, '2.760'), (1, '2.920')] [2023-10-12 15:58:13,730][62634] Updated weights for policy 0, policy_version 5360 (0.0010) [2023-10-12 15:58:14,116][62634] Updated weights for policy 0, policy_version 5370 (0.0009) [2023-10-12 15:58:17,066][62635] Updated weights for policy 1, policy_version 5380 (0.0010) [2023-10-12 15:58:17,442][62635] Updated weights for policy 1, policy_version 5390 (0.0008) [2023-10-12 15:58:17,805][62635] Updated weights for policy 1, policy_version 5400 (0.0009) [2023-10-12 15:58:18,093][62634] Updated weights for policy 0, policy_version 5380 (0.0008) [2023-10-12 15:58:18,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 11042816. Throughput: 0: 1685.5, 1: 1661.7. Samples: 2772076. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) [2023-10-12 15:58:18,435][61643] Avg episode reward: [(0, '2.810'), (1, '2.920')] [2023-10-12 15:58:18,444][62495] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000005408_5537792.pth... [2023-10-12 15:58:18,474][62634] Updated weights for policy 0, policy_version 5390 (0.0009) [2023-10-12 15:58:18,479][62495] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000003840_3932160.pth [2023-10-12 15:58:18,857][62634] Updated weights for policy 0, policy_version 5400 (0.0008) [2023-10-12 15:58:19,148][62354] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000005408_5537792.pth... [2023-10-12 15:58:19,177][62354] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000003808_3899392.pth [2023-10-12 15:58:21,967][62635] Updated weights for policy 1, policy_version 5410 (0.0007) [2023-10-12 15:58:22,375][62635] Updated weights for policy 1, policy_version 5420 (0.0011) [2023-10-12 15:58:22,752][62635] Updated weights for policy 1, policy_version 5430 (0.0010) [2023-10-12 15:58:22,801][62634] Updated weights for policy 0, policy_version 5410 (0.0008) [2023-10-12 15:58:23,118][62635] Updated weights for policy 1, policy_version 5440 (0.0008) [2023-10-12 15:58:23,209][62634] Updated weights for policy 0, policy_version 5420 (0.0008) [2023-10-12 15:58:23,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 11108352. Throughput: 0: 1688.0, 1: 1676.3. Samples: 2782358. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) [2023-10-12 15:58:23,435][61643] Avg episode reward: [(0, '2.900'), (1, '2.910')] [2023-10-12 15:58:23,592][62634] Updated weights for policy 0, policy_version 5430 (0.0009) [2023-10-12 15:58:23,964][62634] Updated weights for policy 0, policy_version 5440 (0.0009) [2023-10-12 15:58:26,983][62635] Updated weights for policy 1, policy_version 5450 (0.0009) [2023-10-12 15:58:27,360][62635] Updated weights for policy 1, policy_version 5460 (0.0009) [2023-10-12 15:58:27,728][62635] Updated weights for policy 1, policy_version 5470 (0.0008) [2023-10-12 15:58:27,909][62634] Updated weights for policy 0, policy_version 5450 (0.0009) [2023-10-12 15:58:28,287][62634] Updated weights for policy 0, policy_version 5460 (0.0010) [2023-10-12 15:58:28,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 11173888. Throughput: 0: 1689.7, 1: 1661.8. Samples: 2802604. Policy #0 lag: (min: 2.0, avg: 9.7, max: 34.0) [2023-10-12 15:58:28,435][61643] Avg episode reward: [(0, '2.870'), (1, '2.970')] [2023-10-12 15:58:28,436][62495] Saving new best policy, reward=2.970! [2023-10-12 15:58:28,665][62634] Updated weights for policy 0, policy_version 5470 (0.0007) [2023-10-12 15:58:31,791][62635] Updated weights for policy 1, policy_version 5480 (0.0009) [2023-10-12 15:58:32,159][62635] Updated weights for policy 1, policy_version 5490 (0.0009) [2023-10-12 15:58:32,531][62635] Updated weights for policy 1, policy_version 5500 (0.0007) [2023-10-12 15:58:32,681][62634] Updated weights for policy 0, policy_version 5480 (0.0009) [2023-10-12 15:58:33,065][62634] Updated weights for policy 0, policy_version 5490 (0.0009) [2023-10-12 15:58:33,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 11239424. Throughput: 0: 1671.7, 1: 1667.3. Samples: 2822284. Policy #0 lag: (min: 2.0, avg: 9.7, max: 34.0) [2023-10-12 15:58:33,435][61643] Avg episode reward: [(0, '2.890'), (1, '2.960')] [2023-10-12 15:58:33,441][62634] Updated weights for policy 0, policy_version 5500 (0.0010) [2023-10-12 15:58:36,575][62635] Updated weights for policy 1, policy_version 5510 (0.0008) [2023-10-12 15:58:36,940][62635] Updated weights for policy 1, policy_version 5520 (0.0007) [2023-10-12 15:58:37,304][62635] Updated weights for policy 1, policy_version 5530 (0.0007) [2023-10-12 15:58:37,616][62634] Updated weights for policy 0, policy_version 5510 (0.0010) [2023-10-12 15:58:37,992][62634] Updated weights for policy 0, policy_version 5520 (0.0010) [2023-10-12 15:58:38,371][62634] Updated weights for policy 0, policy_version 5530 (0.0010) [2023-10-12 15:58:38,435][61643] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 11304960. Throughput: 0: 1684.7, 1: 1682.1. Samples: 2833222. Policy #0 lag: (min: 17.0, avg: 21.3, max: 49.0) [2023-10-12 15:58:38,436][61643] Avg episode reward: [(0, '2.870'), (1, '2.940')] [2023-10-12 15:58:41,259][62635] Updated weights for policy 1, policy_version 5540 (0.0007) [2023-10-12 15:58:41,629][62635] Updated weights for policy 1, policy_version 5550 (0.0008) [2023-10-12 15:58:41,991][62635] Updated weights for policy 1, policy_version 5560 (0.0008) [2023-10-12 15:58:42,491][62634] Updated weights for policy 0, policy_version 5540 (0.0008) [2023-10-12 15:58:42,866][62634] Updated weights for policy 0, policy_version 5550 (0.0009) [2023-10-12 15:58:43,238][62634] Updated weights for policy 0, policy_version 5560 (0.0010) [2023-10-12 15:58:43,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 11370496. Throughput: 0: 1685.4, 1: 1662.0. Samples: 2853038. Policy #0 lag: (min: 17.0, avg: 21.3, max: 49.0) [2023-10-12 15:58:43,435][61643] Avg episode reward: [(0, '2.900'), (1, '2.880')] [2023-10-12 15:58:46,117][62635] Updated weights for policy 1, policy_version 5570 (0.0009) [2023-10-12 15:58:46,493][62635] Updated weights for policy 1, policy_version 5580 (0.0011) [2023-10-12 15:58:46,856][62635] Updated weights for policy 1, policy_version 5590 (0.0008) [2023-10-12 15:58:47,223][62635] Updated weights for policy 1, policy_version 5600 (0.0007) [2023-10-12 15:58:47,238][62634] Updated weights for policy 0, policy_version 5570 (0.0009) [2023-10-12 15:58:47,625][62634] Updated weights for policy 0, policy_version 5580 (0.0010) [2023-10-12 15:58:48,004][62634] Updated weights for policy 0, policy_version 5590 (0.0009) [2023-10-12 15:58:48,383][62634] Updated weights for policy 0, policy_version 5600 (0.0008) [2023-10-12 15:58:48,435][61643] Fps is (10 sec: 16383.7, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 11468800. Throughput: 0: 1670.5, 1: 1679.5. Samples: 2872832. Policy #0 lag: (min: 17.0, avg: 26.4, max: 49.0) [2023-10-12 15:58:48,436][61643] Avg episode reward: [(0, '2.940'), (1, '2.880')] [2023-10-12 15:58:51,286][62635] Updated weights for policy 1, policy_version 5610 (0.0009) [2023-10-12 15:58:51,654][62635] Updated weights for policy 1, policy_version 5620 (0.0009) [2023-10-12 15:58:52,027][62635] Updated weights for policy 1, policy_version 5630 (0.0007) [2023-10-12 15:58:52,382][62634] Updated weights for policy 0, policy_version 5610 (0.0007) [2023-10-12 15:58:52,753][62634] Updated weights for policy 0, policy_version 5620 (0.0007) [2023-10-12 15:58:53,139][62634] Updated weights for policy 0, policy_version 5630 (0.0008) [2023-10-12 15:58:53,435][61643] Fps is (10 sec: 16383.9, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 11534336. Throughput: 0: 1689.8, 1: 1685.5. Samples: 2883790. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 15:58:53,435][61643] Avg episode reward: [(0, '2.970'), (1, '2.880')] [2023-10-12 15:58:53,436][62354] Saving new best policy, reward=2.970! [2023-10-12 15:58:55,979][62635] Updated weights for policy 1, policy_version 5640 (0.0007) [2023-10-12 15:58:56,343][62635] Updated weights for policy 1, policy_version 5650 (0.0007) [2023-10-12 15:58:56,712][62635] Updated weights for policy 1, policy_version 5660 (0.0008) [2023-10-12 15:58:57,219][62634] Updated weights for policy 0, policy_version 5640 (0.0009) [2023-10-12 15:58:57,602][62634] Updated weights for policy 0, policy_version 5650 (0.0008) [2023-10-12 15:58:57,983][62634] Updated weights for policy 0, policy_version 5660 (0.0007) [2023-10-12 15:58:58,435][61643] Fps is (10 sec: 13107.8, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 11599872. Throughput: 0: 1692.5, 1: 1664.4. Samples: 2903378. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 15:58:58,435][61643] Avg episode reward: [(0, '2.920'), (1, '2.940')] [2023-10-12 15:59:00,815][62635] Updated weights for policy 1, policy_version 5670 (0.0011) [2023-10-12 15:59:01,192][62635] Updated weights for policy 1, policy_version 5680 (0.0008) [2023-10-12 15:59:01,560][62635] Updated weights for policy 1, policy_version 5690 (0.0009) [2023-10-12 15:59:01,925][62634] Updated weights for policy 0, policy_version 5670 (0.0009) [2023-10-12 15:59:02,305][62634] Updated weights for policy 0, policy_version 5680 (0.0010) [2023-10-12 15:59:02,694][62634] Updated weights for policy 0, policy_version 5690 (0.0010) [2023-10-12 15:59:03,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 11665408. Throughput: 0: 1664.7, 1: 1685.9. Samples: 2922854. Policy #0 lag: (min: 31.0, avg: 34.1, max: 63.0) [2023-10-12 15:59:03,436][61643] Avg episode reward: [(0, '2.900'), (1, '2.960')] [2023-10-12 15:59:05,664][62635] Updated weights for policy 1, policy_version 5700 (0.0010) [2023-10-12 15:59:06,041][62635] Updated weights for policy 1, policy_version 5710 (0.0010) [2023-10-12 15:59:06,414][62635] Updated weights for policy 1, policy_version 5720 (0.0008) [2023-10-12 15:59:06,786][62634] Updated weights for policy 0, policy_version 5700 (0.0009) [2023-10-12 15:59:07,167][62634] Updated weights for policy 0, policy_version 5710 (0.0007) [2023-10-12 15:59:07,556][62634] Updated weights for policy 0, policy_version 5720 (0.0007) [2023-10-12 15:59:08,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 11730944. Throughput: 0: 1692.0, 1: 1678.1. Samples: 2934012. Policy #0 lag: (min: 31.0, avg: 34.1, max: 63.0) [2023-10-12 15:59:08,436][61643] Avg episode reward: [(0, '2.900'), (1, '2.950')] [2023-10-12 15:59:10,500][62635] Updated weights for policy 1, policy_version 5730 (0.0008) [2023-10-12 15:59:10,903][62635] Updated weights for policy 1, policy_version 5740 (0.0010) [2023-10-12 15:59:11,271][62635] Updated weights for policy 1, policy_version 5750 (0.0008) [2023-10-12 15:59:11,624][62634] Updated weights for policy 0, policy_version 5730 (0.0008) [2023-10-12 15:59:11,642][62635] Updated weights for policy 1, policy_version 5760 (0.0008) [2023-10-12 15:59:12,007][62634] Updated weights for policy 0, policy_version 5740 (0.0008) [2023-10-12 15:59:12,392][62634] Updated weights for policy 0, policy_version 5750 (0.0010) [2023-10-12 15:59:12,776][62634] Updated weights for policy 0, policy_version 5760 (0.0009) [2023-10-12 15:59:13,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 11796480. Throughput: 0: 1677.5, 1: 1675.3. Samples: 2953482. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 15:59:13,436][61643] Avg episode reward: [(0, '2.880'), (1, '2.900')] [2023-10-12 15:59:15,762][62635] Updated weights for policy 1, policy_version 5770 (0.0009) [2023-10-12 15:59:16,127][62635] Updated weights for policy 1, policy_version 5780 (0.0010) [2023-10-12 15:59:16,492][62635] Updated weights for policy 1, policy_version 5790 (0.0007) [2023-10-12 15:59:16,688][62634] Updated weights for policy 0, policy_version 5770 (0.0008) [2023-10-12 15:59:17,060][62634] Updated weights for policy 0, policy_version 5780 (0.0008) [2023-10-12 15:59:17,444][62634] Updated weights for policy 0, policy_version 5790 (0.0009) [2023-10-12 15:59:18,435][61643] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 11862016. Throughput: 0: 1669.7, 1: 1691.1. Samples: 2973522. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 15:59:18,436][61643] Avg episode reward: [(0, '2.890'), (1, '2.870')] [2023-10-12 15:59:20,445][62635] Updated weights for policy 1, policy_version 5800 (0.0007) [2023-10-12 15:59:20,810][62635] Updated weights for policy 1, policy_version 5810 (0.0007) [2023-10-12 15:59:21,179][62635] Updated weights for policy 1, policy_version 5820 (0.0007) [2023-10-12 15:59:21,613][62634] Updated weights for policy 0, policy_version 5800 (0.0008) [2023-10-12 15:59:21,994][62634] Updated weights for policy 0, policy_version 5810 (0.0007) [2023-10-12 15:59:22,367][62634] Updated weights for policy 0, policy_version 5820 (0.0007) [2023-10-12 15:59:23,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 11927552. Throughput: 0: 1687.5, 1: 1668.1. Samples: 2984220. Policy #0 lag: (min: 31.0, avg: 34.9, max: 63.0) [2023-10-12 15:59:23,435][61643] Avg episode reward: [(0, '2.900'), (1, '2.860')] [2023-10-12 15:59:25,214][62635] Updated weights for policy 1, policy_version 5830 (0.0008) [2023-10-12 15:59:25,588][62635] Updated weights for policy 1, policy_version 5840 (0.0010) [2023-10-12 15:59:25,957][62635] Updated weights for policy 1, policy_version 5850 (0.0009) [2023-10-12 15:59:26,357][62634] Updated weights for policy 0, policy_version 5830 (0.0008) [2023-10-12 15:59:26,743][62634] Updated weights for policy 0, policy_version 5840 (0.0010) [2023-10-12 15:59:27,116][62634] Updated weights for policy 0, policy_version 5850 (0.0008) [2023-10-12 15:59:28,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 11993088. Throughput: 0: 1670.2, 1: 1681.5. Samples: 3003864. Policy #0 lag: (min: 31.0, avg: 34.9, max: 63.0) [2023-10-12 15:59:28,436][61643] Avg episode reward: [(0, '2.950'), (1, '2.920')] [2023-10-12 15:59:30,036][62635] Updated weights for policy 1, policy_version 5860 (0.0008) [2023-10-12 15:59:30,405][62635] Updated weights for policy 1, policy_version 5870 (0.0011) [2023-10-12 15:59:30,766][62635] Updated weights for policy 1, policy_version 5880 (0.0010) [2023-10-12 15:59:31,190][62634] Updated weights for policy 0, policy_version 5860 (0.0009) [2023-10-12 15:59:31,565][62634] Updated weights for policy 0, policy_version 5870 (0.0008) [2023-10-12 15:59:31,943][62634] Updated weights for policy 0, policy_version 5880 (0.0009) [2023-10-12 15:59:33,435][61643] Fps is (10 sec: 13106.8, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 12058624. Throughput: 0: 1673.0, 1: 1691.0. Samples: 3024210. Policy #0 lag: (min: 31.0, avg: 39.8, max: 63.0) [2023-10-12 15:59:33,436][61643] Avg episode reward: [(0, '2.990'), (1, '2.910')] [2023-10-12 15:59:33,449][62354] Saving new best policy, reward=2.990! [2023-10-12 15:59:34,870][62635] Updated weights for policy 1, policy_version 5890 (0.0010) [2023-10-12 15:59:35,240][62635] Updated weights for policy 1, policy_version 5900 (0.0009) [2023-10-12 15:59:35,619][62635] Updated weights for policy 1, policy_version 5910 (0.0008) [2023-10-12 15:59:35,988][62635] Updated weights for policy 1, policy_version 5920 (0.0008) [2023-10-12 15:59:36,020][62634] Updated weights for policy 0, policy_version 5890 (0.0009) [2023-10-12 15:59:36,399][62634] Updated weights for policy 0, policy_version 5900 (0.0009) [2023-10-12 15:59:36,773][62634] Updated weights for policy 0, policy_version 5910 (0.0008) [2023-10-12 15:59:37,140][62634] Updated weights for policy 0, policy_version 5920 (0.0009) [2023-10-12 15:59:38,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 12124160. Throughput: 0: 1685.0, 1: 1665.4. Samples: 3034560. Policy #0 lag: (min: 31.0, avg: 39.8, max: 63.0) [2023-10-12 15:59:38,436][61643] Avg episode reward: [(0, '3.000'), (1, '2.960')] [2023-10-12 15:59:38,438][62354] Saving new best policy, reward=3.000! [2023-10-12 15:59:40,004][62635] Updated weights for policy 1, policy_version 5930 (0.0009) [2023-10-12 15:59:40,371][62635] Updated weights for policy 1, policy_version 5940 (0.0008) [2023-10-12 15:59:40,743][62635] Updated weights for policy 1, policy_version 5950 (0.0007) [2023-10-12 15:59:41,177][62634] Updated weights for policy 0, policy_version 5930 (0.0010) [2023-10-12 15:59:41,559][62634] Updated weights for policy 0, policy_version 5940 (0.0007) [2023-10-12 15:59:41,934][62634] Updated weights for policy 0, policy_version 5950 (0.0009) [2023-10-12 15:59:43,435][61643] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 12189696. Throughput: 0: 1660.8, 1: 1694.4. Samples: 3054362. Policy #0 lag: (min: 17.0, avg: 25.6, max: 49.0) [2023-10-12 15:59:43,436][61643] Avg episode reward: [(0, '2.970'), (1, '2.970')] [2023-10-12 15:59:44,673][62635] Updated weights for policy 1, policy_version 5960 (0.0008) [2023-10-12 15:59:45,040][62635] Updated weights for policy 1, policy_version 5970 (0.0009) [2023-10-12 15:59:45,411][62635] Updated weights for policy 1, policy_version 5980 (0.0009) [2023-10-12 15:59:45,846][62634] Updated weights for policy 0, policy_version 5960 (0.0007) [2023-10-12 15:59:46,222][62634] Updated weights for policy 0, policy_version 5970 (0.0008) [2023-10-12 15:59:46,603][62634] Updated weights for policy 0, policy_version 5980 (0.0011) [2023-10-12 15:59:48,435][61643] Fps is (10 sec: 13107.5, 60 sec: 13107.3, 300 sec: 13440.4). Total num frames: 12255232. Throughput: 0: 1690.3, 1: 1693.9. Samples: 3075142. Policy #0 lag: (min: 17.0, avg: 25.6, max: 49.0) [2023-10-12 15:59:48,435][61643] Avg episode reward: [(0, '2.960'), (1, '3.000')] [2023-10-12 15:59:48,445][62495] Saving new best policy, reward=3.000! [2023-10-12 15:59:49,543][62635] Updated weights for policy 1, policy_version 5990 (0.0007) [2023-10-12 15:59:49,914][62635] Updated weights for policy 1, policy_version 6000 (0.0008) [2023-10-12 15:59:50,293][62635] Updated weights for policy 1, policy_version 6010 (0.0008) [2023-10-12 15:59:50,697][62634] Updated weights for policy 0, policy_version 5990 (0.0010) [2023-10-12 15:59:51,071][62634] Updated weights for policy 0, policy_version 6000 (0.0007) [2023-10-12 15:59:51,453][62634] Updated weights for policy 0, policy_version 6010 (0.0007) [2023-10-12 15:59:53,435][61643] Fps is (10 sec: 13106.9, 60 sec: 13107.1, 300 sec: 13440.4). Total num frames: 12320768. Throughput: 0: 1681.2, 1: 1675.1. Samples: 3085048. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 15:59:53,436][61643] Avg episode reward: [(0, '2.920'), (1, '3.010')] [2023-10-12 15:59:53,437][62495] Saving new best policy, reward=3.010! [2023-10-12 15:59:54,234][62635] Updated weights for policy 1, policy_version 6020 (0.0009) [2023-10-12 15:59:54,594][62635] Updated weights for policy 1, policy_version 6030 (0.0008) [2023-10-12 15:59:54,964][62635] Updated weights for policy 1, policy_version 6040 (0.0007) [2023-10-12 15:59:55,376][62634] Updated weights for policy 0, policy_version 6020 (0.0007) [2023-10-12 15:59:55,750][62634] Updated weights for policy 0, policy_version 6030 (0.0007) [2023-10-12 15:59:56,122][62634] Updated weights for policy 0, policy_version 6040 (0.0008) [2023-10-12 15:59:58,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 12386304. Throughput: 0: 1676.0, 1: 1691.3. Samples: 3105012. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 15:59:58,435][61643] Avg episode reward: [(0, '2.960'), (1, '2.970')] [2023-10-12 15:59:59,041][62635] Updated weights for policy 1, policy_version 6050 (0.0009) [2023-10-12 15:59:59,458][62635] Updated weights for policy 1, policy_version 6060 (0.0009) [2023-10-12 15:59:59,825][62635] Updated weights for policy 1, policy_version 6070 (0.0008) [2023-10-12 16:00:00,201][62635] Updated weights for policy 1, policy_version 6080 (0.0009) [2023-10-12 16:00:00,263][62634] Updated weights for policy 0, policy_version 6050 (0.0008) [2023-10-12 16:00:00,656][62634] Updated weights for policy 0, policy_version 6060 (0.0009) [2023-10-12 16:00:01,032][62634] Updated weights for policy 0, policy_version 6070 (0.0011) [2023-10-12 16:00:01,413][62634] Updated weights for policy 0, policy_version 6080 (0.0010) [2023-10-12 16:00:03,435][61643] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 12451840. Throughput: 0: 1694.8, 1: 1686.3. Samples: 3125668. Policy #0 lag: (min: 31.0, avg: 31.0, max: 35.0) [2023-10-12 16:00:03,436][61643] Avg episode reward: [(0, '2.890'), (1, '2.940')] [2023-10-12 16:00:04,321][62635] Updated weights for policy 1, policy_version 6090 (0.0008) [2023-10-12 16:00:04,689][62635] Updated weights for policy 1, policy_version 6100 (0.0008) [2023-10-12 16:00:05,063][62635] Updated weights for policy 1, policy_version 6110 (0.0007) [2023-10-12 16:00:05,425][62634] Updated weights for policy 0, policy_version 6090 (0.0009) [2023-10-12 16:00:05,804][62634] Updated weights for policy 0, policy_version 6100 (0.0007) [2023-10-12 16:00:06,181][62634] Updated weights for policy 0, policy_version 6110 (0.0008) [2023-10-12 16:00:08,435][61643] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 12517376. Throughput: 0: 1675.3, 1: 1680.4. Samples: 3135228. Policy #0 lag: (min: 31.0, avg: 31.0, max: 35.0) [2023-10-12 16:00:08,436][61643] Avg episode reward: [(0, '2.900'), (1, '2.910')] [2023-10-12 16:00:08,966][62635] Updated weights for policy 1, policy_version 6120 (0.0007) [2023-10-12 16:00:09,336][62635] Updated weights for policy 1, policy_version 6130 (0.0008) [2023-10-12 16:00:09,703][62635] Updated weights for policy 1, policy_version 6140 (0.0009) [2023-10-12 16:00:10,219][62634] Updated weights for policy 0, policy_version 6120 (0.0010) [2023-10-12 16:00:10,591][62634] Updated weights for policy 0, policy_version 6130 (0.0008) [2023-10-12 16:00:10,968][62634] Updated weights for policy 0, policy_version 6140 (0.0007) [2023-10-12 16:00:13,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 12582912. Throughput: 0: 1685.8, 1: 1692.9. Samples: 3155908. Policy #0 lag: (min: 10.0, avg: 13.2, max: 40.0) [2023-10-12 16:00:13,436][61643] Avg episode reward: [(0, '2.940'), (1, '2.940')] [2023-10-12 16:00:13,758][62635] Updated weights for policy 1, policy_version 6150 (0.0008) [2023-10-12 16:00:14,132][62635] Updated weights for policy 1, policy_version 6160 (0.0009) [2023-10-12 16:00:14,508][62635] Updated weights for policy 1, policy_version 6170 (0.0008) [2023-10-12 16:00:14,870][62634] Updated weights for policy 0, policy_version 6150 (0.0007) [2023-10-12 16:00:15,261][62634] Updated weights for policy 0, policy_version 6160 (0.0007) [2023-10-12 16:00:15,647][62634] Updated weights for policy 0, policy_version 6170 (0.0007) [2023-10-12 16:00:18,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13107.3, 300 sec: 13440.4). Total num frames: 12648448. Throughput: 0: 1697.4, 1: 1691.3. Samples: 3176700. Policy #0 lag: (min: 10.0, avg: 13.2, max: 40.0) [2023-10-12 16:00:18,435][61643] Avg episode reward: [(0, '2.970'), (1, '2.960')] [2023-10-12 16:00:18,444][62354] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000006176_6324224.pth... [2023-10-12 16:00:18,484][62354] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000004608_4718592.pth [2023-10-12 16:00:18,628][62635] Updated weights for policy 1, policy_version 6180 (0.0009) [2023-10-12 16:00:18,991][62635] Updated weights for policy 1, policy_version 6190 (0.0009) [2023-10-12 16:00:19,361][62635] Updated weights for policy 1, policy_version 6200 (0.0007) [2023-10-12 16:00:19,663][62495] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000006208_6356992.pth... [2023-10-12 16:00:19,681][62634] Updated weights for policy 0, policy_version 6180 (0.0007) [2023-10-12 16:00:19,695][62495] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000004608_4718592.pth [2023-10-12 16:00:20,059][62634] Updated weights for policy 0, policy_version 6190 (0.0008) [2023-10-12 16:00:20,438][62634] Updated weights for policy 0, policy_version 6200 (0.0007) [2023-10-12 16:00:23,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 12713984. Throughput: 0: 1668.2, 1: 1691.8. Samples: 3185758. Policy #0 lag: (min: 30.0, avg: 30.0, max: 30.0) [2023-10-12 16:00:23,435][61643] Avg episode reward: [(0, '2.970'), (1, '2.950')] [2023-10-12 16:00:23,565][62635] Updated weights for policy 1, policy_version 6210 (0.0009) [2023-10-12 16:00:23,933][62635] Updated weights for policy 1, policy_version 6220 (0.0011) [2023-10-12 16:00:24,303][62635] Updated weights for policy 1, policy_version 6230 (0.0009) [2023-10-12 16:00:24,569][62634] Updated weights for policy 0, policy_version 6210 (0.0007) [2023-10-12 16:00:24,679][62635] Updated weights for policy 1, policy_version 6240 (0.0009) [2023-10-12 16:00:24,939][62634] Updated weights for policy 0, policy_version 6220 (0.0009) [2023-10-12 16:00:25,320][62634] Updated weights for policy 0, policy_version 6230 (0.0007) [2023-10-12 16:00:25,703][62634] Updated weights for policy 0, policy_version 6240 (0.0009) [2023-10-12 16:00:28,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 12779520. Throughput: 0: 1688.7, 1: 1689.6. Samples: 3206388. Policy #0 lag: (min: 30.0, avg: 30.0, max: 30.0) [2023-10-12 16:00:28,436][61643] Avg episode reward: [(0, '2.960'), (1, '2.940')] [2023-10-12 16:00:28,611][62635] Updated weights for policy 1, policy_version 6250 (0.0007) [2023-10-12 16:00:28,979][62635] Updated weights for policy 1, policy_version 6260 (0.0008) [2023-10-12 16:00:29,344][62635] Updated weights for policy 1, policy_version 6270 (0.0008) [2023-10-12 16:00:29,753][62634] Updated weights for policy 0, policy_version 6250 (0.0010) [2023-10-12 16:00:30,139][62634] Updated weights for policy 0, policy_version 6260 (0.0007) [2023-10-12 16:00:30,515][62634] Updated weights for policy 0, policy_version 6270 (0.0007) [2023-10-12 16:00:33,318][62635] Updated weights for policy 1, policy_version 6280 (0.0008) [2023-10-12 16:00:33,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.3, 300 sec: 13329.4). Total num frames: 12845056. Throughput: 0: 1687.0, 1: 1691.9. Samples: 3227190. Policy #0 lag: (min: 31.0, avg: 38.1, max: 63.0) [2023-10-12 16:00:33,435][61643] Avg episode reward: [(0, '2.950'), (1, '2.940')] [2023-10-12 16:00:33,679][62635] Updated weights for policy 1, policy_version 6290 (0.0008) [2023-10-12 16:00:34,048][62635] Updated weights for policy 1, policy_version 6300 (0.0008) [2023-10-12 16:00:34,525][62634] Updated weights for policy 0, policy_version 6280 (0.0010) [2023-10-12 16:00:34,910][62634] Updated weights for policy 0, policy_version 6290 (0.0008) [2023-10-12 16:00:35,289][62634] Updated weights for policy 0, policy_version 6300 (0.0010) [2023-10-12 16:00:38,063][62635] Updated weights for policy 1, policy_version 6310 (0.0009) [2023-10-12 16:00:38,428][62635] Updated weights for policy 1, policy_version 6320 (0.0009) [2023-10-12 16:00:38,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 12910592. Throughput: 0: 1666.3, 1: 1694.5. Samples: 3236284. Policy #0 lag: (min: 31.0, avg: 38.1, max: 63.0) [2023-10-12 16:00:38,435][61643] Avg episode reward: [(0, '2.950'), (1, '2.910')] [2023-10-12 16:00:38,803][62635] Updated weights for policy 1, policy_version 6330 (0.0007) [2023-10-12 16:00:39,368][62634] Updated weights for policy 0, policy_version 6310 (0.0008) [2023-10-12 16:00:39,739][62634] Updated weights for policy 0, policy_version 6320 (0.0010) [2023-10-12 16:00:40,121][62634] Updated weights for policy 0, policy_version 6330 (0.0009) [2023-10-12 16:00:42,706][62635] Updated weights for policy 1, policy_version 6340 (0.0008) [2023-10-12 16:00:43,084][62635] Updated weights for policy 1, policy_version 6350 (0.0009) [2023-10-12 16:00:43,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 12976128. Throughput: 0: 1686.4, 1: 1698.8. Samples: 3257350. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 16:00:43,435][61643] Avg episode reward: [(0, '2.890'), (1, '2.860')] [2023-10-12 16:00:43,441][62635] Updated weights for policy 1, policy_version 6360 (0.0010) [2023-10-12 16:00:43,906][62634] Updated weights for policy 0, policy_version 6340 (0.0008) [2023-10-12 16:00:44,286][62634] Updated weights for policy 0, policy_version 6350 (0.0007) [2023-10-12 16:00:44,660][62634] Updated weights for policy 0, policy_version 6360 (0.0008) [2023-10-12 16:00:47,463][62635] Updated weights for policy 1, policy_version 6370 (0.0008) [2023-10-12 16:00:47,843][62635] Updated weights for policy 1, policy_version 6380 (0.0009) [2023-10-12 16:00:48,207][62635] Updated weights for policy 1, policy_version 6390 (0.0009) [2023-10-12 16:00:48,435][61643] Fps is (10 sec: 13106.8, 60 sec: 13107.1, 300 sec: 13329.3). Total num frames: 13041664. Throughput: 0: 1687.9, 1: 1691.2. Samples: 3277730. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 16:00:48,436][61643] Avg episode reward: [(0, '2.900'), (1, '2.850')] [2023-10-12 16:00:48,583][62635] Updated weights for policy 1, policy_version 6400 (0.0011) [2023-10-12 16:00:48,857][62634] Updated weights for policy 0, policy_version 6370 (0.0007) [2023-10-12 16:00:49,231][62634] Updated weights for policy 0, policy_version 6380 (0.0008) [2023-10-12 16:00:49,603][62634] Updated weights for policy 0, policy_version 6390 (0.0007) [2023-10-12 16:00:49,986][62634] Updated weights for policy 0, policy_version 6400 (0.0007) [2023-10-12 16:00:52,605][62635] Updated weights for policy 1, policy_version 6410 (0.0008) [2023-10-12 16:00:52,977][62635] Updated weights for policy 1, policy_version 6420 (0.0008) [2023-10-12 16:00:53,343][62635] Updated weights for policy 1, policy_version 6430 (0.0008) [2023-10-12 16:00:53,435][61643] Fps is (10 sec: 16384.1, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 13139968. Throughput: 0: 1680.0, 1: 1707.8. Samples: 3287678. Policy #0 lag: (min: 31.0, avg: 31.1, max: 39.0) [2023-10-12 16:00:53,435][61643] Avg episode reward: [(0, '2.820'), (1, '2.910')] [2023-10-12 16:00:53,888][62634] Updated weights for policy 0, policy_version 6410 (0.0007) [2023-10-12 16:00:54,271][62634] Updated weights for policy 0, policy_version 6420 (0.0008) [2023-10-12 16:00:54,653][62634] Updated weights for policy 0, policy_version 6430 (0.0007) [2023-10-12 16:00:57,391][62635] Updated weights for policy 1, policy_version 6440 (0.0009) [2023-10-12 16:00:57,770][62635] Updated weights for policy 1, policy_version 6450 (0.0010) [2023-10-12 16:00:58,137][62635] Updated weights for policy 1, policy_version 6460 (0.0011) [2023-10-12 16:00:58,435][61643] Fps is (10 sec: 16384.5, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 13205504. Throughput: 0: 1691.8, 1: 1704.1. Samples: 3308724. Policy #0 lag: (min: 31.0, avg: 31.1, max: 39.0) [2023-10-12 16:00:58,435][61643] Avg episode reward: [(0, '2.880'), (1, '2.970')] [2023-10-12 16:00:58,547][62634] Updated weights for policy 0, policy_version 6440 (0.0010) [2023-10-12 16:00:58,915][62634] Updated weights for policy 0, policy_version 6450 (0.0008) [2023-10-12 16:00:59,290][62634] Updated weights for policy 0, policy_version 6460 (0.0010) [2023-10-12 16:01:02,245][62635] Updated weights for policy 1, policy_version 6470 (0.0009) [2023-10-12 16:01:02,613][62635] Updated weights for policy 1, policy_version 6480 (0.0007) [2023-10-12 16:01:02,987][62635] Updated weights for policy 1, policy_version 6490 (0.0007) [2023-10-12 16:01:03,220][62634] Updated weights for policy 0, policy_version 6470 (0.0008) [2023-10-12 16:01:03,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 13271040. Throughput: 0: 1693.2, 1: 1682.8. Samples: 3328618. Policy #0 lag: (min: 31.0, avg: 35.0, max: 63.0) [2023-10-12 16:01:03,436][61643] Avg episode reward: [(0, '2.900'), (1, '2.960')] [2023-10-12 16:01:03,589][62634] Updated weights for policy 0, policy_version 6480 (0.0009) [2023-10-12 16:01:03,974][62634] Updated weights for policy 0, policy_version 6490 (0.0007) [2023-10-12 16:01:07,078][62635] Updated weights for policy 1, policy_version 6500 (0.0008) [2023-10-12 16:01:07,436][62635] Updated weights for policy 1, policy_version 6510 (0.0009) [2023-10-12 16:01:07,801][62635] Updated weights for policy 1, policy_version 6520 (0.0008) [2023-10-12 16:01:08,061][62634] Updated weights for policy 0, policy_version 6500 (0.0008) [2023-10-12 16:01:08,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 13336576. Throughput: 0: 1694.8, 1: 1703.3. Samples: 3338674. Policy #0 lag: (min: 31.0, avg: 35.0, max: 63.0) [2023-10-12 16:01:08,436][61643] Avg episode reward: [(0, '2.990'), (1, '3.000')] [2023-10-12 16:01:08,441][62634] Updated weights for policy 0, policy_version 6510 (0.0010) [2023-10-12 16:01:08,814][62634] Updated weights for policy 0, policy_version 6520 (0.0010) [2023-10-12 16:01:11,711][62635] Updated weights for policy 1, policy_version 6530 (0.0007) [2023-10-12 16:01:12,079][62635] Updated weights for policy 1, policy_version 6540 (0.0007) [2023-10-12 16:01:12,442][62635] Updated weights for policy 1, policy_version 6550 (0.0008) [2023-10-12 16:01:12,811][62635] Updated weights for policy 1, policy_version 6560 (0.0009) [2023-10-12 16:01:12,895][62634] Updated weights for policy 0, policy_version 6530 (0.0008) [2023-10-12 16:01:13,278][62634] Updated weights for policy 0, policy_version 6540 (0.0010) [2023-10-12 16:01:13,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13440.5). Total num frames: 13402112. Throughput: 0: 1696.7, 1: 1692.4. Samples: 3358898. Policy #0 lag: (min: 0.0, avg: 20.7, max: 32.0) [2023-10-12 16:01:13,435][61643] Avg episode reward: [(0, '2.940'), (1, '3.010')] [2023-10-12 16:01:13,653][62634] Updated weights for policy 0, policy_version 6550 (0.0008) [2023-10-12 16:01:14,029][62634] Updated weights for policy 0, policy_version 6560 (0.0009) [2023-10-12 16:01:16,757][62635] Updated weights for policy 1, policy_version 6570 (0.0009) [2023-10-12 16:01:17,128][62635] Updated weights for policy 1, policy_version 6580 (0.0008) [2023-10-12 16:01:17,501][62635] Updated weights for policy 1, policy_version 6590 (0.0007) [2023-10-12 16:01:18,002][62634] Updated weights for policy 0, policy_version 6570 (0.0010) [2023-10-12 16:01:18,389][62634] Updated weights for policy 0, policy_version 6580 (0.0008) [2023-10-12 16:01:18,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 13467648. Throughput: 0: 1691.5, 1: 1679.3. Samples: 3378878. Policy #0 lag: (min: 0.0, avg: 20.7, max: 32.0) [2023-10-12 16:01:18,435][61643] Avg episode reward: [(0, '2.980'), (1, '2.980')] [2023-10-12 16:01:18,770][62634] Updated weights for policy 0, policy_version 6590 (0.0007) [2023-10-12 16:01:21,377][62635] Updated weights for policy 1, policy_version 6600 (0.0009) [2023-10-12 16:01:21,739][62635] Updated weights for policy 1, policy_version 6610 (0.0009) [2023-10-12 16:01:22,114][62635] Updated weights for policy 1, policy_version 6620 (0.0009) [2023-10-12 16:01:23,017][62634] Updated weights for policy 0, policy_version 6600 (0.0007) [2023-10-12 16:01:23,393][62634] Updated weights for policy 0, policy_version 6610 (0.0007) [2023-10-12 16:01:23,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 13533184. Throughput: 0: 1701.1, 1: 1711.3. Samples: 3389840. Policy #0 lag: (min: 31.0, avg: 33.9, max: 63.0) [2023-10-12 16:01:23,435][61643] Avg episode reward: [(0, '2.960'), (1, '2.940')] [2023-10-12 16:01:23,767][62634] Updated weights for policy 0, policy_version 6620 (0.0010) [2023-10-12 16:01:26,279][62635] Updated weights for policy 1, policy_version 6630 (0.0010) [2023-10-12 16:01:26,650][62635] Updated weights for policy 1, policy_version 6640 (0.0007) [2023-10-12 16:01:27,020][62635] Updated weights for policy 1, policy_version 6650 (0.0007) [2023-10-12 16:01:27,792][62634] Updated weights for policy 0, policy_version 6630 (0.0009) [2023-10-12 16:01:28,173][62634] Updated weights for policy 0, policy_version 6640 (0.0008) [2023-10-12 16:01:28,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 13598720. Throughput: 0: 1697.8, 1: 1681.9. Samples: 3409438. Policy #0 lag: (min: 31.0, avg: 33.9, max: 63.0) [2023-10-12 16:01:28,435][61643] Avg episode reward: [(0, '2.990'), (1, '2.910')] [2023-10-12 16:01:28,562][62634] Updated weights for policy 0, policy_version 6650 (0.0010) [2023-10-12 16:01:31,131][62635] Updated weights for policy 1, policy_version 6660 (0.0009) [2023-10-12 16:01:31,501][62635] Updated weights for policy 1, policy_version 6670 (0.0009) [2023-10-12 16:01:31,865][62635] Updated weights for policy 1, policy_version 6680 (0.0007) [2023-10-12 16:01:32,560][62634] Updated weights for policy 0, policy_version 6660 (0.0008) [2023-10-12 16:01:32,945][62634] Updated weights for policy 0, policy_version 6670 (0.0008) [2023-10-12 16:01:33,326][62634] Updated weights for policy 0, policy_version 6680 (0.0008) [2023-10-12 16:01:33,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 13664256. Throughput: 0: 1683.0, 1: 1682.2. Samples: 3429164. Policy #0 lag: (min: 31.0, avg: 39.4, max: 63.0) [2023-10-12 16:01:33,435][61643] Avg episode reward: [(0, '2.950'), (1, '2.950')] [2023-10-12 16:01:36,084][62635] Updated weights for policy 1, policy_version 6690 (0.0008) [2023-10-12 16:01:36,506][62635] Updated weights for policy 1, policy_version 6700 (0.0007) [2023-10-12 16:01:36,879][62635] Updated weights for policy 1, policy_version 6710 (0.0007) [2023-10-12 16:01:37,246][62635] Updated weights for policy 1, policy_version 6720 (0.0007) [2023-10-12 16:01:37,490][62634] Updated weights for policy 0, policy_version 6690 (0.0010) [2023-10-12 16:01:37,900][62634] Updated weights for policy 0, policy_version 6700 (0.0009) [2023-10-12 16:01:38,281][62634] Updated weights for policy 0, policy_version 6710 (0.0009) [2023-10-12 16:01:38,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 13729792. Throughput: 0: 1698.0, 1: 1693.4. Samples: 3440290. Policy #0 lag: (min: 31.0, avg: 39.4, max: 63.0) [2023-10-12 16:01:38,435][61643] Avg episode reward: [(0, '2.930'), (1, '2.970')] [2023-10-12 16:01:38,670][62634] Updated weights for policy 0, policy_version 6720 (0.0008) [2023-10-12 16:01:41,170][62635] Updated weights for policy 1, policy_version 6730 (0.0007) [2023-10-12 16:01:41,545][62635] Updated weights for policy 1, policy_version 6740 (0.0008) [2023-10-12 16:01:41,907][62635] Updated weights for policy 1, policy_version 6750 (0.0010) [2023-10-12 16:01:42,645][62634] Updated weights for policy 0, policy_version 6730 (0.0010) [2023-10-12 16:01:43,025][62634] Updated weights for policy 0, policy_version 6740 (0.0009) [2023-10-12 16:01:43,400][62634] Updated weights for policy 0, policy_version 6750 (0.0009) [2023-10-12 16:01:43,435][61643] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 13795328. Throughput: 0: 1692.2, 1: 1661.6. Samples: 3459646. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 16:01:43,436][61643] Avg episode reward: [(0, '2.940'), (1, '2.930')] [2023-10-12 16:01:46,007][62635] Updated weights for policy 1, policy_version 6760 (0.0010) [2023-10-12 16:01:46,367][62635] Updated weights for policy 1, policy_version 6770 (0.0008) [2023-10-12 16:01:46,743][62635] Updated weights for policy 1, policy_version 6780 (0.0009) [2023-10-12 16:01:47,385][62634] Updated weights for policy 0, policy_version 6760 (0.0009) [2023-10-12 16:01:47,766][62634] Updated weights for policy 0, policy_version 6770 (0.0009) [2023-10-12 16:01:48,154][62634] Updated weights for policy 0, policy_version 6780 (0.0008) [2023-10-12 16:01:48,435][61643] Fps is (10 sec: 16384.0, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 13893632. Throughput: 0: 1671.6, 1: 1683.4. Samples: 3479592. Policy #0 lag: (min: 31.0, avg: 36.7, max: 63.0) [2023-10-12 16:01:48,436][61643] Avg episode reward: [(0, '2.960'), (1, '2.930')] [2023-10-12 16:01:50,797][62635] Updated weights for policy 1, policy_version 6790 (0.0008) [2023-10-12 16:01:51,165][62635] Updated weights for policy 1, policy_version 6800 (0.0007) [2023-10-12 16:01:51,537][62635] Updated weights for policy 1, policy_version 6810 (0.0010) [2023-10-12 16:01:52,070][62634] Updated weights for policy 0, policy_version 6790 (0.0009) [2023-10-12 16:01:52,446][62634] Updated weights for policy 0, policy_version 6800 (0.0010) [2023-10-12 16:01:52,825][62634] Updated weights for policy 0, policy_version 6810 (0.0010) [2023-10-12 16:01:53,435][61643] Fps is (10 sec: 16384.3, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 13959168. Throughput: 0: 1690.8, 1: 1679.9. Samples: 3490356. Policy #0 lag: (min: 31.0, avg: 36.7, max: 63.0) [2023-10-12 16:01:53,435][61643] Avg episode reward: [(0, '2.940'), (1, '2.900')] [2023-10-12 16:01:55,481][62635] Updated weights for policy 1, policy_version 6820 (0.0009) [2023-10-12 16:01:55,852][62635] Updated weights for policy 1, policy_version 6830 (0.0009) [2023-10-12 16:01:56,220][62635] Updated weights for policy 1, policy_version 6840 (0.0008) [2023-10-12 16:01:57,000][62634] Updated weights for policy 0, policy_version 6820 (0.0008) [2023-10-12 16:01:57,374][62634] Updated weights for policy 0, policy_version 6830 (0.0008) [2023-10-12 16:01:57,759][62634] Updated weights for policy 0, policy_version 6840 (0.0010) [2023-10-12 16:01:58,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 14024704. Throughput: 0: 1691.3, 1: 1676.1. Samples: 3510434. Policy #0 lag: (min: 14.0, avg: 14.2, max: 22.0) [2023-10-12 16:01:58,436][61643] Avg episode reward: [(0, '2.920'), (1, '2.930')] [2023-10-12 16:02:00,161][62635] Updated weights for policy 1, policy_version 6850 (0.0007) [2023-10-12 16:02:00,527][62635] Updated weights for policy 1, policy_version 6860 (0.0009) [2023-10-12 16:02:00,904][62635] Updated weights for policy 1, policy_version 6870 (0.0008) [2023-10-12 16:02:01,269][62635] Updated weights for policy 1, policy_version 6880 (0.0009) [2023-10-12 16:02:01,812][62634] Updated weights for policy 0, policy_version 6850 (0.0010) [2023-10-12 16:02:02,193][62634] Updated weights for policy 0, policy_version 6860 (0.0007) [2023-10-12 16:02:02,583][62634] Updated weights for policy 0, policy_version 6870 (0.0009) [2023-10-12 16:02:02,959][62634] Updated weights for policy 0, policy_version 6880 (0.0008) [2023-10-12 16:02:03,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 14090240. Throughput: 0: 1670.0, 1: 1692.4. Samples: 3530188. Policy #0 lag: (min: 14.0, avg: 14.2, max: 22.0) [2023-10-12 16:02:03,435][61643] Avg episode reward: [(0, '2.870'), (1, '2.950')] [2023-10-12 16:02:05,391][62635] Updated weights for policy 1, policy_version 6890 (0.0007) [2023-10-12 16:02:05,750][62635] Updated weights for policy 1, policy_version 6900 (0.0007) [2023-10-12 16:02:06,125][62635] Updated weights for policy 1, policy_version 6910 (0.0007) [2023-10-12 16:02:07,051][62634] Updated weights for policy 0, policy_version 6890 (0.0009) [2023-10-12 16:02:07,429][62634] Updated weights for policy 0, policy_version 6900 (0.0008) [2023-10-12 16:02:07,815][62634] Updated weights for policy 0, policy_version 6910 (0.0008) [2023-10-12 16:02:08,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 14155776. Throughput: 0: 1693.9, 1: 1663.3. Samples: 3540914. Policy #0 lag: (min: 27.0, avg: 52.4, max: 56.0) [2023-10-12 16:02:08,435][61643] Avg episode reward: [(0, '2.900'), (1, '2.960')] [2023-10-12 16:02:10,082][62635] Updated weights for policy 1, policy_version 6920 (0.0007) [2023-10-12 16:02:10,446][62635] Updated weights for policy 1, policy_version 6930 (0.0008) [2023-10-12 16:02:10,817][62635] Updated weights for policy 1, policy_version 6940 (0.0008) [2023-10-12 16:02:11,699][62634] Updated weights for policy 0, policy_version 6920 (0.0007) [2023-10-12 16:02:12,076][62634] Updated weights for policy 0, policy_version 6930 (0.0007) [2023-10-12 16:02:12,466][62634] Updated weights for policy 0, policy_version 6940 (0.0009) [2023-10-12 16:02:13,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 14221312. Throughput: 0: 1683.5, 1: 1685.1. Samples: 3561022. Policy #0 lag: (min: 27.0, avg: 52.4, max: 56.0) [2023-10-12 16:02:13,435][61643] Avg episode reward: [(0, '2.900'), (1, '2.990')] [2023-10-12 16:02:14,974][62635] Updated weights for policy 1, policy_version 6950 (0.0008) [2023-10-12 16:02:15,339][62635] Updated weights for policy 1, policy_version 6960 (0.0008) [2023-10-12 16:02:15,709][62635] Updated weights for policy 1, policy_version 6970 (0.0010) [2023-10-12 16:02:16,527][62634] Updated weights for policy 0, policy_version 6950 (0.0007) [2023-10-12 16:02:16,914][62634] Updated weights for policy 0, policy_version 6960 (0.0008) [2023-10-12 16:02:17,284][62634] Updated weights for policy 0, policy_version 6970 (0.0008) [2023-10-12 16:02:18,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 14286848. Throughput: 0: 1677.6, 1: 1697.2. Samples: 3581026. Policy #0 lag: (min: 31.0, avg: 40.4, max: 63.0) [2023-10-12 16:02:18,435][61643] Avg episode reward: [(0, '2.900'), (1, '2.980')] [2023-10-12 16:02:18,442][62495] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000006976_7143424.pth... [2023-10-12 16:02:18,442][62354] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000006976_7143424.pth... [2023-10-12 16:02:18,490][62354] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000005408_5537792.pth [2023-10-12 16:02:18,491][62495] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000005408_5537792.pth [2023-10-12 16:02:19,670][62635] Updated weights for policy 1, policy_version 6980 (0.0009) [2023-10-12 16:02:20,037][62635] Updated weights for policy 1, policy_version 6990 (0.0008) [2023-10-12 16:02:20,401][62635] Updated weights for policy 1, policy_version 7000 (0.0008) [2023-10-12 16:02:21,346][62634] Updated weights for policy 0, policy_version 6980 (0.0009) [2023-10-12 16:02:21,726][62634] Updated weights for policy 0, policy_version 6990 (0.0007) [2023-10-12 16:02:22,106][62634] Updated weights for policy 0, policy_version 7000 (0.0009) [2023-10-12 16:02:23,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 14352384. Throughput: 0: 1691.8, 1: 1669.3. Samples: 3591540. Policy #0 lag: (min: 31.0, avg: 40.4, max: 63.0) [2023-10-12 16:02:23,436][61643] Avg episode reward: [(0, '2.880'), (1, '2.990')] [2023-10-12 16:02:24,207][62635] Updated weights for policy 1, policy_version 7010 (0.0008) [2023-10-12 16:02:24,574][62635] Updated weights for policy 1, policy_version 7020 (0.0008) [2023-10-12 16:02:24,940][62635] Updated weights for policy 1, policy_version 7030 (0.0007) [2023-10-12 16:02:25,318][62635] Updated weights for policy 1, policy_version 7040 (0.0007) [2023-10-12 16:02:26,272][62634] Updated weights for policy 0, policy_version 7010 (0.0010) [2023-10-12 16:02:26,661][62634] Updated weights for policy 0, policy_version 7020 (0.0010) [2023-10-12 16:02:27,038][62634] Updated weights for policy 0, policy_version 7030 (0.0010) [2023-10-12 16:02:27,414][62634] Updated weights for policy 0, policy_version 7040 (0.0009) [2023-10-12 16:02:28,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 14417920. Throughput: 0: 1667.5, 1: 1699.0. Samples: 3611138. Policy #0 lag: (min: 31.0, avg: 38.8, max: 63.0) [2023-10-12 16:02:28,435][61643] Avg episode reward: [(0, '2.940'), (1, '2.980')] [2023-10-12 16:02:29,563][62635] Updated weights for policy 1, policy_version 7050 (0.0009) [2023-10-12 16:02:29,933][62635] Updated weights for policy 1, policy_version 7060 (0.0007) [2023-10-12 16:02:30,293][62635] Updated weights for policy 1, policy_version 7070 (0.0009) [2023-10-12 16:02:31,452][62634] Updated weights for policy 0, policy_version 7050 (0.0010) [2023-10-12 16:02:31,839][62634] Updated weights for policy 0, policy_version 7060 (0.0009) [2023-10-12 16:02:32,219][62634] Updated weights for policy 0, policy_version 7070 (0.0008) [2023-10-12 16:02:33,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 14483456. Throughput: 0: 1667.7, 1: 1694.9. Samples: 3630910. Policy #0 lag: (min: 31.0, avg: 38.8, max: 63.0) [2023-10-12 16:02:33,436][61643] Avg episode reward: [(0, '2.970'), (1, '2.950')] [2023-10-12 16:02:34,615][62635] Updated weights for policy 1, policy_version 7080 (0.0009) [2023-10-12 16:02:34,980][62635] Updated weights for policy 1, policy_version 7090 (0.0008) [2023-10-12 16:02:35,348][62635] Updated weights for policy 1, policy_version 7100 (0.0010) [2023-10-12 16:02:36,349][62634] Updated weights for policy 0, policy_version 7080 (0.0009) [2023-10-12 16:02:36,727][62634] Updated weights for policy 0, policy_version 7090 (0.0007) [2023-10-12 16:02:37,107][62634] Updated weights for policy 0, policy_version 7100 (0.0007) [2023-10-12 16:02:38,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 14548992. Throughput: 0: 1680.9, 1: 1673.2. Samples: 3641292. Policy #0 lag: (min: 31.0, avg: 46.5, max: 63.0) [2023-10-12 16:02:38,435][61643] Avg episode reward: [(0, '2.960'), (1, '2.950')] [2023-10-12 16:02:39,406][62635] Updated weights for policy 1, policy_version 7110 (0.0009) [2023-10-12 16:02:39,775][62635] Updated weights for policy 1, policy_version 7120 (0.0007) [2023-10-12 16:02:40,143][62635] Updated weights for policy 1, policy_version 7130 (0.0008) [2023-10-12 16:02:41,018][62634] Updated weights for policy 0, policy_version 7110 (0.0009) [2023-10-12 16:02:41,394][62634] Updated weights for policy 0, policy_version 7120 (0.0009) [2023-10-12 16:02:41,768][62634] Updated weights for policy 0, policy_version 7130 (0.0008) [2023-10-12 16:02:43,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 14614528. Throughput: 0: 1660.1, 1: 1695.4. Samples: 3661432. Policy #0 lag: (min: 31.0, avg: 46.5, max: 63.0) [2023-10-12 16:02:43,436][61643] Avg episode reward: [(0, '3.000'), (1, '2.950')] [2023-10-12 16:02:44,055][62635] Updated weights for policy 1, policy_version 7140 (0.0008) [2023-10-12 16:02:44,422][62635] Updated weights for policy 1, policy_version 7150 (0.0007) [2023-10-12 16:02:44,795][62635] Updated weights for policy 1, policy_version 7160 (0.0008) [2023-10-12 16:02:45,660][62634] Updated weights for policy 0, policy_version 7140 (0.0008) [2023-10-12 16:02:46,047][62634] Updated weights for policy 0, policy_version 7150 (0.0007) [2023-10-12 16:02:46,425][62634] Updated weights for policy 0, policy_version 7160 (0.0007) [2023-10-12 16:02:48,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13551.5). Total num frames: 14680064. Throughput: 0: 1682.8, 1: 1691.9. Samples: 3682050. Policy #0 lag: (min: 31.0, avg: 33.4, max: 63.0) [2023-10-12 16:02:48,436][61643] Avg episode reward: [(0, '2.980'), (1, '2.970')] [2023-10-12 16:02:48,949][62635] Updated weights for policy 1, policy_version 7170 (0.0009) [2023-10-12 16:02:49,305][62635] Updated weights for policy 1, policy_version 7180 (0.0010) [2023-10-12 16:02:49,678][62635] Updated weights for policy 1, policy_version 7190 (0.0012) [2023-10-12 16:02:50,036][62635] Updated weights for policy 1, policy_version 7200 (0.0011) [2023-10-12 16:02:50,597][62634] Updated weights for policy 0, policy_version 7170 (0.0008) [2023-10-12 16:02:50,967][62634] Updated weights for policy 0, policy_version 7180 (0.0007) [2023-10-12 16:02:51,352][62634] Updated weights for policy 0, policy_version 7190 (0.0007) [2023-10-12 16:02:51,724][62634] Updated weights for policy 0, policy_version 7200 (0.0008) [2023-10-12 16:02:53,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 14745600. Throughput: 0: 1672.0, 1: 1683.6. Samples: 3691916. Policy #0 lag: (min: 31.0, avg: 33.4, max: 63.0) [2023-10-12 16:02:53,436][61643] Avg episode reward: [(0, '2.950'), (1, '2.980')] [2023-10-12 16:02:53,916][62635] Updated weights for policy 1, policy_version 7210 (0.0010) [2023-10-12 16:02:54,280][62635] Updated weights for policy 1, policy_version 7220 (0.0010) [2023-10-12 16:02:54,652][62635] Updated weights for policy 1, policy_version 7230 (0.0009) [2023-10-12 16:02:55,846][62634] Updated weights for policy 0, policy_version 7210 (0.0007) [2023-10-12 16:02:56,223][62634] Updated weights for policy 0, policy_version 7220 (0.0009) [2023-10-12 16:02:56,610][62634] Updated weights for policy 0, policy_version 7230 (0.0007) [2023-10-12 16:02:58,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 14811136. Throughput: 0: 1666.4, 1: 1693.5. Samples: 3712218. Policy #0 lag: (min: 29.0, avg: 35.9, max: 61.0) [2023-10-12 16:02:58,436][61643] Avg episode reward: [(0, '2.870'), (1, '3.000')] [2023-10-12 16:02:58,760][62635] Updated weights for policy 1, policy_version 7240 (0.0011) [2023-10-12 16:02:59,127][62635] Updated weights for policy 1, policy_version 7250 (0.0011) [2023-10-12 16:02:59,497][62635] Updated weights for policy 1, policy_version 7260 (0.0011) [2023-10-12 16:03:00,440][62634] Updated weights for policy 0, policy_version 7240 (0.0010) [2023-10-12 16:03:00,813][62634] Updated weights for policy 0, policy_version 7250 (0.0010) [2023-10-12 16:03:01,192][62634] Updated weights for policy 0, policy_version 7260 (0.0009) [2023-10-12 16:03:03,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 14876672. Throughput: 0: 1691.9, 1: 1683.8. Samples: 3732934. Policy #0 lag: (min: 29.0, avg: 35.9, max: 61.0) [2023-10-12 16:03:03,435][61643] Avg episode reward: [(0, '2.910'), (1, '2.980')] [2023-10-12 16:03:03,684][62635] Updated weights for policy 1, policy_version 7270 (0.0010) [2023-10-12 16:03:04,049][62635] Updated weights for policy 1, policy_version 7280 (0.0010) [2023-10-12 16:03:04,418][62635] Updated weights for policy 1, policy_version 7290 (0.0010) [2023-10-12 16:03:05,214][62634] Updated weights for policy 0, policy_version 7270 (0.0008) [2023-10-12 16:03:05,589][62634] Updated weights for policy 0, policy_version 7280 (0.0008) [2023-10-12 16:03:05,972][62634] Updated weights for policy 0, policy_version 7290 (0.0007) [2023-10-12 16:03:08,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 14942208. Throughput: 0: 1666.1, 1: 1683.5. Samples: 3742272. Policy #0 lag: (min: 1.0, avg: 15.4, max: 33.0) [2023-10-12 16:03:08,435][61643] Avg episode reward: [(0, '2.930'), (1, '2.960')] [2023-10-12 16:03:08,507][62635] Updated weights for policy 1, policy_version 7300 (0.0009) [2023-10-12 16:03:08,873][62635] Updated weights for policy 1, policy_version 7310 (0.0009) [2023-10-12 16:03:09,251][62635] Updated weights for policy 1, policy_version 7320 (0.0009) [2023-10-12 16:03:10,057][62634] Updated weights for policy 0, policy_version 7300 (0.0008) [2023-10-12 16:03:10,426][62634] Updated weights for policy 0, policy_version 7310 (0.0008) [2023-10-12 16:03:10,809][62634] Updated weights for policy 0, policy_version 7320 (0.0009) [2023-10-12 16:03:13,053][62635] Updated weights for policy 1, policy_version 7330 (0.0008) [2023-10-12 16:03:13,429][62635] Updated weights for policy 1, policy_version 7340 (0.0008) [2023-10-12 16:03:13,435][61643] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 15007744. Throughput: 0: 1682.4, 1: 1688.0. Samples: 3762808. Policy #0 lag: (min: 1.0, avg: 15.4, max: 33.0) [2023-10-12 16:03:13,436][61643] Avg episode reward: [(0, '2.950'), (1, '2.970')] [2023-10-12 16:03:13,795][62635] Updated weights for policy 1, policy_version 7350 (0.0010) [2023-10-12 16:03:14,159][62635] Updated weights for policy 1, policy_version 7360 (0.0010) [2023-10-12 16:03:14,704][62634] Updated weights for policy 0, policy_version 7330 (0.0008) [2023-10-12 16:03:15,084][62634] Updated weights for policy 0, policy_version 7340 (0.0011) [2023-10-12 16:03:15,468][62634] Updated weights for policy 0, policy_version 7350 (0.0008) [2023-10-12 16:03:15,843][62634] Updated weights for policy 0, policy_version 7360 (0.0009) [2023-10-12 16:03:18,304][62635] Updated weights for policy 1, policy_version 7370 (0.0009) [2023-10-12 16:03:18,435][61643] Fps is (10 sec: 13106.9, 60 sec: 13107.1, 300 sec: 13440.4). Total num frames: 15073280. Throughput: 0: 1698.9, 1: 1691.6. Samples: 3783484. Policy #0 lag: (min: 13.0, avg: 22.3, max: 45.0) [2023-10-12 16:03:18,436][61643] Avg episode reward: [(0, '2.950'), (1, '2.970')] [2023-10-12 16:03:18,671][62635] Updated weights for policy 1, policy_version 7380 (0.0010) [2023-10-12 16:03:19,034][62635] Updated weights for policy 1, policy_version 7390 (0.0010) [2023-10-12 16:03:19,977][62634] Updated weights for policy 0, policy_version 7370 (0.0008) [2023-10-12 16:03:20,356][62634] Updated weights for policy 0, policy_version 7380 (0.0008) [2023-10-12 16:03:20,737][62634] Updated weights for policy 0, policy_version 7390 (0.0007) [2023-10-12 16:03:23,186][62635] Updated weights for policy 1, policy_version 7400 (0.0007) [2023-10-12 16:03:23,435][61643] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 15138816. Throughput: 0: 1666.4, 1: 1699.0. Samples: 3792734. Policy #0 lag: (min: 13.0, avg: 22.3, max: 45.0) [2023-10-12 16:03:23,435][61643] Avg episode reward: [(0, '2.980'), (1, '2.990')] [2023-10-12 16:03:23,551][62635] Updated weights for policy 1, policy_version 7410 (0.0007) [2023-10-12 16:03:23,916][62635] Updated weights for policy 1, policy_version 7420 (0.0007) [2023-10-12 16:03:24,862][62634] Updated weights for policy 0, policy_version 7400 (0.0009) [2023-10-12 16:03:25,239][62634] Updated weights for policy 0, policy_version 7410 (0.0011) [2023-10-12 16:03:25,630][62634] Updated weights for policy 0, policy_version 7420 (0.0008) [2023-10-12 16:03:27,899][62635] Updated weights for policy 1, policy_version 7430 (0.0009) [2023-10-12 16:03:28,266][62635] Updated weights for policy 1, policy_version 7440 (0.0010) [2023-10-12 16:03:28,435][61643] Fps is (10 sec: 13107.6, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 15204352. Throughput: 0: 1685.6, 1: 1695.6. Samples: 3813590. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 16:03:28,435][61643] Avg episode reward: [(0, '3.000'), (1, '2.990')] [2023-10-12 16:03:28,630][62635] Updated weights for policy 1, policy_version 7450 (0.0007) [2023-10-12 16:03:29,774][62634] Updated weights for policy 0, policy_version 7430 (0.0009) [2023-10-12 16:03:30,158][62634] Updated weights for policy 0, policy_version 7440 (0.0009) [2023-10-12 16:03:30,544][62634] Updated weights for policy 0, policy_version 7450 (0.0010) [2023-10-12 16:03:32,798][62635] Updated weights for policy 1, policy_version 7460 (0.0008) [2023-10-12 16:03:33,166][62635] Updated weights for policy 1, policy_version 7470 (0.0008) [2023-10-12 16:03:33,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 15269888. Throughput: 0: 1688.2, 1: 1684.6. Samples: 3833826. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 16:03:33,435][61643] Avg episode reward: [(0, '3.040'), (1, '2.980')] [2023-10-12 16:03:33,444][62354] Saving new best policy, reward=3.040! [2023-10-12 16:03:33,526][62635] Updated weights for policy 1, policy_version 7480 (0.0011) [2023-10-12 16:03:34,464][62634] Updated weights for policy 0, policy_version 7460 (0.0008) [2023-10-12 16:03:34,838][62634] Updated weights for policy 0, policy_version 7470 (0.0011) [2023-10-12 16:03:35,223][62634] Updated weights for policy 0, policy_version 7480 (0.0010) [2023-10-12 16:03:37,310][62635] Updated weights for policy 1, policy_version 7490 (0.0009) [2023-10-12 16:03:37,682][62635] Updated weights for policy 1, policy_version 7500 (0.0007) [2023-10-12 16:03:38,052][62635] Updated weights for policy 1, policy_version 7510 (0.0007) [2023-10-12 16:03:38,422][62635] Updated weights for policy 1, policy_version 7520 (0.0007) [2023-10-12 16:03:38,435][61643] Fps is (10 sec: 16383.9, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 15368192. Throughput: 0: 1665.2, 1: 1703.6. Samples: 3843512. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 16:03:38,435][61643] Avg episode reward: [(0, '3.020'), (1, '2.980')] [2023-10-12 16:03:39,191][62634] Updated weights for policy 0, policy_version 7490 (0.0010) [2023-10-12 16:03:39,560][62634] Updated weights for policy 0, policy_version 7500 (0.0007) [2023-10-12 16:03:39,941][62634] Updated weights for policy 0, policy_version 7510 (0.0009) [2023-10-12 16:03:40,316][62634] Updated weights for policy 0, policy_version 7520 (0.0008) [2023-10-12 16:03:42,559][62635] Updated weights for policy 1, policy_version 7530 (0.0007) [2023-10-12 16:03:42,928][62635] Updated weights for policy 1, policy_version 7540 (0.0007) [2023-10-12 16:03:43,302][62635] Updated weights for policy 1, policy_version 7550 (0.0008) [2023-10-12 16:03:43,435][61643] Fps is (10 sec: 16384.0, 60 sec: 13653.4, 300 sec: 13440.5). Total num frames: 15433728. Throughput: 0: 1682.2, 1: 1700.6. Samples: 3864444. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 16:03:43,435][61643] Avg episode reward: [(0, '2.950'), (1, '2.980')] [2023-10-12 16:03:44,333][62634] Updated weights for policy 0, policy_version 7530 (0.0007) [2023-10-12 16:03:44,708][62634] Updated weights for policy 0, policy_version 7540 (0.0007) [2023-10-12 16:03:45,095][62634] Updated weights for policy 0, policy_version 7550 (0.0007) [2023-10-12 16:03:47,048][62635] Updated weights for policy 1, policy_version 7560 (0.0009) [2023-10-12 16:03:47,417][62635] Updated weights for policy 1, policy_version 7570 (0.0008) [2023-10-12 16:03:47,792][62635] Updated weights for policy 1, policy_version 7580 (0.0009) [2023-10-12 16:03:48,435][61643] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 15499264. Throughput: 0: 1673.5, 1: 1680.8. Samples: 3883880. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 16:03:48,436][61643] Avg episode reward: [(0, '2.930'), (1, '2.980')] [2023-10-12 16:03:49,193][62634] Updated weights for policy 0, policy_version 7560 (0.0008) [2023-10-12 16:03:49,573][62634] Updated weights for policy 0, policy_version 7570 (0.0008) [2023-10-12 16:03:49,952][62634] Updated weights for policy 0, policy_version 7580 (0.0008) [2023-10-12 16:03:51,927][62635] Updated weights for policy 1, policy_version 7590 (0.0010) [2023-10-12 16:03:52,282][62635] Updated weights for policy 1, policy_version 7600 (0.0010) [2023-10-12 16:03:52,650][62635] Updated weights for policy 1, policy_version 7610 (0.0010) [2023-10-12 16:03:53,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 15564800. Throughput: 0: 1666.4, 1: 1711.0. Samples: 3894256. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 16:03:53,435][61643] Avg episode reward: [(0, '2.910'), (1, '2.980')] [2023-10-12 16:03:53,910][62634] Updated weights for policy 0, policy_version 7590 (0.0009) [2023-10-12 16:03:54,295][62634] Updated weights for policy 0, policy_version 7600 (0.0007) [2023-10-12 16:03:54,681][62634] Updated weights for policy 0, policy_version 7610 (0.0008) [2023-10-12 16:03:56,686][62635] Updated weights for policy 1, policy_version 7620 (0.0010) [2023-10-12 16:03:57,052][62635] Updated weights for policy 1, policy_version 7630 (0.0008) [2023-10-12 16:03:57,427][62635] Updated weights for policy 1, policy_version 7640 (0.0008) [2023-10-12 16:03:58,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 15630336. Throughput: 0: 1683.1, 1: 1691.8. Samples: 3914676. Policy #0 lag: (min: 28.0, avg: 28.0, max: 28.0) [2023-10-12 16:03:58,436][61643] Avg episode reward: [(0, '2.910'), (1, '2.950')] [2023-10-12 16:03:58,765][62634] Updated weights for policy 0, policy_version 7620 (0.0009) [2023-10-12 16:03:59,145][62634] Updated weights for policy 0, policy_version 7630 (0.0007) [2023-10-12 16:03:59,513][62634] Updated weights for policy 0, policy_version 7640 (0.0008) [2023-10-12 16:04:01,487][62635] Updated weights for policy 1, policy_version 7650 (0.0008) [2023-10-12 16:04:01,849][62635] Updated weights for policy 1, policy_version 7660 (0.0007) [2023-10-12 16:04:02,216][62635] Updated weights for policy 1, policy_version 7670 (0.0007) [2023-10-12 16:04:02,589][62635] Updated weights for policy 1, policy_version 7680 (0.0007) [2023-10-12 16:04:03,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 15695872. Throughput: 0: 1688.1, 1: 1676.3. Samples: 3934880. Policy #0 lag: (min: 28.0, avg: 28.0, max: 28.0) [2023-10-12 16:04:03,435][61643] Avg episode reward: [(0, '2.940'), (1, '2.950')] [2023-10-12 16:04:03,787][62634] Updated weights for policy 0, policy_version 7650 (0.0007) [2023-10-12 16:04:04,193][62634] Updated weights for policy 0, policy_version 7660 (0.0008) [2023-10-12 16:04:04,558][62634] Updated weights for policy 0, policy_version 7670 (0.0010) [2023-10-12 16:04:04,942][62634] Updated weights for policy 0, policy_version 7680 (0.0007) [2023-10-12 16:04:06,793][62635] Updated weights for policy 1, policy_version 7690 (0.0008) [2023-10-12 16:04:07,162][62635] Updated weights for policy 1, policy_version 7700 (0.0009) [2023-10-12 16:04:07,529][62635] Updated weights for policy 1, policy_version 7710 (0.0009) [2023-10-12 16:04:08,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 15761408. Throughput: 0: 1685.7, 1: 1700.5. Samples: 3945114. Policy #0 lag: (min: 25.0, avg: 36.0, max: 57.0) [2023-10-12 16:04:08,436][61643] Avg episode reward: [(0, '2.980'), (1, '2.970')] [2023-10-12 16:04:08,772][62634] Updated weights for policy 0, policy_version 7690 (0.0008) [2023-10-12 16:04:09,151][62634] Updated weights for policy 0, policy_version 7700 (0.0007) [2023-10-12 16:04:09,522][62634] Updated weights for policy 0, policy_version 7710 (0.0010) [2023-10-12 16:04:11,625][62635] Updated weights for policy 1, policy_version 7720 (0.0007) [2023-10-12 16:04:11,984][62635] Updated weights for policy 1, policy_version 7730 (0.0008) [2023-10-12 16:04:12,350][62635] Updated weights for policy 1, policy_version 7740 (0.0009) [2023-10-12 16:04:13,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 15826944. Throughput: 0: 1692.2, 1: 1674.0. Samples: 3965066. Policy #0 lag: (min: 25.0, avg: 36.0, max: 57.0) [2023-10-12 16:04:13,436][61643] Avg episode reward: [(0, '2.980'), (1, '2.940')] [2023-10-12 16:04:13,616][62634] Updated weights for policy 0, policy_version 7720 (0.0008) [2023-10-12 16:04:13,994][62634] Updated weights for policy 0, policy_version 7730 (0.0009) [2023-10-12 16:04:14,376][62634] Updated weights for policy 0, policy_version 7740 (0.0010) [2023-10-12 16:04:16,361][62635] Updated weights for policy 1, policy_version 7750 (0.0009) [2023-10-12 16:04:16,727][62635] Updated weights for policy 1, policy_version 7760 (0.0009) [2023-10-12 16:04:17,102][62635] Updated weights for policy 1, policy_version 7770 (0.0009) [2023-10-12 16:04:18,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 15892480. Throughput: 0: 1695.3, 1: 1675.0. Samples: 3985490. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 16:04:18,435][61643] Avg episode reward: [(0, '2.950'), (1, '2.960')] [2023-10-12 16:04:18,445][62495] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000007776_7962624.pth... [2023-10-12 16:04:18,475][62634] Updated weights for policy 0, policy_version 7750 (0.0008) [2023-10-12 16:04:18,484][62495] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000006208_6356992.pth [2023-10-12 16:04:18,490][62495] Saving a milestone ./train_atari/atari_kangaroo_APPO/checkpoint_p1/milestones/checkpoint_000007776_7962624.pth [2023-10-12 16:04:18,845][62634] Updated weights for policy 0, policy_version 7760 (0.0008) [2023-10-12 16:04:19,228][62634] Updated weights for policy 0, policy_version 7770 (0.0008) [2023-10-12 16:04:19,453][62354] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000007776_7962624.pth... [2023-10-12 16:04:19,485][62354] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000006176_6324224.pth [2023-10-12 16:04:19,491][62354] Saving a milestone ./train_atari/atari_kangaroo_APPO/checkpoint_p0/milestones/checkpoint_000007776_7962624.pth [2023-10-12 16:04:21,104][62635] Updated weights for policy 1, policy_version 7780 (0.0008) [2023-10-12 16:04:21,476][62635] Updated weights for policy 1, policy_version 7790 (0.0007) [2023-10-12 16:04:21,850][62635] Updated weights for policy 1, policy_version 7800 (0.0007) [2023-10-12 16:04:23,273][62634] Updated weights for policy 0, policy_version 7780 (0.0010) [2023-10-12 16:04:23,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 15958016. Throughput: 0: 1697.5, 1: 1686.2. Samples: 3995778. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 16:04:23,435][61643] Avg episode reward: [(0, '2.930'), (1, '2.970')] [2023-10-12 16:04:23,650][62634] Updated weights for policy 0, policy_version 7790 (0.0009) [2023-10-12 16:04:24,035][62634] Updated weights for policy 0, policy_version 7800 (0.0007) [2023-10-12 16:04:25,808][62635] Updated weights for policy 1, policy_version 7810 (0.0007) [2023-10-12 16:04:26,187][62635] Updated weights for policy 1, policy_version 7820 (0.0007) [2023-10-12 16:04:26,544][62635] Updated weights for policy 1, policy_version 7830 (0.0009) [2023-10-12 16:04:26,907][62635] Updated weights for policy 1, policy_version 7840 (0.0008) [2023-10-12 16:04:27,928][62634] Updated weights for policy 0, policy_version 7810 (0.0007) [2023-10-12 16:04:28,304][62634] Updated weights for policy 0, policy_version 7820 (0.0007) [2023-10-12 16:04:28,435][61643] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 16023552. Throughput: 0: 1697.6, 1: 1663.9. Samples: 4015710. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) [2023-10-12 16:04:28,435][61643] Avg episode reward: [(0, '2.940'), (1, '3.000')] [2023-10-12 16:04:28,690][62634] Updated weights for policy 0, policy_version 7830 (0.0007) [2023-10-12 16:04:29,062][62634] Updated weights for policy 0, policy_version 7840 (0.0008) [2023-10-12 16:04:30,880][62635] Updated weights for policy 1, policy_version 7850 (0.0010) [2023-10-12 16:04:31,247][62635] Updated weights for policy 1, policy_version 7860 (0.0008) [2023-10-12 16:04:31,613][62635] Updated weights for policy 1, policy_version 7870 (0.0007) [2023-10-12 16:04:33,153][62634] Updated weights for policy 0, policy_version 7850 (0.0007) [2023-10-12 16:04:33,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 16089088. Throughput: 0: 1696.3, 1: 1694.3. Samples: 4036454. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) [2023-10-12 16:04:33,435][61643] Avg episode reward: [(0, '2.940'), (1, '2.990')] [2023-10-12 16:04:33,534][62634] Updated weights for policy 0, policy_version 7860 (0.0009) [2023-10-12 16:04:33,915][62634] Updated weights for policy 0, policy_version 7870 (0.0007) [2023-10-12 16:04:35,726][62635] Updated weights for policy 1, policy_version 7880 (0.0008) [2023-10-12 16:04:36,105][62635] Updated weights for policy 1, policy_version 7890 (0.0007) [2023-10-12 16:04:36,476][62635] Updated weights for policy 1, policy_version 7900 (0.0007) [2023-10-12 16:04:38,075][62634] Updated weights for policy 0, policy_version 7880 (0.0010) [2023-10-12 16:04:38,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 16154624. Throughput: 0: 1699.8, 1: 1680.8. Samples: 4046384. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 16:04:38,435][61643] Avg episode reward: [(0, '2.920'), (1, '3.010')] [2023-10-12 16:04:38,457][62634] Updated weights for policy 0, policy_version 7890 (0.0010) [2023-10-12 16:04:38,837][62634] Updated weights for policy 0, policy_version 7900 (0.0010) [2023-10-12 16:04:40,498][62635] Updated weights for policy 1, policy_version 7910 (0.0010) [2023-10-12 16:04:40,867][62635] Updated weights for policy 1, policy_version 7920 (0.0008) [2023-10-12 16:04:41,234][62635] Updated weights for policy 1, policy_version 7930 (0.0008) [2023-10-12 16:04:42,699][62634] Updated weights for policy 0, policy_version 7910 (0.0008) [2023-10-12 16:04:43,080][62634] Updated weights for policy 0, policy_version 7920 (0.0008) [2023-10-12 16:04:43,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 16220160. Throughput: 0: 1692.7, 1: 1681.1. Samples: 4066494. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 16:04:43,435][61643] Avg episode reward: [(0, '2.980'), (1, '3.040')] [2023-10-12 16:04:43,436][62495] Saving new best policy, reward=3.040! [2023-10-12 16:04:43,465][62634] Updated weights for policy 0, policy_version 7930 (0.0011) [2023-10-12 16:04:45,324][62635] Updated weights for policy 1, policy_version 7940 (0.0008) [2023-10-12 16:04:45,686][62635] Updated weights for policy 1, policy_version 7950 (0.0008) [2023-10-12 16:04:46,066][62635] Updated weights for policy 1, policy_version 7960 (0.0010) [2023-10-12 16:04:47,355][62634] Updated weights for policy 0, policy_version 7940 (0.0008) [2023-10-12 16:04:47,734][62634] Updated weights for policy 0, policy_version 7950 (0.0009) [2023-10-12 16:04:48,116][62634] Updated weights for policy 0, policy_version 7960 (0.0008) [2023-10-12 16:04:48,435][61643] Fps is (10 sec: 16384.1, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 16318464. Throughput: 0: 1674.4, 1: 1698.2. Samples: 4086650. Policy #0 lag: (min: 31.0, avg: 31.6, max: 49.0) [2023-10-12 16:04:48,435][61643] Avg episode reward: [(0, '2.980'), (1, '3.040')] [2023-10-12 16:04:50,021][62635] Updated weights for policy 1, policy_version 7970 (0.0007) [2023-10-12 16:04:50,390][62635] Updated weights for policy 1, policy_version 7980 (0.0008) [2023-10-12 16:04:50,763][62635] Updated weights for policy 1, policy_version 7990 (0.0007) [2023-10-12 16:04:51,133][62635] Updated weights for policy 1, policy_version 8000 (0.0008) [2023-10-12 16:04:52,158][62634] Updated weights for policy 0, policy_version 7970 (0.0008) [2023-10-12 16:04:52,557][62634] Updated weights for policy 0, policy_version 7980 (0.0009) [2023-10-12 16:04:52,943][62634] Updated weights for policy 0, policy_version 7990 (0.0007) [2023-10-12 16:04:53,322][62634] Updated weights for policy 0, policy_version 8000 (0.0007) [2023-10-12 16:04:53,435][61643] Fps is (10 sec: 16384.0, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 16384000. Throughput: 0: 1693.3, 1: 1676.1. Samples: 4096738. Policy #0 lag: (min: 1.0, avg: 12.9, max: 33.0) [2023-10-12 16:04:53,435][61643] Avg episode reward: [(0, '2.990'), (1, '3.020')] [2023-10-12 16:04:55,205][62635] Updated weights for policy 1, policy_version 8010 (0.0011) [2023-10-12 16:04:55,574][62635] Updated weights for policy 1, policy_version 8020 (0.0007) [2023-10-12 16:04:55,941][62635] Updated weights for policy 1, policy_version 8030 (0.0008) [2023-10-12 16:04:57,374][62634] Updated weights for policy 0, policy_version 8010 (0.0010) [2023-10-12 16:04:57,759][62634] Updated weights for policy 0, policy_version 8020 (0.0009) [2023-10-12 16:04:58,145][62634] Updated weights for policy 0, policy_version 8030 (0.0007) [2023-10-12 16:04:58,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 16449536. Throughput: 0: 1694.6, 1: 1695.6. Samples: 4117626. Policy #0 lag: (min: 1.0, avg: 12.9, max: 33.0) [2023-10-12 16:04:58,436][61643] Avg episode reward: [(0, '2.920'), (1, '3.050')] [2023-10-12 16:04:58,437][62495] Saving new best policy, reward=3.050! [2023-10-12 16:04:59,964][62635] Updated weights for policy 1, policy_version 8040 (0.0007) [2023-10-12 16:05:00,350][62635] Updated weights for policy 1, policy_version 8050 (0.0008) [2023-10-12 16:05:00,729][62635] Updated weights for policy 1, policy_version 8060 (0.0009) [2023-10-12 16:05:02,050][62634] Updated weights for policy 0, policy_version 8040 (0.0008) [2023-10-12 16:05:02,426][62634] Updated weights for policy 0, policy_version 8050 (0.0007) [2023-10-12 16:05:02,816][62634] Updated weights for policy 0, policy_version 8060 (0.0007) [2023-10-12 16:05:03,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 16515072. Throughput: 0: 1667.1, 1: 1708.1. Samples: 4137374. Policy #0 lag: (min: 21.0, avg: 21.0, max: 21.0) [2023-10-12 16:05:03,435][61643] Avg episode reward: [(0, '2.920'), (1, '3.030')] [2023-10-12 16:05:04,693][62635] Updated weights for policy 1, policy_version 8070 (0.0009) [2023-10-12 16:05:05,064][62635] Updated weights for policy 1, policy_version 8080 (0.0010) [2023-10-12 16:05:05,434][62635] Updated weights for policy 1, policy_version 8090 (0.0010) [2023-10-12 16:05:06,735][62634] Updated weights for policy 0, policy_version 8070 (0.0009) [2023-10-12 16:05:07,112][62634] Updated weights for policy 0, policy_version 8080 (0.0010) [2023-10-12 16:05:07,497][62634] Updated weights for policy 0, policy_version 8090 (0.0007) [2023-10-12 16:05:08,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 16580608. Throughput: 0: 1695.8, 1: 1678.8. Samples: 4147636. Policy #0 lag: (min: 21.0, avg: 21.0, max: 21.0) [2023-10-12 16:05:08,436][61643] Avg episode reward: [(0, '2.930'), (1, '2.970')] [2023-10-12 16:05:09,642][62635] Updated weights for policy 1, policy_version 8100 (0.0010) [2023-10-12 16:05:10,006][62635] Updated weights for policy 1, policy_version 8110 (0.0010) [2023-10-12 16:05:10,378][62635] Updated weights for policy 1, policy_version 8120 (0.0010) [2023-10-12 16:05:11,537][62634] Updated weights for policy 0, policy_version 8100 (0.0007) [2023-10-12 16:05:11,905][62634] Updated weights for policy 0, policy_version 8110 (0.0008) [2023-10-12 16:05:12,290][62634] Updated weights for policy 0, policy_version 8120 (0.0007) [2023-10-12 16:05:13,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 16646144. Throughput: 0: 1683.5, 1: 1694.2. Samples: 4167708. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) [2023-10-12 16:05:13,436][61643] Avg episode reward: [(0, '2.990'), (1, '2.980')] [2023-10-12 16:05:14,360][62635] Updated weights for policy 1, policy_version 8130 (0.0009) [2023-10-12 16:05:14,729][62635] Updated weights for policy 1, policy_version 8140 (0.0008) [2023-10-12 16:05:15,090][62635] Updated weights for policy 1, policy_version 8150 (0.0009) [2023-10-12 16:05:15,465][62635] Updated weights for policy 1, policy_version 8160 (0.0009) [2023-10-12 16:05:16,274][62634] Updated weights for policy 0, policy_version 8130 (0.0009) [2023-10-12 16:05:16,645][62634] Updated weights for policy 0, policy_version 8140 (0.0009) [2023-10-12 16:05:17,025][62634] Updated weights for policy 0, policy_version 8150 (0.0008) [2023-10-12 16:05:17,401][62634] Updated weights for policy 0, policy_version 8160 (0.0008) [2023-10-12 16:05:18,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 16711680. Throughput: 0: 1673.3, 1: 1691.5. Samples: 4187870. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) [2023-10-12 16:05:18,436][61643] Avg episode reward: [(0, '3.020'), (1, '3.010')] [2023-10-12 16:05:19,563][62635] Updated weights for policy 1, policy_version 8170 (0.0007) [2023-10-12 16:05:19,936][62635] Updated weights for policy 1, policy_version 8180 (0.0008) [2023-10-12 16:05:20,310][62635] Updated weights for policy 1, policy_version 8190 (0.0007) [2023-10-12 16:05:21,465][62634] Updated weights for policy 0, policy_version 8170 (0.0007) [2023-10-12 16:05:21,843][62634] Updated weights for policy 0, policy_version 8180 (0.0008) [2023-10-12 16:05:22,231][62634] Updated weights for policy 0, policy_version 8190 (0.0008) [2023-10-12 16:05:23,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 16777216. Throughput: 0: 1701.3, 1: 1675.1. Samples: 4198322. Policy #0 lag: (min: 15.0, avg: 30.9, max: 47.0) [2023-10-12 16:05:23,435][61643] Avg episode reward: [(0, '3.020'), (1, '3.030')] [2023-10-12 16:05:24,295][62635] Updated weights for policy 1, policy_version 8200 (0.0011) [2023-10-12 16:05:24,662][62635] Updated weights for policy 1, policy_version 8210 (0.0011) [2023-10-12 16:05:25,035][62635] Updated weights for policy 1, policy_version 8220 (0.0011) [2023-10-12 16:05:26,396][62634] Updated weights for policy 0, policy_version 8200 (0.0009) [2023-10-12 16:05:26,769][62634] Updated weights for policy 0, policy_version 8210 (0.0011) [2023-10-12 16:05:27,155][62634] Updated weights for policy 0, policy_version 8220 (0.0008) [2023-10-12 16:05:28,435][61643] Fps is (10 sec: 13107.6, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 16842752. Throughput: 0: 1678.3, 1: 1692.9. Samples: 4218198. Policy #0 lag: (min: 15.0, avg: 30.9, max: 47.0) [2023-10-12 16:05:28,435][61643] Avg episode reward: [(0, '3.000'), (1, '3.030')] [2023-10-12 16:05:29,059][62635] Updated weights for policy 1, policy_version 8230 (0.0008) [2023-10-12 16:05:29,428][62635] Updated weights for policy 1, policy_version 8240 (0.0007) [2023-10-12 16:05:29,797][62635] Updated weights for policy 1, policy_version 8250 (0.0009) [2023-10-12 16:05:31,176][62634] Updated weights for policy 0, policy_version 8230 (0.0007) [2023-10-12 16:05:31,546][62634] Updated weights for policy 0, policy_version 8240 (0.0008) [2023-10-12 16:05:31,933][62634] Updated weights for policy 0, policy_version 8250 (0.0009) [2023-10-12 16:05:33,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 16908288. Throughput: 0: 1682.9, 1: 1691.5. Samples: 4238498. Policy #0 lag: (min: 30.0, avg: 40.4, max: 62.0) [2023-10-12 16:05:33,435][61643] Avg episode reward: [(0, '3.000'), (1, '3.020')] [2023-10-12 16:05:33,874][62635] Updated weights for policy 1, policy_version 8260 (0.0009) [2023-10-12 16:05:34,239][62635] Updated weights for policy 1, policy_version 8270 (0.0010) [2023-10-12 16:05:34,622][62635] Updated weights for policy 1, policy_version 8280 (0.0010) [2023-10-12 16:05:36,091][62634] Updated weights for policy 0, policy_version 8260 (0.0010) [2023-10-12 16:05:36,473][62634] Updated weights for policy 0, policy_version 8270 (0.0008) [2023-10-12 16:05:36,843][62634] Updated weights for policy 0, policy_version 8280 (0.0009) [2023-10-12 16:05:38,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 16973824. Throughput: 0: 1696.5, 1: 1681.4. Samples: 4248744. Policy #0 lag: (min: 30.0, avg: 40.4, max: 62.0) [2023-10-12 16:05:38,435][61643] Avg episode reward: [(0, '2.940'), (1, '3.050')] [2023-10-12 16:05:38,737][62635] Updated weights for policy 1, policy_version 8290 (0.0010) [2023-10-12 16:05:39,106][62635] Updated weights for policy 1, policy_version 8300 (0.0009) [2023-10-12 16:05:39,481][62635] Updated weights for policy 1, policy_version 8310 (0.0009) [2023-10-12 16:05:39,858][62635] Updated weights for policy 1, policy_version 8320 (0.0011) [2023-10-12 16:05:40,950][62634] Updated weights for policy 0, policy_version 8290 (0.0008) [2023-10-12 16:05:41,345][62634] Updated weights for policy 0, policy_version 8300 (0.0008) [2023-10-12 16:05:41,710][62634] Updated weights for policy 0, policy_version 8310 (0.0007) [2023-10-12 16:05:42,092][62634] Updated weights for policy 0, policy_version 8320 (0.0009) [2023-10-12 16:05:43,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 17039360. Throughput: 0: 1663.3, 1: 1684.3. Samples: 4268268. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) [2023-10-12 16:05:43,436][61643] Avg episode reward: [(0, '2.940'), (1, '3.010')] [2023-10-12 16:05:43,892][62635] Updated weights for policy 1, policy_version 8330 (0.0009) [2023-10-12 16:05:44,261][62635] Updated weights for policy 1, policy_version 8340 (0.0008) [2023-10-12 16:05:44,631][62635] Updated weights for policy 1, policy_version 8350 (0.0008) [2023-10-12 16:05:45,995][62634] Updated weights for policy 0, policy_version 8330 (0.0009) [2023-10-12 16:05:46,369][62634] Updated weights for policy 0, policy_version 8340 (0.0009) [2023-10-12 16:05:46,753][62634] Updated weights for policy 0, policy_version 8350 (0.0007) [2023-10-12 16:05:48,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 17104896. Throughput: 0: 1678.4, 1: 1688.2. Samples: 4288870. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) [2023-10-12 16:05:48,435][61643] Avg episode reward: [(0, '2.930'), (1, '2.960')] [2023-10-12 16:05:48,754][62635] Updated weights for policy 1, policy_version 8360 (0.0009) [2023-10-12 16:05:49,129][62635] Updated weights for policy 1, policy_version 8370 (0.0009) [2023-10-12 16:05:49,493][62635] Updated weights for policy 1, policy_version 8380 (0.0008) [2023-10-12 16:05:50,875][62634] Updated weights for policy 0, policy_version 8360 (0.0007) [2023-10-12 16:05:51,252][62634] Updated weights for policy 0, policy_version 8370 (0.0010) [2023-10-12 16:05:51,631][62634] Updated weights for policy 0, policy_version 8380 (0.0010) [2023-10-12 16:05:53,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 17170432. Throughput: 0: 1671.6, 1: 1684.8. Samples: 4298674. Policy #0 lag: (min: 31.0, avg: 38.4, max: 63.0) [2023-10-12 16:05:53,436][61643] Avg episode reward: [(0, '3.000'), (1, '2.980')] [2023-10-12 16:05:53,560][62635] Updated weights for policy 1, policy_version 8390 (0.0008) [2023-10-12 16:05:53,935][62635] Updated weights for policy 1, policy_version 8400 (0.0009) [2023-10-12 16:05:54,305][62635] Updated weights for policy 1, policy_version 8410 (0.0008) [2023-10-12 16:05:55,845][62634] Updated weights for policy 0, policy_version 8390 (0.0009) [2023-10-12 16:05:56,228][62634] Updated weights for policy 0, policy_version 8400 (0.0007) [2023-10-12 16:05:56,597][62634] Updated weights for policy 0, policy_version 8410 (0.0007) [2023-10-12 16:05:58,244][62635] Updated weights for policy 1, policy_version 8420 (0.0010) [2023-10-12 16:05:58,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 17235968. Throughput: 0: 1657.4, 1: 1693.0. Samples: 4318476. Policy #0 lag: (min: 31.0, avg: 38.4, max: 63.0) [2023-10-12 16:05:58,436][61643] Avg episode reward: [(0, '2.980'), (1, '2.980')] [2023-10-12 16:05:58,610][62635] Updated weights for policy 1, policy_version 8430 (0.0009) [2023-10-12 16:05:58,983][62635] Updated weights for policy 1, policy_version 8440 (0.0007) [2023-10-12 16:06:00,658][62634] Updated weights for policy 0, policy_version 8420 (0.0007) [2023-10-12 16:06:01,037][62634] Updated weights for policy 0, policy_version 8430 (0.0010) [2023-10-12 16:06:01,421][62634] Updated weights for policy 0, policy_version 8440 (0.0009) [2023-10-12 16:06:03,046][62635] Updated weights for policy 1, policy_version 8450 (0.0007) [2023-10-12 16:06:03,408][62635] Updated weights for policy 1, policy_version 8460 (0.0008) [2023-10-12 16:06:03,435][61643] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 17301504. Throughput: 0: 1672.5, 1: 1694.1. Samples: 4339370. Policy #0 lag: (min: 21.0, avg: 27.4, max: 53.0) [2023-10-12 16:06:03,435][61643] Avg episode reward: [(0, '2.950'), (1, '3.030')] [2023-10-12 16:06:03,777][62635] Updated weights for policy 1, policy_version 8470 (0.0011) [2023-10-12 16:06:04,147][62635] Updated weights for policy 1, policy_version 8480 (0.0010) [2023-10-12 16:06:05,370][62634] Updated weights for policy 0, policy_version 8450 (0.0008) [2023-10-12 16:06:05,752][62634] Updated weights for policy 0, policy_version 8460 (0.0008) [2023-10-12 16:06:06,128][62634] Updated weights for policy 0, policy_version 8470 (0.0009) [2023-10-12 16:06:06,507][62634] Updated weights for policy 0, policy_version 8480 (0.0009) [2023-10-12 16:06:08,276][62635] Updated weights for policy 1, policy_version 8490 (0.0009) [2023-10-12 16:06:08,435][61643] Fps is (10 sec: 13107.6, 60 sec: 13107.3, 300 sec: 13440.4). Total num frames: 17367040. Throughput: 0: 1662.6, 1: 1696.0. Samples: 4349462. Policy #0 lag: (min: 21.0, avg: 27.4, max: 53.0) [2023-10-12 16:06:08,435][61643] Avg episode reward: [(0, '2.970'), (1, '3.060')] [2023-10-12 16:06:08,648][62635] Updated weights for policy 1, policy_version 8500 (0.0009) [2023-10-12 16:06:09,021][62635] Updated weights for policy 1, policy_version 8510 (0.0009) [2023-10-12 16:06:09,093][62495] Saving new best policy, reward=3.060! [2023-10-12 16:06:10,454][62634] Updated weights for policy 0, policy_version 8490 (0.0009) [2023-10-12 16:06:10,832][62634] Updated weights for policy 0, policy_version 8500 (0.0010) [2023-10-12 16:06:11,216][62634] Updated weights for policy 0, policy_version 8510 (0.0008) [2023-10-12 16:06:13,071][62635] Updated weights for policy 1, policy_version 8520 (0.0008) [2023-10-12 16:06:13,433][62635] Updated weights for policy 1, policy_version 8530 (0.0008) [2023-10-12 16:06:13,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 17432576. Throughput: 0: 1673.5, 1: 1691.6. Samples: 4369624. Policy #0 lag: (min: 26.0, avg: 34.2, max: 58.0) [2023-10-12 16:06:13,435][61643] Avg episode reward: [(0, '2.980'), (1, '3.070')] [2023-10-12 16:06:13,800][62635] Updated weights for policy 1, policy_version 8540 (0.0010) [2023-10-12 16:06:13,945][62495] Saving new best policy, reward=3.070! [2023-10-12 16:06:15,169][62634] Updated weights for policy 0, policy_version 8520 (0.0007) [2023-10-12 16:06:15,553][62634] Updated weights for policy 0, policy_version 8530 (0.0007) [2023-10-12 16:06:15,930][62634] Updated weights for policy 0, policy_version 8540 (0.0008) [2023-10-12 16:06:17,800][62635] Updated weights for policy 1, policy_version 8550 (0.0008) [2023-10-12 16:06:18,181][62635] Updated weights for policy 1, policy_version 8560 (0.0008) [2023-10-12 16:06:18,435][61643] Fps is (10 sec: 13106.6, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 17498112. Throughput: 0: 1686.5, 1: 1679.9. Samples: 4389990. Policy #0 lag: (min: 26.0, avg: 34.2, max: 58.0) [2023-10-12 16:06:18,436][61643] Avg episode reward: [(0, '2.940'), (1, '3.030')] [2023-10-12 16:06:18,447][62354] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000008544_8749056.pth... [2023-10-12 16:06:18,482][62354] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000006976_7143424.pth [2023-10-12 16:06:18,548][62635] Updated weights for policy 1, policy_version 8570 (0.0009) [2023-10-12 16:06:18,764][62495] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000008576_8781824.pth... [2023-10-12 16:06:18,794][62495] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000006976_7143424.pth [2023-10-12 16:06:19,897][62634] Updated weights for policy 0, policy_version 8550 (0.0008) [2023-10-12 16:06:20,284][62634] Updated weights for policy 0, policy_version 8560 (0.0007) [2023-10-12 16:06:20,670][62634] Updated weights for policy 0, policy_version 8570 (0.0007) [2023-10-12 16:06:22,649][62635] Updated weights for policy 1, policy_version 8580 (0.0008) [2023-10-12 16:06:23,024][62635] Updated weights for policy 1, policy_version 8590 (0.0008) [2023-10-12 16:06:23,398][62635] Updated weights for policy 1, policy_version 8600 (0.0009) [2023-10-12 16:06:23,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 17563648. Throughput: 0: 1659.2, 1: 1691.7. Samples: 4399534. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 16:06:23,435][61643] Avg episode reward: [(0, '2.960'), (1, '3.020')] [2023-10-12 16:06:24,704][62634] Updated weights for policy 0, policy_version 8580 (0.0009) [2023-10-12 16:06:25,082][62634] Updated weights for policy 0, policy_version 8590 (0.0007) [2023-10-12 16:06:25,464][62634] Updated weights for policy 0, policy_version 8600 (0.0007) [2023-10-12 16:06:27,322][62635] Updated weights for policy 1, policy_version 8610 (0.0010) [2023-10-12 16:06:27,687][62635] Updated weights for policy 1, policy_version 8620 (0.0007) [2023-10-12 16:06:28,064][62635] Updated weights for policy 1, policy_version 8630 (0.0008) [2023-10-12 16:06:28,428][62635] Updated weights for policy 1, policy_version 8640 (0.0008) [2023-10-12 16:06:28,435][61643] Fps is (10 sec: 16384.7, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 17661952. Throughput: 0: 1688.6, 1: 1694.0. Samples: 4420484. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 16:06:28,435][61643] Avg episode reward: [(0, '2.960'), (1, '3.010')] [2023-10-12 16:06:29,496][62634] Updated weights for policy 0, policy_version 8610 (0.0009) [2023-10-12 16:06:29,892][62634] Updated weights for policy 0, policy_version 8620 (0.0007) [2023-10-12 16:06:30,265][62634] Updated weights for policy 0, policy_version 8630 (0.0007) [2023-10-12 16:06:30,650][62634] Updated weights for policy 0, policy_version 8640 (0.0009) [2023-10-12 16:06:32,425][62635] Updated weights for policy 1, policy_version 8650 (0.0008) [2023-10-12 16:06:32,791][62635] Updated weights for policy 1, policy_version 8660 (0.0007) [2023-10-12 16:06:33,154][62635] Updated weights for policy 1, policy_version 8670 (0.0011) [2023-10-12 16:06:33,435][61643] Fps is (10 sec: 16383.8, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 17727488. Throughput: 0: 1696.0, 1: 1668.8. Samples: 4440286. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 16:06:33,435][61643] Avg episode reward: [(0, '3.000'), (1, '2.980')] [2023-10-12 16:06:34,737][62634] Updated weights for policy 0, policy_version 8650 (0.0007) [2023-10-12 16:06:35,114][62634] Updated weights for policy 0, policy_version 8660 (0.0009) [2023-10-12 16:06:35,484][62634] Updated weights for policy 0, policy_version 8670 (0.0008) [2023-10-12 16:06:37,319][62635] Updated weights for policy 1, policy_version 8680 (0.0008) [2023-10-12 16:06:37,710][62635] Updated weights for policy 1, policy_version 8690 (0.0007) [2023-10-12 16:06:38,079][62635] Updated weights for policy 1, policy_version 8700 (0.0009) [2023-10-12 16:06:38,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 17793024. Throughput: 0: 1672.2, 1: 1694.8. Samples: 4450188. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 16:06:38,435][61643] Avg episode reward: [(0, '2.960'), (1, '3.000')] [2023-10-12 16:06:39,714][62634] Updated weights for policy 0, policy_version 8680 (0.0009) [2023-10-12 16:06:40,088][62634] Updated weights for policy 0, policy_version 8690 (0.0008) [2023-10-12 16:06:40,469][62634] Updated weights for policy 0, policy_version 8700 (0.0007) [2023-10-12 16:06:42,225][62635] Updated weights for policy 1, policy_version 8710 (0.0010) [2023-10-12 16:06:42,594][62635] Updated weights for policy 1, policy_version 8720 (0.0008) [2023-10-12 16:06:42,963][62635] Updated weights for policy 1, policy_version 8730 (0.0008) [2023-10-12 16:06:43,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 17858560. Throughput: 0: 1693.1, 1: 1689.1. Samples: 4470674. Policy #0 lag: (min: 15.0, avg: 15.5, max: 30.0) [2023-10-12 16:06:43,436][61643] Avg episode reward: [(0, '2.920'), (1, '3.010')] [2023-10-12 16:06:44,616][62634] Updated weights for policy 0, policy_version 8710 (0.0009) [2023-10-12 16:06:44,982][62634] Updated weights for policy 0, policy_version 8720 (0.0012) [2023-10-12 16:06:45,368][62634] Updated weights for policy 0, policy_version 8730 (0.0009) [2023-10-12 16:06:47,015][62635] Updated weights for policy 1, policy_version 8740 (0.0010) [2023-10-12 16:06:47,396][62635] Updated weights for policy 1, policy_version 8750 (0.0008) [2023-10-12 16:06:47,759][62635] Updated weights for policy 1, policy_version 8760 (0.0009) [2023-10-12 16:06:48,435][61643] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 17924096. Throughput: 0: 1691.4, 1: 1658.1. Samples: 4490100. Policy #0 lag: (min: 15.0, avg: 15.5, max: 30.0) [2023-10-12 16:06:48,436][61643] Avg episode reward: [(0, '2.920'), (1, '2.970')] [2023-10-12 16:06:49,347][62634] Updated weights for policy 0, policy_version 8740 (0.0009) [2023-10-12 16:06:49,727][62634] Updated weights for policy 0, policy_version 8750 (0.0008) [2023-10-12 16:06:50,104][62634] Updated weights for policy 0, policy_version 8760 (0.0008) [2023-10-12 16:06:51,721][62635] Updated weights for policy 1, policy_version 8770 (0.0008) [2023-10-12 16:06:52,106][62635] Updated weights for policy 1, policy_version 8780 (0.0007) [2023-10-12 16:06:52,470][62635] Updated weights for policy 1, policy_version 8790 (0.0009) [2023-10-12 16:06:52,842][62635] Updated weights for policy 1, policy_version 8800 (0.0007) [2023-10-12 16:06:53,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 17989632. Throughput: 0: 1669.1, 1: 1682.0. Samples: 4500262. Policy #0 lag: (min: 9.0, avg: 21.4, max: 41.0) [2023-10-12 16:06:53,435][61643] Avg episode reward: [(0, '2.960'), (1, '2.930')] [2023-10-12 16:06:54,216][62634] Updated weights for policy 0, policy_version 8770 (0.0008) [2023-10-12 16:06:54,590][62634] Updated weights for policy 0, policy_version 8780 (0.0009) [2023-10-12 16:06:54,967][62634] Updated weights for policy 0, policy_version 8790 (0.0008) [2023-10-12 16:06:55,342][62634] Updated weights for policy 0, policy_version 8800 (0.0008) [2023-10-12 16:06:56,886][62635] Updated weights for policy 1, policy_version 8810 (0.0008) [2023-10-12 16:06:57,256][62635] Updated weights for policy 1, policy_version 8820 (0.0008) [2023-10-12 16:06:57,621][62635] Updated weights for policy 1, policy_version 8830 (0.0007) [2023-10-12 16:06:58,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 18055168. Throughput: 0: 1680.4, 1: 1669.1. Samples: 4520356. Policy #0 lag: (min: 9.0, avg: 21.4, max: 41.0) [2023-10-12 16:06:58,436][61643] Avg episode reward: [(0, '2.950'), (1, '2.970')] [2023-10-12 16:06:59,301][62634] Updated weights for policy 0, policy_version 8810 (0.0007) [2023-10-12 16:06:59,685][62634] Updated weights for policy 0, policy_version 8820 (0.0007) [2023-10-12 16:07:00,050][62634] Updated weights for policy 0, policy_version 8830 (0.0010) [2023-10-12 16:07:01,723][62635] Updated weights for policy 1, policy_version 8840 (0.0010) [2023-10-12 16:07:02,099][62635] Updated weights for policy 1, policy_version 8850 (0.0009) [2023-10-12 16:07:02,467][62635] Updated weights for policy 1, policy_version 8860 (0.0009) [2023-10-12 16:07:03,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 18120704. Throughput: 0: 1678.2, 1: 1664.5. Samples: 4540410. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 16:07:03,435][61643] Avg episode reward: [(0, '2.920'), (1, '3.010')] [2023-10-12 16:07:04,165][62634] Updated weights for policy 0, policy_version 8840 (0.0008) [2023-10-12 16:07:04,552][62634] Updated weights for policy 0, policy_version 8850 (0.0007) [2023-10-12 16:07:04,926][62634] Updated weights for policy 0, policy_version 8860 (0.0010) [2023-10-12 16:07:06,651][62635] Updated weights for policy 1, policy_version 8870 (0.0009) [2023-10-12 16:07:07,020][62635] Updated weights for policy 1, policy_version 8880 (0.0010) [2023-10-12 16:07:07,384][62635] Updated weights for policy 1, policy_version 8890 (0.0011) [2023-10-12 16:07:08,435][61643] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 18186240. Throughput: 0: 1674.7, 1: 1682.8. Samples: 4550620. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 16:07:08,435][61643] Avg episode reward: [(0, '2.920'), (1, '3.050')] [2023-10-12 16:07:09,016][62634] Updated weights for policy 0, policy_version 8870 (0.0010) [2023-10-12 16:07:09,393][62634] Updated weights for policy 0, policy_version 8880 (0.0007) [2023-10-12 16:07:09,775][62634] Updated weights for policy 0, policy_version 8890 (0.0008) [2023-10-12 16:07:11,493][62635] Updated weights for policy 1, policy_version 8900 (0.0009) [2023-10-12 16:07:11,862][62635] Updated weights for policy 1, policy_version 8910 (0.0010) [2023-10-12 16:07:12,225][62635] Updated weights for policy 1, policy_version 8920 (0.0009) [2023-10-12 16:07:13,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 18251776. Throughput: 0: 1678.6, 1: 1660.8. Samples: 4570756. Policy #0 lag: (min: 17.0, avg: 30.7, max: 49.0) [2023-10-12 16:07:13,435][61643] Avg episode reward: [(0, '2.950'), (1, '3.000')] [2023-10-12 16:07:13,599][62634] Updated weights for policy 0, policy_version 8900 (0.0007) [2023-10-12 16:07:13,966][62634] Updated weights for policy 0, policy_version 8910 (0.0009) [2023-10-12 16:07:14,343][62634] Updated weights for policy 0, policy_version 8920 (0.0009) [2023-10-12 16:07:16,223][62635] Updated weights for policy 1, policy_version 8930 (0.0009) [2023-10-12 16:07:16,589][62635] Updated weights for policy 1, policy_version 8940 (0.0007) [2023-10-12 16:07:16,957][62635] Updated weights for policy 1, policy_version 8950 (0.0007) [2023-10-12 16:07:17,331][62635] Updated weights for policy 1, policy_version 8960 (0.0008) [2023-10-12 16:07:18,000][62634] Updated weights for policy 0, policy_version 8930 (0.0009) [2023-10-12 16:07:18,408][62634] Updated weights for policy 0, policy_version 8940 (0.0008) [2023-10-12 16:07:18,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 18317312. Throughput: 0: 1688.4, 1: 1670.0. Samples: 4591412. Policy #0 lag: (min: 17.0, avg: 30.7, max: 49.0) [2023-10-12 16:07:18,435][61643] Avg episode reward: [(0, '2.940'), (1, '2.990')] [2023-10-12 16:07:18,780][62634] Updated weights for policy 0, policy_version 8950 (0.0008) [2023-10-12 16:07:19,153][62634] Updated weights for policy 0, policy_version 8960 (0.0009) [2023-10-12 16:07:21,291][62635] Updated weights for policy 1, policy_version 8970 (0.0007) [2023-10-12 16:07:21,654][62635] Updated weights for policy 1, policy_version 8980 (0.0008) [2023-10-12 16:07:22,022][62635] Updated weights for policy 1, policy_version 8990 (0.0007) [2023-10-12 16:07:23,022][62634] Updated weights for policy 0, policy_version 8970 (0.0007) [2023-10-12 16:07:23,390][62634] Updated weights for policy 0, policy_version 8980 (0.0009) [2023-10-12 16:07:23,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 18382848. Throughput: 0: 1693.9, 1: 1676.8. Samples: 4601870. Policy #0 lag: (min: 31.0, avg: 38.9, max: 63.0) [2023-10-12 16:07:23,436][61643] Avg episode reward: [(0, '2.970'), (1, '2.980')] [2023-10-12 16:07:23,767][62634] Updated weights for policy 0, policy_version 8990 (0.0010) [2023-10-12 16:07:26,081][62635] Updated weights for policy 1, policy_version 9000 (0.0008) [2023-10-12 16:07:26,448][62635] Updated weights for policy 1, policy_version 9010 (0.0010) [2023-10-12 16:07:26,827][62635] Updated weights for policy 1, policy_version 9020 (0.0010) [2023-10-12 16:07:27,927][62634] Updated weights for policy 0, policy_version 9000 (0.0008) [2023-10-12 16:07:28,313][62634] Updated weights for policy 0, policy_version 9010 (0.0010) [2023-10-12 16:07:28,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 18448384. Throughput: 0: 1701.7, 1: 1654.3. Samples: 4621696. Policy #0 lag: (min: 31.0, avg: 38.9, max: 63.0) [2023-10-12 16:07:28,436][61643] Avg episode reward: [(0, '2.950'), (1, '2.990')] [2023-10-12 16:07:28,690][62634] Updated weights for policy 0, policy_version 9020 (0.0010) [2023-10-12 16:07:31,005][62635] Updated weights for policy 1, policy_version 9030 (0.0009) [2023-10-12 16:07:31,386][62635] Updated weights for policy 1, policy_version 9040 (0.0009) [2023-10-12 16:07:31,760][62635] Updated weights for policy 1, policy_version 9050 (0.0009) [2023-10-12 16:07:32,809][62634] Updated weights for policy 0, policy_version 9030 (0.0010) [2023-10-12 16:07:33,193][62634] Updated weights for policy 0, policy_version 9040 (0.0010) [2023-10-12 16:07:33,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 18513920. Throughput: 0: 1690.4, 1: 1678.8. Samples: 4641714. Policy #0 lag: (min: 31.0, avg: 34.2, max: 63.0) [2023-10-12 16:07:33,436][61643] Avg episode reward: [(0, '2.990'), (1, '3.020')] [2023-10-12 16:07:33,578][62634] Updated weights for policy 0, policy_version 9050 (0.0009) [2023-10-12 16:07:35,915][62635] Updated weights for policy 1, policy_version 9060 (0.0007) [2023-10-12 16:07:36,282][62635] Updated weights for policy 1, policy_version 9070 (0.0007) [2023-10-12 16:07:36,659][62635] Updated weights for policy 1, policy_version 9080 (0.0007) [2023-10-12 16:07:37,726][62634] Updated weights for policy 0, policy_version 9060 (0.0008) [2023-10-12 16:07:38,104][62634] Updated weights for policy 0, policy_version 9070 (0.0008) [2023-10-12 16:07:38,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 18579456. Throughput: 0: 1701.9, 1: 1674.9. Samples: 4652218. Policy #0 lag: (min: 31.0, avg: 34.2, max: 63.0) [2023-10-12 16:07:38,435][61643] Avg episode reward: [(0, '3.010'), (1, '3.000')] [2023-10-12 16:07:38,472][62634] Updated weights for policy 0, policy_version 9080 (0.0009) [2023-10-12 16:07:40,796][62635] Updated weights for policy 1, policy_version 9090 (0.0008) [2023-10-12 16:07:41,172][62635] Updated weights for policy 1, policy_version 9100 (0.0009) [2023-10-12 16:07:41,540][62635] Updated weights for policy 1, policy_version 9110 (0.0007) [2023-10-12 16:07:41,914][62635] Updated weights for policy 1, policy_version 9120 (0.0008) [2023-10-12 16:07:42,629][62634] Updated weights for policy 0, policy_version 9090 (0.0011) [2023-10-12 16:07:43,008][62634] Updated weights for policy 0, policy_version 9100 (0.0008) [2023-10-12 16:07:43,391][62634] Updated weights for policy 0, policy_version 9110 (0.0010) [2023-10-12 16:07:43,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 18644992. Throughput: 0: 1703.4, 1: 1663.4. Samples: 4671864. Policy #0 lag: (min: 26.0, avg: 32.4, max: 58.0) [2023-10-12 16:07:43,435][61643] Avg episode reward: [(0, '3.010'), (1, '3.020')] [2023-10-12 16:07:43,768][62634] Updated weights for policy 0, policy_version 9120 (0.0007) [2023-10-12 16:07:45,978][62635] Updated weights for policy 1, policy_version 9130 (0.0009) [2023-10-12 16:07:46,355][62635] Updated weights for policy 1, policy_version 9140 (0.0007) [2023-10-12 16:07:46,728][62635] Updated weights for policy 1, policy_version 9150 (0.0007) [2023-10-12 16:07:47,681][62634] Updated weights for policy 0, policy_version 9130 (0.0008) [2023-10-12 16:07:48,055][62634] Updated weights for policy 0, policy_version 9140 (0.0008) [2023-10-12 16:07:48,423][62634] Updated weights for policy 0, policy_version 9150 (0.0008) [2023-10-12 16:07:48,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 18710528. Throughput: 0: 1693.8, 1: 1677.7. Samples: 4692128. Policy #0 lag: (min: 26.0, avg: 32.4, max: 58.0) [2023-10-12 16:07:48,435][61643] Avg episode reward: [(0, '2.990'), (1, '3.040')] [2023-10-12 16:07:50,707][62635] Updated weights for policy 1, policy_version 9160 (0.0008) [2023-10-12 16:07:51,075][62635] Updated weights for policy 1, policy_version 9170 (0.0009) [2023-10-12 16:07:51,442][62635] Updated weights for policy 1, policy_version 9180 (0.0007) [2023-10-12 16:07:52,464][62634] Updated weights for policy 0, policy_version 9160 (0.0007) [2023-10-12 16:07:52,843][62634] Updated weights for policy 0, policy_version 9170 (0.0008) [2023-10-12 16:07:53,225][62634] Updated weights for policy 0, policy_version 9180 (0.0010) [2023-10-12 16:07:53,435][61643] Fps is (10 sec: 16384.0, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 18808832. Throughput: 0: 1708.3, 1: 1666.2. Samples: 4702472. Policy #0 lag: (min: 31.0, avg: 43.0, max: 63.0) [2023-10-12 16:07:53,435][61643] Avg episode reward: [(0, '3.000'), (1, '3.010')] [2023-10-12 16:07:55,606][62635] Updated weights for policy 1, policy_version 9190 (0.0008) [2023-10-12 16:07:55,973][62635] Updated weights for policy 1, policy_version 9200 (0.0007) [2023-10-12 16:07:56,345][62635] Updated weights for policy 1, policy_version 9210 (0.0008) [2023-10-12 16:07:57,184][62634] Updated weights for policy 0, policy_version 9190 (0.0008) [2023-10-12 16:07:57,560][62634] Updated weights for policy 0, policy_version 9200 (0.0008) [2023-10-12 16:07:57,938][62634] Updated weights for policy 0, policy_version 9210 (0.0007) [2023-10-12 16:07:58,435][61643] Fps is (10 sec: 16383.9, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 18874368. Throughput: 0: 1705.5, 1: 1666.2. Samples: 4722484. Policy #0 lag: (min: 31.0, avg: 38.8, max: 63.0) [2023-10-12 16:07:58,436][61643] Avg episode reward: [(0, '2.950'), (1, '3.030')] [2023-10-12 16:08:00,565][62635] Updated weights for policy 1, policy_version 9220 (0.0007) [2023-10-12 16:08:00,940][62635] Updated weights for policy 1, policy_version 9230 (0.0007) [2023-10-12 16:08:01,308][62635] Updated weights for policy 1, policy_version 9240 (0.0007) [2023-10-12 16:08:02,036][62634] Updated weights for policy 0, policy_version 9220 (0.0007) [2023-10-12 16:08:02,419][62634] Updated weights for policy 0, policy_version 9230 (0.0009) [2023-10-12 16:08:02,795][62634] Updated weights for policy 0, policy_version 9240 (0.0008) [2023-10-12 16:08:03,435][61643] Fps is (10 sec: 13106.8, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 18939904. Throughput: 0: 1674.7, 1: 1675.3. Samples: 4742164. Policy #0 lag: (min: 31.0, avg: 38.8, max: 63.0) [2023-10-12 16:08:03,437][61643] Avg episode reward: [(0, '2.880'), (1, '3.080')] [2023-10-12 16:08:03,449][62495] Saving new best policy, reward=3.080! [2023-10-12 16:08:05,356][62635] Updated weights for policy 1, policy_version 9250 (0.0009) [2023-10-12 16:08:05,728][62635] Updated weights for policy 1, policy_version 9260 (0.0010) [2023-10-12 16:08:06,110][62635] Updated weights for policy 1, policy_version 9270 (0.0011) [2023-10-12 16:08:06,475][62635] Updated weights for policy 1, policy_version 9280 (0.0008) [2023-10-12 16:08:06,887][62634] Updated weights for policy 0, policy_version 9250 (0.0008) [2023-10-12 16:08:07,286][62634] Updated weights for policy 0, policy_version 9260 (0.0008) [2023-10-12 16:08:07,664][62634] Updated weights for policy 0, policy_version 9270 (0.0007) [2023-10-12 16:08:08,036][62634] Updated weights for policy 0, policy_version 9280 (0.0007) [2023-10-12 16:08:08,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 19005440. Throughput: 0: 1700.4, 1: 1658.2. Samples: 4753004. Policy #0 lag: (min: 31.0, avg: 38.3, max: 63.0) [2023-10-12 16:08:08,436][61643] Avg episode reward: [(0, '2.880'), (1, '3.140')] [2023-10-12 16:08:08,437][62495] Saving new best policy, reward=3.140! [2023-10-12 16:08:10,696][62635] Updated weights for policy 1, policy_version 9290 (0.0010) [2023-10-12 16:08:11,061][62635] Updated weights for policy 1, policy_version 9300 (0.0008) [2023-10-12 16:08:11,433][62635] Updated weights for policy 1, policy_version 9310 (0.0009) [2023-10-12 16:08:12,069][62634] Updated weights for policy 0, policy_version 9290 (0.0008) [2023-10-12 16:08:12,444][62634] Updated weights for policy 0, policy_version 9300 (0.0007) [2023-10-12 16:08:12,821][62634] Updated weights for policy 0, policy_version 9310 (0.0008) [2023-10-12 16:08:13,435][61643] Fps is (10 sec: 13107.7, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 19070976. Throughput: 0: 1687.8, 1: 1667.5. Samples: 4772682. Policy #0 lag: (min: 31.0, avg: 38.3, max: 63.0) [2023-10-12 16:08:13,435][61643] Avg episode reward: [(0, '2.950'), (1, '3.100')] [2023-10-12 16:08:15,524][62635] Updated weights for policy 1, policy_version 9320 (0.0010) [2023-10-12 16:08:15,893][62635] Updated weights for policy 1, policy_version 9330 (0.0010) [2023-10-12 16:08:16,260][62635] Updated weights for policy 1, policy_version 9340 (0.0008) [2023-10-12 16:08:16,723][62634] Updated weights for policy 0, policy_version 9320 (0.0008) [2023-10-12 16:08:17,100][62634] Updated weights for policy 0, policy_version 9330 (0.0009) [2023-10-12 16:08:17,485][62634] Updated weights for policy 0, policy_version 9340 (0.0011) [2023-10-12 16:08:18,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 19136512. Throughput: 0: 1679.0, 1: 1670.6. Samples: 4792446. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 16:08:18,436][61643] Avg episode reward: [(0, '2.970'), (1, '3.070')] [2023-10-12 16:08:18,448][62495] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000009344_9568256.pth... [2023-10-12 16:08:18,448][62354] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000009344_9568256.pth... [2023-10-12 16:08:18,490][62495] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000007776_7962624.pth [2023-10-12 16:08:18,490][62354] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000007776_7962624.pth [2023-10-12 16:08:20,438][62635] Updated weights for policy 1, policy_version 9350 (0.0010) [2023-10-12 16:08:20,819][62635] Updated weights for policy 1, policy_version 9360 (0.0009) [2023-10-12 16:08:21,185][62635] Updated weights for policy 1, policy_version 9370 (0.0009) [2023-10-12 16:08:21,259][62634] Updated weights for policy 0, policy_version 9350 (0.0009) [2023-10-12 16:08:21,639][62634] Updated weights for policy 0, policy_version 9360 (0.0009) [2023-10-12 16:08:22,020][62634] Updated weights for policy 0, policy_version 9370 (0.0007) [2023-10-12 16:08:23,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 19202048. Throughput: 0: 1701.6, 1: 1654.4. Samples: 4803240. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 16:08:23,435][61643] Avg episode reward: [(0, '2.970'), (1, '3.090')] [2023-10-12 16:08:25,248][62635] Updated weights for policy 1, policy_version 9380 (0.0011) [2023-10-12 16:08:25,629][62635] Updated weights for policy 1, policy_version 9390 (0.0008) [2023-10-12 16:08:25,993][62635] Updated weights for policy 1, policy_version 9400 (0.0008) [2023-10-12 16:08:26,161][62634] Updated weights for policy 0, policy_version 9380 (0.0008) [2023-10-12 16:08:26,530][62634] Updated weights for policy 0, policy_version 9390 (0.0007) [2023-10-12 16:08:26,911][62634] Updated weights for policy 0, policy_version 9400 (0.0008) [2023-10-12 16:08:28,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 19267584. Throughput: 0: 1677.9, 1: 1669.9. Samples: 4822518. Policy #0 lag: (min: 31.0, avg: 33.7, max: 63.0) [2023-10-12 16:08:28,436][61643] Avg episode reward: [(0, '2.980'), (1, '3.040')] [2023-10-12 16:08:30,046][62635] Updated weights for policy 1, policy_version 9410 (0.0007) [2023-10-12 16:08:30,414][62635] Updated weights for policy 1, policy_version 9420 (0.0007) [2023-10-12 16:08:30,780][62635] Updated weights for policy 1, policy_version 9430 (0.0007) [2023-10-12 16:08:30,846][62634] Updated weights for policy 0, policy_version 9410 (0.0007) [2023-10-12 16:08:31,154][62635] Updated weights for policy 1, policy_version 9440 (0.0007) [2023-10-12 16:08:31,233][62634] Updated weights for policy 0, policy_version 9420 (0.0008) [2023-10-12 16:08:31,609][62634] Updated weights for policy 0, policy_version 9430 (0.0008) [2023-10-12 16:08:31,987][62634] Updated weights for policy 0, policy_version 9440 (0.0007) [2023-10-12 16:08:33,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 19333120. Throughput: 0: 1682.4, 1: 1672.7. Samples: 4843108. Policy #0 lag: (min: 31.0, avg: 33.7, max: 63.0) [2023-10-12 16:08:33,435][61643] Avg episode reward: [(0, '3.010'), (1, '3.030')] [2023-10-12 16:08:35,161][62635] Updated weights for policy 1, policy_version 9450 (0.0010) [2023-10-12 16:08:35,544][62635] Updated weights for policy 1, policy_version 9460 (0.0009) [2023-10-12 16:08:35,913][62635] Updated weights for policy 1, policy_version 9470 (0.0009) [2023-10-12 16:08:35,936][62634] Updated weights for policy 0, policy_version 9450 (0.0008) [2023-10-12 16:08:36,311][62634] Updated weights for policy 0, policy_version 9460 (0.0008) [2023-10-12 16:08:36,683][62634] Updated weights for policy 0, policy_version 9470 (0.0008) [2023-10-12 16:08:38,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 19398656. Throughput: 0: 1689.6, 1: 1659.6. Samples: 4853188. Policy #0 lag: (min: 31.0, avg: 34.2, max: 63.0) [2023-10-12 16:08:38,436][61643] Avg episode reward: [(0, '3.000'), (1, '3.010')] [2023-10-12 16:08:39,890][62635] Updated weights for policy 1, policy_version 9480 (0.0011) [2023-10-12 16:08:40,263][62635] Updated weights for policy 1, policy_version 9490 (0.0009) [2023-10-12 16:08:40,628][62635] Updated weights for policy 1, policy_version 9500 (0.0007) [2023-10-12 16:08:40,671][62634] Updated weights for policy 0, policy_version 9480 (0.0007) [2023-10-12 16:08:41,050][62634] Updated weights for policy 0, policy_version 9490 (0.0007) [2023-10-12 16:08:41,419][62634] Updated weights for policy 0, policy_version 9500 (0.0007) [2023-10-12 16:08:43,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 19464192. Throughput: 0: 1667.1, 1: 1673.4. Samples: 4872806. Policy #0 lag: (min: 31.0, avg: 34.2, max: 63.0) [2023-10-12 16:08:43,435][61643] Avg episode reward: [(0, '3.000'), (1, '3.100')] [2023-10-12 16:08:44,837][62635] Updated weights for policy 1, policy_version 9510 (0.0007) [2023-10-12 16:08:45,212][62635] Updated weights for policy 1, policy_version 9520 (0.0009) [2023-10-12 16:08:45,583][62635] Updated weights for policy 1, policy_version 9530 (0.0008) [2023-10-12 16:08:45,600][62634] Updated weights for policy 0, policy_version 9510 (0.0008) [2023-10-12 16:08:45,973][62634] Updated weights for policy 0, policy_version 9520 (0.0010) [2023-10-12 16:08:46,352][62634] Updated weights for policy 0, policy_version 9530 (0.0010) [2023-10-12 16:08:48,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 19529728. Throughput: 0: 1692.1, 1: 1672.9. Samples: 4893592. Policy #0 lag: (min: 9.0, avg: 16.5, max: 41.0) [2023-10-12 16:08:48,435][61643] Avg episode reward: [(0, '3.010'), (1, '3.140')] [2023-10-12 16:08:49,706][62635] Updated weights for policy 1, policy_version 9540 (0.0007) [2023-10-12 16:08:50,077][62635] Updated weights for policy 1, policy_version 9550 (0.0008) [2023-10-12 16:08:50,422][62634] Updated weights for policy 0, policy_version 9540 (0.0009) [2023-10-12 16:08:50,438][62635] Updated weights for policy 1, policy_version 9560 (0.0007) [2023-10-12 16:08:50,806][62634] Updated weights for policy 0, policy_version 9550 (0.0009) [2023-10-12 16:08:51,177][62634] Updated weights for policy 0, policy_version 9560 (0.0009) [2023-10-12 16:08:53,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 19595264. Throughput: 0: 1677.3, 1: 1663.8. Samples: 4903356. Policy #0 lag: (min: 9.0, avg: 16.5, max: 41.0) [2023-10-12 16:08:53,435][61643] Avg episode reward: [(0, '3.010'), (1, '3.160')] [2023-10-12 16:08:53,436][62495] Saving new best policy, reward=3.160! [2023-10-12 16:08:54,371][62635] Updated weights for policy 1, policy_version 9570 (0.0010) [2023-10-12 16:08:54,746][62635] Updated weights for policy 1, policy_version 9580 (0.0008) [2023-10-12 16:08:55,110][62635] Updated weights for policy 1, policy_version 9590 (0.0007) [2023-10-12 16:08:55,388][62634] Updated weights for policy 0, policy_version 9570 (0.0007) [2023-10-12 16:08:55,479][62635] Updated weights for policy 1, policy_version 9600 (0.0007) [2023-10-12 16:08:55,774][62634] Updated weights for policy 0, policy_version 9580 (0.0009) [2023-10-12 16:08:56,154][62634] Updated weights for policy 0, policy_version 9590 (0.0007) [2023-10-12 16:08:56,535][62634] Updated weights for policy 0, policy_version 9600 (0.0007) [2023-10-12 16:08:58,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 19660800. Throughput: 0: 1674.3, 1: 1683.8. Samples: 4923794. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 16:08:58,435][61643] Avg episode reward: [(0, '3.010'), (1, '3.140')] [2023-10-12 16:08:59,600][62635] Updated weights for policy 1, policy_version 9610 (0.0009) [2023-10-12 16:08:59,975][62635] Updated weights for policy 1, policy_version 9620 (0.0010) [2023-10-12 16:09:00,347][62635] Updated weights for policy 1, policy_version 9630 (0.0008) [2023-10-12 16:09:00,529][62634] Updated weights for policy 0, policy_version 9610 (0.0010) [2023-10-12 16:09:00,903][62634] Updated weights for policy 0, policy_version 9620 (0.0011) [2023-10-12 16:09:01,280][62634] Updated weights for policy 0, policy_version 9630 (0.0010) [2023-10-12 16:09:03,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.3, 300 sec: 13440.4). Total num frames: 19726336. Throughput: 0: 1695.0, 1: 1683.1. Samples: 4944460. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 16:09:03,435][61643] Avg episode reward: [(0, '3.010'), (1, '3.200')] [2023-10-12 16:09:03,446][62495] Saving new best policy, reward=3.200! [2023-10-12 16:09:04,414][62635] Updated weights for policy 1, policy_version 9640 (0.0008) [2023-10-12 16:09:04,785][62635] Updated weights for policy 1, policy_version 9650 (0.0007) [2023-10-12 16:09:05,143][62635] Updated weights for policy 1, policy_version 9660 (0.0008) [2023-10-12 16:09:05,298][62634] Updated weights for policy 0, policy_version 9640 (0.0007) [2023-10-12 16:09:05,675][62634] Updated weights for policy 0, policy_version 9650 (0.0008) [2023-10-12 16:09:06,064][62634] Updated weights for policy 0, policy_version 9660 (0.0009) [2023-10-12 16:09:08,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 19791872. Throughput: 0: 1671.9, 1: 1675.7. Samples: 4953884. Policy #0 lag: (min: 25.0, avg: 44.0, max: 57.0) [2023-10-12 16:09:08,435][61643] Avg episode reward: [(0, '3.000'), (1, '3.140')] [2023-10-12 16:09:09,047][62635] Updated weights for policy 1, policy_version 9670 (0.0009) [2023-10-12 16:09:09,416][62635] Updated weights for policy 1, policy_version 9680 (0.0009) [2023-10-12 16:09:09,786][62635] Updated weights for policy 1, policy_version 9690 (0.0009) [2023-10-12 16:09:10,050][62634] Updated weights for policy 0, policy_version 9670 (0.0008) [2023-10-12 16:09:10,427][62634] Updated weights for policy 0, policy_version 9680 (0.0007) [2023-10-12 16:09:10,812][62634] Updated weights for policy 0, policy_version 9690 (0.0007) [2023-10-12 16:09:13,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 19857408. Throughput: 0: 1691.4, 1: 1683.2. Samples: 4974372. Policy #0 lag: (min: 25.0, avg: 44.0, max: 57.0) [2023-10-12 16:09:13,435][61643] Avg episode reward: [(0, '3.000'), (1, '3.150')] [2023-10-12 16:09:13,755][62635] Updated weights for policy 1, policy_version 9700 (0.0009) [2023-10-12 16:09:14,119][62635] Updated weights for policy 1, policy_version 9710 (0.0009) [2023-10-12 16:09:14,492][62635] Updated weights for policy 1, policy_version 9720 (0.0010) [2023-10-12 16:09:14,616][62634] Updated weights for policy 0, policy_version 9700 (0.0009) [2023-10-12 16:09:14,986][62634] Updated weights for policy 0, policy_version 9710 (0.0008) [2023-10-12 16:09:15,368][62634] Updated weights for policy 0, policy_version 9720 (0.0007) [2023-10-12 16:09:18,435][61643] Fps is (10 sec: 13106.8, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 19922944. Throughput: 0: 1696.5, 1: 1680.8. Samples: 4995086. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 16:09:18,436][61643] Avg episode reward: [(0, '2.990'), (1, '3.050')] [2023-10-12 16:09:18,624][62635] Updated weights for policy 1, policy_version 9730 (0.0008) [2023-10-12 16:09:18,994][62635] Updated weights for policy 1, policy_version 9740 (0.0008) [2023-10-12 16:09:19,363][62635] Updated weights for policy 1, policy_version 9750 (0.0008) [2023-10-12 16:09:19,428][62634] Updated weights for policy 0, policy_version 9730 (0.0007) [2023-10-12 16:09:19,729][62635] Updated weights for policy 1, policy_version 9760 (0.0008) [2023-10-12 16:09:19,816][62634] Updated weights for policy 0, policy_version 9740 (0.0010) [2023-10-12 16:09:20,190][62634] Updated weights for policy 0, policy_version 9750 (0.0009) [2023-10-12 16:09:20,565][62634] Updated weights for policy 0, policy_version 9760 (0.0007) [2023-10-12 16:09:23,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 19988480. Throughput: 0: 1676.3, 1: 1679.1. Samples: 5004180. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 16:09:23,435][61643] Avg episode reward: [(0, '2.990'), (1, '3.130')] [2023-10-12 16:09:23,857][62635] Updated weights for policy 1, policy_version 9770 (0.0010) [2023-10-12 16:09:24,235][62635] Updated weights for policy 1, policy_version 9780 (0.0009) [2023-10-12 16:09:24,602][62635] Updated weights for policy 1, policy_version 9790 (0.0007) [2023-10-12 16:09:24,662][62634] Updated weights for policy 0, policy_version 9770 (0.0007) [2023-10-12 16:09:25,033][62634] Updated weights for policy 0, policy_version 9780 (0.0007) [2023-10-12 16:09:25,423][62634] Updated weights for policy 0, policy_version 9790 (0.0009) [2023-10-12 16:09:28,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 20054016. Throughput: 0: 1693.2, 1: 1688.3. Samples: 5024972. Policy #0 lag: (min: 31.0, avg: 37.5, max: 63.0) [2023-10-12 16:09:28,436][61643] Avg episode reward: [(0, '2.990'), (1, '3.080')] [2023-10-12 16:09:28,531][62635] Updated weights for policy 1, policy_version 9800 (0.0007) [2023-10-12 16:09:28,906][62635] Updated weights for policy 1, policy_version 9810 (0.0008) [2023-10-12 16:09:29,269][62635] Updated weights for policy 1, policy_version 9820 (0.0007) [2023-10-12 16:09:29,481][62634] Updated weights for policy 0, policy_version 9800 (0.0008) [2023-10-12 16:09:29,863][62634] Updated weights for policy 0, policy_version 9810 (0.0007) [2023-10-12 16:09:30,239][62634] Updated weights for policy 0, policy_version 9820 (0.0008) [2023-10-12 16:09:33,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 20119552. Throughput: 0: 1692.1, 1: 1689.7. Samples: 5045772. Policy #0 lag: (min: 31.0, avg: 37.5, max: 63.0) [2023-10-12 16:09:33,435][61643] Avg episode reward: [(0, '2.970'), (1, '3.190')] [2023-10-12 16:09:33,454][62635] Updated weights for policy 1, policy_version 9830 (0.0009) [2023-10-12 16:09:33,824][62635] Updated weights for policy 1, policy_version 9840 (0.0010) [2023-10-12 16:09:34,198][62635] Updated weights for policy 1, policy_version 9850 (0.0007) [2023-10-12 16:09:34,334][62634] Updated weights for policy 0, policy_version 9830 (0.0008) [2023-10-12 16:09:34,699][62634] Updated weights for policy 0, policy_version 9840 (0.0010) [2023-10-12 16:09:35,081][62634] Updated weights for policy 0, policy_version 9850 (0.0009) [2023-10-12 16:09:38,337][62635] Updated weights for policy 1, policy_version 9860 (0.0008) [2023-10-12 16:09:38,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 20185088. Throughput: 0: 1678.5, 1: 1690.4. Samples: 5054960. Policy #0 lag: (min: 3.0, avg: 3.4, max: 17.0) [2023-10-12 16:09:38,436][61643] Avg episode reward: [(0, '2.970'), (1, '3.140')] [2023-10-12 16:09:38,703][62635] Updated weights for policy 1, policy_version 9870 (0.0010) [2023-10-12 16:09:39,037][62634] Updated weights for policy 0, policy_version 9860 (0.0009) [2023-10-12 16:09:39,076][62635] Updated weights for policy 1, policy_version 9880 (0.0009) [2023-10-12 16:09:39,406][62634] Updated weights for policy 0, policy_version 9870 (0.0009) [2023-10-12 16:09:39,785][62634] Updated weights for policy 0, policy_version 9880 (0.0008) [2023-10-12 16:09:43,146][62635] Updated weights for policy 1, policy_version 9890 (0.0007) [2023-10-12 16:09:43,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 20250624. Throughput: 0: 1688.2, 1: 1683.0. Samples: 5075496. Policy #0 lag: (min: 3.0, avg: 3.4, max: 17.0) [2023-10-12 16:09:43,435][61643] Avg episode reward: [(0, '2.980'), (1, '3.200')] [2023-10-12 16:09:43,517][62635] Updated weights for policy 1, policy_version 9900 (0.0009) [2023-10-12 16:09:43,890][62634] Updated weights for policy 0, policy_version 9890 (0.0008) [2023-10-12 16:09:43,893][62635] Updated weights for policy 1, policy_version 9910 (0.0008) [2023-10-12 16:09:44,265][62634] Updated weights for policy 0, policy_version 9900 (0.0008) [2023-10-12 16:09:44,266][62635] Updated weights for policy 1, policy_version 9920 (0.0008) [2023-10-12 16:09:44,638][62634] Updated weights for policy 0, policy_version 9910 (0.0007) [2023-10-12 16:09:45,011][62634] Updated weights for policy 0, policy_version 9920 (0.0009) [2023-10-12 16:09:48,292][62635] Updated weights for policy 1, policy_version 9930 (0.0009) [2023-10-12 16:09:48,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.1, 300 sec: 13329.3). Total num frames: 20316160. Throughput: 0: 1683.8, 1: 1680.9. Samples: 5095872. Policy #0 lag: (min: 24.0, avg: 48.8, max: 56.0) [2023-10-12 16:09:48,436][61643] Avg episode reward: [(0, '2.980'), (1, '3.170')] [2023-10-12 16:09:48,662][62635] Updated weights for policy 1, policy_version 9940 (0.0007) [2023-10-12 16:09:49,032][62635] Updated weights for policy 1, policy_version 9950 (0.0008) [2023-10-12 16:09:49,197][62634] Updated weights for policy 0, policy_version 9930 (0.0008) [2023-10-12 16:09:49,574][62634] Updated weights for policy 0, policy_version 9940 (0.0007) [2023-10-12 16:09:49,957][62634] Updated weights for policy 0, policy_version 9950 (0.0007) [2023-10-12 16:09:53,223][62635] Updated weights for policy 1, policy_version 9960 (0.0010) [2023-10-12 16:09:53,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 20381696. Throughput: 0: 1672.5, 1: 1684.7. Samples: 5104958. Policy #0 lag: (min: 24.0, avg: 48.8, max: 56.0) [2023-10-12 16:09:53,436][61643] Avg episode reward: [(0, '2.980'), (1, '3.140')] [2023-10-12 16:09:53,585][62635] Updated weights for policy 1, policy_version 9970 (0.0007) [2023-10-12 16:09:53,950][62635] Updated weights for policy 1, policy_version 9980 (0.0007) [2023-10-12 16:09:53,962][62634] Updated weights for policy 0, policy_version 9960 (0.0008) [2023-10-12 16:09:54,329][62634] Updated weights for policy 0, policy_version 9970 (0.0007) [2023-10-12 16:09:54,713][62634] Updated weights for policy 0, policy_version 9980 (0.0007) [2023-10-12 16:09:58,043][62635] Updated weights for policy 1, policy_version 9990 (0.0008) [2023-10-12 16:09:58,422][62635] Updated weights for policy 1, policy_version 10000 (0.0008) [2023-10-12 16:09:58,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13107.1, 300 sec: 13329.3). Total num frames: 20447232. Throughput: 0: 1675.0, 1: 1686.0. Samples: 5125616. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 16:09:58,437][61643] Avg episode reward: [(0, '2.950'), (1, '3.120')] [2023-10-12 16:09:58,790][62635] Updated weights for policy 1, policy_version 10010 (0.0008) [2023-10-12 16:09:58,856][62634] Updated weights for policy 0, policy_version 9990 (0.0007) [2023-10-12 16:09:59,242][62634] Updated weights for policy 0, policy_version 10000 (0.0009) [2023-10-12 16:09:59,632][62634] Updated weights for policy 0, policy_version 10010 (0.0007) [2023-10-12 16:10:02,827][62635] Updated weights for policy 1, policy_version 10020 (0.0009) [2023-10-12 16:10:03,200][62635] Updated weights for policy 1, policy_version 10030 (0.0011) [2023-10-12 16:10:03,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 20512768. Throughput: 0: 1674.4, 1: 1676.1. Samples: 5145854. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 16:10:03,435][61643] Avg episode reward: [(0, '2.970'), (1, '3.170')] [2023-10-12 16:10:03,572][62635] Updated weights for policy 1, policy_version 10040 (0.0010) [2023-10-12 16:10:03,621][62634] Updated weights for policy 0, policy_version 10020 (0.0010) [2023-10-12 16:10:03,987][62634] Updated weights for policy 0, policy_version 10030 (0.0008) [2023-10-12 16:10:04,366][62634] Updated weights for policy 0, policy_version 10040 (0.0008) [2023-10-12 16:10:07,540][62635] Updated weights for policy 1, policy_version 10050 (0.0009) [2023-10-12 16:10:07,913][62635] Updated weights for policy 1, policy_version 10060 (0.0007) [2023-10-12 16:10:08,284][62635] Updated weights for policy 1, policy_version 10070 (0.0007) [2023-10-12 16:10:08,342][62634] Updated weights for policy 0, policy_version 10050 (0.0007) [2023-10-12 16:10:08,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 20578304. Throughput: 0: 1676.6, 1: 1687.1. Samples: 5155548. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 16:10:08,436][61643] Avg episode reward: [(0, '2.990'), (1, '3.170')] [2023-10-12 16:10:08,656][62635] Updated weights for policy 1, policy_version 10080 (0.0007) [2023-10-12 16:10:08,719][62634] Updated weights for policy 0, policy_version 10060 (0.0008) [2023-10-12 16:10:09,094][62634] Updated weights for policy 0, policy_version 10070 (0.0008) [2023-10-12 16:10:09,473][62634] Updated weights for policy 0, policy_version 10080 (0.0008) [2023-10-12 16:10:12,715][62635] Updated weights for policy 1, policy_version 10090 (0.0009) [2023-10-12 16:10:13,076][62635] Updated weights for policy 1, policy_version 10100 (0.0009) [2023-10-12 16:10:13,432][62634] Updated weights for policy 0, policy_version 10090 (0.0008) [2023-10-12 16:10:13,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 20643840. Throughput: 0: 1680.1, 1: 1682.6. Samples: 5176294. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 16:10:13,435][61643] Avg episode reward: [(0, '2.970'), (1, '3.170')] [2023-10-12 16:10:13,451][62635] Updated weights for policy 1, policy_version 10110 (0.0008) [2023-10-12 16:10:13,800][62634] Updated weights for policy 0, policy_version 10100 (0.0008) [2023-10-12 16:10:14,177][62634] Updated weights for policy 0, policy_version 10110 (0.0007) [2023-10-12 16:10:17,431][62635] Updated weights for policy 1, policy_version 10120 (0.0009) [2023-10-12 16:10:17,797][62635] Updated weights for policy 1, policy_version 10130 (0.0009) [2023-10-12 16:10:18,165][62635] Updated weights for policy 1, policy_version 10140 (0.0007) [2023-10-12 16:10:18,282][62634] Updated weights for policy 0, policy_version 10120 (0.0007) [2023-10-12 16:10:18,435][61643] Fps is (10 sec: 16383.8, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 20742144. Throughput: 0: 1681.4, 1: 1664.8. Samples: 5196352. Policy #0 lag: (min: 31.0, avg: 38.7, max: 63.0) [2023-10-12 16:10:18,436][61643] Avg episode reward: [(0, '2.970'), (1, '3.190')] [2023-10-12 16:10:18,448][62495] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000010144_10387456.pth... [2023-10-12 16:10:18,488][62495] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000008576_8781824.pth [2023-10-12 16:10:18,656][62634] Updated weights for policy 0, policy_version 10130 (0.0008) [2023-10-12 16:10:19,023][62634] Updated weights for policy 0, policy_version 10140 (0.0010) [2023-10-12 16:10:19,172][62354] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000010144_10387456.pth... [2023-10-12 16:10:19,210][62354] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000008544_8749056.pth [2023-10-12 16:10:22,395][62635] Updated weights for policy 1, policy_version 10150 (0.0009) [2023-10-12 16:10:22,771][62635] Updated weights for policy 1, policy_version 10160 (0.0010) [2023-10-12 16:10:23,061][62634] Updated weights for policy 0, policy_version 10150 (0.0007) [2023-10-12 16:10:23,132][62635] Updated weights for policy 1, policy_version 10170 (0.0007) [2023-10-12 16:10:23,435][61643] Fps is (10 sec: 16384.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 20807680. Throughput: 0: 1681.3, 1: 1679.7. Samples: 5206204. Policy #0 lag: (min: 31.0, avg: 38.7, max: 63.0) [2023-10-12 16:10:23,435][61643] Avg episode reward: [(0, '2.980'), (1, '3.230')] [2023-10-12 16:10:23,436][62495] Saving new best policy, reward=3.230! [2023-10-12 16:10:23,439][62634] Updated weights for policy 0, policy_version 10160 (0.0007) [2023-10-12 16:10:23,819][62634] Updated weights for policy 0, policy_version 10170 (0.0008) [2023-10-12 16:10:27,247][62635] Updated weights for policy 1, policy_version 10180 (0.0009) [2023-10-12 16:10:27,623][62635] Updated weights for policy 1, policy_version 10190 (0.0009) [2023-10-12 16:10:27,720][62634] Updated weights for policy 0, policy_version 10180 (0.0009) [2023-10-12 16:10:27,985][62635] Updated weights for policy 1, policy_version 10200 (0.0007) [2023-10-12 16:10:28,103][62634] Updated weights for policy 0, policy_version 10190 (0.0010) [2023-10-12 16:10:28,435][61643] Fps is (10 sec: 13107.6, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 20873216. Throughput: 0: 1682.9, 1: 1681.1. Samples: 5226874. Policy #0 lag: (min: 31.0, avg: 40.1, max: 63.0) [2023-10-12 16:10:28,435][61643] Avg episode reward: [(0, '3.000'), (1, '3.240')] [2023-10-12 16:10:28,436][62495] Saving new best policy, reward=3.240! [2023-10-12 16:10:28,470][62634] Updated weights for policy 0, policy_version 10200 (0.0009) [2023-10-12 16:10:31,858][62635] Updated weights for policy 1, policy_version 10210 (0.0007) [2023-10-12 16:10:32,233][62635] Updated weights for policy 1, policy_version 10220 (0.0009) [2023-10-12 16:10:32,564][62634] Updated weights for policy 0, policy_version 10210 (0.0010) [2023-10-12 16:10:32,592][62635] Updated weights for policy 1, policy_version 10230 (0.0008) [2023-10-12 16:10:32,949][62634] Updated weights for policy 0, policy_version 10220 (0.0009) [2023-10-12 16:10:32,965][62635] Updated weights for policy 1, policy_version 10240 (0.0007) [2023-10-12 16:10:33,318][62634] Updated weights for policy 0, policy_version 10230 (0.0011) [2023-10-12 16:10:33,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 20938752. Throughput: 0: 1674.8, 1: 1660.9. Samples: 5245976. Policy #0 lag: (min: 31.0, avg: 40.1, max: 63.0) [2023-10-12 16:10:33,435][61643] Avg episode reward: [(0, '3.010'), (1, '3.210')] [2023-10-12 16:10:33,694][62634] Updated weights for policy 0, policy_version 10240 (0.0010) [2023-10-12 16:10:36,970][62635] Updated weights for policy 1, policy_version 10250 (0.0009) [2023-10-12 16:10:37,334][62635] Updated weights for policy 1, policy_version 10260 (0.0009) [2023-10-12 16:10:37,709][62635] Updated weights for policy 1, policy_version 10270 (0.0008) [2023-10-12 16:10:38,031][62634] Updated weights for policy 0, policy_version 10250 (0.0008) [2023-10-12 16:10:38,406][62634] Updated weights for policy 0, policy_version 10260 (0.0008) [2023-10-12 16:10:38,435][61643] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 21004288. Throughput: 0: 1689.1, 1: 1686.7. Samples: 5256872. Policy #0 lag: (min: 5.0, avg: 10.9, max: 37.0) [2023-10-12 16:10:38,436][61643] Avg episode reward: [(0, '3.010'), (1, '3.230')] [2023-10-12 16:10:38,788][62634] Updated weights for policy 0, policy_version 10270 (0.0009) [2023-10-12 16:10:41,690][62635] Updated weights for policy 1, policy_version 10280 (0.0007) [2023-10-12 16:10:42,062][62635] Updated weights for policy 1, policy_version 10290 (0.0008) [2023-10-12 16:10:42,431][62635] Updated weights for policy 1, policy_version 10300 (0.0009) [2023-10-12 16:10:42,819][62634] Updated weights for policy 0, policy_version 10280 (0.0009) [2023-10-12 16:10:43,205][62634] Updated weights for policy 0, policy_version 10290 (0.0010) [2023-10-12 16:10:43,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 21069824. Throughput: 0: 1690.5, 1: 1672.4. Samples: 5276946. Policy #0 lag: (min: 5.0, avg: 10.9, max: 37.0) [2023-10-12 16:10:43,435][61643] Avg episode reward: [(0, '3.010'), (1, '3.220')] [2023-10-12 16:10:43,585][62634] Updated weights for policy 0, policy_version 10300 (0.0008) [2023-10-12 16:10:46,552][62635] Updated weights for policy 1, policy_version 10310 (0.0009) [2023-10-12 16:10:46,942][62635] Updated weights for policy 1, policy_version 10320 (0.0009) [2023-10-12 16:10:47,312][62635] Updated weights for policy 1, policy_version 10330 (0.0008) [2023-10-12 16:10:47,605][62634] Updated weights for policy 0, policy_version 10310 (0.0008) [2023-10-12 16:10:47,983][62634] Updated weights for policy 0, policy_version 10320 (0.0009) [2023-10-12 16:10:48,362][62634] Updated weights for policy 0, policy_version 10330 (0.0010) [2023-10-12 16:10:48,435][61643] Fps is (10 sec: 13107.6, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 21135360. Throughput: 0: 1679.0, 1: 1669.9. Samples: 5296552. Policy #0 lag: (min: 19.0, avg: 19.0, max: 19.0) [2023-10-12 16:10:48,435][61643] Avg episode reward: [(0, '3.010'), (1, '3.240')] [2023-10-12 16:10:51,349][62635] Updated weights for policy 1, policy_version 10340 (0.0008) [2023-10-12 16:10:51,716][62635] Updated weights for policy 1, policy_version 10350 (0.0007) [2023-10-12 16:10:52,093][62635] Updated weights for policy 1, policy_version 10360 (0.0009) [2023-10-12 16:10:52,432][62634] Updated weights for policy 0, policy_version 10340 (0.0008) [2023-10-12 16:10:52,809][62634] Updated weights for policy 0, policy_version 10350 (0.0009) [2023-10-12 16:10:53,188][62634] Updated weights for policy 0, policy_version 10360 (0.0008) [2023-10-12 16:10:53,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 21200896. Throughput: 0: 1688.5, 1: 1689.3. Samples: 5307550. Policy #0 lag: (min: 19.0, avg: 19.0, max: 19.0) [2023-10-12 16:10:53,435][61643] Avg episode reward: [(0, '3.010'), (1, '3.220')] [2023-10-12 16:10:56,386][62635] Updated weights for policy 1, policy_version 10370 (0.0007) [2023-10-12 16:10:56,756][62635] Updated weights for policy 1, policy_version 10380 (0.0008) [2023-10-12 16:10:57,118][62635] Updated weights for policy 1, policy_version 10390 (0.0009) [2023-10-12 16:10:57,221][62634] Updated weights for policy 0, policy_version 10370 (0.0008) [2023-10-12 16:10:57,492][62635] Updated weights for policy 1, policy_version 10400 (0.0008) [2023-10-12 16:10:57,602][62634] Updated weights for policy 0, policy_version 10380 (0.0008) [2023-10-12 16:10:57,978][62634] Updated weights for policy 0, policy_version 10390 (0.0007) [2023-10-12 16:10:58,348][62634] Updated weights for policy 0, policy_version 10400 (0.0008) [2023-10-12 16:10:58,435][61643] Fps is (10 sec: 16383.7, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 21299200. Throughput: 0: 1685.4, 1: 1673.4. Samples: 5327438. Policy #0 lag: (min: 22.0, avg: 22.0, max: 22.0) [2023-10-12 16:10:58,436][61643] Avg episode reward: [(0, '2.990'), (1, '3.310')] [2023-10-12 16:10:58,437][62495] Saving new best policy, reward=3.310! [2023-10-12 16:11:01,567][62635] Updated weights for policy 1, policy_version 10410 (0.0008) [2023-10-12 16:11:01,941][62635] Updated weights for policy 1, policy_version 10420 (0.0008) [2023-10-12 16:11:02,308][62635] Updated weights for policy 1, policy_version 10430 (0.0008) [2023-10-12 16:11:02,355][62634] Updated weights for policy 0, policy_version 10410 (0.0008) [2023-10-12 16:11:02,727][62634] Updated weights for policy 0, policy_version 10420 (0.0007) [2023-10-12 16:11:03,113][62634] Updated weights for policy 0, policy_version 10430 (0.0007) [2023-10-12 16:11:03,435][61643] Fps is (10 sec: 16384.1, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 21364736. Throughput: 0: 1658.7, 1: 1677.4. Samples: 5346474. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 16:11:03,435][61643] Avg episode reward: [(0, '2.990'), (1, '3.260')] [2023-10-12 16:11:06,356][62635] Updated weights for policy 1, policy_version 10440 (0.0010) [2023-10-12 16:11:06,735][62635] Updated weights for policy 1, policy_version 10450 (0.0009) [2023-10-12 16:11:07,108][62635] Updated weights for policy 1, policy_version 10460 (0.0008) [2023-10-12 16:11:07,205][62634] Updated weights for policy 0, policy_version 10440 (0.0007) [2023-10-12 16:11:07,587][62634] Updated weights for policy 0, policy_version 10450 (0.0007) [2023-10-12 16:11:07,973][62634] Updated weights for policy 0, policy_version 10460 (0.0008) [2023-10-12 16:11:08,435][61643] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 21430272. Throughput: 0: 1680.3, 1: 1690.7. Samples: 5357902. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 16:11:08,436][61643] Avg episode reward: [(0, '2.970'), (1, '3.250')] [2023-10-12 16:11:11,062][62635] Updated weights for policy 1, policy_version 10470 (0.0007) [2023-10-12 16:11:11,438][62635] Updated weights for policy 1, policy_version 10480 (0.0009) [2023-10-12 16:11:11,810][62635] Updated weights for policy 1, policy_version 10490 (0.0009) [2023-10-12 16:11:12,073][62634] Updated weights for policy 0, policy_version 10470 (0.0009) [2023-10-12 16:11:12,447][62634] Updated weights for policy 0, policy_version 10480 (0.0009) [2023-10-12 16:11:12,828][62634] Updated weights for policy 0, policy_version 10490 (0.0008) [2023-10-12 16:11:13,435][61643] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 21495808. Throughput: 0: 1679.3, 1: 1665.6. Samples: 5377394. Policy #0 lag: (min: 28.0, avg: 28.6, max: 46.0) [2023-10-12 16:11:13,435][61643] Avg episode reward: [(0, '2.960'), (1, '3.230')] [2023-10-12 16:11:15,707][62635] Updated weights for policy 1, policy_version 10500 (0.0008) [2023-10-12 16:11:16,064][62635] Updated weights for policy 1, policy_version 10510 (0.0007) [2023-10-12 16:11:16,440][62635] Updated weights for policy 1, policy_version 10520 (0.0010) [2023-10-12 16:11:17,037][62634] Updated weights for policy 0, policy_version 10500 (0.0009) [2023-10-12 16:11:17,404][62634] Updated weights for policy 0, policy_version 10510 (0.0010) [2023-10-12 16:11:17,793][62634] Updated weights for policy 0, policy_version 10520 (0.0010) [2023-10-12 16:11:18,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 21561344. Throughput: 0: 1663.6, 1: 1691.1. Samples: 5396940. Policy #0 lag: (min: 28.0, avg: 28.6, max: 46.0) [2023-10-12 16:11:18,436][61643] Avg episode reward: [(0, '2.960'), (1, '3.280')] [2023-10-12 16:11:20,626][62635] Updated weights for policy 1, policy_version 10530 (0.0008) [2023-10-12 16:11:20,984][62635] Updated weights for policy 1, policy_version 10540 (0.0007) [2023-10-12 16:11:21,349][62635] Updated weights for policy 1, policy_version 10550 (0.0008) [2023-10-12 16:11:21,723][62635] Updated weights for policy 1, policy_version 10560 (0.0009) [2023-10-12 16:11:21,751][62634] Updated weights for policy 0, policy_version 10530 (0.0009) [2023-10-12 16:11:22,140][62634] Updated weights for policy 0, policy_version 10540 (0.0011) [2023-10-12 16:11:22,525][62634] Updated weights for policy 0, policy_version 10550 (0.0011) [2023-10-12 16:11:22,909][62634] Updated weights for policy 0, policy_version 10560 (0.0010) [2023-10-12 16:11:23,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 21626880. Throughput: 0: 1677.1, 1: 1680.0. Samples: 5407938. Policy #0 lag: (min: 25.0, avg: 38.9, max: 57.0) [2023-10-12 16:11:23,435][61643] Avg episode reward: [(0, '2.950'), (1, '3.300')] [2023-10-12 16:11:25,723][62635] Updated weights for policy 1, policy_version 10570 (0.0008) [2023-10-12 16:11:26,095][62635] Updated weights for policy 1, policy_version 10580 (0.0010) [2023-10-12 16:11:26,468][62635] Updated weights for policy 1, policy_version 10590 (0.0008) [2023-10-12 16:11:26,990][62634] Updated weights for policy 0, policy_version 10570 (0.0010) [2023-10-12 16:11:27,375][62634] Updated weights for policy 0, policy_version 10580 (0.0010) [2023-10-12 16:11:27,750][62634] Updated weights for policy 0, policy_version 10590 (0.0007) [2023-10-12 16:11:28,435][61643] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 21692416. Throughput: 0: 1670.3, 1: 1678.3. Samples: 5427632. Policy #0 lag: (min: 25.0, avg: 38.9, max: 57.0) [2023-10-12 16:11:28,435][61643] Avg episode reward: [(0, '2.940'), (1, '3.310')] [2023-10-12 16:11:30,563][62635] Updated weights for policy 1, policy_version 10600 (0.0009) [2023-10-12 16:11:30,930][62635] Updated weights for policy 1, policy_version 10610 (0.0008) [2023-10-12 16:11:31,307][62635] Updated weights for policy 1, policy_version 10620 (0.0007) [2023-10-12 16:11:31,726][62634] Updated weights for policy 0, policy_version 10600 (0.0009) [2023-10-12 16:11:32,090][62634] Updated weights for policy 0, policy_version 10610 (0.0011) [2023-10-12 16:11:32,471][62634] Updated weights for policy 0, policy_version 10620 (0.0010) [2023-10-12 16:11:33,435][61643] Fps is (10 sec: 13106.8, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 21757952. Throughput: 0: 1660.4, 1: 1695.1. Samples: 5447550. Policy #0 lag: (min: 30.0, avg: 30.0, max: 30.0) [2023-10-12 16:11:33,436][61643] Avg episode reward: [(0, '2.960'), (1, '3.290')] [2023-10-12 16:11:35,413][62635] Updated weights for policy 1, policy_version 10630 (0.0007) [2023-10-12 16:11:35,808][62635] Updated weights for policy 1, policy_version 10640 (0.0007) [2023-10-12 16:11:36,175][62635] Updated weights for policy 1, policy_version 10650 (0.0008) [2023-10-12 16:11:36,345][62634] Updated weights for policy 0, policy_version 10630 (0.0010) [2023-10-12 16:11:36,728][62634] Updated weights for policy 0, policy_version 10640 (0.0008) [2023-10-12 16:11:37,099][62634] Updated weights for policy 0, policy_version 10650 (0.0007) [2023-10-12 16:11:38,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 21823488. Throughput: 0: 1679.1, 1: 1670.3. Samples: 5458278. Policy #0 lag: (min: 30.0, avg: 30.0, max: 30.0) [2023-10-12 16:11:38,436][61643] Avg episode reward: [(0, '2.970'), (1, '3.260')] [2023-10-12 16:11:40,297][62635] Updated weights for policy 1, policy_version 10660 (0.0010) [2023-10-12 16:11:40,666][62635] Updated weights for policy 1, policy_version 10670 (0.0009) [2023-10-12 16:11:40,950][62634] Updated weights for policy 0, policy_version 10660 (0.0007) [2023-10-12 16:11:41,039][62635] Updated weights for policy 1, policy_version 10680 (0.0007) [2023-10-12 16:11:41,318][62634] Updated weights for policy 0, policy_version 10670 (0.0009) [2023-10-12 16:11:41,707][62634] Updated weights for policy 0, policy_version 10680 (0.0009) [2023-10-12 16:11:43,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 21889024. Throughput: 0: 1659.2, 1: 1674.4. Samples: 5477450. Policy #0 lag: (min: 27.0, avg: 27.0, max: 27.0) [2023-10-12 16:11:43,436][61643] Avg episode reward: [(0, '2.990'), (1, '3.250')] [2023-10-12 16:11:45,251][62635] Updated weights for policy 1, policy_version 10690 (0.0008) [2023-10-12 16:11:45,618][62635] Updated weights for policy 1, policy_version 10700 (0.0009) [2023-10-12 16:11:45,977][62635] Updated weights for policy 1, policy_version 10710 (0.0008) [2023-10-12 16:11:45,983][62634] Updated weights for policy 0, policy_version 10690 (0.0008) [2023-10-12 16:11:46,347][62635] Updated weights for policy 1, policy_version 10720 (0.0008) [2023-10-12 16:11:46,353][62634] Updated weights for policy 0, policy_version 10700 (0.0009) [2023-10-12 16:11:46,733][62634] Updated weights for policy 0, policy_version 10710 (0.0008) [2023-10-12 16:11:47,114][62634] Updated weights for policy 0, policy_version 10720 (0.0009) [2023-10-12 16:11:48,435][61643] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 21954560. Throughput: 0: 1682.3, 1: 1685.6. Samples: 5498030. Policy #0 lag: (min: 27.0, avg: 27.0, max: 27.0) [2023-10-12 16:11:48,435][61643] Avg episode reward: [(0, '3.000'), (1, '3.250')] [2023-10-12 16:11:50,436][62635] Updated weights for policy 1, policy_version 10730 (0.0007) [2023-10-12 16:11:50,797][62635] Updated weights for policy 1, policy_version 10740 (0.0009) [2023-10-12 16:11:51,142][62634] Updated weights for policy 0, policy_version 10730 (0.0007) [2023-10-12 16:11:51,175][62635] Updated weights for policy 1, policy_version 10750 (0.0008) [2023-10-12 16:11:51,515][62634] Updated weights for policy 0, policy_version 10740 (0.0007) [2023-10-12 16:11:51,902][62634] Updated weights for policy 0, policy_version 10750 (0.0007) [2023-10-12 16:11:53,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 22020096. Throughput: 0: 1686.7, 1: 1661.9. Samples: 5508586. Policy #0 lag: (min: 2.0, avg: 9.8, max: 34.0) [2023-10-12 16:11:53,436][61643] Avg episode reward: [(0, '2.950'), (1, '3.280')] [2023-10-12 16:11:55,232][62635] Updated weights for policy 1, policy_version 10760 (0.0009) [2023-10-12 16:11:55,605][62635] Updated weights for policy 1, policy_version 10770 (0.0008) [2023-10-12 16:11:55,969][62635] Updated weights for policy 1, policy_version 10780 (0.0007) [2023-10-12 16:11:56,168][62634] Updated weights for policy 0, policy_version 10760 (0.0009) [2023-10-12 16:11:56,549][62634] Updated weights for policy 0, policy_version 10770 (0.0009) [2023-10-12 16:11:56,940][62634] Updated weights for policy 0, policy_version 10780 (0.0010) [2023-10-12 16:11:58,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 22085632. Throughput: 0: 1664.2, 1: 1679.7. Samples: 5527870. Policy #0 lag: (min: 2.0, avg: 9.8, max: 34.0) [2023-10-12 16:11:58,435][61643] Avg episode reward: [(0, '2.920'), (1, '3.300')] [2023-10-12 16:11:59,997][62635] Updated weights for policy 1, policy_version 10790 (0.0008) [2023-10-12 16:12:00,364][62635] Updated weights for policy 1, policy_version 10800 (0.0007) [2023-10-12 16:12:00,728][62635] Updated weights for policy 1, policy_version 10810 (0.0008) [2023-10-12 16:12:00,844][62634] Updated weights for policy 0, policy_version 10790 (0.0007) [2023-10-12 16:12:01,222][62634] Updated weights for policy 0, policy_version 10800 (0.0008) [2023-10-12 16:12:01,603][62634] Updated weights for policy 0, policy_version 10810 (0.0007) [2023-10-12 16:12:03,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 22151168. Throughput: 0: 1688.3, 1: 1681.0. Samples: 5548556. Policy #0 lag: (min: 29.0, avg: 29.0, max: 29.0) [2023-10-12 16:12:03,435][61643] Avg episode reward: [(0, '2.960'), (1, '3.280')] [2023-10-12 16:12:04,760][62635] Updated weights for policy 1, policy_version 10820 (0.0009) [2023-10-12 16:12:05,133][62635] Updated weights for policy 1, policy_version 10830 (0.0008) [2023-10-12 16:12:05,501][62635] Updated weights for policy 1, policy_version 10840 (0.0007) [2023-10-12 16:12:05,744][62634] Updated weights for policy 0, policy_version 10820 (0.0007) [2023-10-12 16:12:06,113][62634] Updated weights for policy 0, policy_version 10830 (0.0008) [2023-10-12 16:12:06,489][62634] Updated weights for policy 0, policy_version 10840 (0.0007) [2023-10-12 16:12:08,435][61643] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 22216704. Throughput: 0: 1683.9, 1: 1660.2. Samples: 5558422. Policy #0 lag: (min: 29.0, avg: 29.0, max: 29.0) [2023-10-12 16:12:08,436][61643] Avg episode reward: [(0, '2.990'), (1, '3.270')] [2023-10-12 16:12:09,587][62635] Updated weights for policy 1, policy_version 10850 (0.0008) [2023-10-12 16:12:09,968][62635] Updated weights for policy 1, policy_version 10860 (0.0008) [2023-10-12 16:12:10,337][62635] Updated weights for policy 1, policy_version 10870 (0.0007) [2023-10-12 16:12:10,561][62634] Updated weights for policy 0, policy_version 10850 (0.0008) [2023-10-12 16:12:10,704][62635] Updated weights for policy 1, policy_version 10880 (0.0007) [2023-10-12 16:12:10,937][62634] Updated weights for policy 0, policy_version 10860 (0.0009) [2023-10-12 16:12:11,320][62634] Updated weights for policy 0, policy_version 10870 (0.0011) [2023-10-12 16:12:11,692][62634] Updated weights for policy 0, policy_version 10880 (0.0007) [2023-10-12 16:12:13,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 22282240. Throughput: 0: 1665.2, 1: 1674.5. Samples: 5577920. Policy #0 lag: (min: 31.0, avg: 37.8, max: 63.0) [2023-10-12 16:12:13,435][61643] Avg episode reward: [(0, '3.010'), (1, '3.280')] [2023-10-12 16:12:14,791][62635] Updated weights for policy 1, policy_version 10890 (0.0009) [2023-10-12 16:12:15,155][62635] Updated weights for policy 1, policy_version 10900 (0.0008) [2023-10-12 16:12:15,522][62635] Updated weights for policy 1, policy_version 10910 (0.0008) [2023-10-12 16:12:15,677][62634] Updated weights for policy 0, policy_version 10890 (0.0009) [2023-10-12 16:12:16,060][62634] Updated weights for policy 0, policy_version 10900 (0.0009) [2023-10-12 16:12:16,438][62634] Updated weights for policy 0, policy_version 10910 (0.0009) [2023-10-12 16:12:18,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 22347776. Throughput: 0: 1687.5, 1: 1671.7. Samples: 5598714. Policy #0 lag: (min: 31.0, avg: 37.8, max: 63.0) [2023-10-12 16:12:18,435][61643] Avg episode reward: [(0, '2.990'), (1, '3.260')] [2023-10-12 16:12:18,448][62495] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000010912_11173888.pth... [2023-10-12 16:12:18,449][62354] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000010912_11173888.pth... [2023-10-12 16:12:18,478][62495] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000009344_9568256.pth [2023-10-12 16:12:18,486][62354] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000009344_9568256.pth [2023-10-12 16:12:19,636][62635] Updated weights for policy 1, policy_version 10920 (0.0011) [2023-10-12 16:12:20,001][62635] Updated weights for policy 1, policy_version 10930 (0.0009) [2023-10-12 16:12:20,290][62634] Updated weights for policy 0, policy_version 10920 (0.0009) [2023-10-12 16:12:20,381][62635] Updated weights for policy 1, policy_version 10940 (0.0008) [2023-10-12 16:12:20,671][62634] Updated weights for policy 0, policy_version 10930 (0.0008) [2023-10-12 16:12:21,044][62634] Updated weights for policy 0, policy_version 10940 (0.0009) [2023-10-12 16:12:23,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 22413312. Throughput: 0: 1665.1, 1: 1665.0. Samples: 5608132. Policy #0 lag: (min: 31.0, avg: 37.8, max: 63.0) [2023-10-12 16:12:23,435][61643] Avg episode reward: [(0, '2.970'), (1, '3.260')] [2023-10-12 16:12:24,407][62635] Updated weights for policy 1, policy_version 10950 (0.0008) [2023-10-12 16:12:24,772][62635] Updated weights for policy 1, policy_version 10960 (0.0007) [2023-10-12 16:12:25,128][62634] Updated weights for policy 0, policy_version 10950 (0.0008) [2023-10-12 16:12:25,141][62635] Updated weights for policy 1, policy_version 10970 (0.0007) [2023-10-12 16:12:25,501][62634] Updated weights for policy 0, policy_version 10960 (0.0010) [2023-10-12 16:12:25,876][62634] Updated weights for policy 0, policy_version 10970 (0.0008) [2023-10-12 16:12:28,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 22478848. Throughput: 0: 1676.9, 1: 1678.9. Samples: 5628462. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 16:12:28,436][61643] Avg episode reward: [(0, '2.980'), (1, '3.280')] [2023-10-12 16:12:29,164][62635] Updated weights for policy 1, policy_version 10980 (0.0007) [2023-10-12 16:12:29,547][62635] Updated weights for policy 1, policy_version 10990 (0.0007) [2023-10-12 16:12:29,910][62634] Updated weights for policy 0, policy_version 10980 (0.0008) [2023-10-12 16:12:29,921][62635] Updated weights for policy 1, policy_version 11000 (0.0008) [2023-10-12 16:12:30,280][62634] Updated weights for policy 0, policy_version 10990 (0.0009) [2023-10-12 16:12:30,655][62634] Updated weights for policy 0, policy_version 11000 (0.0007) [2023-10-12 16:12:33,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.3, 300 sec: 13440.4). Total num frames: 22544384. Throughput: 0: 1680.4, 1: 1679.7. Samples: 5649234. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 16:12:33,435][61643] Avg episode reward: [(0, '2.970'), (1, '3.290')] [2023-10-12 16:12:33,988][62635] Updated weights for policy 1, policy_version 11010 (0.0007) [2023-10-12 16:12:34,350][62635] Updated weights for policy 1, policy_version 11020 (0.0009) [2023-10-12 16:12:34,709][62635] Updated weights for policy 1, policy_version 11030 (0.0009) [2023-10-12 16:12:34,770][62634] Updated weights for policy 0, policy_version 11010 (0.0007) [2023-10-12 16:12:35,077][62635] Updated weights for policy 1, policy_version 11040 (0.0009) [2023-10-12 16:12:35,138][62634] Updated weights for policy 0, policy_version 11020 (0.0008) [2023-10-12 16:12:35,522][62634] Updated weights for policy 0, policy_version 11030 (0.0008) [2023-10-12 16:12:35,910][62634] Updated weights for policy 0, policy_version 11040 (0.0010) [2023-10-12 16:12:38,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 22609920. Throughput: 0: 1656.3, 1: 1668.6. Samples: 5658206. Policy #0 lag: (min: 26.0, avg: 26.0, max: 26.0) [2023-10-12 16:12:38,435][61643] Avg episode reward: [(0, '2.970'), (1, '3.300')] [2023-10-12 16:12:39,259][62635] Updated weights for policy 1, policy_version 11050 (0.0009) [2023-10-12 16:12:39,630][62635] Updated weights for policy 1, policy_version 11060 (0.0007) [2023-10-12 16:12:39,723][62634] Updated weights for policy 0, policy_version 11050 (0.0009) [2023-10-12 16:12:40,003][62635] Updated weights for policy 1, policy_version 11070 (0.0009) [2023-10-12 16:12:40,099][62634] Updated weights for policy 0, policy_version 11060 (0.0007) [2023-10-12 16:12:40,473][62634] Updated weights for policy 0, policy_version 11070 (0.0011) [2023-10-12 16:12:43,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 22675456. Throughput: 0: 1686.8, 1: 1680.7. Samples: 5679406. Policy #0 lag: (min: 26.0, avg: 26.0, max: 26.0) [2023-10-12 16:12:43,436][61643] Avg episode reward: [(0, '2.960'), (1, '3.320')] [2023-10-12 16:12:43,438][62495] Saving new best policy, reward=3.320! [2023-10-12 16:12:43,991][62635] Updated weights for policy 1, policy_version 11080 (0.0009) [2023-10-12 16:12:44,350][62635] Updated weights for policy 1, policy_version 11090 (0.0007) [2023-10-12 16:12:44,619][62634] Updated weights for policy 0, policy_version 11080 (0.0007) [2023-10-12 16:12:44,725][62635] Updated weights for policy 1, policy_version 11100 (0.0009) [2023-10-12 16:12:45,008][62634] Updated weights for policy 0, policy_version 11090 (0.0007) [2023-10-12 16:12:45,374][62634] Updated weights for policy 0, policy_version 11100 (0.0007) [2023-10-12 16:12:48,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 22740992. Throughput: 0: 1688.4, 1: 1678.4. Samples: 5700060. Policy #0 lag: (min: 0.0, avg: 14.7, max: 32.0) [2023-10-12 16:12:48,435][61643] Avg episode reward: [(0, '2.970'), (1, '3.260')] [2023-10-12 16:12:48,803][62635] Updated weights for policy 1, policy_version 11110 (0.0009) [2023-10-12 16:12:49,168][62635] Updated weights for policy 1, policy_version 11120 (0.0009) [2023-10-12 16:12:49,477][62634] Updated weights for policy 0, policy_version 11110 (0.0007) [2023-10-12 16:12:49,548][62635] Updated weights for policy 1, policy_version 11130 (0.0009) [2023-10-12 16:12:49,857][62634] Updated weights for policy 0, policy_version 11120 (0.0007) [2023-10-12 16:12:50,226][62634] Updated weights for policy 0, policy_version 11130 (0.0009) [2023-10-12 16:12:53,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 22806528. Throughput: 0: 1665.6, 1: 1679.3. Samples: 5708940. Policy #0 lag: (min: 0.0, avg: 14.7, max: 32.0) [2023-10-12 16:12:53,435][61643] Avg episode reward: [(0, '2.970'), (1, '3.240')] [2023-10-12 16:12:53,648][62635] Updated weights for policy 1, policy_version 11140 (0.0008) [2023-10-12 16:12:54,019][62635] Updated weights for policy 1, policy_version 11150 (0.0008) [2023-10-12 16:12:54,388][62635] Updated weights for policy 1, policy_version 11160 (0.0008) [2023-10-12 16:12:54,497][62634] Updated weights for policy 0, policy_version 11140 (0.0008) [2023-10-12 16:12:54,892][62634] Updated weights for policy 0, policy_version 11150 (0.0007) [2023-10-12 16:12:55,265][62634] Updated weights for policy 0, policy_version 11160 (0.0008) [2023-10-12 16:12:58,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 22872064. Throughput: 0: 1685.6, 1: 1685.5. Samples: 5729616. Policy #0 lag: (min: 29.0, avg: 29.0, max: 29.0) [2023-10-12 16:12:58,435][61643] Avg episode reward: [(0, '2.980'), (1, '3.230')] [2023-10-12 16:12:58,445][62635] Updated weights for policy 1, policy_version 11170 (0.0008) [2023-10-12 16:12:58,813][62635] Updated weights for policy 1, policy_version 11180 (0.0009) [2023-10-12 16:12:59,173][62635] Updated weights for policy 1, policy_version 11190 (0.0009) [2023-10-12 16:12:59,393][62634] Updated weights for policy 0, policy_version 11170 (0.0009) [2023-10-12 16:12:59,539][62635] Updated weights for policy 1, policy_version 11200 (0.0009) [2023-10-12 16:12:59,780][62634] Updated weights for policy 0, policy_version 11180 (0.0009) [2023-10-12 16:13:00,158][62634] Updated weights for policy 0, policy_version 11190 (0.0009) [2023-10-12 16:13:00,535][62634] Updated weights for policy 0, policy_version 11200 (0.0010) [2023-10-12 16:13:03,435][61643] Fps is (10 sec: 13106.8, 60 sec: 13107.1, 300 sec: 13329.3). Total num frames: 22937600. Throughput: 0: 1685.1, 1: 1683.6. Samples: 5750308. Policy #0 lag: (min: 29.0, avg: 29.0, max: 29.0) [2023-10-12 16:13:03,436][61643] Avg episode reward: [(0, '2.990'), (1, '3.270')] [2023-10-12 16:13:03,837][62635] Updated weights for policy 1, policy_version 11210 (0.0010) [2023-10-12 16:13:04,210][62635] Updated weights for policy 1, policy_version 11220 (0.0008) [2023-10-12 16:13:04,570][62634] Updated weights for policy 0, policy_version 11210 (0.0008) [2023-10-12 16:13:04,584][62635] Updated weights for policy 1, policy_version 11230 (0.0007) [2023-10-12 16:13:04,943][62634] Updated weights for policy 0, policy_version 11220 (0.0010) [2023-10-12 16:13:05,328][62634] Updated weights for policy 0, policy_version 11230 (0.0007) [2023-10-12 16:13:08,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 23003136. Throughput: 0: 1672.3, 1: 1684.5. Samples: 5759188. Policy #0 lag: (min: 17.0, avg: 33.4, max: 49.0) [2023-10-12 16:13:08,435][61643] Avg episode reward: [(0, '2.990'), (1, '3.280')] [2023-10-12 16:13:08,640][62635] Updated weights for policy 1, policy_version 11240 (0.0009) [2023-10-12 16:13:09,004][62635] Updated weights for policy 1, policy_version 11250 (0.0009) [2023-10-12 16:13:09,260][62634] Updated weights for policy 0, policy_version 11240 (0.0010) [2023-10-12 16:13:09,386][62635] Updated weights for policy 1, policy_version 11260 (0.0008) [2023-10-12 16:13:09,643][62634] Updated weights for policy 0, policy_version 11250 (0.0007) [2023-10-12 16:13:10,024][62634] Updated weights for policy 0, policy_version 11260 (0.0007) [2023-10-12 16:13:13,329][62635] Updated weights for policy 1, policy_version 11270 (0.0010) [2023-10-12 16:13:13,435][61643] Fps is (10 sec: 13107.6, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 23068672. Throughput: 0: 1685.9, 1: 1683.4. Samples: 5780082. Policy #0 lag: (min: 17.0, avg: 33.4, max: 49.0) [2023-10-12 16:13:13,436][61643] Avg episode reward: [(0, '2.960'), (1, '3.260')] [2023-10-12 16:13:13,697][62635] Updated weights for policy 1, policy_version 11280 (0.0009) [2023-10-12 16:13:14,061][62635] Updated weights for policy 1, policy_version 11290 (0.0007) [2023-10-12 16:13:14,063][62634] Updated weights for policy 0, policy_version 11270 (0.0007) [2023-10-12 16:13:14,442][62634] Updated weights for policy 0, policy_version 11280 (0.0008) [2023-10-12 16:13:14,821][62634] Updated weights for policy 0, policy_version 11290 (0.0010) [2023-10-12 16:13:18,145][62635] Updated weights for policy 1, policy_version 11300 (0.0009) [2023-10-12 16:13:18,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 23134208. Throughput: 0: 1681.3, 1: 1684.3. Samples: 5800686. Policy #0 lag: (min: 31.0, avg: 34.0, max: 63.0) [2023-10-12 16:13:18,435][61643] Avg episode reward: [(0, '2.950'), (1, '3.260')] [2023-10-12 16:13:18,539][62635] Updated weights for policy 1, policy_version 11310 (0.0007) [2023-10-12 16:13:18,811][62634] Updated weights for policy 0, policy_version 11300 (0.0009) [2023-10-12 16:13:18,899][62635] Updated weights for policy 1, policy_version 11320 (0.0007) [2023-10-12 16:13:19,195][62634] Updated weights for policy 0, policy_version 11310 (0.0007) [2023-10-12 16:13:19,574][62634] Updated weights for policy 0, policy_version 11320 (0.0007) [2023-10-12 16:13:22,956][62635] Updated weights for policy 1, policy_version 11330 (0.0007) [2023-10-12 16:13:23,314][62635] Updated weights for policy 1, policy_version 11340 (0.0009) [2023-10-12 16:13:23,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 23199744. Throughput: 0: 1680.0, 1: 1689.2. Samples: 5809824. Policy #0 lag: (min: 31.0, avg: 34.0, max: 63.0) [2023-10-12 16:13:23,435][61643] Avg episode reward: [(0, '2.940'), (1, '3.270')] [2023-10-12 16:13:23,682][62635] Updated weights for policy 1, policy_version 11350 (0.0007) [2023-10-12 16:13:23,911][62634] Updated weights for policy 0, policy_version 11330 (0.0009) [2023-10-12 16:13:24,052][62635] Updated weights for policy 1, policy_version 11360 (0.0009) [2023-10-12 16:13:24,280][62634] Updated weights for policy 0, policy_version 11340 (0.0008) [2023-10-12 16:13:24,662][62634] Updated weights for policy 0, policy_version 11350 (0.0008) [2023-10-12 16:13:25,034][62634] Updated weights for policy 0, policy_version 11360 (0.0009) [2023-10-12 16:13:28,097][62635] Updated weights for policy 1, policy_version 11370 (0.0009) [2023-10-12 16:13:28,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 23265280. Throughput: 0: 1671.9, 1: 1686.0. Samples: 5830514. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 16:13:28,436][61643] Avg episode reward: [(0, '2.970'), (1, '3.230')] [2023-10-12 16:13:28,474][62635] Updated weights for policy 1, policy_version 11380 (0.0009) [2023-10-12 16:13:28,848][62635] Updated weights for policy 1, policy_version 11390 (0.0008) [2023-10-12 16:13:29,017][62634] Updated weights for policy 0, policy_version 11370 (0.0009) [2023-10-12 16:13:29,397][62634] Updated weights for policy 0, policy_version 11380 (0.0010) [2023-10-12 16:13:29,782][62634] Updated weights for policy 0, policy_version 11390 (0.0007) [2023-10-12 16:13:32,728][62635] Updated weights for policy 1, policy_version 11400 (0.0008) [2023-10-12 16:13:33,096][62635] Updated weights for policy 1, policy_version 11410 (0.0008) [2023-10-12 16:13:33,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 23330816. Throughput: 0: 1677.2, 1: 1676.0. Samples: 5850954. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 16:13:33,435][61643] Avg episode reward: [(0, '3.030'), (1, '3.230')] [2023-10-12 16:13:33,459][62635] Updated weights for policy 1, policy_version 11420 (0.0009) [2023-10-12 16:13:33,842][62634] Updated weights for policy 0, policy_version 11400 (0.0007) [2023-10-12 16:13:34,223][62634] Updated weights for policy 0, policy_version 11410 (0.0007) [2023-10-12 16:13:34,601][62634] Updated weights for policy 0, policy_version 11420 (0.0007) [2023-10-12 16:13:37,526][62635] Updated weights for policy 1, policy_version 11430 (0.0007) [2023-10-12 16:13:37,893][62635] Updated weights for policy 1, policy_version 11440 (0.0007) [2023-10-12 16:13:38,266][62635] Updated weights for policy 1, policy_version 11450 (0.0007) [2023-10-12 16:13:38,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 23396352. Throughput: 0: 1680.0, 1: 1693.3. Samples: 5860740. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 16:13:38,436][61643] Avg episode reward: [(0, '3.040'), (1, '3.200')] [2023-10-12 16:13:38,464][62634] Updated weights for policy 0, policy_version 11430 (0.0009) [2023-10-12 16:13:38,833][62634] Updated weights for policy 0, policy_version 11440 (0.0008) [2023-10-12 16:13:39,218][62634] Updated weights for policy 0, policy_version 11450 (0.0007) [2023-10-12 16:13:42,248][62635] Updated weights for policy 1, policy_version 11460 (0.0007) [2023-10-12 16:13:42,619][62635] Updated weights for policy 1, policy_version 11470 (0.0007) [2023-10-12 16:13:42,982][62635] Updated weights for policy 1, policy_version 11480 (0.0007) [2023-10-12 16:13:43,242][62634] Updated weights for policy 0, policy_version 11460 (0.0008) [2023-10-12 16:13:43,435][61643] Fps is (10 sec: 16384.0, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 23494656. Throughput: 0: 1691.4, 1: 1689.7. Samples: 5881766. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 16:13:43,435][61643] Avg episode reward: [(0, '3.040'), (1, '3.230')] [2023-10-12 16:13:43,613][62634] Updated weights for policy 0, policy_version 11470 (0.0009) [2023-10-12 16:13:43,987][62634] Updated weights for policy 0, policy_version 11480 (0.0009) [2023-10-12 16:13:46,959][62635] Updated weights for policy 1, policy_version 11490 (0.0007) [2023-10-12 16:13:47,328][62635] Updated weights for policy 1, policy_version 11500 (0.0009) [2023-10-12 16:13:47,699][62635] Updated weights for policy 1, policy_version 11510 (0.0007) [2023-10-12 16:13:48,012][62634] Updated weights for policy 0, policy_version 11490 (0.0009) [2023-10-12 16:13:48,059][62635] Updated weights for policy 1, policy_version 11520 (0.0007) [2023-10-12 16:13:48,396][62634] Updated weights for policy 0, policy_version 11500 (0.0008) [2023-10-12 16:13:48,435][61643] Fps is (10 sec: 16383.7, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 23560192. Throughput: 0: 1693.2, 1: 1664.8. Samples: 5901418. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 16:13:48,436][61643] Avg episode reward: [(0, '2.990'), (1, '3.180')] [2023-10-12 16:13:48,761][62634] Updated weights for policy 0, policy_version 11510 (0.0009) [2023-10-12 16:13:49,137][62634] Updated weights for policy 0, policy_version 11520 (0.0011) [2023-10-12 16:13:52,167][62635] Updated weights for policy 1, policy_version 11530 (0.0008) [2023-10-12 16:13:52,541][62635] Updated weights for policy 1, policy_version 11540 (0.0009) [2023-10-12 16:13:52,907][62635] Updated weights for policy 1, policy_version 11550 (0.0007) [2023-10-12 16:13:53,123][62634] Updated weights for policy 0, policy_version 11530 (0.0007) [2023-10-12 16:13:53,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 23625728. Throughput: 0: 1701.2, 1: 1690.5. Samples: 5911814. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 16:13:53,435][61643] Avg episode reward: [(0, '2.990'), (1, '3.270')] [2023-10-12 16:13:53,496][62634] Updated weights for policy 0, policy_version 11540 (0.0008) [2023-10-12 16:13:53,865][62634] Updated weights for policy 0, policy_version 11550 (0.0010) [2023-10-12 16:13:56,928][62635] Updated weights for policy 1, policy_version 11560 (0.0008) [2023-10-12 16:13:57,298][62635] Updated weights for policy 1, policy_version 11570 (0.0009) [2023-10-12 16:13:57,663][62635] Updated weights for policy 1, policy_version 11580 (0.0010) [2023-10-12 16:13:57,912][62634] Updated weights for policy 0, policy_version 11560 (0.0009) [2023-10-12 16:13:58,286][62634] Updated weights for policy 0, policy_version 11570 (0.0007) [2023-10-12 16:13:58,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 23691264. Throughput: 0: 1698.8, 1: 1683.3. Samples: 5932280. Policy #0 lag: (min: 31.0, avg: 38.7, max: 63.0) [2023-10-12 16:13:58,436][61643] Avg episode reward: [(0, '3.000'), (1, '3.270')] [2023-10-12 16:13:58,662][62634] Updated weights for policy 0, policy_version 11580 (0.0008) [2023-10-12 16:14:01,718][62635] Updated weights for policy 1, policy_version 11590 (0.0008) [2023-10-12 16:14:02,080][62635] Updated weights for policy 1, policy_version 11600 (0.0008) [2023-10-12 16:14:02,453][62635] Updated weights for policy 1, policy_version 11610 (0.0009) [2023-10-12 16:14:02,705][62634] Updated weights for policy 0, policy_version 11590 (0.0009) [2023-10-12 16:14:03,092][62634] Updated weights for policy 0, policy_version 11600 (0.0009) [2023-10-12 16:14:03,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 23756800. Throughput: 0: 1688.9, 1: 1664.6. Samples: 5951592. Policy #0 lag: (min: 31.0, avg: 38.7, max: 63.0) [2023-10-12 16:14:03,436][61643] Avg episode reward: [(0, '2.990'), (1, '3.320')] [2023-10-12 16:14:03,476][62634] Updated weights for policy 0, policy_version 11610 (0.0008) [2023-10-12 16:14:06,580][62635] Updated weights for policy 1, policy_version 11620 (0.0007) [2023-10-12 16:14:06,970][62635] Updated weights for policy 1, policy_version 11630 (0.0007) [2023-10-12 16:14:07,332][62635] Updated weights for policy 1, policy_version 11640 (0.0009) [2023-10-12 16:14:07,489][62634] Updated weights for policy 0, policy_version 11620 (0.0009) [2023-10-12 16:14:07,865][62634] Updated weights for policy 0, policy_version 11630 (0.0009) [2023-10-12 16:14:08,248][62634] Updated weights for policy 0, policy_version 11640 (0.0007) [2023-10-12 16:14:08,435][61643] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 23822336. Throughput: 0: 1698.1, 1: 1693.4. Samples: 5962440. Policy #0 lag: (min: 31.0, avg: 38.7, max: 63.0) [2023-10-12 16:14:08,435][61643] Avg episode reward: [(0, '3.000'), (1, '3.260')] [2023-10-12 16:14:11,339][62635] Updated weights for policy 1, policy_version 11650 (0.0008) [2023-10-12 16:14:11,711][62635] Updated weights for policy 1, policy_version 11660 (0.0010) [2023-10-12 16:14:12,090][62635] Updated weights for policy 1, policy_version 11670 (0.0009) [2023-10-12 16:14:12,301][62634] Updated weights for policy 0, policy_version 11650 (0.0010) [2023-10-12 16:14:12,453][62635] Updated weights for policy 1, policy_version 11680 (0.0008) [2023-10-12 16:14:12,669][62634] Updated weights for policy 0, policy_version 11660 (0.0007) [2023-10-12 16:14:13,044][62634] Updated weights for policy 0, policy_version 11670 (0.0009) [2023-10-12 16:14:13,425][62634] Updated weights for policy 0, policy_version 11680 (0.0009) [2023-10-12 16:14:13,435][61643] Fps is (10 sec: 16384.2, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 23920640. Throughput: 0: 1698.2, 1: 1672.5. Samples: 5982196. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 16:14:13,435][61643] Avg episode reward: [(0, '3.020'), (1, '3.290')] [2023-10-12 16:14:16,425][62635] Updated weights for policy 1, policy_version 11690 (0.0009) [2023-10-12 16:14:16,803][62635] Updated weights for policy 1, policy_version 11700 (0.0008) [2023-10-12 16:14:17,175][62635] Updated weights for policy 1, policy_version 11710 (0.0008) [2023-10-12 16:14:17,401][62634] Updated weights for policy 0, policy_version 11690 (0.0009) [2023-10-12 16:14:17,779][62634] Updated weights for policy 0, policy_version 11700 (0.0009) [2023-10-12 16:14:18,164][62634] Updated weights for policy 0, policy_version 11710 (0.0009) [2023-10-12 16:14:18,435][61643] Fps is (10 sec: 16383.7, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 23986176. Throughput: 0: 1673.1, 1: 1673.1. Samples: 6001534. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 16:14:18,436][61643] Avg episode reward: [(0, '3.010'), (1, '3.270')] [2023-10-12 16:14:18,444][62354] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000011712_11993088.pth... [2023-10-12 16:14:18,444][62495] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000011712_11993088.pth... [2023-10-12 16:14:18,474][62354] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000010144_10387456.pth [2023-10-12 16:14:18,480][62495] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000010144_10387456.pth [2023-10-12 16:14:21,240][62635] Updated weights for policy 1, policy_version 11720 (0.0008) [2023-10-12 16:14:21,605][62635] Updated weights for policy 1, policy_version 11730 (0.0008) [2023-10-12 16:14:21,982][62635] Updated weights for policy 1, policy_version 11740 (0.0008) [2023-10-12 16:14:22,297][62634] Updated weights for policy 0, policy_version 11720 (0.0009) [2023-10-12 16:14:22,673][62634] Updated weights for policy 0, policy_version 11730 (0.0009) [2023-10-12 16:14:23,049][62634] Updated weights for policy 0, policy_version 11740 (0.0010) [2023-10-12 16:14:23,435][61643] Fps is (10 sec: 13106.9, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 24051712. Throughput: 0: 1693.2, 1: 1686.4. Samples: 6012826. Policy #0 lag: (min: 31.0, avg: 37.9, max: 63.0) [2023-10-12 16:14:23,436][61643] Avg episode reward: [(0, '3.050'), (1, '3.320')] [2023-10-12 16:14:23,438][62354] Saving new best policy, reward=3.050! [2023-10-12 16:14:26,077][62635] Updated weights for policy 1, policy_version 11750 (0.0009) [2023-10-12 16:14:26,440][62635] Updated weights for policy 1, policy_version 11760 (0.0012) [2023-10-12 16:14:26,808][62635] Updated weights for policy 1, policy_version 11770 (0.0008) [2023-10-12 16:14:27,267][62634] Updated weights for policy 0, policy_version 11750 (0.0009) [2023-10-12 16:14:27,642][62634] Updated weights for policy 0, policy_version 11760 (0.0009) [2023-10-12 16:14:28,016][62634] Updated weights for policy 0, policy_version 11770 (0.0009) [2023-10-12 16:14:28,435][61643] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 24117248. Throughput: 0: 1684.3, 1: 1662.0. Samples: 6032348. Policy #0 lag: (min: 31.0, avg: 37.9, max: 63.0) [2023-10-12 16:14:28,435][61643] Avg episode reward: [(0, '3.010'), (1, '3.340')] [2023-10-12 16:14:28,436][62495] Saving new best policy, reward=3.340! [2023-10-12 16:14:30,882][62635] Updated weights for policy 1, policy_version 11780 (0.0009) [2023-10-12 16:14:31,251][62635] Updated weights for policy 1, policy_version 11790 (0.0008) [2023-10-12 16:14:31,623][62635] Updated weights for policy 1, policy_version 11800 (0.0008) [2023-10-12 16:14:32,037][62634] Updated weights for policy 0, policy_version 11780 (0.0008) [2023-10-12 16:14:32,406][62634] Updated weights for policy 0, policy_version 11790 (0.0009) [2023-10-12 16:14:32,779][62634] Updated weights for policy 0, policy_version 11800 (0.0009) [2023-10-12 16:14:33,435][61643] Fps is (10 sec: 13107.2, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 24182784. Throughput: 0: 1655.9, 1: 1691.9. Samples: 6052068. Policy #0 lag: (min: 31.0, avg: 37.9, max: 63.0) [2023-10-12 16:14:33,436][61643] Avg episode reward: [(0, '2.980'), (1, '3.370')] [2023-10-12 16:14:33,450][62495] Saving new best policy, reward=3.370! [2023-10-12 16:14:35,600][62635] Updated weights for policy 1, policy_version 11810 (0.0008) [2023-10-12 16:14:35,973][62635] Updated weights for policy 1, policy_version 11820 (0.0007) [2023-10-12 16:14:36,339][62635] Updated weights for policy 1, policy_version 11830 (0.0007) [2023-10-12 16:14:36,706][62635] Updated weights for policy 1, policy_version 11840 (0.0007) [2023-10-12 16:14:36,810][62634] Updated weights for policy 0, policy_version 11810 (0.0008) [2023-10-12 16:14:37,187][62634] Updated weights for policy 0, policy_version 11820 (0.0011) [2023-10-12 16:14:37,573][62634] Updated weights for policy 0, policy_version 11830 (0.0008) [2023-10-12 16:14:37,957][62634] Updated weights for policy 0, policy_version 11840 (0.0011) [2023-10-12 16:14:38,435][61643] Fps is (10 sec: 13106.9, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 24248320. Throughput: 0: 1680.3, 1: 1685.4. Samples: 6063268. Policy #0 lag: (min: 31.0, avg: 31.0, max: 33.0) [2023-10-12 16:14:38,436][61643] Avg episode reward: [(0, '2.980'), (1, '3.350')] [2023-10-12 16:14:40,891][62635] Updated weights for policy 1, policy_version 11850 (0.0007) [2023-10-12 16:14:41,265][62635] Updated weights for policy 1, policy_version 11860 (0.0008) [2023-10-12 16:14:41,626][62635] Updated weights for policy 1, policy_version 11870 (0.0009) [2023-10-12 16:14:41,945][62634] Updated weights for policy 0, policy_version 11850 (0.0010) [2023-10-12 16:14:42,316][62634] Updated weights for policy 0, policy_version 11860 (0.0009) [2023-10-12 16:14:42,694][62634] Updated weights for policy 0, policy_version 11870 (0.0007) [2023-10-12 16:14:43,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 24313856. Throughput: 0: 1673.0, 1: 1677.9. Samples: 6083070. Policy #0 lag: (min: 31.0, avg: 31.0, max: 33.0) [2023-10-12 16:14:43,436][61643] Avg episode reward: [(0, '3.030'), (1, '3.320')] [2023-10-12 16:14:45,518][62635] Updated weights for policy 1, policy_version 11880 (0.0008) [2023-10-12 16:14:45,889][62635] Updated weights for policy 1, policy_version 11890 (0.0008) [2023-10-12 16:14:46,260][62635] Updated weights for policy 1, policy_version 11900 (0.0009) [2023-10-12 16:14:46,745][62634] Updated weights for policy 0, policy_version 11880 (0.0009) [2023-10-12 16:14:47,116][62634] Updated weights for policy 0, policy_version 11890 (0.0008) [2023-10-12 16:14:47,493][62634] Updated weights for policy 0, policy_version 11900 (0.0009) [2023-10-12 16:14:48,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 24379392. Throughput: 0: 1664.0, 1: 1700.5. Samples: 6102998. Policy #0 lag: (min: 31.0, avg: 31.0, max: 33.0) [2023-10-12 16:14:48,436][61643] Avg episode reward: [(0, '3.060'), (1, '3.410')] [2023-10-12 16:14:48,444][62354] Saving new best policy, reward=3.060! [2023-10-12 16:14:48,444][62495] Saving new best policy, reward=3.410! [2023-10-12 16:14:50,364][62635] Updated weights for policy 1, policy_version 11910 (0.0008) [2023-10-12 16:14:50,732][62635] Updated weights for policy 1, policy_version 11920 (0.0007) [2023-10-12 16:14:51,101][62635] Updated weights for policy 1, policy_version 11930 (0.0007) [2023-10-12 16:14:51,387][62634] Updated weights for policy 0, policy_version 11910 (0.0010) [2023-10-12 16:14:51,759][62634] Updated weights for policy 0, policy_version 11920 (0.0008) [2023-10-12 16:14:52,134][62634] Updated weights for policy 0, policy_version 11930 (0.0009) [2023-10-12 16:14:53,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 24444928. Throughput: 0: 1687.2, 1: 1676.0. Samples: 6113788. Policy #0 lag: (min: 1.0, avg: 3.1, max: 31.0) [2023-10-12 16:14:53,435][61643] Avg episode reward: [(0, '3.060'), (1, '3.470')] [2023-10-12 16:14:53,436][62495] Saving new best policy, reward=3.470! [2023-10-12 16:14:55,287][62635] Updated weights for policy 1, policy_version 11940 (0.0008) [2023-10-12 16:14:55,654][62635] Updated weights for policy 1, policy_version 11950 (0.0007) [2023-10-12 16:14:56,026][62635] Updated weights for policy 1, policy_version 11960 (0.0008) [2023-10-12 16:14:56,184][62634] Updated weights for policy 0, policy_version 11940 (0.0009) [2023-10-12 16:14:56,573][62634] Updated weights for policy 0, policy_version 11950 (0.0007) [2023-10-12 16:14:56,955][62634] Updated weights for policy 0, policy_version 11960 (0.0007) [2023-10-12 16:14:58,435][61643] Fps is (10 sec: 13107.5, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 24510464. Throughput: 0: 1666.3, 1: 1688.0. Samples: 6133140. Policy #0 lag: (min: 1.0, avg: 3.1, max: 31.0) [2023-10-12 16:14:58,435][61643] Avg episode reward: [(0, '3.050'), (1, '3.500')] [2023-10-12 16:14:58,436][62495] Saving new best policy, reward=3.500! [2023-10-12 16:15:00,043][62635] Updated weights for policy 1, policy_version 11970 (0.0007) [2023-10-12 16:15:00,454][62635] Updated weights for policy 1, policy_version 11980 (0.0009) [2023-10-12 16:15:00,806][62635] Updated weights for policy 1, policy_version 11990 (0.0009) [2023-10-12 16:15:01,044][62634] Updated weights for policy 0, policy_version 11970 (0.0008) [2023-10-12 16:15:01,181][62635] Updated weights for policy 1, policy_version 12000 (0.0007) [2023-10-12 16:15:01,422][62634] Updated weights for policy 0, policy_version 11980 (0.0008) [2023-10-12 16:15:01,795][62634] Updated weights for policy 0, policy_version 11990 (0.0010) [2023-10-12 16:15:02,175][62634] Updated weights for policy 0, policy_version 12000 (0.0010) [2023-10-12 16:15:03,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 24576000. Throughput: 0: 1682.5, 1: 1695.4. Samples: 6153536. Policy #0 lag: (min: 31.0, avg: 36.6, max: 63.0) [2023-10-12 16:15:03,435][61643] Avg episode reward: [(0, '3.000'), (1, '3.530')] [2023-10-12 16:15:03,443][62495] Saving new best policy, reward=3.530! [2023-10-12 16:15:05,067][62635] Updated weights for policy 1, policy_version 12010 (0.0011) [2023-10-12 16:15:05,434][62635] Updated weights for policy 1, policy_version 12020 (0.0010) [2023-10-12 16:15:05,794][62635] Updated weights for policy 1, policy_version 12030 (0.0009) [2023-10-12 16:15:06,236][62634] Updated weights for policy 0, policy_version 12010 (0.0008) [2023-10-12 16:15:06,611][62634] Updated weights for policy 0, policy_version 12020 (0.0011) [2023-10-12 16:15:07,000][62634] Updated weights for policy 0, policy_version 12030 (0.0010) [2023-10-12 16:15:08,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 24641536. Throughput: 0: 1693.7, 1: 1665.7. Samples: 6164002. Policy #0 lag: (min: 31.0, avg: 36.6, max: 63.0) [2023-10-12 16:15:08,435][61643] Avg episode reward: [(0, '3.050'), (1, '3.550')] [2023-10-12 16:15:08,436][62495] Saving new best policy, reward=3.550! [2023-10-12 16:15:09,970][62635] Updated weights for policy 1, policy_version 12040 (0.0008) [2023-10-12 16:15:10,334][62635] Updated weights for policy 1, policy_version 12050 (0.0009) [2023-10-12 16:15:10,699][62635] Updated weights for policy 1, policy_version 12060 (0.0007) [2023-10-12 16:15:11,008][62634] Updated weights for policy 0, policy_version 12040 (0.0008) [2023-10-12 16:15:11,384][62634] Updated weights for policy 0, policy_version 12050 (0.0008) [2023-10-12 16:15:11,771][62634] Updated weights for policy 0, policy_version 12060 (0.0009) [2023-10-12 16:15:13,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 24707072. Throughput: 0: 1665.7, 1: 1692.5. Samples: 6183468. Policy #0 lag: (min: 31.0, avg: 36.6, max: 63.0) [2023-10-12 16:15:13,435][61643] Avg episode reward: [(0, '3.030'), (1, '3.520')] [2023-10-12 16:15:14,572][62635] Updated weights for policy 1, policy_version 12070 (0.0008) [2023-10-12 16:15:14,941][62635] Updated weights for policy 1, policy_version 12080 (0.0011) [2023-10-12 16:15:15,310][62635] Updated weights for policy 1, policy_version 12090 (0.0010) [2023-10-12 16:15:15,720][62634] Updated weights for policy 0, policy_version 12070 (0.0009) [2023-10-12 16:15:16,106][62634] Updated weights for policy 0, policy_version 12080 (0.0007) [2023-10-12 16:15:16,483][62634] Updated weights for policy 0, policy_version 12090 (0.0007) [2023-10-12 16:15:18,435][61643] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 24772608. Throughput: 0: 1689.9, 1: 1694.0. Samples: 6204344. Policy #0 lag: (min: 22.0, avg: 23.7, max: 50.0) [2023-10-12 16:15:18,436][61643] Avg episode reward: [(0, '2.990'), (1, '3.530')] [2023-10-12 16:15:19,435][62635] Updated weights for policy 1, policy_version 12100 (0.0007) [2023-10-12 16:15:19,800][62635] Updated weights for policy 1, policy_version 12110 (0.0009) [2023-10-12 16:15:20,171][62635] Updated weights for policy 1, policy_version 12120 (0.0009) [2023-10-12 16:15:20,510][62634] Updated weights for policy 0, policy_version 12100 (0.0007) [2023-10-12 16:15:20,889][62634] Updated weights for policy 0, policy_version 12110 (0.0007) [2023-10-12 16:15:21,274][62634] Updated weights for policy 0, policy_version 12120 (0.0009) [2023-10-12 16:15:23,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.3, 300 sec: 13440.4). Total num frames: 24838144. Throughput: 0: 1679.4, 1: 1674.8. Samples: 6214204. Policy #0 lag: (min: 22.0, avg: 23.7, max: 50.0) [2023-10-12 16:15:23,435][61643] Avg episode reward: [(0, '3.010'), (1, '3.570')] [2023-10-12 16:15:23,436][62495] Saving new best policy, reward=3.570! [2023-10-12 16:15:24,114][62635] Updated weights for policy 1, policy_version 12130 (0.0008) [2023-10-12 16:15:24,473][62635] Updated weights for policy 1, policy_version 12140 (0.0008) [2023-10-12 16:15:24,838][62635] Updated weights for policy 1, policy_version 12150 (0.0008) [2023-10-12 16:15:25,217][62635] Updated weights for policy 1, policy_version 12160 (0.0009) [2023-10-12 16:15:25,320][62634] Updated weights for policy 0, policy_version 12130 (0.0009) [2023-10-12 16:15:25,692][62634] Updated weights for policy 0, policy_version 12140 (0.0008) [2023-10-12 16:15:26,069][62634] Updated weights for policy 0, policy_version 12150 (0.0009) [2023-10-12 16:15:26,448][62634] Updated weights for policy 0, policy_version 12160 (0.0009) [2023-10-12 16:15:28,435][61643] Fps is (10 sec: 13107.6, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 24903680. Throughput: 0: 1665.8, 1: 1691.8. Samples: 6234162. Policy #0 lag: (min: 22.0, avg: 23.7, max: 50.0) [2023-10-12 16:15:28,435][61643] Avg episode reward: [(0, '3.040'), (1, '3.670')] [2023-10-12 16:15:28,437][62495] Saving new best policy, reward=3.670! [2023-10-12 16:15:29,230][62635] Updated weights for policy 1, policy_version 12170 (0.0008) [2023-10-12 16:15:29,599][62635] Updated weights for policy 1, policy_version 12180 (0.0009) [2023-10-12 16:15:29,965][62635] Updated weights for policy 1, policy_version 12190 (0.0009) [2023-10-12 16:15:30,603][62634] Updated weights for policy 0, policy_version 12170 (0.0011) [2023-10-12 16:15:30,980][62634] Updated weights for policy 0, policy_version 12180 (0.0008) [2023-10-12 16:15:31,362][62634] Updated weights for policy 0, policy_version 12190 (0.0007) [2023-10-12 16:15:33,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 24969216. Throughput: 0: 1687.6, 1: 1691.2. Samples: 6255040. Policy #0 lag: (min: 31.0, avg: 34.1, max: 63.0) [2023-10-12 16:15:33,436][61643] Avg episode reward: [(0, '3.120'), (1, '3.760')] [2023-10-12 16:15:33,445][62354] Saving new best policy, reward=3.120! [2023-10-12 16:15:33,446][62495] Saving new best policy, reward=3.760! [2023-10-12 16:15:34,066][62635] Updated weights for policy 1, policy_version 12200 (0.0007) [2023-10-12 16:15:34,439][62635] Updated weights for policy 1, policy_version 12210 (0.0008) [2023-10-12 16:15:34,804][62635] Updated weights for policy 1, policy_version 12220 (0.0011) [2023-10-12 16:15:35,468][62634] Updated weights for policy 0, policy_version 12200 (0.0008) [2023-10-12 16:15:35,846][62634] Updated weights for policy 0, policy_version 12210 (0.0007) [2023-10-12 16:15:36,217][62634] Updated weights for policy 0, policy_version 12220 (0.0007) [2023-10-12 16:15:38,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.3, 300 sec: 13440.4). Total num frames: 25034752. Throughput: 0: 1664.2, 1: 1686.2. Samples: 6264554. Policy #0 lag: (min: 31.0, avg: 34.1, max: 63.0) [2023-10-12 16:15:38,435][61643] Avg episode reward: [(0, '3.070'), (1, '3.830')] [2023-10-12 16:15:38,436][62495] Saving new best policy, reward=3.830! [2023-10-12 16:15:38,842][62635] Updated weights for policy 1, policy_version 12230 (0.0010) [2023-10-12 16:15:39,200][62635] Updated weights for policy 1, policy_version 12240 (0.0010) [2023-10-12 16:15:39,572][62635] Updated weights for policy 1, policy_version 12250 (0.0009) [2023-10-12 16:15:40,397][62634] Updated weights for policy 0, policy_version 12230 (0.0009) [2023-10-12 16:15:40,775][62634] Updated weights for policy 0, policy_version 12240 (0.0010) [2023-10-12 16:15:41,163][62634] Updated weights for policy 0, policy_version 12250 (0.0010) [2023-10-12 16:15:43,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 25100288. Throughput: 0: 1676.5, 1: 1699.6. Samples: 6285066. Policy #0 lag: (min: 31.0, avg: 34.1, max: 63.0) [2023-10-12 16:15:43,436][61643] Avg episode reward: [(0, '3.050'), (1, '3.880')] [2023-10-12 16:15:43,486][62635] Updated weights for policy 1, policy_version 12260 (0.0009) [2023-10-12 16:15:43,855][62635] Updated weights for policy 1, policy_version 12270 (0.0007) [2023-10-12 16:15:44,225][62635] Updated weights for policy 1, policy_version 12280 (0.0007) [2023-10-12 16:15:44,518][62495] Saving new best policy, reward=3.880! [2023-10-12 16:15:45,157][62634] Updated weights for policy 0, policy_version 12260 (0.0010) [2023-10-12 16:15:45,528][62634] Updated weights for policy 0, policy_version 12270 (0.0009) [2023-10-12 16:15:45,897][62634] Updated weights for policy 0, policy_version 12280 (0.0010) [2023-10-12 16:15:48,296][62635] Updated weights for policy 1, policy_version 12290 (0.0007) [2023-10-12 16:15:48,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 25165824. Throughput: 0: 1682.9, 1: 1702.4. Samples: 6305872. Policy #0 lag: (min: 31.0, avg: 34.0, max: 63.0) [2023-10-12 16:15:48,436][61643] Avg episode reward: [(0, '2.990'), (1, '3.890')] [2023-10-12 16:15:48,685][62635] Updated weights for policy 1, policy_version 12300 (0.0007) [2023-10-12 16:15:49,056][62635] Updated weights for policy 1, policy_version 12310 (0.0008) [2023-10-12 16:15:49,418][62495] Saving new best policy, reward=3.890! [2023-10-12 16:15:49,424][62635] Updated weights for policy 1, policy_version 12320 (0.0008) [2023-10-12 16:15:49,786][62634] Updated weights for policy 0, policy_version 12290 (0.0008) [2023-10-12 16:15:50,168][62634] Updated weights for policy 0, policy_version 12300 (0.0007) [2023-10-12 16:15:50,546][62634] Updated weights for policy 0, policy_version 12310 (0.0009) [2023-10-12 16:15:50,933][62634] Updated weights for policy 0, policy_version 12320 (0.0008) [2023-10-12 16:15:53,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 25231360. Throughput: 0: 1652.6, 1: 1700.0. Samples: 6314866. Policy #0 lag: (min: 31.0, avg: 34.0, max: 63.0) [2023-10-12 16:15:53,435][61643] Avg episode reward: [(0, '3.090'), (1, '3.840')] [2023-10-12 16:15:53,568][62635] Updated weights for policy 1, policy_version 12330 (0.0009) [2023-10-12 16:15:53,937][62635] Updated weights for policy 1, policy_version 12340 (0.0009) [2023-10-12 16:15:54,318][62635] Updated weights for policy 1, policy_version 12350 (0.0009) [2023-10-12 16:15:55,127][62634] Updated weights for policy 0, policy_version 12330 (0.0009) [2023-10-12 16:15:55,496][62634] Updated weights for policy 0, policy_version 12340 (0.0009) [2023-10-12 16:15:55,876][62634] Updated weights for policy 0, policy_version 12350 (0.0009) [2023-10-12 16:15:58,311][62635] Updated weights for policy 1, policy_version 12360 (0.0009) [2023-10-12 16:15:58,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 25296896. Throughput: 0: 1683.9, 1: 1699.2. Samples: 6335708. Policy #0 lag: (min: 31.0, avg: 34.0, max: 63.0) [2023-10-12 16:15:58,435][61643] Avg episode reward: [(0, '3.090'), (1, '3.860')] [2023-10-12 16:15:58,686][62635] Updated weights for policy 1, policy_version 12370 (0.0009) [2023-10-12 16:15:59,045][62635] Updated weights for policy 1, policy_version 12380 (0.0009) [2023-10-12 16:15:59,825][62634] Updated weights for policy 0, policy_version 12360 (0.0008) [2023-10-12 16:16:00,198][62634] Updated weights for policy 0, policy_version 12370 (0.0008) [2023-10-12 16:16:00,566][62634] Updated weights for policy 0, policy_version 12380 (0.0007) [2023-10-12 16:16:03,134][62635] Updated weights for policy 1, policy_version 12390 (0.0009) [2023-10-12 16:16:03,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 25362432. Throughput: 0: 1691.9, 1: 1693.8. Samples: 6356698. Policy #0 lag: (min: 31.0, avg: 39.3, max: 63.0) [2023-10-12 16:16:03,435][61643] Avg episode reward: [(0, '3.080'), (1, '3.900')] [2023-10-12 16:16:03,503][62635] Updated weights for policy 1, policy_version 12400 (0.0008) [2023-10-12 16:16:03,867][62635] Updated weights for policy 1, policy_version 12410 (0.0007) [2023-10-12 16:16:04,090][62495] Saving new best policy, reward=3.900! [2023-10-12 16:16:04,398][62634] Updated weights for policy 0, policy_version 12390 (0.0010) [2023-10-12 16:16:04,776][62634] Updated weights for policy 0, policy_version 12400 (0.0009) [2023-10-12 16:16:05,143][62634] Updated weights for policy 0, policy_version 12410 (0.0009) [2023-10-12 16:16:07,943][62635] Updated weights for policy 1, policy_version 12420 (0.0009) [2023-10-12 16:16:08,311][62635] Updated weights for policy 1, policy_version 12430 (0.0008) [2023-10-12 16:16:08,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 25427968. Throughput: 0: 1672.7, 1: 1699.9. Samples: 6365970. Policy #0 lag: (min: 31.0, avg: 39.3, max: 63.0) [2023-10-12 16:16:08,435][61643] Avg episode reward: [(0, '3.070'), (1, '4.070')] [2023-10-12 16:16:08,672][62635] Updated weights for policy 1, policy_version 12440 (0.0007) [2023-10-12 16:16:08,964][62495] Saving new best policy, reward=4.070! [2023-10-12 16:16:09,216][62634] Updated weights for policy 0, policy_version 12420 (0.0009) [2023-10-12 16:16:09,599][62634] Updated weights for policy 0, policy_version 12430 (0.0008) [2023-10-12 16:16:09,972][62634] Updated weights for policy 0, policy_version 12440 (0.0007) [2023-10-12 16:16:12,810][62635] Updated weights for policy 1, policy_version 12450 (0.0008) [2023-10-12 16:16:13,186][62635] Updated weights for policy 1, policy_version 12460 (0.0007) [2023-10-12 16:16:13,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 25493504. Throughput: 0: 1693.4, 1: 1698.6. Samples: 6386800. Policy #0 lag: (min: 31.0, avg: 39.3, max: 63.0) [2023-10-12 16:16:13,436][61643] Avg episode reward: [(0, '3.130'), (1, '4.220')] [2023-10-12 16:16:13,437][62354] Saving new best policy, reward=3.130! [2023-10-12 16:16:13,559][62635] Updated weights for policy 1, policy_version 12470 (0.0010) [2023-10-12 16:16:13,929][62635] Updated weights for policy 1, policy_version 12480 (0.0007) [2023-10-12 16:16:13,929][62495] Saving new best policy, reward=4.220! [2023-10-12 16:16:14,013][62634] Updated weights for policy 0, policy_version 12450 (0.0008) [2023-10-12 16:16:14,394][62634] Updated weights for policy 0, policy_version 12460 (0.0008) [2023-10-12 16:16:14,773][62634] Updated weights for policy 0, policy_version 12470 (0.0007) [2023-10-12 16:16:15,154][62634] Updated weights for policy 0, policy_version 12480 (0.0007) [2023-10-12 16:16:17,952][62635] Updated weights for policy 1, policy_version 12490 (0.0009) [2023-10-12 16:16:18,313][62635] Updated weights for policy 1, policy_version 12500 (0.0010) [2023-10-12 16:16:18,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.3, 300 sec: 13329.4). Total num frames: 25559040. Throughput: 0: 1690.7, 1: 1688.4. Samples: 6407098. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 16:16:18,435][61643] Avg episode reward: [(0, '3.180'), (1, '4.310')] [2023-10-12 16:16:18,443][62354] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000012480_12779520.pth... [2023-10-12 16:16:18,480][62354] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000010912_11173888.pth [2023-10-12 16:16:18,484][62354] Saving new best policy, reward=3.180! [2023-10-12 16:16:18,688][62635] Updated weights for policy 1, policy_version 12510 (0.0010) [2023-10-12 16:16:18,760][62495] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000012512_12812288.pth... [2023-10-12 16:16:18,800][62495] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000010912_11173888.pth [2023-10-12 16:16:18,810][62495] Saving new best policy, reward=4.310! [2023-10-12 16:16:19,295][62634] Updated weights for policy 0, policy_version 12490 (0.0007) [2023-10-12 16:16:19,669][62634] Updated weights for policy 0, policy_version 12500 (0.0007) [2023-10-12 16:16:20,045][62634] Updated weights for policy 0, policy_version 12510 (0.0007) [2023-10-12 16:16:22,784][62635] Updated weights for policy 1, policy_version 12520 (0.0007) [2023-10-12 16:16:23,159][62635] Updated weights for policy 1, policy_version 12530 (0.0009) [2023-10-12 16:16:23,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 25624576. Throughput: 0: 1679.1, 1: 1698.7. Samples: 6416554. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 16:16:23,435][61643] Avg episode reward: [(0, '3.230'), (1, '4.400')] [2023-10-12 16:16:23,436][62354] Saving new best policy, reward=3.230! [2023-10-12 16:16:23,517][62635] Updated weights for policy 1, policy_version 12540 (0.0007) [2023-10-12 16:16:23,659][62495] Saving new best policy, reward=4.400! [2023-10-12 16:16:24,108][62634] Updated weights for policy 0, policy_version 12520 (0.0009) [2023-10-12 16:16:24,475][62634] Updated weights for policy 0, policy_version 12530 (0.0010) [2023-10-12 16:16:24,852][62634] Updated weights for policy 0, policy_version 12540 (0.0010) [2023-10-12 16:16:27,606][62635] Updated weights for policy 1, policy_version 12550 (0.0008) [2023-10-12 16:16:27,974][62635] Updated weights for policy 1, policy_version 12560 (0.0007) [2023-10-12 16:16:28,342][62635] Updated weights for policy 1, policy_version 12570 (0.0009) [2023-10-12 16:16:28,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 25690112. Throughput: 0: 1692.6, 1: 1695.2. Samples: 6437518. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 16:16:28,435][61643] Avg episode reward: [(0, '3.240'), (1, '4.560')] [2023-10-12 16:16:28,436][62354] Saving new best policy, reward=3.240! [2023-10-12 16:16:28,559][62495] Saving new best policy, reward=4.560! [2023-10-12 16:16:28,841][62634] Updated weights for policy 0, policy_version 12550 (0.0011) [2023-10-12 16:16:29,228][62634] Updated weights for policy 0, policy_version 12560 (0.0007) [2023-10-12 16:16:29,603][62634] Updated weights for policy 0, policy_version 12570 (0.0008) [2023-10-12 16:16:32,168][62635] Updated weights for policy 1, policy_version 12580 (0.0008) [2023-10-12 16:16:32,539][62635] Updated weights for policy 1, policy_version 12590 (0.0008) [2023-10-12 16:16:32,913][62635] Updated weights for policy 1, policy_version 12600 (0.0008) [2023-10-12 16:16:33,435][61643] Fps is (10 sec: 16384.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 25788416. Throughput: 0: 1694.3, 1: 1673.9. Samples: 6457440. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0) [2023-10-12 16:16:33,435][61643] Avg episode reward: [(0, '3.240'), (1, '4.720')] [2023-10-12 16:16:33,445][62495] Saving new best policy, reward=4.720! [2023-10-12 16:16:33,635][62634] Updated weights for policy 0, policy_version 12580 (0.0008) [2023-10-12 16:16:34,011][62634] Updated weights for policy 0, policy_version 12590 (0.0011) [2023-10-12 16:16:34,389][62634] Updated weights for policy 0, policy_version 12600 (0.0011) [2023-10-12 16:16:37,076][62635] Updated weights for policy 1, policy_version 12610 (0.0008) [2023-10-12 16:16:37,507][62635] Updated weights for policy 1, policy_version 12620 (0.0009) [2023-10-12 16:16:37,865][62635] Updated weights for policy 1, policy_version 12630 (0.0008) [2023-10-12 16:16:38,231][62635] Updated weights for policy 1, policy_version 12640 (0.0007) [2023-10-12 16:16:38,324][62634] Updated weights for policy 0, policy_version 12610 (0.0009) [2023-10-12 16:16:38,435][61643] Fps is (10 sec: 16384.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 25853952. Throughput: 0: 1689.4, 1: 1702.5. Samples: 6467504. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0) [2023-10-12 16:16:38,435][61643] Avg episode reward: [(0, '3.230'), (1, '4.910')] [2023-10-12 16:16:38,436][62495] Saving new best policy, reward=4.910! [2023-10-12 16:16:38,704][62634] Updated weights for policy 0, policy_version 12620 (0.0007) [2023-10-12 16:16:39,084][62634] Updated weights for policy 0, policy_version 12630 (0.0009) [2023-10-12 16:16:39,468][62634] Updated weights for policy 0, policy_version 12640 (0.0007) [2023-10-12 16:16:42,309][62635] Updated weights for policy 1, policy_version 12650 (0.0009) [2023-10-12 16:16:42,687][62635] Updated weights for policy 1, policy_version 12660 (0.0009) [2023-10-12 16:16:43,050][62635] Updated weights for policy 1, policy_version 12670 (0.0008) [2023-10-12 16:16:43,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 25919488. Throughput: 0: 1689.6, 1: 1691.6. Samples: 6487862. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0) [2023-10-12 16:16:43,435][61643] Avg episode reward: [(0, '3.180'), (1, '5.110')] [2023-10-12 16:16:43,436][62495] Saving new best policy, reward=5.110! [2023-10-12 16:16:43,517][62634] Updated weights for policy 0, policy_version 12650 (0.0009) [2023-10-12 16:16:43,882][62634] Updated weights for policy 0, policy_version 12660 (0.0009) [2023-10-12 16:16:44,268][62634] Updated weights for policy 0, policy_version 12670 (0.0009) [2023-10-12 16:16:47,032][62635] Updated weights for policy 1, policy_version 12680 (0.0009) [2023-10-12 16:16:47,402][62635] Updated weights for policy 1, policy_version 12690 (0.0009) [2023-10-12 16:16:47,764][62635] Updated weights for policy 1, policy_version 12700 (0.0007) [2023-10-12 16:16:48,250][62634] Updated weights for policy 0, policy_version 12680 (0.0009) [2023-10-12 16:16:48,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 25985024. Throughput: 0: 1689.1, 1: 1667.4. Samples: 6507742. Policy #0 lag: (min: 31.0, avg: 38.6, max: 63.0) [2023-10-12 16:16:48,435][61643] Avg episode reward: [(0, '3.180'), (1, '5.270')] [2023-10-12 16:16:48,444][62495] Saving new best policy, reward=5.270! [2023-10-12 16:16:48,631][62634] Updated weights for policy 0, policy_version 12690 (0.0010) [2023-10-12 16:16:49,016][62634] Updated weights for policy 0, policy_version 12700 (0.0008) [2023-10-12 16:16:51,622][62635] Updated weights for policy 1, policy_version 12710 (0.0007) [2023-10-12 16:16:51,984][62635] Updated weights for policy 1, policy_version 12720 (0.0007) [2023-10-12 16:16:52,355][62635] Updated weights for policy 1, policy_version 12730 (0.0008) [2023-10-12 16:16:53,079][62634] Updated weights for policy 0, policy_version 12710 (0.0008) [2023-10-12 16:16:53,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 26050560. Throughput: 0: 1690.6, 1: 1693.6. Samples: 6518258. Policy #0 lag: (min: 31.0, avg: 38.6, max: 63.0) [2023-10-12 16:16:53,435][61643] Avg episode reward: [(0, '3.220'), (1, '5.520')] [2023-10-12 16:16:53,436][62495] Saving new best policy, reward=5.520! [2023-10-12 16:16:53,456][62634] Updated weights for policy 0, policy_version 12720 (0.0010) [2023-10-12 16:16:53,840][62634] Updated weights for policy 0, policy_version 12730 (0.0010) [2023-10-12 16:16:56,377][62635] Updated weights for policy 1, policy_version 12740 (0.0009) [2023-10-12 16:16:56,746][62635] Updated weights for policy 1, policy_version 12750 (0.0011) [2023-10-12 16:16:57,114][62635] Updated weights for policy 1, policy_version 12760 (0.0009) [2023-10-12 16:16:57,880][62634] Updated weights for policy 0, policy_version 12740 (0.0009) [2023-10-12 16:16:58,266][62634] Updated weights for policy 0, policy_version 12750 (0.0008) [2023-10-12 16:16:58,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 26116096. Throughput: 0: 1693.2, 1: 1673.1. Samples: 6538284. Policy #0 lag: (min: 31.0, avg: 38.6, max: 63.0) [2023-10-12 16:16:58,435][61643] Avg episode reward: [(0, '3.230'), (1, '5.680')] [2023-10-12 16:16:58,436][62495] Saving new best policy, reward=5.680! [2023-10-12 16:16:58,648][62634] Updated weights for policy 0, policy_version 12760 (0.0010) [2023-10-12 16:17:01,248][62635] Updated weights for policy 1, policy_version 12770 (0.0010) [2023-10-12 16:17:01,621][62635] Updated weights for policy 1, policy_version 12780 (0.0008) [2023-10-12 16:17:01,997][62635] Updated weights for policy 1, policy_version 12790 (0.0008) [2023-10-12 16:17:02,359][62635] Updated weights for policy 1, policy_version 12800 (0.0008) [2023-10-12 16:17:02,538][62634] Updated weights for policy 0, policy_version 12770 (0.0010) [2023-10-12 16:17:02,922][62634] Updated weights for policy 0, policy_version 12780 (0.0007) [2023-10-12 16:17:03,309][62634] Updated weights for policy 0, policy_version 12790 (0.0007) [2023-10-12 16:17:03,435][61643] Fps is (10 sec: 13106.7, 60 sec: 13653.2, 300 sec: 13440.4). Total num frames: 26181632. Throughput: 0: 1686.9, 1: 1666.6. Samples: 6558006. Policy #0 lag: (min: 31.0, avg: 40.9, max: 63.0) [2023-10-12 16:17:03,436][61643] Avg episode reward: [(0, '3.210'), (1, '5.870')] [2023-10-12 16:17:03,444][62495] Saving new best policy, reward=5.870! [2023-10-12 16:17:03,693][62634] Updated weights for policy 0, policy_version 12800 (0.0009) [2023-10-12 16:17:06,535][62635] Updated weights for policy 1, policy_version 12810 (0.0009) [2023-10-12 16:17:06,909][62635] Updated weights for policy 1, policy_version 12820 (0.0008) [2023-10-12 16:17:07,276][62635] Updated weights for policy 1, policy_version 12830 (0.0008) [2023-10-12 16:17:07,603][62634] Updated weights for policy 0, policy_version 12810 (0.0009) [2023-10-12 16:17:07,982][62634] Updated weights for policy 0, policy_version 12820 (0.0008) [2023-10-12 16:17:08,350][62634] Updated weights for policy 0, policy_version 12830 (0.0009) [2023-10-12 16:17:08,435][61643] Fps is (10 sec: 16384.1, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 26279936. Throughput: 0: 1706.3, 1: 1683.1. Samples: 6569076. Policy #0 lag: (min: 31.0, avg: 40.9, max: 63.0) [2023-10-12 16:17:08,435][61643] Avg episode reward: [(0, '3.190'), (1, '5.610')] [2023-10-12 16:17:11,413][62635] Updated weights for policy 1, policy_version 12840 (0.0009) [2023-10-12 16:17:11,773][62635] Updated weights for policy 1, policy_version 12850 (0.0009) [2023-10-12 16:17:12,144][62635] Updated weights for policy 1, policy_version 12860 (0.0008) [2023-10-12 16:17:12,325][62634] Updated weights for policy 0, policy_version 12840 (0.0009) [2023-10-12 16:17:12,703][62634] Updated weights for policy 0, policy_version 12850 (0.0008) [2023-10-12 16:17:13,083][62634] Updated weights for policy 0, policy_version 12860 (0.0009) [2023-10-12 16:17:13,435][61643] Fps is (10 sec: 16384.6, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 26345472. Throughput: 0: 1703.2, 1: 1662.6. Samples: 6588978. Policy #0 lag: (min: 24.0, avg: 43.2, max: 56.0) [2023-10-12 16:17:13,436][61643] Avg episode reward: [(0, '3.180'), (1, '5.770')] [2023-10-12 16:17:16,241][62635] Updated weights for policy 1, policy_version 12870 (0.0009) [2023-10-12 16:17:16,604][62635] Updated weights for policy 1, policy_version 12880 (0.0008) [2023-10-12 16:17:16,973][62635] Updated weights for policy 1, policy_version 12890 (0.0007) [2023-10-12 16:17:17,118][62634] Updated weights for policy 0, policy_version 12870 (0.0008) [2023-10-12 16:17:17,498][62634] Updated weights for policy 0, policy_version 12880 (0.0008) [2023-10-12 16:17:17,883][62634] Updated weights for policy 0, policy_version 12890 (0.0007) [2023-10-12 16:17:18,435][61643] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 26411008. Throughput: 0: 1676.5, 1: 1676.7. Samples: 6608334. Policy #0 lag: (min: 24.0, avg: 43.2, max: 56.0) [2023-10-12 16:17:18,435][61643] Avg episode reward: [(0, '3.210'), (1, '5.660')] [2023-10-12 16:17:21,173][62635] Updated weights for policy 1, policy_version 12900 (0.0008) [2023-10-12 16:17:21,544][62635] Updated weights for policy 1, policy_version 12910 (0.0008) [2023-10-12 16:17:21,917][62635] Updated weights for policy 1, policy_version 12920 (0.0007) [2023-10-12 16:17:21,996][62634] Updated weights for policy 0, policy_version 12900 (0.0008) [2023-10-12 16:17:22,371][62634] Updated weights for policy 0, policy_version 12910 (0.0009) [2023-10-12 16:17:22,748][62634] Updated weights for policy 0, policy_version 12920 (0.0008) [2023-10-12 16:17:23,435][61643] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 26476544. Throughput: 0: 1703.0, 1: 1679.1. Samples: 6619700. Policy #0 lag: (min: 24.0, avg: 43.2, max: 56.0) [2023-10-12 16:17:23,435][61643] Avg episode reward: [(0, '3.220'), (1, '5.770')] [2023-10-12 16:17:26,000][62635] Updated weights for policy 1, policy_version 12930 (0.0009) [2023-10-12 16:17:26,420][62635] Updated weights for policy 1, policy_version 12940 (0.0010) [2023-10-12 16:17:26,684][62634] Updated weights for policy 0, policy_version 12930 (0.0008) [2023-10-12 16:17:26,782][62635] Updated weights for policy 1, policy_version 12950 (0.0010) [2023-10-12 16:17:27,051][62634] Updated weights for policy 0, policy_version 12940 (0.0009) [2023-10-12 16:17:27,142][62635] Updated weights for policy 1, policy_version 12960 (0.0008) [2023-10-12 16:17:27,425][62634] Updated weights for policy 0, policy_version 12950 (0.0009) [2023-10-12 16:17:27,795][62634] Updated weights for policy 0, policy_version 12960 (0.0008) [2023-10-12 16:17:28,435][61643] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 26542080. Throughput: 0: 1699.1, 1: 1660.8. Samples: 6639060. Policy #0 lag: (min: 1.0, avg: 7.3, max: 33.0) [2023-10-12 16:17:28,435][61643] Avg episode reward: [(0, '3.220'), (1, '5.510')] [2023-10-12 16:17:31,360][62635] Updated weights for policy 1, policy_version 12970 (0.0010) [2023-10-12 16:17:31,721][62635] Updated weights for policy 1, policy_version 12980 (0.0008) [2023-10-12 16:17:31,870][62634] Updated weights for policy 0, policy_version 12970 (0.0007) [2023-10-12 16:17:32,096][62635] Updated weights for policy 1, policy_version 12990 (0.0008) [2023-10-12 16:17:32,255][62634] Updated weights for policy 0, policy_version 12980 (0.0010) [2023-10-12 16:17:32,626][62634] Updated weights for policy 0, policy_version 12990 (0.0007) [2023-10-12 16:17:33,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 26607616. Throughput: 0: 1673.7, 1: 1680.2. Samples: 6658668. Policy #0 lag: (min: 1.0, avg: 7.3, max: 33.0) [2023-10-12 16:17:33,436][61643] Avg episode reward: [(0, '3.280'), (1, '5.930')] [2023-10-12 16:17:33,444][62354] Saving new best policy, reward=3.280! [2023-10-12 16:17:33,444][62495] Saving new best policy, reward=5.930! [2023-10-12 16:17:36,092][62635] Updated weights for policy 1, policy_version 13000 (0.0010) [2023-10-12 16:17:36,469][62635] Updated weights for policy 1, policy_version 13010 (0.0010) [2023-10-12 16:17:36,811][62634] Updated weights for policy 0, policy_version 13000 (0.0007) [2023-10-12 16:17:36,831][62635] Updated weights for policy 1, policy_version 13020 (0.0007) [2023-10-12 16:17:37,184][62634] Updated weights for policy 0, policy_version 13010 (0.0007) [2023-10-12 16:17:37,560][62634] Updated weights for policy 0, policy_version 13020 (0.0007) [2023-10-12 16:17:38,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 26673152. Throughput: 0: 1700.7, 1: 1671.9. Samples: 6670026. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 16:17:38,436][61643] Avg episode reward: [(0, '3.240'), (1, '5.790')] [2023-10-12 16:17:40,888][62635] Updated weights for policy 1, policy_version 13030 (0.0007) [2023-10-12 16:17:41,260][62635] Updated weights for policy 1, policy_version 13040 (0.0008) [2023-10-12 16:17:41,629][62635] Updated weights for policy 1, policy_version 13050 (0.0008) [2023-10-12 16:17:41,679][62634] Updated weights for policy 0, policy_version 13030 (0.0008) [2023-10-12 16:17:42,051][62634] Updated weights for policy 0, policy_version 13040 (0.0009) [2023-10-12 16:17:42,436][62634] Updated weights for policy 0, policy_version 13050 (0.0009) [2023-10-12 16:17:43,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 26738688. Throughput: 0: 1688.0, 1: 1665.5. Samples: 6689192. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 16:17:43,436][61643] Avg episode reward: [(0, '3.230'), (1, '5.830')] [2023-10-12 16:17:45,581][62635] Updated weights for policy 1, policy_version 13060 (0.0008) [2023-10-12 16:17:45,965][62635] Updated weights for policy 1, policy_version 13070 (0.0008) [2023-10-12 16:17:46,323][62635] Updated weights for policy 1, policy_version 13080 (0.0010) [2023-10-12 16:17:46,372][62634] Updated weights for policy 0, policy_version 13060 (0.0008) [2023-10-12 16:17:46,742][62634] Updated weights for policy 0, policy_version 13070 (0.0009) [2023-10-12 16:17:47,118][62634] Updated weights for policy 0, policy_version 13080 (0.0010) [2023-10-12 16:17:48,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 26804224. Throughput: 0: 1679.0, 1: 1682.1. Samples: 6709256. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 16:17:48,436][61643] Avg episode reward: [(0, '3.200'), (1, '5.810')] [2023-10-12 16:17:50,326][62635] Updated weights for policy 1, policy_version 13090 (0.0008) [2023-10-12 16:17:50,698][62635] Updated weights for policy 1, policy_version 13100 (0.0008) [2023-10-12 16:17:51,073][62635] Updated weights for policy 1, policy_version 13110 (0.0007) [2023-10-12 16:17:51,263][62634] Updated weights for policy 0, policy_version 13090 (0.0009) [2023-10-12 16:17:51,445][62635] Updated weights for policy 1, policy_version 13120 (0.0008) [2023-10-12 16:17:51,641][62634] Updated weights for policy 0, policy_version 13100 (0.0007) [2023-10-12 16:17:52,017][62634] Updated weights for policy 0, policy_version 13110 (0.0008) [2023-10-12 16:17:52,402][62634] Updated weights for policy 0, policy_version 13120 (0.0008) [2023-10-12 16:17:53,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 26869760. Throughput: 0: 1688.9, 1: 1666.7. Samples: 6720078. Policy #0 lag: (min: 31.0, avg: 40.5, max: 63.0) [2023-10-12 16:17:53,435][61643] Avg episode reward: [(0, '3.250'), (1, '5.870')] [2023-10-12 16:17:55,592][62635] Updated weights for policy 1, policy_version 13130 (0.0009) [2023-10-12 16:17:55,960][62635] Updated weights for policy 1, policy_version 13140 (0.0009) [2023-10-12 16:17:56,325][62635] Updated weights for policy 1, policy_version 13150 (0.0008) [2023-10-12 16:17:56,356][62634] Updated weights for policy 0, policy_version 13130 (0.0008) [2023-10-12 16:17:56,733][62634] Updated weights for policy 0, policy_version 13140 (0.0008) [2023-10-12 16:17:57,102][62634] Updated weights for policy 0, policy_version 13150 (0.0008) [2023-10-12 16:17:58,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 26935296. Throughput: 0: 1661.1, 1: 1666.6. Samples: 6738726. Policy #0 lag: (min: 31.0, avg: 40.5, max: 63.0) [2023-10-12 16:17:58,435][61643] Avg episode reward: [(0, '3.250'), (1, '5.880')] [2023-10-12 16:18:00,521][62635] Updated weights for policy 1, policy_version 13160 (0.0007) [2023-10-12 16:18:00,891][62635] Updated weights for policy 1, policy_version 13170 (0.0007) [2023-10-12 16:18:01,251][62634] Updated weights for policy 0, policy_version 13160 (0.0007) [2023-10-12 16:18:01,259][62635] Updated weights for policy 1, policy_version 13180 (0.0007) [2023-10-12 16:18:01,628][62634] Updated weights for policy 0, policy_version 13170 (0.0007) [2023-10-12 16:18:02,008][62634] Updated weights for policy 0, policy_version 13180 (0.0008) [2023-10-12 16:18:03,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 27000832. Throughput: 0: 1678.8, 1: 1672.6. Samples: 6759148. Policy #0 lag: (min: 31.0, avg: 40.5, max: 63.0) [2023-10-12 16:18:03,436][61643] Avg episode reward: [(0, '3.200'), (1, '5.920')] [2023-10-12 16:18:05,403][62635] Updated weights for policy 1, policy_version 13190 (0.0009) [2023-10-12 16:18:05,777][62635] Updated weights for policy 1, policy_version 13200 (0.0008) [2023-10-12 16:18:06,029][62634] Updated weights for policy 0, policy_version 13190 (0.0009) [2023-10-12 16:18:06,148][62635] Updated weights for policy 1, policy_version 13210 (0.0007) [2023-10-12 16:18:06,398][62634] Updated weights for policy 0, policy_version 13200 (0.0008) [2023-10-12 16:18:06,781][62634] Updated weights for policy 0, policy_version 13210 (0.0010) [2023-10-12 16:18:08,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13551.5). Total num frames: 27066368. Throughput: 0: 1682.0, 1: 1650.3. Samples: 6769652. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0) [2023-10-12 16:18:08,436][61643] Avg episode reward: [(0, '3.230'), (1, '5.860')] [2023-10-12 16:18:10,232][62635] Updated weights for policy 1, policy_version 13220 (0.0009) [2023-10-12 16:18:10,599][62635] Updated weights for policy 1, policy_version 13230 (0.0011) [2023-10-12 16:18:10,777][62634] Updated weights for policy 0, policy_version 13220 (0.0008) [2023-10-12 16:18:10,962][62635] Updated weights for policy 1, policy_version 13240 (0.0007) [2023-10-12 16:18:11,154][62634] Updated weights for policy 0, policy_version 13230 (0.0010) [2023-10-12 16:18:11,526][62634] Updated weights for policy 0, policy_version 13240 (0.0010) [2023-10-12 16:18:13,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13551.5). Total num frames: 27131904. Throughput: 0: 1658.5, 1: 1666.0. Samples: 6788662. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0) [2023-10-12 16:18:13,435][61643] Avg episode reward: [(0, '3.240'), (1, '6.210')] [2023-10-12 16:18:13,436][62495] Saving new best policy, reward=6.210! [2023-10-12 16:18:15,083][62635] Updated weights for policy 1, policy_version 13250 (0.0008) [2023-10-12 16:18:15,485][62635] Updated weights for policy 1, policy_version 13260 (0.0007) [2023-10-12 16:18:15,576][62634] Updated weights for policy 0, policy_version 13250 (0.0008) [2023-10-12 16:18:15,849][62635] Updated weights for policy 1, policy_version 13270 (0.0007) [2023-10-12 16:18:15,952][62634] Updated weights for policy 0, policy_version 13260 (0.0009) [2023-10-12 16:18:16,213][62635] Updated weights for policy 1, policy_version 13280 (0.0008) [2023-10-12 16:18:16,324][62634] Updated weights for policy 0, policy_version 13270 (0.0008) [2023-10-12 16:18:16,701][62634] Updated weights for policy 0, policy_version 13280 (0.0009) [2023-10-12 16:18:18,435][61643] Fps is (10 sec: 13106.9, 60 sec: 13107.1, 300 sec: 13551.5). Total num frames: 27197440. Throughput: 0: 1679.1, 1: 1668.2. Samples: 6809294. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0) [2023-10-12 16:18:18,436][61643] Avg episode reward: [(0, '3.330'), (1, '6.220')] [2023-10-12 16:18:18,448][62354] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000013280_13598720.pth... [2023-10-12 16:18:18,448][62495] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000013280_13598720.pth... [2023-10-12 16:18:18,484][62495] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000011712_11993088.pth [2023-10-12 16:18:18,486][62354] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000011712_11993088.pth [2023-10-12 16:18:18,488][62495] Saving new best policy, reward=6.220! [2023-10-12 16:18:18,490][62354] Saving new best policy, reward=3.330! [2023-10-12 16:18:20,259][62635] Updated weights for policy 1, policy_version 13290 (0.0007) [2023-10-12 16:18:20,628][62635] Updated weights for policy 1, policy_version 13300 (0.0008) [2023-10-12 16:18:20,830][62634] Updated weights for policy 0, policy_version 13290 (0.0008) [2023-10-12 16:18:21,008][62635] Updated weights for policy 1, policy_version 13310 (0.0008) [2023-10-12 16:18:21,206][62634] Updated weights for policy 0, policy_version 13300 (0.0009) [2023-10-12 16:18:21,592][62634] Updated weights for policy 0, policy_version 13310 (0.0010) [2023-10-12 16:18:23,435][61643] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13551.5). Total num frames: 27262976. Throughput: 0: 1665.4, 1: 1645.8. Samples: 6819030. Policy #0 lag: (min: 31.0, avg: 41.4, max: 63.0) [2023-10-12 16:18:23,436][61643] Avg episode reward: [(0, '3.290'), (1, '6.270')] [2023-10-12 16:18:23,437][62495] Saving new best policy, reward=6.270! [2023-10-12 16:18:25,099][62635] Updated weights for policy 1, policy_version 13320 (0.0009) [2023-10-12 16:18:25,467][62635] Updated weights for policy 1, policy_version 13330 (0.0008) [2023-10-12 16:18:25,717][62634] Updated weights for policy 0, policy_version 13320 (0.0008) [2023-10-12 16:18:25,831][62635] Updated weights for policy 1, policy_version 13340 (0.0007) [2023-10-12 16:18:26,086][62634] Updated weights for policy 0, policy_version 13330 (0.0007) [2023-10-12 16:18:26,469][62634] Updated weights for policy 0, policy_version 13340 (0.0008) [2023-10-12 16:18:28,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13107.1, 300 sec: 13551.5). Total num frames: 27328512. Throughput: 0: 1656.4, 1: 1668.0. Samples: 6838792. Policy #0 lag: (min: 31.0, avg: 41.4, max: 63.0) [2023-10-12 16:18:28,436][61643] Avg episode reward: [(0, '3.270'), (1, '6.230')] [2023-10-12 16:18:29,776][62635] Updated weights for policy 1, policy_version 13350 (0.0007) [2023-10-12 16:18:30,142][62635] Updated weights for policy 1, policy_version 13360 (0.0010) [2023-10-12 16:18:30,521][62635] Updated weights for policy 1, policy_version 13370 (0.0009) [2023-10-12 16:18:30,623][62634] Updated weights for policy 0, policy_version 13350 (0.0010) [2023-10-12 16:18:31,002][62634] Updated weights for policy 0, policy_version 13360 (0.0010) [2023-10-12 16:18:31,375][62634] Updated weights for policy 0, policy_version 13370 (0.0011) [2023-10-12 16:18:33,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13551.5). Total num frames: 27394048. Throughput: 0: 1671.4, 1: 1668.8. Samples: 6859568. Policy #0 lag: (min: 31.0, avg: 41.4, max: 63.0) [2023-10-12 16:18:33,436][61643] Avg episode reward: [(0, '3.240'), (1, '6.030')] [2023-10-12 16:18:34,600][62635] Updated weights for policy 1, policy_version 13380 (0.0008) [2023-10-12 16:18:34,969][62635] Updated weights for policy 1, policy_version 13390 (0.0008) [2023-10-12 16:18:35,344][62635] Updated weights for policy 1, policy_version 13400 (0.0009) [2023-10-12 16:18:35,505][62634] Updated weights for policy 0, policy_version 13380 (0.0009) [2023-10-12 16:18:35,886][62634] Updated weights for policy 0, policy_version 13390 (0.0008) [2023-10-12 16:18:36,267][62634] Updated weights for policy 0, policy_version 13400 (0.0008) [2023-10-12 16:18:38,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 27459584. Throughput: 0: 1662.4, 1: 1653.8. Samples: 6869306. Policy #0 lag: (min: 20.0, avg: 27.7, max: 52.0) [2023-10-12 16:18:38,436][61643] Avg episode reward: [(0, '3.340'), (1, '6.050')] [2023-10-12 16:18:38,437][62354] Saving new best policy, reward=3.340! [2023-10-12 16:18:39,315][62635] Updated weights for policy 1, policy_version 13410 (0.0009) [2023-10-12 16:18:39,691][62635] Updated weights for policy 1, policy_version 13420 (0.0007) [2023-10-12 16:18:40,056][62635] Updated weights for policy 1, policy_version 13430 (0.0010) [2023-10-12 16:18:40,146][62634] Updated weights for policy 0, policy_version 13410 (0.0007) [2023-10-12 16:18:40,418][62635] Updated weights for policy 1, policy_version 13440 (0.0008) [2023-10-12 16:18:40,512][62634] Updated weights for policy 0, policy_version 13420 (0.0008) [2023-10-12 16:18:40,885][62634] Updated weights for policy 0, policy_version 13430 (0.0007) [2023-10-12 16:18:41,267][62634] Updated weights for policy 0, policy_version 13440 (0.0008) [2023-10-12 16:18:43,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 27525120. Throughput: 0: 1679.5, 1: 1677.4. Samples: 6889784. Policy #0 lag: (min: 20.0, avg: 27.7, max: 52.0) [2023-10-12 16:18:43,435][61643] Avg episode reward: [(0, '3.310'), (1, '5.750')] [2023-10-12 16:18:44,406][62635] Updated weights for policy 1, policy_version 13450 (0.0007) [2023-10-12 16:18:44,765][62635] Updated weights for policy 1, policy_version 13460 (0.0010) [2023-10-12 16:18:45,136][62635] Updated weights for policy 1, policy_version 13470 (0.0009) [2023-10-12 16:18:45,180][62634] Updated weights for policy 0, policy_version 13450 (0.0008) [2023-10-12 16:18:45,565][62634] Updated weights for policy 0, policy_version 13460 (0.0009) [2023-10-12 16:18:45,953][62634] Updated weights for policy 0, policy_version 13470 (0.0009) [2023-10-12 16:18:48,435][61643] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 27590656. Throughput: 0: 1685.8, 1: 1678.3. Samples: 6910532. Policy #0 lag: (min: 20.0, avg: 27.7, max: 52.0) [2023-10-12 16:18:48,435][61643] Avg episode reward: [(0, '3.300'), (1, '5.940')] [2023-10-12 16:18:49,407][62635] Updated weights for policy 1, policy_version 13480 (0.0008) [2023-10-12 16:18:49,776][62634] Updated weights for policy 0, policy_version 13480 (0.0009) [2023-10-12 16:18:49,780][62635] Updated weights for policy 1, policy_version 13490 (0.0008) [2023-10-12 16:18:50,139][62635] Updated weights for policy 1, policy_version 13500 (0.0008) [2023-10-12 16:18:50,149][62634] Updated weights for policy 0, policy_version 13490 (0.0008) [2023-10-12 16:18:50,526][62634] Updated weights for policy 0, policy_version 13500 (0.0010) [2023-10-12 16:18:53,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 27656192. Throughput: 0: 1658.9, 1: 1669.5. Samples: 6919430. Policy #0 lag: (min: 31.0, avg: 37.7, max: 63.0) [2023-10-12 16:18:53,435][61643] Avg episode reward: [(0, '3.240'), (1, '5.680')] [2023-10-12 16:18:54,333][62635] Updated weights for policy 1, policy_version 13510 (0.0009) [2023-10-12 16:18:54,577][62634] Updated weights for policy 0, policy_version 13510 (0.0007) [2023-10-12 16:18:54,713][62635] Updated weights for policy 1, policy_version 13520 (0.0009) [2023-10-12 16:18:54,943][62634] Updated weights for policy 0, policy_version 13520 (0.0007) [2023-10-12 16:18:55,081][62635] Updated weights for policy 1, policy_version 13530 (0.0007) [2023-10-12 16:18:55,329][62634] Updated weights for policy 0, policy_version 13530 (0.0008) [2023-10-12 16:18:58,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 27721728. Throughput: 0: 1689.4, 1: 1675.9. Samples: 6940102. Policy #0 lag: (min: 31.0, avg: 37.7, max: 63.0) [2023-10-12 16:18:58,436][61643] Avg episode reward: [(0, '3.300'), (1, '5.830')] [2023-10-12 16:18:59,218][62635] Updated weights for policy 1, policy_version 13540 (0.0008) [2023-10-12 16:18:59,462][62634] Updated weights for policy 0, policy_version 13540 (0.0008) [2023-10-12 16:18:59,595][62635] Updated weights for policy 1, policy_version 13550 (0.0008) [2023-10-12 16:18:59,833][62634] Updated weights for policy 0, policy_version 13550 (0.0008) [2023-10-12 16:18:59,971][62635] Updated weights for policy 1, policy_version 13560 (0.0008) [2023-10-12 16:19:00,207][62634] Updated weights for policy 0, policy_version 13560 (0.0007) [2023-10-12 16:19:03,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 27787264. Throughput: 0: 1685.6, 1: 1678.0. Samples: 6960654. Policy #0 lag: (min: 31.0, avg: 37.7, max: 63.0) [2023-10-12 16:19:03,435][61643] Avg episode reward: [(0, '3.340'), (1, '5.760')] [2023-10-12 16:19:04,065][62635] Updated weights for policy 1, policy_version 13570 (0.0007) [2023-10-12 16:19:04,251][62634] Updated weights for policy 0, policy_version 13570 (0.0007) [2023-10-12 16:19:04,477][62635] Updated weights for policy 1, policy_version 13580 (0.0008) [2023-10-12 16:19:04,622][62634] Updated weights for policy 0, policy_version 13580 (0.0008) [2023-10-12 16:19:04,848][62635] Updated weights for policy 1, policy_version 13590 (0.0008) [2023-10-12 16:19:05,001][62634] Updated weights for policy 0, policy_version 13590 (0.0007) [2023-10-12 16:19:05,213][62635] Updated weights for policy 1, policy_version 13600 (0.0010) [2023-10-12 16:19:05,375][62634] Updated weights for policy 0, policy_version 13600 (0.0007) [2023-10-12 16:19:08,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 27852800. Throughput: 0: 1671.5, 1: 1671.5. Samples: 6969466. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 16:19:08,436][61643] Avg episode reward: [(0, '3.390'), (1, '5.630')] [2023-10-12 16:19:08,437][62354] Saving new best policy, reward=3.390! [2023-10-12 16:19:09,276][62635] Updated weights for policy 1, policy_version 13610 (0.0009) [2023-10-12 16:19:09,440][62634] Updated weights for policy 0, policy_version 13610 (0.0009) [2023-10-12 16:19:09,633][62635] Updated weights for policy 1, policy_version 13620 (0.0009) [2023-10-12 16:19:09,806][62634] Updated weights for policy 0, policy_version 13620 (0.0010) [2023-10-12 16:19:10,007][62635] Updated weights for policy 1, policy_version 13630 (0.0009) [2023-10-12 16:19:10,186][62634] Updated weights for policy 0, policy_version 13630 (0.0010) [2023-10-12 16:19:13,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 27918336. Throughput: 0: 1687.3, 1: 1673.8. Samples: 6990038. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 16:19:13,436][61643] Avg episode reward: [(0, '3.400'), (1, '5.560')] [2023-10-12 16:19:13,437][62354] Saving new best policy, reward=3.400! [2023-10-12 16:19:14,325][62635] Updated weights for policy 1, policy_version 13640 (0.0007) [2023-10-12 16:19:14,552][62634] Updated weights for policy 0, policy_version 13640 (0.0009) [2023-10-12 16:19:14,697][62635] Updated weights for policy 1, policy_version 13650 (0.0009) [2023-10-12 16:19:14,935][62634] Updated weights for policy 0, policy_version 13650 (0.0009) [2023-10-12 16:19:15,074][62635] Updated weights for policy 1, policy_version 13660 (0.0010) [2023-10-12 16:19:15,302][62634] Updated weights for policy 0, policy_version 13660 (0.0010) [2023-10-12 16:19:18,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 27983872. Throughput: 0: 1689.2, 1: 1664.9. Samples: 7010506. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 16:19:18,435][61643] Avg episode reward: [(0, '3.380'), (1, '5.480')] [2023-10-12 16:19:19,146][62635] Updated weights for policy 1, policy_version 13670 (0.0008) [2023-10-12 16:19:19,474][62634] Updated weights for policy 0, policy_version 13670 (0.0009) [2023-10-12 16:19:19,522][62635] Updated weights for policy 1, policy_version 13680 (0.0007) [2023-10-12 16:19:19,837][62634] Updated weights for policy 0, policy_version 13680 (0.0008) [2023-10-12 16:19:19,887][62635] Updated weights for policy 1, policy_version 13690 (0.0007) [2023-10-12 16:19:20,213][62634] Updated weights for policy 0, policy_version 13690 (0.0008) [2023-10-12 16:19:23,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 28049408. Throughput: 0: 1670.1, 1: 1669.3. Samples: 7019578. Policy #0 lag: (min: 21.0, avg: 28.9, max: 53.0) [2023-10-12 16:19:23,436][61643] Avg episode reward: [(0, '3.390'), (1, '5.840')] [2023-10-12 16:19:23,976][62635] Updated weights for policy 1, policy_version 13700 (0.0008) [2023-10-12 16:19:24,180][62634] Updated weights for policy 0, policy_version 13700 (0.0009) [2023-10-12 16:19:24,346][62635] Updated weights for policy 1, policy_version 13710 (0.0007) [2023-10-12 16:19:24,544][62634] Updated weights for policy 0, policy_version 13710 (0.0008) [2023-10-12 16:19:24,714][62635] Updated weights for policy 1, policy_version 13720 (0.0007) [2023-10-12 16:19:24,916][62634] Updated weights for policy 0, policy_version 13720 (0.0008) [2023-10-12 16:19:28,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.3, 300 sec: 13329.4). Total num frames: 28114944. Throughput: 0: 1682.5, 1: 1661.0. Samples: 7040240. Policy #0 lag: (min: 21.0, avg: 28.9, max: 53.0) [2023-10-12 16:19:28,436][61643] Avg episode reward: [(0, '3.400'), (1, '5.520')] [2023-10-12 16:19:28,789][62635] Updated weights for policy 1, policy_version 13730 (0.0007) [2023-10-12 16:19:28,889][62634] Updated weights for policy 0, policy_version 13730 (0.0007) [2023-10-12 16:19:29,153][62635] Updated weights for policy 1, policy_version 13740 (0.0007) [2023-10-12 16:19:29,258][62634] Updated weights for policy 0, policy_version 13740 (0.0007) [2023-10-12 16:19:29,522][62635] Updated weights for policy 1, policy_version 13750 (0.0007) [2023-10-12 16:19:29,638][62634] Updated weights for policy 0, policy_version 13750 (0.0009) [2023-10-12 16:19:29,895][62635] Updated weights for policy 1, policy_version 13760 (0.0008) [2023-10-12 16:19:30,008][62634] Updated weights for policy 0, policy_version 13760 (0.0008) [2023-10-12 16:19:33,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 28180480. Throughput: 0: 1686.1, 1: 1660.8. Samples: 7061146. Policy #0 lag: (min: 21.0, avg: 28.9, max: 53.0) [2023-10-12 16:19:33,436][61643] Avg episode reward: [(0, '3.410'), (1, '5.790')] [2023-10-12 16:19:33,445][62354] Saving new best policy, reward=3.410! [2023-10-12 16:19:34,025][62635] Updated weights for policy 1, policy_version 13770 (0.0009) [2023-10-12 16:19:34,093][62634] Updated weights for policy 0, policy_version 13770 (0.0009) [2023-10-12 16:19:34,388][62635] Updated weights for policy 1, policy_version 13780 (0.0008) [2023-10-12 16:19:34,469][62634] Updated weights for policy 0, policy_version 13780 (0.0007) [2023-10-12 16:19:34,750][62635] Updated weights for policy 1, policy_version 13790 (0.0009) [2023-10-12 16:19:34,853][62634] Updated weights for policy 0, policy_version 13790 (0.0009) [2023-10-12 16:19:38,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13107.3, 300 sec: 13329.4). Total num frames: 28246016. Throughput: 0: 1681.6, 1: 1663.2. Samples: 7069946. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 16:19:38,436][61643] Avg episode reward: [(0, '3.350'), (1, '5.840')] [2023-10-12 16:19:38,826][62635] Updated weights for policy 1, policy_version 13800 (0.0008) [2023-10-12 16:19:38,940][62634] Updated weights for policy 0, policy_version 13800 (0.0010) [2023-10-12 16:19:39,200][62635] Updated weights for policy 1, policy_version 13810 (0.0007) [2023-10-12 16:19:39,313][62634] Updated weights for policy 0, policy_version 13810 (0.0009) [2023-10-12 16:19:39,562][62635] Updated weights for policy 1, policy_version 13820 (0.0008) [2023-10-12 16:19:39,687][62634] Updated weights for policy 0, policy_version 13820 (0.0007) [2023-10-12 16:19:43,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 28311552. Throughput: 0: 1677.5, 1: 1669.8. Samples: 7090730. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 16:19:43,435][61643] Avg episode reward: [(0, '3.320'), (1, '6.170')] [2023-10-12 16:19:43,558][62635] Updated weights for policy 1, policy_version 13830 (0.0008) [2023-10-12 16:19:43,689][62634] Updated weights for policy 0, policy_version 13830 (0.0010) [2023-10-12 16:19:43,918][62635] Updated weights for policy 1, policy_version 13840 (0.0008) [2023-10-12 16:19:44,071][62634] Updated weights for policy 0, policy_version 13840 (0.0008) [2023-10-12 16:19:44,291][62635] Updated weights for policy 1, policy_version 13850 (0.0007) [2023-10-12 16:19:44,449][62634] Updated weights for policy 0, policy_version 13850 (0.0009) [2023-10-12 16:19:48,369][62635] Updated weights for policy 1, policy_version 13860 (0.0009) [2023-10-12 16:19:48,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 28377088. Throughput: 0: 1677.8, 1: 1671.6. Samples: 7111378. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 16:19:48,435][61643] Avg episode reward: [(0, '3.380'), (1, '6.000')] [2023-10-12 16:19:48,519][62634] Updated weights for policy 0, policy_version 13860 (0.0010) [2023-10-12 16:19:48,734][62635] Updated weights for policy 1, policy_version 13870 (0.0009) [2023-10-12 16:19:48,883][62634] Updated weights for policy 0, policy_version 13870 (0.0008) [2023-10-12 16:19:49,101][62635] Updated weights for policy 1, policy_version 13880 (0.0009) [2023-10-12 16:19:49,260][62634] Updated weights for policy 0, policy_version 13880 (0.0009) [2023-10-12 16:19:53,341][62635] Updated weights for policy 1, policy_version 13890 (0.0009) [2023-10-12 16:19:53,392][62634] Updated weights for policy 0, policy_version 13890 (0.0009) [2023-10-12 16:19:53,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 28442624. Throughput: 0: 1675.5, 1: 1673.2. Samples: 7120158. Policy #0 lag: (min: 18.0, avg: 21.2, max: 50.0) [2023-10-12 16:19:53,436][61643] Avg episode reward: [(0, '3.450'), (1, '5.970')] [2023-10-12 16:19:53,720][62635] Updated weights for policy 1, policy_version 13900 (0.0007) [2023-10-12 16:19:53,775][62634] Updated weights for policy 0, policy_version 13900 (0.0009) [2023-10-12 16:19:54,095][62635] Updated weights for policy 1, policy_version 13910 (0.0009) [2023-10-12 16:19:54,149][62634] Updated weights for policy 0, policy_version 13910 (0.0009) [2023-10-12 16:19:54,468][62635] Updated weights for policy 1, policy_version 13920 (0.0007) [2023-10-12 16:19:54,512][62354] Saving new best policy, reward=3.450! [2023-10-12 16:19:54,513][62634] Updated weights for policy 0, policy_version 13920 (0.0009) [2023-10-12 16:19:58,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 28508160. Throughput: 0: 1679.9, 1: 1672.3. Samples: 7140888. Policy #0 lag: (min: 18.0, avg: 21.2, max: 50.0) [2023-10-12 16:19:58,435][61643] Avg episode reward: [(0, '3.490'), (1, '5.950')] [2023-10-12 16:19:58,507][62635] Updated weights for policy 1, policy_version 13930 (0.0009) [2023-10-12 16:19:58,706][62634] Updated weights for policy 0, policy_version 13930 (0.0009) [2023-10-12 16:19:58,879][62635] Updated weights for policy 1, policy_version 13940 (0.0007) [2023-10-12 16:19:59,077][62634] Updated weights for policy 0, policy_version 13940 (0.0009) [2023-10-12 16:19:59,242][62635] Updated weights for policy 1, policy_version 13950 (0.0007) [2023-10-12 16:19:59,468][62634] Updated weights for policy 0, policy_version 13950 (0.0009) [2023-10-12 16:19:59,533][62354] Saving new best policy, reward=3.490! [2023-10-12 16:20:03,208][62635] Updated weights for policy 1, policy_version 13960 (0.0008) [2023-10-12 16:20:03,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 28573696. Throughput: 0: 1679.1, 1: 1675.4. Samples: 7161456. Policy #0 lag: (min: 18.0, avg: 21.2, max: 50.0) [2023-10-12 16:20:03,436][61643] Avg episode reward: [(0, '3.510'), (1, '6.030')] [2023-10-12 16:20:03,456][62634] Updated weights for policy 0, policy_version 13960 (0.0010) [2023-10-12 16:20:03,584][62635] Updated weights for policy 1, policy_version 13970 (0.0008) [2023-10-12 16:20:03,833][62634] Updated weights for policy 0, policy_version 13970 (0.0008) [2023-10-12 16:20:03,949][62635] Updated weights for policy 1, policy_version 13980 (0.0009) [2023-10-12 16:20:04,202][62634] Updated weights for policy 0, policy_version 13980 (0.0009) [2023-10-12 16:20:04,353][62354] Saving new best policy, reward=3.510! [2023-10-12 16:20:08,018][62635] Updated weights for policy 1, policy_version 13990 (0.0008) [2023-10-12 16:20:08,211][62634] Updated weights for policy 0, policy_version 13990 (0.0009) [2023-10-12 16:20:08,384][62635] Updated weights for policy 1, policy_version 14000 (0.0008) [2023-10-12 16:20:08,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 28639232. Throughput: 0: 1681.8, 1: 1677.6. Samples: 7170752. Policy #0 lag: (min: 24.0, avg: 45.2, max: 56.0) [2023-10-12 16:20:08,436][61643] Avg episode reward: [(0, '3.600'), (1, '5.710')] [2023-10-12 16:20:08,586][62634] Updated weights for policy 0, policy_version 14000 (0.0010) [2023-10-12 16:20:08,751][62635] Updated weights for policy 1, policy_version 14010 (0.0009) [2023-10-12 16:20:08,955][62634] Updated weights for policy 0, policy_version 14010 (0.0008) [2023-10-12 16:20:09,182][62354] Saving new best policy, reward=3.600! [2023-10-12 16:20:12,893][62635] Updated weights for policy 1, policy_version 14020 (0.0008) [2023-10-12 16:20:12,898][62634] Updated weights for policy 0, policy_version 14020 (0.0008) [2023-10-12 16:20:13,253][62635] Updated weights for policy 1, policy_version 14030 (0.0008) [2023-10-12 16:20:13,267][62634] Updated weights for policy 0, policy_version 14030 (0.0008) [2023-10-12 16:20:13,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 28704768. Throughput: 0: 1687.2, 1: 1678.4. Samples: 7191690. Policy #0 lag: (min: 24.0, avg: 45.2, max: 56.0) [2023-10-12 16:20:13,436][61643] Avg episode reward: [(0, '3.550'), (1, '5.980')] [2023-10-12 16:20:13,620][62635] Updated weights for policy 1, policy_version 14040 (0.0007) [2023-10-12 16:20:13,649][62634] Updated weights for policy 0, policy_version 14040 (0.0009) [2023-10-12 16:20:17,719][62634] Updated weights for policy 0, policy_version 14050 (0.0009) [2023-10-12 16:20:17,838][62635] Updated weights for policy 1, policy_version 14050 (0.0009) [2023-10-12 16:20:18,107][62634] Updated weights for policy 0, policy_version 14060 (0.0009) [2023-10-12 16:20:18,204][62635] Updated weights for policy 1, policy_version 14060 (0.0010) [2023-10-12 16:20:18,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 28770304. Throughput: 0: 1676.6, 1: 1669.9. Samples: 7211736. Policy #0 lag: (min: 24.0, avg: 45.2, max: 56.0) [2023-10-12 16:20:18,437][61643] Avg episode reward: [(0, '3.540'), (1, '6.050')] [2023-10-12 16:20:18,474][62634] Updated weights for policy 0, policy_version 14070 (0.0009) [2023-10-12 16:20:18,573][62635] Updated weights for policy 1, policy_version 14070 (0.0007) [2023-10-12 16:20:18,850][62354] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000014080_14417920.pth... [2023-10-12 16:20:18,854][62634] Updated weights for policy 0, policy_version 14080 (0.0008) [2023-10-12 16:20:18,890][62354] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000012480_12779520.pth [2023-10-12 16:20:18,946][62635] Updated weights for policy 1, policy_version 14080 (0.0008) [2023-10-12 16:20:18,946][62495] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000014080_14417920.pth... [2023-10-12 16:20:18,986][62495] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000012512_12812288.pth [2023-10-12 16:20:22,854][62634] Updated weights for policy 0, policy_version 14090 (0.0008) [2023-10-12 16:20:23,010][62635] Updated weights for policy 1, policy_version 14090 (0.0008) [2023-10-12 16:20:23,230][62634] Updated weights for policy 0, policy_version 14100 (0.0007) [2023-10-12 16:20:23,388][62635] Updated weights for policy 1, policy_version 14100 (0.0008) [2023-10-12 16:20:23,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 28835840. Throughput: 0: 1688.3, 1: 1678.0. Samples: 7221428. Policy #0 lag: (min: 3.0, avg: 9.6, max: 35.0) [2023-10-12 16:20:23,436][61643] Avg episode reward: [(0, '3.610'), (1, '6.160')] [2023-10-12 16:20:23,613][62634] Updated weights for policy 0, policy_version 14110 (0.0009) [2023-10-12 16:20:23,684][62354] Saving new best policy, reward=3.610! [2023-10-12 16:20:23,750][62635] Updated weights for policy 1, policy_version 14110 (0.0008) [2023-10-12 16:20:27,744][62634] Updated weights for policy 0, policy_version 14120 (0.0008) [2023-10-12 16:20:27,902][62635] Updated weights for policy 1, policy_version 14120 (0.0010) [2023-10-12 16:20:28,122][62634] Updated weights for policy 0, policy_version 14130 (0.0009) [2023-10-12 16:20:28,275][62635] Updated weights for policy 1, policy_version 14130 (0.0008) [2023-10-12 16:20:28,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 28901376. Throughput: 0: 1684.8, 1: 1673.7. Samples: 7241862. Policy #0 lag: (min: 3.0, avg: 9.6, max: 35.0) [2023-10-12 16:20:28,435][61643] Avg episode reward: [(0, '3.630'), (1, '5.810')] [2023-10-12 16:20:28,494][62634] Updated weights for policy 0, policy_version 14140 (0.0010) [2023-10-12 16:20:28,640][62635] Updated weights for policy 1, policy_version 14140 (0.0008) [2023-10-12 16:20:28,641][62354] Saving new best policy, reward=3.630! [2023-10-12 16:20:32,527][62634] Updated weights for policy 0, policy_version 14150 (0.0008) [2023-10-12 16:20:32,758][62635] Updated weights for policy 1, policy_version 14150 (0.0008) [2023-10-12 16:20:32,907][62634] Updated weights for policy 0, policy_version 14160 (0.0007) [2023-10-12 16:20:33,127][62635] Updated weights for policy 1, policy_version 14160 (0.0008) [2023-10-12 16:20:33,289][62634] Updated weights for policy 0, policy_version 14170 (0.0008) [2023-10-12 16:20:33,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 28966912. Throughput: 0: 1668.0, 1: 1663.2. Samples: 7261280. Policy #0 lag: (min: 3.0, avg: 9.6, max: 35.0) [2023-10-12 16:20:33,435][61643] Avg episode reward: [(0, '3.610'), (1, '6.050')] [2023-10-12 16:20:33,498][62635] Updated weights for policy 1, policy_version 14170 (0.0009) [2023-10-12 16:20:37,335][62634] Updated weights for policy 0, policy_version 14180 (0.0009) [2023-10-12 16:20:37,555][62635] Updated weights for policy 1, policy_version 14180 (0.0007) [2023-10-12 16:20:37,718][62634] Updated weights for policy 0, policy_version 14190 (0.0008) [2023-10-12 16:20:37,929][62635] Updated weights for policy 1, policy_version 14190 (0.0008) [2023-10-12 16:20:38,090][62634] Updated weights for policy 0, policy_version 14200 (0.0009) [2023-10-12 16:20:38,300][62635] Updated weights for policy 1, policy_version 14200 (0.0009) [2023-10-12 16:20:38,435][61643] Fps is (10 sec: 16383.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 29065216. Throughput: 0: 1685.0, 1: 1679.8. Samples: 7271576. Policy #0 lag: (min: 30.0, avg: 31.4, max: 54.0) [2023-10-12 16:20:38,436][61643] Avg episode reward: [(0, '3.610'), (1, '5.910')] [2023-10-12 16:20:42,056][62634] Updated weights for policy 0, policy_version 14210 (0.0009) [2023-10-12 16:20:42,279][62635] Updated weights for policy 1, policy_version 14210 (0.0009) [2023-10-12 16:20:42,432][62634] Updated weights for policy 0, policy_version 14220 (0.0008) [2023-10-12 16:20:42,669][62635] Updated weights for policy 1, policy_version 14220 (0.0007) [2023-10-12 16:20:42,814][62634] Updated weights for policy 0, policy_version 14230 (0.0009) [2023-10-12 16:20:43,035][62635] Updated weights for policy 1, policy_version 14230 (0.0010) [2023-10-12 16:20:43,185][62634] Updated weights for policy 0, policy_version 14240 (0.0009) [2023-10-12 16:20:43,409][62635] Updated weights for policy 1, policy_version 14240 (0.0008) [2023-10-12 16:20:43,435][61643] Fps is (10 sec: 19660.5, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 29163520. Throughput: 0: 1691.5, 1: 1680.9. Samples: 7292650. Policy #0 lag: (min: 30.0, avg: 31.4, max: 54.0) [2023-10-12 16:20:43,436][61643] Avg episode reward: [(0, '3.670'), (1, '5.660')] [2023-10-12 16:20:43,438][62354] Saving new best policy, reward=3.670! [2023-10-12 16:20:47,259][62634] Updated weights for policy 0, policy_version 14250 (0.0009) [2023-10-12 16:20:47,413][62635] Updated weights for policy 1, policy_version 14250 (0.0010) [2023-10-12 16:20:47,635][62634] Updated weights for policy 0, policy_version 14260 (0.0008) [2023-10-12 16:20:47,782][62635] Updated weights for policy 1, policy_version 14260 (0.0009) [2023-10-12 16:20:48,009][62634] Updated weights for policy 0, policy_version 14270 (0.0007) [2023-10-12 16:20:48,155][62635] Updated weights for policy 1, policy_version 14270 (0.0009) [2023-10-12 16:20:48,435][61643] Fps is (10 sec: 16383.7, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 29229056. Throughput: 0: 1667.9, 1: 1658.1. Samples: 7311126. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 16:20:48,436][61643] Avg episode reward: [(0, '3.690'), (1, '6.050')] [2023-10-12 16:20:48,447][62354] Saving new best policy, reward=3.690! [2023-10-12 16:20:52,161][62635] Updated weights for policy 1, policy_version 14280 (0.0007) [2023-10-12 16:20:52,234][62634] Updated weights for policy 0, policy_version 14280 (0.0009) [2023-10-12 16:20:52,530][62635] Updated weights for policy 1, policy_version 14290 (0.0009) [2023-10-12 16:20:52,613][62634] Updated weights for policy 0, policy_version 14290 (0.0010) [2023-10-12 16:20:52,893][62635] Updated weights for policy 1, policy_version 14300 (0.0009) [2023-10-12 16:20:52,991][62634] Updated weights for policy 0, policy_version 14300 (0.0008) [2023-10-12 16:20:53,435][61643] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 29294592. Throughput: 0: 1684.5, 1: 1675.5. Samples: 7321950. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 16:20:53,435][61643] Avg episode reward: [(0, '3.730'), (1, '5.870')] [2023-10-12 16:20:53,436][62354] Saving new best policy, reward=3.730! [2023-10-12 16:20:56,913][62635] Updated weights for policy 1, policy_version 14310 (0.0007) [2023-10-12 16:20:57,008][62634] Updated weights for policy 0, policy_version 14310 (0.0008) [2023-10-12 16:20:57,281][62635] Updated weights for policy 1, policy_version 14320 (0.0008) [2023-10-12 16:20:57,397][62634] Updated weights for policy 0, policy_version 14320 (0.0008) [2023-10-12 16:20:57,655][62635] Updated weights for policy 1, policy_version 14330 (0.0009) [2023-10-12 16:20:57,770][62634] Updated weights for policy 0, policy_version 14330 (0.0008) [2023-10-12 16:20:58,435][61643] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 29360128. Throughput: 0: 1675.7, 1: 1671.2. Samples: 7342302. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 16:20:58,435][61643] Avg episode reward: [(0, '3.740'), (1, '6.130')] [2023-10-12 16:20:58,436][62354] Saving new best policy, reward=3.740! [2023-10-12 16:21:01,762][62634] Updated weights for policy 0, policy_version 14340 (0.0007) [2023-10-12 16:21:01,846][62635] Updated weights for policy 1, policy_version 14340 (0.0008) [2023-10-12 16:21:02,138][62634] Updated weights for policy 0, policy_version 14350 (0.0008) [2023-10-12 16:21:02,212][62635] Updated weights for policy 1, policy_version 14350 (0.0009) [2023-10-12 16:21:02,518][62634] Updated weights for policy 0, policy_version 14360 (0.0008) [2023-10-12 16:21:02,585][62635] Updated weights for policy 1, policy_version 14360 (0.0009) [2023-10-12 16:21:03,435][61643] Fps is (10 sec: 13107.0, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 29425664. Throughput: 0: 1659.5, 1: 1652.4. Samples: 7360768. Policy #0 lag: (min: 27.0, avg: 32.8, max: 59.0) [2023-10-12 16:21:03,436][61643] Avg episode reward: [(0, '3.760'), (1, '6.080')] [2023-10-12 16:21:03,443][62354] Saving new best policy, reward=3.760! [2023-10-12 16:21:06,496][62634] Updated weights for policy 0, policy_version 14370 (0.0007) [2023-10-12 16:21:06,781][62635] Updated weights for policy 1, policy_version 14370 (0.0009) [2023-10-12 16:21:06,875][62634] Updated weights for policy 0, policy_version 14380 (0.0007) [2023-10-12 16:21:07,140][62635] Updated weights for policy 1, policy_version 14380 (0.0007) [2023-10-12 16:21:07,253][62634] Updated weights for policy 0, policy_version 14390 (0.0008) [2023-10-12 16:21:07,505][62635] Updated weights for policy 1, policy_version 14390 (0.0008) [2023-10-12 16:21:07,629][62634] Updated weights for policy 0, policy_version 14400 (0.0007) [2023-10-12 16:21:07,877][62635] Updated weights for policy 1, policy_version 14400 (0.0009) [2023-10-12 16:21:08,435][61643] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 29491200. Throughput: 0: 1684.5, 1: 1674.2. Samples: 7372570. Policy #0 lag: (min: 27.0, avg: 32.8, max: 59.0) [2023-10-12 16:21:08,435][61643] Avg episode reward: [(0, '3.790'), (1, '5.960')] [2023-10-12 16:21:08,436][62354] Saving new best policy, reward=3.790! [2023-10-12 16:21:11,870][62634] Updated weights for policy 0, policy_version 14410 (0.0009) [2023-10-12 16:21:12,050][62635] Updated weights for policy 1, policy_version 14410 (0.0007) [2023-10-12 16:21:12,236][62634] Updated weights for policy 0, policy_version 14420 (0.0010) [2023-10-12 16:21:12,426][62635] Updated weights for policy 1, policy_version 14420 (0.0010) [2023-10-12 16:21:12,619][62634] Updated weights for policy 0, policy_version 14430 (0.0007) [2023-10-12 16:21:12,800][62635] Updated weights for policy 1, policy_version 14430 (0.0010) [2023-10-12 16:21:13,435][61643] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 29556736. Throughput: 0: 1671.6, 1: 1668.9. Samples: 7392184. Policy #0 lag: (min: 27.0, avg: 32.8, max: 59.0) [2023-10-12 16:21:13,436][61643] Avg episode reward: [(0, '3.780'), (1, '6.120')] [2023-10-12 16:21:16,744][62634] Updated weights for policy 0, policy_version 14440 (0.0009) [2023-10-12 16:21:16,827][62635] Updated weights for policy 1, policy_version 14440 (0.0007) [2023-10-12 16:21:17,126][62634] Updated weights for policy 0, policy_version 14450 (0.0009) [2023-10-12 16:21:17,195][62635] Updated weights for policy 1, policy_version 14450 (0.0008) [2023-10-12 16:21:17,499][62634] Updated weights for policy 0, policy_version 14460 (0.0007) [2023-10-12 16:21:17,568][62635] Updated weights for policy 1, policy_version 14460 (0.0008) [2023-10-12 16:21:18,435][61643] Fps is (10 sec: 13107.0, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 29622272. Throughput: 0: 1670.4, 1: 1658.1. Samples: 7411064. Policy #0 lag: (min: 31.0, avg: 35.7, max: 63.0) [2023-10-12 16:21:18,436][61643] Avg episode reward: [(0, '3.820'), (1, '6.090')] [2023-10-12 16:21:18,447][62354] Saving new best policy, reward=3.820! [2023-10-12 16:21:21,596][62635] Updated weights for policy 1, policy_version 14470 (0.0009) [2023-10-12 16:21:21,623][62634] Updated weights for policy 0, policy_version 14470 (0.0007) [2023-10-12 16:21:21,973][62635] Updated weights for policy 1, policy_version 14480 (0.0010) [2023-10-12 16:21:21,994][62634] Updated weights for policy 0, policy_version 14480 (0.0008) [2023-10-12 16:21:22,335][62635] Updated weights for policy 1, policy_version 14490 (0.0007) [2023-10-12 16:21:22,369][62634] Updated weights for policy 0, policy_version 14490 (0.0009) [2023-10-12 16:21:23,435][61643] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 29687808. Throughput: 0: 1684.4, 1: 1673.9. Samples: 7422698. Policy #0 lag: (min: 31.0, avg: 35.7, max: 63.0) [2023-10-12 16:21:23,436][61643] Avg episode reward: [(0, '3.830'), (1, '5.800')] [2023-10-12 16:21:23,436][62354] Saving new best policy, reward=3.830! [2023-10-12 16:21:26,281][62634] Updated weights for policy 0, policy_version 14500 (0.0009) [2023-10-12 16:21:26,502][62635] Updated weights for policy 1, policy_version 14500 (0.0007) [2023-10-12 16:21:26,662][62634] Updated weights for policy 0, policy_version 14510 (0.0007) [2023-10-12 16:21:26,876][62635] Updated weights for policy 1, policy_version 14510 (0.0008) [2023-10-12 16:21:27,030][62634] Updated weights for policy 0, policy_version 14520 (0.0007) [2023-10-12 16:21:27,246][62635] Updated weights for policy 1, policy_version 14520 (0.0007) [2023-10-12 16:21:28,435][61643] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 13440.4). Total num frames: 29753344. Throughput: 0: 1664.4, 1: 1656.7. Samples: 7442100. Policy #0 lag: (min: 31.0, avg: 35.7, max: 63.0) [2023-10-12 16:21:28,435][61643] Avg episode reward: [(0, '3.730'), (1, '5.830')] [2023-10-12 16:21:31,178][62634] Updated weights for policy 0, policy_version 14530 (0.0008) [2023-10-12 16:21:31,394][62635] Updated weights for policy 1, policy_version 14530 (0.0008) [2023-10-12 16:21:31,549][62634] Updated weights for policy 0, policy_version 14540 (0.0010) [2023-10-12 16:21:31,804][62635] Updated weights for policy 1, policy_version 14540 (0.0007) [2023-10-12 16:21:31,924][62634] Updated weights for policy 0, policy_version 14550 (0.0007) [2023-10-12 16:21:32,173][62635] Updated weights for policy 1, policy_version 14550 (0.0009) [2023-10-12 16:21:32,306][62634] Updated weights for policy 0, policy_version 14560 (0.0007) [2023-10-12 16:21:32,541][62635] Updated weights for policy 1, policy_version 14560 (0.0008) [2023-10-12 16:21:33,435][61643] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 13440.4). Total num frames: 29818880. Throughput: 0: 1674.5, 1: 1664.2. Samples: 7461366. Policy #0 lag: (min: 31.0, avg: 33.3, max: 63.0) [2023-10-12 16:21:33,435][61643] Avg episode reward: [(0, '3.610'), (1, '6.440')] [2023-10-12 16:21:33,445][62495] Saving new best policy, reward=6.440! [2023-10-12 16:21:36,337][62634] Updated weights for policy 0, policy_version 14570 (0.0008) [2023-10-12 16:21:36,704][62634] Updated weights for policy 0, policy_version 14580 (0.0009) [2023-10-12 16:21:36,735][62635] Updated weights for policy 1, policy_version 14570 (0.0007) [2023-10-12 16:21:37,081][62634] Updated weights for policy 0, policy_version 14590 (0.0009) [2023-10-12 16:21:37,103][62635] Updated weights for policy 1, policy_version 14580 (0.0007) [2023-10-12 16:21:37,472][62635] Updated weights for policy 1, policy_version 14590 (0.0011) [2023-10-12 16:21:38,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 29884416. Throughput: 0: 1685.9, 1: 1668.4. Samples: 7472894. Policy #0 lag: (min: 31.0, avg: 33.3, max: 63.0) [2023-10-12 16:21:38,435][61643] Avg episode reward: [(0, '3.640'), (1, '6.380')] [2023-10-12 16:21:41,050][62634] Updated weights for policy 0, policy_version 14600 (0.0007) [2023-10-12 16:21:41,423][62634] Updated weights for policy 0, policy_version 14610 (0.0009) [2023-10-12 16:21:41,460][62635] Updated weights for policy 1, policy_version 14600 (0.0009) [2023-10-12 16:21:41,798][62634] Updated weights for policy 0, policy_version 14620 (0.0008) [2023-10-12 16:21:41,831][62635] Updated weights for policy 1, policy_version 14610 (0.0007) [2023-10-12 16:21:42,207][62635] Updated weights for policy 1, policy_version 14620 (0.0009) [2023-10-12 16:21:43,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 29949952. Throughput: 0: 1664.4, 1: 1654.7. Samples: 7491660. Policy #0 lag: (min: 31.0, avg: 33.3, max: 63.0) [2023-10-12 16:21:43,435][61643] Avg episode reward: [(0, '3.730'), (1, '6.210')] [2023-10-12 16:21:45,605][62634] Updated weights for policy 0, policy_version 14630 (0.0008) [2023-10-12 16:21:45,980][62634] Updated weights for policy 0, policy_version 14640 (0.0009) [2023-10-12 16:21:46,078][62635] Updated weights for policy 1, policy_version 14630 (0.0007) [2023-10-12 16:21:46,355][62634] Updated weights for policy 0, policy_version 14650 (0.0009) [2023-10-12 16:21:46,446][62635] Updated weights for policy 1, policy_version 14640 (0.0008) [2023-10-12 16:21:46,817][62635] Updated weights for policy 1, policy_version 14650 (0.0010) [2023-10-12 16:21:48,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.3, 300 sec: 13440.4). Total num frames: 30015488. Throughput: 0: 1695.2, 1: 1676.9. Samples: 7512512. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 16:21:48,435][61643] Avg episode reward: [(0, '3.860'), (1, '6.270')] [2023-10-12 16:21:48,445][62354] Saving new best policy, reward=3.860! [2023-10-12 16:21:50,411][62634] Updated weights for policy 0, policy_version 14660 (0.0008) [2023-10-12 16:21:50,794][62634] Updated weights for policy 0, policy_version 14670 (0.0011) [2023-10-12 16:21:51,054][62635] Updated weights for policy 1, policy_version 14660 (0.0009) [2023-10-12 16:21:51,166][62634] Updated weights for policy 0, policy_version 14680 (0.0009) [2023-10-12 16:21:51,429][62635] Updated weights for policy 1, policy_version 14670 (0.0008) [2023-10-12 16:21:51,789][62635] Updated weights for policy 1, policy_version 14680 (0.0009) [2023-10-12 16:21:53,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 30081024. Throughput: 0: 1676.0, 1: 1673.1. Samples: 7523282. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 16:21:53,435][61643] Avg episode reward: [(0, '3.820'), (1, '6.360')] [2023-10-12 16:21:55,308][62634] Updated weights for policy 0, policy_version 14690 (0.0008) [2023-10-12 16:21:55,682][62634] Updated weights for policy 0, policy_version 14700 (0.0009) [2023-10-12 16:21:55,762][62635] Updated weights for policy 1, policy_version 14690 (0.0009) [2023-10-12 16:21:56,068][62634] Updated weights for policy 0, policy_version 14710 (0.0010) [2023-10-12 16:21:56,129][62635] Updated weights for policy 1, policy_version 14700 (0.0009) [2023-10-12 16:21:56,443][62634] Updated weights for policy 0, policy_version 14720 (0.0009) [2023-10-12 16:21:56,499][62635] Updated weights for policy 1, policy_version 14710 (0.0008) [2023-10-12 16:21:56,867][62635] Updated weights for policy 1, policy_version 14720 (0.0007) [2023-10-12 16:21:58,435][61643] Fps is (10 sec: 13106.7, 60 sec: 13107.1, 300 sec: 13440.4). Total num frames: 30146560. Throughput: 0: 1676.0, 1: 1656.3. Samples: 7542140. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 16:21:58,436][61643] Avg episode reward: [(0, '3.850'), (1, '6.370')] [2023-10-12 16:22:00,555][62634] Updated weights for policy 0, policy_version 14730 (0.0010) [2023-10-12 16:22:00,935][62634] Updated weights for policy 0, policy_version 14740 (0.0009) [2023-10-12 16:22:01,001][62635] Updated weights for policy 1, policy_version 14730 (0.0008) [2023-10-12 16:22:01,311][62634] Updated weights for policy 0, policy_version 14750 (0.0009) [2023-10-12 16:22:01,370][62635] Updated weights for policy 1, policy_version 14740 (0.0009) [2023-10-12 16:22:01,748][62635] Updated weights for policy 1, policy_version 14750 (0.0008) [2023-10-12 16:22:03,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 30212096. Throughput: 0: 1688.3, 1: 1677.2. Samples: 7562508. Policy #0 lag: (min: 5.0, avg: 5.1, max: 11.0) [2023-10-12 16:22:03,435][61643] Avg episode reward: [(0, '3.770'), (1, '6.590')] [2023-10-12 16:22:03,441][62495] Saving new best policy, reward=6.590! [2023-10-12 16:22:05,357][62634] Updated weights for policy 0, policy_version 14760 (0.0008) [2023-10-12 16:22:05,726][62634] Updated weights for policy 0, policy_version 14770 (0.0007) [2023-10-12 16:22:05,783][62635] Updated weights for policy 1, policy_version 14760 (0.0008) [2023-10-12 16:22:06,101][62634] Updated weights for policy 0, policy_version 14780 (0.0010) [2023-10-12 16:22:06,149][62635] Updated weights for policy 1, policy_version 14770 (0.0008) [2023-10-12 16:22:06,509][62635] Updated weights for policy 1, policy_version 14780 (0.0009) [2023-10-12 16:22:08,435][61643] Fps is (10 sec: 13107.6, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 30277632. Throughput: 0: 1668.7, 1: 1666.5. Samples: 7572782. Policy #0 lag: (min: 5.0, avg: 5.1, max: 11.0) [2023-10-12 16:22:08,436][61643] Avg episode reward: [(0, '3.770'), (1, '6.790')] [2023-10-12 16:22:08,437][62495] Saving new best policy, reward=6.790! [2023-10-12 16:22:10,225][62634] Updated weights for policy 0, policy_version 14790 (0.0008) [2023-10-12 16:22:10,612][62634] Updated weights for policy 0, policy_version 14800 (0.0009) [2023-10-12 16:22:10,722][62635] Updated weights for policy 1, policy_version 14790 (0.0009) [2023-10-12 16:22:10,979][62634] Updated weights for policy 0, policy_version 14810 (0.0009) [2023-10-12 16:22:11,092][62635] Updated weights for policy 1, policy_version 14800 (0.0007) [2023-10-12 16:22:11,464][62635] Updated weights for policy 1, policy_version 14810 (0.0008) [2023-10-12 16:22:13,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 30343168. Throughput: 0: 1667.5, 1: 1665.2. Samples: 7592074. Policy #0 lag: (min: 5.0, avg: 5.1, max: 11.0) [2023-10-12 16:22:13,435][61643] Avg episode reward: [(0, '3.720'), (1, '6.600')] [2023-10-12 16:22:15,006][62634] Updated weights for policy 0, policy_version 14820 (0.0008) [2023-10-12 16:22:15,345][62635] Updated weights for policy 1, policy_version 14820 (0.0008) [2023-10-12 16:22:15,382][62634] Updated weights for policy 0, policy_version 14830 (0.0008) [2023-10-12 16:22:15,721][62635] Updated weights for policy 1, policy_version 14830 (0.0008) [2023-10-12 16:22:15,750][62634] Updated weights for policy 0, policy_version 14840 (0.0009) [2023-10-12 16:22:16,086][62635] Updated weights for policy 1, policy_version 14840 (0.0009) [2023-10-12 16:22:18,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 30408704. Throughput: 0: 1685.6, 1: 1681.9. Samples: 7612902. Policy #0 lag: (min: 31.0, avg: 31.8, max: 51.0) [2023-10-12 16:22:18,436][61643] Avg episode reward: [(0, '3.780'), (1, '7.080')] [2023-10-12 16:22:18,444][62354] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000014848_15204352.pth... [2023-10-12 16:22:18,444][62495] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000014848_15204352.pth... [2023-10-12 16:22:18,481][62495] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000013280_13598720.pth [2023-10-12 16:22:18,481][62354] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000013280_13598720.pth [2023-10-12 16:22:18,485][62495] Saving new best policy, reward=7.080! [2023-10-12 16:22:19,739][62634] Updated weights for policy 0, policy_version 14850 (0.0008) [2023-10-12 16:22:20,124][62634] Updated weights for policy 0, policy_version 14860 (0.0009) [2023-10-12 16:22:20,245][62635] Updated weights for policy 1, policy_version 14850 (0.0007) [2023-10-12 16:22:20,498][62634] Updated weights for policy 0, policy_version 14870 (0.0007) [2023-10-12 16:22:20,625][62635] Updated weights for policy 1, policy_version 14860 (0.0007) [2023-10-12 16:22:20,873][62634] Updated weights for policy 0, policy_version 14880 (0.0007) [2023-10-12 16:22:20,998][62635] Updated weights for policy 1, policy_version 14870 (0.0008) [2023-10-12 16:22:21,371][62635] Updated weights for policy 1, policy_version 14880 (0.0011) [2023-10-12 16:22:23,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 30474240. Throughput: 0: 1652.9, 1: 1665.2. Samples: 7622210. Policy #0 lag: (min: 31.0, avg: 31.8, max: 51.0) [2023-10-12 16:22:23,435][61643] Avg episode reward: [(0, '3.710'), (1, '6.960')] [2023-10-12 16:22:25,106][62634] Updated weights for policy 0, policy_version 14890 (0.0007) [2023-10-12 16:22:25,356][62635] Updated weights for policy 1, policy_version 14890 (0.0008) [2023-10-12 16:22:25,485][62634] Updated weights for policy 0, policy_version 14900 (0.0007) [2023-10-12 16:22:25,723][62635] Updated weights for policy 1, policy_version 14900 (0.0009) [2023-10-12 16:22:25,867][62634] Updated weights for policy 0, policy_version 14910 (0.0007) [2023-10-12 16:22:26,085][62635] Updated weights for policy 1, policy_version 14910 (0.0009) [2023-10-12 16:22:28,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 30539776. Throughput: 0: 1672.6, 1: 1677.0. Samples: 7642394. Policy #0 lag: (min: 31.0, avg: 31.8, max: 51.0) [2023-10-12 16:22:28,436][61643] Avg episode reward: [(0, '3.710'), (1, '7.190')] [2023-10-12 16:22:28,438][62495] Saving new best policy, reward=7.190! [2023-10-12 16:22:30,048][62634] Updated weights for policy 0, policy_version 14920 (0.0009) [2023-10-12 16:22:30,154][62635] Updated weights for policy 1, policy_version 14920 (0.0009) [2023-10-12 16:22:30,422][62634] Updated weights for policy 0, policy_version 14930 (0.0007) [2023-10-12 16:22:30,530][62635] Updated weights for policy 1, policy_version 14930 (0.0007) [2023-10-12 16:22:30,806][62634] Updated weights for policy 0, policy_version 14940 (0.0008) [2023-10-12 16:22:30,898][62635] Updated weights for policy 1, policy_version 14940 (0.0008) [2023-10-12 16:22:33,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 30605312. Throughput: 0: 1659.7, 1: 1678.4. Samples: 7662726. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) [2023-10-12 16:22:33,435][61643] Avg episode reward: [(0, '3.640'), (1, '6.960')] [2023-10-12 16:22:35,012][62635] Updated weights for policy 1, policy_version 14950 (0.0008) [2023-10-12 16:22:35,066][62634] Updated weights for policy 0, policy_version 14950 (0.0007) [2023-10-12 16:22:35,388][62635] Updated weights for policy 1, policy_version 14960 (0.0009) [2023-10-12 16:22:35,440][62634] Updated weights for policy 0, policy_version 14960 (0.0007) [2023-10-12 16:22:35,748][62635] Updated weights for policy 1, policy_version 14970 (0.0009) [2023-10-12 16:22:35,814][62634] Updated weights for policy 0, policy_version 14970 (0.0007) [2023-10-12 16:22:38,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 30670848. Throughput: 0: 1646.8, 1: 1653.6. Samples: 7671798. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) [2023-10-12 16:22:38,436][61643] Avg episode reward: [(0, '3.630'), (1, '7.110')] [2023-10-12 16:22:39,780][62635] Updated weights for policy 1, policy_version 14980 (0.0007) [2023-10-12 16:22:39,851][62634] Updated weights for policy 0, policy_version 14980 (0.0007) [2023-10-12 16:22:40,143][62635] Updated weights for policy 1, policy_version 14990 (0.0008) [2023-10-12 16:22:40,227][62634] Updated weights for policy 0, policy_version 14990 (0.0008) [2023-10-12 16:22:40,511][62635] Updated weights for policy 1, policy_version 15000 (0.0008) [2023-10-12 16:22:40,607][62634] Updated weights for policy 0, policy_version 15000 (0.0009) [2023-10-12 16:22:43,435][61643] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 30736384. Throughput: 0: 1664.0, 1: 1675.8. Samples: 7692434. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) [2023-10-12 16:22:43,436][61643] Avg episode reward: [(0, '3.730'), (1, '7.340')] [2023-10-12 16:22:43,437][62495] Saving new best policy, reward=7.340! [2023-10-12 16:22:44,543][62634] Updated weights for policy 0, policy_version 15010 (0.0007) [2023-10-12 16:22:44,623][62635] Updated weights for policy 1, policy_version 15010 (0.0009) [2023-10-12 16:22:44,956][62634] Updated weights for policy 0, policy_version 15020 (0.0008) [2023-10-12 16:22:44,986][62635] Updated weights for policy 1, policy_version 15020 (0.0007) [2023-10-12 16:22:45,323][62634] Updated weights for policy 0, policy_version 15030 (0.0009) [2023-10-12 16:22:45,350][62635] Updated weights for policy 1, policy_version 15030 (0.0009) [2023-10-12 16:22:45,700][62634] Updated weights for policy 0, policy_version 15040 (0.0008) [2023-10-12 16:22:45,720][62635] Updated weights for policy 1, policy_version 15040 (0.0007) [2023-10-12 16:22:48,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 30801920. Throughput: 0: 1669.9, 1: 1677.1. Samples: 7713122. Policy #0 lag: (min: 31.0, avg: 31.0, max: 36.0) [2023-10-12 16:22:48,435][61643] Avg episode reward: [(0, '3.770'), (1, '7.230')] [2023-10-12 16:22:49,828][62635] Updated weights for policy 1, policy_version 15050 (0.0008) [2023-10-12 16:22:49,849][62634] Updated weights for policy 0, policy_version 15050 (0.0009) [2023-10-12 16:22:50,199][62635] Updated weights for policy 1, policy_version 15060 (0.0007) [2023-10-12 16:22:50,221][62634] Updated weights for policy 0, policy_version 15060 (0.0008) [2023-10-12 16:22:50,572][62635] Updated weights for policy 1, policy_version 15070 (0.0008) [2023-10-12 16:22:50,593][62634] Updated weights for policy 0, policy_version 15070 (0.0009) [2023-10-12 16:22:53,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 30867456. Throughput: 0: 1658.7, 1: 1657.2. Samples: 7721998. Policy #0 lag: (min: 31.0, avg: 31.0, max: 36.0) [2023-10-12 16:22:53,436][61643] Avg episode reward: [(0, '3.830'), (1, '7.390')] [2023-10-12 16:22:53,438][62495] Saving new best policy, reward=7.390! [2023-10-12 16:22:54,771][62634] Updated weights for policy 0, policy_version 15080 (0.0007) [2023-10-12 16:22:54,788][62635] Updated weights for policy 1, policy_version 15080 (0.0009) [2023-10-12 16:22:55,149][62634] Updated weights for policy 0, policy_version 15090 (0.0008) [2023-10-12 16:22:55,152][62635] Updated weights for policy 1, policy_version 15090 (0.0008) [2023-10-12 16:22:55,531][62635] Updated weights for policy 1, policy_version 15100 (0.0007) [2023-10-12 16:22:55,532][62634] Updated weights for policy 0, policy_version 15100 (0.0009) [2023-10-12 16:22:58,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.3, 300 sec: 13329.4). Total num frames: 30932992. Throughput: 0: 1668.6, 1: 1672.8. Samples: 7742436. Policy #0 lag: (min: 31.0, avg: 31.0, max: 36.0) [2023-10-12 16:22:58,436][61643] Avg episode reward: [(0, '3.820'), (1, '7.230')] [2023-10-12 16:22:59,544][62635] Updated weights for policy 1, policy_version 15110 (0.0007) [2023-10-12 16:22:59,583][62634] Updated weights for policy 0, policy_version 15110 (0.0008) [2023-10-12 16:22:59,920][62635] Updated weights for policy 1, policy_version 15120 (0.0007) [2023-10-12 16:22:59,966][62634] Updated weights for policy 0, policy_version 15120 (0.0008) [2023-10-12 16:23:00,300][62635] Updated weights for policy 1, policy_version 15130 (0.0007) [2023-10-12 16:23:00,343][62634] Updated weights for policy 0, policy_version 15130 (0.0009) [2023-10-12 16:23:03,435][61643] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 30998528. Throughput: 0: 1659.7, 1: 1677.7. Samples: 7763086. Policy #0 lag: (min: 24.0, avg: 44.9, max: 56.0) [2023-10-12 16:23:03,435][61643] Avg episode reward: [(0, '3.830'), (1, '7.070')] [2023-10-12 16:23:04,340][62635] Updated weights for policy 1, policy_version 15140 (0.0008) [2023-10-12 16:23:04,482][62634] Updated weights for policy 0, policy_version 15140 (0.0008) [2023-10-12 16:23:04,708][62635] Updated weights for policy 1, policy_version 15150 (0.0008) [2023-10-12 16:23:04,861][62634] Updated weights for policy 0, policy_version 15150 (0.0008) [2023-10-12 16:23:05,077][62635] Updated weights for policy 1, policy_version 15160 (0.0008) [2023-10-12 16:23:05,236][62634] Updated weights for policy 0, policy_version 15160 (0.0009) [2023-10-12 16:23:08,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 31064064. Throughput: 0: 1662.0, 1: 1669.7. Samples: 7772136. Policy #0 lag: (min: 24.0, avg: 44.9, max: 56.0) [2023-10-12 16:23:08,436][61643] Avg episode reward: [(0, '3.730'), (1, '7.060')] [2023-10-12 16:23:09,255][62634] Updated weights for policy 0, policy_version 15170 (0.0007) [2023-10-12 16:23:09,331][62635] Updated weights for policy 1, policy_version 15170 (0.0008) [2023-10-12 16:23:09,632][62634] Updated weights for policy 0, policy_version 15180 (0.0009) [2023-10-12 16:23:09,740][62635] Updated weights for policy 1, policy_version 15180 (0.0009) [2023-10-12 16:23:10,005][62634] Updated weights for policy 0, policy_version 15190 (0.0009) [2023-10-12 16:23:10,104][62635] Updated weights for policy 1, policy_version 15190 (0.0008) [2023-10-12 16:23:10,370][62634] Updated weights for policy 0, policy_version 15200 (0.0008) [2023-10-12 16:23:10,465][62635] Updated weights for policy 1, policy_version 15200 (0.0007) [2023-10-12 16:23:13,435][61643] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 31129600. Throughput: 0: 1664.0, 1: 1673.5. Samples: 7792580. Policy #0 lag: (min: 24.0, avg: 44.9, max: 56.0) [2023-10-12 16:23:13,436][61643] Avg episode reward: [(0, '3.710'), (1, '7.380')] [2023-10-12 16:23:14,342][62634] Updated weights for policy 0, policy_version 15210 (0.0008) [2023-10-12 16:23:14,542][62635] Updated weights for policy 1, policy_version 15210 (0.0007) [2023-10-12 16:23:14,723][62634] Updated weights for policy 0, policy_version 15220 (0.0007) [2023-10-12 16:23:14,905][62635] Updated weights for policy 1, policy_version 15220 (0.0009) [2023-10-12 16:23:15,097][62634] Updated weights for policy 0, policy_version 15230 (0.0007) [2023-10-12 16:23:15,276][62635] Updated weights for policy 1, policy_version 15230 (0.0007) [2023-10-12 16:23:18,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 31195136. Throughput: 0: 1671.6, 1: 1676.1. Samples: 7813374. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 16:23:18,436][61643] Avg episode reward: [(0, '3.750'), (1, '7.400')] [2023-10-12 16:23:18,449][62495] Saving new best policy, reward=7.400! [2023-10-12 16:23:19,037][62634] Updated weights for policy 0, policy_version 15240 (0.0008) [2023-10-12 16:23:19,249][62635] Updated weights for policy 1, policy_version 15240 (0.0008) [2023-10-12 16:23:19,414][62634] Updated weights for policy 0, policy_version 15250 (0.0007) [2023-10-12 16:23:19,607][62635] Updated weights for policy 1, policy_version 15250 (0.0007) [2023-10-12 16:23:19,787][62634] Updated weights for policy 0, policy_version 15260 (0.0007) [2023-10-12 16:23:19,979][62635] Updated weights for policy 1, policy_version 15260 (0.0009) [2023-10-12 16:23:23,435][61643] Fps is (10 sec: 13107.6, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 31260672. Throughput: 0: 1673.2, 1: 1674.8. Samples: 7822456. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 16:23:23,435][61643] Avg episode reward: [(0, '3.770'), (1, '7.580')] [2023-10-12 16:23:23,436][62495] Saving new best policy, reward=7.580! [2023-10-12 16:23:23,880][62634] Updated weights for policy 0, policy_version 15270 (0.0008) [2023-10-12 16:23:24,227][62635] Updated weights for policy 1, policy_version 15270 (0.0010) [2023-10-12 16:23:24,257][62634] Updated weights for policy 0, policy_version 15280 (0.0009) [2023-10-12 16:23:24,592][62635] Updated weights for policy 1, policy_version 15280 (0.0008) [2023-10-12 16:23:24,632][62634] Updated weights for policy 0, policy_version 15290 (0.0008) [2023-10-12 16:23:24,967][62635] Updated weights for policy 1, policy_version 15290 (0.0010) [2023-10-12 16:23:28,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 31326208. Throughput: 0: 1671.1, 1: 1674.7. Samples: 7842996. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 16:23:28,435][61643] Avg episode reward: [(0, '3.790'), (1, '7.710')] [2023-10-12 16:23:28,436][62495] Saving new best policy, reward=7.710! [2023-10-12 16:23:28,774][62634] Updated weights for policy 0, policy_version 15300 (0.0010) [2023-10-12 16:23:29,156][62634] Updated weights for policy 0, policy_version 15310 (0.0008) [2023-10-12 16:23:29,185][62635] Updated weights for policy 1, policy_version 15300 (0.0008) [2023-10-12 16:23:29,532][62634] Updated weights for policy 0, policy_version 15320 (0.0008) [2023-10-12 16:23:29,551][62635] Updated weights for policy 1, policy_version 15310 (0.0008) [2023-10-12 16:23:29,920][62635] Updated weights for policy 1, policy_version 15320 (0.0009) [2023-10-12 16:23:33,435][61643] Fps is (10 sec: 13106.8, 60 sec: 13107.1, 300 sec: 13329.4). Total num frames: 31391744. Throughput: 0: 1671.7, 1: 1670.2. Samples: 7863508. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 16:23:33,436][61643] Avg episode reward: [(0, '3.690'), (1, '7.950')] [2023-10-12 16:23:33,448][62495] Saving new best policy, reward=7.950! [2023-10-12 16:23:33,653][62634] Updated weights for policy 0, policy_version 15330 (0.0008) [2023-10-12 16:23:33,840][62635] Updated weights for policy 1, policy_version 15330 (0.0008) [2023-10-12 16:23:34,050][62634] Updated weights for policy 0, policy_version 15340 (0.0007) [2023-10-12 16:23:34,202][62635] Updated weights for policy 1, policy_version 15340 (0.0009) [2023-10-12 16:23:34,440][62634] Updated weights for policy 0, policy_version 15350 (0.0008) [2023-10-12 16:23:34,567][62635] Updated weights for policy 1, policy_version 15350 (0.0009) [2023-10-12 16:23:34,815][62634] Updated weights for policy 0, policy_version 15360 (0.0009) [2023-10-12 16:23:34,937][62635] Updated weights for policy 1, policy_version 15360 (0.0008) [2023-10-12 16:23:38,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 31457280. Throughput: 0: 1674.0, 1: 1670.7. Samples: 7872508. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 16:23:38,435][61643] Avg episode reward: [(0, '3.790'), (1, '7.920')] [2023-10-12 16:23:38,735][62634] Updated weights for policy 0, policy_version 15370 (0.0009) [2023-10-12 16:23:39,027][62635] Updated weights for policy 1, policy_version 15370 (0.0008) [2023-10-12 16:23:39,106][62634] Updated weights for policy 0, policy_version 15380 (0.0010) [2023-10-12 16:23:39,393][62635] Updated weights for policy 1, policy_version 15380 (0.0008) [2023-10-12 16:23:39,495][62634] Updated weights for policy 0, policy_version 15390 (0.0007) [2023-10-12 16:23:39,762][62635] Updated weights for policy 1, policy_version 15390 (0.0011) [2023-10-12 16:23:43,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 31522816. Throughput: 0: 1680.6, 1: 1671.5. Samples: 7893280. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 16:23:43,436][61643] Avg episode reward: [(0, '3.770'), (1, '7.770')] [2023-10-12 16:23:43,697][62634] Updated weights for policy 0, policy_version 15400 (0.0009) [2023-10-12 16:23:43,883][62635] Updated weights for policy 1, policy_version 15400 (0.0009) [2023-10-12 16:23:44,076][62634] Updated weights for policy 0, policy_version 15410 (0.0008) [2023-10-12 16:23:44,243][62635] Updated weights for policy 1, policy_version 15410 (0.0007) [2023-10-12 16:23:44,454][62634] Updated weights for policy 0, policy_version 15420 (0.0008) [2023-10-12 16:23:44,609][62635] Updated weights for policy 1, policy_version 15420 (0.0008) [2023-10-12 16:23:48,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 31588352. Throughput: 0: 1683.0, 1: 1673.9. Samples: 7914146. Policy #0 lag: (min: 31.0, avg: 35.7, max: 63.0) [2023-10-12 16:23:48,435][61643] Avg episode reward: [(0, '3.780'), (1, '7.780')] [2023-10-12 16:23:48,468][62634] Updated weights for policy 0, policy_version 15430 (0.0009) [2023-10-12 16:23:48,645][62635] Updated weights for policy 1, policy_version 15430 (0.0007) [2023-10-12 16:23:48,847][62634] Updated weights for policy 0, policy_version 15440 (0.0008) [2023-10-12 16:23:49,007][62635] Updated weights for policy 1, policy_version 15440 (0.0009) [2023-10-12 16:23:49,224][62634] Updated weights for policy 0, policy_version 15450 (0.0009) [2023-10-12 16:23:49,377][62635] Updated weights for policy 1, policy_version 15450 (0.0009) [2023-10-12 16:23:53,124][62634] Updated weights for policy 0, policy_version 15460 (0.0008) [2023-10-12 16:23:53,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 31653888. Throughput: 0: 1681.8, 1: 1672.0. Samples: 7923058. Policy #0 lag: (min: 31.0, avg: 35.7, max: 63.0) [2023-10-12 16:23:53,435][61643] Avg episode reward: [(0, '3.770'), (1, '7.870')] [2023-10-12 16:23:53,464][62635] Updated weights for policy 1, policy_version 15460 (0.0010) [2023-10-12 16:23:53,502][62634] Updated weights for policy 0, policy_version 15470 (0.0008) [2023-10-12 16:23:53,838][62635] Updated weights for policy 1, policy_version 15470 (0.0008) [2023-10-12 16:23:53,868][62634] Updated weights for policy 0, policy_version 15480 (0.0009) [2023-10-12 16:23:54,195][62635] Updated weights for policy 1, policy_version 15480 (0.0009) [2023-10-12 16:23:58,150][62634] Updated weights for policy 0, policy_version 15490 (0.0008) [2023-10-12 16:23:58,398][62635] Updated weights for policy 1, policy_version 15490 (0.0008) [2023-10-12 16:23:58,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 31719424. Throughput: 0: 1685.9, 1: 1677.0. Samples: 7943910. Policy #0 lag: (min: 31.0, avg: 35.7, max: 63.0) [2023-10-12 16:23:58,435][61643] Avg episode reward: [(0, '3.700'), (1, '7.700')] [2023-10-12 16:23:58,521][62634] Updated weights for policy 0, policy_version 15500 (0.0007) [2023-10-12 16:23:58,792][62635] Updated weights for policy 1, policy_version 15500 (0.0007) [2023-10-12 16:23:58,897][62634] Updated weights for policy 0, policy_version 15510 (0.0008) [2023-10-12 16:23:59,167][62635] Updated weights for policy 1, policy_version 15510 (0.0009) [2023-10-12 16:23:59,276][62634] Updated weights for policy 0, policy_version 15520 (0.0010) [2023-10-12 16:23:59,531][62635] Updated weights for policy 1, policy_version 15520 (0.0007) [2023-10-12 16:24:03,357][62634] Updated weights for policy 0, policy_version 15530 (0.0008) [2023-10-12 16:24:03,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 31784960. Throughput: 0: 1680.1, 1: 1674.7. Samples: 7964336. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 16:24:03,435][61643] Avg episode reward: [(0, '3.710'), (1, '7.580')] [2023-10-12 16:24:03,638][62635] Updated weights for policy 1, policy_version 15530 (0.0007) [2023-10-12 16:24:03,733][62634] Updated weights for policy 0, policy_version 15540 (0.0007) [2023-10-12 16:24:04,001][62635] Updated weights for policy 1, policy_version 15540 (0.0007) [2023-10-12 16:24:04,115][62634] Updated weights for policy 0, policy_version 15550 (0.0008) [2023-10-12 16:24:04,370][62635] Updated weights for policy 1, policy_version 15550 (0.0008) [2023-10-12 16:24:08,342][62634] Updated weights for policy 0, policy_version 15560 (0.0009) [2023-10-12 16:24:08,422][62635] Updated weights for policy 1, policy_version 15560 (0.0009) [2023-10-12 16:24:08,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 31850496. Throughput: 0: 1675.2, 1: 1677.3. Samples: 7973322. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 16:24:08,435][61643] Avg episode reward: [(0, '3.800'), (1, '7.610')] [2023-10-12 16:24:08,717][62634] Updated weights for policy 0, policy_version 15570 (0.0008) [2023-10-12 16:24:08,792][62635] Updated weights for policy 1, policy_version 15570 (0.0009) [2023-10-12 16:24:09,093][62634] Updated weights for policy 0, policy_version 15580 (0.0009) [2023-10-12 16:24:09,153][62635] Updated weights for policy 1, policy_version 15580 (0.0008) [2023-10-12 16:24:13,158][62634] Updated weights for policy 0, policy_version 15590 (0.0008) [2023-10-12 16:24:13,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 31916032. Throughput: 0: 1678.1, 1: 1675.5. Samples: 7993908. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 16:24:13,436][61643] Avg episode reward: [(0, '3.810'), (1, '7.560')] [2023-10-12 16:24:13,464][62635] Updated weights for policy 1, policy_version 15590 (0.0009) [2023-10-12 16:24:13,531][62634] Updated weights for policy 0, policy_version 15600 (0.0009) [2023-10-12 16:24:13,829][62635] Updated weights for policy 1, policy_version 15600 (0.0009) [2023-10-12 16:24:13,912][62634] Updated weights for policy 0, policy_version 15610 (0.0007) [2023-10-12 16:24:14,198][62635] Updated weights for policy 1, policy_version 15610 (0.0007) [2023-10-12 16:24:18,077][62634] Updated weights for policy 0, policy_version 15620 (0.0008) [2023-10-12 16:24:18,241][62635] Updated weights for policy 1, policy_version 15620 (0.0007) [2023-10-12 16:24:18,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 31981568. Throughput: 0: 1673.3, 1: 1676.6. Samples: 8014252. Policy #0 lag: (min: 21.0, avg: 21.0, max: 21.0) [2023-10-12 16:24:18,435][61643] Avg episode reward: [(0, '3.890'), (1, '7.760')] [2023-10-12 16:24:18,454][62634] Updated weights for policy 0, policy_version 15630 (0.0008) [2023-10-12 16:24:18,605][62635] Updated weights for policy 1, policy_version 15630 (0.0007) [2023-10-12 16:24:18,825][62634] Updated weights for policy 0, policy_version 15640 (0.0007) [2023-10-12 16:24:18,976][62635] Updated weights for policy 1, policy_version 15640 (0.0008) [2023-10-12 16:24:19,124][62354] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000015648_16023552.pth... [2023-10-12 16:24:19,153][62354] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000014080_14417920.pth [2023-10-12 16:24:19,156][62354] Saving new best policy, reward=3.890! [2023-10-12 16:24:19,189][62354] Saving a milestone ./train_atari/atari_kangaroo_APPO/checkpoint_p0/milestones/checkpoint_000015648_16023552.pth [2023-10-12 16:24:19,266][62495] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000015648_16023552.pth... [2023-10-12 16:24:19,296][62495] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000014080_14417920.pth [2023-10-12 16:24:19,299][62495] Saving a milestone ./train_atari/atari_kangaroo_APPO/checkpoint_p1/milestones/checkpoint_000015648_16023552.pth [2023-10-12 16:24:22,973][62634] Updated weights for policy 0, policy_version 15650 (0.0007) [2023-10-12 16:24:23,065][62635] Updated weights for policy 1, policy_version 15650 (0.0008) [2023-10-12 16:24:23,379][62634] Updated weights for policy 0, policy_version 15660 (0.0007) [2023-10-12 16:24:23,420][62635] Updated weights for policy 1, policy_version 15660 (0.0007) [2023-10-12 16:24:23,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 32047104. Throughput: 0: 1674.7, 1: 1676.4. Samples: 8023304. Policy #0 lag: (min: 21.0, avg: 21.0, max: 21.0) [2023-10-12 16:24:23,436][61643] Avg episode reward: [(0, '3.810'), (1, '7.850')] [2023-10-12 16:24:23,759][62634] Updated weights for policy 0, policy_version 15670 (0.0008) [2023-10-12 16:24:23,791][62635] Updated weights for policy 1, policy_version 15670 (0.0007) [2023-10-12 16:24:24,131][62634] Updated weights for policy 0, policy_version 15680 (0.0007) [2023-10-12 16:24:24,155][62635] Updated weights for policy 1, policy_version 15680 (0.0007) [2023-10-12 16:24:28,096][62634] Updated weights for policy 0, policy_version 15690 (0.0008) [2023-10-12 16:24:28,246][62635] Updated weights for policy 1, policy_version 15690 (0.0010) [2023-10-12 16:24:28,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 32112640. Throughput: 0: 1667.8, 1: 1679.8. Samples: 8043924. Policy #0 lag: (min: 21.0, avg: 21.0, max: 21.0) [2023-10-12 16:24:28,436][61643] Avg episode reward: [(0, '3.820'), (1, '8.030')] [2023-10-12 16:24:28,483][62634] Updated weights for policy 0, policy_version 15700 (0.0007) [2023-10-12 16:24:28,618][62635] Updated weights for policy 1, policy_version 15700 (0.0008) [2023-10-12 16:24:28,862][62634] Updated weights for policy 0, policy_version 15710 (0.0008) [2023-10-12 16:24:28,990][62635] Updated weights for policy 1, policy_version 15710 (0.0007) [2023-10-12 16:24:29,056][62495] Saving new best policy, reward=8.030! [2023-10-12 16:24:32,789][62634] Updated weights for policy 0, policy_version 15720 (0.0009) [2023-10-12 16:24:33,029][62635] Updated weights for policy 1, policy_version 15720 (0.0008) [2023-10-12 16:24:33,163][62634] Updated weights for policy 0, policy_version 15730 (0.0007) [2023-10-12 16:24:33,396][62635] Updated weights for policy 1, policy_version 15730 (0.0009) [2023-10-12 16:24:33,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 32178176. Throughput: 0: 1657.9, 1: 1667.3. Samples: 8063780. Policy #0 lag: (min: 1.0, avg: 8.7, max: 33.0) [2023-10-12 16:24:33,435][61643] Avg episode reward: [(0, '3.850'), (1, '7.960')] [2023-10-12 16:24:33,534][62634] Updated weights for policy 0, policy_version 15740 (0.0009) [2023-10-12 16:24:33,764][62635] Updated weights for policy 1, policy_version 15740 (0.0009) [2023-10-12 16:24:37,672][62634] Updated weights for policy 0, policy_version 15750 (0.0008) [2023-10-12 16:24:37,922][62635] Updated weights for policy 1, policy_version 15750 (0.0009) [2023-10-12 16:24:38,045][62634] Updated weights for policy 0, policy_version 15760 (0.0007) [2023-10-12 16:24:38,284][62635] Updated weights for policy 1, policy_version 15760 (0.0009) [2023-10-12 16:24:38,423][62634] Updated weights for policy 0, policy_version 15770 (0.0008) [2023-10-12 16:24:38,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 32243712. Throughput: 0: 1667.4, 1: 1676.6. Samples: 8073538. Policy #0 lag: (min: 1.0, avg: 8.7, max: 33.0) [2023-10-12 16:24:38,435][61643] Avg episode reward: [(0, '3.840'), (1, '7.810')] [2023-10-12 16:24:38,651][62635] Updated weights for policy 1, policy_version 15770 (0.0009) [2023-10-12 16:24:42,529][62634] Updated weights for policy 0, policy_version 15780 (0.0008) [2023-10-12 16:24:42,706][62635] Updated weights for policy 1, policy_version 15780 (0.0009) [2023-10-12 16:24:42,911][62634] Updated weights for policy 0, policy_version 15790 (0.0010) [2023-10-12 16:24:43,071][62635] Updated weights for policy 1, policy_version 15790 (0.0007) [2023-10-12 16:24:43,290][62634] Updated weights for policy 0, policy_version 15800 (0.0008) [2023-10-12 16:24:43,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 32309248. Throughput: 0: 1658.9, 1: 1675.0. Samples: 8093936. Policy #0 lag: (min: 1.0, avg: 8.7, max: 33.0) [2023-10-12 16:24:43,435][61643] Avg episode reward: [(0, '3.820'), (1, '8.080')] [2023-10-12 16:24:43,440][62635] Updated weights for policy 1, policy_version 15800 (0.0008) [2023-10-12 16:24:43,737][62495] Saving new best policy, reward=8.080! [2023-10-12 16:24:47,378][62634] Updated weights for policy 0, policy_version 15810 (0.0008) [2023-10-12 16:24:47,486][62635] Updated weights for policy 1, policy_version 15810 (0.0008) [2023-10-12 16:24:47,760][62634] Updated weights for policy 0, policy_version 15820 (0.0007) [2023-10-12 16:24:47,885][62635] Updated weights for policy 1, policy_version 15820 (0.0009) [2023-10-12 16:24:48,138][62634] Updated weights for policy 0, policy_version 15830 (0.0007) [2023-10-12 16:24:48,254][62635] Updated weights for policy 1, policy_version 15830 (0.0009) [2023-10-12 16:24:48,435][61643] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 32374784. Throughput: 0: 1649.1, 1: 1662.1. Samples: 8113344. Policy #0 lag: (min: 2.0, avg: 2.3, max: 14.0) [2023-10-12 16:24:48,436][61643] Avg episode reward: [(0, '3.780'), (1, '7.820')] [2023-10-12 16:24:48,503][62634] Updated weights for policy 0, policy_version 15840 (0.0007) [2023-10-12 16:24:48,619][62635] Updated weights for policy 1, policy_version 15840 (0.0008) [2023-10-12 16:24:52,422][62634] Updated weights for policy 0, policy_version 15850 (0.0010) [2023-10-12 16:24:52,550][62635] Updated weights for policy 1, policy_version 15850 (0.0009) [2023-10-12 16:24:52,791][62634] Updated weights for policy 0, policy_version 15860 (0.0009) [2023-10-12 16:24:52,918][62635] Updated weights for policy 1, policy_version 15860 (0.0009) [2023-10-12 16:24:53,171][62634] Updated weights for policy 0, policy_version 15870 (0.0008) [2023-10-12 16:24:53,284][62635] Updated weights for policy 1, policy_version 15870 (0.0007) [2023-10-12 16:24:53,435][61643] Fps is (10 sec: 19660.3, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 32505856. Throughput: 0: 1667.7, 1: 1674.5. Samples: 8123724. Policy #0 lag: (min: 2.0, avg: 2.3, max: 14.0) [2023-10-12 16:24:53,436][61643] Avg episode reward: [(0, '3.770'), (1, '7.700')] [2023-10-12 16:24:57,191][62634] Updated weights for policy 0, policy_version 15880 (0.0010) [2023-10-12 16:24:57,495][62635] Updated weights for policy 1, policy_version 15880 (0.0009) [2023-10-12 16:24:57,560][62634] Updated weights for policy 0, policy_version 15890 (0.0009) [2023-10-12 16:24:57,868][62635] Updated weights for policy 1, policy_version 15890 (0.0009) [2023-10-12 16:24:57,952][62634] Updated weights for policy 0, policy_version 15900 (0.0007) [2023-10-12 16:24:58,239][62635] Updated weights for policy 1, policy_version 15900 (0.0007) [2023-10-12 16:24:58,435][61643] Fps is (10 sec: 19660.7, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 32571392. Throughput: 0: 1662.7, 1: 1677.0. Samples: 8144194. Policy #0 lag: (min: 0.0, avg: 20.1, max: 32.0) [2023-10-12 16:24:58,436][61643] Avg episode reward: [(0, '3.850'), (1, '7.480')] [2023-10-12 16:25:01,936][62634] Updated weights for policy 0, policy_version 15910 (0.0008) [2023-10-12 16:25:02,297][62635] Updated weights for policy 1, policy_version 15910 (0.0008) [2023-10-12 16:25:02,313][62634] Updated weights for policy 0, policy_version 15920 (0.0009) [2023-10-12 16:25:02,658][62635] Updated weights for policy 1, policy_version 15920 (0.0008) [2023-10-12 16:25:02,697][62634] Updated weights for policy 0, policy_version 15930 (0.0008) [2023-10-12 16:25:03,031][62635] Updated weights for policy 1, policy_version 15930 (0.0008) [2023-10-12 16:25:03,435][61643] Fps is (10 sec: 13107.1, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 32636928. Throughput: 0: 1642.6, 1: 1654.5. Samples: 8162622. Policy #0 lag: (min: 0.0, avg: 20.1, max: 32.0) [2023-10-12 16:25:03,436][61643] Avg episode reward: [(0, '3.940'), (1, '7.860')] [2023-10-12 16:25:03,445][62354] Saving new best policy, reward=3.940! [2023-10-12 16:25:06,778][62634] Updated weights for policy 0, policy_version 15940 (0.0007) [2023-10-12 16:25:07,166][62634] Updated weights for policy 0, policy_version 15950 (0.0009) [2023-10-12 16:25:07,270][62635] Updated weights for policy 1, policy_version 15940 (0.0009) [2023-10-12 16:25:07,533][62634] Updated weights for policy 0, policy_version 15960 (0.0007) [2023-10-12 16:25:07,646][62635] Updated weights for policy 1, policy_version 15950 (0.0008) [2023-10-12 16:25:08,004][62635] Updated weights for policy 1, policy_version 15960 (0.0009) [2023-10-12 16:25:08,435][61643] Fps is (10 sec: 13107.2, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 32702464. Throughput: 0: 1673.2, 1: 1676.6. Samples: 8174048. Policy #0 lag: (min: 0.0, avg: 20.1, max: 32.0) [2023-10-12 16:25:08,436][61643] Avg episode reward: [(0, '3.950'), (1, '8.390')] [2023-10-12 16:25:08,437][62354] Saving new best policy, reward=3.950! [2023-10-12 16:25:08,437][62495] Saving new best policy, reward=8.390! [2023-10-12 16:25:11,672][62634] Updated weights for policy 0, policy_version 15970 (0.0008) [2023-10-12 16:25:11,957][62635] Updated weights for policy 1, policy_version 15970 (0.0008) [2023-10-12 16:25:12,058][62634] Updated weights for policy 0, policy_version 15980 (0.0009) [2023-10-12 16:25:12,329][62635] Updated weights for policy 1, policy_version 15980 (0.0009) [2023-10-12 16:25:12,423][62634] Updated weights for policy 0, policy_version 15990 (0.0008) [2023-10-12 16:25:12,698][62635] Updated weights for policy 1, policy_version 15990 (0.0009) [2023-10-12 16:25:12,796][62634] Updated weights for policy 0, policy_version 16000 (0.0007) [2023-10-12 16:25:13,068][62635] Updated weights for policy 1, policy_version 16000 (0.0008) [2023-10-12 16:25:13,435][61643] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 32768000. Throughput: 0: 1666.6, 1: 1671.4. Samples: 8194134. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 16:25:13,435][61643] Avg episode reward: [(0, '3.920'), (1, '8.220')] [2023-10-12 16:25:17,027][62634] Updated weights for policy 0, policy_version 16010 (0.0008) [2023-10-12 16:25:17,222][62635] Updated weights for policy 1, policy_version 16010 (0.0011) [2023-10-12 16:25:17,406][62634] Updated weights for policy 0, policy_version 16020 (0.0007) [2023-10-12 16:25:17,594][62635] Updated weights for policy 1, policy_version 16020 (0.0007) [2023-10-12 16:25:17,787][62634] Updated weights for policy 0, policy_version 16030 (0.0007) [2023-10-12 16:25:17,959][62635] Updated weights for policy 1, policy_version 16030 (0.0009) [2023-10-12 16:25:18,435][61643] Fps is (10 sec: 13107.1, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 32833536. Throughput: 0: 1655.5, 1: 1650.3. Samples: 8212540. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 16:25:18,436][61643] Avg episode reward: [(0, '3.920'), (1, '8.140')] [2023-10-12 16:25:21,859][62634] Updated weights for policy 0, policy_version 16040 (0.0008) [2023-10-12 16:25:21,994][62635] Updated weights for policy 1, policy_version 16040 (0.0009) [2023-10-12 16:25:22,240][62634] Updated weights for policy 0, policy_version 16050 (0.0010) [2023-10-12 16:25:22,358][62635] Updated weights for policy 1, policy_version 16050 (0.0009) [2023-10-12 16:25:22,622][62634] Updated weights for policy 0, policy_version 16060 (0.0009) [2023-10-12 16:25:22,739][62635] Updated weights for policy 1, policy_version 16060 (0.0007) [2023-10-12 16:25:23,435][61643] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 32899072. Throughput: 0: 1676.4, 1: 1668.2. Samples: 8224044. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 16:25:23,435][61643] Avg episode reward: [(0, '3.830'), (1, '7.920')] [2023-10-12 16:25:26,705][62634] Updated weights for policy 0, policy_version 16070 (0.0008) [2023-10-12 16:25:26,894][62635] Updated weights for policy 1, policy_version 16070 (0.0008) [2023-10-12 16:25:27,079][62634] Updated weights for policy 0, policy_version 16080 (0.0007) [2023-10-12 16:25:27,255][62635] Updated weights for policy 1, policy_version 16080 (0.0008) [2023-10-12 16:25:27,459][62634] Updated weights for policy 0, policy_version 16090 (0.0009) [2023-10-12 16:25:27,622][62635] Updated weights for policy 1, policy_version 16090 (0.0009) [2023-10-12 16:25:28,435][61643] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 32964608. Throughput: 0: 1670.2, 1: 1660.6. Samples: 8243824. Policy #0 lag: (min: 31.0, avg: 39.8, max: 63.0) [2023-10-12 16:25:28,435][61643] Avg episode reward: [(0, '3.840'), (1, '7.960')] [2023-10-12 16:25:31,521][62634] Updated weights for policy 0, policy_version 16100 (0.0007) [2023-10-12 16:25:31,768][62635] Updated weights for policy 1, policy_version 16100 (0.0007) [2023-10-12 16:25:31,901][62634] Updated weights for policy 0, policy_version 16110 (0.0007) [2023-10-12 16:25:32,137][62635] Updated weights for policy 1, policy_version 16110 (0.0009) [2023-10-12 16:25:32,275][62634] Updated weights for policy 0, policy_version 16120 (0.0008) [2023-10-12 16:25:32,504][62635] Updated weights for policy 1, policy_version 16120 (0.0009) [2023-10-12 16:25:33,435][61643] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 13440.4). Total num frames: 33030144. Throughput: 0: 1664.9, 1: 1656.1. Samples: 8262788. Policy #0 lag: (min: 31.0, avg: 39.8, max: 63.0) [2023-10-12 16:25:33,435][61643] Avg episode reward: [(0, '3.840'), (1, '8.170')] [2023-10-12 16:25:36,406][62634] Updated weights for policy 0, policy_version 16130 (0.0009) [2023-10-12 16:25:36,629][62635] Updated weights for policy 1, policy_version 16130 (0.0008) [2023-10-12 16:25:36,779][62634] Updated weights for policy 0, policy_version 16140 (0.0008) [2023-10-12 16:25:37,038][62635] Updated weights for policy 1, policy_version 16140 (0.0007) [2023-10-12 16:25:37,151][62634] Updated weights for policy 0, policy_version 16150 (0.0007) [2023-10-12 16:25:37,414][62635] Updated weights for policy 1, policy_version 16150 (0.0007) [2023-10-12 16:25:37,516][62634] Updated weights for policy 0, policy_version 16160 (0.0007) [2023-10-12 16:25:37,774][62635] Updated weights for policy 1, policy_version 16160 (0.0007) [2023-10-12 16:25:38,435][61643] Fps is (10 sec: 13107.0, 60 sec: 14199.4, 300 sec: 13329.4). Total num frames: 33095680. Throughput: 0: 1678.9, 1: 1670.3. Samples: 8274438. Policy #0 lag: (min: 31.0, avg: 39.8, max: 63.0) [2023-10-12 16:25:38,436][61643] Avg episode reward: [(0, '3.860'), (1, '7.870')] [2023-10-12 16:25:41,592][62634] Updated weights for policy 0, policy_version 16170 (0.0007) [2023-10-12 16:25:41,895][62635] Updated weights for policy 1, policy_version 16170 (0.0009) [2023-10-12 16:25:41,967][62634] Updated weights for policy 0, policy_version 16180 (0.0007) [2023-10-12 16:25:42,265][62635] Updated weights for policy 1, policy_version 16180 (0.0008) [2023-10-12 16:25:42,335][62634] Updated weights for policy 0, policy_version 16190 (0.0008) [2023-10-12 16:25:42,648][62635] Updated weights for policy 1, policy_version 16190 (0.0010) [2023-10-12 16:25:43,435][61643] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 13329.4). Total num frames: 33161216. Throughput: 0: 1663.5, 1: 1656.3. Samples: 8293586. Policy #0 lag: (min: 31.0, avg: 36.3, max: 63.0) [2023-10-12 16:25:43,435][61643] Avg episode reward: [(0, '3.850'), (1, '7.750')] [2023-10-12 16:25:46,483][62634] Updated weights for policy 0, policy_version 16200 (0.0010) [2023-10-12 16:25:46,778][62635] Updated weights for policy 1, policy_version 16200 (0.0007) [2023-10-12 16:25:46,863][62634] Updated weights for policy 0, policy_version 16210 (0.0008) [2023-10-12 16:25:47,149][62635] Updated weights for policy 1, policy_version 16210 (0.0009) [2023-10-12 16:25:47,237][62634] Updated weights for policy 0, policy_version 16220 (0.0008) [2023-10-12 16:25:47,515][62635] Updated weights for policy 1, policy_version 16220 (0.0009) [2023-10-12 16:25:48,435][61643] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 13329.3). Total num frames: 33226752. Throughput: 0: 1675.6, 1: 1660.0. Samples: 8312722. Policy #0 lag: (min: 31.0, avg: 36.3, max: 63.0) [2023-10-12 16:25:48,436][61643] Avg episode reward: [(0, '3.820'), (1, '8.040')] [2023-10-12 16:25:51,133][62634] Updated weights for policy 0, policy_version 16230 (0.0009) [2023-10-12 16:25:51,504][62634] Updated weights for policy 0, policy_version 16240 (0.0007) [2023-10-12 16:25:51,575][62635] Updated weights for policy 1, policy_version 16230 (0.0009) [2023-10-12 16:25:51,886][62634] Updated weights for policy 0, policy_version 16250 (0.0008) [2023-10-12 16:25:51,943][62635] Updated weights for policy 1, policy_version 16240 (0.0007) [2023-10-12 16:25:52,308][62635] Updated weights for policy 1, policy_version 16250 (0.0008) [2023-10-12 16:25:53,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.3, 300 sec: 13329.4). Total num frames: 33292288. Throughput: 0: 1676.1, 1: 1667.7. Samples: 8324520. Policy #0 lag: (min: 31.0, avg: 36.3, max: 63.0) [2023-10-12 16:25:53,435][61643] Avg episode reward: [(0, '3.810'), (1, '8.090')] [2023-10-12 16:25:55,868][62634] Updated weights for policy 0, policy_version 16260 (0.0008) [2023-10-12 16:25:56,236][62634] Updated weights for policy 0, policy_version 16270 (0.0008) [2023-10-12 16:25:56,398][62635] Updated weights for policy 1, policy_version 16260 (0.0011) [2023-10-12 16:25:56,615][62634] Updated weights for policy 0, policy_version 16280 (0.0007) [2023-10-12 16:25:56,752][62635] Updated weights for policy 1, policy_version 16270 (0.0009) [2023-10-12 16:25:57,125][62635] Updated weights for policy 1, policy_version 16280 (0.0008) [2023-10-12 16:25:58,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 33357824. Throughput: 0: 1660.3, 1: 1652.3. Samples: 8343198. Policy #0 lag: (min: 27.0, avg: 30.3, max: 59.0) [2023-10-12 16:25:58,436][61643] Avg episode reward: [(0, '3.800'), (1, '8.150')] [2023-10-12 16:26:00,694][62634] Updated weights for policy 0, policy_version 16290 (0.0008) [2023-10-12 16:26:00,962][62635] Updated weights for policy 1, policy_version 16290 (0.0007) [2023-10-12 16:26:01,085][62634] Updated weights for policy 0, policy_version 16300 (0.0009) [2023-10-12 16:26:01,320][62635] Updated weights for policy 1, policy_version 16300 (0.0008) [2023-10-12 16:26:01,462][62634] Updated weights for policy 0, policy_version 16310 (0.0009) [2023-10-12 16:26:01,695][62635] Updated weights for policy 1, policy_version 16310 (0.0008) [2023-10-12 16:26:01,831][62634] Updated weights for policy 0, policy_version 16320 (0.0009) [2023-10-12 16:26:02,064][62635] Updated weights for policy 1, policy_version 16320 (0.0009) [2023-10-12 16:26:03,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.3, 300 sec: 13329.4). Total num frames: 33423360. Throughput: 0: 1681.3, 1: 1672.9. Samples: 8363480. Policy #0 lag: (min: 27.0, avg: 30.3, max: 59.0) [2023-10-12 16:26:03,435][61643] Avg episode reward: [(0, '3.830'), (1, '7.920')] [2023-10-12 16:26:05,900][62634] Updated weights for policy 0, policy_version 16330 (0.0008) [2023-10-12 16:26:06,225][62635] Updated weights for policy 1, policy_version 16330 (0.0008) [2023-10-12 16:26:06,280][62634] Updated weights for policy 0, policy_version 16340 (0.0010) [2023-10-12 16:26:06,599][62635] Updated weights for policy 1, policy_version 16340 (0.0008) [2023-10-12 16:26:06,649][62634] Updated weights for policy 0, policy_version 16350 (0.0010) [2023-10-12 16:26:06,968][62635] Updated weights for policy 1, policy_version 16350 (0.0009) [2023-10-12 16:26:08,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 33488896. Throughput: 0: 1672.5, 1: 1674.4. Samples: 8374656. Policy #0 lag: (min: 27.0, avg: 30.3, max: 59.0) [2023-10-12 16:26:08,436][61643] Avg episode reward: [(0, '3.830'), (1, '8.050')] [2023-10-12 16:26:10,731][62634] Updated weights for policy 0, policy_version 16360 (0.0009) [2023-10-12 16:26:11,124][62634] Updated weights for policy 0, policy_version 16370 (0.0008) [2023-10-12 16:26:11,141][62635] Updated weights for policy 1, policy_version 16360 (0.0010) [2023-10-12 16:26:11,495][62634] Updated weights for policy 0, policy_version 16380 (0.0008) [2023-10-12 16:26:11,512][62635] Updated weights for policy 1, policy_version 16370 (0.0008) [2023-10-12 16:26:11,874][62635] Updated weights for policy 1, policy_version 16380 (0.0008) [2023-10-12 16:26:13,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 33554432. Throughput: 0: 1665.2, 1: 1662.0. Samples: 8393544. Policy #0 lag: (min: 8.0, avg: 27.6, max: 40.0) [2023-10-12 16:26:13,436][61643] Avg episode reward: [(0, '3.830'), (1, '8.030')] [2023-10-12 16:26:15,432][62634] Updated weights for policy 0, policy_version 16390 (0.0009) [2023-10-12 16:26:15,817][62634] Updated weights for policy 0, policy_version 16400 (0.0008) [2023-10-12 16:26:15,954][62635] Updated weights for policy 1, policy_version 16390 (0.0007) [2023-10-12 16:26:16,190][62634] Updated weights for policy 0, policy_version 16410 (0.0008) [2023-10-12 16:26:16,317][62635] Updated weights for policy 1, policy_version 16400 (0.0008) [2023-10-12 16:26:16,692][62635] Updated weights for policy 1, policy_version 16410 (0.0011) [2023-10-12 16:26:18,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 33619968. Throughput: 0: 1685.5, 1: 1677.1. Samples: 8414108. Policy #0 lag: (min: 8.0, avg: 27.6, max: 40.0) [2023-10-12 16:26:18,436][61643] Avg episode reward: [(0, '3.780'), (1, '8.180')] [2023-10-12 16:26:18,444][62354] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000016416_16809984.pth... [2023-10-12 16:26:18,445][62495] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000016416_16809984.pth... [2023-10-12 16:26:18,482][62354] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000014848_15204352.pth [2023-10-12 16:26:18,484][62495] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000014848_15204352.pth [2023-10-12 16:26:20,282][62634] Updated weights for policy 0, policy_version 16420 (0.0008) [2023-10-12 16:26:20,669][62634] Updated weights for policy 0, policy_version 16430 (0.0008) [2023-10-12 16:26:20,793][62635] Updated weights for policy 1, policy_version 16420 (0.0009) [2023-10-12 16:26:21,032][62634] Updated weights for policy 0, policy_version 16440 (0.0007) [2023-10-12 16:26:21,168][62635] Updated weights for policy 1, policy_version 16430 (0.0009) [2023-10-12 16:26:21,535][62635] Updated weights for policy 1, policy_version 16440 (0.0010) [2023-10-12 16:26:23,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 33685504. Throughput: 0: 1663.8, 1: 1670.9. Samples: 8424502. Policy #0 lag: (min: 8.0, avg: 27.6, max: 40.0) [2023-10-12 16:26:23,435][61643] Avg episode reward: [(0, '3.800'), (1, '8.360')] [2023-10-12 16:26:25,135][62634] Updated weights for policy 0, policy_version 16450 (0.0008) [2023-10-12 16:26:25,479][62635] Updated weights for policy 1, policy_version 16450 (0.0009) [2023-10-12 16:26:25,519][62634] Updated weights for policy 0, policy_version 16460 (0.0007) [2023-10-12 16:26:25,851][62635] Updated weights for policy 1, policy_version 16460 (0.0007) [2023-10-12 16:26:25,883][62634] Updated weights for policy 0, policy_version 16470 (0.0007) [2023-10-12 16:26:26,215][62635] Updated weights for policy 1, policy_version 16470 (0.0007) [2023-10-12 16:26:26,261][62634] Updated weights for policy 0, policy_version 16480 (0.0007) [2023-10-12 16:26:26,587][62635] Updated weights for policy 1, policy_version 16480 (0.0008) [2023-10-12 16:26:28,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 33751040. Throughput: 0: 1674.7, 1: 1669.5. Samples: 8444074. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) [2023-10-12 16:26:28,435][61643] Avg episode reward: [(0, '3.800'), (1, '8.500')] [2023-10-12 16:26:28,437][62495] Saving new best policy, reward=8.500! [2023-10-12 16:26:30,252][62634] Updated weights for policy 0, policy_version 16490 (0.0007) [2023-10-12 16:26:30,623][62634] Updated weights for policy 0, policy_version 16500 (0.0007) [2023-10-12 16:26:30,635][62635] Updated weights for policy 1, policy_version 16490 (0.0007) [2023-10-12 16:26:30,996][62635] Updated weights for policy 1, policy_version 16500 (0.0007) [2023-10-12 16:26:30,998][62634] Updated weights for policy 0, policy_version 16510 (0.0007) [2023-10-12 16:26:31,362][62635] Updated weights for policy 1, policy_version 16510 (0.0010) [2023-10-12 16:26:33,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 33816576. Throughput: 0: 1689.6, 1: 1686.8. Samples: 8464660. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) [2023-10-12 16:26:33,435][61643] Avg episode reward: [(0, '3.800'), (1, '8.390')] [2023-10-12 16:26:34,978][62634] Updated weights for policy 0, policy_version 16520 (0.0007) [2023-10-12 16:26:35,346][62634] Updated weights for policy 0, policy_version 16530 (0.0008) [2023-10-12 16:26:35,463][62635] Updated weights for policy 1, policy_version 16520 (0.0008) [2023-10-12 16:26:35,727][62634] Updated weights for policy 0, policy_version 16540 (0.0007) [2023-10-12 16:26:35,835][62635] Updated weights for policy 1, policy_version 16530 (0.0007) [2023-10-12 16:26:36,199][62635] Updated weights for policy 1, policy_version 16540 (0.0009) [2023-10-12 16:26:38,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.3, 300 sec: 13329.4). Total num frames: 33882112. Throughput: 0: 1659.6, 1: 1668.7. Samples: 8474292. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) [2023-10-12 16:26:38,435][61643] Avg episode reward: [(0, '3.800'), (1, '8.420')] [2023-10-12 16:26:39,632][62634] Updated weights for policy 0, policy_version 16550 (0.0008) [2023-10-12 16:26:40,009][62634] Updated weights for policy 0, policy_version 16560 (0.0010) [2023-10-12 16:26:40,387][62634] Updated weights for policy 0, policy_version 16570 (0.0008) [2023-10-12 16:26:40,455][62635] Updated weights for policy 1, policy_version 16550 (0.0010) [2023-10-12 16:26:40,821][62635] Updated weights for policy 1, policy_version 16560 (0.0009) [2023-10-12 16:26:41,188][62635] Updated weights for policy 1, policy_version 16570 (0.0008) [2023-10-12 16:26:43,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 33947648. Throughput: 0: 1687.9, 1: 1677.5. Samples: 8494642. Policy #0 lag: (min: 9.0, avg: 13.9, max: 41.0) [2023-10-12 16:26:43,435][61643] Avg episode reward: [(0, '3.800'), (1, '8.550')] [2023-10-12 16:26:43,436][62495] Saving new best policy, reward=8.550! [2023-10-12 16:26:44,561][62634] Updated weights for policy 0, policy_version 16580 (0.0007) [2023-10-12 16:26:44,936][62634] Updated weights for policy 0, policy_version 16590 (0.0009) [2023-10-12 16:26:45,258][62635] Updated weights for policy 1, policy_version 16580 (0.0009) [2023-10-12 16:26:45,318][62634] Updated weights for policy 0, policy_version 16600 (0.0009) [2023-10-12 16:26:45,634][62635] Updated weights for policy 1, policy_version 16590 (0.0007) [2023-10-12 16:26:45,998][62635] Updated weights for policy 1, policy_version 16600 (0.0008) [2023-10-12 16:26:48,435][61643] Fps is (10 sec: 13106.8, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 34013184. Throughput: 0: 1688.4, 1: 1685.4. Samples: 8515302. Policy #0 lag: (min: 9.0, avg: 13.9, max: 41.0) [2023-10-12 16:26:48,436][61643] Avg episode reward: [(0, '3.780'), (1, '8.380')] [2023-10-12 16:26:49,486][62634] Updated weights for policy 0, policy_version 16610 (0.0008) [2023-10-12 16:26:49,890][62634] Updated weights for policy 0, policy_version 16620 (0.0008) [2023-10-12 16:26:49,960][62635] Updated weights for policy 1, policy_version 16610 (0.0008) [2023-10-12 16:26:50,265][62634] Updated weights for policy 0, policy_version 16630 (0.0008) [2023-10-12 16:26:50,335][62635] Updated weights for policy 1, policy_version 16620 (0.0008) [2023-10-12 16:26:50,644][62634] Updated weights for policy 0, policy_version 16640 (0.0007) [2023-10-12 16:26:50,702][62635] Updated weights for policy 1, policy_version 16630 (0.0008) [2023-10-12 16:26:51,070][62635] Updated weights for policy 1, policy_version 16640 (0.0009) [2023-10-12 16:26:53,435][61643] Fps is (10 sec: 13106.9, 60 sec: 13107.1, 300 sec: 13329.4). Total num frames: 34078720. Throughput: 0: 1663.1, 1: 1661.4. Samples: 8524260. Policy #0 lag: (min: 9.0, avg: 13.9, max: 41.0) [2023-10-12 16:26:53,436][61643] Avg episode reward: [(0, '3.770'), (1, '8.540')] [2023-10-12 16:26:54,633][62634] Updated weights for policy 0, policy_version 16650 (0.0007) [2023-10-12 16:26:54,999][62634] Updated weights for policy 0, policy_version 16660 (0.0007) [2023-10-12 16:26:55,220][62635] Updated weights for policy 1, policy_version 16650 (0.0009) [2023-10-12 16:26:55,377][62634] Updated weights for policy 0, policy_version 16670 (0.0007) [2023-10-12 16:26:55,593][62635] Updated weights for policy 1, policy_version 16660 (0.0007) [2023-10-12 16:26:55,960][62635] Updated weights for policy 1, policy_version 16670 (0.0008) [2023-10-12 16:26:58,435][61643] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 34144256. Throughput: 0: 1686.2, 1: 1678.0. Samples: 8544932. Policy #0 lag: (min: 23.0, avg: 27.9, max: 55.0) [2023-10-12 16:26:58,435][61643] Avg episode reward: [(0, '3.830'), (1, '8.630')] [2023-10-12 16:26:58,436][62495] Saving new best policy, reward=8.630! [2023-10-12 16:26:59,572][62634] Updated weights for policy 0, policy_version 16680 (0.0008) [2023-10-12 16:26:59,923][62635] Updated weights for policy 1, policy_version 16680 (0.0010) [2023-10-12 16:26:59,954][62634] Updated weights for policy 0, policy_version 16690 (0.0009) [2023-10-12 16:27:00,292][62635] Updated weights for policy 1, policy_version 16690 (0.0007) [2023-10-12 16:27:00,323][62634] Updated weights for policy 0, policy_version 16700 (0.0008) [2023-10-12 16:27:00,664][62635] Updated weights for policy 1, policy_version 16700 (0.0009) [2023-10-12 16:27:03,435][61643] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 34209792. Throughput: 0: 1684.9, 1: 1683.4. Samples: 8565680. Policy #0 lag: (min: 23.0, avg: 27.9, max: 55.0) [2023-10-12 16:27:03,435][61643] Avg episode reward: [(0, '3.860'), (1, '8.830')] [2023-10-12 16:27:03,443][62495] Saving new best policy, reward=8.830! [2023-10-12 16:27:04,387][62634] Updated weights for policy 0, policy_version 16710 (0.0008) [2023-10-12 16:27:04,746][62635] Updated weights for policy 1, policy_version 16710 (0.0008) [2023-10-12 16:27:04,763][62634] Updated weights for policy 0, policy_version 16720 (0.0008) [2023-10-12 16:27:05,106][62635] Updated weights for policy 1, policy_version 16720 (0.0007) [2023-10-12 16:27:05,136][62634] Updated weights for policy 0, policy_version 16730 (0.0010) [2023-10-12 16:27:05,470][62635] Updated weights for policy 1, policy_version 16730 (0.0008) [2023-10-12 16:27:08,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 34275328. Throughput: 0: 1677.8, 1: 1661.2. Samples: 8574758. Policy #0 lag: (min: 23.0, avg: 27.9, max: 55.0) [2023-10-12 16:27:08,435][61643] Avg episode reward: [(0, '3.910'), (1, '8.690')] [2023-10-12 16:27:09,299][62634] Updated weights for policy 0, policy_version 16740 (0.0009) [2023-10-12 16:27:09,579][62635] Updated weights for policy 1, policy_version 16740 (0.0007) [2023-10-12 16:27:09,671][62634] Updated weights for policy 0, policy_version 16750 (0.0009) [2023-10-12 16:27:09,947][62635] Updated weights for policy 1, policy_version 16750 (0.0009) [2023-10-12 16:27:10,056][62634] Updated weights for policy 0, policy_version 16760 (0.0008) [2023-10-12 16:27:10,311][62635] Updated weights for policy 1, policy_version 16760 (0.0008) [2023-10-12 16:27:13,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 34340864. Throughput: 0: 1684.2, 1: 1679.6. Samples: 8595446. Policy #0 lag: (min: 31.0, avg: 31.3, max: 44.0) [2023-10-12 16:27:13,435][61643] Avg episode reward: [(0, '3.870'), (1, '8.750')] [2023-10-12 16:27:14,057][62634] Updated weights for policy 0, policy_version 16770 (0.0007) [2023-10-12 16:27:14,317][62635] Updated weights for policy 1, policy_version 16770 (0.0007) [2023-10-12 16:27:14,429][62634] Updated weights for policy 0, policy_version 16780 (0.0009) [2023-10-12 16:27:14,681][62635] Updated weights for policy 1, policy_version 16780 (0.0008) [2023-10-12 16:27:14,812][62634] Updated weights for policy 0, policy_version 16790 (0.0008) [2023-10-12 16:27:15,052][62635] Updated weights for policy 1, policy_version 16790 (0.0007) [2023-10-12 16:27:15,178][62634] Updated weights for policy 0, policy_version 16800 (0.0009) [2023-10-12 16:27:15,419][62635] Updated weights for policy 1, policy_version 16800 (0.0008) [2023-10-12 16:27:18,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 34406400. Throughput: 0: 1679.9, 1: 1682.8. Samples: 8615984. Policy #0 lag: (min: 31.0, avg: 31.3, max: 44.0) [2023-10-12 16:27:18,435][61643] Avg episode reward: [(0, '3.850'), (1, '8.820')] [2023-10-12 16:27:19,258][62634] Updated weights for policy 0, policy_version 16810 (0.0007) [2023-10-12 16:27:19,642][62634] Updated weights for policy 0, policy_version 16820 (0.0009) [2023-10-12 16:27:19,649][62635] Updated weights for policy 1, policy_version 16810 (0.0007) [2023-10-12 16:27:20,012][62634] Updated weights for policy 0, policy_version 16830 (0.0010) [2023-10-12 16:27:20,016][62635] Updated weights for policy 1, policy_version 16820 (0.0008) [2023-10-12 16:27:20,390][62635] Updated weights for policy 1, policy_version 16830 (0.0009) [2023-10-12 16:27:23,435][61643] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 34471936. Throughput: 0: 1679.2, 1: 1669.7. Samples: 8624996. Policy #0 lag: (min: 31.0, avg: 31.3, max: 44.0) [2023-10-12 16:27:23,436][61643] Avg episode reward: [(0, '3.880'), (1, '8.760')] [2023-10-12 16:27:24,004][62634] Updated weights for policy 0, policy_version 16840 (0.0009) [2023-10-12 16:27:24,379][62634] Updated weights for policy 0, policy_version 16850 (0.0009) [2023-10-12 16:27:24,586][62635] Updated weights for policy 1, policy_version 16840 (0.0008) [2023-10-12 16:27:24,745][62634] Updated weights for policy 0, policy_version 16860 (0.0010) [2023-10-12 16:27:24,948][62635] Updated weights for policy 1, policy_version 16850 (0.0009) [2023-10-12 16:27:25,317][62635] Updated weights for policy 1, policy_version 16860 (0.0009) [2023-10-12 16:27:28,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 34537472. Throughput: 0: 1676.8, 1: 1675.4. Samples: 8645492. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 16:27:28,436][61643] Avg episode reward: [(0, '3.820'), (1, '8.670')] [2023-10-12 16:27:28,914][62634] Updated weights for policy 0, policy_version 16870 (0.0007) [2023-10-12 16:27:29,288][62634] Updated weights for policy 0, policy_version 16880 (0.0008) [2023-10-12 16:27:29,315][62635] Updated weights for policy 1, policy_version 16870 (0.0008) [2023-10-12 16:27:29,665][62634] Updated weights for policy 0, policy_version 16890 (0.0008) [2023-10-12 16:27:29,676][62635] Updated weights for policy 1, policy_version 16880 (0.0009) [2023-10-12 16:27:30,050][62635] Updated weights for policy 1, policy_version 16890 (0.0007) [2023-10-12 16:27:33,435][61643] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 34603008. Throughput: 0: 1673.8, 1: 1677.1. Samples: 8666094. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 16:27:33,435][61643] Avg episode reward: [(0, '3.810'), (1, '8.590')] [2023-10-12 16:27:33,778][62634] Updated weights for policy 0, policy_version 16900 (0.0009) [2023-10-12 16:27:34,029][62635] Updated weights for policy 1, policy_version 16900 (0.0008) [2023-10-12 16:27:34,150][62634] Updated weights for policy 0, policy_version 16910 (0.0009) [2023-10-12 16:27:34,394][62635] Updated weights for policy 1, policy_version 16910 (0.0008) [2023-10-12 16:27:34,527][62634] Updated weights for policy 0, policy_version 16920 (0.0007) [2023-10-12 16:27:34,757][62635] Updated weights for policy 1, policy_version 16920 (0.0008) [2023-10-12 16:27:38,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.1, 300 sec: 13329.4). Total num frames: 34668544. Throughput: 0: 1679.7, 1: 1674.8. Samples: 8675214. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 16:27:38,436][61643] Avg episode reward: [(0, '3.810'), (1, '8.640')] [2023-10-12 16:27:38,588][62634] Updated weights for policy 0, policy_version 16930 (0.0007) [2023-10-12 16:27:38,700][62635] Updated weights for policy 1, policy_version 16930 (0.0009) [2023-10-12 16:27:38,976][62634] Updated weights for policy 0, policy_version 16940 (0.0009) [2023-10-12 16:27:39,072][62635] Updated weights for policy 1, policy_version 16940 (0.0009) [2023-10-12 16:27:39,355][62634] Updated weights for policy 0, policy_version 16950 (0.0010) [2023-10-12 16:27:39,453][62635] Updated weights for policy 1, policy_version 16950 (0.0007) [2023-10-12 16:27:39,731][62634] Updated weights for policy 0, policy_version 16960 (0.0009) [2023-10-12 16:27:39,826][62635] Updated weights for policy 1, policy_version 16960 (0.0008) [2023-10-12 16:27:43,435][61643] Fps is (10 sec: 13106.8, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 34734080. Throughput: 0: 1673.1, 1: 1680.8. Samples: 8695860. Policy #0 lag: (min: 12.0, avg: 19.8, max: 44.0) [2023-10-12 16:27:43,436][61643] Avg episode reward: [(0, '3.820'), (1, '8.600')] [2023-10-12 16:27:43,870][62634] Updated weights for policy 0, policy_version 16970 (0.0010) [2023-10-12 16:27:43,893][62635] Updated weights for policy 1, policy_version 16970 (0.0008) [2023-10-12 16:27:44,235][62634] Updated weights for policy 0, policy_version 16980 (0.0007) [2023-10-12 16:27:44,269][62635] Updated weights for policy 1, policy_version 16980 (0.0009) [2023-10-12 16:27:44,619][62634] Updated weights for policy 0, policy_version 16990 (0.0008) [2023-10-12 16:27:44,645][62635] Updated weights for policy 1, policy_version 16990 (0.0008) [2023-10-12 16:27:48,435][61643] Fps is (10 sec: 13107.5, 60 sec: 13107.3, 300 sec: 13329.4). Total num frames: 34799616. Throughput: 0: 1672.9, 1: 1679.3. Samples: 8716530. Policy #0 lag: (min: 12.0, avg: 19.8, max: 44.0) [2023-10-12 16:27:48,435][61643] Avg episode reward: [(0, '3.860'), (1, '8.420')] [2023-10-12 16:27:48,695][62634] Updated weights for policy 0, policy_version 17000 (0.0008) [2023-10-12 16:27:48,856][62635] Updated weights for policy 1, policy_version 17000 (0.0008) [2023-10-12 16:27:49,065][62634] Updated weights for policy 0, policy_version 17010 (0.0008) [2023-10-12 16:27:49,223][62635] Updated weights for policy 1, policy_version 17010 (0.0007) [2023-10-12 16:27:49,452][62634] Updated weights for policy 0, policy_version 17020 (0.0009) [2023-10-12 16:27:49,582][62635] Updated weights for policy 1, policy_version 17020 (0.0007) [2023-10-12 16:27:53,377][62634] Updated weights for policy 0, policy_version 17030 (0.0008) [2023-10-12 16:27:53,435][61643] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 34865152. Throughput: 0: 1672.9, 1: 1681.1. Samples: 8725688. Policy #0 lag: (min: 12.0, avg: 19.8, max: 44.0) [2023-10-12 16:27:53,435][61643] Avg episode reward: [(0, '3.870'), (1, '8.540')] [2023-10-12 16:27:53,617][62635] Updated weights for policy 1, policy_version 17030 (0.0009) [2023-10-12 16:27:53,751][62634] Updated weights for policy 0, policy_version 17040 (0.0007) [2023-10-12 16:27:53,975][62635] Updated weights for policy 1, policy_version 17040 (0.0007) [2023-10-12 16:27:54,138][62634] Updated weights for policy 0, policy_version 17050 (0.0007) [2023-10-12 16:27:54,351][62635] Updated weights for policy 1, policy_version 17050 (0.0009) [2023-10-12 16:27:57,989][62634] Updated weights for policy 0, policy_version 17060 (0.0008) [2023-10-12 16:27:58,320][62635] Updated weights for policy 1, policy_version 17060 (0.0008) [2023-10-12 16:27:58,359][62634] Updated weights for policy 0, policy_version 17070 (0.0007) [2023-10-12 16:27:58,435][61643] Fps is (10 sec: 13106.9, 60 sec: 13107.1, 300 sec: 13329.3). Total num frames: 34930688. Throughput: 0: 1679.3, 1: 1680.3. Samples: 8746630. Policy #0 lag: (min: 23.0, avg: 23.0, max: 23.0) [2023-10-12 16:27:58,436][61643] Avg episode reward: [(0, '3.840'), (1, '8.750')] [2023-10-12 16:27:58,687][62635] Updated weights for policy 1, policy_version 17070 (0.0007) [2023-10-12 16:27:58,734][62634] Updated weights for policy 0, policy_version 17080 (0.0008) [2023-10-12 16:27:59,054][62635] Updated weights for policy 1, policy_version 17080 (0.0008) [2023-10-12 16:28:02,681][62634] Updated weights for policy 0, policy_version 17090 (0.0010) [2023-10-12 16:28:03,063][62634] Updated weights for policy 0, policy_version 17100 (0.0010) [2023-10-12 16:28:03,214][62635] Updated weights for policy 1, policy_version 17090 (0.0009) [2023-10-12 16:28:03,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 34996224. Throughput: 0: 1675.1, 1: 1679.1. Samples: 8766924. Policy #0 lag: (min: 23.0, avg: 23.0, max: 23.0) [2023-10-12 16:28:03,436][61643] Avg episode reward: [(0, '3.860'), (1, '8.680')] [2023-10-12 16:28:03,437][62634] Updated weights for policy 0, policy_version 17110 (0.0008) [2023-10-12 16:28:03,586][62635] Updated weights for policy 1, policy_version 17100 (0.0008) [2023-10-12 16:28:03,810][62634] Updated weights for policy 0, policy_version 17120 (0.0009) [2023-10-12 16:28:03,948][62635] Updated weights for policy 1, policy_version 17110 (0.0010) [2023-10-12 16:28:04,322][62635] Updated weights for policy 1, policy_version 17120 (0.0007) [2023-10-12 16:28:07,875][62634] Updated weights for policy 0, policy_version 17130 (0.0009) [2023-10-12 16:28:08,246][62634] Updated weights for policy 0, policy_version 17140 (0.0009) [2023-10-12 16:28:08,361][62635] Updated weights for policy 1, policy_version 17130 (0.0008) [2023-10-12 16:28:08,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 35061760. Throughput: 0: 1682.6, 1: 1685.2. Samples: 8776550. Policy #0 lag: (min: 23.0, avg: 23.0, max: 23.0) [2023-10-12 16:28:08,436][61643] Avg episode reward: [(0, '3.860'), (1, '8.610')] [2023-10-12 16:28:08,616][62634] Updated weights for policy 0, policy_version 17150 (0.0008) [2023-10-12 16:28:08,721][62635] Updated weights for policy 1, policy_version 17140 (0.0009) [2023-10-12 16:28:09,087][62635] Updated weights for policy 1, policy_version 17150 (0.0010) [2023-10-12 16:28:12,895][62634] Updated weights for policy 0, policy_version 17160 (0.0010) [2023-10-12 16:28:13,277][62634] Updated weights for policy 0, policy_version 17170 (0.0009) [2023-10-12 16:28:13,313][62635] Updated weights for policy 1, policy_version 17160 (0.0008) [2023-10-12 16:28:13,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 35127296. Throughput: 0: 1677.7, 1: 1684.0. Samples: 8796770. Policy #0 lag: (min: 31.0, avg: 34.8, max: 63.0) [2023-10-12 16:28:13,435][61643] Avg episode reward: [(0, '3.860'), (1, '8.720')] [2023-10-12 16:28:13,643][62634] Updated weights for policy 0, policy_version 17180 (0.0007) [2023-10-12 16:28:13,676][62635] Updated weights for policy 1, policy_version 17170 (0.0009) [2023-10-12 16:28:14,040][62635] Updated weights for policy 1, policy_version 17180 (0.0007) [2023-10-12 16:28:17,780][62634] Updated weights for policy 0, policy_version 17190 (0.0009) [2023-10-12 16:28:18,032][62635] Updated weights for policy 1, policy_version 17190 (0.0007) [2023-10-12 16:28:18,157][62634] Updated weights for policy 0, policy_version 17200 (0.0010) [2023-10-12 16:28:18,404][62635] Updated weights for policy 1, policy_version 17200 (0.0007) [2023-10-12 16:28:18,435][61643] Fps is (10 sec: 13106.8, 60 sec: 13107.1, 300 sec: 13329.3). Total num frames: 35192832. Throughput: 0: 1671.7, 1: 1679.0. Samples: 8816876. Policy #0 lag: (min: 31.0, avg: 34.8, max: 63.0) [2023-10-12 16:28:18,436][61643] Avg episode reward: [(0, '3.890'), (1, '8.680')] [2023-10-12 16:28:18,532][62634] Updated weights for policy 0, policy_version 17210 (0.0009) [2023-10-12 16:28:18,754][62354] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000017216_17629184.pth... [2023-10-12 16:28:18,765][62635] Updated weights for policy 1, policy_version 17210 (0.0008) [2023-10-12 16:28:18,794][62354] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000015648_16023552.pth [2023-10-12 16:28:18,985][62495] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000017216_17629184.pth... [2023-10-12 16:28:19,014][62495] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000015648_16023552.pth [2023-10-12 16:28:22,724][62634] Updated weights for policy 0, policy_version 17220 (0.0007) [2023-10-12 16:28:22,848][62635] Updated weights for policy 1, policy_version 17220 (0.0007) [2023-10-12 16:28:23,097][62634] Updated weights for policy 0, policy_version 17230 (0.0008) [2023-10-12 16:28:23,214][62635] Updated weights for policy 1, policy_version 17230 (0.0008) [2023-10-12 16:28:23,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 35258368. Throughput: 0: 1677.3, 1: 1683.3. Samples: 8826436. Policy #0 lag: (min: 31.0, avg: 34.8, max: 63.0) [2023-10-12 16:28:23,435][61643] Avg episode reward: [(0, '3.860'), (1, '8.550')] [2023-10-12 16:28:23,481][62634] Updated weights for policy 0, policy_version 17240 (0.0008) [2023-10-12 16:28:23,572][62635] Updated weights for policy 1, policy_version 17240 (0.0014) [2023-10-12 16:28:27,723][62634] Updated weights for policy 0, policy_version 17250 (0.0009) [2023-10-12 16:28:27,787][62635] Updated weights for policy 1, policy_version 17250 (0.0008) [2023-10-12 16:28:28,123][62634] Updated weights for policy 0, policy_version 17260 (0.0008) [2023-10-12 16:28:28,164][62635] Updated weights for policy 1, policy_version 17260 (0.0008) [2023-10-12 16:28:28,435][61643] Fps is (10 sec: 13107.6, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 35323904. Throughput: 0: 1679.2, 1: 1676.4. Samples: 8846858. Policy #0 lag: (min: 26.0, avg: 26.0, max: 26.0) [2023-10-12 16:28:28,435][61643] Avg episode reward: [(0, '3.870'), (1, '8.470')] [2023-10-12 16:28:28,498][62634] Updated weights for policy 0, policy_version 17270 (0.0008) [2023-10-12 16:28:28,535][62635] Updated weights for policy 1, policy_version 17270 (0.0007) [2023-10-12 16:28:28,873][62634] Updated weights for policy 0, policy_version 17280 (0.0007) [2023-10-12 16:28:28,896][62635] Updated weights for policy 1, policy_version 17280 (0.0007) [2023-10-12 16:28:32,990][62635] Updated weights for policy 1, policy_version 17290 (0.0007) [2023-10-12 16:28:33,038][62634] Updated weights for policy 0, policy_version 17290 (0.0007) [2023-10-12 16:28:33,358][62635] Updated weights for policy 1, policy_version 17300 (0.0007) [2023-10-12 16:28:33,424][62634] Updated weights for policy 0, policy_version 17300 (0.0007) [2023-10-12 16:28:33,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 35389440. Throughput: 0: 1668.2, 1: 1664.9. Samples: 8866518. Policy #0 lag: (min: 26.0, avg: 26.0, max: 26.0) [2023-10-12 16:28:33,435][61643] Avg episode reward: [(0, '3.890'), (1, '8.540')] [2023-10-12 16:28:33,728][62635] Updated weights for policy 1, policy_version 17310 (0.0008) [2023-10-12 16:28:33,792][62634] Updated weights for policy 0, policy_version 17310 (0.0009) [2023-10-12 16:28:37,787][62635] Updated weights for policy 1, policy_version 17320 (0.0007) [2023-10-12 16:28:37,799][62634] Updated weights for policy 0, policy_version 17320 (0.0007) [2023-10-12 16:28:38,156][62635] Updated weights for policy 1, policy_version 17330 (0.0008) [2023-10-12 16:28:38,185][62634] Updated weights for policy 0, policy_version 17330 (0.0008) [2023-10-12 16:28:38,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 35454976. Throughput: 0: 1676.9, 1: 1674.5. Samples: 8876500. Policy #0 lag: (min: 26.0, avg: 26.0, max: 26.0) [2023-10-12 16:28:38,435][61643] Avg episode reward: [(0, '3.900'), (1, '8.530')] [2023-10-12 16:28:38,529][62635] Updated weights for policy 1, policy_version 17340 (0.0010) [2023-10-12 16:28:38,556][62634] Updated weights for policy 0, policy_version 17340 (0.0009) [2023-10-12 16:28:42,563][62634] Updated weights for policy 0, policy_version 17350 (0.0008) [2023-10-12 16:28:42,646][62635] Updated weights for policy 1, policy_version 17350 (0.0008) [2023-10-12 16:28:42,940][62634] Updated weights for policy 0, policy_version 17360 (0.0008) [2023-10-12 16:28:43,007][62635] Updated weights for policy 1, policy_version 17360 (0.0009) [2023-10-12 16:28:43,320][62634] Updated weights for policy 0, policy_version 17370 (0.0009) [2023-10-12 16:28:43,373][62635] Updated weights for policy 1, policy_version 17370 (0.0008) [2023-10-12 16:28:43,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 35520512. Throughput: 0: 1672.3, 1: 1669.9. Samples: 8897030. Policy #0 lag: (min: 15.0, avg: 15.1, max: 21.0) [2023-10-12 16:28:43,436][61643] Avg episode reward: [(0, '3.850'), (1, '8.450')] [2023-10-12 16:28:47,121][62634] Updated weights for policy 0, policy_version 17380 (0.0008) [2023-10-12 16:28:47,441][62635] Updated weights for policy 1, policy_version 17380 (0.0009) [2023-10-12 16:28:47,499][62634] Updated weights for policy 0, policy_version 17390 (0.0008) [2023-10-12 16:28:47,813][62635] Updated weights for policy 1, policy_version 17390 (0.0008) [2023-10-12 16:28:47,874][62634] Updated weights for policy 0, policy_version 17400 (0.0008) [2023-10-12 16:28:48,178][62635] Updated weights for policy 1, policy_version 17400 (0.0009) [2023-10-12 16:28:48,435][61643] Fps is (10 sec: 16384.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 35618816. Throughput: 0: 1662.1, 1: 1659.1. Samples: 8916378. Policy #0 lag: (min: 15.0, avg: 15.1, max: 21.0) [2023-10-12 16:28:48,435][61643] Avg episode reward: [(0, '3.780'), (1, '8.580')] [2023-10-12 16:28:51,872][62634] Updated weights for policy 0, policy_version 17410 (0.0008) [2023-10-12 16:28:52,241][62634] Updated weights for policy 0, policy_version 17420 (0.0008) [2023-10-12 16:28:52,421][62635] Updated weights for policy 1, policy_version 17410 (0.0009) [2023-10-12 16:28:52,621][62634] Updated weights for policy 0, policy_version 17430 (0.0008) [2023-10-12 16:28:52,790][62635] Updated weights for policy 1, policy_version 17420 (0.0007) [2023-10-12 16:28:52,993][62634] Updated weights for policy 0, policy_version 17440 (0.0009) [2023-10-12 16:28:53,158][62635] Updated weights for policy 1, policy_version 17430 (0.0007) [2023-10-12 16:28:53,435][61643] Fps is (10 sec: 16384.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 35684352. Throughput: 0: 1674.3, 1: 1668.8. Samples: 8926988. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 16:28:53,435][61643] Avg episode reward: [(0, '3.750'), (1, '8.560')] [2023-10-12 16:28:53,521][62635] Updated weights for policy 1, policy_version 17440 (0.0007) [2023-10-12 16:28:56,969][62634] Updated weights for policy 0, policy_version 17450 (0.0007) [2023-10-12 16:28:57,351][62634] Updated weights for policy 0, policy_version 17460 (0.0008) [2023-10-12 16:28:57,724][62634] Updated weights for policy 0, policy_version 17470 (0.0008) [2023-10-12 16:28:57,787][62635] Updated weights for policy 1, policy_version 17450 (0.0009) [2023-10-12 16:28:58,150][62635] Updated weights for policy 1, policy_version 17460 (0.0011) [2023-10-12 16:28:58,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 35749888. Throughput: 0: 1673.2, 1: 1674.1. Samples: 8947400. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 16:28:58,435][61643] Avg episode reward: [(0, '3.790'), (1, '8.580')] [2023-10-12 16:28:58,523][62635] Updated weights for policy 1, policy_version 17470 (0.0011) [2023-10-12 16:29:01,887][62634] Updated weights for policy 0, policy_version 17480 (0.0008) [2023-10-12 16:29:02,270][62634] Updated weights for policy 0, policy_version 17490 (0.0009) [2023-10-12 16:29:02,598][62635] Updated weights for policy 1, policy_version 17480 (0.0008) [2023-10-12 16:29:02,644][62634] Updated weights for policy 0, policy_version 17500 (0.0008) [2023-10-12 16:29:02,959][62635] Updated weights for policy 1, policy_version 17490 (0.0010) [2023-10-12 16:29:03,335][62635] Updated weights for policy 1, policy_version 17500 (0.0009) [2023-10-12 16:29:03,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 35815424. Throughput: 0: 1663.6, 1: 1656.3. Samples: 8966270. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 16:29:03,435][61643] Avg episode reward: [(0, '3.770'), (1, '8.780')] [2023-10-12 16:29:06,577][62634] Updated weights for policy 0, policy_version 17510 (0.0007) [2023-10-12 16:29:06,947][62634] Updated weights for policy 0, policy_version 17520 (0.0010) [2023-10-12 16:29:07,325][62634] Updated weights for policy 0, policy_version 17530 (0.0010) [2023-10-12 16:29:07,363][62635] Updated weights for policy 1, policy_version 17510 (0.0009) [2023-10-12 16:29:07,728][62635] Updated weights for policy 1, policy_version 17520 (0.0010) [2023-10-12 16:29:08,099][62635] Updated weights for policy 1, policy_version 17530 (0.0009) [2023-10-12 16:29:08,435][61643] Fps is (10 sec: 16383.6, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 35913728. Throughput: 0: 1693.0, 1: 1665.1. Samples: 8977550. Policy #0 lag: (min: 31.0, avg: 32.1, max: 53.0) [2023-10-12 16:29:08,436][61643] Avg episode reward: [(0, '3.900'), (1, '8.570')] [2023-10-12 16:29:11,257][62634] Updated weights for policy 0, policy_version 17540 (0.0008) [2023-10-12 16:29:11,624][62634] Updated weights for policy 0, policy_version 17550 (0.0007) [2023-10-12 16:29:12,008][62634] Updated weights for policy 0, policy_version 17560 (0.0009) [2023-10-12 16:29:12,205][62635] Updated weights for policy 1, policy_version 17540 (0.0007) [2023-10-12 16:29:12,572][62635] Updated weights for policy 1, policy_version 17550 (0.0007) [2023-10-12 16:29:12,935][62635] Updated weights for policy 1, policy_version 17560 (0.0007) [2023-10-12 16:29:13,435][61643] Fps is (10 sec: 16383.6, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 35979264. Throughput: 0: 1675.2, 1: 1670.7. Samples: 8997424. Policy #0 lag: (min: 31.0, avg: 32.1, max: 53.0) [2023-10-12 16:29:13,436][61643] Avg episode reward: [(0, '3.930'), (1, '8.690')] [2023-10-12 16:29:16,114][62634] Updated weights for policy 0, policy_version 17570 (0.0007) [2023-10-12 16:29:16,528][62634] Updated weights for policy 0, policy_version 17580 (0.0008) [2023-10-12 16:29:16,899][62634] Updated weights for policy 0, policy_version 17590 (0.0008) [2023-10-12 16:29:17,051][62635] Updated weights for policy 1, policy_version 17570 (0.0008) [2023-10-12 16:29:17,273][62634] Updated weights for policy 0, policy_version 17600 (0.0007) [2023-10-12 16:29:17,420][62635] Updated weights for policy 1, policy_version 17580 (0.0009) [2023-10-12 16:29:17,787][62635] Updated weights for policy 1, policy_version 17590 (0.0008) [2023-10-12 16:29:18,155][62635] Updated weights for policy 1, policy_version 17600 (0.0007) [2023-10-12 16:29:18,435][61643] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 36044800. Throughput: 0: 1678.4, 1: 1656.4. Samples: 9016582. Policy #0 lag: (min: 31.0, avg: 32.1, max: 53.0) [2023-10-12 16:29:18,435][61643] Avg episode reward: [(0, '3.930'), (1, '8.910')] [2023-10-12 16:29:18,445][62495] Saving new best policy, reward=8.910! [2023-10-12 16:29:21,263][62634] Updated weights for policy 0, policy_version 17610 (0.0009) [2023-10-12 16:29:21,641][62634] Updated weights for policy 0, policy_version 17620 (0.0009) [2023-10-12 16:29:22,020][62634] Updated weights for policy 0, policy_version 17630 (0.0008) [2023-10-12 16:29:22,199][62635] Updated weights for policy 1, policy_version 17610 (0.0009) [2023-10-12 16:29:22,570][62635] Updated weights for policy 1, policy_version 17620 (0.0009) [2023-10-12 16:29:22,940][62635] Updated weights for policy 1, policy_version 17630 (0.0008) [2023-10-12 16:29:23,435][61643] Fps is (10 sec: 13107.4, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 36110336. Throughput: 0: 1694.8, 1: 1669.7. Samples: 9027906. Policy #0 lag: (min: 24.0, avg: 49.3, max: 56.0) [2023-10-12 16:29:23,436][61643] Avg episode reward: [(0, '3.950'), (1, '8.790')] [2023-10-12 16:29:26,135][62634] Updated weights for policy 0, policy_version 17640 (0.0009) [2023-10-12 16:29:26,519][62634] Updated weights for policy 0, policy_version 17650 (0.0008) [2023-10-12 16:29:26,896][62634] Updated weights for policy 0, policy_version 17660 (0.0010) [2023-10-12 16:29:27,049][62635] Updated weights for policy 1, policy_version 17640 (0.0007) [2023-10-12 16:29:27,416][62635] Updated weights for policy 1, policy_version 17650 (0.0008) [2023-10-12 16:29:27,793][62635] Updated weights for policy 1, policy_version 17660 (0.0009) [2023-10-12 16:29:28,435][61643] Fps is (10 sec: 13107.1, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 36175872. Throughput: 0: 1669.1, 1: 1668.2. Samples: 9047208. Policy #0 lag: (min: 24.0, avg: 49.3, max: 56.0) [2023-10-12 16:29:28,436][61643] Avg episode reward: [(0, '3.940'), (1, '8.770')] [2023-10-12 16:29:31,042][62634] Updated weights for policy 0, policy_version 17670 (0.0009) [2023-10-12 16:29:31,420][62634] Updated weights for policy 0, policy_version 17680 (0.0010) [2023-10-12 16:29:31,795][62634] Updated weights for policy 0, policy_version 17690 (0.0008) [2023-10-12 16:29:31,917][62635] Updated weights for policy 1, policy_version 17670 (0.0008) [2023-10-12 16:29:32,287][62635] Updated weights for policy 1, policy_version 17680 (0.0008) [2023-10-12 16:29:32,645][62635] Updated weights for policy 1, policy_version 17690 (0.0009) [2023-10-12 16:29:33,435][61643] Fps is (10 sec: 13106.9, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 36241408. Throughput: 0: 1685.5, 1: 1655.7. Samples: 9066734. Policy #0 lag: (min: 24.0, avg: 49.3, max: 56.0) [2023-10-12 16:29:33,436][61643] Avg episode reward: [(0, '3.920'), (1, '8.560')] [2023-10-12 16:29:35,865][62634] Updated weights for policy 0, policy_version 17700 (0.0009) [2023-10-12 16:29:36,236][62634] Updated weights for policy 0, policy_version 17710 (0.0010) [2023-10-12 16:29:36,612][62634] Updated weights for policy 0, policy_version 17720 (0.0009) [2023-10-12 16:29:36,740][62635] Updated weights for policy 1, policy_version 17700 (0.0008) [2023-10-12 16:29:37,110][62635] Updated weights for policy 1, policy_version 17710 (0.0008) [2023-10-12 16:29:37,477][62635] Updated weights for policy 1, policy_version 17720 (0.0008) [2023-10-12 16:29:38,435][61643] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 36306944. Throughput: 0: 1688.2, 1: 1669.0. Samples: 9078062. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 16:29:38,435][61643] Avg episode reward: [(0, '3.900'), (1, '8.530')] [2023-10-12 16:29:40,611][62634] Updated weights for policy 0, policy_version 17730 (0.0010) [2023-10-12 16:29:40,983][62634] Updated weights for policy 0, policy_version 17740 (0.0009) [2023-10-12 16:29:41,365][62634] Updated weights for policy 0, policy_version 17750 (0.0009) [2023-10-12 16:29:41,499][62635] Updated weights for policy 1, policy_version 17730 (0.0007) [2023-10-12 16:29:41,745][62634] Updated weights for policy 0, policy_version 17760 (0.0008) [2023-10-12 16:29:41,862][62635] Updated weights for policy 1, policy_version 17740 (0.0009) [2023-10-12 16:29:42,239][62635] Updated weights for policy 1, policy_version 17750 (0.0009) [2023-10-12 16:29:42,611][62635] Updated weights for policy 1, policy_version 17760 (0.0008) [2023-10-12 16:29:43,435][61643] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 36372480. Throughput: 0: 1673.3, 1: 1657.9. Samples: 9097304. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 16:29:43,436][61643] Avg episode reward: [(0, '3.890'), (1, '8.570')] [2023-10-12 16:29:45,753][62634] Updated weights for policy 0, policy_version 17770 (0.0007) [2023-10-12 16:29:46,123][62634] Updated weights for policy 0, policy_version 17780 (0.0007) [2023-10-12 16:29:46,496][62634] Updated weights for policy 0, policy_version 17790 (0.0007) [2023-10-12 16:29:46,734][62635] Updated weights for policy 1, policy_version 17770 (0.0008) [2023-10-12 16:29:47,102][62635] Updated weights for policy 1, policy_version 17780 (0.0010) [2023-10-12 16:29:47,476][62635] Updated weights for policy 1, policy_version 17790 (0.0007) [2023-10-12 16:29:48,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13329.4). Total num frames: 36438016. Throughput: 0: 1695.9, 1: 1660.2. Samples: 9117292. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 16:29:48,436][61643] Avg episode reward: [(0, '3.890'), (1, '8.540')] [2023-10-12 16:29:50,513][62634] Updated weights for policy 0, policy_version 17800 (0.0009) [2023-10-12 16:29:50,881][62634] Updated weights for policy 0, policy_version 17810 (0.0008) [2023-10-12 16:29:51,264][62634] Updated weights for policy 0, policy_version 17820 (0.0009) [2023-10-12 16:29:51,391][62635] Updated weights for policy 1, policy_version 17800 (0.0008) [2023-10-12 16:29:51,753][62635] Updated weights for policy 1, policy_version 17810 (0.0010) [2023-10-12 16:29:52,123][62635] Updated weights for policy 1, policy_version 17820 (0.0010) [2023-10-12 16:29:53,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13329.4). Total num frames: 36503552. Throughput: 0: 1671.5, 1: 1678.3. Samples: 9128290. Policy #0 lag: (min: 18.0, avg: 23.6, max: 50.0) [2023-10-12 16:29:53,436][61643] Avg episode reward: [(0, '3.890'), (1, '8.620')] [2023-10-12 16:29:55,109][62634] Updated weights for policy 0, policy_version 17830 (0.0008) [2023-10-12 16:29:55,487][62634] Updated weights for policy 0, policy_version 17840 (0.0007) [2023-10-12 16:29:55,859][62634] Updated weights for policy 0, policy_version 17850 (0.0007) [2023-10-12 16:29:56,314][62635] Updated weights for policy 1, policy_version 17830 (0.0011) [2023-10-12 16:29:56,679][62635] Updated weights for policy 1, policy_version 17840 (0.0008) [2023-10-12 16:29:57,053][62635] Updated weights for policy 1, policy_version 17850 (0.0008) [2023-10-12 16:29:58,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13329.4). Total num frames: 36569088. Throughput: 0: 1678.5, 1: 1656.5. Samples: 9147500. Policy #0 lag: (min: 18.0, avg: 23.6, max: 50.0) [2023-10-12 16:29:58,435][61643] Avg episode reward: [(0, '3.900'), (1, '8.850')] [2023-10-12 16:30:00,058][62634] Updated weights for policy 0, policy_version 17860 (0.0010) [2023-10-12 16:30:00,437][62634] Updated weights for policy 0, policy_version 17870 (0.0009) [2023-10-12 16:30:00,811][62634] Updated weights for policy 0, policy_version 17880 (0.0007) [2023-10-12 16:30:01,047][62635] Updated weights for policy 1, policy_version 17860 (0.0008) [2023-10-12 16:30:01,426][62635] Updated weights for policy 1, policy_version 17870 (0.0007) [2023-10-12 16:30:01,801][62635] Updated weights for policy 1, policy_version 17880 (0.0007) [2023-10-12 16:30:03,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13329.4). Total num frames: 36634624. Throughput: 0: 1686.7, 1: 1681.9. Samples: 9168170. Policy #0 lag: (min: 18.0, avg: 23.6, max: 50.0) [2023-10-12 16:30:03,436][61643] Avg episode reward: [(0, '3.910'), (1, '8.510')] [2023-10-12 16:30:04,728][62634] Updated weights for policy 0, policy_version 17890 (0.0007) [2023-10-12 16:30:05,112][62634] Updated weights for policy 0, policy_version 17900 (0.0007) [2023-10-12 16:30:05,485][62634] Updated weights for policy 0, policy_version 17910 (0.0007) [2023-10-12 16:30:05,772][62635] Updated weights for policy 1, policy_version 17890 (0.0009) [2023-10-12 16:30:05,861][62634] Updated weights for policy 0, policy_version 17920 (0.0007) [2023-10-12 16:30:06,143][62635] Updated weights for policy 1, policy_version 17900 (0.0010) [2023-10-12 16:30:06,515][62635] Updated weights for policy 1, policy_version 17910 (0.0008) [2023-10-12 16:30:06,874][62635] Updated weights for policy 1, policy_version 17920 (0.0009) [2023-10-12 16:30:08,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 36700160. Throughput: 0: 1658.9, 1: 1678.1. Samples: 9178074. Policy #0 lag: (min: 22.0, avg: 36.6, max: 54.0) [2023-10-12 16:30:08,436][61643] Avg episode reward: [(0, '3.890'), (1, '8.690')] [2023-10-12 16:30:10,007][62634] Updated weights for policy 0, policy_version 17930 (0.0008) [2023-10-12 16:30:10,391][62634] Updated weights for policy 0, policy_version 17940 (0.0010) [2023-10-12 16:30:10,763][62634] Updated weights for policy 0, policy_version 17950 (0.0007) [2023-10-12 16:30:10,904][62635] Updated weights for policy 1, policy_version 17930 (0.0008) [2023-10-12 16:30:11,270][62635] Updated weights for policy 1, policy_version 17940 (0.0010) [2023-10-12 16:30:11,644][62635] Updated weights for policy 1, policy_version 17950 (0.0008) [2023-10-12 16:30:13,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 36765696. Throughput: 0: 1684.8, 1: 1663.7. Samples: 9197892. Policy #0 lag: (min: 22.0, avg: 36.6, max: 54.0) [2023-10-12 16:30:13,436][61643] Avg episode reward: [(0, '3.890'), (1, '8.730')] [2023-10-12 16:30:14,929][62634] Updated weights for policy 0, policy_version 17960 (0.0008) [2023-10-12 16:30:15,310][62634] Updated weights for policy 0, policy_version 17970 (0.0008) [2023-10-12 16:30:15,680][62634] Updated weights for policy 0, policy_version 17980 (0.0007) [2023-10-12 16:30:15,894][62635] Updated weights for policy 1, policy_version 17960 (0.0009) [2023-10-12 16:30:16,258][62635] Updated weights for policy 1, policy_version 17970 (0.0007) [2023-10-12 16:30:16,636][62635] Updated weights for policy 1, policy_version 17980 (0.0008) [2023-10-12 16:30:18,435][61643] Fps is (10 sec: 13106.8, 60 sec: 13107.1, 300 sec: 13329.3). Total num frames: 36831232. Throughput: 0: 1684.1, 1: 1690.2. Samples: 9218580. Policy #0 lag: (min: 22.0, avg: 36.6, max: 54.0) [2023-10-12 16:30:18,436][61643] Avg episode reward: [(0, '3.870'), (1, '8.720')] [2023-10-12 16:30:18,447][62495] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000017984_18415616.pth... [2023-10-12 16:30:18,447][62354] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000017984_18415616.pth... [2023-10-12 16:30:18,484][62354] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000016416_16809984.pth [2023-10-12 16:30:18,488][62495] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000016416_16809984.pth [2023-10-12 16:30:19,768][62634] Updated weights for policy 0, policy_version 17990 (0.0009) [2023-10-12 16:30:20,148][62634] Updated weights for policy 0, policy_version 18000 (0.0008) [2023-10-12 16:30:20,529][62634] Updated weights for policy 0, policy_version 18010 (0.0010) [2023-10-12 16:30:20,555][62635] Updated weights for policy 1, policy_version 17990 (0.0007) [2023-10-12 16:30:20,927][62635] Updated weights for policy 1, policy_version 18000 (0.0007) [2023-10-12 16:30:21,299][62635] Updated weights for policy 1, policy_version 18010 (0.0007) [2023-10-12 16:30:23,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 36896768. Throughput: 0: 1657.7, 1: 1676.8. Samples: 9228112. Policy #0 lag: (min: 24.0, avg: 46.2, max: 56.0) [2023-10-12 16:30:23,436][61643] Avg episode reward: [(0, '3.880'), (1, '8.590')] [2023-10-12 16:30:24,434][62634] Updated weights for policy 0, policy_version 18020 (0.0008) [2023-10-12 16:30:24,815][62634] Updated weights for policy 0, policy_version 18030 (0.0010) [2023-10-12 16:30:25,202][62634] Updated weights for policy 0, policy_version 18040 (0.0010) [2023-10-12 16:30:25,250][62635] Updated weights for policy 1, policy_version 18020 (0.0007) [2023-10-12 16:30:25,619][62635] Updated weights for policy 1, policy_version 18030 (0.0008) [2023-10-12 16:30:25,985][62635] Updated weights for policy 1, policy_version 18040 (0.0008) [2023-10-12 16:30:28,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 36962304. Throughput: 0: 1681.1, 1: 1680.0. Samples: 9248552. Policy #0 lag: (min: 24.0, avg: 46.2, max: 56.0) [2023-10-12 16:30:28,436][61643] Avg episode reward: [(0, '3.840'), (1, '8.880')] [2023-10-12 16:30:29,371][62634] Updated weights for policy 0, policy_version 18050 (0.0007) [2023-10-12 16:30:29,750][62634] Updated weights for policy 0, policy_version 18060 (0.0009) [2023-10-12 16:30:30,017][62635] Updated weights for policy 1, policy_version 18050 (0.0008) [2023-10-12 16:30:30,116][62634] Updated weights for policy 0, policy_version 18070 (0.0008) [2023-10-12 16:30:30,380][62635] Updated weights for policy 1, policy_version 18060 (0.0009) [2023-10-12 16:30:30,497][62634] Updated weights for policy 0, policy_version 18080 (0.0009) [2023-10-12 16:30:30,745][62635] Updated weights for policy 1, policy_version 18070 (0.0010) [2023-10-12 16:30:31,108][62635] Updated weights for policy 1, policy_version 18080 (0.0007) [2023-10-12 16:30:33,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.3, 300 sec: 13329.4). Total num frames: 37027840. Throughput: 0: 1682.3, 1: 1698.3. Samples: 9269420. Policy #0 lag: (min: 24.0, avg: 46.2, max: 56.0) [2023-10-12 16:30:33,435][61643] Avg episode reward: [(0, '3.860'), (1, '8.820')] [2023-10-12 16:30:34,443][62634] Updated weights for policy 0, policy_version 18090 (0.0009) [2023-10-12 16:30:34,813][62634] Updated weights for policy 0, policy_version 18100 (0.0009) [2023-10-12 16:30:35,192][62634] Updated weights for policy 0, policy_version 18110 (0.0009) [2023-10-12 16:30:35,245][62635] Updated weights for policy 1, policy_version 18090 (0.0008) [2023-10-12 16:30:35,616][62635] Updated weights for policy 1, policy_version 18100 (0.0011) [2023-10-12 16:30:35,986][62635] Updated weights for policy 1, policy_version 18110 (0.0010) [2023-10-12 16:30:38,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 37093376. Throughput: 0: 1668.3, 1: 1665.8. Samples: 9278326. Policy #0 lag: (min: 31.0, avg: 31.6, max: 47.0) [2023-10-12 16:30:38,435][61643] Avg episode reward: [(0, '3.860'), (1, '8.600')] [2023-10-12 16:30:39,158][62634] Updated weights for policy 0, policy_version 18120 (0.0008) [2023-10-12 16:30:39,536][62634] Updated weights for policy 0, policy_version 18130 (0.0008) [2023-10-12 16:30:39,918][62634] Updated weights for policy 0, policy_version 18140 (0.0009) [2023-10-12 16:30:40,178][62635] Updated weights for policy 1, policy_version 18120 (0.0009) [2023-10-12 16:30:40,555][62635] Updated weights for policy 1, policy_version 18130 (0.0010) [2023-10-12 16:30:40,929][62635] Updated weights for policy 1, policy_version 18140 (0.0007) [2023-10-12 16:30:43,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 37158912. Throughput: 0: 1685.9, 1: 1682.8. Samples: 9299094. Policy #0 lag: (min: 31.0, avg: 31.6, max: 47.0) [2023-10-12 16:30:43,436][61643] Avg episode reward: [(0, '3.850'), (1, '9.040')] [2023-10-12 16:30:43,437][62495] Saving new best policy, reward=9.040! [2023-10-12 16:30:44,160][62634] Updated weights for policy 0, policy_version 18150 (0.0008) [2023-10-12 16:30:44,540][62634] Updated weights for policy 0, policy_version 18160 (0.0008) [2023-10-12 16:30:44,915][62634] Updated weights for policy 0, policy_version 18170 (0.0009) [2023-10-12 16:30:44,988][62635] Updated weights for policy 1, policy_version 18150 (0.0007) [2023-10-12 16:30:45,352][62635] Updated weights for policy 1, policy_version 18160 (0.0008) [2023-10-12 16:30:45,719][62635] Updated weights for policy 1, policy_version 18170 (0.0008) [2023-10-12 16:30:48,435][61643] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 37224448. Throughput: 0: 1685.8, 1: 1685.6. Samples: 9319884. Policy #0 lag: (min: 31.0, avg: 31.6, max: 47.0) [2023-10-12 16:30:48,436][61643] Avg episode reward: [(0, '3.850'), (1, '8.640')] [2023-10-12 16:30:49,053][62634] Updated weights for policy 0, policy_version 18180 (0.0009) [2023-10-12 16:30:49,434][62634] Updated weights for policy 0, policy_version 18190 (0.0008) [2023-10-12 16:30:49,752][62635] Updated weights for policy 1, policy_version 18180 (0.0007) [2023-10-12 16:30:49,804][62634] Updated weights for policy 0, policy_version 18200 (0.0008) [2023-10-12 16:30:50,114][62635] Updated weights for policy 1, policy_version 18190 (0.0008) [2023-10-12 16:30:50,498][62635] Updated weights for policy 1, policy_version 18200 (0.0007) [2023-10-12 16:30:53,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 37289984. Throughput: 0: 1686.4, 1: 1663.6. Samples: 9328822. Policy #0 lag: (min: 16.0, avg: 42.4, max: 48.0) [2023-10-12 16:30:53,435][61643] Avg episode reward: [(0, '3.810'), (1, '8.600')] [2023-10-12 16:30:53,837][62634] Updated weights for policy 0, policy_version 18210 (0.0008) [2023-10-12 16:30:54,226][62634] Updated weights for policy 0, policy_version 18220 (0.0007) [2023-10-12 16:30:54,603][62634] Updated weights for policy 0, policy_version 18230 (0.0007) [2023-10-12 16:30:54,635][62635] Updated weights for policy 1, policy_version 18210 (0.0008) [2023-10-12 16:30:54,982][62634] Updated weights for policy 0, policy_version 18240 (0.0007) [2023-10-12 16:30:55,008][62635] Updated weights for policy 1, policy_version 18220 (0.0007) [2023-10-12 16:30:55,380][62635] Updated weights for policy 1, policy_version 18230 (0.0009) [2023-10-12 16:30:55,747][62635] Updated weights for policy 1, policy_version 18240 (0.0008) [2023-10-12 16:30:58,435][61643] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 37355520. Throughput: 0: 1689.0, 1: 1682.1. Samples: 9349590. Policy #0 lag: (min: 16.0, avg: 42.4, max: 48.0) [2023-10-12 16:30:58,435][61643] Avg episode reward: [(0, '3.830'), (1, '8.910')] [2023-10-12 16:30:58,853][62634] Updated weights for policy 0, policy_version 18250 (0.0011) [2023-10-12 16:30:59,225][62634] Updated weights for policy 0, policy_version 18260 (0.0011) [2023-10-12 16:30:59,606][62634] Updated weights for policy 0, policy_version 18270 (0.0007) [2023-10-12 16:30:59,855][62635] Updated weights for policy 1, policy_version 18250 (0.0009) [2023-10-12 16:31:00,226][62635] Updated weights for policy 1, policy_version 18260 (0.0008) [2023-10-12 16:31:00,589][62635] Updated weights for policy 1, policy_version 18270 (0.0008) [2023-10-12 16:31:03,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 37421056. Throughput: 0: 1694.5, 1: 1678.1. Samples: 9370344. Policy #0 lag: (min: 16.0, avg: 42.4, max: 48.0) [2023-10-12 16:31:03,436][61643] Avg episode reward: [(0, '3.860'), (1, '8.870')] [2023-10-12 16:31:03,665][62634] Updated weights for policy 0, policy_version 18280 (0.0008) [2023-10-12 16:31:04,043][62634] Updated weights for policy 0, policy_version 18290 (0.0010) [2023-10-12 16:31:04,418][62634] Updated weights for policy 0, policy_version 18300 (0.0008) [2023-10-12 16:31:04,579][62635] Updated weights for policy 1, policy_version 18280 (0.0009) [2023-10-12 16:31:04,956][62635] Updated weights for policy 1, policy_version 18290 (0.0009) [2023-10-12 16:31:05,324][62635] Updated weights for policy 1, policy_version 18300 (0.0009) [2023-10-12 16:31:08,372][62634] Updated weights for policy 0, policy_version 18310 (0.0008) [2023-10-12 16:31:08,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 37486592. Throughput: 0: 1695.8, 1: 1664.7. Samples: 9379336. Policy #0 lag: (min: 31.0, avg: 37.0, max: 63.0) [2023-10-12 16:31:08,435][61643] Avg episode reward: [(0, '3.910'), (1, '8.910')] [2023-10-12 16:31:08,759][62634] Updated weights for policy 0, policy_version 18320 (0.0008) [2023-10-12 16:31:09,129][62634] Updated weights for policy 0, policy_version 18330 (0.0011) [2023-10-12 16:31:09,369][62635] Updated weights for policy 1, policy_version 18310 (0.0009) [2023-10-12 16:31:09,736][62635] Updated weights for policy 1, policy_version 18320 (0.0008) [2023-10-12 16:31:10,101][62635] Updated weights for policy 1, policy_version 18330 (0.0007) [2023-10-12 16:31:12,949][62634] Updated weights for policy 0, policy_version 18340 (0.0009) [2023-10-12 16:31:13,323][62634] Updated weights for policy 0, policy_version 18350 (0.0010) [2023-10-12 16:31:13,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13107.3, 300 sec: 13329.4). Total num frames: 37552128. Throughput: 0: 1699.2, 1: 1669.9. Samples: 9400162. Policy #0 lag: (min: 31.0, avg: 37.0, max: 63.0) [2023-10-12 16:31:13,435][61643] Avg episode reward: [(0, '3.870'), (1, '9.110')] [2023-10-12 16:31:13,436][62495] Saving new best policy, reward=9.110! [2023-10-12 16:31:13,707][62634] Updated weights for policy 0, policy_version 18360 (0.0008) [2023-10-12 16:31:14,250][62635] Updated weights for policy 1, policy_version 18340 (0.0008) [2023-10-12 16:31:14,619][62635] Updated weights for policy 1, policy_version 18350 (0.0011) [2023-10-12 16:31:14,986][62635] Updated weights for policy 1, policy_version 18360 (0.0008) [2023-10-12 16:31:17,658][62634] Updated weights for policy 0, policy_version 18370 (0.0008) [2023-10-12 16:31:18,035][62634] Updated weights for policy 0, policy_version 18380 (0.0008) [2023-10-12 16:31:18,412][62634] Updated weights for policy 0, policy_version 18390 (0.0008) [2023-10-12 16:31:18,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.3, 300 sec: 13329.4). Total num frames: 37617664. Throughput: 0: 1690.1, 1: 1671.6. Samples: 9420700. Policy #0 lag: (min: 31.0, avg: 37.0, max: 63.0) [2023-10-12 16:31:18,435][61643] Avg episode reward: [(0, '3.840'), (1, '9.120')] [2023-10-12 16:31:18,443][62495] Saving new best policy, reward=9.120! [2023-10-12 16:31:18,791][62634] Updated weights for policy 0, policy_version 18400 (0.0008) [2023-10-12 16:31:19,045][62635] Updated weights for policy 1, policy_version 18370 (0.0008) [2023-10-12 16:31:19,433][62635] Updated weights for policy 1, policy_version 18380 (0.0008) [2023-10-12 16:31:19,796][62635] Updated weights for policy 1, policy_version 18390 (0.0007) [2023-10-12 16:31:20,167][62635] Updated weights for policy 1, policy_version 18400 (0.0009) [2023-10-12 16:31:22,971][62634] Updated weights for policy 0, policy_version 18410 (0.0007) [2023-10-12 16:31:23,363][62634] Updated weights for policy 0, policy_version 18420 (0.0009) [2023-10-12 16:31:23,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 37683200. Throughput: 0: 1701.3, 1: 1671.7. Samples: 9430110. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 16:31:23,435][61643] Avg episode reward: [(0, '3.760'), (1, '9.190')] [2023-10-12 16:31:23,436][62495] Saving new best policy, reward=9.190! [2023-10-12 16:31:23,726][62634] Updated weights for policy 0, policy_version 18430 (0.0009) [2023-10-12 16:31:24,247][62635] Updated weights for policy 1, policy_version 18410 (0.0007) [2023-10-12 16:31:24,614][62635] Updated weights for policy 1, policy_version 18420 (0.0007) [2023-10-12 16:31:24,975][62635] Updated weights for policy 1, policy_version 18430 (0.0008) [2023-10-12 16:31:27,785][62634] Updated weights for policy 0, policy_version 18440 (0.0007) [2023-10-12 16:31:28,174][62634] Updated weights for policy 0, policy_version 18450 (0.0009) [2023-10-12 16:31:28,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.3, 300 sec: 13329.4). Total num frames: 37748736. Throughput: 0: 1695.0, 1: 1681.2. Samples: 9451022. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 16:31:28,435][61643] Avg episode reward: [(0, '3.730'), (1, '9.060')] [2023-10-12 16:31:28,546][62634] Updated weights for policy 0, policy_version 18460 (0.0011) [2023-10-12 16:31:29,103][62635] Updated weights for policy 1, policy_version 18440 (0.0007) [2023-10-12 16:31:29,479][62635] Updated weights for policy 1, policy_version 18450 (0.0009) [2023-10-12 16:31:29,847][62635] Updated weights for policy 1, policy_version 18460 (0.0008) [2023-10-12 16:31:32,604][62634] Updated weights for policy 0, policy_version 18470 (0.0008) [2023-10-12 16:31:32,985][62634] Updated weights for policy 0, policy_version 18480 (0.0009) [2023-10-12 16:31:33,360][62634] Updated weights for policy 0, policy_version 18490 (0.0008) [2023-10-12 16:31:33,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 37814272. Throughput: 0: 1681.6, 1: 1686.1. Samples: 9471434. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 16:31:33,436][61643] Avg episode reward: [(0, '3.750'), (1, '8.980')] [2023-10-12 16:31:33,830][62635] Updated weights for policy 1, policy_version 18470 (0.0007) [2023-10-12 16:31:34,200][62635] Updated weights for policy 1, policy_version 18480 (0.0008) [2023-10-12 16:31:34,568][62635] Updated weights for policy 1, policy_version 18490 (0.0009) [2023-10-12 16:31:37,475][62634] Updated weights for policy 0, policy_version 18500 (0.0010) [2023-10-12 16:31:37,856][62634] Updated weights for policy 0, policy_version 18510 (0.0008) [2023-10-12 16:31:38,227][62634] Updated weights for policy 0, policy_version 18520 (0.0010) [2023-10-12 16:31:38,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 37879808. Throughput: 0: 1697.9, 1: 1688.6. Samples: 9481214. Policy #0 lag: (min: 31.0, avg: 33.1, max: 62.0) [2023-10-12 16:31:38,435][61643] Avg episode reward: [(0, '3.800'), (1, '9.150')] [2023-10-12 16:31:38,587][62635] Updated weights for policy 1, policy_version 18500 (0.0008) [2023-10-12 16:31:38,954][62635] Updated weights for policy 1, policy_version 18510 (0.0011) [2023-10-12 16:31:39,329][62635] Updated weights for policy 1, policy_version 18520 (0.0009) [2023-10-12 16:31:42,410][62634] Updated weights for policy 0, policy_version 18530 (0.0008) [2023-10-12 16:31:42,815][62634] Updated weights for policy 0, policy_version 18540 (0.0008) [2023-10-12 16:31:43,198][62634] Updated weights for policy 0, policy_version 18550 (0.0009) [2023-10-12 16:31:43,315][62635] Updated weights for policy 1, policy_version 18530 (0.0009) [2023-10-12 16:31:43,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 37945344. Throughput: 0: 1689.1, 1: 1692.3. Samples: 9501754. Policy #0 lag: (min: 31.0, avg: 33.1, max: 62.0) [2023-10-12 16:31:43,436][61643] Avg episode reward: [(0, '3.850'), (1, '8.990')] [2023-10-12 16:31:43,577][62634] Updated weights for policy 0, policy_version 18560 (0.0009) [2023-10-12 16:31:43,676][62635] Updated weights for policy 1, policy_version 18540 (0.0010) [2023-10-12 16:31:44,050][62635] Updated weights for policy 1, policy_version 18550 (0.0007) [2023-10-12 16:31:44,419][62635] Updated weights for policy 1, policy_version 18560 (0.0007) [2023-10-12 16:31:47,768][62634] Updated weights for policy 0, policy_version 18570 (0.0009) [2023-10-12 16:31:48,146][62634] Updated weights for policy 0, policy_version 18580 (0.0008) [2023-10-12 16:31:48,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 38010880. Throughput: 0: 1665.6, 1: 1691.0. Samples: 9521394. Policy #0 lag: (min: 31.0, avg: 33.1, max: 62.0) [2023-10-12 16:31:48,435][61643] Avg episode reward: [(0, '3.910'), (1, '8.690')] [2023-10-12 16:31:48,523][62634] Updated weights for policy 0, policy_version 18590 (0.0007) [2023-10-12 16:31:48,562][62635] Updated weights for policy 1, policy_version 18570 (0.0007) [2023-10-12 16:31:48,947][62635] Updated weights for policy 1, policy_version 18580 (0.0008) [2023-10-12 16:31:49,317][62635] Updated weights for policy 1, policy_version 18590 (0.0008) [2023-10-12 16:31:52,546][62634] Updated weights for policy 0, policy_version 18600 (0.0009) [2023-10-12 16:31:52,929][62634] Updated weights for policy 0, policy_version 18610 (0.0007) [2023-10-12 16:31:53,297][62635] Updated weights for policy 1, policy_version 18600 (0.0008) [2023-10-12 16:31:53,308][62634] Updated weights for policy 0, policy_version 18620 (0.0007) [2023-10-12 16:31:53,435][61643] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 38076416. Throughput: 0: 1677.9, 1: 1689.4. Samples: 9530864. Policy #0 lag: (min: 31.0, avg: 34.0, max: 63.0) [2023-10-12 16:31:53,435][61643] Avg episode reward: [(0, '3.900'), (1, '8.950')] [2023-10-12 16:31:53,665][62635] Updated weights for policy 1, policy_version 18610 (0.0010) [2023-10-12 16:31:54,030][62635] Updated weights for policy 1, policy_version 18620 (0.0009) [2023-10-12 16:31:57,181][62634] Updated weights for policy 0, policy_version 18630 (0.0008) [2023-10-12 16:31:57,566][62634] Updated weights for policy 0, policy_version 18640 (0.0008) [2023-10-12 16:31:57,941][62634] Updated weights for policy 0, policy_version 18650 (0.0009) [2023-10-12 16:31:58,093][62635] Updated weights for policy 1, policy_version 18630 (0.0008) [2023-10-12 16:31:58,435][61643] Fps is (10 sec: 16383.6, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 38174720. Throughput: 0: 1678.0, 1: 1693.0. Samples: 9551858. Policy #0 lag: (min: 31.0, avg: 34.0, max: 63.0) [2023-10-12 16:31:58,436][61643] Avg episode reward: [(0, '3.890'), (1, '9.060')] [2023-10-12 16:31:58,451][62635] Updated weights for policy 1, policy_version 18640 (0.0010) [2023-10-12 16:31:58,830][62635] Updated weights for policy 1, policy_version 18650 (0.0010) [2023-10-12 16:32:02,066][62634] Updated weights for policy 0, policy_version 18660 (0.0007) [2023-10-12 16:32:02,442][62634] Updated weights for policy 0, policy_version 18670 (0.0008) [2023-10-12 16:32:02,820][62634] Updated weights for policy 0, policy_version 18680 (0.0008) [2023-10-12 16:32:02,990][62635] Updated weights for policy 1, policy_version 18660 (0.0009) [2023-10-12 16:32:03,359][62635] Updated weights for policy 1, policy_version 18670 (0.0010) [2023-10-12 16:32:03,435][61643] Fps is (10 sec: 16384.0, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 38240256. Throughput: 0: 1663.1, 1: 1686.9. Samples: 9571450. Policy #0 lag: (min: 11.0, avg: 12.2, max: 34.0) [2023-10-12 16:32:03,435][61643] Avg episode reward: [(0, '3.860'), (1, '8.640')] [2023-10-12 16:32:03,732][62635] Updated weights for policy 1, policy_version 18680 (0.0009) [2023-10-12 16:32:06,860][62634] Updated weights for policy 0, policy_version 18690 (0.0010) [2023-10-12 16:32:07,225][62634] Updated weights for policy 0, policy_version 18700 (0.0010) [2023-10-12 16:32:07,599][62634] Updated weights for policy 0, policy_version 18710 (0.0011) [2023-10-12 16:32:07,984][62634] Updated weights for policy 0, policy_version 18720 (0.0010) [2023-10-12 16:32:08,015][62635] Updated weights for policy 1, policy_version 18690 (0.0008) [2023-10-12 16:32:08,435][61643] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 38305792. Throughput: 0: 1677.6, 1: 1694.1. Samples: 9581838. Policy #0 lag: (min: 11.0, avg: 12.2, max: 34.0) [2023-10-12 16:32:08,435][61643] Avg episode reward: [(0, '3.890'), (1, '8.650')] [2023-10-12 16:32:08,438][62635] Updated weights for policy 1, policy_version 18700 (0.0007) [2023-10-12 16:32:08,805][62635] Updated weights for policy 1, policy_version 18710 (0.0007) [2023-10-12 16:32:09,166][62635] Updated weights for policy 1, policy_version 18720 (0.0008) [2023-10-12 16:32:12,003][62634] Updated weights for policy 0, policy_version 18730 (0.0008) [2023-10-12 16:32:12,384][62634] Updated weights for policy 0, policy_version 18740 (0.0007) [2023-10-12 16:32:12,764][62634] Updated weights for policy 0, policy_version 18750 (0.0007) [2023-10-12 16:32:13,114][62635] Updated weights for policy 1, policy_version 18730 (0.0009) [2023-10-12 16:32:13,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 38371328. Throughput: 0: 1666.8, 1: 1687.4. Samples: 9601962. Policy #0 lag: (min: 11.0, avg: 12.2, max: 34.0) [2023-10-12 16:32:13,435][61643] Avg episode reward: [(0, '3.900'), (1, '8.930')] [2023-10-12 16:32:13,481][62635] Updated weights for policy 1, policy_version 18740 (0.0009) [2023-10-12 16:32:13,856][62635] Updated weights for policy 1, policy_version 18750 (0.0008) [2023-10-12 16:32:16,811][62634] Updated weights for policy 0, policy_version 18760 (0.0009) [2023-10-12 16:32:17,183][62634] Updated weights for policy 0, policy_version 18770 (0.0009) [2023-10-12 16:32:17,565][62634] Updated weights for policy 0, policy_version 18780 (0.0009) [2023-10-12 16:32:17,851][62635] Updated weights for policy 1, policy_version 18760 (0.0007) [2023-10-12 16:32:18,219][62635] Updated weights for policy 1, policy_version 18770 (0.0008) [2023-10-12 16:32:18,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 38436864. Throughput: 0: 1658.9, 1: 1669.1. Samples: 9621194. Policy #0 lag: (min: 31.0, avg: 32.3, max: 54.0) [2023-10-12 16:32:18,435][61643] Avg episode reward: [(0, '3.890'), (1, '8.670')] [2023-10-12 16:32:18,445][62354] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000018784_19234816.pth... [2023-10-12 16:32:18,488][62354] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000017216_17629184.pth [2023-10-12 16:32:18,594][62635] Updated weights for policy 1, policy_version 18780 (0.0009) [2023-10-12 16:32:18,738][62495] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000018784_19234816.pth... [2023-10-12 16:32:18,775][62495] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000017216_17629184.pth [2023-10-12 16:32:21,665][62634] Updated weights for policy 0, policy_version 18790 (0.0010) [2023-10-12 16:32:22,042][62634] Updated weights for policy 0, policy_version 18800 (0.0010) [2023-10-12 16:32:22,426][62634] Updated weights for policy 0, policy_version 18810 (0.0008) [2023-10-12 16:32:22,734][62635] Updated weights for policy 1, policy_version 18790 (0.0009) [2023-10-12 16:32:23,100][62635] Updated weights for policy 1, policy_version 18800 (0.0010) [2023-10-12 16:32:23,435][61643] Fps is (10 sec: 13106.8, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 38502400. Throughput: 0: 1674.9, 1: 1680.0. Samples: 9632186. Policy #0 lag: (min: 31.0, avg: 32.3, max: 54.0) [2023-10-12 16:32:23,436][61643] Avg episode reward: [(0, '3.870'), (1, '8.690')] [2023-10-12 16:32:23,469][62635] Updated weights for policy 1, policy_version 18810 (0.0008) [2023-10-12 16:32:26,564][62634] Updated weights for policy 0, policy_version 18820 (0.0009) [2023-10-12 16:32:26,932][62634] Updated weights for policy 0, policy_version 18830 (0.0009) [2023-10-12 16:32:27,303][62634] Updated weights for policy 0, policy_version 18840 (0.0009) [2023-10-12 16:32:27,498][62635] Updated weights for policy 1, policy_version 18820 (0.0008) [2023-10-12 16:32:27,861][62635] Updated weights for policy 1, policy_version 18830 (0.0007) [2023-10-12 16:32:28,240][62635] Updated weights for policy 1, policy_version 18840 (0.0007) [2023-10-12 16:32:28,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 38567936. Throughput: 0: 1664.1, 1: 1677.0. Samples: 9652104. Policy #0 lag: (min: 31.0, avg: 32.3, max: 54.0) [2023-10-12 16:32:28,435][61643] Avg episode reward: [(0, '3.860'), (1, '8.810')] [2023-10-12 16:32:31,437][62634] Updated weights for policy 0, policy_version 18850 (0.0008) [2023-10-12 16:32:31,817][62634] Updated weights for policy 0, policy_version 18860 (0.0010) [2023-10-12 16:32:32,183][62634] Updated weights for policy 0, policy_version 18870 (0.0008) [2023-10-12 16:32:32,260][62635] Updated weights for policy 1, policy_version 18850 (0.0010) [2023-10-12 16:32:32,565][62634] Updated weights for policy 0, policy_version 18880 (0.0010) [2023-10-12 16:32:32,634][62635] Updated weights for policy 1, policy_version 18860 (0.0007) [2023-10-12 16:32:33,002][62635] Updated weights for policy 1, policy_version 18870 (0.0007) [2023-10-12 16:32:33,371][62635] Updated weights for policy 1, policy_version 18880 (0.0008) [2023-10-12 16:32:33,435][61643] Fps is (10 sec: 16384.4, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 38666240. Throughput: 0: 1666.1, 1: 1660.0. Samples: 9671070. Policy #0 lag: (min: 23.0, avg: 23.8, max: 41.0) [2023-10-12 16:32:33,435][61643] Avg episode reward: [(0, '3.810'), (1, '8.860')] [2023-10-12 16:32:36,552][62634] Updated weights for policy 0, policy_version 18890 (0.0009) [2023-10-12 16:32:36,935][62634] Updated weights for policy 0, policy_version 18900 (0.0007) [2023-10-12 16:32:37,310][62634] Updated weights for policy 0, policy_version 18910 (0.0008) [2023-10-12 16:32:37,460][62635] Updated weights for policy 1, policy_version 18890 (0.0009) [2023-10-12 16:32:37,835][62635] Updated weights for policy 1, policy_version 18900 (0.0009) [2023-10-12 16:32:38,203][62635] Updated weights for policy 1, policy_version 18910 (0.0007) [2023-10-12 16:32:38,435][61643] Fps is (10 sec: 16384.1, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 38731776. Throughput: 0: 1682.7, 1: 1680.8. Samples: 9682220. Policy #0 lag: (min: 23.0, avg: 23.8, max: 41.0) [2023-10-12 16:32:38,435][61643] Avg episode reward: [(0, '3.860'), (1, '8.920')] [2023-10-12 16:32:41,344][62634] Updated weights for policy 0, policy_version 18920 (0.0009) [2023-10-12 16:32:41,727][62634] Updated weights for policy 0, policy_version 18930 (0.0009) [2023-10-12 16:32:42,080][62635] Updated weights for policy 1, policy_version 18920 (0.0009) [2023-10-12 16:32:42,092][62634] Updated weights for policy 0, policy_version 18940 (0.0008) [2023-10-12 16:32:42,448][62635] Updated weights for policy 1, policy_version 18930 (0.0011) [2023-10-12 16:32:42,812][62635] Updated weights for policy 1, policy_version 18940 (0.0009) [2023-10-12 16:32:43,435][61643] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 38797312. Throughput: 0: 1654.5, 1: 1680.4. Samples: 9701924. Policy #0 lag: (min: 23.0, avg: 23.8, max: 41.0) [2023-10-12 16:32:43,435][61643] Avg episode reward: [(0, '3.820'), (1, '8.990')] [2023-10-12 16:32:46,030][62634] Updated weights for policy 0, policy_version 18950 (0.0010) [2023-10-12 16:32:46,409][62634] Updated weights for policy 0, policy_version 18960 (0.0009) [2023-10-12 16:32:46,789][62634] Updated weights for policy 0, policy_version 18970 (0.0010) [2023-10-12 16:32:47,008][62635] Updated weights for policy 1, policy_version 18950 (0.0008) [2023-10-12 16:32:47,369][62635] Updated weights for policy 1, policy_version 18960 (0.0007) [2023-10-12 16:32:47,734][62635] Updated weights for policy 1, policy_version 18970 (0.0010) [2023-10-12 16:32:48,435][61643] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 38862848. Throughput: 0: 1673.4, 1: 1658.6. Samples: 9721390. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 16:32:48,435][61643] Avg episode reward: [(0, '3.800'), (1, '9.010')] [2023-10-12 16:32:50,723][62634] Updated weights for policy 0, policy_version 18980 (0.0009) [2023-10-12 16:32:51,110][62634] Updated weights for policy 0, policy_version 18990 (0.0008) [2023-10-12 16:32:51,493][62634] Updated weights for policy 0, policy_version 19000 (0.0008) [2023-10-12 16:32:51,728][62635] Updated weights for policy 1, policy_version 18980 (0.0008) [2023-10-12 16:32:52,103][62635] Updated weights for policy 1, policy_version 18990 (0.0010) [2023-10-12 16:32:52,457][62635] Updated weights for policy 1, policy_version 19000 (0.0009) [2023-10-12 16:32:53,435][61643] Fps is (10 sec: 13106.9, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 38928384. Throughput: 0: 1674.0, 1: 1680.6. Samples: 9732798. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 16:32:53,436][61643] Avg episode reward: [(0, '3.820'), (1, '9.030')] [2023-10-12 16:32:55,439][62634] Updated weights for policy 0, policy_version 19010 (0.0009) [2023-10-12 16:32:55,817][62634] Updated weights for policy 0, policy_version 19020 (0.0008) [2023-10-12 16:32:56,196][62634] Updated weights for policy 0, policy_version 19030 (0.0010) [2023-10-12 16:32:56,579][62634] Updated weights for policy 0, policy_version 19040 (0.0008) [2023-10-12 16:32:56,774][62635] Updated weights for policy 1, policy_version 19010 (0.0009) [2023-10-12 16:32:57,169][62635] Updated weights for policy 1, policy_version 19020 (0.0008) [2023-10-12 16:32:57,539][62635] Updated weights for policy 1, policy_version 19030 (0.0010) [2023-10-12 16:32:57,906][62635] Updated weights for policy 1, policy_version 19040 (0.0010) [2023-10-12 16:32:58,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 38993920. Throughput: 0: 1669.2, 1: 1672.9. Samples: 9752358. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 16:32:58,436][61643] Avg episode reward: [(0, '3.790'), (1, '9.130')] [2023-10-12 16:33:00,530][62634] Updated weights for policy 0, policy_version 19050 (0.0009) [2023-10-12 16:33:00,913][62634] Updated weights for policy 0, policy_version 19060 (0.0010) [2023-10-12 16:33:01,288][62634] Updated weights for policy 0, policy_version 19070 (0.0010) [2023-10-12 16:33:02,056][62635] Updated weights for policy 1, policy_version 19050 (0.0007) [2023-10-12 16:33:02,432][62635] Updated weights for policy 1, policy_version 19060 (0.0007) [2023-10-12 16:33:02,792][62635] Updated weights for policy 1, policy_version 19070 (0.0007) [2023-10-12 16:33:03,435][61643] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 39059456. Throughput: 0: 1696.7, 1: 1663.9. Samples: 9772422. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 16:33:03,435][61643] Avg episode reward: [(0, '3.850'), (1, '8.620')] [2023-10-12 16:33:05,263][62634] Updated weights for policy 0, policy_version 19080 (0.0007) [2023-10-12 16:33:05,630][62634] Updated weights for policy 0, policy_version 19090 (0.0007) [2023-10-12 16:33:06,009][62634] Updated weights for policy 0, policy_version 19100 (0.0007) [2023-10-12 16:33:06,789][62635] Updated weights for policy 1, policy_version 19080 (0.0009) [2023-10-12 16:33:07,155][62635] Updated weights for policy 1, policy_version 19090 (0.0007) [2023-10-12 16:33:07,532][62635] Updated weights for policy 1, policy_version 19100 (0.0008) [2023-10-12 16:33:08,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 39124992. Throughput: 0: 1671.2, 1: 1685.5. Samples: 9783236. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 16:33:08,435][61643] Avg episode reward: [(0, '3.850'), (1, '8.560')] [2023-10-12 16:33:10,036][62634] Updated weights for policy 0, policy_version 19110 (0.0008) [2023-10-12 16:33:10,421][62634] Updated weights for policy 0, policy_version 19120 (0.0010) [2023-10-12 16:33:10,806][62634] Updated weights for policy 0, policy_version 19130 (0.0010) [2023-10-12 16:33:11,554][62635] Updated weights for policy 1, policy_version 19110 (0.0008) [2023-10-12 16:33:11,919][62635] Updated weights for policy 1, policy_version 19120 (0.0010) [2023-10-12 16:33:12,287][62635] Updated weights for policy 1, policy_version 19130 (0.0009) [2023-10-12 16:33:13,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 39190528. Throughput: 0: 1686.0, 1: 1668.2. Samples: 9803042. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 16:33:13,436][61643] Avg episode reward: [(0, '3.910'), (1, '8.410')] [2023-10-12 16:33:14,923][62634] Updated weights for policy 0, policy_version 19140 (0.0007) [2023-10-12 16:33:15,311][62634] Updated weights for policy 0, policy_version 19150 (0.0008) [2023-10-12 16:33:15,687][62634] Updated weights for policy 0, policy_version 19160 (0.0010) [2023-10-12 16:33:16,308][62635] Updated weights for policy 1, policy_version 19140 (0.0011) [2023-10-12 16:33:16,673][62635] Updated weights for policy 1, policy_version 19150 (0.0011) [2023-10-12 16:33:17,050][62635] Updated weights for policy 1, policy_version 19160 (0.0010) [2023-10-12 16:33:18,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 39256064. Throughput: 0: 1702.7, 1: 1676.5. Samples: 9823136. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 16:33:18,435][61643] Avg episode reward: [(0, '3.910'), (1, '8.440')] [2023-10-12 16:33:19,739][62634] Updated weights for policy 0, policy_version 19170 (0.0009) [2023-10-12 16:33:20,118][62634] Updated weights for policy 0, policy_version 19180 (0.0009) [2023-10-12 16:33:20,493][62634] Updated weights for policy 0, policy_version 19190 (0.0007) [2023-10-12 16:33:20,882][62634] Updated weights for policy 0, policy_version 19200 (0.0010) [2023-10-12 16:33:21,151][62635] Updated weights for policy 1, policy_version 19170 (0.0009) [2023-10-12 16:33:21,528][62635] Updated weights for policy 1, policy_version 19180 (0.0008) [2023-10-12 16:33:21,892][62635] Updated weights for policy 1, policy_version 19190 (0.0008) [2023-10-12 16:33:22,258][62635] Updated weights for policy 1, policy_version 19200 (0.0009) [2023-10-12 16:33:23,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 39321600. Throughput: 0: 1675.1, 1: 1687.6. Samples: 9833540. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 16:33:23,435][61643] Avg episode reward: [(0, '3.890'), (1, '8.370')] [2023-10-12 16:33:24,833][62634] Updated weights for policy 0, policy_version 19210 (0.0009) [2023-10-12 16:33:25,198][62634] Updated weights for policy 0, policy_version 19220 (0.0008) [2023-10-12 16:33:25,574][62634] Updated weights for policy 0, policy_version 19230 (0.0009) [2023-10-12 16:33:26,501][62635] Updated weights for policy 1, policy_version 19210 (0.0009) [2023-10-12 16:33:26,865][62635] Updated weights for policy 1, policy_version 19220 (0.0009) [2023-10-12 16:33:27,241][62635] Updated weights for policy 1, policy_version 19230 (0.0009) [2023-10-12 16:33:28,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 39387136. Throughput: 0: 1704.4, 1: 1666.2. Samples: 9853600. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 16:33:28,435][61643] Avg episode reward: [(0, '3.910'), (1, '8.570')] [2023-10-12 16:33:29,646][62634] Updated weights for policy 0, policy_version 19240 (0.0008) [2023-10-12 16:33:30,024][62634] Updated weights for policy 0, policy_version 19250 (0.0010) [2023-10-12 16:33:30,415][62634] Updated weights for policy 0, policy_version 19260 (0.0008) [2023-10-12 16:33:31,194][62635] Updated weights for policy 1, policy_version 19240 (0.0009) [2023-10-12 16:33:31,565][62635] Updated weights for policy 1, policy_version 19250 (0.0008) [2023-10-12 16:33:31,937][62635] Updated weights for policy 1, policy_version 19260 (0.0008) [2023-10-12 16:33:33,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13551.5). Total num frames: 39452672. Throughput: 0: 1708.3, 1: 1682.3. Samples: 9873966. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 16:33:33,436][61643] Avg episode reward: [(0, '3.880'), (1, '8.910')] [2023-10-12 16:33:34,464][62634] Updated weights for policy 0, policy_version 19270 (0.0009) [2023-10-12 16:33:34,837][62634] Updated weights for policy 0, policy_version 19280 (0.0008) [2023-10-12 16:33:35,217][62634] Updated weights for policy 0, policy_version 19290 (0.0010) [2023-10-12 16:33:35,989][62635] Updated weights for policy 1, policy_version 19270 (0.0008) [2023-10-12 16:33:36,351][62635] Updated weights for policy 1, policy_version 19280 (0.0009) [2023-10-12 16:33:36,724][62635] Updated weights for policy 1, policy_version 19290 (0.0010) [2023-10-12 16:33:38,435][61643] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13551.5). Total num frames: 39518208. Throughput: 0: 1681.8, 1: 1676.1. Samples: 9883904. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 16:33:38,436][61643] Avg episode reward: [(0, '3.930'), (1, '8.620')] [2023-10-12 16:33:39,445][62634] Updated weights for policy 0, policy_version 19300 (0.0010) [2023-10-12 16:33:39,814][62634] Updated weights for policy 0, policy_version 19310 (0.0010) [2023-10-12 16:33:40,199][62634] Updated weights for policy 0, policy_version 19320 (0.0009) [2023-10-12 16:33:40,866][62635] Updated weights for policy 1, policy_version 19300 (0.0009) [2023-10-12 16:33:41,228][62635] Updated weights for policy 1, policy_version 19310 (0.0007) [2023-10-12 16:33:41,609][62635] Updated weights for policy 1, policy_version 19320 (0.0007) [2023-10-12 16:33:43,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 39583744. Throughput: 0: 1692.4, 1: 1662.0. Samples: 9903304. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 16:33:43,435][61643] Avg episode reward: [(0, '3.870'), (1, '8.850')] [2023-10-12 16:33:44,273][62634] Updated weights for policy 0, policy_version 19330 (0.0011) [2023-10-12 16:33:44,659][62634] Updated weights for policy 0, policy_version 19340 (0.0008) [2023-10-12 16:33:45,027][62634] Updated weights for policy 0, policy_version 19350 (0.0009) [2023-10-12 16:33:45,405][62634] Updated weights for policy 0, policy_version 19360 (0.0007) [2023-10-12 16:33:45,685][62635] Updated weights for policy 1, policy_version 19330 (0.0008) [2023-10-12 16:33:46,080][62635] Updated weights for policy 1, policy_version 19340 (0.0007) [2023-10-12 16:33:46,439][62635] Updated weights for policy 1, policy_version 19350 (0.0009) [2023-10-12 16:33:46,809][62635] Updated weights for policy 1, policy_version 19360 (0.0007) [2023-10-12 16:33:48,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 39649280. Throughput: 0: 1684.0, 1: 1678.5. Samples: 9923732. Policy #0 lag: (min: 31.0, avg: 31.6, max: 47.0) [2023-10-12 16:33:48,436][61643] Avg episode reward: [(0, '3.910'), (1, '9.100')] [2023-10-12 16:33:49,569][62634] Updated weights for policy 0, policy_version 19370 (0.0008) [2023-10-12 16:33:49,951][62634] Updated weights for policy 0, policy_version 19380 (0.0011) [2023-10-12 16:33:50,320][62634] Updated weights for policy 0, policy_version 19390 (0.0008) [2023-10-12 16:33:50,812][62635] Updated weights for policy 1, policy_version 19370 (0.0007) [2023-10-12 16:33:51,178][62635] Updated weights for policy 1, policy_version 19380 (0.0008) [2023-10-12 16:33:51,551][62635] Updated weights for policy 1, policy_version 19390 (0.0007) [2023-10-12 16:33:53,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.3, 300 sec: 13440.4). Total num frames: 39714816. Throughput: 0: 1677.6, 1: 1663.0. Samples: 9933562. Policy #0 lag: (min: 31.0, avg: 31.6, max: 47.0) [2023-10-12 16:33:53,435][61643] Avg episode reward: [(0, '3.910'), (1, '8.850')] [2023-10-12 16:33:54,428][62634] Updated weights for policy 0, policy_version 19400 (0.0008) [2023-10-12 16:33:54,796][62634] Updated weights for policy 0, policy_version 19410 (0.0007) [2023-10-12 16:33:55,177][62634] Updated weights for policy 0, policy_version 19420 (0.0009) [2023-10-12 16:33:55,728][62635] Updated weights for policy 1, policy_version 19400 (0.0008) [2023-10-12 16:33:56,091][62635] Updated weights for policy 1, policy_version 19410 (0.0010) [2023-10-12 16:33:56,460][62635] Updated weights for policy 1, policy_version 19420 (0.0009) [2023-10-12 16:33:58,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 39780352. Throughput: 0: 1681.0, 1: 1664.9. Samples: 9953606. Policy #0 lag: (min: 31.0, avg: 31.6, max: 47.0) [2023-10-12 16:33:58,435][61643] Avg episode reward: [(0, '3.950'), (1, '8.860')] [2023-10-12 16:33:59,220][62634] Updated weights for policy 0, policy_version 19430 (0.0010) [2023-10-12 16:33:59,600][62634] Updated weights for policy 0, policy_version 19440 (0.0008) [2023-10-12 16:33:59,975][62634] Updated weights for policy 0, policy_version 19450 (0.0009) [2023-10-12 16:34:00,491][62635] Updated weights for policy 1, policy_version 19430 (0.0010) [2023-10-12 16:34:00,865][62635] Updated weights for policy 1, policy_version 19440 (0.0009) [2023-10-12 16:34:01,241][62635] Updated weights for policy 1, policy_version 19450 (0.0008) [2023-10-12 16:34:03,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 39845888. Throughput: 0: 1684.2, 1: 1675.9. Samples: 9974338. Policy #0 lag: (min: 31.0, avg: 31.6, max: 49.0) [2023-10-12 16:34:03,435][61643] Avg episode reward: [(0, '3.950'), (1, '9.030')] [2023-10-12 16:34:03,979][62634] Updated weights for policy 0, policy_version 19460 (0.0008) [2023-10-12 16:34:04,360][62634] Updated weights for policy 0, policy_version 19470 (0.0008) [2023-10-12 16:34:04,734][62634] Updated weights for policy 0, policy_version 19480 (0.0008) [2023-10-12 16:34:05,414][62635] Updated weights for policy 1, policy_version 19460 (0.0009) [2023-10-12 16:34:05,779][62635] Updated weights for policy 1, policy_version 19470 (0.0008) [2023-10-12 16:34:06,148][62635] Updated weights for policy 1, policy_version 19480 (0.0008) [2023-10-12 16:34:08,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 39911424. Throughput: 0: 1684.8, 1: 1654.1. Samples: 9983792. Policy #0 lag: (min: 31.0, avg: 31.6, max: 49.0) [2023-10-12 16:34:08,435][61643] Avg episode reward: [(0, '3.940'), (1, '8.840')] [2023-10-12 16:34:08,818][62634] Updated weights for policy 0, policy_version 19490 (0.0009) [2023-10-12 16:34:09,195][62634] Updated weights for policy 0, policy_version 19500 (0.0010) [2023-10-12 16:34:09,575][62634] Updated weights for policy 0, policy_version 19510 (0.0009) [2023-10-12 16:34:09,959][62634] Updated weights for policy 0, policy_version 19520 (0.0010) [2023-10-12 16:34:10,141][62635] Updated weights for policy 1, policy_version 19490 (0.0010) [2023-10-12 16:34:10,513][62635] Updated weights for policy 1, policy_version 19500 (0.0010) [2023-10-12 16:34:10,875][62635] Updated weights for policy 1, policy_version 19510 (0.0008) [2023-10-12 16:34:11,245][62635] Updated weights for policy 1, policy_version 19520 (0.0007) [2023-10-12 16:34:13,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 39976960. Throughput: 0: 1677.0, 1: 1665.7. Samples: 10004022. Policy #0 lag: (min: 31.0, avg: 31.6, max: 49.0) [2023-10-12 16:34:13,436][61643] Avg episode reward: [(0, '3.850'), (1, '8.890')] [2023-10-12 16:34:13,908][62634] Updated weights for policy 0, policy_version 19530 (0.0010) [2023-10-12 16:34:14,277][62634] Updated weights for policy 0, policy_version 19540 (0.0007) [2023-10-12 16:34:14,655][62634] Updated weights for policy 0, policy_version 19550 (0.0007) [2023-10-12 16:34:15,372][62635] Updated weights for policy 1, policy_version 19530 (0.0009) [2023-10-12 16:34:15,744][62635] Updated weights for policy 1, policy_version 19540 (0.0008) [2023-10-12 16:34:16,105][62635] Updated weights for policy 1, policy_version 19550 (0.0008) [2023-10-12 16:34:18,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 40042496. Throughput: 0: 1676.0, 1: 1673.1. Samples: 10024678. Policy #0 lag: (min: 31.0, avg: 33.1, max: 62.0) [2023-10-12 16:34:18,436][61643] Avg episode reward: [(0, '3.850'), (1, '8.880')] [2023-10-12 16:34:18,444][62495] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000019552_20021248.pth... [2023-10-12 16:34:18,483][62495] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000017984_18415616.pth [2023-10-12 16:34:18,669][62634] Updated weights for policy 0, policy_version 19560 (0.0008) [2023-10-12 16:34:19,053][62634] Updated weights for policy 0, policy_version 19570 (0.0010) [2023-10-12 16:34:19,415][62634] Updated weights for policy 0, policy_version 19580 (0.0010) [2023-10-12 16:34:19,568][62354] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000019584_20054016.pth... [2023-10-12 16:34:19,607][62354] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000017984_18415616.pth [2023-10-12 16:34:20,177][62635] Updated weights for policy 1, policy_version 19560 (0.0010) [2023-10-12 16:34:20,549][62635] Updated weights for policy 1, policy_version 19570 (0.0009) [2023-10-12 16:34:20,918][62635] Updated weights for policy 1, policy_version 19580 (0.0009) [2023-10-12 16:34:23,414][62634] Updated weights for policy 0, policy_version 19590 (0.0010) [2023-10-12 16:34:23,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 40108032. Throughput: 0: 1677.7, 1: 1653.8. Samples: 10033820. Policy #0 lag: (min: 31.0, avg: 33.1, max: 62.0) [2023-10-12 16:34:23,435][61643] Avg episode reward: [(0, '3.820'), (1, '8.910')] [2023-10-12 16:34:23,791][62634] Updated weights for policy 0, policy_version 19600 (0.0011) [2023-10-12 16:34:24,165][62634] Updated weights for policy 0, policy_version 19610 (0.0010) [2023-10-12 16:34:25,069][62635] Updated weights for policy 1, policy_version 19590 (0.0009) [2023-10-12 16:34:25,437][62635] Updated weights for policy 1, policy_version 19600 (0.0010) [2023-10-12 16:34:25,806][62635] Updated weights for policy 1, policy_version 19610 (0.0010) [2023-10-12 16:34:28,183][62634] Updated weights for policy 0, policy_version 19620 (0.0010) [2023-10-12 16:34:28,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 40173568. Throughput: 0: 1683.4, 1: 1672.8. Samples: 10054336. Policy #0 lag: (min: 31.0, avg: 33.1, max: 62.0) [2023-10-12 16:34:28,435][61643] Avg episode reward: [(0, '3.870'), (1, '8.940')] [2023-10-12 16:34:28,556][62634] Updated weights for policy 0, policy_version 19630 (0.0011) [2023-10-12 16:34:28,936][62634] Updated weights for policy 0, policy_version 19640 (0.0009) [2023-10-12 16:34:29,918][62635] Updated weights for policy 1, policy_version 19620 (0.0011) [2023-10-12 16:34:30,286][62635] Updated weights for policy 1, policy_version 19630 (0.0011) [2023-10-12 16:34:30,650][62635] Updated weights for policy 1, policy_version 19640 (0.0009) [2023-10-12 16:34:33,015][62634] Updated weights for policy 0, policy_version 19650 (0.0011) [2023-10-12 16:34:33,395][62634] Updated weights for policy 0, policy_version 19660 (0.0007) [2023-10-12 16:34:33,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 40239104. Throughput: 0: 1688.2, 1: 1674.0. Samples: 10075034. Policy #0 lag: (min: 31.0, avg: 33.1, max: 62.0) [2023-10-12 16:34:33,436][61643] Avg episode reward: [(0, '3.890'), (1, '8.750')] [2023-10-12 16:34:33,774][62634] Updated weights for policy 0, policy_version 19670 (0.0007) [2023-10-12 16:34:34,160][62634] Updated weights for policy 0, policy_version 19680 (0.0011) [2023-10-12 16:34:34,948][62635] Updated weights for policy 1, policy_version 19650 (0.0010) [2023-10-12 16:34:35,331][62635] Updated weights for policy 1, policy_version 19660 (0.0009) [2023-10-12 16:34:35,700][62635] Updated weights for policy 1, policy_version 19670 (0.0009) [2023-10-12 16:34:36,066][62635] Updated weights for policy 1, policy_version 19680 (0.0008) [2023-10-12 16:34:38,253][62634] Updated weights for policy 0, policy_version 19690 (0.0007) [2023-10-12 16:34:38,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 40304640. Throughput: 0: 1687.4, 1: 1657.2. Samples: 10084070. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 16:34:38,435][61643] Avg episode reward: [(0, '3.860'), (1, '8.820')] [2023-10-12 16:34:38,628][62634] Updated weights for policy 0, policy_version 19700 (0.0008) [2023-10-12 16:34:39,005][62634] Updated weights for policy 0, policy_version 19710 (0.0008) [2023-10-12 16:34:40,167][62635] Updated weights for policy 1, policy_version 19690 (0.0011) [2023-10-12 16:34:40,538][62635] Updated weights for policy 1, policy_version 19700 (0.0009) [2023-10-12 16:34:40,913][62635] Updated weights for policy 1, policy_version 19710 (0.0008) [2023-10-12 16:34:43,093][62634] Updated weights for policy 0, policy_version 19720 (0.0011) [2023-10-12 16:34:43,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 40370176. Throughput: 0: 1684.0, 1: 1672.3. Samples: 10104638. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 16:34:43,435][61643] Avg episode reward: [(0, '3.900'), (1, '8.780')] [2023-10-12 16:34:43,480][62634] Updated weights for policy 0, policy_version 19730 (0.0011) [2023-10-12 16:34:43,855][62634] Updated weights for policy 0, policy_version 19740 (0.0008) [2023-10-12 16:34:45,006][62635] Updated weights for policy 1, policy_version 19720 (0.0007) [2023-10-12 16:34:45,371][62635] Updated weights for policy 1, policy_version 19730 (0.0008) [2023-10-12 16:34:45,737][62635] Updated weights for policy 1, policy_version 19740 (0.0010) [2023-10-12 16:34:47,986][62634] Updated weights for policy 0, policy_version 19750 (0.0008) [2023-10-12 16:34:48,367][62634] Updated weights for policy 0, policy_version 19760 (0.0008) [2023-10-12 16:34:48,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 40435712. Throughput: 0: 1677.8, 1: 1669.6. Samples: 10124968. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 16:34:48,435][61643] Avg episode reward: [(0, '3.920'), (1, '8.790')] [2023-10-12 16:34:48,741][62634] Updated weights for policy 0, policy_version 19770 (0.0009) [2023-10-12 16:34:49,824][62635] Updated weights for policy 1, policy_version 19750 (0.0011) [2023-10-12 16:34:50,189][62635] Updated weights for policy 1, policy_version 19760 (0.0008) [2023-10-12 16:34:50,562][62635] Updated weights for policy 1, policy_version 19770 (0.0009) [2023-10-12 16:34:52,758][62634] Updated weights for policy 0, policy_version 19780 (0.0007) [2023-10-12 16:34:53,137][62634] Updated weights for policy 0, policy_version 19790 (0.0007) [2023-10-12 16:34:53,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 40501248. Throughput: 0: 1684.4, 1: 1661.2. Samples: 10134342. Policy #0 lag: (min: 11.0, avg: 13.2, max: 43.0) [2023-10-12 16:34:53,435][61643] Avg episode reward: [(0, '3.930'), (1, '8.740')] [2023-10-12 16:34:53,514][62634] Updated weights for policy 0, policy_version 19800 (0.0007) [2023-10-12 16:34:54,573][62635] Updated weights for policy 1, policy_version 19780 (0.0007) [2023-10-12 16:34:54,945][62635] Updated weights for policy 1, policy_version 19790 (0.0007) [2023-10-12 16:34:55,314][62635] Updated weights for policy 1, policy_version 19800 (0.0008) [2023-10-12 16:34:57,577][62634] Updated weights for policy 0, policy_version 19810 (0.0008) [2023-10-12 16:34:57,953][62634] Updated weights for policy 0, policy_version 19820 (0.0010) [2023-10-12 16:34:58,328][62634] Updated weights for policy 0, policy_version 19830 (0.0009) [2023-10-12 16:34:58,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 40566784. Throughput: 0: 1681.5, 1: 1672.9. Samples: 10154968. Policy #0 lag: (min: 11.0, avg: 13.2, max: 43.0) [2023-10-12 16:34:58,435][61643] Avg episode reward: [(0, '3.840'), (1, '8.860')] [2023-10-12 16:34:58,704][62634] Updated weights for policy 0, policy_version 19840 (0.0009) [2023-10-12 16:34:59,406][62635] Updated weights for policy 1, policy_version 19810 (0.0007) [2023-10-12 16:34:59,779][62635] Updated weights for policy 1, policy_version 19820 (0.0008) [2023-10-12 16:35:00,147][62635] Updated weights for policy 1, policy_version 19830 (0.0008) [2023-10-12 16:35:00,518][62635] Updated weights for policy 1, policy_version 19840 (0.0009) [2023-10-12 16:35:02,668][62634] Updated weights for policy 0, policy_version 19850 (0.0007) [2023-10-12 16:35:03,045][62634] Updated weights for policy 0, policy_version 19860 (0.0007) [2023-10-12 16:35:03,432][62634] Updated weights for policy 0, policy_version 19870 (0.0009) [2023-10-12 16:35:03,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 40632320. Throughput: 0: 1667.2, 1: 1675.7. Samples: 10175112. Policy #0 lag: (min: 11.0, avg: 13.2, max: 43.0) [2023-10-12 16:35:03,435][61643] Avg episode reward: [(0, '3.800'), (1, '8.910')] [2023-10-12 16:35:04,535][62635] Updated weights for policy 1, policy_version 19850 (0.0007) [2023-10-12 16:35:04,904][62635] Updated weights for policy 1, policy_version 19860 (0.0009) [2023-10-12 16:35:05,276][62635] Updated weights for policy 1, policy_version 19870 (0.0007) [2023-10-12 16:35:07,469][62634] Updated weights for policy 0, policy_version 19880 (0.0009) [2023-10-12 16:35:07,851][62634] Updated weights for policy 0, policy_version 19890 (0.0008) [2023-10-12 16:35:08,226][62634] Updated weights for policy 0, policy_version 19900 (0.0009) [2023-10-12 16:35:08,435][61643] Fps is (10 sec: 16383.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 40730624. Throughput: 0: 1683.8, 1: 1672.1. Samples: 10184836. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 16:35:08,435][61643] Avg episode reward: [(0, '3.830'), (1, '8.930')] [2023-10-12 16:35:09,343][62635] Updated weights for policy 1, policy_version 19880 (0.0007) [2023-10-12 16:35:09,708][62635] Updated weights for policy 1, policy_version 19890 (0.0007) [2023-10-12 16:35:10,084][62635] Updated weights for policy 1, policy_version 19900 (0.0009) [2023-10-12 16:35:12,098][62634] Updated weights for policy 0, policy_version 19910 (0.0009) [2023-10-12 16:35:12,476][62634] Updated weights for policy 0, policy_version 19920 (0.0009) [2023-10-12 16:35:12,852][62634] Updated weights for policy 0, policy_version 19930 (0.0010) [2023-10-12 16:35:13,435][61643] Fps is (10 sec: 16383.9, 60 sec: 13653.4, 300 sec: 13440.5). Total num frames: 40796160. Throughput: 0: 1688.7, 1: 1678.6. Samples: 10205864. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 16:35:13,435][61643] Avg episode reward: [(0, '3.900'), (1, '9.000')] [2023-10-12 16:35:14,170][62635] Updated weights for policy 1, policy_version 19910 (0.0008) [2023-10-12 16:35:14,529][62635] Updated weights for policy 1, policy_version 19920 (0.0007) [2023-10-12 16:35:14,903][62635] Updated weights for policy 1, policy_version 19930 (0.0007) [2023-10-12 16:35:16,862][62634] Updated weights for policy 0, policy_version 19940 (0.0008) [2023-10-12 16:35:17,235][62634] Updated weights for policy 0, policy_version 19950 (0.0010) [2023-10-12 16:35:17,618][62634] Updated weights for policy 0, policy_version 19960 (0.0009) [2023-10-12 16:35:18,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 40861696. Throughput: 0: 1660.6, 1: 1680.6. Samples: 10225388. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 16:35:18,436][61643] Avg episode reward: [(0, '3.910'), (1, '9.150')] [2023-10-12 16:35:18,921][62635] Updated weights for policy 1, policy_version 19940 (0.0009) [2023-10-12 16:35:19,297][62635] Updated weights for policy 1, policy_version 19950 (0.0009) [2023-10-12 16:35:19,668][62635] Updated weights for policy 1, policy_version 19960 (0.0007) [2023-10-12 16:35:21,699][62634] Updated weights for policy 0, policy_version 19970 (0.0008) [2023-10-12 16:35:22,071][62634] Updated weights for policy 0, policy_version 19980 (0.0008) [2023-10-12 16:35:22,454][62634] Updated weights for policy 0, policy_version 19990 (0.0007) [2023-10-12 16:35:22,834][62634] Updated weights for policy 0, policy_version 20000 (0.0007) [2023-10-12 16:35:23,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 40927232. Throughput: 0: 1689.6, 1: 1678.9. Samples: 10235656. Policy #0 lag: (min: 31.0, avg: 42.1, max: 63.0) [2023-10-12 16:35:23,435][61643] Avg episode reward: [(0, '3.890'), (1, '9.140')] [2023-10-12 16:35:23,666][62635] Updated weights for policy 1, policy_version 19970 (0.0010) [2023-10-12 16:35:24,053][62635] Updated weights for policy 1, policy_version 19980 (0.0009) [2023-10-12 16:35:24,431][62635] Updated weights for policy 1, policy_version 19990 (0.0008) [2023-10-12 16:35:24,809][62635] Updated weights for policy 1, policy_version 20000 (0.0010) [2023-10-12 16:35:26,906][62634] Updated weights for policy 0, policy_version 20010 (0.0007) [2023-10-12 16:35:27,280][62634] Updated weights for policy 0, policy_version 20020 (0.0007) [2023-10-12 16:35:27,668][62634] Updated weights for policy 0, policy_version 20030 (0.0008) [2023-10-12 16:35:28,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 40992768. Throughput: 0: 1682.1, 1: 1681.6. Samples: 10256004. Policy #0 lag: (min: 31.0, avg: 42.1, max: 63.0) [2023-10-12 16:35:28,435][61643] Avg episode reward: [(0, '3.860'), (1, '9.040')] [2023-10-12 16:35:28,797][62635] Updated weights for policy 1, policy_version 20010 (0.0007) [2023-10-12 16:35:29,166][62635] Updated weights for policy 1, policy_version 20020 (0.0008) [2023-10-12 16:35:29,537][62635] Updated weights for policy 1, policy_version 20030 (0.0009) [2023-10-12 16:35:31,622][62634] Updated weights for policy 0, policy_version 20040 (0.0009) [2023-10-12 16:35:31,997][62634] Updated weights for policy 0, policy_version 20050 (0.0008) [2023-10-12 16:35:32,382][62634] Updated weights for policy 0, policy_version 20060 (0.0008) [2023-10-12 16:35:33,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 41058304. Throughput: 0: 1669.2, 1: 1688.3. Samples: 10276054. Policy #0 lag: (min: 31.0, avg: 42.1, max: 63.0) [2023-10-12 16:35:33,435][61643] Avg episode reward: [(0, '3.890'), (1, '8.940')] [2023-10-12 16:35:33,558][62635] Updated weights for policy 1, policy_version 20040 (0.0008) [2023-10-12 16:35:33,933][62635] Updated weights for policy 1, policy_version 20050 (0.0008) [2023-10-12 16:35:34,300][62635] Updated weights for policy 1, policy_version 20060 (0.0008) [2023-10-12 16:35:36,331][62634] Updated weights for policy 0, policy_version 20070 (0.0008) [2023-10-12 16:35:36,711][62634] Updated weights for policy 0, policy_version 20080 (0.0007) [2023-10-12 16:35:37,081][62634] Updated weights for policy 0, policy_version 20090 (0.0010) [2023-10-12 16:35:38,236][62635] Updated weights for policy 1, policy_version 20070 (0.0009) [2023-10-12 16:35:38,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 41123840. Throughput: 0: 1695.1, 1: 1689.7. Samples: 10286658. Policy #0 lag: (min: 31.0, avg: 42.1, max: 63.0) [2023-10-12 16:35:38,435][61643] Avg episode reward: [(0, '3.940'), (1, '8.880')] [2023-10-12 16:35:38,601][62635] Updated weights for policy 1, policy_version 20080 (0.0010) [2023-10-12 16:35:38,966][62635] Updated weights for policy 1, policy_version 20090 (0.0009) [2023-10-12 16:35:41,086][62634] Updated weights for policy 0, policy_version 20100 (0.0009) [2023-10-12 16:35:41,487][62634] Updated weights for policy 0, policy_version 20110 (0.0007) [2023-10-12 16:35:41,860][62634] Updated weights for policy 0, policy_version 20120 (0.0009) [2023-10-12 16:35:43,155][62635] Updated weights for policy 1, policy_version 20100 (0.0009) [2023-10-12 16:35:43,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 41189376. Throughput: 0: 1677.2, 1: 1684.9. Samples: 10306260. Policy #0 lag: (min: 31.0, avg: 35.0, max: 63.0) [2023-10-12 16:35:43,435][61643] Avg episode reward: [(0, '3.930'), (1, '8.850')] [2023-10-12 16:35:43,529][62635] Updated weights for policy 1, policy_version 20110 (0.0007) [2023-10-12 16:35:43,901][62635] Updated weights for policy 1, policy_version 20120 (0.0009) [2023-10-12 16:35:45,832][62634] Updated weights for policy 0, policy_version 20130 (0.0009) [2023-10-12 16:35:46,203][62634] Updated weights for policy 0, policy_version 20140 (0.0008) [2023-10-12 16:35:46,582][62634] Updated weights for policy 0, policy_version 20150 (0.0011) [2023-10-12 16:35:46,960][62634] Updated weights for policy 0, policy_version 20160 (0.0011) [2023-10-12 16:35:47,926][62635] Updated weights for policy 1, policy_version 20130 (0.0007) [2023-10-12 16:35:48,295][62635] Updated weights for policy 1, policy_version 20140 (0.0009) [2023-10-12 16:35:48,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 41254912. Throughput: 0: 1687.9, 1: 1681.1. Samples: 10326714. Policy #0 lag: (min: 31.0, avg: 35.0, max: 63.0) [2023-10-12 16:35:48,435][61643] Avg episode reward: [(0, '3.940'), (1, '8.640')] [2023-10-12 16:35:48,666][62635] Updated weights for policy 1, policy_version 20150 (0.0010) [2023-10-12 16:35:49,031][62635] Updated weights for policy 1, policy_version 20160 (0.0009) [2023-10-12 16:35:50,878][62634] Updated weights for policy 0, policy_version 20170 (0.0009) [2023-10-12 16:35:51,255][62634] Updated weights for policy 0, policy_version 20180 (0.0008) [2023-10-12 16:35:51,628][62634] Updated weights for policy 0, policy_version 20190 (0.0009) [2023-10-12 16:35:53,195][62635] Updated weights for policy 1, policy_version 20170 (0.0007) [2023-10-12 16:35:53,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 41320448. Throughput: 0: 1690.8, 1: 1684.0. Samples: 10336702. Policy #0 lag: (min: 31.0, avg: 35.0, max: 63.0) [2023-10-12 16:35:53,435][61643] Avg episode reward: [(0, '3.900'), (1, '8.920')] [2023-10-12 16:35:53,567][62635] Updated weights for policy 1, policy_version 20180 (0.0008) [2023-10-12 16:35:53,935][62635] Updated weights for policy 1, policy_version 20190 (0.0008) [2023-10-12 16:35:55,860][62634] Updated weights for policy 0, policy_version 20200 (0.0010) [2023-10-12 16:35:56,230][62634] Updated weights for policy 0, policy_version 20210 (0.0008) [2023-10-12 16:35:56,601][62634] Updated weights for policy 0, policy_version 20220 (0.0011) [2023-10-12 16:35:58,048][62635] Updated weights for policy 1, policy_version 20200 (0.0007) [2023-10-12 16:35:58,415][62635] Updated weights for policy 1, policy_version 20210 (0.0007) [2023-10-12 16:35:58,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 41385984. Throughput: 0: 1663.4, 1: 1682.4. Samples: 10356424. Policy #0 lag: (min: 31.0, avg: 35.0, max: 63.0) [2023-10-12 16:35:58,435][61643] Avg episode reward: [(0, '3.860'), (1, '8.440')] [2023-10-12 16:35:58,772][62635] Updated weights for policy 1, policy_version 20220 (0.0008) [2023-10-12 16:36:00,721][62634] Updated weights for policy 0, policy_version 20230 (0.0008) [2023-10-12 16:36:01,097][62634] Updated weights for policy 0, policy_version 20240 (0.0009) [2023-10-12 16:36:01,478][62634] Updated weights for policy 0, policy_version 20250 (0.0007) [2023-10-12 16:36:02,824][62635] Updated weights for policy 1, policy_version 20230 (0.0007) [2023-10-12 16:36:03,192][62635] Updated weights for policy 1, policy_version 20240 (0.0009) [2023-10-12 16:36:03,435][61643] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 41451520. Throughput: 0: 1686.2, 1: 1676.5. Samples: 10376712. Policy #0 lag: (min: 31.0, avg: 42.7, max: 63.0) [2023-10-12 16:36:03,436][61643] Avg episode reward: [(0, '3.880'), (1, '8.500')] [2023-10-12 16:36:03,559][62635] Updated weights for policy 1, policy_version 20250 (0.0009) [2023-10-12 16:36:05,621][62634] Updated weights for policy 0, policy_version 20260 (0.0010) [2023-10-12 16:36:06,001][62634] Updated weights for policy 0, policy_version 20270 (0.0009) [2023-10-12 16:36:06,381][62634] Updated weights for policy 0, policy_version 20280 (0.0009) [2023-10-12 16:36:07,700][62635] Updated weights for policy 1, policy_version 20260 (0.0008) [2023-10-12 16:36:08,073][62635] Updated weights for policy 1, policy_version 20270 (0.0009) [2023-10-12 16:36:08,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 41517056. Throughput: 0: 1675.0, 1: 1687.2. Samples: 10386954. Policy #0 lag: (min: 31.0, avg: 42.7, max: 63.0) [2023-10-12 16:36:08,435][61643] Avg episode reward: [(0, '3.870'), (1, '8.670')] [2023-10-12 16:36:08,450][62635] Updated weights for policy 1, policy_version 20280 (0.0008) [2023-10-12 16:36:10,648][62634] Updated weights for policy 0, policy_version 20290 (0.0010) [2023-10-12 16:36:11,025][62634] Updated weights for policy 0, policy_version 20300 (0.0010) [2023-10-12 16:36:11,408][62634] Updated weights for policy 0, policy_version 20310 (0.0011) [2023-10-12 16:36:11,774][62634] Updated weights for policy 0, policy_version 20320 (0.0011) [2023-10-12 16:36:12,618][62635] Updated weights for policy 1, policy_version 20290 (0.0008) [2023-10-12 16:36:13,025][62635] Updated weights for policy 1, policy_version 20300 (0.0009) [2023-10-12 16:36:13,392][62635] Updated weights for policy 1, policy_version 20310 (0.0010) [2023-10-12 16:36:13,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 41582592. Throughput: 0: 1660.4, 1: 1687.1. Samples: 10406644. Policy #0 lag: (min: 31.0, avg: 42.7, max: 63.0) [2023-10-12 16:36:13,436][61643] Avg episode reward: [(0, '3.900'), (1, '8.430')] [2023-10-12 16:36:13,758][62635] Updated weights for policy 1, policy_version 20320 (0.0011) [2023-10-12 16:36:15,600][62634] Updated weights for policy 0, policy_version 20330 (0.0008) [2023-10-12 16:36:15,981][62634] Updated weights for policy 0, policy_version 20340 (0.0009) [2023-10-12 16:36:16,350][62634] Updated weights for policy 0, policy_version 20350 (0.0009) [2023-10-12 16:36:17,939][62635] Updated weights for policy 1, policy_version 20330 (0.0009) [2023-10-12 16:36:18,315][62635] Updated weights for policy 1, policy_version 20340 (0.0008) [2023-10-12 16:36:18,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 41648128. Throughput: 0: 1685.6, 1: 1666.2. Samples: 10426888. Policy #0 lag: (min: 31.0, avg: 42.7, max: 63.0) [2023-10-12 16:36:18,435][61643] Avg episode reward: [(0, '3.930'), (1, '8.450')] [2023-10-12 16:36:18,443][62354] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000020352_20840448.pth... [2023-10-12 16:36:18,473][62354] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000018784_19234816.pth [2023-10-12 16:36:18,682][62635] Updated weights for policy 1, policy_version 20350 (0.0010) [2023-10-12 16:36:18,760][62495] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000020352_20840448.pth... [2023-10-12 16:36:18,799][62495] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000018784_19234816.pth [2023-10-12 16:36:20,465][62634] Updated weights for policy 0, policy_version 20360 (0.0009) [2023-10-12 16:36:20,839][62634] Updated weights for policy 0, policy_version 20370 (0.0008) [2023-10-12 16:36:21,223][62634] Updated weights for policy 0, policy_version 20380 (0.0007) [2023-10-12 16:36:22,776][62635] Updated weights for policy 1, policy_version 20360 (0.0007) [2023-10-12 16:36:23,142][62635] Updated weights for policy 1, policy_version 20370 (0.0009) [2023-10-12 16:36:23,435][61643] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 41713664. Throughput: 0: 1663.3, 1: 1671.8. Samples: 10436736. Policy #0 lag: (min: 18.0, avg: 30.6, max: 50.0) [2023-10-12 16:36:23,435][61643] Avg episode reward: [(0, '3.950'), (1, '8.500')] [2023-10-12 16:36:23,510][62635] Updated weights for policy 1, policy_version 20380 (0.0010) [2023-10-12 16:36:25,225][62634] Updated weights for policy 0, policy_version 20390 (0.0009) [2023-10-12 16:36:25,609][62634] Updated weights for policy 0, policy_version 20400 (0.0009) [2023-10-12 16:36:25,977][62634] Updated weights for policy 0, policy_version 20410 (0.0009) [2023-10-12 16:36:27,362][62635] Updated weights for policy 1, policy_version 20390 (0.0010) [2023-10-12 16:36:27,726][62635] Updated weights for policy 1, policy_version 20400 (0.0011) [2023-10-12 16:36:28,100][62635] Updated weights for policy 1, policy_version 20410 (0.0008) [2023-10-12 16:36:28,435][61643] Fps is (10 sec: 16383.8, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 41811968. Throughput: 0: 1676.8, 1: 1677.8. Samples: 10457214. Policy #0 lag: (min: 18.0, avg: 30.6, max: 50.0) [2023-10-12 16:36:28,435][61643] Avg episode reward: [(0, '3.970'), (1, '8.430')] [2023-10-12 16:36:28,436][62354] Saving new best policy, reward=3.970! [2023-10-12 16:36:30,173][62634] Updated weights for policy 0, policy_version 20420 (0.0009) [2023-10-12 16:36:30,547][62634] Updated weights for policy 0, policy_version 20430 (0.0010) [2023-10-12 16:36:30,925][62634] Updated weights for policy 0, policy_version 20440 (0.0009) [2023-10-12 16:36:31,931][62635] Updated weights for policy 1, policy_version 20420 (0.0009) [2023-10-12 16:36:32,309][62635] Updated weights for policy 1, policy_version 20430 (0.0009) [2023-10-12 16:36:32,683][62635] Updated weights for policy 1, policy_version 20440 (0.0010) [2023-10-12 16:36:33,435][61643] Fps is (10 sec: 16383.6, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 41877504. Throughput: 0: 1676.5, 1: 1653.7. Samples: 10476574. Policy #0 lag: (min: 18.0, avg: 30.6, max: 50.0) [2023-10-12 16:36:33,436][61643] Avg episode reward: [(0, '4.000'), (1, '8.520')] [2023-10-12 16:36:33,446][62354] Saving new best policy, reward=4.000! [2023-10-12 16:36:34,929][62634] Updated weights for policy 0, policy_version 20450 (0.0009) [2023-10-12 16:36:35,317][62634] Updated weights for policy 0, policy_version 20460 (0.0009) [2023-10-12 16:36:35,677][62634] Updated weights for policy 0, policy_version 20470 (0.0009) [2023-10-12 16:36:36,059][62634] Updated weights for policy 0, policy_version 20480 (0.0009) [2023-10-12 16:36:36,700][62635] Updated weights for policy 1, policy_version 20450 (0.0009) [2023-10-12 16:36:37,070][62635] Updated weights for policy 1, policy_version 20460 (0.0008) [2023-10-12 16:36:37,450][62635] Updated weights for policy 1, policy_version 20470 (0.0008) [2023-10-12 16:36:37,823][62635] Updated weights for policy 1, policy_version 20480 (0.0009) [2023-10-12 16:36:38,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 41943040. Throughput: 0: 1662.3, 1: 1684.1. Samples: 10487292. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) [2023-10-12 16:36:38,436][61643] Avg episode reward: [(0, '4.000'), (1, '8.630')] [2023-10-12 16:36:40,153][62634] Updated weights for policy 0, policy_version 20490 (0.0007) [2023-10-12 16:36:40,527][62634] Updated weights for policy 0, policy_version 20500 (0.0011) [2023-10-12 16:36:40,913][62634] Updated weights for policy 0, policy_version 20510 (0.0007) [2023-10-12 16:36:42,107][62635] Updated weights for policy 1, policy_version 20490 (0.0008) [2023-10-12 16:36:42,479][62635] Updated weights for policy 1, policy_version 20500 (0.0009) [2023-10-12 16:36:42,851][62635] Updated weights for policy 1, policy_version 20510 (0.0009) [2023-10-12 16:36:43,435][61643] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 42008576. Throughput: 0: 1682.2, 1: 1673.3. Samples: 10507420. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) [2023-10-12 16:36:43,435][61643] Avg episode reward: [(0, '4.000'), (1, '8.720')] [2023-10-12 16:36:44,836][62634] Updated weights for policy 0, policy_version 20520 (0.0009) [2023-10-12 16:36:45,215][62634] Updated weights for policy 0, policy_version 20530 (0.0010) [2023-10-12 16:36:45,592][62634] Updated weights for policy 0, policy_version 20540 (0.0007) [2023-10-12 16:36:46,937][62635] Updated weights for policy 1, policy_version 20520 (0.0010) [2023-10-12 16:36:47,306][62635] Updated weights for policy 1, policy_version 20530 (0.0011) [2023-10-12 16:36:47,681][62635] Updated weights for policy 1, policy_version 20540 (0.0008) [2023-10-12 16:36:48,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 42074112. Throughput: 0: 1688.1, 1: 1658.4. Samples: 10527300. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) [2023-10-12 16:36:48,435][61643] Avg episode reward: [(0, '4.000'), (1, '8.530')] [2023-10-12 16:36:49,591][62634] Updated weights for policy 0, policy_version 20550 (0.0010) [2023-10-12 16:36:49,964][62634] Updated weights for policy 0, policy_version 20560 (0.0009) [2023-10-12 16:36:50,344][62634] Updated weights for policy 0, policy_version 20570 (0.0009) [2023-10-12 16:36:51,893][62635] Updated weights for policy 1, policy_version 20550 (0.0010) [2023-10-12 16:36:52,267][62635] Updated weights for policy 1, policy_version 20560 (0.0008) [2023-10-12 16:36:52,628][62635] Updated weights for policy 1, policy_version 20570 (0.0008) [2023-10-12 16:36:53,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 42139648. Throughput: 0: 1666.8, 1: 1675.4. Samples: 10537352. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) [2023-10-12 16:36:53,435][61643] Avg episode reward: [(0, '3.990'), (1, '8.590')] [2023-10-12 16:36:54,362][62634] Updated weights for policy 0, policy_version 20580 (0.0009) [2023-10-12 16:36:54,737][62634] Updated weights for policy 0, policy_version 20590 (0.0007) [2023-10-12 16:36:55,112][62634] Updated weights for policy 0, policy_version 20600 (0.0007) [2023-10-12 16:36:56,761][62635] Updated weights for policy 1, policy_version 20580 (0.0008) [2023-10-12 16:36:57,135][62635] Updated weights for policy 1, policy_version 20590 (0.0008) [2023-10-12 16:36:57,501][62635] Updated weights for policy 1, policy_version 20600 (0.0008) [2023-10-12 16:36:58,435][61643] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 42205184. Throughput: 0: 1697.2, 1: 1664.7. Samples: 10557932. Policy #0 lag: (min: 31.0, avg: 37.4, max: 63.0) [2023-10-12 16:36:58,436][61643] Avg episode reward: [(0, '3.990'), (1, '8.700')] [2023-10-12 16:36:59,104][62634] Updated weights for policy 0, policy_version 20610 (0.0009) [2023-10-12 16:36:59,479][62634] Updated weights for policy 0, policy_version 20620 (0.0009) [2023-10-12 16:36:59,857][62634] Updated weights for policy 0, policy_version 20630 (0.0008) [2023-10-12 16:37:00,234][62634] Updated weights for policy 0, policy_version 20640 (0.0009) [2023-10-12 16:37:01,536][62635] Updated weights for policy 1, policy_version 20610 (0.0008) [2023-10-12 16:37:01,940][62635] Updated weights for policy 1, policy_version 20620 (0.0009) [2023-10-12 16:37:02,310][62635] Updated weights for policy 1, policy_version 20630 (0.0008) [2023-10-12 16:37:02,687][62635] Updated weights for policy 1, policy_version 20640 (0.0007) [2023-10-12 16:37:03,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 42270720. Throughput: 0: 1696.7, 1: 1664.1. Samples: 10578122. Policy #0 lag: (min: 31.0, avg: 37.4, max: 63.0) [2023-10-12 16:37:03,436][61643] Avg episode reward: [(0, '3.910'), (1, '8.670')] [2023-10-12 16:37:04,065][62634] Updated weights for policy 0, policy_version 20650 (0.0007) [2023-10-12 16:37:04,433][62634] Updated weights for policy 0, policy_version 20660 (0.0007) [2023-10-12 16:37:04,810][62634] Updated weights for policy 0, policy_version 20670 (0.0007) [2023-10-12 16:37:06,477][62635] Updated weights for policy 1, policy_version 20650 (0.0007) [2023-10-12 16:37:06,844][62635] Updated weights for policy 1, policy_version 20660 (0.0007) [2023-10-12 16:37:07,217][62635] Updated weights for policy 1, policy_version 20670 (0.0007) [2023-10-12 16:37:08,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 42336256. Throughput: 0: 1684.8, 1: 1692.5. Samples: 10588714. Policy #0 lag: (min: 31.0, avg: 37.4, max: 63.0) [2023-10-12 16:37:08,435][61643] Avg episode reward: [(0, '3.860'), (1, '8.720')] [2023-10-12 16:37:08,942][62634] Updated weights for policy 0, policy_version 20680 (0.0007) [2023-10-12 16:37:09,323][62634] Updated weights for policy 0, policy_version 20690 (0.0007) [2023-10-12 16:37:09,697][62634] Updated weights for policy 0, policy_version 20700 (0.0009) [2023-10-12 16:37:11,214][62635] Updated weights for policy 1, policy_version 20680 (0.0010) [2023-10-12 16:37:11,583][62635] Updated weights for policy 1, policy_version 20690 (0.0008) [2023-10-12 16:37:11,962][62635] Updated weights for policy 1, policy_version 20700 (0.0008) [2023-10-12 16:37:13,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 42401792. Throughput: 0: 1692.4, 1: 1669.3. Samples: 10608492. Policy #0 lag: (min: 31.0, avg: 37.4, max: 63.0) [2023-10-12 16:37:13,435][61643] Avg episode reward: [(0, '3.830'), (1, '9.130')] [2023-10-12 16:37:13,669][62634] Updated weights for policy 0, policy_version 20710 (0.0009) [2023-10-12 16:37:14,049][62634] Updated weights for policy 0, policy_version 20720 (0.0008) [2023-10-12 16:37:14,431][62634] Updated weights for policy 0, policy_version 20730 (0.0009) [2023-10-12 16:37:16,006][62635] Updated weights for policy 1, policy_version 20710 (0.0009) [2023-10-12 16:37:16,368][62635] Updated weights for policy 1, policy_version 20720 (0.0009) [2023-10-12 16:37:16,748][62635] Updated weights for policy 1, policy_version 20730 (0.0007) [2023-10-12 16:37:18,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 42467328. Throughput: 0: 1698.3, 1: 1695.2. Samples: 10629278. Policy #0 lag: (min: 0.0, avg: 19.8, max: 32.0) [2023-10-12 16:37:18,435][61643] Avg episode reward: [(0, '3.840'), (1, '9.110')] [2023-10-12 16:37:18,576][62634] Updated weights for policy 0, policy_version 20740 (0.0008) [2023-10-12 16:37:18,962][62634] Updated weights for policy 0, policy_version 20750 (0.0010) [2023-10-12 16:37:19,338][62634] Updated weights for policy 0, policy_version 20760 (0.0008) [2023-10-12 16:37:20,751][62635] Updated weights for policy 1, policy_version 20740 (0.0008) [2023-10-12 16:37:21,118][62635] Updated weights for policy 1, policy_version 20750 (0.0007) [2023-10-12 16:37:21,493][62635] Updated weights for policy 1, policy_version 20760 (0.0009) [2023-10-12 16:37:23,422][62634] Updated weights for policy 0, policy_version 20770 (0.0008) [2023-10-12 16:37:23,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 42532864. Throughput: 0: 1691.4, 1: 1681.6. Samples: 10639076. Policy #0 lag: (min: 0.0, avg: 19.8, max: 32.0) [2023-10-12 16:37:23,436][61643] Avg episode reward: [(0, '3.840'), (1, '9.040')] [2023-10-12 16:37:23,800][62634] Updated weights for policy 0, policy_version 20780 (0.0011) [2023-10-12 16:37:24,179][62634] Updated weights for policy 0, policy_version 20790 (0.0009) [2023-10-12 16:37:24,560][62634] Updated weights for policy 0, policy_version 20800 (0.0010) [2023-10-12 16:37:25,698][62635] Updated weights for policy 1, policy_version 20770 (0.0008) [2023-10-12 16:37:26,065][62635] Updated weights for policy 1, policy_version 20780 (0.0007) [2023-10-12 16:37:26,436][62635] Updated weights for policy 1, policy_version 20790 (0.0009) [2023-10-12 16:37:26,807][62635] Updated weights for policy 1, policy_version 20800 (0.0008) [2023-10-12 16:37:28,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 42598400. Throughput: 0: 1696.3, 1: 1672.2. Samples: 10659004. Policy #0 lag: (min: 0.0, avg: 19.8, max: 32.0) [2023-10-12 16:37:28,436][61643] Avg episode reward: [(0, '3.850'), (1, '8.890')] [2023-10-12 16:37:28,625][62634] Updated weights for policy 0, policy_version 20810 (0.0007) [2023-10-12 16:37:29,009][62634] Updated weights for policy 0, policy_version 20820 (0.0007) [2023-10-12 16:37:29,394][62634] Updated weights for policy 0, policy_version 20830 (0.0009) [2023-10-12 16:37:30,812][62635] Updated weights for policy 1, policy_version 20810 (0.0009) [2023-10-12 16:37:31,182][62635] Updated weights for policy 1, policy_version 20820 (0.0010) [2023-10-12 16:37:31,558][62635] Updated weights for policy 1, policy_version 20830 (0.0010) [2023-10-12 16:37:33,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 42663936. Throughput: 0: 1691.8, 1: 1692.6. Samples: 10679598. Policy #0 lag: (min: 20.0, avg: 20.0, max: 20.0) [2023-10-12 16:37:33,435][61643] Avg episode reward: [(0, '3.870'), (1, '8.950')] [2023-10-12 16:37:33,443][62634] Updated weights for policy 0, policy_version 20840 (0.0011) [2023-10-12 16:37:33,813][62634] Updated weights for policy 0, policy_version 20850 (0.0010) [2023-10-12 16:37:34,196][62634] Updated weights for policy 0, policy_version 20860 (0.0009) [2023-10-12 16:37:35,454][62635] Updated weights for policy 1, policy_version 20840 (0.0010) [2023-10-12 16:37:35,819][62635] Updated weights for policy 1, policy_version 20850 (0.0008) [2023-10-12 16:37:36,189][62635] Updated weights for policy 1, policy_version 20860 (0.0007) [2023-10-12 16:37:38,181][62634] Updated weights for policy 0, policy_version 20870 (0.0009) [2023-10-12 16:37:38,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 42729472. Throughput: 0: 1699.2, 1: 1677.2. Samples: 10689292. Policy #0 lag: (min: 20.0, avg: 20.0, max: 20.0) [2023-10-12 16:37:38,436][61643] Avg episode reward: [(0, '3.870'), (1, '8.810')] [2023-10-12 16:37:38,552][62634] Updated weights for policy 0, policy_version 20880 (0.0010) [2023-10-12 16:37:38,936][62634] Updated weights for policy 0, policy_version 20890 (0.0008) [2023-10-12 16:37:40,445][62635] Updated weights for policy 1, policy_version 20870 (0.0010) [2023-10-12 16:37:40,811][62635] Updated weights for policy 1, policy_version 20880 (0.0008) [2023-10-12 16:37:41,178][62635] Updated weights for policy 1, policy_version 20890 (0.0007) [2023-10-12 16:37:42,797][62634] Updated weights for policy 0, policy_version 20900 (0.0010) [2023-10-12 16:37:43,173][62634] Updated weights for policy 0, policy_version 20910 (0.0009) [2023-10-12 16:37:43,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 42795008. Throughput: 0: 1696.1, 1: 1678.1. Samples: 10709768. Policy #0 lag: (min: 20.0, avg: 20.0, max: 20.0) [2023-10-12 16:37:43,435][61643] Avg episode reward: [(0, '3.940'), (1, '8.760')] [2023-10-12 16:37:43,553][62634] Updated weights for policy 0, policy_version 20920 (0.0009) [2023-10-12 16:37:45,118][62635] Updated weights for policy 1, policy_version 20900 (0.0007) [2023-10-12 16:37:45,489][62635] Updated weights for policy 1, policy_version 20910 (0.0007) [2023-10-12 16:37:45,851][62635] Updated weights for policy 1, policy_version 20920 (0.0007) [2023-10-12 16:37:47,596][62634] Updated weights for policy 0, policy_version 20930 (0.0007) [2023-10-12 16:37:47,978][62634] Updated weights for policy 0, policy_version 20940 (0.0008) [2023-10-12 16:37:48,355][62634] Updated weights for policy 0, policy_version 20950 (0.0008) [2023-10-12 16:37:48,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.1, 300 sec: 13329.4). Total num frames: 42860544. Throughput: 0: 1676.0, 1: 1702.3. Samples: 10730148. Policy #0 lag: (min: 20.0, avg: 20.0, max: 20.0) [2023-10-12 16:37:48,436][61643] Avg episode reward: [(0, '3.950'), (1, '8.730')] [2023-10-12 16:37:48,731][62634] Updated weights for policy 0, policy_version 20960 (0.0009) [2023-10-12 16:37:49,842][62635] Updated weights for policy 1, policy_version 20930 (0.0008) [2023-10-12 16:37:50,235][62635] Updated weights for policy 1, policy_version 20940 (0.0009) [2023-10-12 16:37:50,603][62635] Updated weights for policy 1, policy_version 20950 (0.0007) [2023-10-12 16:37:50,982][62635] Updated weights for policy 1, policy_version 20960 (0.0007) [2023-10-12 16:37:52,861][62634] Updated weights for policy 0, policy_version 20970 (0.0009) [2023-10-12 16:37:53,250][62634] Updated weights for policy 0, policy_version 20980 (0.0009) [2023-10-12 16:37:53,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 42926080. Throughput: 0: 1688.5, 1: 1666.0. Samples: 10739670. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 16:37:53,436][61643] Avg episode reward: [(0, '3.950'), (1, '8.560')] [2023-10-12 16:37:53,635][62634] Updated weights for policy 0, policy_version 20990 (0.0008) [2023-10-12 16:37:55,001][62635] Updated weights for policy 1, policy_version 20970 (0.0009) [2023-10-12 16:37:55,365][62635] Updated weights for policy 1, policy_version 20980 (0.0007) [2023-10-12 16:37:55,747][62635] Updated weights for policy 1, policy_version 20990 (0.0011) [2023-10-12 16:37:57,616][62634] Updated weights for policy 0, policy_version 21000 (0.0007) [2023-10-12 16:37:57,981][62634] Updated weights for policy 0, policy_version 21010 (0.0007) [2023-10-12 16:37:58,355][62634] Updated weights for policy 0, policy_version 21020 (0.0008) [2023-10-12 16:37:58,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 42991616. Throughput: 0: 1690.8, 1: 1686.9. Samples: 10760488. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 16:37:58,436][61643] Avg episode reward: [(0, '3.960'), (1, '8.770')] [2023-10-12 16:37:59,872][62635] Updated weights for policy 1, policy_version 21000 (0.0011) [2023-10-12 16:38:00,246][62635] Updated weights for policy 1, policy_version 21010 (0.0008) [2023-10-12 16:38:00,615][62635] Updated weights for policy 1, policy_version 21020 (0.0008) [2023-10-12 16:38:02,422][62634] Updated weights for policy 0, policy_version 21030 (0.0008) [2023-10-12 16:38:02,805][62634] Updated weights for policy 0, policy_version 21040 (0.0007) [2023-10-12 16:38:03,184][62634] Updated weights for policy 0, policy_version 21050 (0.0010) [2023-10-12 16:38:03,435][61643] Fps is (10 sec: 16384.1, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 43089920. Throughput: 0: 1672.2, 1: 1688.1. Samples: 10780492. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 16:38:03,435][61643] Avg episode reward: [(0, '3.950'), (1, '8.700')] [2023-10-12 16:38:04,802][62635] Updated weights for policy 1, policy_version 21030 (0.0007) [2023-10-12 16:38:05,178][62635] Updated weights for policy 1, policy_version 21040 (0.0008) [2023-10-12 16:38:05,551][62635] Updated weights for policy 1, policy_version 21050 (0.0007) [2023-10-12 16:38:07,216][62634] Updated weights for policy 0, policy_version 21060 (0.0010) [2023-10-12 16:38:07,610][62634] Updated weights for policy 0, policy_version 21070 (0.0007) [2023-10-12 16:38:07,990][62634] Updated weights for policy 0, policy_version 21080 (0.0007) [2023-10-12 16:38:08,435][61643] Fps is (10 sec: 16384.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 43155456. Throughput: 0: 1694.2, 1: 1667.8. Samples: 10790366. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 16:38:08,436][61643] Avg episode reward: [(0, '3.920'), (1, '8.660')] [2023-10-12 16:38:09,460][62635] Updated weights for policy 1, policy_version 21060 (0.0009) [2023-10-12 16:38:09,828][62635] Updated weights for policy 1, policy_version 21070 (0.0008) [2023-10-12 16:38:10,194][62635] Updated weights for policy 1, policy_version 21080 (0.0009) [2023-10-12 16:38:12,155][62634] Updated weights for policy 0, policy_version 21090 (0.0008) [2023-10-12 16:38:12,533][62634] Updated weights for policy 0, policy_version 21100 (0.0009) [2023-10-12 16:38:12,909][62634] Updated weights for policy 0, policy_version 21110 (0.0008) [2023-10-12 16:38:13,277][62634] Updated weights for policy 0, policy_version 21120 (0.0009) [2023-10-12 16:38:13,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 43220992. Throughput: 0: 1687.4, 1: 1693.0. Samples: 10811120. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 16:38:13,435][61643] Avg episode reward: [(0, '3.940'), (1, '8.950')] [2023-10-12 16:38:14,277][62635] Updated weights for policy 1, policy_version 21090 (0.0008) [2023-10-12 16:38:14,649][62635] Updated weights for policy 1, policy_version 21100 (0.0007) [2023-10-12 16:38:15,008][62635] Updated weights for policy 1, policy_version 21110 (0.0007) [2023-10-12 16:38:15,386][62635] Updated weights for policy 1, policy_version 21120 (0.0010) [2023-10-12 16:38:17,330][62634] Updated weights for policy 0, policy_version 21130 (0.0008) [2023-10-12 16:38:17,717][62634] Updated weights for policy 0, policy_version 21140 (0.0010) [2023-10-12 16:38:18,097][62634] Updated weights for policy 0, policy_version 21150 (0.0007) [2023-10-12 16:38:18,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 43286528. Throughput: 0: 1668.2, 1: 1694.6. Samples: 10830924. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 16:38:18,436][61643] Avg episode reward: [(0, '3.890'), (1, '8.730')] [2023-10-12 16:38:18,447][62354] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000021152_21659648.pth... [2023-10-12 16:38:18,447][62495] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000021120_21626880.pth... [2023-10-12 16:38:18,477][62354] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000019584_20054016.pth [2023-10-12 16:38:18,484][62495] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000019552_20021248.pth [2023-10-12 16:38:19,526][62635] Updated weights for policy 1, policy_version 21130 (0.0009) [2023-10-12 16:38:19,903][62635] Updated weights for policy 1, policy_version 21140 (0.0009) [2023-10-12 16:38:20,264][62635] Updated weights for policy 1, policy_version 21150 (0.0009) [2023-10-12 16:38:22,063][62634] Updated weights for policy 0, policy_version 21160 (0.0008) [2023-10-12 16:38:22,435][62634] Updated weights for policy 0, policy_version 21170 (0.0008) [2023-10-12 16:38:22,812][62634] Updated weights for policy 0, policy_version 21180 (0.0008) [2023-10-12 16:38:23,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 43352064. Throughput: 0: 1688.9, 1: 1681.8. Samples: 10840974. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 16:38:23,436][61643] Avg episode reward: [(0, '3.890'), (1, '8.970')] [2023-10-12 16:38:24,291][62635] Updated weights for policy 1, policy_version 21160 (0.0011) [2023-10-12 16:38:24,662][62635] Updated weights for policy 1, policy_version 21170 (0.0008) [2023-10-12 16:38:25,022][62635] Updated weights for policy 1, policy_version 21180 (0.0008) [2023-10-12 16:38:27,061][62634] Updated weights for policy 0, policy_version 21190 (0.0008) [2023-10-12 16:38:27,436][62634] Updated weights for policy 0, policy_version 21200 (0.0008) [2023-10-12 16:38:27,820][62634] Updated weights for policy 0, policy_version 21210 (0.0010) [2023-10-12 16:38:28,435][61643] Fps is (10 sec: 13107.6, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 43417600. Throughput: 0: 1681.5, 1: 1689.6. Samples: 10861470. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0) [2023-10-12 16:38:28,435][61643] Avg episode reward: [(0, '3.920'), (1, '8.980')] [2023-10-12 16:38:28,969][62635] Updated weights for policy 1, policy_version 21190 (0.0009) [2023-10-12 16:38:29,343][62635] Updated weights for policy 1, policy_version 21200 (0.0009) [2023-10-12 16:38:29,717][62635] Updated weights for policy 1, policy_version 21210 (0.0009) [2023-10-12 16:38:31,741][62634] Updated weights for policy 0, policy_version 21220 (0.0008) [2023-10-12 16:38:32,119][62634] Updated weights for policy 0, policy_version 21230 (0.0008) [2023-10-12 16:38:32,500][62634] Updated weights for policy 0, policy_version 21240 (0.0009) [2023-10-12 16:38:33,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 43483136. Throughput: 0: 1667.5, 1: 1695.2. Samples: 10881470. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0) [2023-10-12 16:38:33,436][61643] Avg episode reward: [(0, '3.970'), (1, '8.800')] [2023-10-12 16:38:33,777][62635] Updated weights for policy 1, policy_version 21220 (0.0010) [2023-10-12 16:38:34,148][62635] Updated weights for policy 1, policy_version 21230 (0.0009) [2023-10-12 16:38:34,527][62635] Updated weights for policy 1, policy_version 21240 (0.0008) [2023-10-12 16:38:36,411][62634] Updated weights for policy 0, policy_version 21250 (0.0008) [2023-10-12 16:38:36,781][62634] Updated weights for policy 0, policy_version 21260 (0.0007) [2023-10-12 16:38:37,166][62634] Updated weights for policy 0, policy_version 21270 (0.0010) [2023-10-12 16:38:37,537][62634] Updated weights for policy 0, policy_version 21280 (0.0008) [2023-10-12 16:38:38,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 43548672. Throughput: 0: 1690.7, 1: 1695.9. Samples: 10892068. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0) [2023-10-12 16:38:38,436][61643] Avg episode reward: [(0, '3.880'), (1, '8.950')] [2023-10-12 16:38:38,685][62635] Updated weights for policy 1, policy_version 21250 (0.0008) [2023-10-12 16:38:39,080][62635] Updated weights for policy 1, policy_version 21260 (0.0009) [2023-10-12 16:38:39,446][62635] Updated weights for policy 1, policy_version 21270 (0.0007) [2023-10-12 16:38:39,815][62635] Updated weights for policy 1, policy_version 21280 (0.0009) [2023-10-12 16:38:41,683][62634] Updated weights for policy 0, policy_version 21290 (0.0009) [2023-10-12 16:38:42,064][62634] Updated weights for policy 0, policy_version 21300 (0.0009) [2023-10-12 16:38:42,427][62634] Updated weights for policy 0, policy_version 21310 (0.0009) [2023-10-12 16:38:43,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13653.2, 300 sec: 13440.4). Total num frames: 43614208. Throughput: 0: 1675.8, 1: 1691.6. Samples: 10912022. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0) [2023-10-12 16:38:43,436][61643] Avg episode reward: [(0, '3.880'), (1, '9.080')] [2023-10-12 16:38:43,878][62635] Updated weights for policy 1, policy_version 21290 (0.0007) [2023-10-12 16:38:44,241][62635] Updated weights for policy 1, policy_version 21300 (0.0011) [2023-10-12 16:38:44,604][62635] Updated weights for policy 1, policy_version 21310 (0.0010) [2023-10-12 16:38:46,461][62634] Updated weights for policy 0, policy_version 21320 (0.0009) [2023-10-12 16:38:46,844][62634] Updated weights for policy 0, policy_version 21330 (0.0009) [2023-10-12 16:38:47,230][62634] Updated weights for policy 0, policy_version 21340 (0.0008) [2023-10-12 16:38:48,435][61643] Fps is (10 sec: 13107.5, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 43679744. Throughput: 0: 1677.4, 1: 1689.6. Samples: 10932006. Policy #0 lag: (min: 27.0, avg: 27.3, max: 39.0) [2023-10-12 16:38:48,435][61643] Avg episode reward: [(0, '3.840'), (1, '8.760')] [2023-10-12 16:38:48,584][62635] Updated weights for policy 1, policy_version 21320 (0.0008) [2023-10-12 16:38:48,950][62635] Updated weights for policy 1, policy_version 21330 (0.0010) [2023-10-12 16:38:49,324][62635] Updated weights for policy 1, policy_version 21340 (0.0010) [2023-10-12 16:38:51,121][62634] Updated weights for policy 0, policy_version 21350 (0.0008) [2023-10-12 16:38:51,500][62634] Updated weights for policy 0, policy_version 21360 (0.0010) [2023-10-12 16:38:51,874][62634] Updated weights for policy 0, policy_version 21370 (0.0008) [2023-10-12 16:38:53,407][62635] Updated weights for policy 1, policy_version 21350 (0.0008) [2023-10-12 16:38:53,435][61643] Fps is (10 sec: 13107.6, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 43745280. Throughput: 0: 1684.4, 1: 1689.7. Samples: 10942202. Policy #0 lag: (min: 27.0, avg: 27.3, max: 39.0) [2023-10-12 16:38:53,435][61643] Avg episode reward: [(0, '3.940'), (1, '8.930')] [2023-10-12 16:38:53,772][62635] Updated weights for policy 1, policy_version 21360 (0.0009) [2023-10-12 16:38:54,133][62635] Updated weights for policy 1, policy_version 21370 (0.0009) [2023-10-12 16:38:55,832][62634] Updated weights for policy 0, policy_version 21380 (0.0008) [2023-10-12 16:38:56,203][62634] Updated weights for policy 0, policy_version 21390 (0.0007) [2023-10-12 16:38:56,595][62634] Updated weights for policy 0, policy_version 21400 (0.0007) [2023-10-12 16:38:58,184][62635] Updated weights for policy 1, policy_version 21380 (0.0008) [2023-10-12 16:38:58,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 43810816. Throughput: 0: 1666.4, 1: 1686.2. Samples: 10961990. Policy #0 lag: (min: 27.0, avg: 27.3, max: 39.0) [2023-10-12 16:38:58,435][61643] Avg episode reward: [(0, '3.960'), (1, '9.140')] [2023-10-12 16:38:58,548][62635] Updated weights for policy 1, policy_version 21390 (0.0009) [2023-10-12 16:38:58,925][62635] Updated weights for policy 1, policy_version 21400 (0.0009) [2023-10-12 16:39:00,717][62634] Updated weights for policy 0, policy_version 21410 (0.0007) [2023-10-12 16:39:01,123][62634] Updated weights for policy 0, policy_version 21420 (0.0010) [2023-10-12 16:39:01,499][62634] Updated weights for policy 0, policy_version 21430 (0.0009) [2023-10-12 16:39:01,867][62634] Updated weights for policy 0, policy_version 21440 (0.0008) [2023-10-12 16:39:02,813][62635] Updated weights for policy 1, policy_version 21410 (0.0008) [2023-10-12 16:39:03,180][62635] Updated weights for policy 1, policy_version 21420 (0.0007) [2023-10-12 16:39:03,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 43876352. Throughput: 0: 1679.8, 1: 1687.6. Samples: 10982454. Policy #0 lag: (min: 27.0, avg: 27.3, max: 39.0) [2023-10-12 16:39:03,436][61643] Avg episode reward: [(0, '3.980'), (1, '8.940')] [2023-10-12 16:39:03,552][62635] Updated weights for policy 1, policy_version 21430 (0.0010) [2023-10-12 16:39:03,924][62635] Updated weights for policy 1, policy_version 21440 (0.0010) [2023-10-12 16:39:05,760][62634] Updated weights for policy 0, policy_version 21450 (0.0008) [2023-10-12 16:39:06,142][62634] Updated weights for policy 0, policy_version 21460 (0.0007) [2023-10-12 16:39:06,515][62634] Updated weights for policy 0, policy_version 21470 (0.0007) [2023-10-12 16:39:07,974][62635] Updated weights for policy 1, policy_version 21450 (0.0010) [2023-10-12 16:39:08,337][62635] Updated weights for policy 1, policy_version 21460 (0.0007) [2023-10-12 16:39:08,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 43941888. Throughput: 0: 1676.2, 1: 1696.4. Samples: 10992740. Policy #0 lag: (min: 17.0, avg: 31.5, max: 49.0) [2023-10-12 16:39:08,435][61643] Avg episode reward: [(0, '3.960'), (1, '9.020')] [2023-10-12 16:39:08,702][62635] Updated weights for policy 1, policy_version 21470 (0.0010) [2023-10-12 16:39:10,723][62634] Updated weights for policy 0, policy_version 21480 (0.0007) [2023-10-12 16:39:11,104][62634] Updated weights for policy 0, policy_version 21490 (0.0007) [2023-10-12 16:39:11,483][62634] Updated weights for policy 0, policy_version 21500 (0.0007) [2023-10-12 16:39:12,740][62635] Updated weights for policy 1, policy_version 21480 (0.0008) [2023-10-12 16:39:13,114][62635] Updated weights for policy 1, policy_version 21490 (0.0007) [2023-10-12 16:39:13,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 44007424. Throughput: 0: 1663.0, 1: 1697.3. Samples: 11012686. Policy #0 lag: (min: 17.0, avg: 31.5, max: 49.0) [2023-10-12 16:39:13,436][61643] Avg episode reward: [(0, '3.960'), (1, '9.120')] [2023-10-12 16:39:13,478][62635] Updated weights for policy 1, policy_version 21500 (0.0008) [2023-10-12 16:39:15,411][62634] Updated weights for policy 0, policy_version 21510 (0.0007) [2023-10-12 16:39:15,779][62634] Updated weights for policy 0, policy_version 21520 (0.0009) [2023-10-12 16:39:16,165][62634] Updated weights for policy 0, policy_version 21530 (0.0007) [2023-10-12 16:39:17,410][62635] Updated weights for policy 1, policy_version 21510 (0.0010) [2023-10-12 16:39:17,781][62635] Updated weights for policy 1, policy_version 21520 (0.0008) [2023-10-12 16:39:18,146][62635] Updated weights for policy 1, policy_version 21530 (0.0008) [2023-10-12 16:39:18,435][61643] Fps is (10 sec: 16384.0, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 44105728. Throughput: 0: 1692.8, 1: 1668.8. Samples: 11032740. Policy #0 lag: (min: 17.0, avg: 31.5, max: 49.0) [2023-10-12 16:39:18,435][61643] Avg episode reward: [(0, '3.910'), (1, '9.100')] [2023-10-12 16:39:20,346][62634] Updated weights for policy 0, policy_version 21540 (0.0008) [2023-10-12 16:39:20,718][62634] Updated weights for policy 0, policy_version 21550 (0.0009) [2023-10-12 16:39:21,093][62634] Updated weights for policy 0, policy_version 21560 (0.0007) [2023-10-12 16:39:22,242][62635] Updated weights for policy 1, policy_version 21540 (0.0009) [2023-10-12 16:39:22,605][62635] Updated weights for policy 1, policy_version 21550 (0.0009) [2023-10-12 16:39:22,980][62635] Updated weights for policy 1, policy_version 21560 (0.0010) [2023-10-12 16:39:23,435][61643] Fps is (10 sec: 16384.2, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 44171264. Throughput: 0: 1667.6, 1: 1691.1. Samples: 11043206. Policy #0 lag: (min: 17.0, avg: 31.5, max: 49.0) [2023-10-12 16:39:23,435][61643] Avg episode reward: [(0, '3.870'), (1, '9.140')] [2023-10-12 16:39:25,090][62634] Updated weights for policy 0, policy_version 21570 (0.0010) [2023-10-12 16:39:25,462][62634] Updated weights for policy 0, policy_version 21580 (0.0009) [2023-10-12 16:39:25,836][62634] Updated weights for policy 0, policy_version 21590 (0.0009) [2023-10-12 16:39:26,217][62634] Updated weights for policy 0, policy_version 21600 (0.0009) [2023-10-12 16:39:27,068][62635] Updated weights for policy 1, policy_version 21570 (0.0010) [2023-10-12 16:39:27,459][62635] Updated weights for policy 1, policy_version 21580 (0.0009) [2023-10-12 16:39:27,834][62635] Updated weights for policy 1, policy_version 21590 (0.0009) [2023-10-12 16:39:28,196][62635] Updated weights for policy 1, policy_version 21600 (0.0007) [2023-10-12 16:39:28,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 44236800. Throughput: 0: 1670.5, 1: 1692.6. Samples: 11063360. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 16:39:28,435][61643] Avg episode reward: [(0, '3.880'), (1, '9.010')] [2023-10-12 16:39:30,384][62634] Updated weights for policy 0, policy_version 21610 (0.0011) [2023-10-12 16:39:30,770][62634] Updated weights for policy 0, policy_version 21620 (0.0009) [2023-10-12 16:39:31,146][62634] Updated weights for policy 0, policy_version 21630 (0.0010) [2023-10-12 16:39:32,086][62635] Updated weights for policy 1, policy_version 21610 (0.0007) [2023-10-12 16:39:32,466][62635] Updated weights for policy 1, policy_version 21620 (0.0008) [2023-10-12 16:39:32,826][62635] Updated weights for policy 1, policy_version 21630 (0.0008) [2023-10-12 16:39:33,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 44302336. Throughput: 0: 1684.6, 1: 1671.2. Samples: 11083016. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 16:39:33,435][61643] Avg episode reward: [(0, '3.910'), (1, '8.820')] [2023-10-12 16:39:35,184][62634] Updated weights for policy 0, policy_version 21640 (0.0007) [2023-10-12 16:39:35,548][62634] Updated weights for policy 0, policy_version 21650 (0.0008) [2023-10-12 16:39:35,930][62634] Updated weights for policy 0, policy_version 21660 (0.0010) [2023-10-12 16:39:36,985][62635] Updated weights for policy 1, policy_version 21640 (0.0010) [2023-10-12 16:39:37,357][62635] Updated weights for policy 1, policy_version 21650 (0.0010) [2023-10-12 16:39:37,719][62635] Updated weights for policy 1, policy_version 21660 (0.0008) [2023-10-12 16:39:38,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 44367872. Throughput: 0: 1661.3, 1: 1701.1. Samples: 11093510. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 16:39:38,435][61643] Avg episode reward: [(0, '3.960'), (1, '8.810')] [2023-10-12 16:39:39,786][62634] Updated weights for policy 0, policy_version 21670 (0.0008) [2023-10-12 16:39:40,158][62634] Updated weights for policy 0, policy_version 21680 (0.0010) [2023-10-12 16:39:40,543][62634] Updated weights for policy 0, policy_version 21690 (0.0008) [2023-10-12 16:39:41,806][62635] Updated weights for policy 1, policy_version 21670 (0.0011) [2023-10-12 16:39:42,182][62635] Updated weights for policy 1, policy_version 21680 (0.0011) [2023-10-12 16:39:42,551][62635] Updated weights for policy 1, policy_version 21690 (0.0010) [2023-10-12 16:39:43,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 44433408. Throughput: 0: 1680.7, 1: 1686.3. Samples: 11113502. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 16:39:43,435][61643] Avg episode reward: [(0, '3.920'), (1, '8.780')] [2023-10-12 16:39:44,515][62634] Updated weights for policy 0, policy_version 21700 (0.0007) [2023-10-12 16:39:44,893][62634] Updated weights for policy 0, policy_version 21710 (0.0008) [2023-10-12 16:39:45,268][62634] Updated weights for policy 0, policy_version 21720 (0.0010) [2023-10-12 16:39:46,592][62635] Updated weights for policy 1, policy_version 21700 (0.0009) [2023-10-12 16:39:46,969][62635] Updated weights for policy 1, policy_version 21710 (0.0008) [2023-10-12 16:39:47,345][62635] Updated weights for policy 1, policy_version 21720 (0.0009) [2023-10-12 16:39:48,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 44498944. Throughput: 0: 1691.8, 1: 1665.0. Samples: 11133508. Policy #0 lag: (min: 31.0, avg: 40.0, max: 63.0) [2023-10-12 16:39:48,436][61643] Avg episode reward: [(0, '3.910'), (1, '8.710')] [2023-10-12 16:39:49,411][62634] Updated weights for policy 0, policy_version 21730 (0.0009) [2023-10-12 16:39:49,819][62634] Updated weights for policy 0, policy_version 21740 (0.0009) [2023-10-12 16:39:50,210][62634] Updated weights for policy 0, policy_version 21750 (0.0009) [2023-10-12 16:39:50,581][62634] Updated weights for policy 0, policy_version 21760 (0.0010) [2023-10-12 16:39:51,521][62635] Updated weights for policy 1, policy_version 21730 (0.0008) [2023-10-12 16:39:51,891][62635] Updated weights for policy 1, policy_version 21740 (0.0008) [2023-10-12 16:39:52,262][62635] Updated weights for policy 1, policy_version 21750 (0.0007) [2023-10-12 16:39:52,619][62635] Updated weights for policy 1, policy_version 21760 (0.0008) [2023-10-12 16:39:53,435][61643] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 44564480. Throughput: 0: 1667.7, 1: 1685.8. Samples: 11143648. Policy #0 lag: (min: 31.0, avg: 40.0, max: 63.0) [2023-10-12 16:39:53,436][61643] Avg episode reward: [(0, '3.930'), (1, '8.840')] [2023-10-12 16:39:54,632][62634] Updated weights for policy 0, policy_version 21770 (0.0007) [2023-10-12 16:39:55,012][62634] Updated weights for policy 0, policy_version 21780 (0.0007) [2023-10-12 16:39:55,405][62634] Updated weights for policy 0, policy_version 21790 (0.0007) [2023-10-12 16:39:56,645][62635] Updated weights for policy 1, policy_version 21770 (0.0010) [2023-10-12 16:39:57,026][62635] Updated weights for policy 1, policy_version 21780 (0.0008) [2023-10-12 16:39:57,392][62635] Updated weights for policy 1, policy_version 21790 (0.0009) [2023-10-12 16:39:58,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 44630016. Throughput: 0: 1683.7, 1: 1669.3. Samples: 11163572. Policy #0 lag: (min: 31.0, avg: 40.0, max: 63.0) [2023-10-12 16:39:58,436][61643] Avg episode reward: [(0, '3.960'), (1, '8.780')] [2023-10-12 16:39:59,690][62634] Updated weights for policy 0, policy_version 21800 (0.0007) [2023-10-12 16:40:00,074][62634] Updated weights for policy 0, policy_version 21810 (0.0008) [2023-10-12 16:40:00,457][62634] Updated weights for policy 0, policy_version 21820 (0.0009) [2023-10-12 16:40:01,525][62635] Updated weights for policy 1, policy_version 21800 (0.0010) [2023-10-12 16:40:01,900][62635] Updated weights for policy 1, policy_version 21810 (0.0009) [2023-10-12 16:40:02,276][62635] Updated weights for policy 1, policy_version 21820 (0.0010) [2023-10-12 16:40:03,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 44695552. Throughput: 0: 1681.8, 1: 1676.3. Samples: 11183854. Policy #0 lag: (min: 31.0, avg: 40.0, max: 63.0) [2023-10-12 16:40:03,436][61643] Avg episode reward: [(0, '3.920'), (1, '8.780')] [2023-10-12 16:40:04,496][62634] Updated weights for policy 0, policy_version 21830 (0.0010) [2023-10-12 16:40:04,869][62634] Updated weights for policy 0, policy_version 21840 (0.0010) [2023-10-12 16:40:05,254][62634] Updated weights for policy 0, policy_version 21850 (0.0011) [2023-10-12 16:40:06,280][62635] Updated weights for policy 1, policy_version 21830 (0.0007) [2023-10-12 16:40:06,655][62635] Updated weights for policy 1, policy_version 21840 (0.0008) [2023-10-12 16:40:07,025][62635] Updated weights for policy 1, policy_version 21850 (0.0009) [2023-10-12 16:40:08,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 44761088. Throughput: 0: 1669.2, 1: 1684.6. Samples: 11194124. Policy #0 lag: (min: 13.0, avg: 20.8, max: 45.0) [2023-10-12 16:40:08,436][61643] Avg episode reward: [(0, '3.930'), (1, '8.820')] [2023-10-12 16:40:09,447][62634] Updated weights for policy 0, policy_version 21860 (0.0010) [2023-10-12 16:40:09,824][62634] Updated weights for policy 0, policy_version 21870 (0.0009) [2023-10-12 16:40:10,205][62634] Updated weights for policy 0, policy_version 21880 (0.0009) [2023-10-12 16:40:11,091][62635] Updated weights for policy 1, policy_version 21860 (0.0007) [2023-10-12 16:40:11,460][62635] Updated weights for policy 1, policy_version 21870 (0.0009) [2023-10-12 16:40:11,823][62635] Updated weights for policy 1, policy_version 21880 (0.0008) [2023-10-12 16:40:13,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 44826624. Throughput: 0: 1680.6, 1: 1664.8. Samples: 11213902. Policy #0 lag: (min: 13.0, avg: 20.8, max: 45.0) [2023-10-12 16:40:13,435][61643] Avg episode reward: [(0, '3.890'), (1, '8.800')] [2023-10-12 16:40:14,339][62634] Updated weights for policy 0, policy_version 21890 (0.0009) [2023-10-12 16:40:14,707][62634] Updated weights for policy 0, policy_version 21900 (0.0010) [2023-10-12 16:40:15,084][62634] Updated weights for policy 0, policy_version 21910 (0.0010) [2023-10-12 16:40:15,461][62634] Updated weights for policy 0, policy_version 21920 (0.0008) [2023-10-12 16:40:15,887][62635] Updated weights for policy 1, policy_version 21890 (0.0009) [2023-10-12 16:40:16,288][62635] Updated weights for policy 1, policy_version 21900 (0.0007) [2023-10-12 16:40:16,660][62635] Updated weights for policy 1, policy_version 21910 (0.0007) [2023-10-12 16:40:17,038][62635] Updated weights for policy 1, policy_version 21920 (0.0008) [2023-10-12 16:40:18,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 44892160. Throughput: 0: 1678.8, 1: 1681.6. Samples: 11234236. Policy #0 lag: (min: 13.0, avg: 20.8, max: 45.0) [2023-10-12 16:40:18,436][61643] Avg episode reward: [(0, '3.940'), (1, '8.950')] [2023-10-12 16:40:18,446][62354] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000021920_22446080.pth... [2023-10-12 16:40:18,446][62495] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000021920_22446080.pth... [2023-10-12 16:40:18,483][62495] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000020352_20840448.pth [2023-10-12 16:40:18,486][62354] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000020352_20840448.pth [2023-10-12 16:40:19,570][62634] Updated weights for policy 0, policy_version 21930 (0.0008) [2023-10-12 16:40:19,946][62634] Updated weights for policy 0, policy_version 21940 (0.0008) [2023-10-12 16:40:20,323][62634] Updated weights for policy 0, policy_version 21950 (0.0008) [2023-10-12 16:40:20,943][62635] Updated weights for policy 1, policy_version 21930 (0.0008) [2023-10-12 16:40:21,304][62635] Updated weights for policy 1, policy_version 21940 (0.0009) [2023-10-12 16:40:21,669][62635] Updated weights for policy 1, policy_version 21950 (0.0009) [2023-10-12 16:40:23,435][61643] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 44957696. Throughput: 0: 1675.3, 1: 1676.2. Samples: 11244328. Policy #0 lag: (min: 13.0, avg: 20.8, max: 45.0) [2023-10-12 16:40:23,436][61643] Avg episode reward: [(0, '3.910'), (1, '8.920')] [2023-10-12 16:40:24,456][62634] Updated weights for policy 0, policy_version 21960 (0.0007) [2023-10-12 16:40:24,832][62634] Updated weights for policy 0, policy_version 21970 (0.0007) [2023-10-12 16:40:25,199][62634] Updated weights for policy 0, policy_version 21980 (0.0008) [2023-10-12 16:40:25,829][62635] Updated weights for policy 1, policy_version 21960 (0.0010) [2023-10-12 16:40:26,197][62635] Updated weights for policy 1, policy_version 21970 (0.0007) [2023-10-12 16:40:26,572][62635] Updated weights for policy 1, policy_version 21980 (0.0007) [2023-10-12 16:40:28,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 45023232. Throughput: 0: 1672.2, 1: 1670.5. Samples: 11263922. Policy #0 lag: (min: 18.0, avg: 18.8, max: 38.0) [2023-10-12 16:40:28,435][61643] Avg episode reward: [(0, '3.920'), (1, '8.980')] [2023-10-12 16:40:29,513][62634] Updated weights for policy 0, policy_version 21990 (0.0010) [2023-10-12 16:40:29,876][62634] Updated weights for policy 0, policy_version 22000 (0.0010) [2023-10-12 16:40:30,250][62634] Updated weights for policy 0, policy_version 22010 (0.0008) [2023-10-12 16:40:30,660][62635] Updated weights for policy 1, policy_version 21990 (0.0009) [2023-10-12 16:40:31,036][62635] Updated weights for policy 1, policy_version 22000 (0.0010) [2023-10-12 16:40:31,405][62635] Updated weights for policy 1, policy_version 22010 (0.0010) [2023-10-12 16:40:33,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 45088768. Throughput: 0: 1661.6, 1: 1693.3. Samples: 11284480. Policy #0 lag: (min: 18.0, avg: 18.8, max: 38.0) [2023-10-12 16:40:33,436][61643] Avg episode reward: [(0, '3.890'), (1, '8.810')] [2023-10-12 16:40:34,399][62634] Updated weights for policy 0, policy_version 22020 (0.0007) [2023-10-12 16:40:34,783][62634] Updated weights for policy 0, policy_version 22030 (0.0009) [2023-10-12 16:40:35,157][62634] Updated weights for policy 0, policy_version 22040 (0.0008) [2023-10-12 16:40:35,360][62635] Updated weights for policy 1, policy_version 22020 (0.0007) [2023-10-12 16:40:35,737][62635] Updated weights for policy 1, policy_version 22030 (0.0007) [2023-10-12 16:40:36,104][62635] Updated weights for policy 1, policy_version 22040 (0.0007) [2023-10-12 16:40:38,435][61643] Fps is (10 sec: 13106.8, 60 sec: 13107.1, 300 sec: 13440.4). Total num frames: 45154304. Throughput: 0: 1661.8, 1: 1679.1. Samples: 11293986. Policy #0 lag: (min: 18.0, avg: 18.8, max: 38.0) [2023-10-12 16:40:38,436][61643] Avg episode reward: [(0, '3.890'), (1, '9.160')] [2023-10-12 16:40:39,164][62634] Updated weights for policy 0, policy_version 22050 (0.0007) [2023-10-12 16:40:39,549][62634] Updated weights for policy 0, policy_version 22060 (0.0011) [2023-10-12 16:40:39,924][62634] Updated weights for policy 0, policy_version 22070 (0.0007) [2023-10-12 16:40:40,162][62635] Updated weights for policy 1, policy_version 22050 (0.0010) [2023-10-12 16:40:40,300][62634] Updated weights for policy 0, policy_version 22080 (0.0007) [2023-10-12 16:40:40,533][62635] Updated weights for policy 1, policy_version 22060 (0.0007) [2023-10-12 16:40:40,905][62635] Updated weights for policy 1, policy_version 22070 (0.0007) [2023-10-12 16:40:41,266][62635] Updated weights for policy 1, policy_version 22080 (0.0009) [2023-10-12 16:40:43,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 45219840. Throughput: 0: 1662.9, 1: 1682.7. Samples: 11314122. Policy #0 lag: (min: 18.0, avg: 18.8, max: 38.0) [2023-10-12 16:40:43,436][61643] Avg episode reward: [(0, '3.900'), (1, '9.090')] [2023-10-12 16:40:44,215][62634] Updated weights for policy 0, policy_version 22090 (0.0009) [2023-10-12 16:40:44,592][62634] Updated weights for policy 0, policy_version 22100 (0.0009) [2023-10-12 16:40:44,974][62634] Updated weights for policy 0, policy_version 22110 (0.0008) [2023-10-12 16:40:45,412][62635] Updated weights for policy 1, policy_version 22090 (0.0010) [2023-10-12 16:40:45,777][62635] Updated weights for policy 1, policy_version 22100 (0.0010) [2023-10-12 16:40:46,152][62635] Updated weights for policy 1, policy_version 22110 (0.0010) [2023-10-12 16:40:48,435][61643] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 45285376. Throughput: 0: 1661.3, 1: 1691.6. Samples: 11334736. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 16:40:48,436][61643] Avg episode reward: [(0, '3.910'), (1, '8.930')] [2023-10-12 16:40:49,063][62634] Updated weights for policy 0, policy_version 22120 (0.0007) [2023-10-12 16:40:49,455][62634] Updated weights for policy 0, policy_version 22130 (0.0007) [2023-10-12 16:40:49,830][62634] Updated weights for policy 0, policy_version 22140 (0.0009) [2023-10-12 16:40:50,178][62635] Updated weights for policy 1, policy_version 22120 (0.0008) [2023-10-12 16:40:50,547][62635] Updated weights for policy 1, policy_version 22130 (0.0007) [2023-10-12 16:40:50,910][62635] Updated weights for policy 1, policy_version 22140 (0.0009) [2023-10-12 16:40:53,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 45350912. Throughput: 0: 1664.0, 1: 1664.8. Samples: 11343920. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 16:40:53,435][61643] Avg episode reward: [(0, '3.960'), (1, '9.030')] [2023-10-12 16:40:53,915][62634] Updated weights for policy 0, policy_version 22150 (0.0008) [2023-10-12 16:40:54,296][62634] Updated weights for policy 0, policy_version 22160 (0.0009) [2023-10-12 16:40:54,682][62634] Updated weights for policy 0, policy_version 22170 (0.0008) [2023-10-12 16:40:55,125][62635] Updated weights for policy 1, policy_version 22150 (0.0009) [2023-10-12 16:40:55,493][62635] Updated weights for policy 1, policy_version 22160 (0.0007) [2023-10-12 16:40:55,863][62635] Updated weights for policy 1, policy_version 22170 (0.0011) [2023-10-12 16:40:58,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 45416448. Throughput: 0: 1665.0, 1: 1678.9. Samples: 11364376. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 16:40:58,435][61643] Avg episode reward: [(0, '3.970'), (1, '8.860')] [2023-10-12 16:40:58,473][62634] Updated weights for policy 0, policy_version 22180 (0.0009) [2023-10-12 16:40:58,847][62634] Updated weights for policy 0, policy_version 22190 (0.0010) [2023-10-12 16:40:59,217][62634] Updated weights for policy 0, policy_version 22200 (0.0008) [2023-10-12 16:40:59,981][62635] Updated weights for policy 1, policy_version 22180 (0.0009) [2023-10-12 16:41:00,345][62635] Updated weights for policy 1, policy_version 22190 (0.0009) [2023-10-12 16:41:00,720][62635] Updated weights for policy 1, policy_version 22200 (0.0010) [2023-10-12 16:41:03,264][62634] Updated weights for policy 0, policy_version 22210 (0.0008) [2023-10-12 16:41:03,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 45481984. Throughput: 0: 1671.7, 1: 1683.6. Samples: 11385224. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 16:41:03,435][61643] Avg episode reward: [(0, '3.940'), (1, '8.900')] [2023-10-12 16:41:03,635][62634] Updated weights for policy 0, policy_version 22220 (0.0011) [2023-10-12 16:41:04,019][62634] Updated weights for policy 0, policy_version 22230 (0.0010) [2023-10-12 16:41:04,400][62634] Updated weights for policy 0, policy_version 22240 (0.0007) [2023-10-12 16:41:04,831][62635] Updated weights for policy 1, policy_version 22210 (0.0009) [2023-10-12 16:41:05,228][62635] Updated weights for policy 1, policy_version 22220 (0.0009) [2023-10-12 16:41:05,590][62635] Updated weights for policy 1, policy_version 22230 (0.0009) [2023-10-12 16:41:05,963][62635] Updated weights for policy 1, policy_version 22240 (0.0008) [2023-10-12 16:41:08,392][62634] Updated weights for policy 0, policy_version 22250 (0.0011) [2023-10-12 16:41:08,435][61643] Fps is (10 sec: 13106.8, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 45547520. Throughput: 0: 1675.6, 1: 1656.9. Samples: 11394290. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 16:41:08,437][61643] Avg episode reward: [(0, '3.920'), (1, '8.980')] [2023-10-12 16:41:08,759][62634] Updated weights for policy 0, policy_version 22260 (0.0009) [2023-10-12 16:41:09,147][62634] Updated weights for policy 0, policy_version 22270 (0.0007) [2023-10-12 16:41:09,991][62635] Updated weights for policy 1, policy_version 22250 (0.0010) [2023-10-12 16:41:10,369][62635] Updated weights for policy 1, policy_version 22260 (0.0010) [2023-10-12 16:41:10,730][62635] Updated weights for policy 1, policy_version 22270 (0.0009) [2023-10-12 16:41:13,243][62634] Updated weights for policy 0, policy_version 22280 (0.0008) [2023-10-12 16:41:13,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 45613056. Throughput: 0: 1679.6, 1: 1673.6. Samples: 11414814. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 16:41:13,435][61643] Avg episode reward: [(0, '3.910'), (1, '9.130')] [2023-10-12 16:41:13,614][62634] Updated weights for policy 0, policy_version 22290 (0.0007) [2023-10-12 16:41:13,997][62634] Updated weights for policy 0, policy_version 22300 (0.0008) [2023-10-12 16:41:14,803][62635] Updated weights for policy 1, policy_version 22280 (0.0009) [2023-10-12 16:41:15,170][62635] Updated weights for policy 1, policy_version 22290 (0.0008) [2023-10-12 16:41:15,546][62635] Updated weights for policy 1, policy_version 22300 (0.0008) [2023-10-12 16:41:17,995][62634] Updated weights for policy 0, policy_version 22310 (0.0008) [2023-10-12 16:41:18,367][62634] Updated weights for policy 0, policy_version 22320 (0.0009) [2023-10-12 16:41:18,435][61643] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 45678592. Throughput: 0: 1681.1, 1: 1673.3. Samples: 11435428. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 16:41:18,436][61643] Avg episode reward: [(0, '3.950'), (1, '9.060')] [2023-10-12 16:41:18,759][62634] Updated weights for policy 0, policy_version 22330 (0.0009) [2023-10-12 16:41:19,699][62635] Updated weights for policy 1, policy_version 22310 (0.0009) [2023-10-12 16:41:20,062][62635] Updated weights for policy 1, policy_version 22320 (0.0007) [2023-10-12 16:41:20,434][62635] Updated weights for policy 1, policy_version 22330 (0.0009) [2023-10-12 16:41:22,858][62634] Updated weights for policy 0, policy_version 22340 (0.0010) [2023-10-12 16:41:23,243][62634] Updated weights for policy 0, policy_version 22350 (0.0008) [2023-10-12 16:41:23,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.3, 300 sec: 13329.4). Total num frames: 45744128. Throughput: 0: 1689.1, 1: 1660.7. Samples: 11444726. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 16:41:23,435][61643] Avg episode reward: [(0, '3.960'), (1, '9.140')] [2023-10-12 16:41:23,623][62634] Updated weights for policy 0, policy_version 22360 (0.0007) [2023-10-12 16:41:24,396][62635] Updated weights for policy 1, policy_version 22340 (0.0008) [2023-10-12 16:41:24,768][62635] Updated weights for policy 1, policy_version 22350 (0.0009) [2023-10-12 16:41:25,140][62635] Updated weights for policy 1, policy_version 22360 (0.0007) [2023-10-12 16:41:27,646][62634] Updated weights for policy 0, policy_version 22370 (0.0008) [2023-10-12 16:41:28,034][62634] Updated weights for policy 0, policy_version 22380 (0.0009) [2023-10-12 16:41:28,399][62634] Updated weights for policy 0, policy_version 22390 (0.0010) [2023-10-12 16:41:28,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 45809664. Throughput: 0: 1690.2, 1: 1669.2. Samples: 11465294. Policy #0 lag: (min: 31.0, avg: 39.3, max: 63.0) [2023-10-12 16:41:28,435][61643] Avg episode reward: [(0, '3.970'), (1, '9.150')] [2023-10-12 16:41:28,779][62634] Updated weights for policy 0, policy_version 22400 (0.0007) [2023-10-12 16:41:29,332][62635] Updated weights for policy 1, policy_version 22370 (0.0009) [2023-10-12 16:41:29,708][62635] Updated weights for policy 1, policy_version 22380 (0.0009) [2023-10-12 16:41:30,074][62635] Updated weights for policy 1, policy_version 22390 (0.0007) [2023-10-12 16:41:30,447][62635] Updated weights for policy 1, policy_version 22400 (0.0007) [2023-10-12 16:41:32,810][62634] Updated weights for policy 0, policy_version 22410 (0.0008) [2023-10-12 16:41:33,189][62634] Updated weights for policy 0, policy_version 22420 (0.0008) [2023-10-12 16:41:33,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 45875200. Throughput: 0: 1677.4, 1: 1667.7. Samples: 11485266. Policy #0 lag: (min: 31.0, avg: 39.3, max: 63.0) [2023-10-12 16:41:33,436][61643] Avg episode reward: [(0, '3.960'), (1, '8.910')] [2023-10-12 16:41:33,568][62634] Updated weights for policy 0, policy_version 22430 (0.0008) [2023-10-12 16:41:34,610][62635] Updated weights for policy 1, policy_version 22410 (0.0009) [2023-10-12 16:41:34,995][62635] Updated weights for policy 1, policy_version 22420 (0.0009) [2023-10-12 16:41:35,356][62635] Updated weights for policy 1, policy_version 22430 (0.0011) [2023-10-12 16:41:37,616][62634] Updated weights for policy 0, policy_version 22440 (0.0010) [2023-10-12 16:41:37,997][62634] Updated weights for policy 0, policy_version 22450 (0.0010) [2023-10-12 16:41:38,375][62634] Updated weights for policy 0, policy_version 22460 (0.0009) [2023-10-12 16:41:38,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 45940736. Throughput: 0: 1690.9, 1: 1661.8. Samples: 11494792. Policy #0 lag: (min: 31.0, avg: 39.3, max: 63.0) [2023-10-12 16:41:38,436][61643] Avg episode reward: [(0, '3.950'), (1, '8.950')] [2023-10-12 16:41:39,397][62635] Updated weights for policy 1, policy_version 22440 (0.0010) [2023-10-12 16:41:39,770][62635] Updated weights for policy 1, policy_version 22450 (0.0009) [2023-10-12 16:41:40,131][62635] Updated weights for policy 1, policy_version 22460 (0.0008) [2023-10-12 16:41:42,586][62634] Updated weights for policy 0, policy_version 22470 (0.0008) [2023-10-12 16:41:42,957][62634] Updated weights for policy 0, policy_version 22480 (0.0008) [2023-10-12 16:41:43,344][62634] Updated weights for policy 0, policy_version 22490 (0.0009) [2023-10-12 16:41:43,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 46006272. Throughput: 0: 1684.4, 1: 1667.9. Samples: 11515226. Policy #0 lag: (min: 31.0, avg: 39.3, max: 63.0) [2023-10-12 16:41:43,436][61643] Avg episode reward: [(0, '3.950'), (1, '8.990')] [2023-10-12 16:41:44,209][62635] Updated weights for policy 1, policy_version 22470 (0.0009) [2023-10-12 16:41:44,581][62635] Updated weights for policy 1, policy_version 22480 (0.0007) [2023-10-12 16:41:44,953][62635] Updated weights for policy 1, policy_version 22490 (0.0007) [2023-10-12 16:41:47,286][62634] Updated weights for policy 0, policy_version 22500 (0.0009) [2023-10-12 16:41:47,663][62634] Updated weights for policy 0, policy_version 22510 (0.0012) [2023-10-12 16:41:48,042][62634] Updated weights for policy 0, policy_version 22520 (0.0009) [2023-10-12 16:41:48,435][61643] Fps is (10 sec: 16384.2, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 46104576. Throughput: 0: 1663.3, 1: 1674.2. Samples: 11535414. Policy #0 lag: (min: 29.0, avg: 29.0, max: 29.0) [2023-10-12 16:41:48,435][61643] Avg episode reward: [(0, '3.940'), (1, '8.750')] [2023-10-12 16:41:48,949][62635] Updated weights for policy 1, policy_version 22500 (0.0007) [2023-10-12 16:41:49,323][62635] Updated weights for policy 1, policy_version 22510 (0.0007) [2023-10-12 16:41:49,683][62635] Updated weights for policy 1, policy_version 22520 (0.0008) [2023-10-12 16:41:52,037][62634] Updated weights for policy 0, policy_version 22530 (0.0008) [2023-10-12 16:41:52,412][62634] Updated weights for policy 0, policy_version 22540 (0.0008) [2023-10-12 16:41:52,797][62634] Updated weights for policy 0, policy_version 22550 (0.0009) [2023-10-12 16:41:53,171][62634] Updated weights for policy 0, policy_version 22560 (0.0007) [2023-10-12 16:41:53,435][61643] Fps is (10 sec: 16384.1, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 46170112. Throughput: 0: 1681.1, 1: 1678.2. Samples: 11545458. Policy #0 lag: (min: 29.0, avg: 29.0, max: 29.0) [2023-10-12 16:41:53,435][61643] Avg episode reward: [(0, '3.940'), (1, '8.800')] [2023-10-12 16:41:53,876][62635] Updated weights for policy 1, policy_version 22530 (0.0009) [2023-10-12 16:41:54,235][62635] Updated weights for policy 1, policy_version 22540 (0.0008) [2023-10-12 16:41:54,609][62635] Updated weights for policy 1, policy_version 22550 (0.0009) [2023-10-12 16:41:54,969][62635] Updated weights for policy 1, policy_version 22560 (0.0008) [2023-10-12 16:41:57,251][62634] Updated weights for policy 0, policy_version 22570 (0.0007) [2023-10-12 16:41:57,633][62634] Updated weights for policy 0, policy_version 22580 (0.0007) [2023-10-12 16:41:57,998][62634] Updated weights for policy 0, policy_version 22590 (0.0007) [2023-10-12 16:41:58,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 46235648. Throughput: 0: 1684.0, 1: 1684.1. Samples: 11566378. Policy #0 lag: (min: 29.0, avg: 29.0, max: 29.0) [2023-10-12 16:41:58,435][61643] Avg episode reward: [(0, '3.960'), (1, '8.800')] [2023-10-12 16:41:59,077][62635] Updated weights for policy 1, policy_version 22570 (0.0007) [2023-10-12 16:41:59,455][62635] Updated weights for policy 1, policy_version 22580 (0.0009) [2023-10-12 16:41:59,825][62635] Updated weights for policy 1, policy_version 22590 (0.0009) [2023-10-12 16:42:01,856][62634] Updated weights for policy 0, policy_version 22600 (0.0008) [2023-10-12 16:42:02,233][62634] Updated weights for policy 0, policy_version 22610 (0.0009) [2023-10-12 16:42:02,617][62634] Updated weights for policy 0, policy_version 22620 (0.0009) [2023-10-12 16:42:03,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 46301184. Throughput: 0: 1663.4, 1: 1679.7. Samples: 11585866. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 16:42:03,435][61643] Avg episode reward: [(0, '3.950'), (1, '8.900')] [2023-10-12 16:42:03,864][62635] Updated weights for policy 1, policy_version 22600 (0.0007) [2023-10-12 16:42:04,238][62635] Updated weights for policy 1, policy_version 22610 (0.0008) [2023-10-12 16:42:04,598][62635] Updated weights for policy 1, policy_version 22620 (0.0010) [2023-10-12 16:42:06,663][62634] Updated weights for policy 0, policy_version 22630 (0.0008) [2023-10-12 16:42:07,039][62634] Updated weights for policy 0, policy_version 22640 (0.0010) [2023-10-12 16:42:07,422][62634] Updated weights for policy 0, policy_version 22650 (0.0010) [2023-10-12 16:42:08,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 46366720. Throughput: 0: 1693.1, 1: 1677.3. Samples: 11596392. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 16:42:08,435][61643] Avg episode reward: [(0, '3.960'), (1, '8.900')] [2023-10-12 16:42:08,583][62635] Updated weights for policy 1, policy_version 22630 (0.0007) [2023-10-12 16:42:08,961][62635] Updated weights for policy 1, policy_version 22640 (0.0008) [2023-10-12 16:42:09,331][62635] Updated weights for policy 1, policy_version 22650 (0.0007) [2023-10-12 16:42:11,527][62634] Updated weights for policy 0, policy_version 22660 (0.0009) [2023-10-12 16:42:11,905][62634] Updated weights for policy 0, policy_version 22670 (0.0008) [2023-10-12 16:42:12,282][62634] Updated weights for policy 0, policy_version 22680 (0.0009) [2023-10-12 16:42:13,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 46432256. Throughput: 0: 1679.0, 1: 1679.8. Samples: 11616440. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 16:42:13,435][61643] Avg episode reward: [(0, '3.960'), (1, '9.120')] [2023-10-12 16:42:13,466][62635] Updated weights for policy 1, policy_version 22660 (0.0009) [2023-10-12 16:42:13,834][62635] Updated weights for policy 1, policy_version 22670 (0.0011) [2023-10-12 16:42:14,204][62635] Updated weights for policy 1, policy_version 22680 (0.0010) [2023-10-12 16:42:16,245][62634] Updated weights for policy 0, policy_version 22690 (0.0008) [2023-10-12 16:42:16,616][62634] Updated weights for policy 0, policy_version 22700 (0.0010) [2023-10-12 16:42:16,989][62634] Updated weights for policy 0, policy_version 22710 (0.0009) [2023-10-12 16:42:17,364][62634] Updated weights for policy 0, policy_version 22720 (0.0007) [2023-10-12 16:42:18,237][62635] Updated weights for policy 1, policy_version 22690 (0.0009) [2023-10-12 16:42:18,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 46497792. Throughput: 0: 1678.7, 1: 1686.3. Samples: 11636690. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 16:42:18,436][61643] Avg episode reward: [(0, '3.990'), (1, '9.070')] [2023-10-12 16:42:18,445][62354] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000022720_23265280.pth... [2023-10-12 16:42:18,478][62354] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000021152_21659648.pth [2023-10-12 16:42:18,601][62635] Updated weights for policy 1, policy_version 22700 (0.0007) [2023-10-12 16:42:18,970][62635] Updated weights for policy 1, policy_version 22710 (0.0010) [2023-10-12 16:42:19,346][62495] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000022720_23265280.pth... [2023-10-12 16:42:19,346][62635] Updated weights for policy 1, policy_version 22720 (0.0010) [2023-10-12 16:42:19,386][62495] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000021120_21626880.pth [2023-10-12 16:42:21,414][62634] Updated weights for policy 0, policy_version 22730 (0.0009) [2023-10-12 16:42:21,780][62634] Updated weights for policy 0, policy_version 22740 (0.0008) [2023-10-12 16:42:22,159][62634] Updated weights for policy 0, policy_version 22750 (0.0007) [2023-10-12 16:42:23,435][61643] Fps is (10 sec: 13106.8, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 46563328. Throughput: 0: 1694.1, 1: 1688.6. Samples: 11647012. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 16:42:23,436][61643] Avg episode reward: [(0, '3.990'), (1, '9.240')] [2023-10-12 16:42:23,576][62635] Updated weights for policy 1, policy_version 22730 (0.0011) [2023-10-12 16:42:23,946][62635] Updated weights for policy 1, policy_version 22740 (0.0009) [2023-10-12 16:42:24,318][62635] Updated weights for policy 1, policy_version 22750 (0.0008) [2023-10-12 16:42:24,387][62495] Saving new best policy, reward=9.240! [2023-10-12 16:42:26,230][62634] Updated weights for policy 0, policy_version 22760 (0.0008) [2023-10-12 16:42:26,611][62634] Updated weights for policy 0, policy_version 22770 (0.0008) [2023-10-12 16:42:26,977][62634] Updated weights for policy 0, policy_version 22780 (0.0007) [2023-10-12 16:42:28,334][62635] Updated weights for policy 1, policy_version 22760 (0.0008) [2023-10-12 16:42:28,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 46628864. Throughput: 0: 1673.6, 1: 1687.0. Samples: 11666456. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 16:42:28,435][61643] Avg episode reward: [(0, '3.990'), (1, '9.110')] [2023-10-12 16:42:28,703][62635] Updated weights for policy 1, policy_version 22770 (0.0008) [2023-10-12 16:42:29,074][62635] Updated weights for policy 1, policy_version 22780 (0.0010) [2023-10-12 16:42:31,082][62634] Updated weights for policy 0, policy_version 22790 (0.0010) [2023-10-12 16:42:31,454][62634] Updated weights for policy 0, policy_version 22800 (0.0011) [2023-10-12 16:42:31,825][62634] Updated weights for policy 0, policy_version 22810 (0.0010) [2023-10-12 16:42:32,948][62635] Updated weights for policy 1, policy_version 22790 (0.0010) [2023-10-12 16:42:33,312][62635] Updated weights for policy 1, policy_version 22800 (0.0007) [2023-10-12 16:42:33,435][61643] Fps is (10 sec: 13107.5, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 46694400. Throughput: 0: 1681.4, 1: 1676.8. Samples: 11686532. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 16:42:33,435][61643] Avg episode reward: [(0, '4.000'), (1, '8.890')] [2023-10-12 16:42:33,677][62635] Updated weights for policy 1, policy_version 22810 (0.0010) [2023-10-12 16:42:35,866][62634] Updated weights for policy 0, policy_version 22820 (0.0009) [2023-10-12 16:42:36,243][62634] Updated weights for policy 0, policy_version 22830 (0.0009) [2023-10-12 16:42:36,616][62634] Updated weights for policy 0, policy_version 22840 (0.0007) [2023-10-12 16:42:37,714][62635] Updated weights for policy 1, policy_version 22820 (0.0010) [2023-10-12 16:42:38,090][62635] Updated weights for policy 1, policy_version 22830 (0.0007) [2023-10-12 16:42:38,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 46759936. Throughput: 0: 1684.9, 1: 1687.5. Samples: 11697214. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 16:42:38,436][61643] Avg episode reward: [(0, '3.940'), (1, '9.050')] [2023-10-12 16:42:38,451][62635] Updated weights for policy 1, policy_version 22840 (0.0009) [2023-10-12 16:42:40,488][62634] Updated weights for policy 0, policy_version 22850 (0.0007) [2023-10-12 16:42:40,878][62634] Updated weights for policy 0, policy_version 22860 (0.0010) [2023-10-12 16:42:41,249][62634] Updated weights for policy 0, policy_version 22870 (0.0007) [2023-10-12 16:42:41,623][62634] Updated weights for policy 0, policy_version 22880 (0.0008) [2023-10-12 16:42:42,716][62635] Updated weights for policy 1, policy_version 22850 (0.0010) [2023-10-12 16:42:43,119][62635] Updated weights for policy 1, policy_version 22860 (0.0007) [2023-10-12 16:42:43,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 46825472. Throughput: 0: 1663.1, 1: 1682.3. Samples: 11716920. Policy #0 lag: (min: 16.0, avg: 36.7, max: 48.0) [2023-10-12 16:42:43,435][61643] Avg episode reward: [(0, '3.850'), (1, '8.890')] [2023-10-12 16:42:43,486][62635] Updated weights for policy 1, policy_version 22870 (0.0007) [2023-10-12 16:42:43,851][62635] Updated weights for policy 1, policy_version 22880 (0.0008) [2023-10-12 16:42:45,604][62634] Updated weights for policy 0, policy_version 22890 (0.0008) [2023-10-12 16:42:45,979][62634] Updated weights for policy 0, policy_version 22900 (0.0009) [2023-10-12 16:42:46,352][62634] Updated weights for policy 0, policy_version 22910 (0.0008) [2023-10-12 16:42:47,905][62635] Updated weights for policy 1, policy_version 22890 (0.0009) [2023-10-12 16:42:48,280][62635] Updated weights for policy 1, policy_version 22900 (0.0009) [2023-10-12 16:42:48,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 46891008. Throughput: 0: 1692.5, 1: 1670.0. Samples: 11737180. Policy #0 lag: (min: 16.0, avg: 36.7, max: 48.0) [2023-10-12 16:42:48,436][61643] Avg episode reward: [(0, '3.850'), (1, '8.950')] [2023-10-12 16:42:48,656][62635] Updated weights for policy 1, policy_version 22910 (0.0009) [2023-10-12 16:42:50,355][62634] Updated weights for policy 0, policy_version 22920 (0.0010) [2023-10-12 16:42:50,742][62634] Updated weights for policy 0, policy_version 22930 (0.0009) [2023-10-12 16:42:51,112][62634] Updated weights for policy 0, policy_version 22940 (0.0011) [2023-10-12 16:42:52,555][62635] Updated weights for policy 1, policy_version 22920 (0.0010) [2023-10-12 16:42:52,937][62635] Updated weights for policy 1, policy_version 22930 (0.0009) [2023-10-12 16:42:53,304][62635] Updated weights for policy 1, policy_version 22940 (0.0010) [2023-10-12 16:42:53,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 46956544. Throughput: 0: 1668.4, 1: 1682.4. Samples: 11747174. Policy #0 lag: (min: 16.0, avg: 36.7, max: 48.0) [2023-10-12 16:42:53,435][61643] Avg episode reward: [(0, '3.860'), (1, '8.960')] [2023-10-12 16:42:55,204][62634] Updated weights for policy 0, policy_version 22950 (0.0009) [2023-10-12 16:42:55,578][62634] Updated weights for policy 0, policy_version 22960 (0.0009) [2023-10-12 16:42:55,955][62634] Updated weights for policy 0, policy_version 22970 (0.0009) [2023-10-12 16:42:57,542][62635] Updated weights for policy 1, policy_version 22950 (0.0009) [2023-10-12 16:42:57,912][62635] Updated weights for policy 1, policy_version 22960 (0.0008) [2023-10-12 16:42:58,274][62635] Updated weights for policy 1, policy_version 22970 (0.0007) [2023-10-12 16:42:58,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 47022080. Throughput: 0: 1674.8, 1: 1684.2. Samples: 11767598. Policy #0 lag: (min: 16.0, avg: 36.7, max: 48.0) [2023-10-12 16:42:58,435][61643] Avg episode reward: [(0, '3.910'), (1, '9.020')] [2023-10-12 16:43:00,165][62634] Updated weights for policy 0, policy_version 22980 (0.0009) [2023-10-12 16:43:00,550][62634] Updated weights for policy 0, policy_version 22990 (0.0009) [2023-10-12 16:43:00,929][62634] Updated weights for policy 0, policy_version 23000 (0.0009) [2023-10-12 16:43:02,440][62635] Updated weights for policy 1, policy_version 22980 (0.0009) [2023-10-12 16:43:02,800][62635] Updated weights for policy 1, policy_version 22990 (0.0010) [2023-10-12 16:43:03,173][62635] Updated weights for policy 1, policy_version 23000 (0.0007) [2023-10-12 16:43:03,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 47087616. Throughput: 0: 1684.1, 1: 1662.0. Samples: 11787264. Policy #0 lag: (min: 6.0, avg: 8.1, max: 34.0) [2023-10-12 16:43:03,435][61643] Avg episode reward: [(0, '3.940'), (1, '8.990')] [2023-10-12 16:43:04,936][62634] Updated weights for policy 0, policy_version 23010 (0.0008) [2023-10-12 16:43:05,317][62634] Updated weights for policy 0, policy_version 23020 (0.0007) [2023-10-12 16:43:05,683][62634] Updated weights for policy 0, policy_version 23030 (0.0008) [2023-10-12 16:43:06,069][62634] Updated weights for policy 0, policy_version 23040 (0.0009) [2023-10-12 16:43:07,118][62635] Updated weights for policy 1, policy_version 23010 (0.0008) [2023-10-12 16:43:07,486][62635] Updated weights for policy 1, policy_version 23020 (0.0009) [2023-10-12 16:43:07,854][62635] Updated weights for policy 1, policy_version 23030 (0.0007) [2023-10-12 16:43:08,218][62635] Updated weights for policy 1, policy_version 23040 (0.0007) [2023-10-12 16:43:08,435][61643] Fps is (10 sec: 16384.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 47185920. Throughput: 0: 1658.2, 1: 1680.1. Samples: 11797232. Policy #0 lag: (min: 6.0, avg: 8.1, max: 34.0) [2023-10-12 16:43:08,436][61643] Avg episode reward: [(0, '3.940'), (1, '8.950')] [2023-10-12 16:43:10,176][62634] Updated weights for policy 0, policy_version 23050 (0.0007) [2023-10-12 16:43:10,558][62634] Updated weights for policy 0, policy_version 23060 (0.0007) [2023-10-12 16:43:10,929][62634] Updated weights for policy 0, policy_version 23070 (0.0009) [2023-10-12 16:43:12,268][62635] Updated weights for policy 1, policy_version 23050 (0.0007) [2023-10-12 16:43:12,629][62635] Updated weights for policy 1, policy_version 23060 (0.0008) [2023-10-12 16:43:12,994][62635] Updated weights for policy 1, policy_version 23070 (0.0009) [2023-10-12 16:43:13,435][61643] Fps is (10 sec: 16383.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 47251456. Throughput: 0: 1680.4, 1: 1678.2. Samples: 11817592. Policy #0 lag: (min: 6.0, avg: 8.1, max: 34.0) [2023-10-12 16:43:13,436][61643] Avg episode reward: [(0, '3.980'), (1, '9.210')] [2023-10-12 16:43:14,867][62634] Updated weights for policy 0, policy_version 23080 (0.0010) [2023-10-12 16:43:15,246][62634] Updated weights for policy 0, policy_version 23090 (0.0009) [2023-10-12 16:43:15,631][62634] Updated weights for policy 0, policy_version 23100 (0.0010) [2023-10-12 16:43:17,022][62635] Updated weights for policy 1, policy_version 23080 (0.0007) [2023-10-12 16:43:17,388][62635] Updated weights for policy 1, policy_version 23090 (0.0009) [2023-10-12 16:43:17,770][62635] Updated weights for policy 1, policy_version 23100 (0.0009) [2023-10-12 16:43:18,435][61643] Fps is (10 sec: 13106.8, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 47316992. Throughput: 0: 1690.0, 1: 1658.0. Samples: 11837194. Policy #0 lag: (min: 6.0, avg: 8.1, max: 34.0) [2023-10-12 16:43:18,436][61643] Avg episode reward: [(0, '3.980'), (1, '9.010')] [2023-10-12 16:43:19,901][62634] Updated weights for policy 0, policy_version 23110 (0.0008) [2023-10-12 16:43:20,271][62634] Updated weights for policy 0, policy_version 23120 (0.0008) [2023-10-12 16:43:20,644][62634] Updated weights for policy 0, policy_version 23130 (0.0009) [2023-10-12 16:43:21,898][62635] Updated weights for policy 1, policy_version 23110 (0.0008) [2023-10-12 16:43:22,257][62635] Updated weights for policy 1, policy_version 23120 (0.0007) [2023-10-12 16:43:22,632][62635] Updated weights for policy 1, policy_version 23130 (0.0007) [2023-10-12 16:43:23,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 47382528. Throughput: 0: 1664.0, 1: 1674.7. Samples: 11847452. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 16:43:23,436][61643] Avg episode reward: [(0, '3.910'), (1, '8.930')] [2023-10-12 16:43:24,739][62634] Updated weights for policy 0, policy_version 23140 (0.0008) [2023-10-12 16:43:25,124][62634] Updated weights for policy 0, policy_version 23150 (0.0007) [2023-10-12 16:43:25,504][62634] Updated weights for policy 0, policy_version 23160 (0.0007) [2023-10-12 16:43:26,844][62635] Updated weights for policy 1, policy_version 23140 (0.0008) [2023-10-12 16:43:27,214][62635] Updated weights for policy 1, policy_version 23150 (0.0008) [2023-10-12 16:43:27,579][62635] Updated weights for policy 1, policy_version 23160 (0.0007) [2023-10-12 16:43:28,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 47448064. Throughput: 0: 1680.7, 1: 1669.2. Samples: 11867666. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 16:43:28,436][61643] Avg episode reward: [(0, '3.890'), (1, '8.920')] [2023-10-12 16:43:29,574][62634] Updated weights for policy 0, policy_version 23170 (0.0008) [2023-10-12 16:43:29,943][62634] Updated weights for policy 0, policy_version 23180 (0.0011) [2023-10-12 16:43:30,321][62634] Updated weights for policy 0, policy_version 23190 (0.0010) [2023-10-12 16:43:30,706][62634] Updated weights for policy 0, policy_version 23200 (0.0008) [2023-10-12 16:43:31,664][62635] Updated weights for policy 1, policy_version 23170 (0.0010) [2023-10-12 16:43:32,061][62635] Updated weights for policy 1, policy_version 23180 (0.0011) [2023-10-12 16:43:32,419][62635] Updated weights for policy 1, policy_version 23190 (0.0010) [2023-10-12 16:43:32,785][62635] Updated weights for policy 1, policy_version 23200 (0.0010) [2023-10-12 16:43:33,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 47513600. Throughput: 0: 1677.5, 1: 1661.9. Samples: 11887450. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 16:43:33,435][61643] Avg episode reward: [(0, '3.900'), (1, '8.860')] [2023-10-12 16:43:34,831][62634] Updated weights for policy 0, policy_version 23210 (0.0007) [2023-10-12 16:43:35,221][62634] Updated weights for policy 0, policy_version 23220 (0.0009) [2023-10-12 16:43:35,604][62634] Updated weights for policy 0, policy_version 23230 (0.0007) [2023-10-12 16:43:36,561][62635] Updated weights for policy 1, policy_version 23210 (0.0008) [2023-10-12 16:43:36,929][62635] Updated weights for policy 1, policy_version 23220 (0.0009) [2023-10-12 16:43:37,292][62635] Updated weights for policy 1, policy_version 23230 (0.0007) [2023-10-12 16:43:38,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 47579136. Throughput: 0: 1667.2, 1: 1683.6. Samples: 11897958. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 16:43:38,436][61643] Avg episode reward: [(0, '3.940'), (1, '8.890')] [2023-10-12 16:43:39,572][62634] Updated weights for policy 0, policy_version 23240 (0.0007) [2023-10-12 16:43:39,948][62634] Updated weights for policy 0, policy_version 23250 (0.0009) [2023-10-12 16:43:40,320][62634] Updated weights for policy 0, policy_version 23260 (0.0011) [2023-10-12 16:43:41,380][62635] Updated weights for policy 1, policy_version 23240 (0.0007) [2023-10-12 16:43:41,758][62635] Updated weights for policy 1, policy_version 23250 (0.0008) [2023-10-12 16:43:42,128][62635] Updated weights for policy 1, policy_version 23260 (0.0008) [2023-10-12 16:43:43,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 47644672. Throughput: 0: 1674.0, 1: 1665.2. Samples: 11917860. Policy #0 lag: (min: 0.0, avg: 28.4, max: 32.0) [2023-10-12 16:43:43,436][61643] Avg episode reward: [(0, '3.990'), (1, '8.790')] [2023-10-12 16:43:44,412][62634] Updated weights for policy 0, policy_version 23270 (0.0007) [2023-10-12 16:43:44,780][62634] Updated weights for policy 0, policy_version 23280 (0.0009) [2023-10-12 16:43:45,165][62634] Updated weights for policy 0, policy_version 23290 (0.0008) [2023-10-12 16:43:46,236][62635] Updated weights for policy 1, policy_version 23270 (0.0009) [2023-10-12 16:43:46,603][62635] Updated weights for policy 1, policy_version 23280 (0.0007) [2023-10-12 16:43:46,973][62635] Updated weights for policy 1, policy_version 23290 (0.0008) [2023-10-12 16:43:48,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 47710208. Throughput: 0: 1676.2, 1: 1676.0. Samples: 11938114. Policy #0 lag: (min: 0.0, avg: 28.4, max: 32.0) [2023-10-12 16:43:48,436][61643] Avg episode reward: [(0, '3.990'), (1, '8.900')] [2023-10-12 16:43:49,301][62634] Updated weights for policy 0, policy_version 23300 (0.0009) [2023-10-12 16:43:49,691][62634] Updated weights for policy 0, policy_version 23310 (0.0008) [2023-10-12 16:43:50,071][62634] Updated weights for policy 0, policy_version 23320 (0.0010) [2023-10-12 16:43:51,032][62635] Updated weights for policy 1, policy_version 23300 (0.0008) [2023-10-12 16:43:51,407][62635] Updated weights for policy 1, policy_version 23310 (0.0009) [2023-10-12 16:43:51,781][62635] Updated weights for policy 1, policy_version 23320 (0.0007) [2023-10-12 16:43:53,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 47775744. Throughput: 0: 1670.2, 1: 1685.6. Samples: 11948240. Policy #0 lag: (min: 0.0, avg: 28.4, max: 32.0) [2023-10-12 16:43:53,435][61643] Avg episode reward: [(0, '3.970'), (1, '8.940')] [2023-10-12 16:43:54,099][62634] Updated weights for policy 0, policy_version 23330 (0.0008) [2023-10-12 16:43:54,474][62634] Updated weights for policy 0, policy_version 23340 (0.0007) [2023-10-12 16:43:54,844][62634] Updated weights for policy 0, policy_version 23350 (0.0009) [2023-10-12 16:43:55,227][62634] Updated weights for policy 0, policy_version 23360 (0.0009) [2023-10-12 16:43:55,823][62635] Updated weights for policy 1, policy_version 23330 (0.0008) [2023-10-12 16:43:56,191][62635] Updated weights for policy 1, policy_version 23340 (0.0007) [2023-10-12 16:43:56,560][62635] Updated weights for policy 1, policy_version 23350 (0.0009) [2023-10-12 16:43:56,936][62635] Updated weights for policy 1, policy_version 23360 (0.0007) [2023-10-12 16:43:58,435][61643] Fps is (10 sec: 13107.6, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 47841280. Throughput: 0: 1674.5, 1: 1666.2. Samples: 11967926. Policy #0 lag: (min: 0.0, avg: 28.4, max: 32.0) [2023-10-12 16:43:58,435][61643] Avg episode reward: [(0, '3.970'), (1, '8.870')] [2023-10-12 16:43:59,204][62634] Updated weights for policy 0, policy_version 23370 (0.0010) [2023-10-12 16:43:59,586][62634] Updated weights for policy 0, policy_version 23380 (0.0008) [2023-10-12 16:43:59,952][62634] Updated weights for policy 0, policy_version 23390 (0.0010) [2023-10-12 16:44:00,948][62635] Updated weights for policy 1, policy_version 23370 (0.0009) [2023-10-12 16:44:01,322][62635] Updated weights for policy 1, policy_version 23380 (0.0008) [2023-10-12 16:44:01,685][62635] Updated weights for policy 1, policy_version 23390 (0.0009) [2023-10-12 16:44:03,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 47906816. Throughput: 0: 1674.7, 1: 1693.8. Samples: 11988776. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 16:44:03,435][61643] Avg episode reward: [(0, '3.960'), (1, '9.110')] [2023-10-12 16:44:04,034][62634] Updated weights for policy 0, policy_version 23400 (0.0010) [2023-10-12 16:44:04,407][62634] Updated weights for policy 0, policy_version 23410 (0.0007) [2023-10-12 16:44:04,790][62634] Updated weights for policy 0, policy_version 23420 (0.0007) [2023-10-12 16:44:05,719][62635] Updated weights for policy 1, policy_version 23400 (0.0010) [2023-10-12 16:44:06,093][62635] Updated weights for policy 1, policy_version 23410 (0.0010) [2023-10-12 16:44:06,457][62635] Updated weights for policy 1, policy_version 23420 (0.0011) [2023-10-12 16:44:08,435][61643] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 47972352. Throughput: 0: 1675.4, 1: 1681.3. Samples: 11998506. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 16:44:08,436][61643] Avg episode reward: [(0, '3.950'), (1, '9.140')] [2023-10-12 16:44:08,819][62634] Updated weights for policy 0, policy_version 23430 (0.0010) [2023-10-12 16:44:09,188][62634] Updated weights for policy 0, policy_version 23440 (0.0010) [2023-10-12 16:44:09,565][62634] Updated weights for policy 0, policy_version 23450 (0.0008) [2023-10-12 16:44:10,571][62635] Updated weights for policy 1, policy_version 23430 (0.0009) [2023-10-12 16:44:10,939][62635] Updated weights for policy 1, policy_version 23440 (0.0008) [2023-10-12 16:44:11,306][62635] Updated weights for policy 1, policy_version 23450 (0.0008) [2023-10-12 16:44:13,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 48037888. Throughput: 0: 1679.9, 1: 1673.2. Samples: 12018554. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 16:44:13,435][61643] Avg episode reward: [(0, '3.920'), (1, '9.040')] [2023-10-12 16:44:13,622][62634] Updated weights for policy 0, policy_version 23460 (0.0009) [2023-10-12 16:44:13,996][62634] Updated weights for policy 0, policy_version 23470 (0.0010) [2023-10-12 16:44:14,364][62634] Updated weights for policy 0, policy_version 23480 (0.0010) [2023-10-12 16:44:15,328][62635] Updated weights for policy 1, policy_version 23460 (0.0009) [2023-10-12 16:44:15,699][62635] Updated weights for policy 1, policy_version 23470 (0.0011) [2023-10-12 16:44:16,062][62635] Updated weights for policy 1, policy_version 23480 (0.0011) [2023-10-12 16:44:18,252][62634] Updated weights for policy 0, policy_version 23490 (0.0009) [2023-10-12 16:44:18,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 48103424. Throughput: 0: 1684.8, 1: 1696.4. Samples: 12039604. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 16:44:18,436][61643] Avg episode reward: [(0, '3.940'), (1, '9.140')] [2023-10-12 16:44:18,444][62495] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000023488_24051712.pth... [2023-10-12 16:44:18,482][62495] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000021920_22446080.pth [2023-10-12 16:44:18,487][62495] Saving a milestone ./train_atari/atari_kangaroo_APPO/checkpoint_p1/milestones/checkpoint_000023488_24051712.pth [2023-10-12 16:44:18,629][62634] Updated weights for policy 0, policy_version 23500 (0.0010) [2023-10-12 16:44:19,012][62634] Updated weights for policy 0, policy_version 23510 (0.0010) [2023-10-12 16:44:19,380][62354] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000023520_24084480.pth... [2023-10-12 16:44:19,384][62634] Updated weights for policy 0, policy_version 23520 (0.0007) [2023-10-12 16:44:19,415][62354] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000021920_22446080.pth [2023-10-12 16:44:19,419][62354] Saving a milestone ./train_atari/atari_kangaroo_APPO/checkpoint_p0/milestones/checkpoint_000023520_24084480.pth [2023-10-12 16:44:20,158][62635] Updated weights for policy 1, policy_version 23490 (0.0010) [2023-10-12 16:44:20,562][62635] Updated weights for policy 1, policy_version 23500 (0.0010) [2023-10-12 16:44:20,939][62635] Updated weights for policy 1, policy_version 23510 (0.0007) [2023-10-12 16:44:21,305][62635] Updated weights for policy 1, policy_version 23520 (0.0008) [2023-10-12 16:44:23,423][62634] Updated weights for policy 0, policy_version 23530 (0.0010) [2023-10-12 16:44:23,435][61643] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 48168960. Throughput: 0: 1684.1, 1: 1668.4. Samples: 12048822. Policy #0 lag: (min: 8.0, avg: 30.0, max: 40.0) [2023-10-12 16:44:23,436][61643] Avg episode reward: [(0, '3.940'), (1, '9.140')] [2023-10-12 16:44:23,797][62634] Updated weights for policy 0, policy_version 23540 (0.0011) [2023-10-12 16:44:24,170][62634] Updated weights for policy 0, policy_version 23550 (0.0010) [2023-10-12 16:44:25,205][62635] Updated weights for policy 1, policy_version 23530 (0.0008) [2023-10-12 16:44:25,578][62635] Updated weights for policy 1, policy_version 23540 (0.0007) [2023-10-12 16:44:25,943][62635] Updated weights for policy 1, policy_version 23550 (0.0009) [2023-10-12 16:44:28,244][62634] Updated weights for policy 0, policy_version 23560 (0.0007) [2023-10-12 16:44:28,435][61643] Fps is (10 sec: 13107.5, 60 sec: 13107.3, 300 sec: 13329.4). Total num frames: 48234496. Throughput: 0: 1685.0, 1: 1680.6. Samples: 12069312. Policy #0 lag: (min: 8.0, avg: 30.0, max: 40.0) [2023-10-12 16:44:28,435][61643] Avg episode reward: [(0, '3.930'), (1, '9.010')] [2023-10-12 16:44:28,629][62634] Updated weights for policy 0, policy_version 23570 (0.0007) [2023-10-12 16:44:29,006][62634] Updated weights for policy 0, policy_version 23580 (0.0008) [2023-10-12 16:44:29,868][62635] Updated weights for policy 1, policy_version 23560 (0.0008) [2023-10-12 16:44:30,240][62635] Updated weights for policy 1, policy_version 23570 (0.0007) [2023-10-12 16:44:30,613][62635] Updated weights for policy 1, policy_version 23580 (0.0009) [2023-10-12 16:44:33,021][62634] Updated weights for policy 0, policy_version 23590 (0.0007) [2023-10-12 16:44:33,394][62634] Updated weights for policy 0, policy_version 23600 (0.0009) [2023-10-12 16:44:33,435][61643] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 48300032. Throughput: 0: 1684.2, 1: 1694.1. Samples: 12090134. Policy #0 lag: (min: 8.0, avg: 30.0, max: 40.0) [2023-10-12 16:44:33,435][61643] Avg episode reward: [(0, '3.970'), (1, '9.130')] [2023-10-12 16:44:33,772][62634] Updated weights for policy 0, policy_version 23610 (0.0008) [2023-10-12 16:44:34,878][62635] Updated weights for policy 1, policy_version 23590 (0.0009) [2023-10-12 16:44:35,240][62635] Updated weights for policy 1, policy_version 23600 (0.0008) [2023-10-12 16:44:35,615][62635] Updated weights for policy 1, policy_version 23610 (0.0007) [2023-10-12 16:44:37,769][62634] Updated weights for policy 0, policy_version 23620 (0.0007) [2023-10-12 16:44:38,158][62634] Updated weights for policy 0, policy_version 23630 (0.0009) [2023-10-12 16:44:38,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 48365568. Throughput: 0: 1695.3, 1: 1666.7. Samples: 12099530. Policy #0 lag: (min: 8.0, avg: 30.0, max: 40.0) [2023-10-12 16:44:38,436][61643] Avg episode reward: [(0, '3.970'), (1, '8.930')] [2023-10-12 16:44:38,534][62634] Updated weights for policy 0, policy_version 23640 (0.0007) [2023-10-12 16:44:39,927][62635] Updated weights for policy 1, policy_version 23620 (0.0007) [2023-10-12 16:44:40,294][62635] Updated weights for policy 1, policy_version 23630 (0.0007) [2023-10-12 16:44:40,653][62635] Updated weights for policy 1, policy_version 23640 (0.0008) [2023-10-12 16:44:42,623][62634] Updated weights for policy 0, policy_version 23650 (0.0007) [2023-10-12 16:44:42,993][62634] Updated weights for policy 0, policy_version 23660 (0.0008) [2023-10-12 16:44:43,381][62634] Updated weights for policy 0, policy_version 23670 (0.0010) [2023-10-12 16:44:43,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 48431104. Throughput: 0: 1696.0, 1: 1686.1. Samples: 12120122. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 16:44:43,435][61643] Avg episode reward: [(0, '3.970'), (1, '9.000')] [2023-10-12 16:44:43,746][62634] Updated weights for policy 0, policy_version 23680 (0.0010) [2023-10-12 16:44:44,729][62635] Updated weights for policy 1, policy_version 23650 (0.0008) [2023-10-12 16:44:45,101][62635] Updated weights for policy 1, policy_version 23660 (0.0010) [2023-10-12 16:44:45,468][62635] Updated weights for policy 1, policy_version 23670 (0.0008) [2023-10-12 16:44:45,845][62635] Updated weights for policy 1, policy_version 23680 (0.0009) [2023-10-12 16:44:47,972][62634] Updated weights for policy 0, policy_version 23690 (0.0010) [2023-10-12 16:44:48,349][62634] Updated weights for policy 0, policy_version 23700 (0.0009) [2023-10-12 16:44:48,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 48496640. Throughput: 0: 1687.0, 1: 1689.6. Samples: 12140720. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 16:44:48,436][61643] Avg episode reward: [(0, '3.990'), (1, '9.200')] [2023-10-12 16:44:48,724][62634] Updated weights for policy 0, policy_version 23710 (0.0009) [2023-10-12 16:44:49,728][62635] Updated weights for policy 1, policy_version 23690 (0.0007) [2023-10-12 16:44:50,103][62635] Updated weights for policy 1, policy_version 23700 (0.0009) [2023-10-12 16:44:50,467][62635] Updated weights for policy 1, policy_version 23710 (0.0008) [2023-10-12 16:44:52,907][62634] Updated weights for policy 0, policy_version 23720 (0.0009) [2023-10-12 16:44:53,291][62634] Updated weights for policy 0, policy_version 23730 (0.0010) [2023-10-12 16:44:53,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 48562176. Throughput: 0: 1695.7, 1: 1677.2. Samples: 12150286. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 16:44:53,435][61643] Avg episode reward: [(0, '3.990'), (1, '9.050')] [2023-10-12 16:44:53,675][62634] Updated weights for policy 0, policy_version 23740 (0.0009) [2023-10-12 16:44:54,433][62635] Updated weights for policy 1, policy_version 23720 (0.0008) [2023-10-12 16:44:54,797][62635] Updated weights for policy 1, policy_version 23730 (0.0007) [2023-10-12 16:44:55,173][62635] Updated weights for policy 1, policy_version 23740 (0.0009) [2023-10-12 16:44:57,569][62634] Updated weights for policy 0, policy_version 23750 (0.0007) [2023-10-12 16:44:57,955][62634] Updated weights for policy 0, policy_version 23760 (0.0007) [2023-10-12 16:44:58,325][62634] Updated weights for policy 0, policy_version 23770 (0.0007) [2023-10-12 16:44:58,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 48627712. Throughput: 0: 1694.8, 1: 1693.7. Samples: 12171040. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 16:44:58,435][61643] Avg episode reward: [(0, '3.990'), (1, '9.130')] [2023-10-12 16:44:59,388][62635] Updated weights for policy 1, policy_version 23750 (0.0010) [2023-10-12 16:44:59,754][62635] Updated weights for policy 1, policy_version 23760 (0.0010) [2023-10-12 16:45:00,117][62635] Updated weights for policy 1, policy_version 23770 (0.0010) [2023-10-12 16:45:02,481][62634] Updated weights for policy 0, policy_version 23780 (0.0007) [2023-10-12 16:45:02,871][62634] Updated weights for policy 0, policy_version 23790 (0.0007) [2023-10-12 16:45:03,242][62634] Updated weights for policy 0, policy_version 23800 (0.0007) [2023-10-12 16:45:03,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 48693248. Throughput: 0: 1676.5, 1: 1692.2. Samples: 12191198. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 16:45:03,435][61643] Avg episode reward: [(0, '3.990'), (1, '9.220')] [2023-10-12 16:45:04,214][62635] Updated weights for policy 1, policy_version 23780 (0.0008) [2023-10-12 16:45:04,581][62635] Updated weights for policy 1, policy_version 23790 (0.0009) [2023-10-12 16:45:04,957][62635] Updated weights for policy 1, policy_version 23800 (0.0009) [2023-10-12 16:45:07,324][62634] Updated weights for policy 0, policy_version 23810 (0.0008) [2023-10-12 16:45:07,705][62634] Updated weights for policy 0, policy_version 23820 (0.0007) [2023-10-12 16:45:08,069][62634] Updated weights for policy 0, policy_version 23830 (0.0008) [2023-10-12 16:45:08,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 48758784. Throughput: 0: 1692.5, 1: 1687.8. Samples: 12200936. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 16:45:08,435][61643] Avg episode reward: [(0, '3.990'), (1, '9.160')] [2023-10-12 16:45:08,446][62634] Updated weights for policy 0, policy_version 23840 (0.0009) [2023-10-12 16:45:08,906][62635] Updated weights for policy 1, policy_version 23810 (0.0007) [2023-10-12 16:45:09,301][62635] Updated weights for policy 1, policy_version 23820 (0.0010) [2023-10-12 16:45:09,667][62635] Updated weights for policy 1, policy_version 23830 (0.0008) [2023-10-12 16:45:10,039][62635] Updated weights for policy 1, policy_version 23840 (0.0008) [2023-10-12 16:45:12,526][62634] Updated weights for policy 0, policy_version 23850 (0.0008) [2023-10-12 16:45:12,903][62634] Updated weights for policy 0, policy_version 23860 (0.0009) [2023-10-12 16:45:13,280][62634] Updated weights for policy 0, policy_version 23870 (0.0008) [2023-10-12 16:45:13,435][61643] Fps is (10 sec: 16384.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 48857088. Throughput: 0: 1691.7, 1: 1693.7. Samples: 12221656. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 16:45:13,435][61643] Avg episode reward: [(0, '3.980'), (1, '9.120')] [2023-10-12 16:45:13,989][62635] Updated weights for policy 1, policy_version 23850 (0.0008) [2023-10-12 16:45:14,363][62635] Updated weights for policy 1, policy_version 23860 (0.0007) [2023-10-12 16:45:14,730][62635] Updated weights for policy 1, policy_version 23870 (0.0007) [2023-10-12 16:45:17,210][62634] Updated weights for policy 0, policy_version 23880 (0.0008) [2023-10-12 16:45:17,598][62634] Updated weights for policy 0, policy_version 23890 (0.0008) [2023-10-12 16:45:17,979][62634] Updated weights for policy 0, policy_version 23900 (0.0008) [2023-10-12 16:45:18,435][61643] Fps is (10 sec: 16384.0, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 48922624. Throughput: 0: 1671.4, 1: 1686.4. Samples: 12241234. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0) [2023-10-12 16:45:18,435][61643] Avg episode reward: [(0, '3.990'), (1, '8.990')] [2023-10-12 16:45:18,843][62635] Updated weights for policy 1, policy_version 23880 (0.0008) [2023-10-12 16:45:19,206][62635] Updated weights for policy 1, policy_version 23890 (0.0010) [2023-10-12 16:45:19,576][62635] Updated weights for policy 1, policy_version 23900 (0.0009) [2023-10-12 16:45:21,814][62634] Updated weights for policy 0, policy_version 23910 (0.0009) [2023-10-12 16:45:22,189][62634] Updated weights for policy 0, policy_version 23920 (0.0009) [2023-10-12 16:45:22,568][62634] Updated weights for policy 0, policy_version 23930 (0.0010) [2023-10-12 16:45:23,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 48988160. Throughput: 0: 1688.6, 1: 1688.0. Samples: 12251474. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0) [2023-10-12 16:45:23,436][61643] Avg episode reward: [(0, '3.950'), (1, '8.900')] [2023-10-12 16:45:23,523][62635] Updated weights for policy 1, policy_version 23910 (0.0010) [2023-10-12 16:45:23,894][62635] Updated weights for policy 1, policy_version 23920 (0.0010) [2023-10-12 16:45:24,263][62635] Updated weights for policy 1, policy_version 23930 (0.0009) [2023-10-12 16:45:26,686][62634] Updated weights for policy 0, policy_version 23940 (0.0007) [2023-10-12 16:45:27,063][62634] Updated weights for policy 0, policy_version 23950 (0.0009) [2023-10-12 16:45:27,441][62634] Updated weights for policy 0, policy_version 23960 (0.0009) [2023-10-12 16:45:28,201][62635] Updated weights for policy 1, policy_version 23940 (0.0009) [2023-10-12 16:45:28,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 49053696. Throughput: 0: 1676.1, 1: 1697.9. Samples: 12271952. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0) [2023-10-12 16:45:28,436][61643] Avg episode reward: [(0, '3.950'), (1, '8.940')] [2023-10-12 16:45:28,572][62635] Updated weights for policy 1, policy_version 23950 (0.0009) [2023-10-12 16:45:28,942][62635] Updated weights for policy 1, policy_version 23960 (0.0009) [2023-10-12 16:45:31,553][62634] Updated weights for policy 0, policy_version 23970 (0.0008) [2023-10-12 16:45:31,920][62634] Updated weights for policy 0, policy_version 23980 (0.0009) [2023-10-12 16:45:32,300][62634] Updated weights for policy 0, policy_version 23990 (0.0009) [2023-10-12 16:45:32,671][62634] Updated weights for policy 0, policy_version 24000 (0.0009) [2023-10-12 16:45:32,993][62635] Updated weights for policy 1, policy_version 23970 (0.0010) [2023-10-12 16:45:33,364][62635] Updated weights for policy 1, policy_version 23980 (0.0008) [2023-10-12 16:45:33,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 49119232. Throughput: 0: 1665.1, 1: 1696.2. Samples: 12291980. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0) [2023-10-12 16:45:33,435][61643] Avg episode reward: [(0, '3.950'), (1, '9.050')] [2023-10-12 16:45:33,738][62635] Updated weights for policy 1, policy_version 23990 (0.0007) [2023-10-12 16:45:34,102][62635] Updated weights for policy 1, policy_version 24000 (0.0008) [2023-10-12 16:45:36,531][62634] Updated weights for policy 0, policy_version 24010 (0.0008) [2023-10-12 16:45:36,909][62634] Updated weights for policy 0, policy_version 24020 (0.0008) [2023-10-12 16:45:37,284][62634] Updated weights for policy 0, policy_version 24030 (0.0008) [2023-10-12 16:45:37,983][62635] Updated weights for policy 1, policy_version 24010 (0.0007) [2023-10-12 16:45:38,356][62635] Updated weights for policy 1, policy_version 24020 (0.0008) [2023-10-12 16:45:38,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 49184768. Throughput: 0: 1690.8, 1: 1696.0. Samples: 12302692. Policy #0 lag: (min: 21.0, avg: 21.0, max: 21.0) [2023-10-12 16:45:38,436][61643] Avg episode reward: [(0, '3.980'), (1, '8.960')] [2023-10-12 16:45:38,729][62635] Updated weights for policy 1, policy_version 24030 (0.0008) [2023-10-12 16:45:41,461][62634] Updated weights for policy 0, policy_version 24040 (0.0008) [2023-10-12 16:45:41,839][62634] Updated weights for policy 0, policy_version 24050 (0.0007) [2023-10-12 16:45:42,224][62634] Updated weights for policy 0, policy_version 24060 (0.0007) [2023-10-12 16:45:42,747][62635] Updated weights for policy 1, policy_version 24040 (0.0010) [2023-10-12 16:45:43,123][62635] Updated weights for policy 1, policy_version 24050 (0.0010) [2023-10-12 16:45:43,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 49250304. Throughput: 0: 1673.1, 1: 1694.9. Samples: 12322602. Policy #0 lag: (min: 21.0, avg: 21.0, max: 21.0) [2023-10-12 16:45:43,436][61643] Avg episode reward: [(0, '3.980'), (1, '9.090')] [2023-10-12 16:45:43,499][62635] Updated weights for policy 1, policy_version 24060 (0.0011) [2023-10-12 16:45:46,193][62634] Updated weights for policy 0, policy_version 24070 (0.0009) [2023-10-12 16:45:46,576][62634] Updated weights for policy 0, policy_version 24080 (0.0007) [2023-10-12 16:45:46,952][62634] Updated weights for policy 0, policy_version 24090 (0.0008) [2023-10-12 16:45:47,550][62635] Updated weights for policy 1, policy_version 24070 (0.0012) [2023-10-12 16:45:47,917][62635] Updated weights for policy 1, policy_version 24080 (0.0010) [2023-10-12 16:45:48,293][62635] Updated weights for policy 1, policy_version 24090 (0.0009) [2023-10-12 16:45:48,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 49315840. Throughput: 0: 1675.0, 1: 1680.6. Samples: 12342198. Policy #0 lag: (min: 21.0, avg: 21.0, max: 21.0) [2023-10-12 16:45:48,436][61643] Avg episode reward: [(0, '3.980'), (1, '9.010')] [2023-10-12 16:45:50,835][62634] Updated weights for policy 0, policy_version 24100 (0.0007) [2023-10-12 16:45:51,219][62634] Updated weights for policy 0, policy_version 24110 (0.0007) [2023-10-12 16:45:51,592][62634] Updated weights for policy 0, policy_version 24120 (0.0007) [2023-10-12 16:45:52,538][62635] Updated weights for policy 1, policy_version 24100 (0.0008) [2023-10-12 16:45:52,906][62635] Updated weights for policy 1, policy_version 24110 (0.0007) [2023-10-12 16:45:53,268][62635] Updated weights for policy 1, policy_version 24120 (0.0007) [2023-10-12 16:45:53,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 49381376. Throughput: 0: 1686.5, 1: 1696.5. Samples: 12353174. Policy #0 lag: (min: 21.0, avg: 21.0, max: 21.0) [2023-10-12 16:45:53,435][61643] Avg episode reward: [(0, '3.960'), (1, '9.080')] [2023-10-12 16:45:55,513][62634] Updated weights for policy 0, policy_version 24130 (0.0008) [2023-10-12 16:45:55,889][62634] Updated weights for policy 0, policy_version 24140 (0.0007) [2023-10-12 16:45:56,267][62634] Updated weights for policy 0, policy_version 24150 (0.0009) [2023-10-12 16:45:56,647][62634] Updated weights for policy 0, policy_version 24160 (0.0010) [2023-10-12 16:45:57,404][62635] Updated weights for policy 1, policy_version 24130 (0.0011) [2023-10-12 16:45:57,809][62635] Updated weights for policy 1, policy_version 24140 (0.0010) [2023-10-12 16:45:58,179][62635] Updated weights for policy 1, policy_version 24150 (0.0007) [2023-10-12 16:45:58,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 49446912. Throughput: 0: 1667.1, 1: 1694.7. Samples: 12372934. Policy #0 lag: (min: 5.0, avg: 6.2, max: 29.0) [2023-10-12 16:45:58,436][61643] Avg episode reward: [(0, '3.970'), (1, '9.220')] [2023-10-12 16:45:58,552][62635] Updated weights for policy 1, policy_version 24160 (0.0008) [2023-10-12 16:46:00,817][62634] Updated weights for policy 0, policy_version 24170 (0.0007) [2023-10-12 16:46:01,192][62634] Updated weights for policy 0, policy_version 24180 (0.0007) [2023-10-12 16:46:01,570][62634] Updated weights for policy 0, policy_version 24190 (0.0008) [2023-10-12 16:46:02,681][62635] Updated weights for policy 1, policy_version 24170 (0.0008) [2023-10-12 16:46:03,052][62635] Updated weights for policy 1, policy_version 24180 (0.0007) [2023-10-12 16:46:03,416][62635] Updated weights for policy 1, policy_version 24190 (0.0007) [2023-10-12 16:46:03,435][61643] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 49512448. Throughput: 0: 1693.9, 1: 1676.5. Samples: 12392902. Policy #0 lag: (min: 5.0, avg: 6.2, max: 29.0) [2023-10-12 16:46:03,436][61643] Avg episode reward: [(0, '3.940'), (1, '9.040')] [2023-10-12 16:46:05,638][62634] Updated weights for policy 0, policy_version 24200 (0.0008) [2023-10-12 16:46:06,014][62634] Updated weights for policy 0, policy_version 24210 (0.0008) [2023-10-12 16:46:06,384][62634] Updated weights for policy 0, policy_version 24220 (0.0007) [2023-10-12 16:46:07,272][62635] Updated weights for policy 1, policy_version 24200 (0.0009) [2023-10-12 16:46:07,648][62635] Updated weights for policy 1, policy_version 24210 (0.0009) [2023-10-12 16:46:08,022][62635] Updated weights for policy 1, policy_version 24220 (0.0008) [2023-10-12 16:46:08,435][61643] Fps is (10 sec: 16384.0, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 49610752. Throughput: 0: 1684.6, 1: 1693.5. Samples: 12403488. Policy #0 lag: (min: 5.0, avg: 6.2, max: 29.0) [2023-10-12 16:46:08,436][61643] Avg episode reward: [(0, '3.960'), (1, '8.990')] [2023-10-12 16:46:10,389][62634] Updated weights for policy 0, policy_version 24230 (0.0008) [2023-10-12 16:46:10,776][62634] Updated weights for policy 0, policy_version 24240 (0.0007) [2023-10-12 16:46:11,151][62634] Updated weights for policy 0, policy_version 24250 (0.0009) [2023-10-12 16:46:12,300][62635] Updated weights for policy 1, policy_version 24230 (0.0008) [2023-10-12 16:46:12,676][62635] Updated weights for policy 1, policy_version 24240 (0.0007) [2023-10-12 16:46:13,037][62635] Updated weights for policy 1, policy_version 24250 (0.0008) [2023-10-12 16:46:13,435][61643] Fps is (10 sec: 16384.2, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 49676288. Throughput: 0: 1680.0, 1: 1687.1. Samples: 12423472. Policy #0 lag: (min: 5.0, avg: 6.2, max: 29.0) [2023-10-12 16:46:13,435][61643] Avg episode reward: [(0, '3.920'), (1, '9.090')] [2023-10-12 16:46:15,168][62634] Updated weights for policy 0, policy_version 24260 (0.0010) [2023-10-12 16:46:15,553][62634] Updated weights for policy 0, policy_version 24270 (0.0008) [2023-10-12 16:46:15,938][62634] Updated weights for policy 0, policy_version 24280 (0.0008) [2023-10-12 16:46:17,194][62635] Updated weights for policy 1, policy_version 24260 (0.0008) [2023-10-12 16:46:17,561][62635] Updated weights for policy 1, policy_version 24270 (0.0008) [2023-10-12 16:46:17,932][62635] Updated weights for policy 1, policy_version 24280 (0.0008) [2023-10-12 16:46:18,435][61643] Fps is (10 sec: 13106.8, 60 sec: 13653.2, 300 sec: 13551.5). Total num frames: 49741824. Throughput: 0: 1699.3, 1: 1660.6. Samples: 12443176. Policy #0 lag: (min: 31.0, avg: 32.5, max: 57.0) [2023-10-12 16:46:18,436][61643] Avg episode reward: [(0, '3.930'), (1, '8.940')] [2023-10-12 16:46:18,446][62354] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000024288_24870912.pth... [2023-10-12 16:46:18,447][62495] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000024288_24870912.pth... [2023-10-12 16:46:18,481][62354] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000022720_23265280.pth [2023-10-12 16:46:18,483][62495] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000022720_23265280.pth [2023-10-12 16:46:19,991][62634] Updated weights for policy 0, policy_version 24290 (0.0007) [2023-10-12 16:46:20,374][62634] Updated weights for policy 0, policy_version 24300 (0.0007) [2023-10-12 16:46:20,750][62634] Updated weights for policy 0, policy_version 24310 (0.0007) [2023-10-12 16:46:21,119][62634] Updated weights for policy 0, policy_version 24320 (0.0008) [2023-10-12 16:46:22,078][62635] Updated weights for policy 1, policy_version 24290 (0.0009) [2023-10-12 16:46:22,443][62635] Updated weights for policy 1, policy_version 24300 (0.0008) [2023-10-12 16:46:22,817][62635] Updated weights for policy 1, policy_version 24310 (0.0008) [2023-10-12 16:46:23,189][62635] Updated weights for policy 1, policy_version 24320 (0.0008) [2023-10-12 16:46:23,435][61643] Fps is (10 sec: 13106.7, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 49807360. Throughput: 0: 1673.1, 1: 1678.9. Samples: 12453534. Policy #0 lag: (min: 31.0, avg: 32.5, max: 57.0) [2023-10-12 16:46:23,436][61643] Avg episode reward: [(0, '3.910'), (1, '8.960')] [2023-10-12 16:46:24,970][62634] Updated weights for policy 0, policy_version 24330 (0.0009) [2023-10-12 16:46:25,349][62634] Updated weights for policy 0, policy_version 24340 (0.0008) [2023-10-12 16:46:25,726][62634] Updated weights for policy 0, policy_version 24350 (0.0010) [2023-10-12 16:46:27,163][62635] Updated weights for policy 1, policy_version 24330 (0.0007) [2023-10-12 16:46:27,539][62635] Updated weights for policy 1, policy_version 24340 (0.0007) [2023-10-12 16:46:27,900][62635] Updated weights for policy 1, policy_version 24350 (0.0009) [2023-10-12 16:46:28,435][61643] Fps is (10 sec: 13107.7, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 49872896. Throughput: 0: 1685.7, 1: 1674.3. Samples: 12473800. Policy #0 lag: (min: 31.0, avg: 32.5, max: 57.0) [2023-10-12 16:46:28,435][61643] Avg episode reward: [(0, '3.900'), (1, '8.980')] [2023-10-12 16:46:29,836][62634] Updated weights for policy 0, policy_version 24360 (0.0008) [2023-10-12 16:46:30,210][62634] Updated weights for policy 0, policy_version 24370 (0.0009) [2023-10-12 16:46:30,589][62634] Updated weights for policy 0, policy_version 24380 (0.0007) [2023-10-12 16:46:31,934][62635] Updated weights for policy 1, policy_version 24360 (0.0010) [2023-10-12 16:46:32,294][62635] Updated weights for policy 1, policy_version 24370 (0.0008) [2023-10-12 16:46:32,663][62635] Updated weights for policy 1, policy_version 24380 (0.0007) [2023-10-12 16:46:33,435][61643] Fps is (10 sec: 13107.7, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 49938432. Throughput: 0: 1698.9, 1: 1665.9. Samples: 12493612. Policy #0 lag: (min: 31.0, avg: 32.5, max: 57.0) [2023-10-12 16:46:33,435][61643] Avg episode reward: [(0, '3.940'), (1, '8.930')] [2023-10-12 16:46:34,674][62634] Updated weights for policy 0, policy_version 24390 (0.0009) [2023-10-12 16:46:35,052][62634] Updated weights for policy 0, policy_version 24400 (0.0007) [2023-10-12 16:46:35,434][62634] Updated weights for policy 0, policy_version 24410 (0.0010) [2023-10-12 16:46:36,563][62635] Updated weights for policy 1, policy_version 24390 (0.0008) [2023-10-12 16:46:36,936][62635] Updated weights for policy 1, policy_version 24400 (0.0009) [2023-10-12 16:46:37,310][62635] Updated weights for policy 1, policy_version 24410 (0.0011) [2023-10-12 16:46:38,435][61643] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 50003968. Throughput: 0: 1672.8, 1: 1683.2. Samples: 12504196. Policy #0 lag: (min: 16.0, avg: 34.8, max: 48.0) [2023-10-12 16:46:38,436][61643] Avg episode reward: [(0, '3.920'), (1, '9.130')] [2023-10-12 16:46:39,518][62634] Updated weights for policy 0, policy_version 24420 (0.0009) [2023-10-12 16:46:39,902][62634] Updated weights for policy 0, policy_version 24430 (0.0009) [2023-10-12 16:46:40,275][62634] Updated weights for policy 0, policy_version 24440 (0.0008) [2023-10-12 16:46:41,319][62635] Updated weights for policy 1, policy_version 24420 (0.0010) [2023-10-12 16:46:41,684][62635] Updated weights for policy 1, policy_version 24430 (0.0010) [2023-10-12 16:46:42,045][62635] Updated weights for policy 1, policy_version 24440 (0.0007) [2023-10-12 16:46:43,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 50069504. Throughput: 0: 1694.2, 1: 1665.3. Samples: 12524112. Policy #0 lag: (min: 16.0, avg: 34.8, max: 48.0) [2023-10-12 16:46:43,435][61643] Avg episode reward: [(0, '3.950'), (1, '9.020')] [2023-10-12 16:46:44,229][62634] Updated weights for policy 0, policy_version 24450 (0.0007) [2023-10-12 16:46:44,615][62634] Updated weights for policy 0, policy_version 24460 (0.0009) [2023-10-12 16:46:44,993][62634] Updated weights for policy 0, policy_version 24470 (0.0008) [2023-10-12 16:46:45,374][62634] Updated weights for policy 0, policy_version 24480 (0.0010) [2023-10-12 16:46:46,158][62635] Updated weights for policy 1, policy_version 24450 (0.0007) [2023-10-12 16:46:46,533][62635] Updated weights for policy 1, policy_version 24460 (0.0007) [2023-10-12 16:46:46,907][62635] Updated weights for policy 1, policy_version 24470 (0.0007) [2023-10-12 16:46:47,276][62635] Updated weights for policy 1, policy_version 24480 (0.0008) [2023-10-12 16:46:48,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 50135040. Throughput: 0: 1690.0, 1: 1674.8. Samples: 12544318. Policy #0 lag: (min: 16.0, avg: 34.8, max: 48.0) [2023-10-12 16:46:48,435][61643] Avg episode reward: [(0, '3.950'), (1, '9.040')] [2023-10-12 16:46:49,392][62634] Updated weights for policy 0, policy_version 24490 (0.0008) [2023-10-12 16:46:49,771][62634] Updated weights for policy 0, policy_version 24500 (0.0008) [2023-10-12 16:46:50,150][62634] Updated weights for policy 0, policy_version 24510 (0.0007) [2023-10-12 16:46:51,353][62635] Updated weights for policy 1, policy_version 24490 (0.0008) [2023-10-12 16:46:51,720][62635] Updated weights for policy 1, policy_version 24500 (0.0008) [2023-10-12 16:46:52,093][62635] Updated weights for policy 1, policy_version 24510 (0.0009) [2023-10-12 16:46:53,435][61643] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 50200576. Throughput: 0: 1674.2, 1: 1683.1. Samples: 12554564. Policy #0 lag: (min: 16.0, avg: 34.8, max: 48.0) [2023-10-12 16:46:53,436][61643] Avg episode reward: [(0, '3.910'), (1, '8.900')] [2023-10-12 16:46:54,274][62634] Updated weights for policy 0, policy_version 24520 (0.0007) [2023-10-12 16:46:54,662][62634] Updated weights for policy 0, policy_version 24530 (0.0007) [2023-10-12 16:46:55,035][62634] Updated weights for policy 0, policy_version 24540 (0.0009) [2023-10-12 16:46:56,143][62635] Updated weights for policy 1, policy_version 24520 (0.0009) [2023-10-12 16:46:56,505][62635] Updated weights for policy 1, policy_version 24530 (0.0009) [2023-10-12 16:46:56,875][62635] Updated weights for policy 1, policy_version 24540 (0.0007) [2023-10-12 16:46:58,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 50266112. Throughput: 0: 1684.9, 1: 1659.1. Samples: 12573954. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2023-10-12 16:46:58,436][61643] Avg episode reward: [(0, '3.910'), (1, '8.960')] [2023-10-12 16:46:58,982][62634] Updated weights for policy 0, policy_version 24550 (0.0010) [2023-10-12 16:46:59,363][62634] Updated weights for policy 0, policy_version 24560 (0.0010) [2023-10-12 16:46:59,739][62634] Updated weights for policy 0, policy_version 24570 (0.0009) [2023-10-12 16:47:00,941][62635] Updated weights for policy 1, policy_version 24550 (0.0008) [2023-10-12 16:47:01,313][62635] Updated weights for policy 1, policy_version 24560 (0.0009) [2023-10-12 16:47:01,698][62635] Updated weights for policy 1, policy_version 24570 (0.0010) [2023-10-12 16:47:03,435][61643] Fps is (10 sec: 13107.5, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 50331648. Throughput: 0: 1686.2, 1: 1681.9. Samples: 12594740. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2023-10-12 16:47:03,435][61643] Avg episode reward: [(0, '3.920'), (1, '9.060')] [2023-10-12 16:47:03,860][62634] Updated weights for policy 0, policy_version 24580 (0.0008) [2023-10-12 16:47:04,269][62634] Updated weights for policy 0, policy_version 24590 (0.0007) [2023-10-12 16:47:04,644][62634] Updated weights for policy 0, policy_version 24600 (0.0007) [2023-10-12 16:47:05,697][62635] Updated weights for policy 1, policy_version 24580 (0.0008) [2023-10-12 16:47:06,059][62635] Updated weights for policy 1, policy_version 24590 (0.0010) [2023-10-12 16:47:06,432][62635] Updated weights for policy 1, policy_version 24600 (0.0010) [2023-10-12 16:47:08,435][61643] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 50397184. Throughput: 0: 1676.8, 1: 1676.1. Samples: 12604410. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2023-10-12 16:47:08,435][61643] Avg episode reward: [(0, '3.920'), (1, '8.780')] [2023-10-12 16:47:08,509][62634] Updated weights for policy 0, policy_version 24610 (0.0007) [2023-10-12 16:47:08,883][62634] Updated weights for policy 0, policy_version 24620 (0.0008) [2023-10-12 16:47:09,264][62634] Updated weights for policy 0, policy_version 24630 (0.0010) [2023-10-12 16:47:09,637][62634] Updated weights for policy 0, policy_version 24640 (0.0010) [2023-10-12 16:47:10,505][62635] Updated weights for policy 1, policy_version 24610 (0.0008) [2023-10-12 16:47:10,875][62635] Updated weights for policy 1, policy_version 24620 (0.0007) [2023-10-12 16:47:11,230][62635] Updated weights for policy 1, policy_version 24630 (0.0007) [2023-10-12 16:47:11,603][62635] Updated weights for policy 1, policy_version 24640 (0.0008) [2023-10-12 16:47:13,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 50462720. Throughput: 0: 1686.1, 1: 1663.4. Samples: 12624528. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2023-10-12 16:47:13,436][61643] Avg episode reward: [(0, '3.960'), (1, '8.920')] [2023-10-12 16:47:13,705][62634] Updated weights for policy 0, policy_version 24650 (0.0009) [2023-10-12 16:47:14,090][62634] Updated weights for policy 0, policy_version 24660 (0.0009) [2023-10-12 16:47:14,461][62634] Updated weights for policy 0, policy_version 24670 (0.0007) [2023-10-12 16:47:15,801][62635] Updated weights for policy 1, policy_version 24650 (0.0009) [2023-10-12 16:47:16,167][62635] Updated weights for policy 1, policy_version 24660 (0.0008) [2023-10-12 16:47:16,539][62635] Updated weights for policy 1, policy_version 24670 (0.0009) [2023-10-12 16:47:18,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13107.3, 300 sec: 13440.4). Total num frames: 50528256. Throughput: 0: 1684.0, 1: 1686.5. Samples: 12645288. Policy #0 lag: (min: 4.0, avg: 6.9, max: 36.0) [2023-10-12 16:47:18,436][61643] Avg episode reward: [(0, '3.980'), (1, '9.200')] [2023-10-12 16:47:18,690][62634] Updated weights for policy 0, policy_version 24680 (0.0008) [2023-10-12 16:47:19,064][62634] Updated weights for policy 0, policy_version 24690 (0.0009) [2023-10-12 16:47:19,440][62634] Updated weights for policy 0, policy_version 24700 (0.0009) [2023-10-12 16:47:20,543][62635] Updated weights for policy 1, policy_version 24680 (0.0009) [2023-10-12 16:47:20,905][62635] Updated weights for policy 1, policy_version 24690 (0.0007) [2023-10-12 16:47:21,278][62635] Updated weights for policy 1, policy_version 24700 (0.0007) [2023-10-12 16:47:23,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13107.3, 300 sec: 13440.4). Total num frames: 50593792. Throughput: 0: 1683.5, 1: 1663.6. Samples: 12654812. Policy #0 lag: (min: 4.0, avg: 6.9, max: 36.0) [2023-10-12 16:47:23,435][61643] Avg episode reward: [(0, '3.980'), (1, '9.030')] [2023-10-12 16:47:23,474][62634] Updated weights for policy 0, policy_version 24710 (0.0008) [2023-10-12 16:47:23,859][62634] Updated weights for policy 0, policy_version 24720 (0.0010) [2023-10-12 16:47:24,225][62634] Updated weights for policy 0, policy_version 24730 (0.0007) [2023-10-12 16:47:25,479][62635] Updated weights for policy 1, policy_version 24710 (0.0009) [2023-10-12 16:47:25,849][62635] Updated weights for policy 1, policy_version 24720 (0.0012) [2023-10-12 16:47:26,214][62635] Updated weights for policy 1, policy_version 24730 (0.0009) [2023-10-12 16:47:28,365][62634] Updated weights for policy 0, policy_version 24740 (0.0008) [2023-10-12 16:47:28,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 50659328. Throughput: 0: 1677.8, 1: 1671.7. Samples: 12674838. Policy #0 lag: (min: 4.0, avg: 6.9, max: 36.0) [2023-10-12 16:47:28,435][61643] Avg episode reward: [(0, '3.990'), (1, '8.890')] [2023-10-12 16:47:28,734][62634] Updated weights for policy 0, policy_version 24750 (0.0007) [2023-10-12 16:47:29,108][62634] Updated weights for policy 0, policy_version 24760 (0.0007) [2023-10-12 16:47:30,433][62635] Updated weights for policy 1, policy_version 24740 (0.0009) [2023-10-12 16:47:30,802][62635] Updated weights for policy 1, policy_version 24750 (0.0011) [2023-10-12 16:47:31,164][62635] Updated weights for policy 1, policy_version 24760 (0.0009) [2023-10-12 16:47:33,149][62634] Updated weights for policy 0, policy_version 24770 (0.0007) [2023-10-12 16:47:33,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 50724864. Throughput: 0: 1681.2, 1: 1676.3. Samples: 12695408. Policy #0 lag: (min: 4.0, avg: 6.9, max: 36.0) [2023-10-12 16:47:33,435][61643] Avg episode reward: [(0, '3.990'), (1, '8.920')] [2023-10-12 16:47:33,518][62634] Updated weights for policy 0, policy_version 24780 (0.0010) [2023-10-12 16:47:33,892][62634] Updated weights for policy 0, policy_version 24790 (0.0011) [2023-10-12 16:47:34,260][62634] Updated weights for policy 0, policy_version 24800 (0.0009) [2023-10-12 16:47:35,352][62635] Updated weights for policy 1, policy_version 24770 (0.0009) [2023-10-12 16:47:35,760][62635] Updated weights for policy 1, policy_version 24780 (0.0008) [2023-10-12 16:47:36,124][62635] Updated weights for policy 1, policy_version 24790 (0.0009) [2023-10-12 16:47:36,498][62635] Updated weights for policy 1, policy_version 24800 (0.0010) [2023-10-12 16:47:38,400][62634] Updated weights for policy 0, policy_version 24810 (0.0008) [2023-10-12 16:47:38,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 50790400. Throughput: 0: 1677.9, 1: 1658.8. Samples: 12704714. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2023-10-12 16:47:38,435][61643] Avg episode reward: [(0, '3.970'), (1, '9.040')] [2023-10-12 16:47:38,780][62634] Updated weights for policy 0, policy_version 24820 (0.0009) [2023-10-12 16:47:39,150][62634] Updated weights for policy 0, policy_version 24830 (0.0009) [2023-10-12 16:47:40,484][62635] Updated weights for policy 1, policy_version 24810 (0.0007) [2023-10-12 16:47:40,856][62635] Updated weights for policy 1, policy_version 24820 (0.0007) [2023-10-12 16:47:41,228][62635] Updated weights for policy 1, policy_version 24830 (0.0007) [2023-10-12 16:47:43,269][62634] Updated weights for policy 0, policy_version 24840 (0.0010) [2023-10-12 16:47:43,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 50855936. Throughput: 0: 1681.6, 1: 1670.2. Samples: 12724784. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2023-10-12 16:47:43,435][61643] Avg episode reward: [(0, '3.980'), (1, '8.830')] [2023-10-12 16:47:43,643][62634] Updated weights for policy 0, policy_version 24850 (0.0010) [2023-10-12 16:47:44,030][62634] Updated weights for policy 0, policy_version 24860 (0.0011) [2023-10-12 16:47:45,274][62635] Updated weights for policy 1, policy_version 24840 (0.0007) [2023-10-12 16:47:45,642][62635] Updated weights for policy 1, policy_version 24850 (0.0009) [2023-10-12 16:47:46,013][62635] Updated weights for policy 1, policy_version 24860 (0.0009) [2023-10-12 16:47:47,953][62634] Updated weights for policy 0, policy_version 24870 (0.0010) [2023-10-12 16:47:48,336][62634] Updated weights for policy 0, policy_version 24880 (0.0008) [2023-10-12 16:47:48,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 50921472. Throughput: 0: 1676.3, 1: 1671.8. Samples: 12745404. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2023-10-12 16:47:48,435][61643] Avg episode reward: [(0, '3.960'), (1, '9.080')] [2023-10-12 16:47:48,719][62634] Updated weights for policy 0, policy_version 24890 (0.0009) [2023-10-12 16:47:50,019][62635] Updated weights for policy 1, policy_version 24870 (0.0008) [2023-10-12 16:47:50,385][62635] Updated weights for policy 1, policy_version 24880 (0.0008) [2023-10-12 16:47:50,756][62635] Updated weights for policy 1, policy_version 24890 (0.0008) [2023-10-12 16:47:52,829][62634] Updated weights for policy 0, policy_version 24900 (0.0008) [2023-10-12 16:47:53,216][62634] Updated weights for policy 0, policy_version 24910 (0.0009) [2023-10-12 16:47:53,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 50987008. Throughput: 0: 1685.1, 1: 1656.9. Samples: 12754804. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2023-10-12 16:47:53,436][61643] Avg episode reward: [(0, '3.960'), (1, '8.770')] [2023-10-12 16:47:53,586][62634] Updated weights for policy 0, policy_version 24920 (0.0009) [2023-10-12 16:47:54,734][62635] Updated weights for policy 1, policy_version 24900 (0.0009) [2023-10-12 16:47:55,107][62635] Updated weights for policy 1, policy_version 24910 (0.0008) [2023-10-12 16:47:55,467][62635] Updated weights for policy 1, policy_version 24920 (0.0008) [2023-10-12 16:47:57,585][62634] Updated weights for policy 0, policy_version 24930 (0.0009) [2023-10-12 16:47:57,972][62634] Updated weights for policy 0, policy_version 24940 (0.0009) [2023-10-12 16:47:58,346][62634] Updated weights for policy 0, policy_version 24950 (0.0009) [2023-10-12 16:47:58,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.3, 300 sec: 13440.4). Total num frames: 51052544. Throughput: 0: 1677.2, 1: 1678.8. Samples: 12775548. Policy #0 lag: (min: 29.0, avg: 29.0, max: 29.0) [2023-10-12 16:47:58,435][61643] Avg episode reward: [(0, '3.960'), (1, '8.850')] [2023-10-12 16:47:58,725][62634] Updated weights for policy 0, policy_version 24960 (0.0010) [2023-10-12 16:47:59,528][62635] Updated weights for policy 1, policy_version 24930 (0.0008) [2023-10-12 16:47:59,891][62635] Updated weights for policy 1, policy_version 24940 (0.0010) [2023-10-12 16:48:00,269][62635] Updated weights for policy 1, policy_version 24950 (0.0009) [2023-10-12 16:48:00,628][62635] Updated weights for policy 1, policy_version 24960 (0.0007) [2023-10-12 16:48:02,860][62634] Updated weights for policy 0, policy_version 24970 (0.0007) [2023-10-12 16:48:03,232][62634] Updated weights for policy 0, policy_version 24980 (0.0008) [2023-10-12 16:48:03,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 51118080. Throughput: 0: 1666.0, 1: 1681.7. Samples: 12795936. Policy #0 lag: (min: 29.0, avg: 29.0, max: 29.0) [2023-10-12 16:48:03,435][61643] Avg episode reward: [(0, '3.970'), (1, '8.690')] [2023-10-12 16:48:03,611][62634] Updated weights for policy 0, policy_version 24990 (0.0007) [2023-10-12 16:48:04,635][62635] Updated weights for policy 1, policy_version 24970 (0.0007) [2023-10-12 16:48:05,003][62635] Updated weights for policy 1, policy_version 24980 (0.0007) [2023-10-12 16:48:05,374][62635] Updated weights for policy 1, policy_version 24990 (0.0007) [2023-10-12 16:48:07,595][62634] Updated weights for policy 0, policy_version 25000 (0.0010) [2023-10-12 16:48:07,977][62634] Updated weights for policy 0, policy_version 25010 (0.0010) [2023-10-12 16:48:08,350][62634] Updated weights for policy 0, policy_version 25020 (0.0009) [2023-10-12 16:48:08,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 51183616. Throughput: 0: 1679.5, 1: 1670.1. Samples: 12805544. Policy #0 lag: (min: 29.0, avg: 29.0, max: 29.0) [2023-10-12 16:48:08,435][61643] Avg episode reward: [(0, '3.980'), (1, '8.770')] [2023-10-12 16:48:09,630][62635] Updated weights for policy 1, policy_version 25000 (0.0007) [2023-10-12 16:48:09,997][62635] Updated weights for policy 1, policy_version 25010 (0.0008) [2023-10-12 16:48:10,367][62635] Updated weights for policy 1, policy_version 25020 (0.0007) [2023-10-12 16:48:12,333][62634] Updated weights for policy 0, policy_version 25030 (0.0009) [2023-10-12 16:48:12,708][62634] Updated weights for policy 0, policy_version 25040 (0.0009) [2023-10-12 16:48:13,089][62634] Updated weights for policy 0, policy_version 25050 (0.0007) [2023-10-12 16:48:13,435][61643] Fps is (10 sec: 16384.2, 60 sec: 13653.4, 300 sec: 13440.5). Total num frames: 51281920. Throughput: 0: 1684.1, 1: 1679.5. Samples: 12826200. Policy #0 lag: (min: 1.0, avg: 17.2, max: 33.0) [2023-10-12 16:48:13,435][61643] Avg episode reward: [(0, '3.950'), (1, '8.820')] [2023-10-12 16:48:14,395][62635] Updated weights for policy 1, policy_version 25030 (0.0007) [2023-10-12 16:48:14,756][62635] Updated weights for policy 1, policy_version 25040 (0.0007) [2023-10-12 16:48:15,129][62635] Updated weights for policy 1, policy_version 25050 (0.0010) [2023-10-12 16:48:17,047][62634] Updated weights for policy 0, policy_version 25060 (0.0010) [2023-10-12 16:48:17,424][62634] Updated weights for policy 0, policy_version 25070 (0.0009) [2023-10-12 16:48:17,799][62634] Updated weights for policy 0, policy_version 25080 (0.0009) [2023-10-12 16:48:18,435][61643] Fps is (10 sec: 16383.8, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 51347456. Throughput: 0: 1657.3, 1: 1686.4. Samples: 12845872. Policy #0 lag: (min: 1.0, avg: 17.2, max: 33.0) [2023-10-12 16:48:18,436][61643] Avg episode reward: [(0, '3.950'), (1, '9.100')] [2023-10-12 16:48:18,444][62495] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000025056_25657344.pth... [2023-10-12 16:48:18,445][62354] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000025088_25690112.pth... [2023-10-12 16:48:18,479][62354] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000023520_24084480.pth [2023-10-12 16:48:18,486][62495] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000023488_24051712.pth [2023-10-12 16:48:19,106][62635] Updated weights for policy 1, policy_version 25060 (0.0007) [2023-10-12 16:48:19,475][62635] Updated weights for policy 1, policy_version 25070 (0.0008) [2023-10-12 16:48:19,839][62635] Updated weights for policy 1, policy_version 25080 (0.0008) [2023-10-12 16:48:21,880][62634] Updated weights for policy 0, policy_version 25090 (0.0009) [2023-10-12 16:48:22,258][62634] Updated weights for policy 0, policy_version 25100 (0.0008) [2023-10-12 16:48:22,632][62634] Updated weights for policy 0, policy_version 25110 (0.0008) [2023-10-12 16:48:23,007][62634] Updated weights for policy 0, policy_version 25120 (0.0007) [2023-10-12 16:48:23,435][61643] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 51412992. Throughput: 0: 1686.9, 1: 1677.7. Samples: 12856124. Policy #0 lag: (min: 1.0, avg: 17.2, max: 33.0) [2023-10-12 16:48:23,436][61643] Avg episode reward: [(0, '3.920'), (1, '8.870')] [2023-10-12 16:48:23,848][62635] Updated weights for policy 1, policy_version 25090 (0.0008) [2023-10-12 16:48:24,223][62635] Updated weights for policy 1, policy_version 25100 (0.0009) [2023-10-12 16:48:24,595][62635] Updated weights for policy 1, policy_version 25110 (0.0009) [2023-10-12 16:48:24,969][62635] Updated weights for policy 1, policy_version 25120 (0.0008) [2023-10-12 16:48:27,065][62634] Updated weights for policy 0, policy_version 25130 (0.0008) [2023-10-12 16:48:27,443][62634] Updated weights for policy 0, policy_version 25140 (0.0007) [2023-10-12 16:48:27,826][62634] Updated weights for policy 0, policy_version 25150 (0.0007) [2023-10-12 16:48:28,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 51478528. Throughput: 0: 1685.2, 1: 1695.6. Samples: 12876920. Policy #0 lag: (min: 1.0, avg: 17.2, max: 33.0) [2023-10-12 16:48:28,435][61643] Avg episode reward: [(0, '3.960'), (1, '8.920')] [2023-10-12 16:48:28,865][62635] Updated weights for policy 1, policy_version 25130 (0.0007) [2023-10-12 16:48:29,230][62635] Updated weights for policy 1, policy_version 25140 (0.0007) [2023-10-12 16:48:29,601][62635] Updated weights for policy 1, policy_version 25150 (0.0007) [2023-10-12 16:48:31,874][62634] Updated weights for policy 0, policy_version 25160 (0.0008) [2023-10-12 16:48:32,244][62634] Updated weights for policy 0, policy_version 25170 (0.0007) [2023-10-12 16:48:32,627][62634] Updated weights for policy 0, policy_version 25180 (0.0007) [2023-10-12 16:48:33,435][61643] Fps is (10 sec: 13107.5, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 51544064. Throughput: 0: 1665.4, 1: 1699.4. Samples: 12896820. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 16:48:33,435][61643] Avg episode reward: [(0, '3.960'), (1, '9.210')] [2023-10-12 16:48:33,534][62635] Updated weights for policy 1, policy_version 25160 (0.0010) [2023-10-12 16:48:33,903][62635] Updated weights for policy 1, policy_version 25170 (0.0007) [2023-10-12 16:48:34,268][62635] Updated weights for policy 1, policy_version 25180 (0.0007) [2023-10-12 16:48:36,823][62634] Updated weights for policy 0, policy_version 25190 (0.0008) [2023-10-12 16:48:37,204][62634] Updated weights for policy 0, policy_version 25200 (0.0010) [2023-10-12 16:48:37,581][62634] Updated weights for policy 0, policy_version 25210 (0.0011) [2023-10-12 16:48:38,417][62635] Updated weights for policy 1, policy_version 25190 (0.0008) [2023-10-12 16:48:38,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 51609600. Throughput: 0: 1684.8, 1: 1699.2. Samples: 12907082. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 16:48:38,435][61643] Avg episode reward: [(0, '3.970'), (1, '9.080')] [2023-10-12 16:48:38,781][62635] Updated weights for policy 1, policy_version 25200 (0.0009) [2023-10-12 16:48:39,156][62635] Updated weights for policy 1, policy_version 25210 (0.0008) [2023-10-12 16:48:41,670][62634] Updated weights for policy 0, policy_version 25220 (0.0008) [2023-10-12 16:48:42,068][62634] Updated weights for policy 0, policy_version 25230 (0.0007) [2023-10-12 16:48:42,434][62634] Updated weights for policy 0, policy_version 25240 (0.0007) [2023-10-12 16:48:43,071][62635] Updated weights for policy 1, policy_version 25220 (0.0008) [2023-10-12 16:48:43,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 51675136. Throughput: 0: 1677.4, 1: 1694.0. Samples: 12927262. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 16:48:43,435][61643] Avg episode reward: [(0, '3.950'), (1, '9.040')] [2023-10-12 16:48:43,438][62635] Updated weights for policy 1, policy_version 25230 (0.0009) [2023-10-12 16:48:43,809][62635] Updated weights for policy 1, policy_version 25240 (0.0009) [2023-10-12 16:48:46,463][62634] Updated weights for policy 0, policy_version 25250 (0.0009) [2023-10-12 16:48:46,852][62634] Updated weights for policy 0, policy_version 25260 (0.0009) [2023-10-12 16:48:47,221][62634] Updated weights for policy 0, policy_version 25270 (0.0007) [2023-10-12 16:48:47,599][62634] Updated weights for policy 0, policy_version 25280 (0.0008) [2023-10-12 16:48:47,930][62635] Updated weights for policy 1, policy_version 25250 (0.0008) [2023-10-12 16:48:48,300][62635] Updated weights for policy 1, policy_version 25260 (0.0010) [2023-10-12 16:48:48,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 51740672. Throughput: 0: 1669.9, 1: 1688.5. Samples: 12947064. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 16:48:48,435][61643] Avg episode reward: [(0, '3.950'), (1, '8.850')] [2023-10-12 16:48:48,676][62635] Updated weights for policy 1, policy_version 25270 (0.0009) [2023-10-12 16:48:49,053][62635] Updated weights for policy 1, policy_version 25280 (0.0008) [2023-10-12 16:48:51,704][62634] Updated weights for policy 0, policy_version 25290 (0.0009) [2023-10-12 16:48:52,068][62634] Updated weights for policy 0, policy_version 25300 (0.0009) [2023-10-12 16:48:52,442][62634] Updated weights for policy 0, policy_version 25310 (0.0008) [2023-10-12 16:48:53,187][62635] Updated weights for policy 1, policy_version 25290 (0.0007) [2023-10-12 16:48:53,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 51806208. Throughput: 0: 1683.9, 1: 1695.6. Samples: 12957624. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2023-10-12 16:48:53,435][61643] Avg episode reward: [(0, '3.910'), (1, '8.980')] [2023-10-12 16:48:53,551][62635] Updated weights for policy 1, policy_version 25300 (0.0008) [2023-10-12 16:48:53,912][62635] Updated weights for policy 1, policy_version 25310 (0.0008) [2023-10-12 16:48:56,472][62634] Updated weights for policy 0, policy_version 25320 (0.0008) [2023-10-12 16:48:56,852][62634] Updated weights for policy 0, policy_version 25330 (0.0008) [2023-10-12 16:48:57,219][62634] Updated weights for policy 0, policy_version 25340 (0.0008) [2023-10-12 16:48:57,949][62635] Updated weights for policy 1, policy_version 25320 (0.0009) [2023-10-12 16:48:58,311][62635] Updated weights for policy 1, policy_version 25330 (0.0008) [2023-10-12 16:48:58,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 51871744. Throughput: 0: 1665.9, 1: 1700.4. Samples: 12977684. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2023-10-12 16:48:58,435][61643] Avg episode reward: [(0, '3.910'), (1, '9.020')] [2023-10-12 16:48:58,689][62635] Updated weights for policy 1, policy_version 25340 (0.0007) [2023-10-12 16:49:01,139][62634] Updated weights for policy 0, policy_version 25350 (0.0009) [2023-10-12 16:49:01,520][62634] Updated weights for policy 0, policy_version 25360 (0.0007) [2023-10-12 16:49:01,894][62634] Updated weights for policy 0, policy_version 25370 (0.0010) [2023-10-12 16:49:02,655][62635] Updated weights for policy 1, policy_version 25350 (0.0008) [2023-10-12 16:49:03,036][62635] Updated weights for policy 1, policy_version 25360 (0.0008) [2023-10-12 16:49:03,407][62635] Updated weights for policy 1, policy_version 25370 (0.0008) [2023-10-12 16:49:03,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 51937280. Throughput: 0: 1684.8, 1: 1691.9. Samples: 12997824. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2023-10-12 16:49:03,436][61643] Avg episode reward: [(0, '3.950'), (1, '8.890')] [2023-10-12 16:49:05,998][62634] Updated weights for policy 0, policy_version 25380 (0.0008) [2023-10-12 16:49:06,377][62634] Updated weights for policy 0, policy_version 25390 (0.0008) [2023-10-12 16:49:06,752][62634] Updated weights for policy 0, policy_version 25400 (0.0009) [2023-10-12 16:49:07,377][62635] Updated weights for policy 1, policy_version 25380 (0.0007) [2023-10-12 16:49:07,738][62635] Updated weights for policy 1, policy_version 25390 (0.0010) [2023-10-12 16:49:08,116][62635] Updated weights for policy 1, policy_version 25400 (0.0010) [2023-10-12 16:49:08,435][61643] Fps is (10 sec: 16383.9, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 52035584. Throughput: 0: 1686.4, 1: 1707.6. Samples: 13008854. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2023-10-12 16:49:08,435][61643] Avg episode reward: [(0, '3.990'), (1, '9.040')] [2023-10-12 16:49:10,732][62634] Updated weights for policy 0, policy_version 25410 (0.0009) [2023-10-12 16:49:11,113][62634] Updated weights for policy 0, policy_version 25420 (0.0011) [2023-10-12 16:49:11,489][62634] Updated weights for policy 0, policy_version 25430 (0.0009) [2023-10-12 16:49:11,860][62634] Updated weights for policy 0, policy_version 25440 (0.0008) [2023-10-12 16:49:12,279][62635] Updated weights for policy 1, policy_version 25410 (0.0009) [2023-10-12 16:49:12,681][62635] Updated weights for policy 1, policy_version 25420 (0.0009) [2023-10-12 16:49:13,049][62635] Updated weights for policy 1, policy_version 25430 (0.0010) [2023-10-12 16:49:13,416][62635] Updated weights for policy 1, policy_version 25440 (0.0007) [2023-10-12 16:49:13,435][61643] Fps is (10 sec: 16384.1, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 52101120. Throughput: 0: 1661.5, 1: 1706.6. Samples: 13028486. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 16:49:13,435][61643] Avg episode reward: [(0, '3.980'), (1, '9.210')] [2023-10-12 16:49:15,730][62634] Updated weights for policy 0, policy_version 25450 (0.0008) [2023-10-12 16:49:16,103][62634] Updated weights for policy 0, policy_version 25460 (0.0009) [2023-10-12 16:49:16,481][62634] Updated weights for policy 0, policy_version 25470 (0.0008) [2023-10-12 16:49:17,496][62635] Updated weights for policy 1, policy_version 25450 (0.0008) [2023-10-12 16:49:17,861][62635] Updated weights for policy 1, policy_version 25460 (0.0008) [2023-10-12 16:49:18,218][62635] Updated weights for policy 1, policy_version 25470 (0.0008) [2023-10-12 16:49:18,435][61643] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 52166656. Throughput: 0: 1690.3, 1: 1677.8. Samples: 13048388. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 16:49:18,436][61643] Avg episode reward: [(0, '3.980'), (1, '8.990')] [2023-10-12 16:49:20,588][62634] Updated weights for policy 0, policy_version 25480 (0.0009) [2023-10-12 16:49:20,974][62634] Updated weights for policy 0, policy_version 25490 (0.0008) [2023-10-12 16:49:21,348][62634] Updated weights for policy 0, policy_version 25500 (0.0008) [2023-10-12 16:49:22,332][62635] Updated weights for policy 1, policy_version 25480 (0.0010) [2023-10-12 16:49:22,692][62635] Updated weights for policy 1, policy_version 25490 (0.0008) [2023-10-12 16:49:23,056][62635] Updated weights for policy 1, policy_version 25500 (0.0007) [2023-10-12 16:49:23,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 52232192. Throughput: 0: 1678.8, 1: 1697.6. Samples: 13059024. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 16:49:23,435][61643] Avg episode reward: [(0, '3.980'), (1, '9.080')] [2023-10-12 16:49:25,263][62634] Updated weights for policy 0, policy_version 25510 (0.0007) [2023-10-12 16:49:25,645][62634] Updated weights for policy 0, policy_version 25520 (0.0008) [2023-10-12 16:49:26,026][62634] Updated weights for policy 0, policy_version 25530 (0.0007) [2023-10-12 16:49:27,174][62635] Updated weights for policy 1, policy_version 25510 (0.0009) [2023-10-12 16:49:27,550][62635] Updated weights for policy 1, policy_version 25520 (0.0010) [2023-10-12 16:49:27,904][62635] Updated weights for policy 1, policy_version 25530 (0.0011) [2023-10-12 16:49:28,435][61643] Fps is (10 sec: 13107.6, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 52297728. Throughput: 0: 1681.5, 1: 1693.8. Samples: 13079148. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 16:49:28,435][61643] Avg episode reward: [(0, '3.980'), (1, '9.300')] [2023-10-12 16:49:28,436][62495] Saving new best policy, reward=9.300! [2023-10-12 16:49:30,098][62634] Updated weights for policy 0, policy_version 25540 (0.0007) [2023-10-12 16:49:30,484][62634] Updated weights for policy 0, policy_version 25550 (0.0008) [2023-10-12 16:49:30,860][62634] Updated weights for policy 0, policy_version 25560 (0.0007) [2023-10-12 16:49:31,834][62635] Updated weights for policy 1, policy_version 25540 (0.0009) [2023-10-12 16:49:32,204][62635] Updated weights for policy 1, policy_version 25550 (0.0007) [2023-10-12 16:49:32,579][62635] Updated weights for policy 1, policy_version 25560 (0.0007) [2023-10-12 16:49:33,435][61643] Fps is (10 sec: 13106.8, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 52363264. Throughput: 0: 1700.1, 1: 1670.3. Samples: 13098732. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-10-12 16:49:33,436][61643] Avg episode reward: [(0, '3.950'), (1, '9.140')] [2023-10-12 16:49:34,852][62634] Updated weights for policy 0, policy_version 25570 (0.0007) [2023-10-12 16:49:35,229][62634] Updated weights for policy 0, policy_version 25580 (0.0008) [2023-10-12 16:49:35,617][62634] Updated weights for policy 0, policy_version 25590 (0.0007) [2023-10-12 16:49:35,992][62634] Updated weights for policy 0, policy_version 25600 (0.0009) [2023-10-12 16:49:36,462][62635] Updated weights for policy 1, policy_version 25570 (0.0008) [2023-10-12 16:49:36,828][62635] Updated weights for policy 1, policy_version 25580 (0.0008) [2023-10-12 16:49:37,198][62635] Updated weights for policy 1, policy_version 25590 (0.0009) [2023-10-12 16:49:37,565][62635] Updated weights for policy 1, policy_version 25600 (0.0007) [2023-10-12 16:49:38,435][61643] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 52428800. Throughput: 0: 1675.9, 1: 1693.5. Samples: 13109244. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-10-12 16:49:38,436][61643] Avg episode reward: [(0, '3.940'), (1, '9.100')] [2023-10-12 16:49:40,029][62634] Updated weights for policy 0, policy_version 25610 (0.0009) [2023-10-12 16:49:40,392][62634] Updated weights for policy 0, policy_version 25620 (0.0011) [2023-10-12 16:49:40,777][62634] Updated weights for policy 0, policy_version 25630 (0.0010) [2023-10-12 16:49:41,667][62635] Updated weights for policy 1, policy_version 25610 (0.0008) [2023-10-12 16:49:42,039][62635] Updated weights for policy 1, policy_version 25620 (0.0008) [2023-10-12 16:49:42,397][62635] Updated weights for policy 1, policy_version 25630 (0.0007) [2023-10-12 16:49:43,435][61643] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 52494336. Throughput: 0: 1692.2, 1: 1673.1. Samples: 13129122. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-10-12 16:49:43,435][61643] Avg episode reward: [(0, '3.940'), (1, '9.160')] [2023-10-12 16:49:44,849][62634] Updated weights for policy 0, policy_version 25640 (0.0009) [2023-10-12 16:49:45,234][62634] Updated weights for policy 0, policy_version 25650 (0.0008) [2023-10-12 16:49:45,619][62634] Updated weights for policy 0, policy_version 25660 (0.0007) [2023-10-12 16:49:46,582][62635] Updated weights for policy 1, policy_version 25640 (0.0010) [2023-10-12 16:49:46,957][62635] Updated weights for policy 1, policy_version 25650 (0.0007) [2023-10-12 16:49:47,336][62635] Updated weights for policy 1, policy_version 25660 (0.0009) [2023-10-12 16:49:48,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 52559872. Throughput: 0: 1694.2, 1: 1669.5. Samples: 13149190. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-10-12 16:49:48,436][61643] Avg episode reward: [(0, '3.950'), (1, '9.030')] [2023-10-12 16:49:49,722][62634] Updated weights for policy 0, policy_version 25670 (0.0008) [2023-10-12 16:49:50,093][62634] Updated weights for policy 0, policy_version 25680 (0.0011) [2023-10-12 16:49:50,475][62634] Updated weights for policy 0, policy_version 25690 (0.0008) [2023-10-12 16:49:51,396][62635] Updated weights for policy 1, policy_version 25670 (0.0009) [2023-10-12 16:49:51,764][62635] Updated weights for policy 1, policy_version 25680 (0.0008) [2023-10-12 16:49:52,132][62635] Updated weights for policy 1, policy_version 25690 (0.0008) [2023-10-12 16:49:53,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 52625408. Throughput: 0: 1666.7, 1: 1681.7. Samples: 13159534. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 16:49:53,436][61643] Avg episode reward: [(0, '3.960'), (1, '9.030')] [2023-10-12 16:49:54,596][62634] Updated weights for policy 0, policy_version 25700 (0.0008) [2023-10-12 16:49:54,972][62634] Updated weights for policy 0, policy_version 25710 (0.0007) [2023-10-12 16:49:55,347][62634] Updated weights for policy 0, policy_version 25720 (0.0010) [2023-10-12 16:49:56,236][62635] Updated weights for policy 1, policy_version 25700 (0.0007) [2023-10-12 16:49:56,599][62635] Updated weights for policy 1, policy_version 25710 (0.0008) [2023-10-12 16:49:56,963][62635] Updated weights for policy 1, policy_version 25720 (0.0007) [2023-10-12 16:49:58,435][61643] Fps is (10 sec: 13107.6, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 52690944. Throughput: 0: 1690.4, 1: 1656.8. Samples: 13179108. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 16:49:58,435][61643] Avg episode reward: [(0, '3.960'), (1, '9.050')] [2023-10-12 16:49:59,378][62634] Updated weights for policy 0, policy_version 25730 (0.0008) [2023-10-12 16:49:59,746][62634] Updated weights for policy 0, policy_version 25740 (0.0010) [2023-10-12 16:50:00,123][62634] Updated weights for policy 0, policy_version 25750 (0.0010) [2023-10-12 16:50:00,501][62634] Updated weights for policy 0, policy_version 25760 (0.0010) [2023-10-12 16:50:01,047][62635] Updated weights for policy 1, policy_version 25730 (0.0008) [2023-10-12 16:50:01,475][62635] Updated weights for policy 1, policy_version 25740 (0.0010) [2023-10-12 16:50:01,846][62635] Updated weights for policy 1, policy_version 25750 (0.0009) [2023-10-12 16:50:02,214][62635] Updated weights for policy 1, policy_version 25760 (0.0009) [2023-10-12 16:50:03,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 52756480. Throughput: 0: 1682.7, 1: 1667.7. Samples: 13199152. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 16:50:03,435][61643] Avg episode reward: [(0, '3.930'), (1, '9.020')] [2023-10-12 16:50:04,585][62634] Updated weights for policy 0, policy_version 25770 (0.0008) [2023-10-12 16:50:04,963][62634] Updated weights for policy 0, policy_version 25780 (0.0007) [2023-10-12 16:50:05,347][62634] Updated weights for policy 0, policy_version 25790 (0.0009) [2023-10-12 16:50:06,405][62635] Updated weights for policy 1, policy_version 25770 (0.0011) [2023-10-12 16:50:06,780][62635] Updated weights for policy 1, policy_version 25780 (0.0009) [2023-10-12 16:50:07,145][62635] Updated weights for policy 1, policy_version 25790 (0.0009) [2023-10-12 16:50:08,435][61643] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 52822016. Throughput: 0: 1667.0, 1: 1675.8. Samples: 13209450. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 16:50:08,436][61643] Avg episode reward: [(0, '3.930'), (1, '9.120')] [2023-10-12 16:50:09,432][62634] Updated weights for policy 0, policy_version 25800 (0.0008) [2023-10-12 16:50:09,806][62634] Updated weights for policy 0, policy_version 25810 (0.0007) [2023-10-12 16:50:10,181][62634] Updated weights for policy 0, policy_version 25820 (0.0007) [2023-10-12 16:50:11,155][62635] Updated weights for policy 1, policy_version 25800 (0.0007) [2023-10-12 16:50:11,523][62635] Updated weights for policy 1, policy_version 25810 (0.0007) [2023-10-12 16:50:11,880][62635] Updated weights for policy 1, policy_version 25820 (0.0007) [2023-10-12 16:50:13,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 52887552. Throughput: 0: 1677.3, 1: 1652.7. Samples: 13228998. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 16:50:13,435][61643] Avg episode reward: [(0, '3.910'), (1, '8.940')] [2023-10-12 16:50:14,267][62634] Updated weights for policy 0, policy_version 25830 (0.0007) [2023-10-12 16:50:14,638][62634] Updated weights for policy 0, policy_version 25840 (0.0009) [2023-10-12 16:50:15,016][62634] Updated weights for policy 0, policy_version 25850 (0.0008) [2023-10-12 16:50:15,976][62635] Updated weights for policy 1, policy_version 25830 (0.0008) [2023-10-12 16:50:16,340][62635] Updated weights for policy 1, policy_version 25840 (0.0009) [2023-10-12 16:50:16,709][62635] Updated weights for policy 1, policy_version 25850 (0.0010) [2023-10-12 16:50:18,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 52953088. Throughput: 0: 1678.1, 1: 1677.9. Samples: 13249752. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 16:50:18,435][61643] Avg episode reward: [(0, '3.860'), (1, '8.760')] [2023-10-12 16:50:18,447][62495] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000025856_26476544.pth... [2023-10-12 16:50:18,447][62354] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000025856_26476544.pth... [2023-10-12 16:50:18,484][62495] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000024288_24870912.pth [2023-10-12 16:50:18,484][62354] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000024288_24870912.pth [2023-10-12 16:50:19,184][62634] Updated weights for policy 0, policy_version 25860 (0.0007) [2023-10-12 16:50:19,568][62634] Updated weights for policy 0, policy_version 25870 (0.0007) [2023-10-12 16:50:19,943][62634] Updated weights for policy 0, policy_version 25880 (0.0007) [2023-10-12 16:50:20,950][62635] Updated weights for policy 1, policy_version 25860 (0.0008) [2023-10-12 16:50:21,331][62635] Updated weights for policy 1, policy_version 25870 (0.0008) [2023-10-12 16:50:21,706][62635] Updated weights for policy 1, policy_version 25880 (0.0007) [2023-10-12 16:50:23,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 53018624. Throughput: 0: 1673.1, 1: 1669.0. Samples: 13259638. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 16:50:23,435][61643] Avg episode reward: [(0, '3.860'), (1, '8.940')] [2023-10-12 16:50:23,842][62634] Updated weights for policy 0, policy_version 25890 (0.0007) [2023-10-12 16:50:24,220][62634] Updated weights for policy 0, policy_version 25900 (0.0010) [2023-10-12 16:50:24,604][62634] Updated weights for policy 0, policy_version 25910 (0.0009) [2023-10-12 16:50:24,985][62634] Updated weights for policy 0, policy_version 25920 (0.0008) [2023-10-12 16:50:25,870][62635] Updated weights for policy 1, policy_version 25890 (0.0007) [2023-10-12 16:50:26,239][62635] Updated weights for policy 1, policy_version 25900 (0.0009) [2023-10-12 16:50:26,604][62635] Updated weights for policy 1, policy_version 25910 (0.0008) [2023-10-12 16:50:26,976][62635] Updated weights for policy 1, policy_version 25920 (0.0008) [2023-10-12 16:50:28,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13107.1, 300 sec: 13440.4). Total num frames: 53084160. Throughput: 0: 1677.9, 1: 1665.4. Samples: 13279570. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 16:50:28,436][61643] Avg episode reward: [(0, '3.880'), (1, '9.160')] [2023-10-12 16:50:29,014][62634] Updated weights for policy 0, policy_version 25930 (0.0009) [2023-10-12 16:50:29,390][62634] Updated weights for policy 0, policy_version 25940 (0.0009) [2023-10-12 16:50:29,776][62634] Updated weights for policy 0, policy_version 25950 (0.0009) [2023-10-12 16:50:30,871][62635] Updated weights for policy 1, policy_version 25930 (0.0007) [2023-10-12 16:50:31,236][62635] Updated weights for policy 1, policy_version 25940 (0.0008) [2023-10-12 16:50:31,603][62635] Updated weights for policy 1, policy_version 25950 (0.0009) [2023-10-12 16:50:33,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.3, 300 sec: 13440.4). Total num frames: 53149696. Throughput: 0: 1679.9, 1: 1679.5. Samples: 13300364. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 16:50:33,435][61643] Avg episode reward: [(0, '3.910'), (1, '9.010')] [2023-10-12 16:50:33,912][62634] Updated weights for policy 0, policy_version 25960 (0.0007) [2023-10-12 16:50:34,281][62634] Updated weights for policy 0, policy_version 25970 (0.0010) [2023-10-12 16:50:34,666][62634] Updated weights for policy 0, policy_version 25980 (0.0009) [2023-10-12 16:50:35,711][62635] Updated weights for policy 1, policy_version 25960 (0.0009) [2023-10-12 16:50:36,075][62635] Updated weights for policy 1, policy_version 25970 (0.0010) [2023-10-12 16:50:36,439][62635] Updated weights for policy 1, policy_version 25980 (0.0011) [2023-10-12 16:50:38,435][61643] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 53215232. Throughput: 0: 1683.2, 1: 1661.9. Samples: 13310062. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 16:50:38,435][61643] Avg episode reward: [(0, '3.980'), (1, '9.060')] [2023-10-12 16:50:38,451][62634] Updated weights for policy 0, policy_version 25990 (0.0009) [2023-10-12 16:50:38,838][62634] Updated weights for policy 0, policy_version 26000 (0.0010) [2023-10-12 16:50:39,215][62634] Updated weights for policy 0, policy_version 26010 (0.0008) [2023-10-12 16:50:40,568][62635] Updated weights for policy 1, policy_version 25990 (0.0009) [2023-10-12 16:50:40,939][62635] Updated weights for policy 1, policy_version 26000 (0.0007) [2023-10-12 16:50:41,304][62635] Updated weights for policy 1, policy_version 26010 (0.0007) [2023-10-12 16:50:43,238][62634] Updated weights for policy 0, policy_version 26020 (0.0008) [2023-10-12 16:50:43,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 53280768. Throughput: 0: 1691.1, 1: 1666.2. Samples: 13330184. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 16:50:43,435][61643] Avg episode reward: [(0, '3.940'), (1, '9.060')] [2023-10-12 16:50:43,604][62634] Updated weights for policy 0, policy_version 26030 (0.0009) [2023-10-12 16:50:43,980][62634] Updated weights for policy 0, policy_version 26040 (0.0007) [2023-10-12 16:50:45,339][62635] Updated weights for policy 1, policy_version 26020 (0.0007) [2023-10-12 16:50:45,712][62635] Updated weights for policy 1, policy_version 26030 (0.0011) [2023-10-12 16:50:46,065][62635] Updated weights for policy 1, policy_version 26040 (0.0010) [2023-10-12 16:50:48,026][62634] Updated weights for policy 0, policy_version 26050 (0.0008) [2023-10-12 16:50:48,393][62634] Updated weights for policy 0, policy_version 26060 (0.0008) [2023-10-12 16:50:48,435][61643] Fps is (10 sec: 13106.8, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 53346304. Throughput: 0: 1697.6, 1: 1680.5. Samples: 13351168. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 16:50:48,436][61643] Avg episode reward: [(0, '3.920'), (1, '9.010')] [2023-10-12 16:50:48,772][62634] Updated weights for policy 0, policy_version 26070 (0.0008) [2023-10-12 16:50:49,150][62634] Updated weights for policy 0, policy_version 26080 (0.0007) [2023-10-12 16:50:50,188][62635] Updated weights for policy 1, policy_version 26050 (0.0010) [2023-10-12 16:50:50,597][62635] Updated weights for policy 1, policy_version 26060 (0.0007) [2023-10-12 16:50:50,970][62635] Updated weights for policy 1, policy_version 26070 (0.0007) [2023-10-12 16:50:51,341][62635] Updated weights for policy 1, policy_version 26080 (0.0008) [2023-10-12 16:50:53,334][62634] Updated weights for policy 0, policy_version 26090 (0.0007) [2023-10-12 16:50:53,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 53411840. Throughput: 0: 1697.9, 1: 1656.2. Samples: 13360384. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 16:50:53,435][61643] Avg episode reward: [(0, '3.920'), (1, '9.000')] [2023-10-12 16:50:53,700][62634] Updated weights for policy 0, policy_version 26100 (0.0008) [2023-10-12 16:50:54,081][62634] Updated weights for policy 0, policy_version 26110 (0.0007) [2023-10-12 16:50:55,335][62635] Updated weights for policy 1, policy_version 26090 (0.0007) [2023-10-12 16:50:55,711][62635] Updated weights for policy 1, policy_version 26100 (0.0008) [2023-10-12 16:50:56,075][62635] Updated weights for policy 1, policy_version 26110 (0.0009) [2023-10-12 16:50:58,095][62634] Updated weights for policy 0, policy_version 26120 (0.0008) [2023-10-12 16:50:58,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 53477376. Throughput: 0: 1694.8, 1: 1676.7. Samples: 13380718. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 16:50:58,435][61643] Avg episode reward: [(0, '3.940'), (1, '8.940')] [2023-10-12 16:50:58,472][62634] Updated weights for policy 0, policy_version 26130 (0.0007) [2023-10-12 16:50:58,845][62634] Updated weights for policy 0, policy_version 26140 (0.0007) [2023-10-12 16:51:00,052][62635] Updated weights for policy 1, policy_version 26120 (0.0007) [2023-10-12 16:51:00,425][62635] Updated weights for policy 1, policy_version 26130 (0.0007) [2023-10-12 16:51:00,802][62635] Updated weights for policy 1, policy_version 26140 (0.0007) [2023-10-12 16:51:02,812][62634] Updated weights for policy 0, policy_version 26150 (0.0010) [2023-10-12 16:51:03,183][62634] Updated weights for policy 0, policy_version 26160 (0.0008) [2023-10-12 16:51:03,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 53542912. Throughput: 0: 1688.0, 1: 1684.9. Samples: 13401534. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 16:51:03,435][61643] Avg episode reward: [(0, '3.970'), (1, '8.750')] [2023-10-12 16:51:03,570][62634] Updated weights for policy 0, policy_version 26170 (0.0007) [2023-10-12 16:51:04,809][62635] Updated weights for policy 1, policy_version 26150 (0.0010) [2023-10-12 16:51:05,179][62635] Updated weights for policy 1, policy_version 26160 (0.0008) [2023-10-12 16:51:05,560][62635] Updated weights for policy 1, policy_version 26170 (0.0007) [2023-10-12 16:51:07,372][62634] Updated weights for policy 0, policy_version 26180 (0.0008) [2023-10-12 16:51:07,759][62634] Updated weights for policy 0, policy_version 26190 (0.0010) [2023-10-12 16:51:08,135][62634] Updated weights for policy 0, policy_version 26200 (0.0008) [2023-10-12 16:51:08,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 53608448. Throughput: 0: 1704.3, 1: 1663.8. Samples: 13411206. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 16:51:08,436][61643] Avg episode reward: [(0, '4.000'), (1, '8.820')] [2023-10-12 16:51:09,673][62635] Updated weights for policy 1, policy_version 26180 (0.0010) [2023-10-12 16:51:10,039][62635] Updated weights for policy 1, policy_version 26190 (0.0009) [2023-10-12 16:51:10,399][62635] Updated weights for policy 1, policy_version 26200 (0.0008) [2023-10-12 16:51:11,963][62634] Updated weights for policy 0, policy_version 26210 (0.0008) [2023-10-12 16:51:12,332][62634] Updated weights for policy 0, policy_version 26220 (0.0007) [2023-10-12 16:51:12,707][62634] Updated weights for policy 0, policy_version 26230 (0.0007) [2023-10-12 16:51:13,087][62634] Updated weights for policy 0, policy_version 26240 (0.0008) [2023-10-12 16:51:13,435][61643] Fps is (10 sec: 16383.8, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 53706752. Throughput: 0: 1703.3, 1: 1685.3. Samples: 13432058. Policy #0 lag: (min: 31.0, avg: 32.3, max: 55.0) [2023-10-12 16:51:13,436][61643] Avg episode reward: [(0, '3.980'), (1, '9.060')] [2023-10-12 16:51:14,484][62635] Updated weights for policy 1, policy_version 26210 (0.0009) [2023-10-12 16:51:14,846][62635] Updated weights for policy 1, policy_version 26220 (0.0009) [2023-10-12 16:51:15,211][62635] Updated weights for policy 1, policy_version 26230 (0.0007) [2023-10-12 16:51:15,583][62635] Updated weights for policy 1, policy_version 26240 (0.0007) [2023-10-12 16:51:17,259][62634] Updated weights for policy 0, policy_version 26250 (0.0009) [2023-10-12 16:51:17,640][62634] Updated weights for policy 0, policy_version 26260 (0.0008) [2023-10-12 16:51:18,024][62634] Updated weights for policy 0, policy_version 26270 (0.0007) [2023-10-12 16:51:18,435][61643] Fps is (10 sec: 16384.4, 60 sec: 13653.3, 300 sec: 13440.5). Total num frames: 53772288. Throughput: 0: 1677.5, 1: 1681.6. Samples: 13451526. Policy #0 lag: (min: 31.0, avg: 32.3, max: 55.0) [2023-10-12 16:51:18,435][61643] Avg episode reward: [(0, '3.980'), (1, '8.960')] [2023-10-12 16:51:19,753][62635] Updated weights for policy 1, policy_version 26250 (0.0008) [2023-10-12 16:51:20,128][62635] Updated weights for policy 1, policy_version 26260 (0.0007) [2023-10-12 16:51:20,493][62635] Updated weights for policy 1, policy_version 26270 (0.0008) [2023-10-12 16:51:22,102][62634] Updated weights for policy 0, policy_version 26280 (0.0009) [2023-10-12 16:51:22,467][62634] Updated weights for policy 0, policy_version 26290 (0.0007) [2023-10-12 16:51:22,847][62634] Updated weights for policy 0, policy_version 26300 (0.0007) [2023-10-12 16:51:23,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 53837824. Throughput: 0: 1698.9, 1: 1669.7. Samples: 13461648. Policy #0 lag: (min: 31.0, avg: 32.3, max: 55.0) [2023-10-12 16:51:23,435][61643] Avg episode reward: [(0, '3.980'), (1, '8.790')] [2023-10-12 16:51:24,550][62635] Updated weights for policy 1, policy_version 26280 (0.0007) [2023-10-12 16:51:24,935][62635] Updated weights for policy 1, policy_version 26290 (0.0007) [2023-10-12 16:51:25,302][62635] Updated weights for policy 1, policy_version 26300 (0.0007) [2023-10-12 16:51:26,873][62634] Updated weights for policy 0, policy_version 26310 (0.0009) [2023-10-12 16:51:27,238][62634] Updated weights for policy 0, policy_version 26320 (0.0010) [2023-10-12 16:51:27,620][62634] Updated weights for policy 0, policy_version 26330 (0.0008) [2023-10-12 16:51:28,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 53903360. Throughput: 0: 1689.2, 1: 1689.8. Samples: 13482238. Policy #0 lag: (min: 31.0, avg: 31.5, max: 46.0) [2023-10-12 16:51:28,435][61643] Avg episode reward: [(0, '3.960'), (1, '8.660')] [2023-10-12 16:51:29,329][62635] Updated weights for policy 1, policy_version 26310 (0.0010) [2023-10-12 16:51:29,694][62635] Updated weights for policy 1, policy_version 26320 (0.0008) [2023-10-12 16:51:30,065][62635] Updated weights for policy 1, policy_version 26330 (0.0010) [2023-10-12 16:51:31,736][62634] Updated weights for policy 0, policy_version 26340 (0.0009) [2023-10-12 16:51:32,104][62634] Updated weights for policy 0, policy_version 26350 (0.0010) [2023-10-12 16:51:32,476][62634] Updated weights for policy 0, policy_version 26360 (0.0011) [2023-10-12 16:51:33,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 53968896. Throughput: 0: 1662.3, 1: 1689.2. Samples: 13501982. Policy #0 lag: (min: 31.0, avg: 31.5, max: 46.0) [2023-10-12 16:51:33,435][61643] Avg episode reward: [(0, '3.980'), (1, '8.840')] [2023-10-12 16:51:34,024][62635] Updated weights for policy 1, policy_version 26340 (0.0011) [2023-10-12 16:51:34,389][62635] Updated weights for policy 1, policy_version 26350 (0.0011) [2023-10-12 16:51:34,756][62635] Updated weights for policy 1, policy_version 26360 (0.0010) [2023-10-12 16:51:36,407][62634] Updated weights for policy 0, policy_version 26370 (0.0008) [2023-10-12 16:51:36,787][62634] Updated weights for policy 0, policy_version 26380 (0.0009) [2023-10-12 16:51:37,163][62634] Updated weights for policy 0, policy_version 26390 (0.0010) [2023-10-12 16:51:37,542][62634] Updated weights for policy 0, policy_version 26400 (0.0010) [2023-10-12 16:51:38,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 54034432. Throughput: 0: 1694.5, 1: 1682.9. Samples: 13512366. Policy #0 lag: (min: 31.0, avg: 31.5, max: 46.0) [2023-10-12 16:51:38,435][61643] Avg episode reward: [(0, '3.940'), (1, '8.950')] [2023-10-12 16:51:38,868][62635] Updated weights for policy 1, policy_version 26370 (0.0010) [2023-10-12 16:51:39,271][62635] Updated weights for policy 1, policy_version 26380 (0.0008) [2023-10-12 16:51:39,644][62635] Updated weights for policy 1, policy_version 26390 (0.0007) [2023-10-12 16:51:40,008][62635] Updated weights for policy 1, policy_version 26400 (0.0008) [2023-10-12 16:51:41,592][62634] Updated weights for policy 0, policy_version 26410 (0.0009) [2023-10-12 16:51:41,974][62634] Updated weights for policy 0, policy_version 26420 (0.0009) [2023-10-12 16:51:42,361][62634] Updated weights for policy 0, policy_version 26430 (0.0008) [2023-10-12 16:51:43,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 54099968. Throughput: 0: 1679.6, 1: 1695.1. Samples: 13532582. Policy #0 lag: (min: 31.0, avg: 31.5, max: 46.0) [2023-10-12 16:51:43,435][61643] Avg episode reward: [(0, '3.960'), (1, '8.760')] [2023-10-12 16:51:44,038][62635] Updated weights for policy 1, policy_version 26410 (0.0007) [2023-10-12 16:51:44,408][62635] Updated weights for policy 1, policy_version 26420 (0.0008) [2023-10-12 16:51:44,771][62635] Updated weights for policy 1, policy_version 26430 (0.0010) [2023-10-12 16:51:46,377][62634] Updated weights for policy 0, policy_version 26440 (0.0009) [2023-10-12 16:51:46,755][62634] Updated weights for policy 0, policy_version 26450 (0.0007) [2023-10-12 16:51:47,130][62634] Updated weights for policy 0, policy_version 26460 (0.0008) [2023-10-12 16:51:48,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 54165504. Throughput: 0: 1672.8, 1: 1687.9. Samples: 13552764. Policy #0 lag: (min: 31.0, avg: 37.9, max: 63.0) [2023-10-12 16:51:48,435][61643] Avg episode reward: [(0, '3.940'), (1, '8.780')] [2023-10-12 16:51:48,937][62635] Updated weights for policy 1, policy_version 26440 (0.0010) [2023-10-12 16:51:49,306][62635] Updated weights for policy 1, policy_version 26450 (0.0008) [2023-10-12 16:51:49,670][62635] Updated weights for policy 1, policy_version 26460 (0.0008) [2023-10-12 16:51:51,307][62634] Updated weights for policy 0, policy_version 26470 (0.0010) [2023-10-12 16:51:51,696][62634] Updated weights for policy 0, policy_version 26480 (0.0011) [2023-10-12 16:51:52,062][62634] Updated weights for policy 0, policy_version 26490 (0.0009) [2023-10-12 16:51:53,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 54231040. Throughput: 0: 1687.3, 1: 1687.7. Samples: 13563080. Policy #0 lag: (min: 31.0, avg: 37.9, max: 63.0) [2023-10-12 16:51:53,435][61643] Avg episode reward: [(0, '3.930'), (1, '8.830')] [2023-10-12 16:51:53,777][62635] Updated weights for policy 1, policy_version 26470 (0.0009) [2023-10-12 16:51:54,152][62635] Updated weights for policy 1, policy_version 26480 (0.0008) [2023-10-12 16:51:54,531][62635] Updated weights for policy 1, policy_version 26490 (0.0011) [2023-10-12 16:51:56,126][62634] Updated weights for policy 0, policy_version 26500 (0.0008) [2023-10-12 16:51:56,510][62634] Updated weights for policy 0, policy_version 26510 (0.0007) [2023-10-12 16:51:56,883][62634] Updated weights for policy 0, policy_version 26520 (0.0009) [2023-10-12 16:51:58,424][62635] Updated weights for policy 1, policy_version 26500 (0.0010) [2023-10-12 16:51:58,435][61643] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 54296576. Throughput: 0: 1663.4, 1: 1688.2. Samples: 13582882. Policy #0 lag: (min: 31.0, avg: 37.9, max: 63.0) [2023-10-12 16:51:58,436][61643] Avg episode reward: [(0, '3.940'), (1, '9.020')] [2023-10-12 16:51:58,787][62635] Updated weights for policy 1, policy_version 26510 (0.0007) [2023-10-12 16:51:59,152][62635] Updated weights for policy 1, policy_version 26520 (0.0008) [2023-10-12 16:52:01,051][62634] Updated weights for policy 0, policy_version 26530 (0.0009) [2023-10-12 16:52:01,425][62634] Updated weights for policy 0, policy_version 26540 (0.0009) [2023-10-12 16:52:01,809][62634] Updated weights for policy 0, policy_version 26550 (0.0009) [2023-10-12 16:52:02,175][62634] Updated weights for policy 0, policy_version 26560 (0.0009) [2023-10-12 16:52:03,247][62635] Updated weights for policy 1, policy_version 26530 (0.0008) [2023-10-12 16:52:03,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 54362112. Throughput: 0: 1681.7, 1: 1688.7. Samples: 13603194. Policy #0 lag: (min: 31.0, avg: 37.9, max: 63.0) [2023-10-12 16:52:03,435][61643] Avg episode reward: [(0, '3.950'), (1, '9.020')] [2023-10-12 16:52:03,610][62635] Updated weights for policy 1, policy_version 26540 (0.0009) [2023-10-12 16:52:03,993][62635] Updated weights for policy 1, policy_version 26550 (0.0009) [2023-10-12 16:52:04,357][62635] Updated weights for policy 1, policy_version 26560 (0.0008) [2023-10-12 16:52:06,151][62634] Updated weights for policy 0, policy_version 26570 (0.0010) [2023-10-12 16:52:06,522][62634] Updated weights for policy 0, policy_version 26580 (0.0007) [2023-10-12 16:52:06,890][62634] Updated weights for policy 0, policy_version 26590 (0.0011) [2023-10-12 16:52:08,387][62635] Updated weights for policy 1, policy_version 26570 (0.0010) [2023-10-12 16:52:08,435][61643] Fps is (10 sec: 13107.5, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 54427648. Throughput: 0: 1682.0, 1: 1691.5. Samples: 13613456. Policy #0 lag: (min: 7.0, avg: 14.0, max: 39.0) [2023-10-12 16:52:08,435][61643] Avg episode reward: [(0, '3.990'), (1, '8.820')] [2023-10-12 16:52:08,758][62635] Updated weights for policy 1, policy_version 26580 (0.0008) [2023-10-12 16:52:09,128][62635] Updated weights for policy 1, policy_version 26590 (0.0009) [2023-10-12 16:52:11,070][62634] Updated weights for policy 0, policy_version 26600 (0.0008) [2023-10-12 16:52:11,449][62634] Updated weights for policy 0, policy_version 26610 (0.0009) [2023-10-12 16:52:11,830][62634] Updated weights for policy 0, policy_version 26620 (0.0008) [2023-10-12 16:52:13,138][62635] Updated weights for policy 1, policy_version 26600 (0.0008) [2023-10-12 16:52:13,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 54493184. Throughput: 0: 1662.8, 1: 1685.6. Samples: 13632916. Policy #0 lag: (min: 7.0, avg: 14.0, max: 39.0) [2023-10-12 16:52:13,435][61643] Avg episode reward: [(0, '3.990'), (1, '9.140')] [2023-10-12 16:52:13,492][62635] Updated weights for policy 1, policy_version 26610 (0.0007) [2023-10-12 16:52:13,863][62635] Updated weights for policy 1, policy_version 26620 (0.0008) [2023-10-12 16:52:15,854][62634] Updated weights for policy 0, policy_version 26630 (0.0009) [2023-10-12 16:52:16,236][62634] Updated weights for policy 0, policy_version 26640 (0.0010) [2023-10-12 16:52:16,617][62634] Updated weights for policy 0, policy_version 26650 (0.0010) [2023-10-12 16:52:17,981][62635] Updated weights for policy 1, policy_version 26630 (0.0008) [2023-10-12 16:52:18,347][62635] Updated weights for policy 1, policy_version 26640 (0.0011) [2023-10-12 16:52:18,435][61643] Fps is (10 sec: 13106.8, 60 sec: 13107.1, 300 sec: 13440.4). Total num frames: 54558720. Throughput: 0: 1686.3, 1: 1678.9. Samples: 13653416. Policy #0 lag: (min: 7.0, avg: 14.0, max: 39.0) [2023-10-12 16:52:18,436][61643] Avg episode reward: [(0, '3.970'), (1, '9.180')] [2023-10-12 16:52:18,447][62354] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000026656_27295744.pth... [2023-10-12 16:52:18,485][62354] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000025088_25690112.pth [2023-10-12 16:52:18,718][62635] Updated weights for policy 1, policy_version 26650 (0.0007) [2023-10-12 16:52:18,932][62495] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000026656_27295744.pth... [2023-10-12 16:52:18,962][62495] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000025056_25657344.pth [2023-10-12 16:52:20,748][62634] Updated weights for policy 0, policy_version 26660 (0.0009) [2023-10-12 16:52:21,127][62634] Updated weights for policy 0, policy_version 26670 (0.0008) [2023-10-12 16:52:21,504][62634] Updated weights for policy 0, policy_version 26680 (0.0008) [2023-10-12 16:52:22,819][62635] Updated weights for policy 1, policy_version 26660 (0.0008) [2023-10-12 16:52:23,202][62635] Updated weights for policy 1, policy_version 26670 (0.0009) [2023-10-12 16:52:23,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 54624256. Throughput: 0: 1675.0, 1: 1688.9. Samples: 13663740. Policy #0 lag: (min: 7.0, avg: 14.0, max: 39.0) [2023-10-12 16:52:23,435][61643] Avg episode reward: [(0, '3.960'), (1, '9.060')] [2023-10-12 16:52:23,570][62635] Updated weights for policy 1, policy_version 26680 (0.0010) [2023-10-12 16:52:25,464][62634] Updated weights for policy 0, policy_version 26690 (0.0007) [2023-10-12 16:52:25,841][62634] Updated weights for policy 0, policy_version 26700 (0.0007) [2023-10-12 16:52:26,219][62634] Updated weights for policy 0, policy_version 26710 (0.0009) [2023-10-12 16:52:26,602][62634] Updated weights for policy 0, policy_version 26720 (0.0009) [2023-10-12 16:52:27,510][62635] Updated weights for policy 1, policy_version 26690 (0.0009) [2023-10-12 16:52:27,907][62635] Updated weights for policy 1, policy_version 26700 (0.0007) [2023-10-12 16:52:28,274][62635] Updated weights for policy 1, policy_version 26710 (0.0008) [2023-10-12 16:52:28,435][61643] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 54689792. Throughput: 0: 1671.6, 1: 1688.3. Samples: 13683778. Policy #0 lag: (min: 24.0, avg: 47.6, max: 56.0) [2023-10-12 16:52:28,436][61643] Avg episode reward: [(0, '3.950'), (1, '9.150')] [2023-10-12 16:52:28,647][62635] Updated weights for policy 1, policy_version 26720 (0.0009) [2023-10-12 16:52:30,517][62634] Updated weights for policy 0, policy_version 26730 (0.0007) [2023-10-12 16:52:30,903][62634] Updated weights for policy 0, policy_version 26740 (0.0009) [2023-10-12 16:52:31,281][62634] Updated weights for policy 0, policy_version 26750 (0.0009) [2023-10-12 16:52:32,712][62635] Updated weights for policy 1, policy_version 26730 (0.0008) [2023-10-12 16:52:33,092][62635] Updated weights for policy 1, policy_version 26740 (0.0008) [2023-10-12 16:52:33,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 54755328. Throughput: 0: 1684.3, 1: 1671.6. Samples: 13703780. Policy #0 lag: (min: 24.0, avg: 47.6, max: 56.0) [2023-10-12 16:52:33,435][61643] Avg episode reward: [(0, '3.970'), (1, '9.250')] [2023-10-12 16:52:33,474][62635] Updated weights for policy 1, policy_version 26750 (0.0008) [2023-10-12 16:52:35,470][62634] Updated weights for policy 0, policy_version 26760 (0.0008) [2023-10-12 16:52:35,852][62634] Updated weights for policy 0, policy_version 26770 (0.0008) [2023-10-12 16:52:36,224][62634] Updated weights for policy 0, policy_version 26780 (0.0007) [2023-10-12 16:52:37,655][62635] Updated weights for policy 1, policy_version 26760 (0.0007) [2023-10-12 16:52:38,031][62635] Updated weights for policy 1, policy_version 26770 (0.0007) [2023-10-12 16:52:38,404][62635] Updated weights for policy 1, policy_version 26780 (0.0009) [2023-10-12 16:52:38,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13107.1, 300 sec: 13440.4). Total num frames: 54820864. Throughput: 0: 1665.1, 1: 1689.7. Samples: 13714050. Policy #0 lag: (min: 24.0, avg: 47.6, max: 56.0) [2023-10-12 16:52:38,436][61643] Avg episode reward: [(0, '3.980'), (1, '9.000')] [2023-10-12 16:52:40,442][62634] Updated weights for policy 0, policy_version 26790 (0.0007) [2023-10-12 16:52:40,829][62634] Updated weights for policy 0, policy_version 26800 (0.0007) [2023-10-12 16:52:41,198][62634] Updated weights for policy 0, policy_version 26810 (0.0007) [2023-10-12 16:52:42,413][62635] Updated weights for policy 1, policy_version 26790 (0.0008) [2023-10-12 16:52:42,769][62635] Updated weights for policy 1, policy_version 26800 (0.0011) [2023-10-12 16:52:43,143][62635] Updated weights for policy 1, policy_version 26810 (0.0010) [2023-10-12 16:52:43,435][61643] Fps is (10 sec: 16384.0, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 54919168. Throughput: 0: 1672.7, 1: 1686.4. Samples: 13734042. Policy #0 lag: (min: 24.0, avg: 47.6, max: 56.0) [2023-10-12 16:52:43,435][61643] Avg episode reward: [(0, '3.990'), (1, '9.080')] [2023-10-12 16:52:45,105][62634] Updated weights for policy 0, policy_version 26820 (0.0007) [2023-10-12 16:52:45,489][62634] Updated weights for policy 0, policy_version 26830 (0.0008) [2023-10-12 16:52:45,871][62634] Updated weights for policy 0, policy_version 26840 (0.0010) [2023-10-12 16:52:47,153][62635] Updated weights for policy 1, policy_version 26820 (0.0009) [2023-10-12 16:52:47,516][62635] Updated weights for policy 1, policy_version 26830 (0.0009) [2023-10-12 16:52:47,879][62635] Updated weights for policy 1, policy_version 26840 (0.0009) [2023-10-12 16:52:48,435][61643] Fps is (10 sec: 16384.2, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 54984704. Throughput: 0: 1684.0, 1: 1661.5. Samples: 13753740. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 16:52:48,435][61643] Avg episode reward: [(0, '3.990'), (1, '8.990')] [2023-10-12 16:52:49,916][62634] Updated weights for policy 0, policy_version 26850 (0.0008) [2023-10-12 16:52:50,287][62634] Updated weights for policy 0, policy_version 26860 (0.0008) [2023-10-12 16:52:50,666][62634] Updated weights for policy 0, policy_version 26870 (0.0007) [2023-10-12 16:52:51,045][62634] Updated weights for policy 0, policy_version 26880 (0.0007) [2023-10-12 16:52:52,076][62635] Updated weights for policy 1, policy_version 26850 (0.0009) [2023-10-12 16:52:52,447][62635] Updated weights for policy 1, policy_version 26860 (0.0008) [2023-10-12 16:52:52,818][62635] Updated weights for policy 1, policy_version 26870 (0.0008) [2023-10-12 16:52:53,183][62635] Updated weights for policy 1, policy_version 26880 (0.0009) [2023-10-12 16:52:53,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 55050240. Throughput: 0: 1663.1, 1: 1683.7. Samples: 13764060. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 16:52:53,435][61643] Avg episode reward: [(0, '3.980'), (1, '8.790')] [2023-10-12 16:52:55,116][62634] Updated weights for policy 0, policy_version 26890 (0.0008) [2023-10-12 16:52:55,492][62634] Updated weights for policy 0, policy_version 26900 (0.0008) [2023-10-12 16:52:55,871][62634] Updated weights for policy 0, policy_version 26910 (0.0009) [2023-10-12 16:52:57,145][62635] Updated weights for policy 1, policy_version 26890 (0.0008) [2023-10-12 16:52:57,511][62635] Updated weights for policy 1, policy_version 26900 (0.0008) [2023-10-12 16:52:57,883][62635] Updated weights for policy 1, policy_version 26910 (0.0007) [2023-10-12 16:52:58,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 55115776. Throughput: 0: 1684.8, 1: 1680.9. Samples: 13784374. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 16:52:58,435][61643] Avg episode reward: [(0, '3.980'), (1, '8.960')] [2023-10-12 16:52:59,764][62634] Updated weights for policy 0, policy_version 26920 (0.0009) [2023-10-12 16:53:00,140][62634] Updated weights for policy 0, policy_version 26930 (0.0008) [2023-10-12 16:53:00,525][62634] Updated weights for policy 0, policy_version 26940 (0.0008) [2023-10-12 16:53:01,932][62635] Updated weights for policy 1, policy_version 26920 (0.0008) [2023-10-12 16:53:02,309][62635] Updated weights for policy 1, policy_version 26930 (0.0009) [2023-10-12 16:53:02,682][62635] Updated weights for policy 1, policy_version 26940 (0.0007) [2023-10-12 16:53:03,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 55181312. Throughput: 0: 1687.6, 1: 1664.2. Samples: 13804246. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 16:53:03,436][61643] Avg episode reward: [(0, '3.980'), (1, '8.940')] [2023-10-12 16:53:04,547][62634] Updated weights for policy 0, policy_version 26950 (0.0009) [2023-10-12 16:53:04,928][62634] Updated weights for policy 0, policy_version 26960 (0.0007) [2023-10-12 16:53:05,305][62634] Updated weights for policy 0, policy_version 26970 (0.0011) [2023-10-12 16:53:06,761][62635] Updated weights for policy 1, policy_version 26950 (0.0009) [2023-10-12 16:53:07,142][62635] Updated weights for policy 1, policy_version 26960 (0.0009) [2023-10-12 16:53:07,509][62635] Updated weights for policy 1, policy_version 26970 (0.0008) [2023-10-12 16:53:08,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 55246848. Throughput: 0: 1665.2, 1: 1690.4. Samples: 13814742. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 16:53:08,436][61643] Avg episode reward: [(0, '3.990'), (1, '8.890')] [2023-10-12 16:53:09,477][62634] Updated weights for policy 0, policy_version 26980 (0.0010) [2023-10-12 16:53:09,846][62634] Updated weights for policy 0, policy_version 26990 (0.0008) [2023-10-12 16:53:10,229][62634] Updated weights for policy 0, policy_version 27000 (0.0008) [2023-10-12 16:53:11,614][62635] Updated weights for policy 1, policy_version 26980 (0.0008) [2023-10-12 16:53:11,983][62635] Updated weights for policy 1, policy_version 26990 (0.0007) [2023-10-12 16:53:12,353][62635] Updated weights for policy 1, policy_version 27000 (0.0007) [2023-10-12 16:53:13,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 55312384. Throughput: 0: 1681.6, 1: 1671.0. Samples: 13834646. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 16:53:13,436][61643] Avg episode reward: [(0, '4.000'), (1, '8.850')] [2023-10-12 16:53:14,303][62634] Updated weights for policy 0, policy_version 27010 (0.0008) [2023-10-12 16:53:14,683][62634] Updated weights for policy 0, policy_version 27020 (0.0010) [2023-10-12 16:53:15,059][62634] Updated weights for policy 0, policy_version 27030 (0.0011) [2023-10-12 16:53:15,433][62634] Updated weights for policy 0, policy_version 27040 (0.0010) [2023-10-12 16:53:16,268][62635] Updated weights for policy 1, policy_version 27010 (0.0008) [2023-10-12 16:53:16,674][62635] Updated weights for policy 1, policy_version 27020 (0.0008) [2023-10-12 16:53:17,039][62635] Updated weights for policy 1, policy_version 27030 (0.0010) [2023-10-12 16:53:17,411][62635] Updated weights for policy 1, policy_version 27040 (0.0007) [2023-10-12 16:53:18,435][61643] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 55377920. Throughput: 0: 1680.3, 1: 1670.6. Samples: 13854572. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 16:53:18,436][61643] Avg episode reward: [(0, '4.000'), (1, '9.000')] [2023-10-12 16:53:19,383][62634] Updated weights for policy 0, policy_version 27050 (0.0007) [2023-10-12 16:53:19,763][62634] Updated weights for policy 0, policy_version 27060 (0.0008) [2023-10-12 16:53:20,139][62634] Updated weights for policy 0, policy_version 27070 (0.0009) [2023-10-12 16:53:21,561][62635] Updated weights for policy 1, policy_version 27050 (0.0009) [2023-10-12 16:53:21,930][62635] Updated weights for policy 1, policy_version 27060 (0.0010) [2023-10-12 16:53:22,304][62635] Updated weights for policy 1, policy_version 27070 (0.0008) [2023-10-12 16:53:23,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 55443456. Throughput: 0: 1670.5, 1: 1682.5. Samples: 13864934. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 16:53:23,436][61643] Avg episode reward: [(0, '3.930'), (1, '9.130')] [2023-10-12 16:53:24,231][62634] Updated weights for policy 0, policy_version 27080 (0.0008) [2023-10-12 16:53:24,603][62634] Updated weights for policy 0, policy_version 27090 (0.0010) [2023-10-12 16:53:24,970][62634] Updated weights for policy 0, policy_version 27100 (0.0009) [2023-10-12 16:53:26,339][62635] Updated weights for policy 1, policy_version 27080 (0.0010) [2023-10-12 16:53:26,708][62635] Updated weights for policy 1, policy_version 27090 (0.0010) [2023-10-12 16:53:27,076][62635] Updated weights for policy 1, policy_version 27100 (0.0008) [2023-10-12 16:53:28,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 55508992. Throughput: 0: 1685.8, 1: 1666.7. Samples: 13884902. Policy #0 lag: (min: 8.0, avg: 32.2, max: 40.0) [2023-10-12 16:53:28,436][61643] Avg episode reward: [(0, '3.910'), (1, '9.120')] [2023-10-12 16:53:29,073][62634] Updated weights for policy 0, policy_version 27110 (0.0008) [2023-10-12 16:53:29,445][62634] Updated weights for policy 0, policy_version 27120 (0.0010) [2023-10-12 16:53:29,826][62634] Updated weights for policy 0, policy_version 27130 (0.0010) [2023-10-12 16:53:30,978][62635] Updated weights for policy 1, policy_version 27110 (0.0009) [2023-10-12 16:53:31,349][62635] Updated weights for policy 1, policy_version 27120 (0.0008) [2023-10-12 16:53:31,714][62635] Updated weights for policy 1, policy_version 27130 (0.0008) [2023-10-12 16:53:33,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 55574528. Throughput: 0: 1682.5, 1: 1689.5. Samples: 13905482. Policy #0 lag: (min: 8.0, avg: 32.2, max: 40.0) [2023-10-12 16:53:33,436][61643] Avg episode reward: [(0, '3.910'), (1, '9.150')] [2023-10-12 16:53:33,938][62634] Updated weights for policy 0, policy_version 27140 (0.0009) [2023-10-12 16:53:34,322][62634] Updated weights for policy 0, policy_version 27150 (0.0007) [2023-10-12 16:53:34,699][62634] Updated weights for policy 0, policy_version 27160 (0.0007) [2023-10-12 16:53:35,685][62635] Updated weights for policy 1, policy_version 27140 (0.0009) [2023-10-12 16:53:36,053][62635] Updated weights for policy 1, policy_version 27150 (0.0007) [2023-10-12 16:53:36,431][62635] Updated weights for policy 1, policy_version 27160 (0.0009) [2023-10-12 16:53:38,435][61643] Fps is (10 sec: 13107.5, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 55640064. Throughput: 0: 1675.9, 1: 1688.0. Samples: 13915434. Policy #0 lag: (min: 8.0, avg: 32.2, max: 40.0) [2023-10-12 16:53:38,435][61643] Avg episode reward: [(0, '3.910'), (1, '9.170')] [2023-10-12 16:53:38,697][62634] Updated weights for policy 0, policy_version 27170 (0.0007) [2023-10-12 16:53:39,069][62634] Updated weights for policy 0, policy_version 27180 (0.0011) [2023-10-12 16:53:39,443][62634] Updated weights for policy 0, policy_version 27190 (0.0009) [2023-10-12 16:53:39,816][62634] Updated weights for policy 0, policy_version 27200 (0.0010) [2023-10-12 16:53:40,477][62635] Updated weights for policy 1, policy_version 27170 (0.0008) [2023-10-12 16:53:40,855][62635] Updated weights for policy 1, policy_version 27180 (0.0007) [2023-10-12 16:53:41,218][62635] Updated weights for policy 1, policy_version 27190 (0.0009) [2023-10-12 16:53:41,596][62635] Updated weights for policy 1, policy_version 27200 (0.0009) [2023-10-12 16:53:43,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 55705600. Throughput: 0: 1679.7, 1: 1677.5. Samples: 13935450. Policy #0 lag: (min: 8.0, avg: 32.2, max: 40.0) [2023-10-12 16:53:43,436][61643] Avg episode reward: [(0, '4.000'), (1, '9.260')] [2023-10-12 16:53:43,914][62634] Updated weights for policy 0, policy_version 27210 (0.0007) [2023-10-12 16:53:44,297][62634] Updated weights for policy 0, policy_version 27220 (0.0010) [2023-10-12 16:53:44,678][62634] Updated weights for policy 0, policy_version 27230 (0.0009) [2023-10-12 16:53:45,720][62635] Updated weights for policy 1, policy_version 27210 (0.0009) [2023-10-12 16:53:46,087][62635] Updated weights for policy 1, policy_version 27220 (0.0009) [2023-10-12 16:53:46,465][62635] Updated weights for policy 1, policy_version 27230 (0.0009) [2023-10-12 16:53:48,435][61643] Fps is (10 sec: 13106.8, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 55771136. Throughput: 0: 1673.0, 1: 1703.6. Samples: 13956196. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 16:53:48,436][61643] Avg episode reward: [(0, '4.000'), (1, '9.330')] [2023-10-12 16:53:48,447][62495] Saving new best policy, reward=9.330! [2023-10-12 16:53:48,675][62634] Updated weights for policy 0, policy_version 27240 (0.0008) [2023-10-12 16:53:49,057][62634] Updated weights for policy 0, policy_version 27250 (0.0008) [2023-10-12 16:53:49,444][62634] Updated weights for policy 0, policy_version 27260 (0.0007) [2023-10-12 16:53:50,534][62635] Updated weights for policy 1, policy_version 27240 (0.0008) [2023-10-12 16:53:50,902][62635] Updated weights for policy 1, policy_version 27250 (0.0009) [2023-10-12 16:53:51,269][62635] Updated weights for policy 1, policy_version 27260 (0.0007) [2023-10-12 16:53:53,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 55836672. Throughput: 0: 1676.7, 1: 1678.8. Samples: 13965740. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 16:53:53,435][61643] Avg episode reward: [(0, '4.000'), (1, '9.220')] [2023-10-12 16:53:53,436][62634] Updated weights for policy 0, policy_version 27270 (0.0009) [2023-10-12 16:53:53,814][62634] Updated weights for policy 0, policy_version 27280 (0.0008) [2023-10-12 16:53:54,191][62634] Updated weights for policy 0, policy_version 27290 (0.0009) [2023-10-12 16:53:55,311][62635] Updated weights for policy 1, policy_version 27270 (0.0008) [2023-10-12 16:53:55,678][62635] Updated weights for policy 1, policy_version 27280 (0.0007) [2023-10-12 16:53:56,051][62635] Updated weights for policy 1, policy_version 27290 (0.0007) [2023-10-12 16:53:58,261][62634] Updated weights for policy 0, policy_version 27300 (0.0007) [2023-10-12 16:53:58,435][61643] Fps is (10 sec: 13107.6, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 55902208. Throughput: 0: 1684.4, 1: 1682.2. Samples: 13986144. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 16:53:58,435][61643] Avg episode reward: [(0, '3.970'), (1, '9.260')] [2023-10-12 16:53:58,642][62634] Updated weights for policy 0, policy_version 27310 (0.0007) [2023-10-12 16:53:59,011][62634] Updated weights for policy 0, policy_version 27320 (0.0008) [2023-10-12 16:53:59,933][62635] Updated weights for policy 1, policy_version 27300 (0.0009) [2023-10-12 16:54:00,296][62635] Updated weights for policy 1, policy_version 27310 (0.0010) [2023-10-12 16:54:00,660][62635] Updated weights for policy 1, policy_version 27320 (0.0008) [2023-10-12 16:54:03,169][62634] Updated weights for policy 0, policy_version 27330 (0.0007) [2023-10-12 16:54:03,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 55967744. Throughput: 0: 1689.1, 1: 1698.5. Samples: 14007012. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 16:54:03,436][61643] Avg episode reward: [(0, '3.910'), (1, '9.030')] [2023-10-12 16:54:03,540][62634] Updated weights for policy 0, policy_version 27340 (0.0007) [2023-10-12 16:54:03,917][62634] Updated weights for policy 0, policy_version 27350 (0.0008) [2023-10-12 16:54:04,298][62634] Updated weights for policy 0, policy_version 27360 (0.0009) [2023-10-12 16:54:04,795][62635] Updated weights for policy 1, policy_version 27330 (0.0008) [2023-10-12 16:54:05,206][62635] Updated weights for policy 1, policy_version 27340 (0.0008) [2023-10-12 16:54:05,585][62635] Updated weights for policy 1, policy_version 27350 (0.0008) [2023-10-12 16:54:05,942][62635] Updated weights for policy 1, policy_version 27360 (0.0008) [2023-10-12 16:54:08,384][62634] Updated weights for policy 0, policy_version 27370 (0.0009) [2023-10-12 16:54:08,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 56033280. Throughput: 0: 1690.1, 1: 1669.1. Samples: 14016098. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 16:54:08,435][61643] Avg episode reward: [(0, '3.880'), (1, '9.060')] [2023-10-12 16:54:08,757][62634] Updated weights for policy 0, policy_version 27380 (0.0007) [2023-10-12 16:54:09,138][62634] Updated weights for policy 0, policy_version 27390 (0.0007) [2023-10-12 16:54:09,893][62635] Updated weights for policy 1, policy_version 27370 (0.0007) [2023-10-12 16:54:10,267][62635] Updated weights for policy 1, policy_version 27380 (0.0007) [2023-10-12 16:54:10,639][62635] Updated weights for policy 1, policy_version 27390 (0.0009) [2023-10-12 16:54:12,838][62634] Updated weights for policy 0, policy_version 27400 (0.0007) [2023-10-12 16:54:13,218][62634] Updated weights for policy 0, policy_version 27410 (0.0010) [2023-10-12 16:54:13,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 56098816. Throughput: 0: 1694.2, 1: 1683.2. Samples: 14036886. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 16:54:13,435][61643] Avg episode reward: [(0, '3.890'), (1, '9.060')] [2023-10-12 16:54:13,603][62634] Updated weights for policy 0, policy_version 27420 (0.0008) [2023-10-12 16:54:14,904][62635] Updated weights for policy 1, policy_version 27400 (0.0008) [2023-10-12 16:54:15,269][62635] Updated weights for policy 1, policy_version 27410 (0.0008) [2023-10-12 16:54:15,639][62635] Updated weights for policy 1, policy_version 27420 (0.0010) [2023-10-12 16:54:17,567][62634] Updated weights for policy 0, policy_version 27430 (0.0009) [2023-10-12 16:54:17,940][62634] Updated weights for policy 0, policy_version 27440 (0.0008) [2023-10-12 16:54:18,317][62634] Updated weights for policy 0, policy_version 27450 (0.0010) [2023-10-12 16:54:18,435][61643] Fps is (10 sec: 13106.8, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 56164352. Throughput: 0: 1684.9, 1: 1689.7. Samples: 14057340. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 16:54:18,436][61643] Avg episode reward: [(0, '3.970'), (1, '9.050')] [2023-10-12 16:54:18,447][62495] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000027424_28082176.pth... [2023-10-12 16:54:18,484][62495] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000025856_26476544.pth [2023-10-12 16:54:18,532][62354] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000027456_28114944.pth... [2023-10-12 16:54:18,567][62354] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000025856_26476544.pth [2023-10-12 16:54:19,606][62635] Updated weights for policy 1, policy_version 27430 (0.0008) [2023-10-12 16:54:19,973][62635] Updated weights for policy 1, policy_version 27440 (0.0010) [2023-10-12 16:54:20,348][62635] Updated weights for policy 1, policy_version 27450 (0.0011) [2023-10-12 16:54:22,307][62634] Updated weights for policy 0, policy_version 27460 (0.0008) [2023-10-12 16:54:22,679][62634] Updated weights for policy 0, policy_version 27470 (0.0008) [2023-10-12 16:54:23,054][62634] Updated weights for policy 0, policy_version 27480 (0.0010) [2023-10-12 16:54:23,435][61643] Fps is (10 sec: 16384.1, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 56262656. Throughput: 0: 1701.7, 1: 1667.6. Samples: 14067050. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 16:54:23,435][61643] Avg episode reward: [(0, '4.000'), (1, '9.060')] [2023-10-12 16:54:24,520][62635] Updated weights for policy 1, policy_version 27460 (0.0008) [2023-10-12 16:54:24,889][62635] Updated weights for policy 1, policy_version 27470 (0.0008) [2023-10-12 16:54:25,255][62635] Updated weights for policy 1, policy_version 27480 (0.0008) [2023-10-12 16:54:26,959][62634] Updated weights for policy 0, policy_version 27490 (0.0010) [2023-10-12 16:54:27,331][62634] Updated weights for policy 0, policy_version 27500 (0.0009) [2023-10-12 16:54:27,714][62634] Updated weights for policy 0, policy_version 27510 (0.0011) [2023-10-12 16:54:28,087][62634] Updated weights for policy 0, policy_version 27520 (0.0010) [2023-10-12 16:54:28,435][61643] Fps is (10 sec: 16384.5, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 56328192. Throughput: 0: 1700.9, 1: 1686.8. Samples: 14087900. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 16:54:28,435][61643] Avg episode reward: [(0, '3.970'), (1, '8.970')] [2023-10-12 16:54:29,238][62635] Updated weights for policy 1, policy_version 27490 (0.0009) [2023-10-12 16:54:29,610][62635] Updated weights for policy 1, policy_version 27500 (0.0008) [2023-10-12 16:54:29,974][62635] Updated weights for policy 1, policy_version 27510 (0.0008) [2023-10-12 16:54:30,341][62635] Updated weights for policy 1, policy_version 27520 (0.0008) [2023-10-12 16:54:32,100][62634] Updated weights for policy 0, policy_version 27530 (0.0007) [2023-10-12 16:54:32,469][62634] Updated weights for policy 0, policy_version 27540 (0.0007) [2023-10-12 16:54:32,852][62634] Updated weights for policy 0, policy_version 27550 (0.0009) [2023-10-12 16:54:33,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 56393728. Throughput: 0: 1677.4, 1: 1683.4. Samples: 14107432. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 16:54:33,436][61643] Avg episode reward: [(0, '3.940'), (1, '9.120')] [2023-10-12 16:54:34,426][62635] Updated weights for policy 1, policy_version 27530 (0.0008) [2023-10-12 16:54:34,797][62635] Updated weights for policy 1, policy_version 27540 (0.0007) [2023-10-12 16:54:35,164][62635] Updated weights for policy 1, policy_version 27550 (0.0007) [2023-10-12 16:54:37,061][62634] Updated weights for policy 0, policy_version 27560 (0.0008) [2023-10-12 16:54:37,436][62634] Updated weights for policy 0, policy_version 27570 (0.0009) [2023-10-12 16:54:37,821][62634] Updated weights for policy 0, policy_version 27580 (0.0008) [2023-10-12 16:54:38,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 56459264. Throughput: 0: 1707.1, 1: 1674.7. Samples: 14117920. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 16:54:38,436][61643] Avg episode reward: [(0, '3.940'), (1, '9.170')] [2023-10-12 16:54:39,203][62635] Updated weights for policy 1, policy_version 27560 (0.0010) [2023-10-12 16:54:39,574][62635] Updated weights for policy 1, policy_version 27570 (0.0007) [2023-10-12 16:54:39,940][62635] Updated weights for policy 1, policy_version 27580 (0.0009) [2023-10-12 16:54:41,807][62634] Updated weights for policy 0, policy_version 27590 (0.0010) [2023-10-12 16:54:42,181][62634] Updated weights for policy 0, policy_version 27600 (0.0008) [2023-10-12 16:54:42,559][62634] Updated weights for policy 0, policy_version 27610 (0.0009) [2023-10-12 16:54:43,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 56524800. Throughput: 0: 1695.1, 1: 1687.1. Samples: 14138340. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 16:54:43,436][61643] Avg episode reward: [(0, '3.950'), (1, '9.180')] [2023-10-12 16:54:43,907][62635] Updated weights for policy 1, policy_version 27590 (0.0007) [2023-10-12 16:54:44,287][62635] Updated weights for policy 1, policy_version 27600 (0.0008) [2023-10-12 16:54:44,657][62635] Updated weights for policy 1, policy_version 27610 (0.0009) [2023-10-12 16:54:46,514][62634] Updated weights for policy 0, policy_version 27620 (0.0008) [2023-10-12 16:54:46,897][62634] Updated weights for policy 0, policy_version 27630 (0.0008) [2023-10-12 16:54:47,274][62634] Updated weights for policy 0, policy_version 27640 (0.0008) [2023-10-12 16:54:48,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 56590336. Throughput: 0: 1670.7, 1: 1692.7. Samples: 14158364. Policy #0 lag: (min: 18.0, avg: 24.8, max: 50.0) [2023-10-12 16:54:48,435][61643] Avg episode reward: [(0, '3.950'), (1, '9.180')] [2023-10-12 16:54:48,669][62635] Updated weights for policy 1, policy_version 27620 (0.0007) [2023-10-12 16:54:49,037][62635] Updated weights for policy 1, policy_version 27630 (0.0009) [2023-10-12 16:54:49,393][62635] Updated weights for policy 1, policy_version 27640 (0.0009) [2023-10-12 16:54:51,249][62634] Updated weights for policy 0, policy_version 27650 (0.0009) [2023-10-12 16:54:51,625][62634] Updated weights for policy 0, policy_version 27660 (0.0009) [2023-10-12 16:54:52,008][62634] Updated weights for policy 0, policy_version 27670 (0.0009) [2023-10-12 16:54:52,377][62634] Updated weights for policy 0, policy_version 27680 (0.0008) [2023-10-12 16:54:53,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 56655872. Throughput: 0: 1701.6, 1: 1690.9. Samples: 14168762. Policy #0 lag: (min: 18.0, avg: 24.8, max: 50.0) [2023-10-12 16:54:53,435][61643] Avg episode reward: [(0, '3.960'), (1, '9.060')] [2023-10-12 16:54:53,472][62635] Updated weights for policy 1, policy_version 27650 (0.0008) [2023-10-12 16:54:53,869][62635] Updated weights for policy 1, policy_version 27660 (0.0007) [2023-10-12 16:54:54,236][62635] Updated weights for policy 1, policy_version 27670 (0.0007) [2023-10-12 16:54:54,601][62635] Updated weights for policy 1, policy_version 27680 (0.0009) [2023-10-12 16:54:56,380][62634] Updated weights for policy 0, policy_version 27690 (0.0011) [2023-10-12 16:54:56,755][62634] Updated weights for policy 0, policy_version 27700 (0.0007) [2023-10-12 16:54:57,130][62634] Updated weights for policy 0, policy_version 27710 (0.0008) [2023-10-12 16:54:58,435][61643] Fps is (10 sec: 13106.9, 60 sec: 13653.2, 300 sec: 13440.4). Total num frames: 56721408. Throughput: 0: 1678.5, 1: 1693.2. Samples: 14188616. Policy #0 lag: (min: 18.0, avg: 24.8, max: 50.0) [2023-10-12 16:54:58,436][61643] Avg episode reward: [(0, '3.970'), (1, '9.160')] [2023-10-12 16:54:58,738][62635] Updated weights for policy 1, policy_version 27690 (0.0008) [2023-10-12 16:54:59,110][62635] Updated weights for policy 1, policy_version 27700 (0.0011) [2023-10-12 16:54:59,483][62635] Updated weights for policy 1, policy_version 27710 (0.0007) [2023-10-12 16:55:01,253][62634] Updated weights for policy 0, policy_version 27720 (0.0008) [2023-10-12 16:55:01,628][62634] Updated weights for policy 0, policy_version 27730 (0.0007) [2023-10-12 16:55:02,002][62634] Updated weights for policy 0, policy_version 27740 (0.0008) [2023-10-12 16:55:03,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 56786944. Throughput: 0: 1680.3, 1: 1693.4. Samples: 14209156. Policy #0 lag: (min: 18.0, avg: 24.8, max: 50.0) [2023-10-12 16:55:03,435][61643] Avg episode reward: [(0, '3.990'), (1, '9.040')] [2023-10-12 16:55:03,470][62635] Updated weights for policy 1, policy_version 27720 (0.0010) [2023-10-12 16:55:03,843][62635] Updated weights for policy 1, policy_version 27730 (0.0009) [2023-10-12 16:55:04,216][62635] Updated weights for policy 1, policy_version 27740 (0.0008) [2023-10-12 16:55:06,055][62634] Updated weights for policy 0, policy_version 27750 (0.0008) [2023-10-12 16:55:06,428][62634] Updated weights for policy 0, policy_version 27760 (0.0009) [2023-10-12 16:55:06,806][62634] Updated weights for policy 0, policy_version 27770 (0.0009) [2023-10-12 16:55:08,372][62635] Updated weights for policy 1, policy_version 27750 (0.0009) [2023-10-12 16:55:08,435][61643] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 56852480. Throughput: 0: 1691.7, 1: 1692.8. Samples: 14219352. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-10-12 16:55:08,436][61643] Avg episode reward: [(0, '3.990'), (1, '8.950')] [2023-10-12 16:55:08,739][62635] Updated weights for policy 1, policy_version 27760 (0.0008) [2023-10-12 16:55:09,120][62635] Updated weights for policy 1, policy_version 27770 (0.0009) [2023-10-12 16:55:10,886][62634] Updated weights for policy 0, policy_version 27780 (0.0007) [2023-10-12 16:55:11,285][62634] Updated weights for policy 0, policy_version 27790 (0.0007) [2023-10-12 16:55:11,664][62634] Updated weights for policy 0, policy_version 27800 (0.0008) [2023-10-12 16:55:13,234][62635] Updated weights for policy 1, policy_version 27780 (0.0008) [2023-10-12 16:55:13,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 56918016. Throughput: 0: 1662.8, 1: 1692.4. Samples: 14238882. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-10-12 16:55:13,435][61643] Avg episode reward: [(0, '3.990'), (1, '9.000')] [2023-10-12 16:55:13,599][62635] Updated weights for policy 1, policy_version 27790 (0.0008) [2023-10-12 16:55:13,980][62635] Updated weights for policy 1, policy_version 27800 (0.0011) [2023-10-12 16:55:15,835][62634] Updated weights for policy 0, policy_version 27810 (0.0008) [2023-10-12 16:55:16,217][62634] Updated weights for policy 0, policy_version 27820 (0.0011) [2023-10-12 16:55:16,585][62634] Updated weights for policy 0, policy_version 27830 (0.0011) [2023-10-12 16:55:16,966][62634] Updated weights for policy 0, policy_version 27840 (0.0010) [2023-10-12 16:55:17,886][62635] Updated weights for policy 1, policy_version 27810 (0.0009) [2023-10-12 16:55:18,267][62635] Updated weights for policy 1, policy_version 27820 (0.0010) [2023-10-12 16:55:18,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 56983552. Throughput: 0: 1686.8, 1: 1688.8. Samples: 14259338. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-10-12 16:55:18,435][61643] Avg episode reward: [(0, '3.970'), (1, '9.250')] [2023-10-12 16:55:18,632][62635] Updated weights for policy 1, policy_version 27830 (0.0008) [2023-10-12 16:55:18,997][62635] Updated weights for policy 1, policy_version 27840 (0.0008) [2023-10-12 16:55:21,234][62634] Updated weights for policy 0, policy_version 27850 (0.0007) [2023-10-12 16:55:21,609][62634] Updated weights for policy 0, policy_version 27860 (0.0007) [2023-10-12 16:55:21,996][62634] Updated weights for policy 0, policy_version 27870 (0.0007) [2023-10-12 16:55:23,091][62635] Updated weights for policy 1, policy_version 27850 (0.0008) [2023-10-12 16:55:23,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 57049088. Throughput: 0: 1681.5, 1: 1690.9. Samples: 14269678. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-10-12 16:55:23,435][61643] Avg episode reward: [(0, '3.930'), (1, '8.980')] [2023-10-12 16:55:23,465][62635] Updated weights for policy 1, policy_version 27860 (0.0009) [2023-10-12 16:55:23,832][62635] Updated weights for policy 1, policy_version 27870 (0.0007) [2023-10-12 16:55:26,096][62634] Updated weights for policy 0, policy_version 27880 (0.0008) [2023-10-12 16:55:26,469][62634] Updated weights for policy 0, policy_version 27890 (0.0009) [2023-10-12 16:55:26,850][62634] Updated weights for policy 0, policy_version 27900 (0.0008) [2023-10-12 16:55:27,912][62635] Updated weights for policy 1, policy_version 27880 (0.0009) [2023-10-12 16:55:28,292][62635] Updated weights for policy 1, policy_version 27890 (0.0010) [2023-10-12 16:55:28,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 57114624. Throughput: 0: 1661.7, 1: 1692.3. Samples: 14289272. Policy #0 lag: (min: 25.0, avg: 44.0, max: 57.0) [2023-10-12 16:55:28,435][61643] Avg episode reward: [(0, '3.880'), (1, '8.990')] [2023-10-12 16:55:28,660][62635] Updated weights for policy 1, policy_version 27900 (0.0007) [2023-10-12 16:55:30,784][62634] Updated weights for policy 0, policy_version 27910 (0.0008) [2023-10-12 16:55:31,167][62634] Updated weights for policy 0, policy_version 27920 (0.0008) [2023-10-12 16:55:31,536][62634] Updated weights for policy 0, policy_version 27930 (0.0008) [2023-10-12 16:55:32,740][62635] Updated weights for policy 1, policy_version 27910 (0.0007) [2023-10-12 16:55:33,105][62635] Updated weights for policy 1, policy_version 27920 (0.0007) [2023-10-12 16:55:33,435][61643] Fps is (10 sec: 13106.8, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 57180160. Throughput: 0: 1679.2, 1: 1675.9. Samples: 14309348. Policy #0 lag: (min: 25.0, avg: 44.0, max: 57.0) [2023-10-12 16:55:33,436][61643] Avg episode reward: [(0, '3.870'), (1, '9.250')] [2023-10-12 16:55:33,475][62635] Updated weights for policy 1, policy_version 27930 (0.0010) [2023-10-12 16:55:35,515][62634] Updated weights for policy 0, policy_version 27940 (0.0008) [2023-10-12 16:55:35,892][62634] Updated weights for policy 0, policy_version 27950 (0.0007) [2023-10-12 16:55:36,259][62634] Updated weights for policy 0, policy_version 27960 (0.0007) [2023-10-12 16:55:37,464][62635] Updated weights for policy 1, policy_version 27940 (0.0008) [2023-10-12 16:55:37,820][62635] Updated weights for policy 1, policy_version 27950 (0.0009) [2023-10-12 16:55:38,200][62635] Updated weights for policy 1, policy_version 27960 (0.0010) [2023-10-12 16:55:38,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.3, 300 sec: 13440.4). Total num frames: 57245696. Throughput: 0: 1663.9, 1: 1686.1. Samples: 14319512. Policy #0 lag: (min: 25.0, avg: 44.0, max: 57.0) [2023-10-12 16:55:38,435][61643] Avg episode reward: [(0, '3.940'), (1, '9.190')] [2023-10-12 16:55:40,242][62634] Updated weights for policy 0, policy_version 27970 (0.0007) [2023-10-12 16:55:40,610][62634] Updated weights for policy 0, policy_version 27980 (0.0008) [2023-10-12 16:55:40,988][62634] Updated weights for policy 0, policy_version 27990 (0.0008) [2023-10-12 16:55:41,362][62634] Updated weights for policy 0, policy_version 28000 (0.0008) [2023-10-12 16:55:42,295][62635] Updated weights for policy 1, policy_version 27970 (0.0009) [2023-10-12 16:55:42,700][62635] Updated weights for policy 1, policy_version 27980 (0.0007) [2023-10-12 16:55:43,072][62635] Updated weights for policy 1, policy_version 27990 (0.0008) [2023-10-12 16:55:43,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 57311232. Throughput: 0: 1666.4, 1: 1688.8. Samples: 14339602. Policy #0 lag: (min: 25.0, avg: 44.0, max: 57.0) [2023-10-12 16:55:43,436][61643] Avg episode reward: [(0, '3.960'), (1, '9.130')] [2023-10-12 16:55:43,438][62635] Updated weights for policy 1, policy_version 28000 (0.0008) [2023-10-12 16:55:45,492][62634] Updated weights for policy 0, policy_version 28010 (0.0007) [2023-10-12 16:55:45,864][62634] Updated weights for policy 0, policy_version 28020 (0.0010) [2023-10-12 16:55:46,240][62634] Updated weights for policy 0, policy_version 28030 (0.0011) [2023-10-12 16:55:47,492][62635] Updated weights for policy 1, policy_version 28010 (0.0007) [2023-10-12 16:55:47,869][62635] Updated weights for policy 1, policy_version 28020 (0.0008) [2023-10-12 16:55:48,242][62635] Updated weights for policy 1, policy_version 28030 (0.0009) [2023-10-12 16:55:48,435][61643] Fps is (10 sec: 16383.9, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 57409536. Throughput: 0: 1675.2, 1: 1657.6. Samples: 14359134. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 16:55:48,435][61643] Avg episode reward: [(0, '4.000'), (1, '9.070')] [2023-10-12 16:55:50,370][62634] Updated weights for policy 0, policy_version 28040 (0.0008) [2023-10-12 16:55:50,737][62634] Updated weights for policy 0, policy_version 28050 (0.0008) [2023-10-12 16:55:51,119][62634] Updated weights for policy 0, policy_version 28060 (0.0007) [2023-10-12 16:55:52,358][62635] Updated weights for policy 1, policy_version 28040 (0.0009) [2023-10-12 16:55:52,720][62635] Updated weights for policy 1, policy_version 28050 (0.0010) [2023-10-12 16:55:53,102][62635] Updated weights for policy 1, policy_version 28060 (0.0007) [2023-10-12 16:55:53,435][61643] Fps is (10 sec: 16384.5, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 57475072. Throughput: 0: 1658.1, 1: 1679.9. Samples: 14369562. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 16:55:53,435][61643] Avg episode reward: [(0, '4.000'), (1, '8.930')] [2023-10-12 16:55:55,061][62634] Updated weights for policy 0, policy_version 28070 (0.0009) [2023-10-12 16:55:55,432][62634] Updated weights for policy 0, policy_version 28080 (0.0008) [2023-10-12 16:55:55,815][62634] Updated weights for policy 0, policy_version 28090 (0.0008) [2023-10-12 16:55:57,113][62635] Updated weights for policy 1, policy_version 28070 (0.0008) [2023-10-12 16:55:57,487][62635] Updated weights for policy 1, policy_version 28080 (0.0009) [2023-10-12 16:55:57,860][62635] Updated weights for policy 1, policy_version 28090 (0.0007) [2023-10-12 16:55:58,435][61643] Fps is (10 sec: 13106.7, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 57540608. Throughput: 0: 1678.6, 1: 1671.7. Samples: 14389646. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 16:55:58,436][61643] Avg episode reward: [(0, '4.000'), (1, '8.970')] [2023-10-12 16:55:59,844][62634] Updated weights for policy 0, policy_version 28100 (0.0008) [2023-10-12 16:56:00,226][62634] Updated weights for policy 0, policy_version 28110 (0.0009) [2023-10-12 16:56:00,605][62634] Updated weights for policy 0, policy_version 28120 (0.0008) [2023-10-12 16:56:01,972][62635] Updated weights for policy 1, policy_version 28100 (0.0008) [2023-10-12 16:56:02,329][62635] Updated weights for policy 1, policy_version 28110 (0.0007) [2023-10-12 16:56:02,704][62635] Updated weights for policy 1, policy_version 28120 (0.0007) [2023-10-12 16:56:03,435][61643] Fps is (10 sec: 13106.7, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 57606144. Throughput: 0: 1678.3, 1: 1649.8. Samples: 14409104. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 16:56:03,436][61643] Avg episode reward: [(0, '3.960'), (1, '8.950')] [2023-10-12 16:56:04,706][62634] Updated weights for policy 0, policy_version 28130 (0.0008) [2023-10-12 16:56:05,082][62634] Updated weights for policy 0, policy_version 28140 (0.0009) [2023-10-12 16:56:05,459][62634] Updated weights for policy 0, policy_version 28150 (0.0007) [2023-10-12 16:56:05,843][62634] Updated weights for policy 0, policy_version 28160 (0.0008) [2023-10-12 16:56:06,908][62635] Updated weights for policy 1, policy_version 28130 (0.0007) [2023-10-12 16:56:07,274][62635] Updated weights for policy 1, policy_version 28140 (0.0008) [2023-10-12 16:56:07,636][62635] Updated weights for policy 1, policy_version 28150 (0.0008) [2023-10-12 16:56:08,014][62635] Updated weights for policy 1, policy_version 28160 (0.0009) [2023-10-12 16:56:08,435][61643] Fps is (10 sec: 13107.7, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 57671680. Throughput: 0: 1651.4, 1: 1675.1. Samples: 14419372. Policy #0 lag: (min: 1.0, avg: 10.1, max: 33.0) [2023-10-12 16:56:08,436][61643] Avg episode reward: [(0, '3.950'), (1, '8.670')] [2023-10-12 16:56:09,892][62634] Updated weights for policy 0, policy_version 28170 (0.0008) [2023-10-12 16:56:10,274][62634] Updated weights for policy 0, policy_version 28180 (0.0009) [2023-10-12 16:56:10,666][62634] Updated weights for policy 0, policy_version 28190 (0.0009) [2023-10-12 16:56:12,174][62635] Updated weights for policy 1, policy_version 28170 (0.0007) [2023-10-12 16:56:12,539][62635] Updated weights for policy 1, policy_version 28180 (0.0008) [2023-10-12 16:56:12,904][62635] Updated weights for policy 1, policy_version 28190 (0.0008) [2023-10-12 16:56:13,435][61643] Fps is (10 sec: 13107.6, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 57737216. Throughput: 0: 1679.2, 1: 1665.3. Samples: 14439774. Policy #0 lag: (min: 1.0, avg: 10.1, max: 33.0) [2023-10-12 16:56:13,435][61643] Avg episode reward: [(0, '3.940'), (1, '8.560')] [2023-10-12 16:56:14,678][62634] Updated weights for policy 0, policy_version 28200 (0.0008) [2023-10-12 16:56:15,048][62634] Updated weights for policy 0, policy_version 28210 (0.0007) [2023-10-12 16:56:15,430][62634] Updated weights for policy 0, policy_version 28220 (0.0009) [2023-10-12 16:56:16,974][62635] Updated weights for policy 1, policy_version 28200 (0.0007) [2023-10-12 16:56:17,334][62635] Updated weights for policy 1, policy_version 28210 (0.0007) [2023-10-12 16:56:17,700][62635] Updated weights for policy 1, policy_version 28220 (0.0009) [2023-10-12 16:56:18,435][61643] Fps is (10 sec: 13106.8, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 57802752. Throughput: 0: 1683.5, 1: 1651.2. Samples: 14459410. Policy #0 lag: (min: 1.0, avg: 10.1, max: 33.0) [2023-10-12 16:56:18,436][61643] Avg episode reward: [(0, '3.930'), (1, '8.580')] [2023-10-12 16:56:18,449][62495] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000028224_28901376.pth... [2023-10-12 16:56:18,449][62354] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000028224_28901376.pth... [2023-10-12 16:56:18,499][62354] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000026656_27295744.pth [2023-10-12 16:56:18,500][62495] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000026656_27295744.pth [2023-10-12 16:56:19,528][62634] Updated weights for policy 0, policy_version 28230 (0.0008) [2023-10-12 16:56:19,904][62634] Updated weights for policy 0, policy_version 28240 (0.0009) [2023-10-12 16:56:20,278][62634] Updated weights for policy 0, policy_version 28250 (0.0009) [2023-10-12 16:56:21,903][62635] Updated weights for policy 1, policy_version 28230 (0.0008) [2023-10-12 16:56:22,274][62635] Updated weights for policy 1, policy_version 28240 (0.0008) [2023-10-12 16:56:22,645][62635] Updated weights for policy 1, policy_version 28250 (0.0008) [2023-10-12 16:56:23,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 57868288. Throughput: 0: 1668.3, 1: 1670.8. Samples: 14469772. Policy #0 lag: (min: 1.0, avg: 10.1, max: 33.0) [2023-10-12 16:56:23,436][61643] Avg episode reward: [(0, '3.970'), (1, '8.490')] [2023-10-12 16:56:24,293][62634] Updated weights for policy 0, policy_version 28260 (0.0009) [2023-10-12 16:56:24,675][62634] Updated weights for policy 0, policy_version 28270 (0.0009) [2023-10-12 16:56:25,056][62634] Updated weights for policy 0, policy_version 28280 (0.0008) [2023-10-12 16:56:26,733][62635] Updated weights for policy 1, policy_version 28260 (0.0010) [2023-10-12 16:56:27,094][62635] Updated weights for policy 1, policy_version 28270 (0.0011) [2023-10-12 16:56:27,458][62635] Updated weights for policy 1, policy_version 28280 (0.0011) [2023-10-12 16:56:28,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 57933824. Throughput: 0: 1684.4, 1: 1661.4. Samples: 14490164. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 16:56:28,436][61643] Avg episode reward: [(0, '3.980'), (1, '8.440')] [2023-10-12 16:56:29,036][62634] Updated weights for policy 0, policy_version 28290 (0.0009) [2023-10-12 16:56:29,418][62634] Updated weights for policy 0, policy_version 28300 (0.0008) [2023-10-12 16:56:29,783][62634] Updated weights for policy 0, policy_version 28310 (0.0007) [2023-10-12 16:56:30,164][62634] Updated weights for policy 0, policy_version 28320 (0.0007) [2023-10-12 16:56:31,508][62635] Updated weights for policy 1, policy_version 28290 (0.0010) [2023-10-12 16:56:31,900][62635] Updated weights for policy 1, policy_version 28300 (0.0008) [2023-10-12 16:56:32,275][62635] Updated weights for policy 1, policy_version 28310 (0.0009) [2023-10-12 16:56:32,646][62635] Updated weights for policy 1, policy_version 28320 (0.0010) [2023-10-12 16:56:33,435][61643] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 57999360. Throughput: 0: 1686.7, 1: 1666.1. Samples: 14510014. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 16:56:33,436][61643] Avg episode reward: [(0, '3.990'), (1, '8.760')] [2023-10-12 16:56:34,308][62634] Updated weights for policy 0, policy_version 28330 (0.0008) [2023-10-12 16:56:34,693][62634] Updated weights for policy 0, policy_version 28340 (0.0007) [2023-10-12 16:56:35,072][62634] Updated weights for policy 0, policy_version 28350 (0.0010) [2023-10-12 16:56:36,785][62635] Updated weights for policy 1, policy_version 28330 (0.0008) [2023-10-12 16:56:37,153][62635] Updated weights for policy 1, policy_version 28340 (0.0009) [2023-10-12 16:56:37,530][62635] Updated weights for policy 1, policy_version 28350 (0.0007) [2023-10-12 16:56:38,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 58064896. Throughput: 0: 1677.7, 1: 1671.4. Samples: 14520272. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 16:56:38,436][61643] Avg episode reward: [(0, '3.990'), (1, '8.760')] [2023-10-12 16:56:39,010][62634] Updated weights for policy 0, policy_version 28360 (0.0008) [2023-10-12 16:56:39,388][62634] Updated weights for policy 0, policy_version 28370 (0.0009) [2023-10-12 16:56:39,775][62634] Updated weights for policy 0, policy_version 28380 (0.0009) [2023-10-12 16:56:41,655][62635] Updated weights for policy 1, policy_version 28360 (0.0008) [2023-10-12 16:56:42,027][62635] Updated weights for policy 1, policy_version 28370 (0.0008) [2023-10-12 16:56:42,395][62635] Updated weights for policy 1, policy_version 28380 (0.0008) [2023-10-12 16:56:43,435][61643] Fps is (10 sec: 13107.5, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 58130432. Throughput: 0: 1686.0, 1: 1661.1. Samples: 14540264. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 16:56:43,436][61643] Avg episode reward: [(0, '3.990'), (1, '8.760')] [2023-10-12 16:56:43,818][62634] Updated weights for policy 0, policy_version 28390 (0.0007) [2023-10-12 16:56:44,191][62634] Updated weights for policy 0, policy_version 28400 (0.0008) [2023-10-12 16:56:44,574][62634] Updated weights for policy 0, policy_version 28410 (0.0009) [2023-10-12 16:56:46,552][62635] Updated weights for policy 1, policy_version 28390 (0.0009) [2023-10-12 16:56:46,925][62635] Updated weights for policy 1, policy_version 28400 (0.0008) [2023-10-12 16:56:47,287][62635] Updated weights for policy 1, policy_version 28410 (0.0008) [2023-10-12 16:56:48,435][61643] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 58195968. Throughput: 0: 1690.6, 1: 1668.0. Samples: 14560242. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 16:56:48,435][61643] Avg episode reward: [(0, '3.990'), (1, '9.020')] [2023-10-12 16:56:48,654][62634] Updated weights for policy 0, policy_version 28420 (0.0010) [2023-10-12 16:56:49,047][62634] Updated weights for policy 0, policy_version 28430 (0.0009) [2023-10-12 16:56:49,426][62634] Updated weights for policy 0, policy_version 28440 (0.0009) [2023-10-12 16:56:51,291][62635] Updated weights for policy 1, policy_version 28420 (0.0009) [2023-10-12 16:56:51,668][62635] Updated weights for policy 1, policy_version 28430 (0.0008) [2023-10-12 16:56:52,033][62635] Updated weights for policy 1, policy_version 28440 (0.0007) [2023-10-12 16:56:53,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 58261504. Throughput: 0: 1688.3, 1: 1671.0. Samples: 14570542. Policy #0 lag: (min: 31.0, avg: 34.8, max: 63.0) [2023-10-12 16:56:53,435][61643] Avg episode reward: [(0, '3.960'), (1, '8.870')] [2023-10-12 16:56:53,536][62634] Updated weights for policy 0, policy_version 28450 (0.0008) [2023-10-12 16:56:53,907][62634] Updated weights for policy 0, policy_version 28460 (0.0008) [2023-10-12 16:56:54,287][62634] Updated weights for policy 0, policy_version 28470 (0.0008) [2023-10-12 16:56:54,665][62634] Updated weights for policy 0, policy_version 28480 (0.0007) [2023-10-12 16:56:56,144][62635] Updated weights for policy 1, policy_version 28450 (0.0010) [2023-10-12 16:56:56,521][62635] Updated weights for policy 1, policy_version 28460 (0.0010) [2023-10-12 16:56:56,891][62635] Updated weights for policy 1, policy_version 28470 (0.0009) [2023-10-12 16:56:57,258][62635] Updated weights for policy 1, policy_version 28480 (0.0008) [2023-10-12 16:56:58,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13107.3, 300 sec: 13440.4). Total num frames: 58327040. Throughput: 0: 1687.7, 1: 1653.6. Samples: 14590136. Policy #0 lag: (min: 31.0, avg: 34.8, max: 63.0) [2023-10-12 16:56:58,436][61643] Avg episode reward: [(0, '3.900'), (1, '9.100')] [2023-10-12 16:56:58,604][62634] Updated weights for policy 0, policy_version 28490 (0.0008) [2023-10-12 16:56:58,974][62634] Updated weights for policy 0, policy_version 28500 (0.0009) [2023-10-12 16:56:59,356][62634] Updated weights for policy 0, policy_version 28510 (0.0010) [2023-10-12 16:57:01,294][62635] Updated weights for policy 1, policy_version 28490 (0.0009) [2023-10-12 16:57:01,666][62635] Updated weights for policy 1, policy_version 28500 (0.0008) [2023-10-12 16:57:02,029][62635] Updated weights for policy 1, policy_version 28510 (0.0007) [2023-10-12 16:57:03,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 58392576. Throughput: 0: 1686.0, 1: 1673.4. Samples: 14610580. Policy #0 lag: (min: 31.0, avg: 34.8, max: 63.0) [2023-10-12 16:57:03,435][61643] Avg episode reward: [(0, '3.890'), (1, '9.220')] [2023-10-12 16:57:03,441][62634] Updated weights for policy 0, policy_version 28520 (0.0008) [2023-10-12 16:57:03,823][62634] Updated weights for policy 0, policy_version 28530 (0.0007) [2023-10-12 16:57:04,198][62634] Updated weights for policy 0, policy_version 28540 (0.0007) [2023-10-12 16:57:05,930][62635] Updated weights for policy 1, policy_version 28520 (0.0008) [2023-10-12 16:57:06,292][62635] Updated weights for policy 1, policy_version 28530 (0.0009) [2023-10-12 16:57:06,666][62635] Updated weights for policy 1, policy_version 28540 (0.0009) [2023-10-12 16:57:08,164][62634] Updated weights for policy 0, policy_version 28550 (0.0009) [2023-10-12 16:57:08,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.1, 300 sec: 13440.4). Total num frames: 58458112. Throughput: 0: 1684.7, 1: 1666.2. Samples: 14620560. Policy #0 lag: (min: 31.0, avg: 34.8, max: 63.0) [2023-10-12 16:57:08,436][61643] Avg episode reward: [(0, '3.850'), (1, '9.130')] [2023-10-12 16:57:08,539][62634] Updated weights for policy 0, policy_version 28560 (0.0010) [2023-10-12 16:57:08,917][62634] Updated weights for policy 0, policy_version 28570 (0.0008) [2023-10-12 16:57:10,637][62635] Updated weights for policy 1, policy_version 28550 (0.0007) [2023-10-12 16:57:11,000][62635] Updated weights for policy 1, policy_version 28560 (0.0010) [2023-10-12 16:57:11,372][62635] Updated weights for policy 1, policy_version 28570 (0.0008) [2023-10-12 16:57:13,052][62634] Updated weights for policy 0, policy_version 28580 (0.0009) [2023-10-12 16:57:13,434][62634] Updated weights for policy 0, policy_version 28590 (0.0011) [2023-10-12 16:57:13,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 58523648. Throughput: 0: 1686.0, 1: 1653.3. Samples: 14640428. Policy #0 lag: (min: 31.0, avg: 34.8, max: 63.0) [2023-10-12 16:57:13,436][61643] Avg episode reward: [(0, '3.800'), (1, '9.010')] [2023-10-12 16:57:13,798][62634] Updated weights for policy 0, policy_version 28600 (0.0007) [2023-10-12 16:57:15,504][62635] Updated weights for policy 1, policy_version 28580 (0.0009) [2023-10-12 16:57:15,867][62635] Updated weights for policy 1, policy_version 28590 (0.0007) [2023-10-12 16:57:16,244][62635] Updated weights for policy 1, policy_version 28600 (0.0010) [2023-10-12 16:57:17,736][62634] Updated weights for policy 0, policy_version 28610 (0.0007) [2023-10-12 16:57:18,106][62634] Updated weights for policy 0, policy_version 28620 (0.0008) [2023-10-12 16:57:18,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 58589184. Throughput: 0: 1677.9, 1: 1671.2. Samples: 14660724. Policy #0 lag: (min: 17.0, avg: 28.9, max: 49.0) [2023-10-12 16:57:18,436][61643] Avg episode reward: [(0, '3.820'), (1, '9.320')] [2023-10-12 16:57:18,488][62634] Updated weights for policy 0, policy_version 28630 (0.0008) [2023-10-12 16:57:18,857][62634] Updated weights for policy 0, policy_version 28640 (0.0008) [2023-10-12 16:57:20,511][62635] Updated weights for policy 1, policy_version 28610 (0.0010) [2023-10-12 16:57:20,927][62635] Updated weights for policy 1, policy_version 28620 (0.0009) [2023-10-12 16:57:21,294][62635] Updated weights for policy 1, policy_version 28630 (0.0010) [2023-10-12 16:57:21,666][62635] Updated weights for policy 1, policy_version 28640 (0.0011) [2023-10-12 16:57:23,016][62634] Updated weights for policy 0, policy_version 28650 (0.0007) [2023-10-12 16:57:23,398][62634] Updated weights for policy 0, policy_version 28660 (0.0009) [2023-10-12 16:57:23,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 58654720. Throughput: 0: 1685.5, 1: 1653.7. Samples: 14670536. Policy #0 lag: (min: 17.0, avg: 28.9, max: 49.0) [2023-10-12 16:57:23,435][61643] Avg episode reward: [(0, '3.850'), (1, '9.150')] [2023-10-12 16:57:23,763][62634] Updated weights for policy 0, policy_version 28670 (0.0010) [2023-10-12 16:57:25,853][62635] Updated weights for policy 1, policy_version 28650 (0.0007) [2023-10-12 16:57:26,222][62635] Updated weights for policy 1, policy_version 28660 (0.0007) [2023-10-12 16:57:26,598][62635] Updated weights for policy 1, policy_version 28670 (0.0011) [2023-10-12 16:57:27,973][62634] Updated weights for policy 0, policy_version 28680 (0.0010) [2023-10-12 16:57:28,349][62634] Updated weights for policy 0, policy_version 28690 (0.0008) [2023-10-12 16:57:28,435][61643] Fps is (10 sec: 13107.6, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 58720256. Throughput: 0: 1686.0, 1: 1652.9. Samples: 14690512. Policy #0 lag: (min: 17.0, avg: 28.9, max: 49.0) [2023-10-12 16:57:28,435][61643] Avg episode reward: [(0, '3.970'), (1, '9.040')] [2023-10-12 16:57:28,729][62634] Updated weights for policy 0, policy_version 28700 (0.0007) [2023-10-12 16:57:30,766][62635] Updated weights for policy 1, policy_version 28680 (0.0009) [2023-10-12 16:57:31,137][62635] Updated weights for policy 1, policy_version 28690 (0.0008) [2023-10-12 16:57:31,506][62635] Updated weights for policy 1, policy_version 28700 (0.0009) [2023-10-12 16:57:32,838][62634] Updated weights for policy 0, policy_version 28710 (0.0007) [2023-10-12 16:57:33,213][62634] Updated weights for policy 0, policy_version 28720 (0.0007) [2023-10-12 16:57:33,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 58785792. Throughput: 0: 1675.6, 1: 1671.4. Samples: 14710860. Policy #0 lag: (min: 17.0, avg: 28.9, max: 49.0) [2023-10-12 16:57:33,435][61643] Avg episode reward: [(0, '3.950'), (1, '9.160')] [2023-10-12 16:57:33,593][62634] Updated weights for policy 0, policy_version 28730 (0.0008) [2023-10-12 16:57:35,658][62635] Updated weights for policy 1, policy_version 28710 (0.0008) [2023-10-12 16:57:36,030][62635] Updated weights for policy 1, policy_version 28720 (0.0009) [2023-10-12 16:57:36,392][62635] Updated weights for policy 1, policy_version 28730 (0.0008) [2023-10-12 16:57:37,717][62634] Updated weights for policy 0, policy_version 28740 (0.0008) [2023-10-12 16:57:38,112][62634] Updated weights for policy 0, policy_version 28750 (0.0010) [2023-10-12 16:57:38,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 58851328. Throughput: 0: 1686.8, 1: 1652.5. Samples: 14720812. Policy #0 lag: (min: 31.0, avg: 38.8, max: 63.0) [2023-10-12 16:57:38,436][61643] Avg episode reward: [(0, '3.940'), (1, '8.770')] [2023-10-12 16:57:38,489][62634] Updated weights for policy 0, policy_version 28760 (0.0009) [2023-10-12 16:57:40,487][62635] Updated weights for policy 1, policy_version 28740 (0.0008) [2023-10-12 16:57:40,849][62635] Updated weights for policy 1, policy_version 28750 (0.0008) [2023-10-12 16:57:41,226][62635] Updated weights for policy 1, policy_version 28760 (0.0010) [2023-10-12 16:57:42,439][62634] Updated weights for policy 0, policy_version 28770 (0.0009) [2023-10-12 16:57:42,807][62634] Updated weights for policy 0, policy_version 28780 (0.0007) [2023-10-12 16:57:43,177][62634] Updated weights for policy 0, policy_version 28790 (0.0008) [2023-10-12 16:57:43,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 58916864. Throughput: 0: 1688.5, 1: 1663.0. Samples: 14740954. Policy #0 lag: (min: 31.0, avg: 38.8, max: 63.0) [2023-10-12 16:57:43,435][61643] Avg episode reward: [(0, '3.940'), (1, '8.840')] [2023-10-12 16:57:43,553][62634] Updated weights for policy 0, policy_version 28800 (0.0008) [2023-10-12 16:57:45,182][62635] Updated weights for policy 1, policy_version 28770 (0.0009) [2023-10-12 16:57:45,558][62635] Updated weights for policy 1, policy_version 28780 (0.0010) [2023-10-12 16:57:45,928][62635] Updated weights for policy 1, policy_version 28790 (0.0007) [2023-10-12 16:57:46,296][62635] Updated weights for policy 1, policy_version 28800 (0.0008) [2023-10-12 16:57:47,540][62634] Updated weights for policy 0, policy_version 28810 (0.0008) [2023-10-12 16:57:47,916][62634] Updated weights for policy 0, policy_version 28820 (0.0011) [2023-10-12 16:57:48,292][62634] Updated weights for policy 0, policy_version 28830 (0.0009) [2023-10-12 16:57:48,435][61643] Fps is (10 sec: 16384.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 59015168. Throughput: 0: 1667.7, 1: 1670.7. Samples: 14760810. Policy #0 lag: (min: 31.0, avg: 38.8, max: 63.0) [2023-10-12 16:57:48,435][61643] Avg episode reward: [(0, '3.950'), (1, '9.070')] [2023-10-12 16:57:50,394][62635] Updated weights for policy 1, policy_version 28810 (0.0007) [2023-10-12 16:57:50,769][62635] Updated weights for policy 1, policy_version 28820 (0.0007) [2023-10-12 16:57:51,141][62635] Updated weights for policy 1, policy_version 28830 (0.0007) [2023-10-12 16:57:52,223][62634] Updated weights for policy 0, policy_version 28840 (0.0008) [2023-10-12 16:57:52,605][62634] Updated weights for policy 0, policy_version 28850 (0.0009) [2023-10-12 16:57:52,984][62634] Updated weights for policy 0, policy_version 28860 (0.0007) [2023-10-12 16:57:53,435][61643] Fps is (10 sec: 16383.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 59080704. Throughput: 0: 1687.9, 1: 1657.7. Samples: 14771108. Policy #0 lag: (min: 31.0, avg: 38.8, max: 63.0) [2023-10-12 16:57:53,435][61643] Avg episode reward: [(0, '3.950'), (1, '8.680')] [2023-10-12 16:57:55,082][62635] Updated weights for policy 1, policy_version 28840 (0.0008) [2023-10-12 16:57:55,458][62635] Updated weights for policy 1, policy_version 28850 (0.0009) [2023-10-12 16:57:55,815][62635] Updated weights for policy 1, policy_version 28860 (0.0008) [2023-10-12 16:57:57,024][62634] Updated weights for policy 0, policy_version 28870 (0.0009) [2023-10-12 16:57:57,393][62634] Updated weights for policy 0, policy_version 28880 (0.0008) [2023-10-12 16:57:57,779][62634] Updated weights for policy 0, policy_version 28890 (0.0007) [2023-10-12 16:57:58,435][61643] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 59146240. Throughput: 0: 1685.1, 1: 1676.8. Samples: 14791714. Policy #0 lag: (min: 30.0, avg: 42.3, max: 62.0) [2023-10-12 16:57:58,436][61643] Avg episode reward: [(0, '3.900'), (1, '8.740')] [2023-10-12 16:57:59,864][62635] Updated weights for policy 1, policy_version 28870 (0.0007) [2023-10-12 16:58:00,241][62635] Updated weights for policy 1, policy_version 28880 (0.0007) [2023-10-12 16:58:00,601][62635] Updated weights for policy 1, policy_version 28890 (0.0007) [2023-10-12 16:58:01,801][62634] Updated weights for policy 0, policy_version 28900 (0.0007) [2023-10-12 16:58:02,175][62634] Updated weights for policy 0, policy_version 28910 (0.0010) [2023-10-12 16:58:02,558][62634] Updated weights for policy 0, policy_version 28920 (0.0011) [2023-10-12 16:58:03,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 59211776. Throughput: 0: 1663.7, 1: 1684.6. Samples: 14811400. Policy #0 lag: (min: 30.0, avg: 42.3, max: 62.0) [2023-10-12 16:58:03,435][61643] Avg episode reward: [(0, '3.900'), (1, '9.190')] [2023-10-12 16:58:04,545][62635] Updated weights for policy 1, policy_version 28900 (0.0009) [2023-10-12 16:58:04,911][62635] Updated weights for policy 1, policy_version 28910 (0.0008) [2023-10-12 16:58:05,286][62635] Updated weights for policy 1, policy_version 28920 (0.0010) [2023-10-12 16:58:06,651][62634] Updated weights for policy 0, policy_version 28930 (0.0010) [2023-10-12 16:58:07,026][62634] Updated weights for policy 0, policy_version 28940 (0.0009) [2023-10-12 16:58:07,395][62634] Updated weights for policy 0, policy_version 28950 (0.0008) [2023-10-12 16:58:07,773][62634] Updated weights for policy 0, policy_version 28960 (0.0009) [2023-10-12 16:58:08,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 59277312. Throughput: 0: 1687.9, 1: 1673.6. Samples: 14821806. Policy #0 lag: (min: 30.0, avg: 42.3, max: 62.0) [2023-10-12 16:58:08,436][61643] Avg episode reward: [(0, '3.880'), (1, '8.920')] [2023-10-12 16:58:09,564][62635] Updated weights for policy 1, policy_version 28930 (0.0009) [2023-10-12 16:58:09,978][62635] Updated weights for policy 1, policy_version 28940 (0.0008) [2023-10-12 16:58:10,335][62635] Updated weights for policy 1, policy_version 28950 (0.0010) [2023-10-12 16:58:10,709][62635] Updated weights for policy 1, policy_version 28960 (0.0009) [2023-10-12 16:58:11,835][62634] Updated weights for policy 0, policy_version 28970 (0.0007) [2023-10-12 16:58:12,220][62634] Updated weights for policy 0, policy_version 28980 (0.0009) [2023-10-12 16:58:12,592][62634] Updated weights for policy 0, policy_version 28990 (0.0007) [2023-10-12 16:58:13,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 59342848. Throughput: 0: 1674.1, 1: 1686.1. Samples: 14841722. Policy #0 lag: (min: 30.0, avg: 42.3, max: 62.0) [2023-10-12 16:58:13,435][61643] Avg episode reward: [(0, '3.870'), (1, '8.970')] [2023-10-12 16:58:14,865][62635] Updated weights for policy 1, policy_version 28970 (0.0008) [2023-10-12 16:58:15,232][62635] Updated weights for policy 1, policy_version 28980 (0.0007) [2023-10-12 16:58:15,594][62635] Updated weights for policy 1, policy_version 28990 (0.0007) [2023-10-12 16:58:16,555][62634] Updated weights for policy 0, policy_version 29000 (0.0008) [2023-10-12 16:58:16,933][62634] Updated weights for policy 0, policy_version 29010 (0.0007) [2023-10-12 16:58:17,313][62634] Updated weights for policy 0, policy_version 29020 (0.0007) [2023-10-12 16:58:18,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 59408384. Throughput: 0: 1669.0, 1: 1684.5. Samples: 14861768. Policy #0 lag: (min: 30.0, avg: 42.3, max: 62.0) [2023-10-12 16:58:18,435][61643] Avg episode reward: [(0, '3.830'), (1, '9.240')] [2023-10-12 16:58:18,444][62354] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000029024_29720576.pth... [2023-10-12 16:58:18,444][62495] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000028992_29687808.pth... [2023-10-12 16:58:18,474][62354] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000027456_28114944.pth [2023-10-12 16:58:18,477][62495] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000027424_28082176.pth [2023-10-12 16:58:19,712][62635] Updated weights for policy 1, policy_version 29000 (0.0009) [2023-10-12 16:58:20,084][62635] Updated weights for policy 1, policy_version 29010 (0.0008) [2023-10-12 16:58:20,447][62635] Updated weights for policy 1, policy_version 29020 (0.0008) [2023-10-12 16:58:21,388][62634] Updated weights for policy 0, policy_version 29030 (0.0009) [2023-10-12 16:58:21,763][62634] Updated weights for policy 0, policy_version 29040 (0.0009) [2023-10-12 16:58:22,138][62634] Updated weights for policy 0, policy_version 29050 (0.0007) [2023-10-12 16:58:23,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 59473920. Throughput: 0: 1692.3, 1: 1669.9. Samples: 14872110. Policy #0 lag: (min: 21.0, avg: 21.0, max: 21.0) [2023-10-12 16:58:23,436][61643] Avg episode reward: [(0, '3.770'), (1, '8.900')] [2023-10-12 16:58:24,495][62635] Updated weights for policy 1, policy_version 29030 (0.0007) [2023-10-12 16:58:24,858][62635] Updated weights for policy 1, policy_version 29040 (0.0007) [2023-10-12 16:58:25,224][62635] Updated weights for policy 1, policy_version 29050 (0.0010) [2023-10-12 16:58:26,261][62634] Updated weights for policy 0, policy_version 29060 (0.0007) [2023-10-12 16:58:26,649][62634] Updated weights for policy 0, policy_version 29070 (0.0007) [2023-10-12 16:58:27,025][62634] Updated weights for policy 0, policy_version 29080 (0.0008) [2023-10-12 16:58:28,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 59539456. Throughput: 0: 1669.0, 1: 1688.8. Samples: 14892052. Policy #0 lag: (min: 21.0, avg: 21.0, max: 21.0) [2023-10-12 16:58:28,435][61643] Avg episode reward: [(0, '3.820'), (1, '8.950')] [2023-10-12 16:58:29,117][62635] Updated weights for policy 1, policy_version 29060 (0.0009) [2023-10-12 16:58:29,487][62635] Updated weights for policy 1, policy_version 29070 (0.0007) [2023-10-12 16:58:29,847][62635] Updated weights for policy 1, policy_version 29080 (0.0007) [2023-10-12 16:58:31,005][62634] Updated weights for policy 0, policy_version 29090 (0.0007) [2023-10-12 16:58:31,382][62634] Updated weights for policy 0, policy_version 29100 (0.0009) [2023-10-12 16:58:31,754][62634] Updated weights for policy 0, policy_version 29110 (0.0009) [2023-10-12 16:58:32,131][62634] Updated weights for policy 0, policy_version 29120 (0.0009) [2023-10-12 16:58:33,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 59604992. Throughput: 0: 1681.2, 1: 1689.5. Samples: 14912490. Policy #0 lag: (min: 21.0, avg: 21.0, max: 21.0) [2023-10-12 16:58:33,435][61643] Avg episode reward: [(0, '3.840'), (1, '9.090')] [2023-10-12 16:58:33,866][62635] Updated weights for policy 1, policy_version 29090 (0.0007) [2023-10-12 16:58:34,232][62635] Updated weights for policy 1, policy_version 29100 (0.0007) [2023-10-12 16:58:34,595][62635] Updated weights for policy 1, policy_version 29110 (0.0008) [2023-10-12 16:58:34,964][62635] Updated weights for policy 1, policy_version 29120 (0.0007) [2023-10-12 16:58:36,241][62634] Updated weights for policy 0, policy_version 29130 (0.0007) [2023-10-12 16:58:36,614][62634] Updated weights for policy 0, policy_version 29140 (0.0007) [2023-10-12 16:58:36,989][62634] Updated weights for policy 0, policy_version 29150 (0.0008) [2023-10-12 16:58:38,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 59670528. Throughput: 0: 1691.3, 1: 1682.7. Samples: 14922938. Policy #0 lag: (min: 21.0, avg: 21.0, max: 21.0) [2023-10-12 16:58:38,435][61643] Avg episode reward: [(0, '3.930'), (1, '9.010')] [2023-10-12 16:58:38,927][62635] Updated weights for policy 1, policy_version 29130 (0.0008) [2023-10-12 16:58:39,299][62635] Updated weights for policy 1, policy_version 29140 (0.0009) [2023-10-12 16:58:39,667][62635] Updated weights for policy 1, policy_version 29150 (0.0011) [2023-10-12 16:58:41,091][62634] Updated weights for policy 0, policy_version 29160 (0.0007) [2023-10-12 16:58:41,477][62634] Updated weights for policy 0, policy_version 29170 (0.0007) [2023-10-12 16:58:41,855][62634] Updated weights for policy 0, policy_version 29180 (0.0009) [2023-10-12 16:58:43,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 59736064. Throughput: 0: 1666.3, 1: 1693.2. Samples: 14942888. Policy #0 lag: (min: 21.0, avg: 21.0, max: 21.0) [2023-10-12 16:58:43,436][61643] Avg episode reward: [(0, '3.970'), (1, '9.100')] [2023-10-12 16:58:43,508][62635] Updated weights for policy 1, policy_version 29160 (0.0008) [2023-10-12 16:58:43,879][62635] Updated weights for policy 1, policy_version 29170 (0.0007) [2023-10-12 16:58:44,251][62635] Updated weights for policy 1, policy_version 29180 (0.0008) [2023-10-12 16:58:45,813][62634] Updated weights for policy 0, policy_version 29190 (0.0008) [2023-10-12 16:58:46,195][62634] Updated weights for policy 0, policy_version 29200 (0.0007) [2023-10-12 16:58:46,570][62634] Updated weights for policy 0, policy_version 29210 (0.0009) [2023-10-12 16:58:48,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 59801600. Throughput: 0: 1689.4, 1: 1690.3. Samples: 14963484. Policy #0 lag: (min: 21.0, avg: 21.0, max: 21.0) [2023-10-12 16:58:48,435][61643] Avg episode reward: [(0, '3.970'), (1, '9.180')] [2023-10-12 16:58:48,470][62635] Updated weights for policy 1, policy_version 29190 (0.0008) [2023-10-12 16:58:48,833][62635] Updated weights for policy 1, policy_version 29200 (0.0007) [2023-10-12 16:58:49,199][62635] Updated weights for policy 1, policy_version 29210 (0.0010) [2023-10-12 16:58:50,667][62634] Updated weights for policy 0, policy_version 29220 (0.0008) [2023-10-12 16:58:51,045][62634] Updated weights for policy 0, policy_version 29230 (0.0009) [2023-10-12 16:58:51,437][62634] Updated weights for policy 0, policy_version 29240 (0.0011) [2023-10-12 16:58:53,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 59867136. Throughput: 0: 1677.7, 1: 1687.7. Samples: 14973252. Policy #0 lag: (min: 21.0, avg: 21.0, max: 21.0) [2023-10-12 16:58:53,436][61643] Avg episode reward: [(0, '3.950'), (1, '8.850')] [2023-10-12 16:58:53,471][62635] Updated weights for policy 1, policy_version 29220 (0.0010) [2023-10-12 16:58:53,837][62635] Updated weights for policy 1, policy_version 29230 (0.0008) [2023-10-12 16:58:54,205][62635] Updated weights for policy 1, policy_version 29240 (0.0009) [2023-10-12 16:58:55,262][62634] Updated weights for policy 0, policy_version 29250 (0.0007) [2023-10-12 16:58:55,643][62634] Updated weights for policy 0, policy_version 29260 (0.0008) [2023-10-12 16:58:56,011][62634] Updated weights for policy 0, policy_version 29270 (0.0009) [2023-10-12 16:58:56,392][62634] Updated weights for policy 0, policy_version 29280 (0.0010) [2023-10-12 16:58:58,267][62635] Updated weights for policy 1, policy_version 29250 (0.0007) [2023-10-12 16:58:58,435][61643] Fps is (10 sec: 13106.8, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 59932672. Throughput: 0: 1672.2, 1: 1689.1. Samples: 14992980. Policy #0 lag: (min: 21.0, avg: 21.0, max: 21.0) [2023-10-12 16:58:58,436][61643] Avg episode reward: [(0, '3.950'), (1, '8.970')] [2023-10-12 16:58:58,672][62635] Updated weights for policy 1, policy_version 29260 (0.0009) [2023-10-12 16:58:59,040][62635] Updated weights for policy 1, policy_version 29270 (0.0010) [2023-10-12 16:58:59,417][62635] Updated weights for policy 1, policy_version 29280 (0.0007) [2023-10-12 16:59:00,617][62634] Updated weights for policy 0, policy_version 29290 (0.0008) [2023-10-12 16:59:00,991][62634] Updated weights for policy 0, policy_version 29300 (0.0007) [2023-10-12 16:59:01,375][62634] Updated weights for policy 0, policy_version 29310 (0.0007) [2023-10-12 16:59:03,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 59998208. Throughput: 0: 1688.4, 1: 1684.8. Samples: 15013560. Policy #0 lag: (min: 21.0, avg: 21.0, max: 21.0) [2023-10-12 16:59:03,435][61643] Avg episode reward: [(0, '3.980'), (1, '9.340')] [2023-10-12 16:59:03,488][62635] Updated weights for policy 1, policy_version 29290 (0.0011) [2023-10-12 16:59:03,864][62635] Updated weights for policy 1, policy_version 29300 (0.0007) [2023-10-12 16:59:04,227][62635] Updated weights for policy 1, policy_version 29310 (0.0008) [2023-10-12 16:59:04,297][62495] Saving new best policy, reward=9.340! [2023-10-12 16:59:05,268][62634] Updated weights for policy 0, policy_version 29320 (0.0008) [2023-10-12 16:59:05,645][62634] Updated weights for policy 0, policy_version 29330 (0.0008) [2023-10-12 16:59:06,031][62634] Updated weights for policy 0, policy_version 29340 (0.0008) [2023-10-12 16:59:08,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 60063744. Throughput: 0: 1665.5, 1: 1685.2. Samples: 15022892. Policy #0 lag: (min: 21.0, avg: 21.0, max: 21.0) [2023-10-12 16:59:08,436][61643] Avg episode reward: [(0, '3.980'), (1, '8.960')] [2023-10-12 16:59:08,574][62635] Updated weights for policy 1, policy_version 29320 (0.0010) [2023-10-12 16:59:08,946][62635] Updated weights for policy 1, policy_version 29330 (0.0009) [2023-10-12 16:59:09,309][62635] Updated weights for policy 1, policy_version 29340 (0.0010) [2023-10-12 16:59:10,029][62634] Updated weights for policy 0, policy_version 29350 (0.0009) [2023-10-12 16:59:10,417][62634] Updated weights for policy 0, policy_version 29360 (0.0011) [2023-10-12 16:59:10,782][62634] Updated weights for policy 0, policy_version 29370 (0.0008) [2023-10-12 16:59:13,328][62635] Updated weights for policy 1, policy_version 29350 (0.0009) [2023-10-12 16:59:13,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.5). Total num frames: 60129280. Throughput: 0: 1683.6, 1: 1675.4. Samples: 15043208. Policy #0 lag: (min: 31.0, avg: 34.7, max: 63.0) [2023-10-12 16:59:13,435][61643] Avg episode reward: [(0, '3.990'), (1, '8.910')] [2023-10-12 16:59:13,694][62635] Updated weights for policy 1, policy_version 29360 (0.0007) [2023-10-12 16:59:14,077][62635] Updated weights for policy 1, policy_version 29370 (0.0009) [2023-10-12 16:59:14,970][62634] Updated weights for policy 0, policy_version 29380 (0.0008) [2023-10-12 16:59:15,356][62634] Updated weights for policy 0, policy_version 29390 (0.0007) [2023-10-12 16:59:15,731][62634] Updated weights for policy 0, policy_version 29400 (0.0008) [2023-10-12 16:59:17,984][62635] Updated weights for policy 1, policy_version 29380 (0.0008) [2023-10-12 16:59:18,357][62635] Updated weights for policy 1, policy_version 29390 (0.0008) [2023-10-12 16:59:18,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 60194816. Throughput: 0: 1689.6, 1: 1671.1. Samples: 15063726. Policy #0 lag: (min: 31.0, avg: 34.7, max: 63.0) [2023-10-12 16:59:18,436][61643] Avg episode reward: [(0, '4.000'), (1, '9.010')] [2023-10-12 16:59:18,718][62635] Updated weights for policy 1, policy_version 29400 (0.0008) [2023-10-12 16:59:19,652][62634] Updated weights for policy 0, policy_version 29410 (0.0008) [2023-10-12 16:59:20,027][62634] Updated weights for policy 0, policy_version 29420 (0.0007) [2023-10-12 16:59:20,408][62634] Updated weights for policy 0, policy_version 29430 (0.0008) [2023-10-12 16:59:20,789][62634] Updated weights for policy 0, policy_version 29440 (0.0007) [2023-10-12 16:59:22,883][62635] Updated weights for policy 1, policy_version 29410 (0.0009) [2023-10-12 16:59:23,257][62635] Updated weights for policy 1, policy_version 29420 (0.0007) [2023-10-12 16:59:23,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 60260352. Throughput: 0: 1660.1, 1: 1677.6. Samples: 15073134. Policy #0 lag: (min: 31.0, avg: 34.7, max: 63.0) [2023-10-12 16:59:23,435][61643] Avg episode reward: [(0, '3.990'), (1, '9.030')] [2023-10-12 16:59:23,631][62635] Updated weights for policy 1, policy_version 29430 (0.0009) [2023-10-12 16:59:23,986][62635] Updated weights for policy 1, policy_version 29440 (0.0008) [2023-10-12 16:59:24,847][62634] Updated weights for policy 0, policy_version 29450 (0.0007) [2023-10-12 16:59:25,223][62634] Updated weights for policy 0, policy_version 29460 (0.0007) [2023-10-12 16:59:25,599][62634] Updated weights for policy 0, policy_version 29470 (0.0007) [2023-10-12 16:59:27,963][62635] Updated weights for policy 1, policy_version 29450 (0.0007) [2023-10-12 16:59:28,336][62635] Updated weights for policy 1, policy_version 29460 (0.0009) [2023-10-12 16:59:28,435][61643] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 60325888. Throughput: 0: 1688.4, 1: 1667.6. Samples: 15093908. Policy #0 lag: (min: 31.0, avg: 34.7, max: 63.0) [2023-10-12 16:59:28,435][61643] Avg episode reward: [(0, '4.000'), (1, '9.140')] [2023-10-12 16:59:28,698][62635] Updated weights for policy 1, policy_version 29470 (0.0009) [2023-10-12 16:59:29,523][62634] Updated weights for policy 0, policy_version 29480 (0.0007) [2023-10-12 16:59:29,896][62634] Updated weights for policy 0, policy_version 29490 (0.0010) [2023-10-12 16:59:30,270][62634] Updated weights for policy 0, policy_version 29500 (0.0009) [2023-10-12 16:59:32,794][62635] Updated weights for policy 1, policy_version 29480 (0.0007) [2023-10-12 16:59:33,169][62635] Updated weights for policy 1, policy_version 29490 (0.0009) [2023-10-12 16:59:33,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 60391424. Throughput: 0: 1695.5, 1: 1654.6. Samples: 15114238. Policy #0 lag: (min: 31.0, avg: 34.7, max: 63.0) [2023-10-12 16:59:33,435][61643] Avg episode reward: [(0, '3.990'), (1, '9.030')] [2023-10-12 16:59:33,539][62635] Updated weights for policy 1, policy_version 29500 (0.0008) [2023-10-12 16:59:34,310][62634] Updated weights for policy 0, policy_version 29510 (0.0009) [2023-10-12 16:59:34,690][62634] Updated weights for policy 0, policy_version 29520 (0.0007) [2023-10-12 16:59:35,066][62634] Updated weights for policy 0, policy_version 29530 (0.0008) [2023-10-12 16:59:37,525][62635] Updated weights for policy 1, policy_version 29510 (0.0007) [2023-10-12 16:59:37,890][62635] Updated weights for policy 1, policy_version 29520 (0.0007) [2023-10-12 16:59:38,254][62635] Updated weights for policy 1, policy_version 29530 (0.0009) [2023-10-12 16:59:38,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 60456960. Throughput: 0: 1679.0, 1: 1672.8. Samples: 15124082. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0) [2023-10-12 16:59:38,435][61643] Avg episode reward: [(0, '3.990'), (1, '9.040')] [2023-10-12 16:59:39,115][62634] Updated weights for policy 0, policy_version 29540 (0.0008) [2023-10-12 16:59:39,486][62634] Updated weights for policy 0, policy_version 29550 (0.0008) [2023-10-12 16:59:39,863][62634] Updated weights for policy 0, policy_version 29560 (0.0009) [2023-10-12 16:59:42,502][62635] Updated weights for policy 1, policy_version 29540 (0.0009) [2023-10-12 16:59:42,871][62635] Updated weights for policy 1, policy_version 29550 (0.0007) [2023-10-12 16:59:43,248][62635] Updated weights for policy 1, policy_version 29560 (0.0007) [2023-10-12 16:59:43,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 60522496. Throughput: 0: 1696.1, 1: 1674.5. Samples: 15144654. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0) [2023-10-12 16:59:43,435][61643] Avg episode reward: [(0, '3.990'), (1, '9.080')] [2023-10-12 16:59:43,880][62634] Updated weights for policy 0, policy_version 29570 (0.0011) [2023-10-12 16:59:44,251][62634] Updated weights for policy 0, policy_version 29580 (0.0010) [2023-10-12 16:59:44,637][62634] Updated weights for policy 0, policy_version 29590 (0.0008) [2023-10-12 16:59:45,009][62634] Updated weights for policy 0, policy_version 29600 (0.0009) [2023-10-12 16:59:47,342][62635] Updated weights for policy 1, policy_version 29570 (0.0008) [2023-10-12 16:59:47,746][62635] Updated weights for policy 1, policy_version 29580 (0.0009) [2023-10-12 16:59:48,106][62635] Updated weights for policy 1, policy_version 29590 (0.0008) [2023-10-12 16:59:48,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 60588032. Throughput: 0: 1698.0, 1: 1660.6. Samples: 15164696. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0) [2023-10-12 16:59:48,435][61643] Avg episode reward: [(0, '4.000'), (1, '9.150')] [2023-10-12 16:59:48,472][62635] Updated weights for policy 1, policy_version 29600 (0.0009) [2023-10-12 16:59:49,085][62634] Updated weights for policy 0, policy_version 29610 (0.0011) [2023-10-12 16:59:49,452][62634] Updated weights for policy 0, policy_version 29620 (0.0010) [2023-10-12 16:59:49,836][62634] Updated weights for policy 0, policy_version 29630 (0.0007) [2023-10-12 16:59:52,577][62635] Updated weights for policy 1, policy_version 29610 (0.0008) [2023-10-12 16:59:52,951][62635] Updated weights for policy 1, policy_version 29620 (0.0010) [2023-10-12 16:59:53,323][62635] Updated weights for policy 1, policy_version 29630 (0.0009) [2023-10-12 16:59:53,435][61643] Fps is (10 sec: 16384.0, 60 sec: 13653.4, 300 sec: 13440.5). Total num frames: 60686336. Throughput: 0: 1688.3, 1: 1678.5. Samples: 15174394. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0) [2023-10-12 16:59:53,435][61643] Avg episode reward: [(0, '4.000'), (1, '8.970')] [2023-10-12 16:59:53,754][62634] Updated weights for policy 0, policy_version 29640 (0.0010) [2023-10-12 16:59:54,131][62634] Updated weights for policy 0, policy_version 29650 (0.0010) [2023-10-12 16:59:54,513][62634] Updated weights for policy 0, policy_version 29660 (0.0009) [2023-10-12 16:59:57,439][62635] Updated weights for policy 1, policy_version 29640 (0.0008) [2023-10-12 16:59:57,813][62635] Updated weights for policy 1, policy_version 29650 (0.0008) [2023-10-12 16:59:58,186][62635] Updated weights for policy 1, policy_version 29660 (0.0009) [2023-10-12 16:59:58,435][61643] Fps is (10 sec: 16384.1, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 60751872. Throughput: 0: 1697.9, 1: 1682.1. Samples: 15195308. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0) [2023-10-12 16:59:58,435][61643] Avg episode reward: [(0, '3.930'), (1, '9.000')] [2023-10-12 16:59:58,540][62634] Updated weights for policy 0, policy_version 29670 (0.0007) [2023-10-12 16:59:58,917][62634] Updated weights for policy 0, policy_version 29680 (0.0010) [2023-10-12 16:59:59,292][62634] Updated weights for policy 0, policy_version 29690 (0.0009) [2023-10-12 17:00:02,213][62635] Updated weights for policy 1, policy_version 29670 (0.0009) [2023-10-12 17:00:02,579][62635] Updated weights for policy 1, policy_version 29680 (0.0008) [2023-10-12 17:00:02,952][62635] Updated weights for policy 1, policy_version 29690 (0.0008) [2023-10-12 17:00:03,399][62634] Updated weights for policy 0, policy_version 29700 (0.0010) [2023-10-12 17:00:03,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 60817408. Throughput: 0: 1700.5, 1: 1665.7. Samples: 15215204. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) [2023-10-12 17:00:03,436][61643] Avg episode reward: [(0, '3.950'), (1, '9.140')] [2023-10-12 17:00:03,780][62634] Updated weights for policy 0, policy_version 29710 (0.0010) [2023-10-12 17:00:04,158][62634] Updated weights for policy 0, policy_version 29720 (0.0008) [2023-10-12 17:00:07,054][62635] Updated weights for policy 1, policy_version 29700 (0.0008) [2023-10-12 17:00:07,427][62635] Updated weights for policy 1, policy_version 29710 (0.0007) [2023-10-12 17:00:07,794][62635] Updated weights for policy 1, policy_version 29720 (0.0009) [2023-10-12 17:00:08,289][62634] Updated weights for policy 0, policy_version 29730 (0.0008) [2023-10-12 17:00:08,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 60882944. Throughput: 0: 1694.4, 1: 1682.7. Samples: 15225104. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) [2023-10-12 17:00:08,435][61643] Avg episode reward: [(0, '3.970'), (1, '8.910')] [2023-10-12 17:00:08,662][62634] Updated weights for policy 0, policy_version 29740 (0.0010) [2023-10-12 17:00:09,037][62634] Updated weights for policy 0, policy_version 29750 (0.0011) [2023-10-12 17:00:09,422][62634] Updated weights for policy 0, policy_version 29760 (0.0010) [2023-10-12 17:00:11,739][62635] Updated weights for policy 1, policy_version 29730 (0.0008) [2023-10-12 17:00:12,111][62635] Updated weights for policy 1, policy_version 29740 (0.0008) [2023-10-12 17:00:12,482][62635] Updated weights for policy 1, policy_version 29750 (0.0008) [2023-10-12 17:00:12,838][62635] Updated weights for policy 1, policy_version 29760 (0.0008) [2023-10-12 17:00:13,412][62634] Updated weights for policy 0, policy_version 29770 (0.0010) [2023-10-12 17:00:13,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 60948480. Throughput: 0: 1687.1, 1: 1677.6. Samples: 15245316. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) [2023-10-12 17:00:13,435][61643] Avg episode reward: [(0, '4.030'), (1, '8.990')] [2023-10-12 17:00:13,791][62634] Updated weights for policy 0, policy_version 29780 (0.0009) [2023-10-12 17:00:14,168][62634] Updated weights for policy 0, policy_version 29790 (0.0007) [2023-10-12 17:00:14,245][62354] Saving new best policy, reward=4.030! [2023-10-12 17:00:16,822][62635] Updated weights for policy 1, policy_version 29770 (0.0008) [2023-10-12 17:00:17,189][62635] Updated weights for policy 1, policy_version 29780 (0.0007) [2023-10-12 17:00:17,567][62635] Updated weights for policy 1, policy_version 29790 (0.0009) [2023-10-12 17:00:18,211][62634] Updated weights for policy 0, policy_version 29800 (0.0007) [2023-10-12 17:00:18,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 61014016. Throughput: 0: 1685.3, 1: 1673.5. Samples: 15265388. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) [2023-10-12 17:00:18,436][61643] Avg episode reward: [(0, '4.060'), (1, '9.000')] [2023-10-12 17:00:18,445][62495] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000029792_30507008.pth... [2023-10-12 17:00:18,485][62495] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000028224_28901376.pth [2023-10-12 17:00:18,595][62634] Updated weights for policy 0, policy_version 29810 (0.0011) [2023-10-12 17:00:18,984][62634] Updated weights for policy 0, policy_version 29820 (0.0011) [2023-10-12 17:00:19,131][62354] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000029824_30539776.pth... [2023-10-12 17:00:19,171][62354] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000028224_28901376.pth [2023-10-12 17:00:19,175][62354] Saving new best policy, reward=4.060! [2023-10-12 17:00:21,633][62635] Updated weights for policy 1, policy_version 29800 (0.0008) [2023-10-12 17:00:21,997][62635] Updated weights for policy 1, policy_version 29810 (0.0009) [2023-10-12 17:00:22,364][62635] Updated weights for policy 1, policy_version 29820 (0.0007) [2023-10-12 17:00:23,132][62634] Updated weights for policy 0, policy_version 29830 (0.0010) [2023-10-12 17:00:23,435][61643] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 61079552. Throughput: 0: 1680.8, 1: 1688.1. Samples: 15275684. Policy #0 lag: (min: 1.0, avg: 8.5, max: 33.0) [2023-10-12 17:00:23,436][61643] Avg episode reward: [(0, '3.960'), (1, '8.900')] [2023-10-12 17:00:23,498][62634] Updated weights for policy 0, policy_version 29840 (0.0007) [2023-10-12 17:00:23,876][62634] Updated weights for policy 0, policy_version 29850 (0.0009) [2023-10-12 17:00:26,331][62635] Updated weights for policy 1, policy_version 29830 (0.0008) [2023-10-12 17:00:26,699][62635] Updated weights for policy 1, policy_version 29840 (0.0009) [2023-10-12 17:00:27,079][62635] Updated weights for policy 1, policy_version 29850 (0.0009) [2023-10-12 17:00:28,048][62634] Updated weights for policy 0, policy_version 29860 (0.0009) [2023-10-12 17:00:28,430][62634] Updated weights for policy 0, policy_version 29870 (0.0007) [2023-10-12 17:00:28,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 61145088. Throughput: 0: 1681.4, 1: 1671.9. Samples: 15295554. Policy #0 lag: (min: 1.0, avg: 8.5, max: 33.0) [2023-10-12 17:00:28,436][61643] Avg episode reward: [(0, '4.000'), (1, '8.980')] [2023-10-12 17:00:28,814][62634] Updated weights for policy 0, policy_version 29880 (0.0009) [2023-10-12 17:00:31,070][62635] Updated weights for policy 1, policy_version 29860 (0.0007) [2023-10-12 17:00:31,436][62635] Updated weights for policy 1, policy_version 29870 (0.0007) [2023-10-12 17:00:31,802][62635] Updated weights for policy 1, policy_version 29880 (0.0007) [2023-10-12 17:00:32,701][62634] Updated weights for policy 0, policy_version 29890 (0.0008) [2023-10-12 17:00:33,085][62634] Updated weights for policy 0, policy_version 29900 (0.0009) [2023-10-12 17:00:33,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 61210624. Throughput: 0: 1673.4, 1: 1685.8. Samples: 15315860. Policy #0 lag: (min: 1.0, avg: 8.5, max: 33.0) [2023-10-12 17:00:33,436][61643] Avg episode reward: [(0, '3.970'), (1, '9.060')] [2023-10-12 17:00:33,449][62634] Updated weights for policy 0, policy_version 29910 (0.0008) [2023-10-12 17:00:33,827][62634] Updated weights for policy 0, policy_version 29920 (0.0010) [2023-10-12 17:00:35,899][62635] Updated weights for policy 1, policy_version 29890 (0.0007) [2023-10-12 17:00:36,299][62635] Updated weights for policy 1, policy_version 29900 (0.0008) [2023-10-12 17:00:36,663][62635] Updated weights for policy 1, policy_version 29910 (0.0008) [2023-10-12 17:00:37,025][62635] Updated weights for policy 1, policy_version 29920 (0.0008) [2023-10-12 17:00:37,865][62634] Updated weights for policy 0, policy_version 29930 (0.0009) [2023-10-12 17:00:38,250][62634] Updated weights for policy 0, policy_version 29940 (0.0009) [2023-10-12 17:00:38,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 61276160. Throughput: 0: 1680.6, 1: 1695.3. Samples: 15326312. Policy #0 lag: (min: 1.0, avg: 8.5, max: 33.0) [2023-10-12 17:00:38,436][61643] Avg episode reward: [(0, '3.990'), (1, '8.840')] [2023-10-12 17:00:38,617][62634] Updated weights for policy 0, policy_version 29950 (0.0011) [2023-10-12 17:00:41,084][62635] Updated weights for policy 1, policy_version 29930 (0.0008) [2023-10-12 17:00:41,451][62635] Updated weights for policy 1, policy_version 29940 (0.0007) [2023-10-12 17:00:41,815][62635] Updated weights for policy 1, policy_version 29950 (0.0009) [2023-10-12 17:00:42,709][62634] Updated weights for policy 0, policy_version 29960 (0.0007) [2023-10-12 17:00:43,095][62634] Updated weights for policy 0, policy_version 29970 (0.0009) [2023-10-12 17:00:43,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13329.4). Total num frames: 61341696. Throughput: 0: 1676.9, 1: 1672.1. Samples: 15346014. Policy #0 lag: (min: 20.0, avg: 20.0, max: 20.0) [2023-10-12 17:00:43,435][61643] Avg episode reward: [(0, '4.150'), (1, '8.900')] [2023-10-12 17:00:43,468][62634] Updated weights for policy 0, policy_version 29980 (0.0009) [2023-10-12 17:00:43,613][62354] Saving new best policy, reward=4.150! [2023-10-12 17:00:45,708][62635] Updated weights for policy 1, policy_version 29960 (0.0007) [2023-10-12 17:00:46,080][62635] Updated weights for policy 1, policy_version 29970 (0.0009) [2023-10-12 17:00:46,448][62635] Updated weights for policy 1, policy_version 29980 (0.0011) [2023-10-12 17:00:47,414][62634] Updated weights for policy 0, policy_version 29990 (0.0010) [2023-10-12 17:00:47,797][62634] Updated weights for policy 0, policy_version 30000 (0.0010) [2023-10-12 17:00:48,171][62634] Updated weights for policy 0, policy_version 30010 (0.0009) [2023-10-12 17:00:48,435][61643] Fps is (10 sec: 16384.1, 60 sec: 14199.5, 300 sec: 13440.4). Total num frames: 61440000. Throughput: 0: 1664.8, 1: 1696.6. Samples: 15366468. Policy #0 lag: (min: 20.0, avg: 20.0, max: 20.0) [2023-10-12 17:00:48,436][61643] Avg episode reward: [(0, '4.160'), (1, '8.980')] [2023-10-12 17:00:48,444][62354] Saving new best policy, reward=4.160! [2023-10-12 17:00:50,595][62635] Updated weights for policy 1, policy_version 29990 (0.0009) [2023-10-12 17:00:50,970][62635] Updated weights for policy 1, policy_version 30000 (0.0009) [2023-10-12 17:00:51,336][62635] Updated weights for policy 1, policy_version 30010 (0.0007) [2023-10-12 17:00:52,361][62634] Updated weights for policy 0, policy_version 30020 (0.0007) [2023-10-12 17:00:52,765][62634] Updated weights for policy 0, policy_version 30030 (0.0010) [2023-10-12 17:00:53,142][62634] Updated weights for policy 0, policy_version 30040 (0.0010) [2023-10-12 17:00:53,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 61472768. Throughput: 0: 1687.5, 1: 1686.0. Samples: 15376910. Policy #0 lag: (min: 20.0, avg: 20.0, max: 20.0) [2023-10-12 17:00:53,435][61643] Avg episode reward: [(0, '4.230'), (1, '8.910')] [2023-10-12 17:00:53,446][62354] Saving new best policy, reward=4.230! [2023-10-12 17:00:55,300][62635] Updated weights for policy 1, policy_version 30020 (0.0007) [2023-10-12 17:00:55,674][62635] Updated weights for policy 1, policy_version 30030 (0.0009) [2023-10-12 17:00:56,034][62635] Updated weights for policy 1, policy_version 30040 (0.0009) [2023-10-12 17:00:57,047][62634] Updated weights for policy 0, policy_version 30050 (0.0009) [2023-10-12 17:00:57,430][62634] Updated weights for policy 0, policy_version 30060 (0.0007) [2023-10-12 17:00:57,801][62634] Updated weights for policy 0, policy_version 30070 (0.0007) [2023-10-12 17:00:58,180][62634] Updated weights for policy 0, policy_version 30080 (0.0008) [2023-10-12 17:00:58,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13440.5). Total num frames: 61571072. Throughput: 0: 1691.3, 1: 1683.6. Samples: 15397186. Policy #0 lag: (min: 20.0, avg: 20.0, max: 20.0) [2023-10-12 17:00:58,435][61643] Avg episode reward: [(0, '4.260'), (1, '8.960')] [2023-10-12 17:00:58,436][62354] Saving new best policy, reward=4.260! [2023-10-12 17:01:00,093][62635] Updated weights for policy 1, policy_version 30050 (0.0009) [2023-10-12 17:01:00,450][62635] Updated weights for policy 1, policy_version 30060 (0.0010) [2023-10-12 17:01:00,825][62635] Updated weights for policy 1, policy_version 30070 (0.0010) [2023-10-12 17:01:01,187][62635] Updated weights for policy 1, policy_version 30080 (0.0008) [2023-10-12 17:01:02,116][62634] Updated weights for policy 0, policy_version 30090 (0.0008) [2023-10-12 17:01:02,499][62634] Updated weights for policy 0, policy_version 30100 (0.0007) [2023-10-12 17:01:02,878][62634] Updated weights for policy 0, policy_version 30110 (0.0007) [2023-10-12 17:01:03,435][61643] Fps is (10 sec: 16383.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 61636608. Throughput: 0: 1662.8, 1: 1704.2. Samples: 15416904. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 17:01:03,435][61643] Avg episode reward: [(0, '4.330'), (1, '9.060')] [2023-10-12 17:01:03,443][62354] Saving new best policy, reward=4.330! [2023-10-12 17:01:05,254][62635] Updated weights for policy 1, policy_version 30090 (0.0009) [2023-10-12 17:01:05,628][62635] Updated weights for policy 1, policy_version 30100 (0.0007) [2023-10-12 17:01:05,993][62635] Updated weights for policy 1, policy_version 30110 (0.0008) [2023-10-12 17:01:06,908][62634] Updated weights for policy 0, policy_version 30120 (0.0007) [2023-10-12 17:01:07,283][62634] Updated weights for policy 0, policy_version 30130 (0.0008) [2023-10-12 17:01:07,655][62634] Updated weights for policy 0, policy_version 30140 (0.0007) [2023-10-12 17:01:08,435][61643] Fps is (10 sec: 13106.8, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 61702144. Throughput: 0: 1695.6, 1: 1676.6. Samples: 15427434. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 17:01:08,436][61643] Avg episode reward: [(0, '4.350'), (1, '9.270')] [2023-10-12 17:01:08,437][62354] Saving new best policy, reward=4.350! [2023-10-12 17:01:10,291][62635] Updated weights for policy 1, policy_version 30120 (0.0008) [2023-10-12 17:01:10,668][62635] Updated weights for policy 1, policy_version 30130 (0.0008) [2023-10-12 17:01:11,035][62635] Updated weights for policy 1, policy_version 30140 (0.0007) [2023-10-12 17:01:11,772][62634] Updated weights for policy 0, policy_version 30150 (0.0008) [2023-10-12 17:01:12,149][62634] Updated weights for policy 0, policy_version 30160 (0.0010) [2023-10-12 17:01:12,518][62634] Updated weights for policy 0, policy_version 30170 (0.0007) [2023-10-12 17:01:13,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 61767680. Throughput: 0: 1688.0, 1: 1691.3. Samples: 15447624. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 17:01:13,436][61643] Avg episode reward: [(0, '4.360'), (1, '9.140')] [2023-10-12 17:01:13,436][62354] Saving new best policy, reward=4.360! [2023-10-12 17:01:14,983][62635] Updated weights for policy 1, policy_version 30150 (0.0007) [2023-10-12 17:01:15,361][62635] Updated weights for policy 1, policy_version 30160 (0.0008) [2023-10-12 17:01:15,727][62635] Updated weights for policy 1, policy_version 30170 (0.0009) [2023-10-12 17:01:16,578][62634] Updated weights for policy 0, policy_version 30180 (0.0007) [2023-10-12 17:01:16,948][62634] Updated weights for policy 0, policy_version 30190 (0.0008) [2023-10-12 17:01:17,338][62634] Updated weights for policy 0, policy_version 30200 (0.0010) [2023-10-12 17:01:18,435][61643] Fps is (10 sec: 13107.5, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 61833216. Throughput: 0: 1673.9, 1: 1697.2. Samples: 15467556. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 17:01:18,435][61643] Avg episode reward: [(0, '4.460'), (1, '9.340')] [2023-10-12 17:01:18,445][62354] Saving new best policy, reward=4.460! [2023-10-12 17:01:19,780][62635] Updated weights for policy 1, policy_version 30180 (0.0008) [2023-10-12 17:01:20,151][62635] Updated weights for policy 1, policy_version 30190 (0.0009) [2023-10-12 17:01:20,521][62635] Updated weights for policy 1, policy_version 30200 (0.0007) [2023-10-12 17:01:21,341][62634] Updated weights for policy 0, policy_version 30210 (0.0009) [2023-10-12 17:01:21,719][62634] Updated weights for policy 0, policy_version 30220 (0.0008) [2023-10-12 17:01:22,100][62634] Updated weights for policy 0, policy_version 30230 (0.0011) [2023-10-12 17:01:22,487][62634] Updated weights for policy 0, policy_version 30240 (0.0010) [2023-10-12 17:01:23,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 61898752. Throughput: 0: 1697.3, 1: 1668.6. Samples: 15477778. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 17:01:23,435][61643] Avg episode reward: [(0, '4.440'), (1, '9.400')] [2023-10-12 17:01:23,436][62495] Saving new best policy, reward=9.400! [2023-10-12 17:01:24,570][62635] Updated weights for policy 1, policy_version 30210 (0.0007) [2023-10-12 17:01:24,946][62635] Updated weights for policy 1, policy_version 30220 (0.0007) [2023-10-12 17:01:25,318][62635] Updated weights for policy 1, policy_version 30230 (0.0008) [2023-10-12 17:01:25,681][62635] Updated weights for policy 1, policy_version 30240 (0.0007) [2023-10-12 17:01:26,453][62634] Updated weights for policy 0, policy_version 30250 (0.0009) [2023-10-12 17:01:26,838][62634] Updated weights for policy 0, policy_version 30260 (0.0007) [2023-10-12 17:01:27,220][62634] Updated weights for policy 0, policy_version 30270 (0.0008) [2023-10-12 17:01:28,435][61643] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 61964288. Throughput: 0: 1677.0, 1: 1692.3. Samples: 15497630. Policy #0 lag: (min: 31.0, avg: 35.9, max: 63.0) [2023-10-12 17:01:28,436][61643] Avg episode reward: [(0, '4.450'), (1, '9.150')] [2023-10-12 17:01:29,909][62635] Updated weights for policy 1, policy_version 30250 (0.0009) [2023-10-12 17:01:30,273][62635] Updated weights for policy 1, policy_version 30260 (0.0007) [2023-10-12 17:01:30,644][62635] Updated weights for policy 1, policy_version 30270 (0.0007) [2023-10-12 17:01:31,244][62634] Updated weights for policy 0, policy_version 30280 (0.0010) [2023-10-12 17:01:31,617][62634] Updated weights for policy 0, policy_version 30290 (0.0011) [2023-10-12 17:01:31,998][62634] Updated weights for policy 0, policy_version 30300 (0.0009) [2023-10-12 17:01:33,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 62029824. Throughput: 0: 1675.0, 1: 1684.2. Samples: 15517634. Policy #0 lag: (min: 31.0, avg: 35.9, max: 63.0) [2023-10-12 17:01:33,435][61643] Avg episode reward: [(0, '4.420'), (1, '9.330')] [2023-10-12 17:01:34,597][62635] Updated weights for policy 1, policy_version 30280 (0.0009) [2023-10-12 17:01:34,964][62635] Updated weights for policy 1, policy_version 30290 (0.0010) [2023-10-12 17:01:35,335][62635] Updated weights for policy 1, policy_version 30300 (0.0009) [2023-10-12 17:01:36,189][62634] Updated weights for policy 0, policy_version 30310 (0.0010) [2023-10-12 17:01:36,556][62634] Updated weights for policy 0, policy_version 30320 (0.0011) [2023-10-12 17:01:36,931][62634] Updated weights for policy 0, policy_version 30330 (0.0011) [2023-10-12 17:01:38,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 62095360. Throughput: 0: 1687.8, 1: 1667.4. Samples: 15527892. Policy #0 lag: (min: 31.0, avg: 35.9, max: 63.0) [2023-10-12 17:01:38,435][61643] Avg episode reward: [(0, '4.470'), (1, '9.330')] [2023-10-12 17:01:38,436][62354] Saving new best policy, reward=4.470! [2023-10-12 17:01:39,355][62635] Updated weights for policy 1, policy_version 30310 (0.0007) [2023-10-12 17:01:39,715][62635] Updated weights for policy 1, policy_version 30320 (0.0009) [2023-10-12 17:01:40,089][62635] Updated weights for policy 1, policy_version 30330 (0.0009) [2023-10-12 17:01:41,069][62634] Updated weights for policy 0, policy_version 30340 (0.0009) [2023-10-12 17:01:41,454][62634] Updated weights for policy 0, policy_version 30350 (0.0008) [2023-10-12 17:01:41,834][62634] Updated weights for policy 0, policy_version 30360 (0.0010) [2023-10-12 17:01:43,435][61643] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 62160896. Throughput: 0: 1661.9, 1: 1685.6. Samples: 15547822. Policy #0 lag: (min: 31.0, avg: 35.9, max: 63.0) [2023-10-12 17:01:43,436][61643] Avg episode reward: [(0, '4.550'), (1, '9.090')] [2023-10-12 17:01:43,438][62354] Saving new best policy, reward=4.550! [2023-10-12 17:01:44,180][62635] Updated weights for policy 1, policy_version 30340 (0.0008) [2023-10-12 17:01:44,546][62635] Updated weights for policy 1, policy_version 30350 (0.0007) [2023-10-12 17:01:44,921][62635] Updated weights for policy 1, policy_version 30360 (0.0007) [2023-10-12 17:01:45,832][62634] Updated weights for policy 0, policy_version 30370 (0.0010) [2023-10-12 17:01:46,206][62634] Updated weights for policy 0, policy_version 30380 (0.0009) [2023-10-12 17:01:46,586][62634] Updated weights for policy 0, policy_version 30390 (0.0007) [2023-10-12 17:01:46,962][62634] Updated weights for policy 0, policy_version 30400 (0.0007) [2023-10-12 17:01:48,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 62226432. Throughput: 0: 1684.0, 1: 1677.2. Samples: 15568158. Policy #0 lag: (min: 31.0, avg: 35.9, max: 63.0) [2023-10-12 17:01:48,436][61643] Avg episode reward: [(0, '4.650'), (1, '9.350')] [2023-10-12 17:01:48,442][62354] Saving new best policy, reward=4.650! [2023-10-12 17:01:48,941][62635] Updated weights for policy 1, policy_version 30370 (0.0007) [2023-10-12 17:01:49,306][62635] Updated weights for policy 1, policy_version 30380 (0.0008) [2023-10-12 17:01:49,675][62635] Updated weights for policy 1, policy_version 30390 (0.0010) [2023-10-12 17:01:50,039][62635] Updated weights for policy 1, policy_version 30400 (0.0007) [2023-10-12 17:01:51,066][62634] Updated weights for policy 0, policy_version 30410 (0.0010) [2023-10-12 17:01:51,440][62634] Updated weights for policy 0, policy_version 30420 (0.0008) [2023-10-12 17:01:51,813][62634] Updated weights for policy 0, policy_version 30430 (0.0008) [2023-10-12 17:01:53,435][61643] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 62291968. Throughput: 0: 1675.7, 1: 1675.4. Samples: 15578234. Policy #0 lag: (min: 31.0, avg: 40.8, max: 63.0) [2023-10-12 17:01:53,435][61643] Avg episode reward: [(0, '4.710'), (1, '9.190')] [2023-10-12 17:01:53,436][62354] Saving new best policy, reward=4.710! [2023-10-12 17:01:54,223][62635] Updated weights for policy 1, policy_version 30410 (0.0009) [2023-10-12 17:01:54,590][62635] Updated weights for policy 1, policy_version 30420 (0.0009) [2023-10-12 17:01:54,969][62635] Updated weights for policy 1, policy_version 30430 (0.0008) [2023-10-12 17:01:55,748][62634] Updated weights for policy 0, policy_version 30440 (0.0009) [2023-10-12 17:01:56,127][62634] Updated weights for policy 0, policy_version 30450 (0.0009) [2023-10-12 17:01:56,507][62634] Updated weights for policy 0, policy_version 30460 (0.0009) [2023-10-12 17:01:58,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13107.1, 300 sec: 13440.4). Total num frames: 62357504. Throughput: 0: 1664.2, 1: 1678.5. Samples: 15598046. Policy #0 lag: (min: 31.0, avg: 40.8, max: 63.0) [2023-10-12 17:01:58,436][61643] Avg episode reward: [(0, '4.870'), (1, '8.960')] [2023-10-12 17:01:58,437][62354] Saving new best policy, reward=4.870! [2023-10-12 17:01:59,021][62635] Updated weights for policy 1, policy_version 30440 (0.0010) [2023-10-12 17:01:59,387][62635] Updated weights for policy 1, policy_version 30450 (0.0010) [2023-10-12 17:01:59,742][62635] Updated weights for policy 1, policy_version 30460 (0.0011) [2023-10-12 17:02:00,482][62634] Updated weights for policy 0, policy_version 30470 (0.0009) [2023-10-12 17:02:00,864][62634] Updated weights for policy 0, policy_version 30480 (0.0009) [2023-10-12 17:02:01,244][62634] Updated weights for policy 0, policy_version 30490 (0.0009) [2023-10-12 17:02:03,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 62423040. Throughput: 0: 1685.0, 1: 1673.3. Samples: 15618680. Policy #0 lag: (min: 31.0, avg: 40.8, max: 63.0) [2023-10-12 17:02:03,436][61643] Avg episode reward: [(0, '5.060'), (1, '9.100')] [2023-10-12 17:02:03,444][62354] Saving new best policy, reward=5.060! [2023-10-12 17:02:03,911][62635] Updated weights for policy 1, policy_version 30470 (0.0008) [2023-10-12 17:02:04,283][62635] Updated weights for policy 1, policy_version 30480 (0.0008) [2023-10-12 17:02:04,645][62635] Updated weights for policy 1, policy_version 30490 (0.0009) [2023-10-12 17:02:05,347][62634] Updated weights for policy 0, policy_version 30500 (0.0009) [2023-10-12 17:02:05,718][62634] Updated weights for policy 0, policy_version 30510 (0.0009) [2023-10-12 17:02:06,094][62634] Updated weights for policy 0, policy_version 30520 (0.0009) [2023-10-12 17:02:08,435][61643] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 62488576. Throughput: 0: 1669.6, 1: 1674.8. Samples: 15628280. Policy #0 lag: (min: 31.0, avg: 40.8, max: 63.0) [2023-10-12 17:02:08,435][61643] Avg episode reward: [(0, '5.250'), (1, '9.130')] [2023-10-12 17:02:08,436][62354] Saving new best policy, reward=5.250! [2023-10-12 17:02:08,696][62635] Updated weights for policy 1, policy_version 30500 (0.0007) [2023-10-12 17:02:09,069][62635] Updated weights for policy 1, policy_version 30510 (0.0008) [2023-10-12 17:02:09,441][62635] Updated weights for policy 1, policy_version 30520 (0.0008) [2023-10-12 17:02:10,313][62634] Updated weights for policy 0, policy_version 30530 (0.0009) [2023-10-12 17:02:10,697][62634] Updated weights for policy 0, policy_version 30540 (0.0008) [2023-10-12 17:02:11,067][62634] Updated weights for policy 0, policy_version 30550 (0.0010) [2023-10-12 17:02:11,448][62634] Updated weights for policy 0, policy_version 30560 (0.0007) [2023-10-12 17:02:13,380][62635] Updated weights for policy 1, policy_version 30530 (0.0009) [2023-10-12 17:02:13,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 62554112. Throughput: 0: 1673.3, 1: 1679.6. Samples: 15648512. Policy #0 lag: (min: 31.0, avg: 40.8, max: 63.0) [2023-10-12 17:02:13,435][61643] Avg episode reward: [(0, '5.520'), (1, '9.000')] [2023-10-12 17:02:13,436][62354] Saving new best policy, reward=5.520! [2023-10-12 17:02:13,752][62635] Updated weights for policy 1, policy_version 30540 (0.0010) [2023-10-12 17:02:14,120][62635] Updated weights for policy 1, policy_version 30550 (0.0007) [2023-10-12 17:02:14,494][62635] Updated weights for policy 1, policy_version 30560 (0.0007) [2023-10-12 17:02:15,344][62634] Updated weights for policy 0, policy_version 30570 (0.0007) [2023-10-12 17:02:15,715][62634] Updated weights for policy 0, policy_version 30580 (0.0007) [2023-10-12 17:02:16,099][62634] Updated weights for policy 0, policy_version 30590 (0.0007) [2023-10-12 17:02:18,435][61643] Fps is (10 sec: 13106.8, 60 sec: 13107.1, 300 sec: 13440.4). Total num frames: 62619648. Throughput: 0: 1692.2, 1: 1682.0. Samples: 15669474. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 17:02:18,436][61643] Avg episode reward: [(0, '5.790'), (1, '9.150')] [2023-10-12 17:02:18,447][62354] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000030592_31326208.pth... [2023-10-12 17:02:18,490][62354] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000029024_29720576.pth [2023-10-12 17:02:18,495][62354] Saving new best policy, reward=5.790! [2023-10-12 17:02:18,689][62635] Updated weights for policy 1, policy_version 30570 (0.0008) [2023-10-12 17:02:19,065][62635] Updated weights for policy 1, policy_version 30580 (0.0010) [2023-10-12 17:02:19,419][62635] Updated weights for policy 1, policy_version 30590 (0.0008) [2023-10-12 17:02:19,492][62495] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000030592_31326208.pth... [2023-10-12 17:02:19,522][62495] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000028992_29687808.pth [2023-10-12 17:02:20,108][62634] Updated weights for policy 0, policy_version 30600 (0.0010) [2023-10-12 17:02:20,484][62634] Updated weights for policy 0, policy_version 30610 (0.0009) [2023-10-12 17:02:20,862][62634] Updated weights for policy 0, policy_version 30620 (0.0010) [2023-10-12 17:02:23,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 62685184. Throughput: 0: 1665.9, 1: 1682.4. Samples: 15678568. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 17:02:23,435][61643] Avg episode reward: [(0, '5.950'), (1, '9.170')] [2023-10-12 17:02:23,436][62354] Saving new best policy, reward=5.950! [2023-10-12 17:02:23,548][62635] Updated weights for policy 1, policy_version 30600 (0.0010) [2023-10-12 17:02:23,918][62635] Updated weights for policy 1, policy_version 30610 (0.0007) [2023-10-12 17:02:24,297][62635] Updated weights for policy 1, policy_version 30620 (0.0007) [2023-10-12 17:02:24,963][62634] Updated weights for policy 0, policy_version 30630 (0.0009) [2023-10-12 17:02:25,330][62634] Updated weights for policy 0, policy_version 30640 (0.0009) [2023-10-12 17:02:25,706][62634] Updated weights for policy 0, policy_version 30650 (0.0011) [2023-10-12 17:02:28,235][62635] Updated weights for policy 1, policy_version 30630 (0.0011) [2023-10-12 17:02:28,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 62750720. Throughput: 0: 1686.5, 1: 1677.7. Samples: 15699210. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 17:02:28,436][61643] Avg episode reward: [(0, '6.010'), (1, '9.060')] [2023-10-12 17:02:28,437][62354] Saving new best policy, reward=6.010! [2023-10-12 17:02:28,595][62635] Updated weights for policy 1, policy_version 30640 (0.0008) [2023-10-12 17:02:28,972][62635] Updated weights for policy 1, policy_version 30650 (0.0010) [2023-10-12 17:02:29,785][62634] Updated weights for policy 0, policy_version 30660 (0.0011) [2023-10-12 17:02:30,168][62634] Updated weights for policy 0, policy_version 30670 (0.0007) [2023-10-12 17:02:30,554][62634] Updated weights for policy 0, policy_version 30680 (0.0007) [2023-10-12 17:02:32,936][62635] Updated weights for policy 1, policy_version 30660 (0.0009) [2023-10-12 17:02:33,307][62635] Updated weights for policy 1, policy_version 30670 (0.0011) [2023-10-12 17:02:33,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 62816256. Throughput: 0: 1693.3, 1: 1678.8. Samples: 15719902. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 17:02:33,436][61643] Avg episode reward: [(0, '6.120'), (1, '9.220')] [2023-10-12 17:02:33,444][62354] Saving new best policy, reward=6.120! [2023-10-12 17:02:33,674][62635] Updated weights for policy 1, policy_version 30680 (0.0008) [2023-10-12 17:02:34,427][62634] Updated weights for policy 0, policy_version 30690 (0.0007) [2023-10-12 17:02:34,806][62634] Updated weights for policy 0, policy_version 30700 (0.0009) [2023-10-12 17:02:35,183][62634] Updated weights for policy 0, policy_version 30710 (0.0007) [2023-10-12 17:02:35,556][62634] Updated weights for policy 0, policy_version 30720 (0.0007) [2023-10-12 17:02:37,851][62635] Updated weights for policy 1, policy_version 30690 (0.0009) [2023-10-12 17:02:38,222][62635] Updated weights for policy 1, policy_version 30700 (0.0008) [2023-10-12 17:02:38,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 62881792. Throughput: 0: 1671.9, 1: 1684.8. Samples: 15729286. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 17:02:38,436][61643] Avg episode reward: [(0, '6.130'), (1, '9.160')] [2023-10-12 17:02:38,437][62354] Saving new best policy, reward=6.130! [2023-10-12 17:02:38,588][62635] Updated weights for policy 1, policy_version 30710 (0.0009) [2023-10-12 17:02:38,962][62635] Updated weights for policy 1, policy_version 30720 (0.0010) [2023-10-12 17:02:39,753][62634] Updated weights for policy 0, policy_version 30730 (0.0009) [2023-10-12 17:02:40,131][62634] Updated weights for policy 0, policy_version 30740 (0.0011) [2023-10-12 17:02:40,508][62634] Updated weights for policy 0, policy_version 30750 (0.0010) [2023-10-12 17:02:43,046][62635] Updated weights for policy 1, policy_version 30730 (0.0009) [2023-10-12 17:02:43,411][62635] Updated weights for policy 1, policy_version 30740 (0.0009) [2023-10-12 17:02:43,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 62947328. Throughput: 0: 1691.7, 1: 1682.5. Samples: 15749884. Policy #0 lag: (min: 31.0, avg: 33.3, max: 58.0) [2023-10-12 17:02:43,435][61643] Avg episode reward: [(0, '5.930'), (1, '8.910')] [2023-10-12 17:02:43,783][62635] Updated weights for policy 1, policy_version 30750 (0.0011) [2023-10-12 17:02:44,610][62634] Updated weights for policy 0, policy_version 30760 (0.0008) [2023-10-12 17:02:44,990][62634] Updated weights for policy 0, policy_version 30770 (0.0008) [2023-10-12 17:02:45,358][62634] Updated weights for policy 0, policy_version 30780 (0.0009) [2023-10-12 17:02:47,827][62635] Updated weights for policy 1, policy_version 30760 (0.0009) [2023-10-12 17:02:48,202][62635] Updated weights for policy 1, policy_version 30770 (0.0010) [2023-10-12 17:02:48,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 63012864. Throughput: 0: 1684.4, 1: 1676.9. Samples: 15769936. Policy #0 lag: (min: 31.0, avg: 33.3, max: 58.0) [2023-10-12 17:02:48,435][61643] Avg episode reward: [(0, '5.780'), (1, '9.070')] [2023-10-12 17:02:48,563][62635] Updated weights for policy 1, policy_version 30780 (0.0009) [2023-10-12 17:02:49,472][62634] Updated weights for policy 0, policy_version 30790 (0.0008) [2023-10-12 17:02:49,858][62634] Updated weights for policy 0, policy_version 30800 (0.0009) [2023-10-12 17:02:50,232][62634] Updated weights for policy 0, policy_version 30810 (0.0009) [2023-10-12 17:02:52,670][62635] Updated weights for policy 1, policy_version 30790 (0.0009) [2023-10-12 17:02:53,042][62635] Updated weights for policy 1, policy_version 30800 (0.0008) [2023-10-12 17:02:53,415][62635] Updated weights for policy 1, policy_version 30810 (0.0008) [2023-10-12 17:02:53,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 63078400. Throughput: 0: 1667.2, 1: 1688.3. Samples: 15779278. Policy #0 lag: (min: 31.0, avg: 33.3, max: 58.0) [2023-10-12 17:02:53,436][61643] Avg episode reward: [(0, '5.460'), (1, '9.170')] [2023-10-12 17:02:54,198][62634] Updated weights for policy 0, policy_version 30820 (0.0009) [2023-10-12 17:02:54,566][62634] Updated weights for policy 0, policy_version 30830 (0.0010) [2023-10-12 17:02:54,948][62634] Updated weights for policy 0, policy_version 30840 (0.0009) [2023-10-12 17:02:57,293][62635] Updated weights for policy 1, policy_version 30820 (0.0009) [2023-10-12 17:02:57,666][62635] Updated weights for policy 1, policy_version 30830 (0.0011) [2023-10-12 17:02:58,038][62635] Updated weights for policy 1, policy_version 30840 (0.0011) [2023-10-12 17:02:58,435][61643] Fps is (10 sec: 16383.6, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 63176704. Throughput: 0: 1681.6, 1: 1690.4. Samples: 15800254. Policy #0 lag: (min: 31.0, avg: 33.3, max: 58.0) [2023-10-12 17:02:58,436][61643] Avg episode reward: [(0, '5.600'), (1, '8.970')] [2023-10-12 17:02:58,999][62634] Updated weights for policy 0, policy_version 30850 (0.0008) [2023-10-12 17:02:59,365][62634] Updated weights for policy 0, policy_version 30860 (0.0009) [2023-10-12 17:02:59,749][62634] Updated weights for policy 0, policy_version 30870 (0.0009) [2023-10-12 17:03:00,126][62634] Updated weights for policy 0, policy_version 30880 (0.0010) [2023-10-12 17:03:02,178][62635] Updated weights for policy 1, policy_version 30850 (0.0009) [2023-10-12 17:03:02,546][62635] Updated weights for policy 1, policy_version 30860 (0.0008) [2023-10-12 17:03:02,919][62635] Updated weights for policy 1, policy_version 30870 (0.0011) [2023-10-12 17:03:03,283][62635] Updated weights for policy 1, policy_version 30880 (0.0010) [2023-10-12 17:03:03,435][61643] Fps is (10 sec: 16384.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 63242240. Throughput: 0: 1680.5, 1: 1664.1. Samples: 15819978. Policy #0 lag: (min: 31.0, avg: 33.3, max: 58.0) [2023-10-12 17:03:03,435][61643] Avg episode reward: [(0, '5.600'), (1, '9.160')] [2023-10-12 17:03:04,105][62634] Updated weights for policy 0, policy_version 30890 (0.0008) [2023-10-12 17:03:04,480][62634] Updated weights for policy 0, policy_version 30900 (0.0009) [2023-10-12 17:03:04,860][62634] Updated weights for policy 0, policy_version 30910 (0.0008) [2023-10-12 17:03:07,487][62635] Updated weights for policy 1, policy_version 30890 (0.0009) [2023-10-12 17:03:07,858][62635] Updated weights for policy 1, policy_version 30900 (0.0008) [2023-10-12 17:03:08,229][62635] Updated weights for policy 1, policy_version 30910 (0.0009) [2023-10-12 17:03:08,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 63307776. Throughput: 0: 1677.1, 1: 1690.3. Samples: 15830102. Policy #0 lag: (min: 26.0, avg: 27.0, max: 42.0) [2023-10-12 17:03:08,436][61643] Avg episode reward: [(0, '5.830'), (1, '9.180')] [2023-10-12 17:03:08,869][62634] Updated weights for policy 0, policy_version 30920 (0.0009) [2023-10-12 17:03:09,247][62634] Updated weights for policy 0, policy_version 30930 (0.0007) [2023-10-12 17:03:09,614][62634] Updated weights for policy 0, policy_version 30940 (0.0007) [2023-10-12 17:03:12,191][62635] Updated weights for policy 1, policy_version 30920 (0.0008) [2023-10-12 17:03:12,559][62635] Updated weights for policy 1, policy_version 30930 (0.0008) [2023-10-12 17:03:12,928][62635] Updated weights for policy 1, policy_version 30940 (0.0008) [2023-10-12 17:03:13,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 63373312. Throughput: 0: 1685.2, 1: 1682.5. Samples: 15850754. Policy #0 lag: (min: 26.0, avg: 27.0, max: 42.0) [2023-10-12 17:03:13,435][61643] Avg episode reward: [(0, '5.740'), (1, '9.150')] [2023-10-12 17:03:13,790][62634] Updated weights for policy 0, policy_version 30950 (0.0008) [2023-10-12 17:03:14,175][62634] Updated weights for policy 0, policy_version 30960 (0.0008) [2023-10-12 17:03:14,551][62634] Updated weights for policy 0, policy_version 30970 (0.0008) [2023-10-12 17:03:17,065][62635] Updated weights for policy 1, policy_version 30950 (0.0008) [2023-10-12 17:03:17,430][62635] Updated weights for policy 1, policy_version 30960 (0.0008) [2023-10-12 17:03:17,797][62635] Updated weights for policy 1, policy_version 30970 (0.0007) [2023-10-12 17:03:18,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 63438848. Throughput: 0: 1685.5, 1: 1661.8. Samples: 15870530. Policy #0 lag: (min: 26.0, avg: 27.0, max: 42.0) [2023-10-12 17:03:18,435][61643] Avg episode reward: [(0, '5.730'), (1, '9.380')] [2023-10-12 17:03:18,521][62634] Updated weights for policy 0, policy_version 30980 (0.0008) [2023-10-12 17:03:18,915][62634] Updated weights for policy 0, policy_version 30990 (0.0008) [2023-10-12 17:03:19,286][62634] Updated weights for policy 0, policy_version 31000 (0.0008) [2023-10-12 17:03:21,997][62635] Updated weights for policy 1, policy_version 30980 (0.0008) [2023-10-12 17:03:22,366][62635] Updated weights for policy 1, policy_version 30990 (0.0009) [2023-10-12 17:03:22,732][62635] Updated weights for policy 1, policy_version 31000 (0.0009) [2023-10-12 17:03:23,210][62634] Updated weights for policy 0, policy_version 31010 (0.0007) [2023-10-12 17:03:23,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 63504384. Throughput: 0: 1682.3, 1: 1681.5. Samples: 15880656. Policy #0 lag: (min: 26.0, avg: 27.0, max: 42.0) [2023-10-12 17:03:23,435][61643] Avg episode reward: [(0, '5.690'), (1, '9.370')] [2023-10-12 17:03:23,587][62634] Updated weights for policy 0, policy_version 31020 (0.0009) [2023-10-12 17:03:23,963][62634] Updated weights for policy 0, policy_version 31030 (0.0010) [2023-10-12 17:03:24,347][62634] Updated weights for policy 0, policy_version 31040 (0.0007) [2023-10-12 17:03:26,935][62635] Updated weights for policy 1, policy_version 31010 (0.0008) [2023-10-12 17:03:27,300][62635] Updated weights for policy 1, policy_version 31020 (0.0009) [2023-10-12 17:03:27,669][62635] Updated weights for policy 1, policy_version 31030 (0.0010) [2023-10-12 17:03:28,034][62635] Updated weights for policy 1, policy_version 31040 (0.0009) [2023-10-12 17:03:28,348][62634] Updated weights for policy 0, policy_version 31050 (0.0009) [2023-10-12 17:03:28,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 63569920. Throughput: 0: 1683.0, 1: 1677.9. Samples: 15901126. Policy #0 lag: (min: 26.0, avg: 27.0, max: 42.0) [2023-10-12 17:03:28,435][61643] Avg episode reward: [(0, '5.570'), (1, '9.270')] [2023-10-12 17:03:28,732][62634] Updated weights for policy 0, policy_version 31060 (0.0007) [2023-10-12 17:03:29,116][62634] Updated weights for policy 0, policy_version 31070 (0.0011) [2023-10-12 17:03:32,032][62635] Updated weights for policy 1, policy_version 31050 (0.0009) [2023-10-12 17:03:32,401][62635] Updated weights for policy 1, policy_version 31060 (0.0011) [2023-10-12 17:03:32,763][62635] Updated weights for policy 1, policy_version 31070 (0.0011) [2023-10-12 17:03:33,160][62634] Updated weights for policy 0, policy_version 31080 (0.0009) [2023-10-12 17:03:33,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 63635456. Throughput: 0: 1688.5, 1: 1665.0. Samples: 15920846. Policy #0 lag: (min: 1.0, avg: 27.3, max: 32.0) [2023-10-12 17:03:33,435][61643] Avg episode reward: [(0, '5.760'), (1, '9.260')] [2023-10-12 17:03:33,549][62634] Updated weights for policy 0, policy_version 31090 (0.0007) [2023-10-12 17:03:33,917][62634] Updated weights for policy 0, policy_version 31100 (0.0007) [2023-10-12 17:03:36,739][62635] Updated weights for policy 1, policy_version 31080 (0.0008) [2023-10-12 17:03:37,107][62635] Updated weights for policy 1, policy_version 31090 (0.0007) [2023-10-12 17:03:37,475][62635] Updated weights for policy 1, policy_version 31100 (0.0007) [2023-10-12 17:03:38,007][62634] Updated weights for policy 0, policy_version 31110 (0.0010) [2023-10-12 17:03:38,388][62634] Updated weights for policy 0, policy_version 31120 (0.0011) [2023-10-12 17:03:38,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 63700992. Throughput: 0: 1694.2, 1: 1686.6. Samples: 15931416. Policy #0 lag: (min: 1.0, avg: 27.3, max: 32.0) [2023-10-12 17:03:38,435][61643] Avg episode reward: [(0, '5.820'), (1, '9.300')] [2023-10-12 17:03:38,769][62634] Updated weights for policy 0, policy_version 31130 (0.0009) [2023-10-12 17:03:41,397][62635] Updated weights for policy 1, policy_version 31110 (0.0009) [2023-10-12 17:03:41,760][62635] Updated weights for policy 1, policy_version 31120 (0.0009) [2023-10-12 17:03:42,127][62635] Updated weights for policy 1, policy_version 31130 (0.0009) [2023-10-12 17:03:42,760][62634] Updated weights for policy 0, policy_version 31140 (0.0009) [2023-10-12 17:03:43,138][62634] Updated weights for policy 0, policy_version 31150 (0.0008) [2023-10-12 17:03:43,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 63766528. Throughput: 0: 1695.8, 1: 1663.6. Samples: 15951426. Policy #0 lag: (min: 1.0, avg: 27.3, max: 32.0) [2023-10-12 17:03:43,435][61643] Avg episode reward: [(0, '5.920'), (1, '9.060')] [2023-10-12 17:03:43,517][62634] Updated weights for policy 0, policy_version 31160 (0.0009) [2023-10-12 17:03:46,209][62635] Updated weights for policy 1, policy_version 31140 (0.0008) [2023-10-12 17:03:46,577][62635] Updated weights for policy 1, policy_version 31150 (0.0010) [2023-10-12 17:03:46,944][62635] Updated weights for policy 1, policy_version 31160 (0.0009) [2023-10-12 17:03:47,533][62634] Updated weights for policy 0, policy_version 31170 (0.0009) [2023-10-12 17:03:47,909][62634] Updated weights for policy 0, policy_version 31180 (0.0009) [2023-10-12 17:03:48,280][62634] Updated weights for policy 0, policy_version 31190 (0.0011) [2023-10-12 17:03:48,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 63832064. Throughput: 0: 1685.1, 1: 1678.3. Samples: 15971330. Policy #0 lag: (min: 1.0, avg: 27.3, max: 32.0) [2023-10-12 17:03:48,436][61643] Avg episode reward: [(0, '5.780'), (1, '9.140')] [2023-10-12 17:03:48,657][62634] Updated weights for policy 0, policy_version 31200 (0.0009) [2023-10-12 17:03:51,007][62635] Updated weights for policy 1, policy_version 31170 (0.0007) [2023-10-12 17:03:51,383][62635] Updated weights for policy 1, policy_version 31180 (0.0009) [2023-10-12 17:03:51,744][62635] Updated weights for policy 1, policy_version 31190 (0.0008) [2023-10-12 17:03:52,114][62635] Updated weights for policy 1, policy_version 31200 (0.0007) [2023-10-12 17:03:52,655][62634] Updated weights for policy 0, policy_version 31210 (0.0007) [2023-10-12 17:03:53,040][62634] Updated weights for policy 0, policy_version 31220 (0.0010) [2023-10-12 17:03:53,416][62634] Updated weights for policy 0, policy_version 31230 (0.0010) [2023-10-12 17:03:53,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 63897600. Throughput: 0: 1693.9, 1: 1682.6. Samples: 15982044. Policy #0 lag: (min: 1.0, avg: 27.3, max: 32.0) [2023-10-12 17:03:53,435][61643] Avg episode reward: [(0, '5.750'), (1, '9.050')] [2023-10-12 17:03:56,078][62635] Updated weights for policy 1, policy_version 31210 (0.0009) [2023-10-12 17:03:56,450][62635] Updated weights for policy 1, policy_version 31220 (0.0007) [2023-10-12 17:03:56,828][62635] Updated weights for policy 1, policy_version 31230 (0.0010) [2023-10-12 17:03:57,286][62634] Updated weights for policy 0, policy_version 31240 (0.0010) [2023-10-12 17:03:57,672][62634] Updated weights for policy 0, policy_version 31250 (0.0008) [2023-10-12 17:03:58,047][62634] Updated weights for policy 0, policy_version 31260 (0.0007) [2023-10-12 17:03:58,435][61643] Fps is (10 sec: 16383.7, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 63995904. Throughput: 0: 1694.2, 1: 1663.2. Samples: 16001838. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 17:03:58,436][61643] Avg episode reward: [(0, '5.750'), (1, '9.030')] [2023-10-12 17:04:01,067][62635] Updated weights for policy 1, policy_version 31240 (0.0010) [2023-10-12 17:04:01,438][62635] Updated weights for policy 1, policy_version 31250 (0.0011) [2023-10-12 17:04:01,798][62635] Updated weights for policy 1, policy_version 31260 (0.0010) [2023-10-12 17:04:02,085][62634] Updated weights for policy 0, policy_version 31270 (0.0008) [2023-10-12 17:04:02,468][62634] Updated weights for policy 0, policy_version 31280 (0.0007) [2023-10-12 17:04:02,842][62634] Updated weights for policy 0, policy_version 31290 (0.0008) [2023-10-12 17:04:03,435][61643] Fps is (10 sec: 16383.6, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 64061440. Throughput: 0: 1667.8, 1: 1685.3. Samples: 16021418. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 17:04:03,436][61643] Avg episode reward: [(0, '5.860'), (1, '9.240')] [2023-10-12 17:04:05,923][62635] Updated weights for policy 1, policy_version 31270 (0.0008) [2023-10-12 17:04:06,289][62635] Updated weights for policy 1, policy_version 31280 (0.0009) [2023-10-12 17:04:06,659][62635] Updated weights for policy 1, policy_version 31290 (0.0008) [2023-10-12 17:04:06,913][62634] Updated weights for policy 0, policy_version 31300 (0.0008) [2023-10-12 17:04:07,307][62634] Updated weights for policy 0, policy_version 31310 (0.0010) [2023-10-12 17:04:07,685][62634] Updated weights for policy 0, policy_version 31320 (0.0007) [2023-10-12 17:04:08,435][61643] Fps is (10 sec: 13107.5, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 64126976. Throughput: 0: 1695.7, 1: 1679.4. Samples: 16032538. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 17:04:08,436][61643] Avg episode reward: [(0, '5.690'), (1, '9.200')] [2023-10-12 17:04:10,490][62635] Updated weights for policy 1, policy_version 31300 (0.0007) [2023-10-12 17:04:10,855][62635] Updated weights for policy 1, policy_version 31310 (0.0007) [2023-10-12 17:04:11,222][62635] Updated weights for policy 1, policy_version 31320 (0.0007) [2023-10-12 17:04:11,688][62634] Updated weights for policy 0, policy_version 31330 (0.0007) [2023-10-12 17:04:12,064][62634] Updated weights for policy 0, policy_version 31340 (0.0009) [2023-10-12 17:04:12,446][62634] Updated weights for policy 0, policy_version 31350 (0.0009) [2023-10-12 17:04:12,825][62634] Updated weights for policy 0, policy_version 31360 (0.0009) [2023-10-12 17:04:13,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 64192512. Throughput: 0: 1687.4, 1: 1665.4. Samples: 16052004. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 17:04:13,436][61643] Avg episode reward: [(0, '5.750'), (1, '9.140')] [2023-10-12 17:04:15,407][62635] Updated weights for policy 1, policy_version 31330 (0.0008) [2023-10-12 17:04:15,778][62635] Updated weights for policy 1, policy_version 31340 (0.0011) [2023-10-12 17:04:16,142][62635] Updated weights for policy 1, policy_version 31350 (0.0008) [2023-10-12 17:04:16,516][62635] Updated weights for policy 1, policy_version 31360 (0.0010) [2023-10-12 17:04:16,790][62634] Updated weights for policy 0, policy_version 31370 (0.0007) [2023-10-12 17:04:17,173][62634] Updated weights for policy 0, policy_version 31380 (0.0008) [2023-10-12 17:04:17,557][62634] Updated weights for policy 0, policy_version 31390 (0.0008) [2023-10-12 17:04:18,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 64258048. Throughput: 0: 1664.1, 1: 1688.3. Samples: 16071704. Policy #0 lag: (min: 18.0, avg: 30.1, max: 50.0) [2023-10-12 17:04:18,435][61643] Avg episode reward: [(0, '5.830'), (1, '9.160')] [2023-10-12 17:04:18,445][62495] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000031360_32112640.pth... [2023-10-12 17:04:18,445][62354] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000031392_32145408.pth... [2023-10-12 17:04:18,496][62354] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000029824_30539776.pth [2023-10-12 17:04:18,497][62495] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000029792_30507008.pth [2023-10-12 17:04:18,502][62354] Saving a milestone ./train_atari/atari_kangaroo_APPO/checkpoint_p0/milestones/checkpoint_000031392_32145408.pth [2023-10-12 17:04:18,502][62495] Saving a milestone ./train_atari/atari_kangaroo_APPO/checkpoint_p1/milestones/checkpoint_000031360_32112640.pth [2023-10-12 17:04:20,558][62635] Updated weights for policy 1, policy_version 31370 (0.0010) [2023-10-12 17:04:20,929][62635] Updated weights for policy 1, policy_version 31380 (0.0010) [2023-10-12 17:04:21,300][62635] Updated weights for policy 1, policy_version 31390 (0.0007) [2023-10-12 17:04:21,742][62634] Updated weights for policy 0, policy_version 31400 (0.0008) [2023-10-12 17:04:22,119][62634] Updated weights for policy 0, policy_version 31410 (0.0009) [2023-10-12 17:04:22,505][62634] Updated weights for policy 0, policy_version 31420 (0.0009) [2023-10-12 17:04:23,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 64323584. Throughput: 0: 1687.9, 1: 1665.3. Samples: 16082310. Policy #0 lag: (min: 18.0, avg: 30.1, max: 50.0) [2023-10-12 17:04:23,435][61643] Avg episode reward: [(0, '6.000'), (1, '9.100')] [2023-10-12 17:04:25,336][62635] Updated weights for policy 1, policy_version 31400 (0.0010) [2023-10-12 17:04:25,715][62635] Updated weights for policy 1, policy_version 31410 (0.0009) [2023-10-12 17:04:26,089][62635] Updated weights for policy 1, policy_version 31420 (0.0010) [2023-10-12 17:04:26,523][62634] Updated weights for policy 0, policy_version 31430 (0.0010) [2023-10-12 17:04:26,895][62634] Updated weights for policy 0, policy_version 31440 (0.0008) [2023-10-12 17:04:27,279][62634] Updated weights for policy 0, policy_version 31450 (0.0008) [2023-10-12 17:04:28,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 64389120. Throughput: 0: 1669.5, 1: 1676.8. Samples: 16102006. Policy #0 lag: (min: 18.0, avg: 30.1, max: 50.0) [2023-10-12 17:04:28,436][61643] Avg episode reward: [(0, '6.080'), (1, '8.890')] [2023-10-12 17:04:30,270][62635] Updated weights for policy 1, policy_version 31430 (0.0008) [2023-10-12 17:04:30,636][62635] Updated weights for policy 1, policy_version 31440 (0.0009) [2023-10-12 17:04:30,995][62635] Updated weights for policy 1, policy_version 31450 (0.0010) [2023-10-12 17:04:31,381][62634] Updated weights for policy 0, policy_version 31460 (0.0007) [2023-10-12 17:04:31,762][62634] Updated weights for policy 0, policy_version 31470 (0.0009) [2023-10-12 17:04:32,150][62634] Updated weights for policy 0, policy_version 31480 (0.0010) [2023-10-12 17:04:33,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 64454656. Throughput: 0: 1662.6, 1: 1687.7. Samples: 16122096. Policy #0 lag: (min: 18.0, avg: 30.1, max: 50.0) [2023-10-12 17:04:33,435][61643] Avg episode reward: [(0, '6.050'), (1, '9.110')] [2023-10-12 17:04:35,052][62635] Updated weights for policy 1, policy_version 31460 (0.0008) [2023-10-12 17:04:35,425][62635] Updated weights for policy 1, policy_version 31470 (0.0010) [2023-10-12 17:04:35,793][62635] Updated weights for policy 1, policy_version 31480 (0.0010) [2023-10-12 17:04:36,276][62634] Updated weights for policy 0, policy_version 31490 (0.0009) [2023-10-12 17:04:36,658][62634] Updated weights for policy 0, policy_version 31500 (0.0009) [2023-10-12 17:04:37,030][62634] Updated weights for policy 0, policy_version 31510 (0.0008) [2023-10-12 17:04:37,400][62634] Updated weights for policy 0, policy_version 31520 (0.0007) [2023-10-12 17:04:38,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 64520192. Throughput: 0: 1681.6, 1: 1661.7. Samples: 16132490. Policy #0 lag: (min: 18.0, avg: 30.1, max: 50.0) [2023-10-12 17:04:38,435][61643] Avg episode reward: [(0, '6.150'), (1, '9.090')] [2023-10-12 17:04:38,436][62354] Saving new best policy, reward=6.150! [2023-10-12 17:04:39,778][62635] Updated weights for policy 1, policy_version 31490 (0.0009) [2023-10-12 17:04:40,150][62635] Updated weights for policy 1, policy_version 31500 (0.0009) [2023-10-12 17:04:40,534][62635] Updated weights for policy 1, policy_version 31510 (0.0010) [2023-10-12 17:04:40,906][62635] Updated weights for policy 1, policy_version 31520 (0.0008) [2023-10-12 17:04:41,406][62634] Updated weights for policy 0, policy_version 31530 (0.0010) [2023-10-12 17:04:41,785][62634] Updated weights for policy 0, policy_version 31540 (0.0008) [2023-10-12 17:04:42,161][62634] Updated weights for policy 0, policy_version 31550 (0.0008) [2023-10-12 17:04:43,435][61643] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 64585728. Throughput: 0: 1659.6, 1: 1686.4. Samples: 16152408. Policy #0 lag: (min: 14.0, avg: 21.2, max: 46.0) [2023-10-12 17:04:43,436][61643] Avg episode reward: [(0, '6.240'), (1, '8.980')] [2023-10-12 17:04:43,437][62354] Saving new best policy, reward=6.240! [2023-10-12 17:04:44,997][62635] Updated weights for policy 1, policy_version 31530 (0.0007) [2023-10-12 17:04:45,372][62635] Updated weights for policy 1, policy_version 31540 (0.0008) [2023-10-12 17:04:45,739][62635] Updated weights for policy 1, policy_version 31550 (0.0008) [2023-10-12 17:04:46,281][62634] Updated weights for policy 0, policy_version 31560 (0.0010) [2023-10-12 17:04:46,658][62634] Updated weights for policy 0, policy_version 31570 (0.0007) [2023-10-12 17:04:47,027][62634] Updated weights for policy 0, policy_version 31580 (0.0011) [2023-10-12 17:04:48,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 64651264. Throughput: 0: 1678.2, 1: 1684.3. Samples: 16172732. Policy #0 lag: (min: 14.0, avg: 21.2, max: 46.0) [2023-10-12 17:04:48,435][61643] Avg episode reward: [(0, '6.490'), (1, '8.950')] [2023-10-12 17:04:48,443][62354] Saving new best policy, reward=6.490! [2023-10-12 17:04:49,913][62635] Updated weights for policy 1, policy_version 31560 (0.0009) [2023-10-12 17:04:50,288][62635] Updated weights for policy 1, policy_version 31570 (0.0010) [2023-10-12 17:04:50,660][62635] Updated weights for policy 1, policy_version 31580 (0.0008) [2023-10-12 17:04:51,099][62634] Updated weights for policy 0, policy_version 31590 (0.0009) [2023-10-12 17:04:51,474][62634] Updated weights for policy 0, policy_version 31600 (0.0008) [2023-10-12 17:04:51,853][62634] Updated weights for policy 0, policy_version 31610 (0.0009) [2023-10-12 17:04:53,435][61643] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 64716800. Throughput: 0: 1682.4, 1: 1661.3. Samples: 16183006. Policy #0 lag: (min: 14.0, avg: 21.2, max: 46.0) [2023-10-12 17:04:53,435][61643] Avg episode reward: [(0, '6.510'), (1, '9.200')] [2023-10-12 17:04:53,436][62354] Saving new best policy, reward=6.510! [2023-10-12 17:04:54,529][62635] Updated weights for policy 1, policy_version 31590 (0.0007) [2023-10-12 17:04:54,894][62635] Updated weights for policy 1, policy_version 31600 (0.0007) [2023-10-12 17:04:55,255][62635] Updated weights for policy 1, policy_version 31610 (0.0009) [2023-10-12 17:04:55,936][62634] Updated weights for policy 0, policy_version 31620 (0.0008) [2023-10-12 17:04:56,301][62634] Updated weights for policy 0, policy_version 31630 (0.0009) [2023-10-12 17:04:56,679][62634] Updated weights for policy 0, policy_version 31640 (0.0009) [2023-10-12 17:04:58,435][61643] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 64782336. Throughput: 0: 1662.7, 1: 1684.5. Samples: 16202630. Policy #0 lag: (min: 14.0, avg: 21.2, max: 46.0) [2023-10-12 17:04:58,436][61643] Avg episode reward: [(0, '6.500'), (1, '9.000')] [2023-10-12 17:04:59,320][62635] Updated weights for policy 1, policy_version 31620 (0.0008) [2023-10-12 17:04:59,680][62635] Updated weights for policy 1, policy_version 31630 (0.0008) [2023-10-12 17:05:00,039][62635] Updated weights for policy 1, policy_version 31640 (0.0011) [2023-10-12 17:05:00,618][62634] Updated weights for policy 0, policy_version 31650 (0.0010) [2023-10-12 17:05:01,011][62634] Updated weights for policy 0, policy_version 31660 (0.0008) [2023-10-12 17:05:01,394][62634] Updated weights for policy 0, policy_version 31670 (0.0011) [2023-10-12 17:05:01,771][62634] Updated weights for policy 0, policy_version 31680 (0.0008) [2023-10-12 17:05:03,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 64847872. Throughput: 0: 1685.6, 1: 1688.7. Samples: 16223544. Policy #0 lag: (min: 14.0, avg: 21.2, max: 46.0) [2023-10-12 17:05:03,435][61643] Avg episode reward: [(0, '6.420'), (1, '8.800')] [2023-10-12 17:05:04,083][62635] Updated weights for policy 1, policy_version 31650 (0.0009) [2023-10-12 17:05:04,447][62635] Updated weights for policy 1, policy_version 31660 (0.0009) [2023-10-12 17:05:04,813][62635] Updated weights for policy 1, policy_version 31670 (0.0008) [2023-10-12 17:05:05,189][62635] Updated weights for policy 1, policy_version 31680 (0.0008) [2023-10-12 17:05:05,859][62634] Updated weights for policy 0, policy_version 31690 (0.0009) [2023-10-12 17:05:06,233][62634] Updated weights for policy 0, policy_version 31700 (0.0009) [2023-10-12 17:05:06,612][62634] Updated weights for policy 0, policy_version 31710 (0.0009) [2023-10-12 17:05:08,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 64913408. Throughput: 0: 1673.9, 1: 1678.3. Samples: 16233158. Policy #0 lag: (min: 31.0, avg: 34.1, max: 63.0) [2023-10-12 17:05:08,436][61643] Avg episode reward: [(0, '6.660'), (1, '8.950')] [2023-10-12 17:05:08,437][62354] Saving new best policy, reward=6.660! [2023-10-12 17:05:09,395][62635] Updated weights for policy 1, policy_version 31690 (0.0008) [2023-10-12 17:05:09,761][62635] Updated weights for policy 1, policy_version 31700 (0.0008) [2023-10-12 17:05:10,145][62635] Updated weights for policy 1, policy_version 31710 (0.0009) [2023-10-12 17:05:10,559][62634] Updated weights for policy 0, policy_version 31720 (0.0008) [2023-10-12 17:05:10,937][62634] Updated weights for policy 0, policy_version 31730 (0.0007) [2023-10-12 17:05:11,313][62634] Updated weights for policy 0, policy_version 31740 (0.0007) [2023-10-12 17:05:13,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 64978944. Throughput: 0: 1675.6, 1: 1682.6. Samples: 16253124. Policy #0 lag: (min: 31.0, avg: 34.1, max: 63.0) [2023-10-12 17:05:13,436][61643] Avg episode reward: [(0, '6.720'), (1, '8.930')] [2023-10-12 17:05:13,437][62354] Saving new best policy, reward=6.720! [2023-10-12 17:05:14,214][62635] Updated weights for policy 1, policy_version 31720 (0.0007) [2023-10-12 17:05:14,594][62635] Updated weights for policy 1, policy_version 31730 (0.0009) [2023-10-12 17:05:14,959][62635] Updated weights for policy 1, policy_version 31740 (0.0007) [2023-10-12 17:05:15,502][62634] Updated weights for policy 0, policy_version 31750 (0.0007) [2023-10-12 17:05:15,878][62634] Updated weights for policy 0, policy_version 31760 (0.0008) [2023-10-12 17:05:16,256][62634] Updated weights for policy 0, policy_version 31770 (0.0007) [2023-10-12 17:05:18,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 65044480. Throughput: 0: 1690.4, 1: 1683.3. Samples: 16273912. Policy #0 lag: (min: 31.0, avg: 34.1, max: 63.0) [2023-10-12 17:05:18,435][61643] Avg episode reward: [(0, '6.460'), (1, '8.710')] [2023-10-12 17:05:19,087][62635] Updated weights for policy 1, policy_version 31750 (0.0008) [2023-10-12 17:05:19,457][62635] Updated weights for policy 1, policy_version 31760 (0.0007) [2023-10-12 17:05:19,827][62635] Updated weights for policy 1, policy_version 31770 (0.0009) [2023-10-12 17:05:20,293][62634] Updated weights for policy 0, policy_version 31780 (0.0008) [2023-10-12 17:05:20,676][62634] Updated weights for policy 0, policy_version 31790 (0.0009) [2023-10-12 17:05:21,052][62634] Updated weights for policy 0, policy_version 31800 (0.0007) [2023-10-12 17:05:23,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 65110016. Throughput: 0: 1672.4, 1: 1681.2. Samples: 16283398. Policy #0 lag: (min: 31.0, avg: 34.1, max: 63.0) [2023-10-12 17:05:23,436][61643] Avg episode reward: [(0, '6.450'), (1, '8.630')] [2023-10-12 17:05:23,929][62635] Updated weights for policy 1, policy_version 31780 (0.0007) [2023-10-12 17:05:24,298][62635] Updated weights for policy 1, policy_version 31790 (0.0008) [2023-10-12 17:05:24,661][62635] Updated weights for policy 1, policy_version 31800 (0.0009) [2023-10-12 17:05:25,254][62634] Updated weights for policy 0, policy_version 31810 (0.0008) [2023-10-12 17:05:25,640][62634] Updated weights for policy 0, policy_version 31820 (0.0007) [2023-10-12 17:05:26,026][62634] Updated weights for policy 0, policy_version 31830 (0.0007) [2023-10-12 17:05:26,404][62634] Updated weights for policy 0, policy_version 31840 (0.0007) [2023-10-12 17:05:28,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 65175552. Throughput: 0: 1682.9, 1: 1678.7. Samples: 16303680. Policy #0 lag: (min: 31.0, avg: 34.1, max: 63.0) [2023-10-12 17:05:28,436][61643] Avg episode reward: [(0, '6.360'), (1, '8.790')] [2023-10-12 17:05:28,541][62635] Updated weights for policy 1, policy_version 31810 (0.0009) [2023-10-12 17:05:28,919][62635] Updated weights for policy 1, policy_version 31820 (0.0008) [2023-10-12 17:05:29,287][62635] Updated weights for policy 1, policy_version 31830 (0.0007) [2023-10-12 17:05:29,653][62635] Updated weights for policy 1, policy_version 31840 (0.0007) [2023-10-12 17:05:30,322][62634] Updated weights for policy 0, policy_version 31850 (0.0010) [2023-10-12 17:05:30,692][62634] Updated weights for policy 0, policy_version 31860 (0.0011) [2023-10-12 17:05:31,059][62634] Updated weights for policy 0, policy_version 31870 (0.0009) [2023-10-12 17:05:33,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13107.1, 300 sec: 13440.4). Total num frames: 65241088. Throughput: 0: 1681.5, 1: 1685.7. Samples: 16324260. Policy #0 lag: (min: 28.0, avg: 28.0, max: 28.0) [2023-10-12 17:05:33,436][61643] Avg episode reward: [(0, '6.490'), (1, '8.880')] [2023-10-12 17:05:33,729][62635] Updated weights for policy 1, policy_version 31850 (0.0010) [2023-10-12 17:05:34,093][62635] Updated weights for policy 1, policy_version 31860 (0.0010) [2023-10-12 17:05:34,463][62635] Updated weights for policy 1, policy_version 31870 (0.0010) [2023-10-12 17:05:35,083][62634] Updated weights for policy 0, policy_version 31880 (0.0009) [2023-10-12 17:05:35,463][62634] Updated weights for policy 0, policy_version 31890 (0.0009) [2023-10-12 17:05:35,837][62634] Updated weights for policy 0, policy_version 31900 (0.0008) [2023-10-12 17:05:38,435][61643] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 65306624. Throughput: 0: 1654.0, 1: 1687.9. Samples: 16333392. Policy #0 lag: (min: 28.0, avg: 28.0, max: 28.0) [2023-10-12 17:05:38,435][61643] Avg episode reward: [(0, '6.410'), (1, '8.800')] [2023-10-12 17:05:38,759][62635] Updated weights for policy 1, policy_version 31880 (0.0010) [2023-10-12 17:05:39,138][62635] Updated weights for policy 1, policy_version 31890 (0.0009) [2023-10-12 17:05:39,508][62635] Updated weights for policy 1, policy_version 31900 (0.0009) [2023-10-12 17:05:39,760][62634] Updated weights for policy 0, policy_version 31910 (0.0009) [2023-10-12 17:05:40,131][62634] Updated weights for policy 0, policy_version 31920 (0.0009) [2023-10-12 17:05:40,514][62634] Updated weights for policy 0, policy_version 31930 (0.0010) [2023-10-12 17:05:43,386][62635] Updated weights for policy 1, policy_version 31910 (0.0009) [2023-10-12 17:05:43,435][61643] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 65372160. Throughput: 0: 1684.0, 1: 1685.5. Samples: 16354254. Policy #0 lag: (min: 28.0, avg: 28.0, max: 28.0) [2023-10-12 17:05:43,435][61643] Avg episode reward: [(0, '6.330'), (1, '8.700')] [2023-10-12 17:05:43,757][62635] Updated weights for policy 1, policy_version 31920 (0.0009) [2023-10-12 17:05:44,123][62635] Updated weights for policy 1, policy_version 31930 (0.0010) [2023-10-12 17:05:44,357][62634] Updated weights for policy 0, policy_version 31940 (0.0008) [2023-10-12 17:05:44,742][62634] Updated weights for policy 0, policy_version 31950 (0.0010) [2023-10-12 17:05:45,107][62634] Updated weights for policy 0, policy_version 31960 (0.0010) [2023-10-12 17:05:47,975][62635] Updated weights for policy 1, policy_version 31940 (0.0008) [2023-10-12 17:05:48,347][62635] Updated weights for policy 1, policy_version 31950 (0.0009) [2023-10-12 17:05:48,435][61643] Fps is (10 sec: 13106.7, 60 sec: 13107.1, 300 sec: 13440.4). Total num frames: 65437696. Throughput: 0: 1684.6, 1: 1683.5. Samples: 16375110. Policy #0 lag: (min: 28.0, avg: 28.0, max: 28.0) [2023-10-12 17:05:48,436][61643] Avg episode reward: [(0, '6.550'), (1, '8.800')] [2023-10-12 17:05:48,712][62635] Updated weights for policy 1, policy_version 31960 (0.0007) [2023-10-12 17:05:49,315][62634] Updated weights for policy 0, policy_version 31970 (0.0011) [2023-10-12 17:05:49,722][62634] Updated weights for policy 0, policy_version 31980 (0.0009) [2023-10-12 17:05:50,084][62634] Updated weights for policy 0, policy_version 31990 (0.0008) [2023-10-12 17:05:50,462][62634] Updated weights for policy 0, policy_version 32000 (0.0008) [2023-10-12 17:05:52,630][62635] Updated weights for policy 1, policy_version 31970 (0.0009) [2023-10-12 17:05:52,998][62635] Updated weights for policy 1, policy_version 31980 (0.0008) [2023-10-12 17:05:53,362][62635] Updated weights for policy 1, policy_version 31990 (0.0011) [2023-10-12 17:05:53,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 65503232. Throughput: 0: 1666.0, 1: 1692.1. Samples: 16384272. Policy #0 lag: (min: 28.0, avg: 28.0, max: 28.0) [2023-10-12 17:05:53,435][61643] Avg episode reward: [(0, '6.580'), (1, '9.010')] [2023-10-12 17:05:53,725][62635] Updated weights for policy 1, policy_version 32000 (0.0008) [2023-10-12 17:05:54,706][62634] Updated weights for policy 0, policy_version 32010 (0.0011) [2023-10-12 17:05:55,070][62634] Updated weights for policy 0, policy_version 32020 (0.0010) [2023-10-12 17:05:55,454][62634] Updated weights for policy 0, policy_version 32030 (0.0009) [2023-10-12 17:05:57,791][62635] Updated weights for policy 1, policy_version 32010 (0.0007) [2023-10-12 17:05:58,164][62635] Updated weights for policy 1, policy_version 32020 (0.0007) [2023-10-12 17:05:58,435][61643] Fps is (10 sec: 13107.7, 60 sec: 13107.3, 300 sec: 13329.4). Total num frames: 65568768. Throughput: 0: 1678.7, 1: 1695.3. Samples: 16404954. Policy #0 lag: (min: 24.0, avg: 46.4, max: 56.0) [2023-10-12 17:05:58,435][61643] Avg episode reward: [(0, '6.490'), (1, '8.750')] [2023-10-12 17:05:58,536][62635] Updated weights for policy 1, policy_version 32030 (0.0007) [2023-10-12 17:05:59,563][62634] Updated weights for policy 0, policy_version 32040 (0.0009) [2023-10-12 17:05:59,944][62634] Updated weights for policy 0, policy_version 32050 (0.0008) [2023-10-12 17:06:00,321][62634] Updated weights for policy 0, policy_version 32060 (0.0007) [2023-10-12 17:06:02,702][62635] Updated weights for policy 1, policy_version 32040 (0.0007) [2023-10-12 17:06:03,073][62635] Updated weights for policy 1, policy_version 32050 (0.0007) [2023-10-12 17:06:03,431][62635] Updated weights for policy 1, policy_version 32060 (0.0007) [2023-10-12 17:06:03,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 65634304. Throughput: 0: 1682.0, 1: 1678.6. Samples: 16425138. Policy #0 lag: (min: 24.0, avg: 46.4, max: 56.0) [2023-10-12 17:06:03,435][61643] Avg episode reward: [(0, '6.690'), (1, '8.780')] [2023-10-12 17:06:04,422][62634] Updated weights for policy 0, policy_version 32070 (0.0007) [2023-10-12 17:06:04,804][62634] Updated weights for policy 0, policy_version 32080 (0.0007) [2023-10-12 17:06:05,188][62634] Updated weights for policy 0, policy_version 32090 (0.0008) [2023-10-12 17:06:07,571][62635] Updated weights for policy 1, policy_version 32070 (0.0008) [2023-10-12 17:06:07,945][62635] Updated weights for policy 1, policy_version 32080 (0.0007) [2023-10-12 17:06:08,308][62635] Updated weights for policy 1, policy_version 32090 (0.0009) [2023-10-12 17:06:08,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 65699840. Throughput: 0: 1672.4, 1: 1696.6. Samples: 16435004. Policy #0 lag: (min: 24.0, avg: 46.4, max: 56.0) [2023-10-12 17:06:08,435][61643] Avg episode reward: [(0, '6.490'), (1, '9.000')] [2023-10-12 17:06:09,203][62634] Updated weights for policy 0, policy_version 32100 (0.0009) [2023-10-12 17:06:09,572][62634] Updated weights for policy 0, policy_version 32110 (0.0009) [2023-10-12 17:06:09,944][62634] Updated weights for policy 0, policy_version 32120 (0.0010) [2023-10-12 17:06:12,335][62635] Updated weights for policy 1, policy_version 32100 (0.0008) [2023-10-12 17:06:12,700][62635] Updated weights for policy 1, policy_version 32110 (0.0008) [2023-10-12 17:06:13,079][62635] Updated weights for policy 1, policy_version 32120 (0.0008) [2023-10-12 17:06:13,435][61643] Fps is (10 sec: 16383.7, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 65798144. Throughput: 0: 1679.3, 1: 1696.9. Samples: 16455610. Policy #0 lag: (min: 24.0, avg: 46.4, max: 56.0) [2023-10-12 17:06:13,436][61643] Avg episode reward: [(0, '6.780'), (1, '8.860')] [2023-10-12 17:06:13,437][62354] Saving new best policy, reward=6.780! [2023-10-12 17:06:13,948][62634] Updated weights for policy 0, policy_version 32130 (0.0010) [2023-10-12 17:06:14,322][62634] Updated weights for policy 0, policy_version 32140 (0.0008) [2023-10-12 17:06:14,704][62634] Updated weights for policy 0, policy_version 32150 (0.0009) [2023-10-12 17:06:15,083][62634] Updated weights for policy 0, policy_version 32160 (0.0010) [2023-10-12 17:06:17,115][62635] Updated weights for policy 1, policy_version 32130 (0.0008) [2023-10-12 17:06:17,476][62635] Updated weights for policy 1, policy_version 32140 (0.0007) [2023-10-12 17:06:17,843][62635] Updated weights for policy 1, policy_version 32150 (0.0007) [2023-10-12 17:06:18,216][62635] Updated weights for policy 1, policy_version 32160 (0.0010) [2023-10-12 17:06:18,435][61643] Fps is (10 sec: 16384.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 65863680. Throughput: 0: 1686.3, 1: 1673.6. Samples: 16475454. Policy #0 lag: (min: 24.0, avg: 46.4, max: 56.0) [2023-10-12 17:06:18,435][61643] Avg episode reward: [(0, '6.430'), (1, '9.060')] [2023-10-12 17:06:18,445][62354] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000032160_32931840.pth... [2023-10-12 17:06:18,445][62495] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000032160_32931840.pth... [2023-10-12 17:06:18,474][62354] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000030592_31326208.pth [2023-10-12 17:06:18,480][62495] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000030592_31326208.pth [2023-10-12 17:06:19,088][62634] Updated weights for policy 0, policy_version 32170 (0.0007) [2023-10-12 17:06:19,461][62634] Updated weights for policy 0, policy_version 32180 (0.0007) [2023-10-12 17:06:19,841][62634] Updated weights for policy 0, policy_version 32190 (0.0009) [2023-10-12 17:06:22,297][62635] Updated weights for policy 1, policy_version 32170 (0.0008) [2023-10-12 17:06:22,664][62635] Updated weights for policy 1, policy_version 32180 (0.0007) [2023-10-12 17:06:23,036][62635] Updated weights for policy 1, policy_version 32190 (0.0008) [2023-10-12 17:06:23,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 65929216. Throughput: 0: 1683.9, 1: 1696.2. Samples: 16485494. Policy #0 lag: (min: 31.0, avg: 38.6, max: 63.0) [2023-10-12 17:06:23,436][61643] Avg episode reward: [(0, '6.480'), (1, '9.270')] [2023-10-12 17:06:23,847][62634] Updated weights for policy 0, policy_version 32200 (0.0010) [2023-10-12 17:06:24,222][62634] Updated weights for policy 0, policy_version 32210 (0.0010) [2023-10-12 17:06:24,597][62634] Updated weights for policy 0, policy_version 32220 (0.0009) [2023-10-12 17:06:27,242][62635] Updated weights for policy 1, policy_version 32200 (0.0007) [2023-10-12 17:06:27,608][62635] Updated weights for policy 1, policy_version 32210 (0.0007) [2023-10-12 17:06:27,986][62635] Updated weights for policy 1, policy_version 32220 (0.0009) [2023-10-12 17:06:28,435][61643] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 65994752. Throughput: 0: 1687.3, 1: 1694.7. Samples: 16506448. Policy #0 lag: (min: 31.0, avg: 38.6, max: 63.0) [2023-10-12 17:06:28,436][61643] Avg episode reward: [(0, '6.550'), (1, '9.220')] [2023-10-12 17:06:28,669][62634] Updated weights for policy 0, policy_version 32230 (0.0007) [2023-10-12 17:06:29,042][62634] Updated weights for policy 0, policy_version 32240 (0.0010) [2023-10-12 17:06:29,420][62634] Updated weights for policy 0, policy_version 32250 (0.0009) [2023-10-12 17:06:31,882][62635] Updated weights for policy 1, policy_version 32230 (0.0009) [2023-10-12 17:06:32,246][62635] Updated weights for policy 1, policy_version 32240 (0.0009) [2023-10-12 17:06:32,606][62635] Updated weights for policy 1, policy_version 32250 (0.0008) [2023-10-12 17:06:33,368][62634] Updated weights for policy 0, policy_version 32260 (0.0008) [2023-10-12 17:06:33,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 66060288. Throughput: 0: 1688.0, 1: 1666.8. Samples: 16526076. Policy #0 lag: (min: 31.0, avg: 38.6, max: 63.0) [2023-10-12 17:06:33,436][61643] Avg episode reward: [(0, '6.470'), (1, '9.400')] [2023-10-12 17:06:33,754][62634] Updated weights for policy 0, policy_version 32270 (0.0009) [2023-10-12 17:06:34,127][62634] Updated weights for policy 0, policy_version 32280 (0.0007) [2023-10-12 17:06:36,639][62635] Updated weights for policy 1, policy_version 32260 (0.0007) [2023-10-12 17:06:37,005][62635] Updated weights for policy 1, policy_version 32270 (0.0010) [2023-10-12 17:06:37,371][62635] Updated weights for policy 1, policy_version 32280 (0.0011) [2023-10-12 17:06:38,255][62634] Updated weights for policy 0, policy_version 32290 (0.0010) [2023-10-12 17:06:38,435][61643] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 66125824. Throughput: 0: 1692.4, 1: 1689.5. Samples: 16536456. Policy #0 lag: (min: 31.0, avg: 38.6, max: 63.0) [2023-10-12 17:06:38,435][61643] Avg episode reward: [(0, '6.390'), (1, '9.260')] [2023-10-12 17:06:38,648][62634] Updated weights for policy 0, policy_version 32300 (0.0008) [2023-10-12 17:06:39,035][62634] Updated weights for policy 0, policy_version 32310 (0.0007) [2023-10-12 17:06:39,409][62634] Updated weights for policy 0, policy_version 32320 (0.0008) [2023-10-12 17:06:41,456][62635] Updated weights for policy 1, policy_version 32290 (0.0009) [2023-10-12 17:06:41,823][62635] Updated weights for policy 1, policy_version 32300 (0.0007) [2023-10-12 17:06:42,188][62635] Updated weights for policy 1, policy_version 32310 (0.0008) [2023-10-12 17:06:42,555][62635] Updated weights for policy 1, policy_version 32320 (0.0007) [2023-10-12 17:06:43,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 66191360. Throughput: 0: 1695.2, 1: 1673.6. Samples: 16556552. Policy #0 lag: (min: 31.0, avg: 38.6, max: 63.0) [2023-10-12 17:06:43,435][61643] Avg episode reward: [(0, '6.470'), (1, '9.130')] [2023-10-12 17:06:43,511][62634] Updated weights for policy 0, policy_version 32330 (0.0008) [2023-10-12 17:06:43,876][62634] Updated weights for policy 0, policy_version 32340 (0.0010) [2023-10-12 17:06:44,256][62634] Updated weights for policy 0, policy_version 32350 (0.0008) [2023-10-12 17:06:46,599][62635] Updated weights for policy 1, policy_version 32330 (0.0008) [2023-10-12 17:06:46,973][62635] Updated weights for policy 1, policy_version 32340 (0.0009) [2023-10-12 17:06:47,348][62635] Updated weights for policy 1, policy_version 32350 (0.0007) [2023-10-12 17:06:48,313][62634] Updated weights for policy 0, policy_version 32360 (0.0009) [2023-10-12 17:06:48,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 66256896. Throughput: 0: 1694.1, 1: 1676.3. Samples: 16576806. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 17:06:48,435][61643] Avg episode reward: [(0, '6.390'), (1, '9.180')] [2023-10-12 17:06:48,690][62634] Updated weights for policy 0, policy_version 32370 (0.0009) [2023-10-12 17:06:49,065][62634] Updated weights for policy 0, policy_version 32380 (0.0010) [2023-10-12 17:06:51,471][62635] Updated weights for policy 1, policy_version 32360 (0.0008) [2023-10-12 17:06:51,836][62635] Updated weights for policy 1, policy_version 32370 (0.0009) [2023-10-12 17:06:52,215][62635] Updated weights for policy 1, policy_version 32380 (0.0008) [2023-10-12 17:06:53,084][62634] Updated weights for policy 0, policy_version 32390 (0.0009) [2023-10-12 17:06:53,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 66322432. Throughput: 0: 1693.6, 1: 1688.0. Samples: 16587176. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 17:06:53,435][61643] Avg episode reward: [(0, '6.910'), (1, '9.200')] [2023-10-12 17:06:53,465][62634] Updated weights for policy 0, policy_version 32400 (0.0009) [2023-10-12 17:06:53,832][62634] Updated weights for policy 0, policy_version 32410 (0.0008) [2023-10-12 17:06:54,059][62354] Saving new best policy, reward=6.910! [2023-10-12 17:06:56,393][62635] Updated weights for policy 1, policy_version 32390 (0.0007) [2023-10-12 17:06:56,753][62635] Updated weights for policy 1, policy_version 32400 (0.0008) [2023-10-12 17:06:57,123][62635] Updated weights for policy 1, policy_version 32410 (0.0009) [2023-10-12 17:06:57,773][62634] Updated weights for policy 0, policy_version 32420 (0.0007) [2023-10-12 17:06:58,156][62634] Updated weights for policy 0, policy_version 32430 (0.0007) [2023-10-12 17:06:58,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 66387968. Throughput: 0: 1701.0, 1: 1667.4. Samples: 16607188. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 17:06:58,435][61643] Avg episode reward: [(0, '6.590'), (1, '9.090')] [2023-10-12 17:06:58,530][62634] Updated weights for policy 0, policy_version 32440 (0.0009) [2023-10-12 17:07:01,040][62635] Updated weights for policy 1, policy_version 32420 (0.0007) [2023-10-12 17:07:01,396][62635] Updated weights for policy 1, policy_version 32430 (0.0008) [2023-10-12 17:07:01,766][62635] Updated weights for policy 1, policy_version 32440 (0.0008) [2023-10-12 17:07:02,363][62634] Updated weights for policy 0, policy_version 32450 (0.0009) [2023-10-12 17:07:02,744][62634] Updated weights for policy 0, policy_version 32460 (0.0010) [2023-10-12 17:07:03,115][62634] Updated weights for policy 0, policy_version 32470 (0.0010) [2023-10-12 17:07:03,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 66453504. Throughput: 0: 1686.9, 1: 1683.6. Samples: 16627130. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 17:07:03,436][61643] Avg episode reward: [(0, '6.660'), (1, '9.070')] [2023-10-12 17:07:03,490][62634] Updated weights for policy 0, policy_version 32480 (0.0008) [2023-10-12 17:07:05,737][62635] Updated weights for policy 1, policy_version 32450 (0.0007) [2023-10-12 17:07:06,108][62635] Updated weights for policy 1, policy_version 32460 (0.0008) [2023-10-12 17:07:06,476][62635] Updated weights for policy 1, policy_version 32470 (0.0012) [2023-10-12 17:07:06,837][62635] Updated weights for policy 1, policy_version 32480 (0.0009) [2023-10-12 17:07:07,461][62634] Updated weights for policy 0, policy_version 32490 (0.0008) [2023-10-12 17:07:07,835][62634] Updated weights for policy 0, policy_version 32500 (0.0008) [2023-10-12 17:07:08,211][62634] Updated weights for policy 0, policy_version 32510 (0.0008) [2023-10-12 17:07:08,435][61643] Fps is (10 sec: 16383.6, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 66551808. Throughput: 0: 1704.8, 1: 1681.6. Samples: 16637880. Policy #0 lag: (min: 31.0, avg: 36.6, max: 63.0) [2023-10-12 17:07:08,436][61643] Avg episode reward: [(0, '7.200'), (1, '8.930')] [2023-10-12 17:07:08,437][62354] Saving new best policy, reward=7.200! [2023-10-12 17:07:10,918][62635] Updated weights for policy 1, policy_version 32490 (0.0007) [2023-10-12 17:07:11,282][62635] Updated weights for policy 1, policy_version 32500 (0.0007) [2023-10-12 17:07:11,644][62635] Updated weights for policy 1, policy_version 32510 (0.0007) [2023-10-12 17:07:12,239][62634] Updated weights for policy 0, policy_version 32520 (0.0010) [2023-10-12 17:07:12,618][62634] Updated weights for policy 0, policy_version 32530 (0.0010) [2023-10-12 17:07:12,999][62634] Updated weights for policy 0, policy_version 32540 (0.0008) [2023-10-12 17:07:13,435][61643] Fps is (10 sec: 16384.2, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 66617344. Throughput: 0: 1700.5, 1: 1660.6. Samples: 16657698. Policy #0 lag: (min: 31.0, avg: 36.6, max: 63.0) [2023-10-12 17:07:13,435][61643] Avg episode reward: [(0, '6.860'), (1, '8.880')] [2023-10-12 17:07:15,980][62635] Updated weights for policy 1, policy_version 32520 (0.0009) [2023-10-12 17:07:16,352][62635] Updated weights for policy 1, policy_version 32530 (0.0008) [2023-10-12 17:07:16,718][62635] Updated weights for policy 1, policy_version 32540 (0.0009) [2023-10-12 17:07:17,089][62634] Updated weights for policy 0, policy_version 32550 (0.0008) [2023-10-12 17:07:17,466][62634] Updated weights for policy 0, policy_version 32560 (0.0008) [2023-10-12 17:07:17,850][62634] Updated weights for policy 0, policy_version 32570 (0.0007) [2023-10-12 17:07:18,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 66682880. Throughput: 0: 1671.7, 1: 1689.1. Samples: 16677316. Policy #0 lag: (min: 31.0, avg: 36.6, max: 63.0) [2023-10-12 17:07:18,436][61643] Avg episode reward: [(0, '6.790'), (1, '8.780')] [2023-10-12 17:07:20,742][62635] Updated weights for policy 1, policy_version 32550 (0.0008) [2023-10-12 17:07:21,107][62635] Updated weights for policy 1, policy_version 32560 (0.0009) [2023-10-12 17:07:21,484][62635] Updated weights for policy 1, policy_version 32570 (0.0009) [2023-10-12 17:07:21,862][62634] Updated weights for policy 0, policy_version 32580 (0.0008) [2023-10-12 17:07:22,239][62634] Updated weights for policy 0, policy_version 32590 (0.0011) [2023-10-12 17:07:22,625][62634] Updated weights for policy 0, policy_version 32600 (0.0009) [2023-10-12 17:07:23,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 66748416. Throughput: 0: 1698.8, 1: 1676.8. Samples: 16688362. Policy #0 lag: (min: 31.0, avg: 36.6, max: 63.0) [2023-10-12 17:07:23,436][61643] Avg episode reward: [(0, '7.000'), (1, '8.610')] [2023-10-12 17:07:25,419][62635] Updated weights for policy 1, policy_version 32580 (0.0009) [2023-10-12 17:07:25,783][62635] Updated weights for policy 1, policy_version 32590 (0.0007) [2023-10-12 17:07:26,154][62635] Updated weights for policy 1, policy_version 32600 (0.0008) [2023-10-12 17:07:26,623][62634] Updated weights for policy 0, policy_version 32610 (0.0008) [2023-10-12 17:07:27,020][62634] Updated weights for policy 0, policy_version 32620 (0.0007) [2023-10-12 17:07:27,390][62634] Updated weights for policy 0, policy_version 32630 (0.0008) [2023-10-12 17:07:27,769][62634] Updated weights for policy 0, policy_version 32640 (0.0008) [2023-10-12 17:07:28,435][61643] Fps is (10 sec: 13107.6, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 66813952. Throughput: 0: 1692.4, 1: 1675.7. Samples: 16708120. Policy #0 lag: (min: 31.0, avg: 36.6, max: 63.0) [2023-10-12 17:07:28,435][61643] Avg episode reward: [(0, '7.100'), (1, '8.720')] [2023-10-12 17:07:30,107][62635] Updated weights for policy 1, policy_version 32610 (0.0009) [2023-10-12 17:07:30,478][62635] Updated weights for policy 1, policy_version 32620 (0.0009) [2023-10-12 17:07:30,845][62635] Updated weights for policy 1, policy_version 32630 (0.0008) [2023-10-12 17:07:31,220][62635] Updated weights for policy 1, policy_version 32640 (0.0007) [2023-10-12 17:07:31,790][62634] Updated weights for policy 0, policy_version 32650 (0.0008) [2023-10-12 17:07:32,158][62634] Updated weights for policy 0, policy_version 32660 (0.0010) [2023-10-12 17:07:32,543][62634] Updated weights for policy 0, policy_version 32670 (0.0010) [2023-10-12 17:07:33,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 66879488. Throughput: 0: 1669.2, 1: 1692.9. Samples: 16728102. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) [2023-10-12 17:07:33,436][61643] Avg episode reward: [(0, '7.080'), (1, '8.830')] [2023-10-12 17:07:35,031][62635] Updated weights for policy 1, policy_version 32650 (0.0011) [2023-10-12 17:07:35,395][62635] Updated weights for policy 1, policy_version 32660 (0.0010) [2023-10-12 17:07:35,774][62635] Updated weights for policy 1, policy_version 32670 (0.0007) [2023-10-12 17:07:36,458][62634] Updated weights for policy 0, policy_version 32680 (0.0008) [2023-10-12 17:07:36,834][62634] Updated weights for policy 0, policy_version 32690 (0.0007) [2023-10-12 17:07:37,203][62634] Updated weights for policy 0, policy_version 32700 (0.0011) [2023-10-12 17:07:38,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 66945024. Throughput: 0: 1700.8, 1: 1664.1. Samples: 16738598. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) [2023-10-12 17:07:38,436][61643] Avg episode reward: [(0, '7.030'), (1, '8.590')] [2023-10-12 17:07:39,860][62635] Updated weights for policy 1, policy_version 32680 (0.0008) [2023-10-12 17:07:40,237][62635] Updated weights for policy 1, policy_version 32690 (0.0008) [2023-10-12 17:07:40,603][62635] Updated weights for policy 1, policy_version 32700 (0.0010) [2023-10-12 17:07:41,463][62634] Updated weights for policy 0, policy_version 32710 (0.0009) [2023-10-12 17:07:41,849][62634] Updated weights for policy 0, policy_version 32720 (0.0011) [2023-10-12 17:07:42,216][62634] Updated weights for policy 0, policy_version 32730 (0.0009) [2023-10-12 17:07:43,435][61643] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 67010560. Throughput: 0: 1675.6, 1: 1689.0. Samples: 16758598. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) [2023-10-12 17:07:43,435][61643] Avg episode reward: [(0, '7.250'), (1, '8.800')] [2023-10-12 17:07:43,436][62354] Saving new best policy, reward=7.250! [2023-10-12 17:07:44,803][62635] Updated weights for policy 1, policy_version 32710 (0.0007) [2023-10-12 17:07:45,178][62635] Updated weights for policy 1, policy_version 32720 (0.0008) [2023-10-12 17:07:45,539][62635] Updated weights for policy 1, policy_version 32730 (0.0007) [2023-10-12 17:07:46,267][62634] Updated weights for policy 0, policy_version 32740 (0.0010) [2023-10-12 17:07:46,641][62634] Updated weights for policy 0, policy_version 32750 (0.0009) [2023-10-12 17:07:47,016][62634] Updated weights for policy 0, policy_version 32760 (0.0010) [2023-10-12 17:07:48,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 67076096. Throughput: 0: 1673.8, 1: 1695.9. Samples: 16778764. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) [2023-10-12 17:07:48,436][61643] Avg episode reward: [(0, '7.370'), (1, '9.010')] [2023-10-12 17:07:48,447][62354] Saving new best policy, reward=7.370! [2023-10-12 17:07:49,585][62635] Updated weights for policy 1, policy_version 32740 (0.0007) [2023-10-12 17:07:49,948][62635] Updated weights for policy 1, policy_version 32750 (0.0009) [2023-10-12 17:07:50,329][62635] Updated weights for policy 1, policy_version 32760 (0.0009) [2023-10-12 17:07:50,947][62634] Updated weights for policy 0, policy_version 32770 (0.0008) [2023-10-12 17:07:51,321][62634] Updated weights for policy 0, policy_version 32780 (0.0010) [2023-10-12 17:07:51,695][62634] Updated weights for policy 0, policy_version 32790 (0.0008) [2023-10-12 17:07:52,069][62634] Updated weights for policy 0, policy_version 32800 (0.0009) [2023-10-12 17:07:53,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 67141632. Throughput: 0: 1684.3, 1: 1676.0. Samples: 16789092. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) [2023-10-12 17:07:53,435][61643] Avg episode reward: [(0, '7.350'), (1, '8.970')] [2023-10-12 17:07:54,290][62635] Updated weights for policy 1, policy_version 32770 (0.0007) [2023-10-12 17:07:54,665][62635] Updated weights for policy 1, policy_version 32780 (0.0007) [2023-10-12 17:07:55,024][62635] Updated weights for policy 1, policy_version 32790 (0.0009) [2023-10-12 17:07:55,396][62635] Updated weights for policy 1, policy_version 32800 (0.0007) [2023-10-12 17:07:56,216][62634] Updated weights for policy 0, policy_version 32810 (0.0008) [2023-10-12 17:07:56,597][62634] Updated weights for policy 0, policy_version 32820 (0.0007) [2023-10-12 17:07:56,978][62634] Updated weights for policy 0, policy_version 32830 (0.0008) [2023-10-12 17:07:58,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 67207168. Throughput: 0: 1659.3, 1: 1699.5. Samples: 16808846. Policy #0 lag: (min: 31.0, avg: 33.9, max: 63.0) [2023-10-12 17:07:58,436][61643] Avg episode reward: [(0, '7.390'), (1, '8.980')] [2023-10-12 17:07:58,436][62354] Saving new best policy, reward=7.390! [2023-10-12 17:07:59,463][62635] Updated weights for policy 1, policy_version 32810 (0.0008) [2023-10-12 17:07:59,832][62635] Updated weights for policy 1, policy_version 32820 (0.0010) [2023-10-12 17:08:00,202][62635] Updated weights for policy 1, policy_version 32830 (0.0010) [2023-10-12 17:08:01,008][62634] Updated weights for policy 0, policy_version 32840 (0.0007) [2023-10-12 17:08:01,387][62634] Updated weights for policy 0, policy_version 32850 (0.0007) [2023-10-12 17:08:01,769][62634] Updated weights for policy 0, policy_version 32860 (0.0007) [2023-10-12 17:08:03,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 67272704. Throughput: 0: 1682.2, 1: 1697.7. Samples: 16829414. Policy #0 lag: (min: 31.0, avg: 33.9, max: 63.0) [2023-10-12 17:08:03,435][61643] Avg episode reward: [(0, '7.580'), (1, '9.090')] [2023-10-12 17:08:03,443][62354] Saving new best policy, reward=7.580! [2023-10-12 17:08:04,420][62635] Updated weights for policy 1, policy_version 32840 (0.0008) [2023-10-12 17:08:04,795][62635] Updated weights for policy 1, policy_version 32850 (0.0009) [2023-10-12 17:08:05,162][62635] Updated weights for policy 1, policy_version 32860 (0.0009) [2023-10-12 17:08:05,839][62634] Updated weights for policy 0, policy_version 32870 (0.0008) [2023-10-12 17:08:06,214][62634] Updated weights for policy 0, policy_version 32880 (0.0010) [2023-10-12 17:08:06,592][62634] Updated weights for policy 0, policy_version 32890 (0.0008) [2023-10-12 17:08:08,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 67338240. Throughput: 0: 1676.8, 1: 1680.3. Samples: 16839430. Policy #0 lag: (min: 31.0, avg: 33.9, max: 63.0) [2023-10-12 17:08:08,436][61643] Avg episode reward: [(0, '7.360'), (1, '8.830')] [2023-10-12 17:08:09,300][62635] Updated weights for policy 1, policy_version 32870 (0.0007) [2023-10-12 17:08:09,667][62635] Updated weights for policy 1, policy_version 32880 (0.0009) [2023-10-12 17:08:10,036][62635] Updated weights for policy 1, policy_version 32890 (0.0007) [2023-10-12 17:08:10,751][62634] Updated weights for policy 0, policy_version 32900 (0.0008) [2023-10-12 17:08:11,138][62634] Updated weights for policy 0, policy_version 32910 (0.0008) [2023-10-12 17:08:11,515][62634] Updated weights for policy 0, policy_version 32920 (0.0007) [2023-10-12 17:08:13,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 67403776. Throughput: 0: 1660.7, 1: 1699.8. Samples: 16859342. Policy #0 lag: (min: 31.0, avg: 33.9, max: 63.0) [2023-10-12 17:08:13,435][61643] Avg episode reward: [(0, '7.720'), (1, '8.880')] [2023-10-12 17:08:13,436][62354] Saving new best policy, reward=7.720! [2023-10-12 17:08:14,032][62635] Updated weights for policy 1, policy_version 32900 (0.0008) [2023-10-12 17:08:14,398][62635] Updated weights for policy 1, policy_version 32910 (0.0011) [2023-10-12 17:08:14,761][62635] Updated weights for policy 1, policy_version 32920 (0.0009) [2023-10-12 17:08:15,667][62634] Updated weights for policy 0, policy_version 32930 (0.0008) [2023-10-12 17:08:16,057][62634] Updated weights for policy 0, policy_version 32940 (0.0011) [2023-10-12 17:08:16,443][62634] Updated weights for policy 0, policy_version 32950 (0.0011) [2023-10-12 17:08:16,809][62634] Updated weights for policy 0, policy_version 32960 (0.0011) [2023-10-12 17:08:18,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 67469312. Throughput: 0: 1679.1, 1: 1695.4. Samples: 16879954. Policy #0 lag: (min: 31.0, avg: 33.9, max: 63.0) [2023-10-12 17:08:18,436][61643] Avg episode reward: [(0, '7.690'), (1, '9.190')] [2023-10-12 17:08:18,445][62354] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000032960_33751040.pth... [2023-10-12 17:08:18,445][62495] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000032928_33718272.pth... [2023-10-12 17:08:18,476][62354] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000031392_32145408.pth [2023-10-12 17:08:18,486][62495] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000031360_32112640.pth [2023-10-12 17:08:18,699][62635] Updated weights for policy 1, policy_version 32930 (0.0007) [2023-10-12 17:08:19,068][62635] Updated weights for policy 1, policy_version 32940 (0.0008) [2023-10-12 17:08:19,443][62635] Updated weights for policy 1, policy_version 32950 (0.0009) [2023-10-12 17:08:19,806][62635] Updated weights for policy 1, policy_version 32960 (0.0009) [2023-10-12 17:08:20,717][62634] Updated weights for policy 0, policy_version 32970 (0.0007) [2023-10-12 17:08:21,099][62634] Updated weights for policy 0, policy_version 32980 (0.0008) [2023-10-12 17:08:21,469][62634] Updated weights for policy 0, policy_version 32990 (0.0011) [2023-10-12 17:08:23,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 67534848. Throughput: 0: 1662.3, 1: 1691.5. Samples: 16889516. Policy #0 lag: (min: 14.0, avg: 15.6, max: 42.0) [2023-10-12 17:08:23,436][61643] Avg episode reward: [(0, '7.760'), (1, '9.250')] [2023-10-12 17:08:23,437][62354] Saving new best policy, reward=7.760! [2023-10-12 17:08:23,699][62635] Updated weights for policy 1, policy_version 32970 (0.0010) [2023-10-12 17:08:24,070][62635] Updated weights for policy 1, policy_version 32980 (0.0010) [2023-10-12 17:08:24,436][62635] Updated weights for policy 1, policy_version 32990 (0.0010) [2023-10-12 17:08:25,610][62634] Updated weights for policy 0, policy_version 33000 (0.0008) [2023-10-12 17:08:25,988][62634] Updated weights for policy 0, policy_version 33010 (0.0009) [2023-10-12 17:08:26,363][62634] Updated weights for policy 0, policy_version 33020 (0.0010) [2023-10-12 17:08:28,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 67600384. Throughput: 0: 1664.9, 1: 1691.4. Samples: 16909632. Policy #0 lag: (min: 14.0, avg: 15.6, max: 42.0) [2023-10-12 17:08:28,435][61643] Avg episode reward: [(0, '7.850'), (1, '9.130')] [2023-10-12 17:08:28,436][62354] Saving new best policy, reward=7.850! [2023-10-12 17:08:28,480][62635] Updated weights for policy 1, policy_version 33000 (0.0009) [2023-10-12 17:08:28,842][62635] Updated weights for policy 1, policy_version 33010 (0.0007) [2023-10-12 17:08:29,212][62635] Updated weights for policy 1, policy_version 33020 (0.0007) [2023-10-12 17:08:30,532][62634] Updated weights for policy 0, policy_version 33030 (0.0008) [2023-10-12 17:08:30,915][62634] Updated weights for policy 0, policy_version 33040 (0.0008) [2023-10-12 17:08:31,289][62634] Updated weights for policy 0, policy_version 33050 (0.0007) [2023-10-12 17:08:33,339][62635] Updated weights for policy 1, policy_version 33030 (0.0008) [2023-10-12 17:08:33,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 67665920. Throughput: 0: 1678.4, 1: 1693.2. Samples: 16930484. Policy #0 lag: (min: 14.0, avg: 15.6, max: 42.0) [2023-10-12 17:08:33,435][61643] Avg episode reward: [(0, '7.910'), (1, '9.270')] [2023-10-12 17:08:33,443][62354] Saving new best policy, reward=7.910! [2023-10-12 17:08:33,707][62635] Updated weights for policy 1, policy_version 33040 (0.0011) [2023-10-12 17:08:34,082][62635] Updated weights for policy 1, policy_version 33050 (0.0011) [2023-10-12 17:08:35,249][62634] Updated weights for policy 0, policy_version 33060 (0.0008) [2023-10-12 17:08:35,630][62634] Updated weights for policy 0, policy_version 33070 (0.0008) [2023-10-12 17:08:35,995][62634] Updated weights for policy 0, policy_version 33080 (0.0008) [2023-10-12 17:08:37,962][62635] Updated weights for policy 1, policy_version 33060 (0.0010) [2023-10-12 17:08:38,336][62635] Updated weights for policy 1, policy_version 33070 (0.0009) [2023-10-12 17:08:38,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 67731456. Throughput: 0: 1657.8, 1: 1694.8. Samples: 16939960. Policy #0 lag: (min: 14.0, avg: 15.6, max: 42.0) [2023-10-12 17:08:38,436][61643] Avg episode reward: [(0, '8.250'), (1, '9.270')] [2023-10-12 17:08:38,436][62354] Saving new best policy, reward=8.250! [2023-10-12 17:08:38,699][62635] Updated weights for policy 1, policy_version 33080 (0.0007) [2023-10-12 17:08:40,168][62634] Updated weights for policy 0, policy_version 33090 (0.0007) [2023-10-12 17:08:40,551][62634] Updated weights for policy 0, policy_version 33100 (0.0009) [2023-10-12 17:08:40,925][62634] Updated weights for policy 0, policy_version 33110 (0.0007) [2023-10-12 17:08:41,296][62634] Updated weights for policy 0, policy_version 33120 (0.0009) [2023-10-12 17:08:42,611][62635] Updated weights for policy 1, policy_version 33090 (0.0008) [2023-10-12 17:08:42,979][62635] Updated weights for policy 1, policy_version 33100 (0.0007) [2023-10-12 17:08:43,344][62635] Updated weights for policy 1, policy_version 33110 (0.0007) [2023-10-12 17:08:43,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 67796992. Throughput: 0: 1665.4, 1: 1701.4. Samples: 16960352. Policy #0 lag: (min: 14.0, avg: 15.6, max: 42.0) [2023-10-12 17:08:43,435][61643] Avg episode reward: [(0, '8.080'), (1, '9.300')] [2023-10-12 17:08:43,725][62635] Updated weights for policy 1, policy_version 33120 (0.0009) [2023-10-12 17:08:45,415][62634] Updated weights for policy 0, policy_version 33130 (0.0010) [2023-10-12 17:08:45,781][62634] Updated weights for policy 0, policy_version 33140 (0.0007) [2023-10-12 17:08:46,165][62634] Updated weights for policy 0, policy_version 33150 (0.0007) [2023-10-12 17:08:47,726][62635] Updated weights for policy 1, policy_version 33130 (0.0010) [2023-10-12 17:08:48,096][62635] Updated weights for policy 1, policy_version 33140 (0.0010) [2023-10-12 17:08:48,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.3, 300 sec: 13440.4). Total num frames: 67862528. Throughput: 0: 1666.8, 1: 1690.7. Samples: 16980504. Policy #0 lag: (min: 9.0, avg: 22.3, max: 41.0) [2023-10-12 17:08:48,435][61643] Avg episode reward: [(0, '7.930'), (1, '9.330')] [2023-10-12 17:08:48,471][62635] Updated weights for policy 1, policy_version 33150 (0.0008) [2023-10-12 17:08:50,265][62634] Updated weights for policy 0, policy_version 33160 (0.0009) [2023-10-12 17:08:50,653][62634] Updated weights for policy 0, policy_version 33170 (0.0008) [2023-10-12 17:08:51,034][62634] Updated weights for policy 0, policy_version 33180 (0.0010) [2023-10-12 17:08:52,707][62635] Updated weights for policy 1, policy_version 33160 (0.0007) [2023-10-12 17:08:53,078][62635] Updated weights for policy 1, policy_version 33170 (0.0008) [2023-10-12 17:08:53,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 67928064. Throughput: 0: 1653.1, 1: 1704.8. Samples: 16990536. Policy #0 lag: (min: 9.0, avg: 22.3, max: 41.0) [2023-10-12 17:08:53,435][61643] Avg episode reward: [(0, '7.980'), (1, '9.120')] [2023-10-12 17:08:53,458][62635] Updated weights for policy 1, policy_version 33180 (0.0009) [2023-10-12 17:08:55,128][62634] Updated weights for policy 0, policy_version 33190 (0.0008) [2023-10-12 17:08:55,502][62634] Updated weights for policy 0, policy_version 33200 (0.0007) [2023-10-12 17:08:55,881][62634] Updated weights for policy 0, policy_version 33210 (0.0007) [2023-10-12 17:08:57,468][62635] Updated weights for policy 1, policy_version 33190 (0.0009) [2023-10-12 17:08:57,833][62635] Updated weights for policy 1, policy_version 33200 (0.0009) [2023-10-12 17:08:58,198][62635] Updated weights for policy 1, policy_version 33210 (0.0009) [2023-10-12 17:08:58,435][61643] Fps is (10 sec: 16384.1, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 68026368. Throughput: 0: 1665.0, 1: 1698.5. Samples: 17010698. Policy #0 lag: (min: 9.0, avg: 22.3, max: 41.0) [2023-10-12 17:08:58,435][61643] Avg episode reward: [(0, '7.990'), (1, '9.170')] [2023-10-12 17:09:00,033][62634] Updated weights for policy 0, policy_version 33220 (0.0008) [2023-10-12 17:09:00,409][62634] Updated weights for policy 0, policy_version 33230 (0.0008) [2023-10-12 17:09:00,783][62634] Updated weights for policy 0, policy_version 33240 (0.0007) [2023-10-12 17:09:02,203][62635] Updated weights for policy 1, policy_version 33220 (0.0008) [2023-10-12 17:09:02,583][62635] Updated weights for policy 1, policy_version 33230 (0.0008) [2023-10-12 17:09:02,943][62635] Updated weights for policy 1, policy_version 33240 (0.0007) [2023-10-12 17:09:03,435][61643] Fps is (10 sec: 16383.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 68091904. Throughput: 0: 1668.4, 1: 1679.8. Samples: 17030624. Policy #0 lag: (min: 9.0, avg: 22.3, max: 41.0) [2023-10-12 17:09:03,435][61643] Avg episode reward: [(0, '7.840'), (1, '9.250')] [2023-10-12 17:09:04,846][62634] Updated weights for policy 0, policy_version 33250 (0.0007) [2023-10-12 17:09:05,244][62634] Updated weights for policy 0, policy_version 33260 (0.0007) [2023-10-12 17:09:05,628][62634] Updated weights for policy 0, policy_version 33270 (0.0007) [2023-10-12 17:09:06,003][62634] Updated weights for policy 0, policy_version 33280 (0.0007) [2023-10-12 17:09:06,977][62635] Updated weights for policy 1, policy_version 33250 (0.0009) [2023-10-12 17:09:07,347][62635] Updated weights for policy 1, policy_version 33260 (0.0008) [2023-10-12 17:09:07,713][62635] Updated weights for policy 1, policy_version 33270 (0.0008) [2023-10-12 17:09:08,079][62635] Updated weights for policy 1, policy_version 33280 (0.0008) [2023-10-12 17:09:08,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 68157440. Throughput: 0: 1653.7, 1: 1705.1. Samples: 17040662. Policy #0 lag: (min: 9.0, avg: 22.3, max: 41.0) [2023-10-12 17:09:08,435][61643] Avg episode reward: [(0, '8.030'), (1, '9.020')] [2023-10-12 17:09:10,069][62634] Updated weights for policy 0, policy_version 33290 (0.0010) [2023-10-12 17:09:10,441][62634] Updated weights for policy 0, policy_version 33300 (0.0009) [2023-10-12 17:09:10,825][62634] Updated weights for policy 0, policy_version 33310 (0.0008) [2023-10-12 17:09:12,236][62635] Updated weights for policy 1, policy_version 33290 (0.0009) [2023-10-12 17:09:12,602][62635] Updated weights for policy 1, policy_version 33300 (0.0009) [2023-10-12 17:09:12,967][62635] Updated weights for policy 1, policy_version 33310 (0.0009) [2023-10-12 17:09:13,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 68222976. Throughput: 0: 1663.3, 1: 1696.6. Samples: 17060830. Policy #0 lag: (min: 31.0, avg: 37.9, max: 63.0) [2023-10-12 17:09:13,436][61643] Avg episode reward: [(0, '7.980'), (1, '9.210')] [2023-10-12 17:09:14,825][62634] Updated weights for policy 0, policy_version 33320 (0.0010) [2023-10-12 17:09:15,200][62634] Updated weights for policy 0, policy_version 33330 (0.0009) [2023-10-12 17:09:15,576][62634] Updated weights for policy 0, policy_version 33340 (0.0007) [2023-10-12 17:09:17,110][62635] Updated weights for policy 1, policy_version 33320 (0.0009) [2023-10-12 17:09:17,482][62635] Updated weights for policy 1, policy_version 33330 (0.0010) [2023-10-12 17:09:17,850][62635] Updated weights for policy 1, policy_version 33340 (0.0011) [2023-10-12 17:09:18,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 68288512. Throughput: 0: 1667.3, 1: 1668.8. Samples: 17080608. Policy #0 lag: (min: 31.0, avg: 37.9, max: 63.0) [2023-10-12 17:09:18,435][61643] Avg episode reward: [(0, '8.100'), (1, '9.290')] [2023-10-12 17:09:19,626][62634] Updated weights for policy 0, policy_version 33350 (0.0009) [2023-10-12 17:09:20,007][62634] Updated weights for policy 0, policy_version 33360 (0.0010) [2023-10-12 17:09:20,389][62634] Updated weights for policy 0, policy_version 33370 (0.0008) [2023-10-12 17:09:21,878][62635] Updated weights for policy 1, policy_version 33350 (0.0009) [2023-10-12 17:09:22,235][62635] Updated weights for policy 1, policy_version 33360 (0.0010) [2023-10-12 17:09:22,604][62635] Updated weights for policy 1, policy_version 33370 (0.0009) [2023-10-12 17:09:23,435][61643] Fps is (10 sec: 13107.5, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 68354048. Throughput: 0: 1660.1, 1: 1694.0. Samples: 17090898. Policy #0 lag: (min: 31.0, avg: 37.9, max: 63.0) [2023-10-12 17:09:23,435][61643] Avg episode reward: [(0, '8.280'), (1, '9.180')] [2023-10-12 17:09:23,436][62354] Saving new best policy, reward=8.280! [2023-10-12 17:09:24,455][62634] Updated weights for policy 0, policy_version 33380 (0.0007) [2023-10-12 17:09:24,825][62634] Updated weights for policy 0, policy_version 33390 (0.0008) [2023-10-12 17:09:25,198][62634] Updated weights for policy 0, policy_version 33400 (0.0009) [2023-10-12 17:09:26,666][62635] Updated weights for policy 1, policy_version 33380 (0.0008) [2023-10-12 17:09:27,039][62635] Updated weights for policy 1, policy_version 33390 (0.0009) [2023-10-12 17:09:27,413][62635] Updated weights for policy 1, policy_version 33400 (0.0008) [2023-10-12 17:09:28,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 68419584. Throughput: 0: 1676.2, 1: 1673.7. Samples: 17111098. Policy #0 lag: (min: 31.0, avg: 37.9, max: 63.0) [2023-10-12 17:09:28,435][61643] Avg episode reward: [(0, '8.420'), (1, '9.160')] [2023-10-12 17:09:28,436][62354] Saving new best policy, reward=8.420! [2023-10-12 17:09:29,228][62634] Updated weights for policy 0, policy_version 33410 (0.0008) [2023-10-12 17:09:29,611][62634] Updated weights for policy 0, policy_version 33420 (0.0007) [2023-10-12 17:09:29,995][62634] Updated weights for policy 0, policy_version 33430 (0.0010) [2023-10-12 17:09:30,371][62634] Updated weights for policy 0, policy_version 33440 (0.0010) [2023-10-12 17:09:31,459][62635] Updated weights for policy 1, policy_version 33410 (0.0008) [2023-10-12 17:09:31,827][62635] Updated weights for policy 1, policy_version 33420 (0.0009) [2023-10-12 17:09:32,193][62635] Updated weights for policy 1, policy_version 33430 (0.0008) [2023-10-12 17:09:32,567][62635] Updated weights for policy 1, policy_version 33440 (0.0007) [2023-10-12 17:09:33,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 68485120. Throughput: 0: 1683.3, 1: 1665.2. Samples: 17131186. Policy #0 lag: (min: 31.0, avg: 37.9, max: 63.0) [2023-10-12 17:09:33,436][61643] Avg episode reward: [(0, '8.290'), (1, '9.250')] [2023-10-12 17:09:34,452][62634] Updated weights for policy 0, policy_version 33450 (0.0007) [2023-10-12 17:09:34,827][62634] Updated weights for policy 0, policy_version 33460 (0.0008) [2023-10-12 17:09:35,209][62634] Updated weights for policy 0, policy_version 33470 (0.0007) [2023-10-12 17:09:36,839][62635] Updated weights for policy 1, policy_version 33450 (0.0010) [2023-10-12 17:09:37,205][62635] Updated weights for policy 1, policy_version 33460 (0.0010) [2023-10-12 17:09:37,575][62635] Updated weights for policy 1, policy_version 33470 (0.0010) [2023-10-12 17:09:38,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 68550656. Throughput: 0: 1675.0, 1: 1679.3. Samples: 17141480. Policy #0 lag: (min: 14.0, avg: 14.0, max: 16.0) [2023-10-12 17:09:38,435][61643] Avg episode reward: [(0, '8.620'), (1, '9.090')] [2023-10-12 17:09:38,436][62354] Saving new best policy, reward=8.620! [2023-10-12 17:09:39,449][62634] Updated weights for policy 0, policy_version 33480 (0.0009) [2023-10-12 17:09:39,828][62634] Updated weights for policy 0, policy_version 33490 (0.0009) [2023-10-12 17:09:40,206][62634] Updated weights for policy 0, policy_version 33500 (0.0008) [2023-10-12 17:09:41,512][62635] Updated weights for policy 1, policy_version 33480 (0.0008) [2023-10-12 17:09:41,892][62635] Updated weights for policy 1, policy_version 33490 (0.0010) [2023-10-12 17:09:42,262][62635] Updated weights for policy 1, policy_version 33500 (0.0008) [2023-10-12 17:09:43,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 68616192. Throughput: 0: 1678.6, 1: 1662.4. Samples: 17161046. Policy #0 lag: (min: 14.0, avg: 14.0, max: 16.0) [2023-10-12 17:09:43,436][61643] Avg episode reward: [(0, '8.530'), (1, '9.090')] [2023-10-12 17:09:43,931][62634] Updated weights for policy 0, policy_version 33510 (0.0011) [2023-10-12 17:09:44,303][62634] Updated weights for policy 0, policy_version 33520 (0.0007) [2023-10-12 17:09:44,684][62634] Updated weights for policy 0, policy_version 33530 (0.0007) [2023-10-12 17:09:46,350][62635] Updated weights for policy 1, policy_version 33510 (0.0009) [2023-10-12 17:09:46,727][62635] Updated weights for policy 1, policy_version 33520 (0.0010) [2023-10-12 17:09:47,099][62635] Updated weights for policy 1, policy_version 33530 (0.0009) [2023-10-12 17:09:48,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 68681728. Throughput: 0: 1686.8, 1: 1672.2. Samples: 17181780. Policy #0 lag: (min: 14.0, avg: 14.0, max: 16.0) [2023-10-12 17:09:48,435][61643] Avg episode reward: [(0, '8.600'), (1, '9.310')] [2023-10-12 17:09:48,806][62634] Updated weights for policy 0, policy_version 33540 (0.0009) [2023-10-12 17:09:49,181][62634] Updated weights for policy 0, policy_version 33550 (0.0009) [2023-10-12 17:09:49,548][62634] Updated weights for policy 0, policy_version 33560 (0.0009) [2023-10-12 17:09:51,104][62635] Updated weights for policy 1, policy_version 33540 (0.0010) [2023-10-12 17:09:51,474][62635] Updated weights for policy 1, policy_version 33550 (0.0010) [2023-10-12 17:09:51,834][62635] Updated weights for policy 1, policy_version 33560 (0.0008) [2023-10-12 17:09:53,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 68747264. Throughput: 0: 1684.7, 1: 1678.6. Samples: 17192010. Policy #0 lag: (min: 14.0, avg: 14.0, max: 16.0) [2023-10-12 17:09:53,436][61643] Avg episode reward: [(0, '8.830'), (1, '9.160')] [2023-10-12 17:09:53,594][62634] Updated weights for policy 0, policy_version 33570 (0.0011) [2023-10-12 17:09:53,974][62634] Updated weights for policy 0, policy_version 33580 (0.0007) [2023-10-12 17:09:54,343][62634] Updated weights for policy 0, policy_version 33590 (0.0007) [2023-10-12 17:09:54,718][62354] Saving new best policy, reward=8.830! [2023-10-12 17:09:54,724][62634] Updated weights for policy 0, policy_version 33600 (0.0008) [2023-10-12 17:09:55,917][62635] Updated weights for policy 1, policy_version 33570 (0.0008) [2023-10-12 17:09:56,287][62635] Updated weights for policy 1, policy_version 33580 (0.0008) [2023-10-12 17:09:56,657][62635] Updated weights for policy 1, policy_version 33590 (0.0008) [2023-10-12 17:09:57,029][62635] Updated weights for policy 1, policy_version 33600 (0.0008) [2023-10-12 17:09:58,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 68812800. Throughput: 0: 1693.6, 1: 1661.1. Samples: 17211790. Policy #0 lag: (min: 14.0, avg: 14.0, max: 16.0) [2023-10-12 17:09:58,435][61643] Avg episode reward: [(0, '8.510'), (1, '9.160')] [2023-10-12 17:09:58,783][62634] Updated weights for policy 0, policy_version 33610 (0.0008) [2023-10-12 17:09:59,161][62634] Updated weights for policy 0, policy_version 33620 (0.0009) [2023-10-12 17:09:59,543][62634] Updated weights for policy 0, policy_version 33630 (0.0009) [2023-10-12 17:10:01,074][62635] Updated weights for policy 1, policy_version 33610 (0.0008) [2023-10-12 17:10:01,432][62635] Updated weights for policy 1, policy_version 33620 (0.0008) [2023-10-12 17:10:01,808][62635] Updated weights for policy 1, policy_version 33630 (0.0008) [2023-10-12 17:10:03,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 68878336. Throughput: 0: 1699.6, 1: 1684.4. Samples: 17232892. Policy #0 lag: (min: 17.0, avg: 20.1, max: 49.0) [2023-10-12 17:10:03,435][61643] Avg episode reward: [(0, '8.520'), (1, '9.360')] [2023-10-12 17:10:03,528][62634] Updated weights for policy 0, policy_version 33640 (0.0008) [2023-10-12 17:10:03,894][62634] Updated weights for policy 0, policy_version 33650 (0.0007) [2023-10-12 17:10:04,268][62634] Updated weights for policy 0, policy_version 33660 (0.0007) [2023-10-12 17:10:05,847][62635] Updated weights for policy 1, policy_version 33640 (0.0009) [2023-10-12 17:10:06,211][62635] Updated weights for policy 1, policy_version 33650 (0.0009) [2023-10-12 17:10:06,582][62635] Updated weights for policy 1, policy_version 33660 (0.0009) [2023-10-12 17:10:08,166][62634] Updated weights for policy 0, policy_version 33670 (0.0008) [2023-10-12 17:10:08,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 68943872. Throughput: 0: 1700.1, 1: 1679.0. Samples: 17242956. Policy #0 lag: (min: 17.0, avg: 20.1, max: 49.0) [2023-10-12 17:10:08,435][61643] Avg episode reward: [(0, '8.310'), (1, '9.390')] [2023-10-12 17:10:08,532][62634] Updated weights for policy 0, policy_version 33680 (0.0009) [2023-10-12 17:10:08,911][62634] Updated weights for policy 0, policy_version 33690 (0.0007) [2023-10-12 17:10:10,729][62635] Updated weights for policy 1, policy_version 33670 (0.0008) [2023-10-12 17:10:11,100][62635] Updated weights for policy 1, policy_version 33680 (0.0008) [2023-10-12 17:10:11,458][62635] Updated weights for policy 1, policy_version 33690 (0.0009) [2023-10-12 17:10:12,863][62634] Updated weights for policy 0, policy_version 33700 (0.0008) [2023-10-12 17:10:13,238][62634] Updated weights for policy 0, policy_version 33710 (0.0009) [2023-10-12 17:10:13,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 69009408. Throughput: 0: 1704.7, 1: 1669.7. Samples: 17262946. Policy #0 lag: (min: 17.0, avg: 20.1, max: 49.0) [2023-10-12 17:10:13,435][61643] Avg episode reward: [(0, '8.180'), (1, '9.270')] [2023-10-12 17:10:13,615][62634] Updated weights for policy 0, policy_version 33720 (0.0007) [2023-10-12 17:10:15,569][62635] Updated weights for policy 1, policy_version 33700 (0.0008) [2023-10-12 17:10:15,945][62635] Updated weights for policy 1, policy_version 33710 (0.0007) [2023-10-12 17:10:16,317][62635] Updated weights for policy 1, policy_version 33720 (0.0007) [2023-10-12 17:10:17,658][62634] Updated weights for policy 0, policy_version 33730 (0.0009) [2023-10-12 17:10:18,037][62634] Updated weights for policy 0, policy_version 33740 (0.0008) [2023-10-12 17:10:18,413][62634] Updated weights for policy 0, policy_version 33750 (0.0008) [2023-10-12 17:10:18,435][61643] Fps is (10 sec: 13106.9, 60 sec: 13107.1, 300 sec: 13440.4). Total num frames: 69074944. Throughput: 0: 1692.2, 1: 1688.5. Samples: 17283318. Policy #0 lag: (min: 17.0, avg: 20.1, max: 49.0) [2023-10-12 17:10:18,436][61643] Avg episode reward: [(0, '8.230'), (1, '9.430')] [2023-10-12 17:10:18,446][62495] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000033728_34537472.pth... [2023-10-12 17:10:18,478][62495] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000032160_32931840.pth [2023-10-12 17:10:18,482][62495] Saving new best policy, reward=9.430! [2023-10-12 17:10:18,791][62354] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000033760_34570240.pth... [2023-10-12 17:10:18,793][62634] Updated weights for policy 0, policy_version 33760 (0.0008) [2023-10-12 17:10:18,830][62354] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000032160_32931840.pth [2023-10-12 17:10:20,339][62635] Updated weights for policy 1, policy_version 33730 (0.0008) [2023-10-12 17:10:20,701][62635] Updated weights for policy 1, policy_version 33740 (0.0007) [2023-10-12 17:10:21,066][62635] Updated weights for policy 1, policy_version 33750 (0.0008) [2023-10-12 17:10:21,428][62635] Updated weights for policy 1, policy_version 33760 (0.0007) [2023-10-12 17:10:22,978][62634] Updated weights for policy 0, policy_version 33770 (0.0007) [2023-10-12 17:10:23,347][62634] Updated weights for policy 0, policy_version 33780 (0.0008) [2023-10-12 17:10:23,435][61643] Fps is (10 sec: 13106.9, 60 sec: 13107.1, 300 sec: 13440.4). Total num frames: 69140480. Throughput: 0: 1698.1, 1: 1670.6. Samples: 17293072. Policy #0 lag: (min: 17.0, avg: 20.1, max: 49.0) [2023-10-12 17:10:23,436][61643] Avg episode reward: [(0, '8.280'), (1, '9.500')] [2023-10-12 17:10:23,437][62495] Saving new best policy, reward=9.500! [2023-10-12 17:10:23,730][62634] Updated weights for policy 0, policy_version 33790 (0.0009) [2023-10-12 17:10:25,432][62635] Updated weights for policy 1, policy_version 33770 (0.0008) [2023-10-12 17:10:25,797][62635] Updated weights for policy 1, policy_version 33780 (0.0008) [2023-10-12 17:10:26,169][62635] Updated weights for policy 1, policy_version 33790 (0.0008) [2023-10-12 17:10:27,763][62634] Updated weights for policy 0, policy_version 33800 (0.0010) [2023-10-12 17:10:28,144][62634] Updated weights for policy 0, policy_version 33810 (0.0009) [2023-10-12 17:10:28,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 69206016. Throughput: 0: 1703.8, 1: 1681.5. Samples: 17313386. Policy #0 lag: (min: 21.0, avg: 21.0, max: 21.0) [2023-10-12 17:10:28,436][61643] Avg episode reward: [(0, '8.530'), (1, '9.060')] [2023-10-12 17:10:28,515][62634] Updated weights for policy 0, policy_version 33820 (0.0009) [2023-10-12 17:10:30,180][62635] Updated weights for policy 1, policy_version 33800 (0.0008) [2023-10-12 17:10:30,553][62635] Updated weights for policy 1, policy_version 33810 (0.0008) [2023-10-12 17:10:30,927][62635] Updated weights for policy 1, policy_version 33820 (0.0009) [2023-10-12 17:10:32,548][62634] Updated weights for policy 0, policy_version 33830 (0.0008) [2023-10-12 17:10:32,923][62634] Updated weights for policy 0, policy_version 33840 (0.0009) [2023-10-12 17:10:33,291][62634] Updated weights for policy 0, policy_version 33850 (0.0008) [2023-10-12 17:10:33,435][61643] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 69271552. Throughput: 0: 1678.2, 1: 1690.3. Samples: 17333360. Policy #0 lag: (min: 21.0, avg: 21.0, max: 21.0) [2023-10-12 17:10:33,435][61643] Avg episode reward: [(0, '8.440'), (1, '9.190')] [2023-10-12 17:10:35,088][62635] Updated weights for policy 1, policy_version 33830 (0.0009) [2023-10-12 17:10:35,450][62635] Updated weights for policy 1, policy_version 33840 (0.0009) [2023-10-12 17:10:35,820][62635] Updated weights for policy 1, policy_version 33850 (0.0011) [2023-10-12 17:10:37,299][62634] Updated weights for policy 0, policy_version 33860 (0.0008) [2023-10-12 17:10:37,678][62634] Updated weights for policy 0, policy_version 33870 (0.0010) [2023-10-12 17:10:38,059][62634] Updated weights for policy 0, policy_version 33880 (0.0011) [2023-10-12 17:10:38,435][61643] Fps is (10 sec: 16384.0, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 69369856. Throughput: 0: 1695.5, 1: 1663.3. Samples: 17343156. Policy #0 lag: (min: 21.0, avg: 21.0, max: 21.0) [2023-10-12 17:10:38,436][61643] Avg episode reward: [(0, '8.460'), (1, '9.050')] [2023-10-12 17:10:39,988][62635] Updated weights for policy 1, policy_version 33860 (0.0009) [2023-10-12 17:10:40,350][62635] Updated weights for policy 1, policy_version 33870 (0.0007) [2023-10-12 17:10:40,719][62635] Updated weights for policy 1, policy_version 33880 (0.0008) [2023-10-12 17:10:42,190][62634] Updated weights for policy 0, policy_version 33890 (0.0008) [2023-10-12 17:10:42,577][62634] Updated weights for policy 0, policy_version 33900 (0.0007) [2023-10-12 17:10:42,955][62634] Updated weights for policy 0, policy_version 33910 (0.0008) [2023-10-12 17:10:43,330][62634] Updated weights for policy 0, policy_version 33920 (0.0010) [2023-10-12 17:10:43,435][61643] Fps is (10 sec: 16384.1, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 69435392. Throughput: 0: 1695.0, 1: 1683.8. Samples: 17363834. Policy #0 lag: (min: 21.0, avg: 21.0, max: 21.0) [2023-10-12 17:10:43,435][61643] Avg episode reward: [(0, '8.550'), (1, '8.840')] [2023-10-12 17:10:44,655][62635] Updated weights for policy 1, policy_version 33890 (0.0008) [2023-10-12 17:10:45,020][62635] Updated weights for policy 1, policy_version 33900 (0.0007) [2023-10-12 17:10:45,393][62635] Updated weights for policy 1, policy_version 33910 (0.0008) [2023-10-12 17:10:45,766][62635] Updated weights for policy 1, policy_version 33920 (0.0008) [2023-10-12 17:10:47,309][62634] Updated weights for policy 0, policy_version 33930 (0.0009) [2023-10-12 17:10:47,687][62634] Updated weights for policy 0, policy_version 33940 (0.0008) [2023-10-12 17:10:48,052][62634] Updated weights for policy 0, policy_version 33950 (0.0008) [2023-10-12 17:10:48,435][61643] Fps is (10 sec: 13106.9, 60 sec: 13653.2, 300 sec: 13551.5). Total num frames: 69500928. Throughput: 0: 1661.7, 1: 1690.4. Samples: 17383738. Policy #0 lag: (min: 0.0, avg: 28.2, max: 32.0) [2023-10-12 17:10:48,436][61643] Avg episode reward: [(0, '9.010'), (1, '8.960')] [2023-10-12 17:10:48,449][62354] Saving new best policy, reward=9.010! [2023-10-12 17:10:49,825][62635] Updated weights for policy 1, policy_version 33930 (0.0008) [2023-10-12 17:10:50,197][62635] Updated weights for policy 1, policy_version 33940 (0.0010) [2023-10-12 17:10:50,569][62635] Updated weights for policy 1, policy_version 33950 (0.0011) [2023-10-12 17:10:52,114][62634] Updated weights for policy 0, policy_version 33960 (0.0009) [2023-10-12 17:10:52,498][62634] Updated weights for policy 0, policy_version 33970 (0.0010) [2023-10-12 17:10:52,876][62634] Updated weights for policy 0, policy_version 33980 (0.0009) [2023-10-12 17:10:53,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 69566464. Throughput: 0: 1682.7, 1: 1668.5. Samples: 17393758. Policy #0 lag: (min: 0.0, avg: 28.2, max: 32.0) [2023-10-12 17:10:53,436][61643] Avg episode reward: [(0, '8.630'), (1, '8.800')] [2023-10-12 17:10:54,732][62635] Updated weights for policy 1, policy_version 33960 (0.0008) [2023-10-12 17:10:55,095][62635] Updated weights for policy 1, policy_version 33970 (0.0009) [2023-10-12 17:10:55,467][62635] Updated weights for policy 1, policy_version 33980 (0.0007) [2023-10-12 17:10:56,894][62634] Updated weights for policy 0, policy_version 33990 (0.0008) [2023-10-12 17:10:57,285][62634] Updated weights for policy 0, policy_version 34000 (0.0010) [2023-10-12 17:10:57,663][62634] Updated weights for policy 0, policy_version 34010 (0.0010) [2023-10-12 17:10:58,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 69632000. Throughput: 0: 1673.8, 1: 1689.9. Samples: 17414310. Policy #0 lag: (min: 0.0, avg: 28.2, max: 32.0) [2023-10-12 17:10:58,436][61643] Avg episode reward: [(0, '8.680'), (1, '8.750')] [2023-10-12 17:10:59,361][62635] Updated weights for policy 1, policy_version 33990 (0.0011) [2023-10-12 17:10:59,724][62635] Updated weights for policy 1, policy_version 34000 (0.0010) [2023-10-12 17:11:00,099][62635] Updated weights for policy 1, policy_version 34010 (0.0009) [2023-10-12 17:11:01,642][62634] Updated weights for policy 0, policy_version 34020 (0.0007) [2023-10-12 17:11:02,018][62634] Updated weights for policy 0, policy_version 34030 (0.0010) [2023-10-12 17:11:02,403][62634] Updated weights for policy 0, policy_version 34040 (0.0009) [2023-10-12 17:11:03,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 69697536. Throughput: 0: 1662.4, 1: 1689.6. Samples: 17434156. Policy #0 lag: (min: 0.0, avg: 28.2, max: 32.0) [2023-10-12 17:11:03,435][61643] Avg episode reward: [(0, '8.760'), (1, '8.730')] [2023-10-12 17:11:04,253][62635] Updated weights for policy 1, policy_version 34020 (0.0009) [2023-10-12 17:11:04,617][62635] Updated weights for policy 1, policy_version 34030 (0.0012) [2023-10-12 17:11:04,988][62635] Updated weights for policy 1, policy_version 34040 (0.0009) [2023-10-12 17:11:06,439][62634] Updated weights for policy 0, policy_version 34050 (0.0009) [2023-10-12 17:11:06,822][62634] Updated weights for policy 0, policy_version 34060 (0.0007) [2023-10-12 17:11:07,194][62634] Updated weights for policy 0, policy_version 34070 (0.0009) [2023-10-12 17:11:07,564][62634] Updated weights for policy 0, policy_version 34080 (0.0007) [2023-10-12 17:11:08,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 69763072. Throughput: 0: 1687.3, 1: 1680.5. Samples: 17444624. Policy #0 lag: (min: 0.0, avg: 28.2, max: 32.0) [2023-10-12 17:11:08,435][61643] Avg episode reward: [(0, '8.760'), (1, '9.060')] [2023-10-12 17:11:08,898][62635] Updated weights for policy 1, policy_version 34050 (0.0008) [2023-10-12 17:11:09,264][62635] Updated weights for policy 1, policy_version 34060 (0.0011) [2023-10-12 17:11:09,625][62635] Updated weights for policy 1, policy_version 34070 (0.0011) [2023-10-12 17:11:09,993][62635] Updated weights for policy 1, policy_version 34080 (0.0011) [2023-10-12 17:11:11,582][62634] Updated weights for policy 0, policy_version 34090 (0.0007) [2023-10-12 17:11:11,957][62634] Updated weights for policy 0, policy_version 34100 (0.0008) [2023-10-12 17:11:12,328][62634] Updated weights for policy 0, policy_version 34110 (0.0007) [2023-10-12 17:11:13,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 69828608. Throughput: 0: 1669.3, 1: 1688.4. Samples: 17464486. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0) [2023-10-12 17:11:13,436][61643] Avg episode reward: [(0, '8.770'), (1, '8.720')] [2023-10-12 17:11:14,110][62635] Updated weights for policy 1, policy_version 34090 (0.0009) [2023-10-12 17:11:14,478][62635] Updated weights for policy 1, policy_version 34100 (0.0009) [2023-10-12 17:11:14,856][62635] Updated weights for policy 1, policy_version 34110 (0.0010) [2023-10-12 17:11:16,393][62634] Updated weights for policy 0, policy_version 34120 (0.0007) [2023-10-12 17:11:16,768][62634] Updated weights for policy 0, policy_version 34130 (0.0007) [2023-10-12 17:11:17,140][62634] Updated weights for policy 0, policy_version 34140 (0.0008) [2023-10-12 17:11:18,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 69894144. Throughput: 0: 1677.4, 1: 1687.8. Samples: 17484794. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0) [2023-10-12 17:11:18,435][61643] Avg episode reward: [(0, '8.630'), (1, '8.570')] [2023-10-12 17:11:19,174][62635] Updated weights for policy 1, policy_version 34120 (0.0009) [2023-10-12 17:11:19,550][62635] Updated weights for policy 1, policy_version 34130 (0.0010) [2023-10-12 17:11:19,917][62635] Updated weights for policy 1, policy_version 34140 (0.0008) [2023-10-12 17:11:21,176][62634] Updated weights for policy 0, policy_version 34150 (0.0008) [2023-10-12 17:11:21,560][62634] Updated weights for policy 0, policy_version 34160 (0.0007) [2023-10-12 17:11:21,936][62634] Updated weights for policy 0, policy_version 34170 (0.0008) [2023-10-12 17:11:23,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 69959680. Throughput: 0: 1695.4, 1: 1682.3. Samples: 17495152. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0) [2023-10-12 17:11:23,436][61643] Avg episode reward: [(0, '8.750'), (1, '8.910')] [2023-10-12 17:11:23,832][62635] Updated weights for policy 1, policy_version 34150 (0.0009) [2023-10-12 17:11:24,194][62635] Updated weights for policy 1, policy_version 34160 (0.0008) [2023-10-12 17:11:24,574][62635] Updated weights for policy 1, policy_version 34170 (0.0008) [2023-10-12 17:11:25,926][62634] Updated weights for policy 0, policy_version 34180 (0.0009) [2023-10-12 17:11:26,297][62634] Updated weights for policy 0, policy_version 34190 (0.0009) [2023-10-12 17:11:26,673][62634] Updated weights for policy 0, policy_version 34200 (0.0008) [2023-10-12 17:11:28,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 70025216. Throughput: 0: 1670.0, 1: 1685.3. Samples: 17514826. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0) [2023-10-12 17:11:28,436][61643] Avg episode reward: [(0, '8.780'), (1, '8.980')] [2023-10-12 17:11:28,793][62635] Updated weights for policy 1, policy_version 34180 (0.0009) [2023-10-12 17:11:29,162][62635] Updated weights for policy 1, policy_version 34190 (0.0010) [2023-10-12 17:11:29,528][62635] Updated weights for policy 1, policy_version 34200 (0.0008) [2023-10-12 17:11:30,637][62634] Updated weights for policy 0, policy_version 34210 (0.0009) [2023-10-12 17:11:31,045][62634] Updated weights for policy 0, policy_version 34220 (0.0010) [2023-10-12 17:11:31,410][62634] Updated weights for policy 0, policy_version 34230 (0.0010) [2023-10-12 17:11:31,786][62634] Updated weights for policy 0, policy_version 34240 (0.0007) [2023-10-12 17:11:33,301][62635] Updated weights for policy 1, policy_version 34210 (0.0008) [2023-10-12 17:11:33,435][61643] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 70090752. Throughput: 0: 1693.5, 1: 1682.6. Samples: 17535662. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0) [2023-10-12 17:11:33,435][61643] Avg episode reward: [(0, '8.900'), (1, '9.000')] [2023-10-12 17:11:33,669][62635] Updated weights for policy 1, policy_version 34220 (0.0009) [2023-10-12 17:11:34,046][62635] Updated weights for policy 1, policy_version 34230 (0.0010) [2023-10-12 17:11:34,409][62635] Updated weights for policy 1, policy_version 34240 (0.0008) [2023-10-12 17:11:35,739][62634] Updated weights for policy 0, policy_version 34250 (0.0010) [2023-10-12 17:11:36,118][62634] Updated weights for policy 0, policy_version 34260 (0.0009) [2023-10-12 17:11:36,497][62634] Updated weights for policy 0, policy_version 34270 (0.0008) [2023-10-12 17:11:38,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 70156288. Throughput: 0: 1689.2, 1: 1685.2. Samples: 17545606. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 17:11:38,436][61643] Avg episode reward: [(0, '8.780'), (1, '9.070')] [2023-10-12 17:11:38,538][62635] Updated weights for policy 1, policy_version 34250 (0.0008) [2023-10-12 17:11:38,901][62635] Updated weights for policy 1, policy_version 34260 (0.0008) [2023-10-12 17:11:39,265][62635] Updated weights for policy 1, policy_version 34270 (0.0009) [2023-10-12 17:11:40,560][62634] Updated weights for policy 0, policy_version 34280 (0.0007) [2023-10-12 17:11:40,937][62634] Updated weights for policy 0, policy_version 34290 (0.0007) [2023-10-12 17:11:41,314][62634] Updated weights for policy 0, policy_version 34300 (0.0007) [2023-10-12 17:11:43,412][62635] Updated weights for policy 1, policy_version 34280 (0.0008) [2023-10-12 17:11:43,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 70221824. Throughput: 0: 1676.1, 1: 1685.7. Samples: 17565588. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 17:11:43,435][61643] Avg episode reward: [(0, '8.890'), (1, '9.100')] [2023-10-12 17:11:43,782][62635] Updated weights for policy 1, policy_version 34290 (0.0008) [2023-10-12 17:11:44,157][62635] Updated weights for policy 1, policy_version 34300 (0.0007) [2023-10-12 17:11:45,319][62634] Updated weights for policy 0, policy_version 34310 (0.0008) [2023-10-12 17:11:45,698][62634] Updated weights for policy 0, policy_version 34320 (0.0010) [2023-10-12 17:11:46,071][62634] Updated weights for policy 0, policy_version 34330 (0.0009) [2023-10-12 17:11:48,216][62635] Updated weights for policy 1, policy_version 34310 (0.0009) [2023-10-12 17:11:48,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.3, 300 sec: 13440.4). Total num frames: 70287360. Throughput: 0: 1700.8, 1: 1684.0. Samples: 17586468. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 17:11:48,435][61643] Avg episode reward: [(0, '8.830'), (1, '9.270')] [2023-10-12 17:11:48,588][62635] Updated weights for policy 1, policy_version 34320 (0.0009) [2023-10-12 17:11:48,958][62635] Updated weights for policy 1, policy_version 34330 (0.0007) [2023-10-12 17:11:50,000][62634] Updated weights for policy 0, policy_version 34340 (0.0007) [2023-10-12 17:11:50,380][62634] Updated weights for policy 0, policy_version 34350 (0.0009) [2023-10-12 17:11:50,754][62634] Updated weights for policy 0, policy_version 34360 (0.0010) [2023-10-12 17:11:52,996][62635] Updated weights for policy 1, policy_version 34340 (0.0007) [2023-10-12 17:11:53,374][62635] Updated weights for policy 1, policy_version 34350 (0.0009) [2023-10-12 17:11:53,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 70352896. Throughput: 0: 1675.5, 1: 1684.1. Samples: 17595806. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 17:11:53,435][61643] Avg episode reward: [(0, '8.660'), (1, '9.190')] [2023-10-12 17:11:53,736][62635] Updated weights for policy 1, policy_version 34360 (0.0008) [2023-10-12 17:11:54,590][62634] Updated weights for policy 0, policy_version 34370 (0.0008) [2023-10-12 17:11:54,974][62634] Updated weights for policy 0, policy_version 34380 (0.0009) [2023-10-12 17:11:55,348][62634] Updated weights for policy 0, policy_version 34390 (0.0007) [2023-10-12 17:11:55,726][62634] Updated weights for policy 0, policy_version 34400 (0.0009) [2023-10-12 17:11:57,671][62635] Updated weights for policy 1, policy_version 34370 (0.0007) [2023-10-12 17:11:58,030][62635] Updated weights for policy 1, policy_version 34380 (0.0008) [2023-10-12 17:11:58,392][62635] Updated weights for policy 1, policy_version 34390 (0.0007) [2023-10-12 17:11:58,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 70418432. Throughput: 0: 1691.2, 1: 1690.3. Samples: 17616654. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 17:11:58,435][61643] Avg episode reward: [(0, '8.830'), (1, '9.130')] [2023-10-12 17:11:58,755][62635] Updated weights for policy 1, policy_version 34400 (0.0008) [2023-10-12 17:11:59,793][62634] Updated weights for policy 0, policy_version 34410 (0.0007) [2023-10-12 17:12:00,173][62634] Updated weights for policy 0, policy_version 34420 (0.0008) [2023-10-12 17:12:00,552][62634] Updated weights for policy 0, policy_version 34430 (0.0008) [2023-10-12 17:12:02,704][62635] Updated weights for policy 1, policy_version 34410 (0.0010) [2023-10-12 17:12:03,080][62635] Updated weights for policy 1, policy_version 34420 (0.0007) [2023-10-12 17:12:03,435][61643] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 70483968. Throughput: 0: 1701.4, 1: 1678.1. Samples: 17636872. Policy #0 lag: (min: 31.0, avg: 38.3, max: 63.0) [2023-10-12 17:12:03,436][61643] Avg episode reward: [(0, '8.590'), (1, '9.430')] [2023-10-12 17:12:03,452][62635] Updated weights for policy 1, policy_version 34430 (0.0009) [2023-10-12 17:12:04,740][62634] Updated weights for policy 0, policy_version 34440 (0.0007) [2023-10-12 17:12:05,115][62634] Updated weights for policy 0, policy_version 34450 (0.0009) [2023-10-12 17:12:05,500][62634] Updated weights for policy 0, policy_version 34460 (0.0007) [2023-10-12 17:12:07,789][62635] Updated weights for policy 1, policy_version 34440 (0.0008) [2023-10-12 17:12:08,165][62635] Updated weights for policy 1, policy_version 34450 (0.0009) [2023-10-12 17:12:08,435][61643] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 70549504. Throughput: 0: 1667.7, 1: 1699.5. Samples: 17646674. Policy #0 lag: (min: 31.0, avg: 38.3, max: 63.0) [2023-10-12 17:12:08,436][61643] Avg episode reward: [(0, '8.570'), (1, '9.400')] [2023-10-12 17:12:08,531][62635] Updated weights for policy 1, policy_version 34460 (0.0009) [2023-10-12 17:12:09,595][62634] Updated weights for policy 0, policy_version 34470 (0.0007) [2023-10-12 17:12:09,972][62634] Updated weights for policy 0, policy_version 34480 (0.0008) [2023-10-12 17:12:10,356][62634] Updated weights for policy 0, policy_version 34490 (0.0009) [2023-10-12 17:12:12,554][62635] Updated weights for policy 1, policy_version 34470 (0.0009) [2023-10-12 17:12:12,921][62635] Updated weights for policy 1, policy_version 34480 (0.0009) [2023-10-12 17:12:13,291][62635] Updated weights for policy 1, policy_version 34490 (0.0009) [2023-10-12 17:12:13,435][61643] Fps is (10 sec: 13107.6, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 70615040. Throughput: 0: 1691.4, 1: 1693.5. Samples: 17667146. Policy #0 lag: (min: 31.0, avg: 38.3, max: 63.0) [2023-10-12 17:12:13,435][61643] Avg episode reward: [(0, '8.710'), (1, '9.300')] [2023-10-12 17:12:14,352][62634] Updated weights for policy 0, policy_version 34500 (0.0008) [2023-10-12 17:12:14,734][62634] Updated weights for policy 0, policy_version 34510 (0.0010) [2023-10-12 17:12:15,105][62634] Updated weights for policy 0, policy_version 34520 (0.0008) [2023-10-12 17:12:17,305][62635] Updated weights for policy 1, policy_version 34500 (0.0008) [2023-10-12 17:12:17,670][62635] Updated weights for policy 1, policy_version 34510 (0.0009) [2023-10-12 17:12:18,053][62635] Updated weights for policy 1, policy_version 34520 (0.0009) [2023-10-12 17:12:18,435][61643] Fps is (10 sec: 16384.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 70713344. Throughput: 0: 1691.5, 1: 1672.3. Samples: 17687034. Policy #0 lag: (min: 31.0, avg: 38.3, max: 63.0) [2023-10-12 17:12:18,436][61643] Avg episode reward: [(0, '8.530'), (1, '9.480')] [2023-10-12 17:12:18,445][62495] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000034528_35356672.pth... [2023-10-12 17:12:18,445][62354] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000034528_35356672.pth... [2023-10-12 17:12:18,481][62354] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000032960_33751040.pth [2023-10-12 17:12:18,483][62495] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000032928_33718272.pth [2023-10-12 17:12:19,332][62634] Updated weights for policy 0, policy_version 34530 (0.0008) [2023-10-12 17:12:19,746][62634] Updated weights for policy 0, policy_version 34540 (0.0007) [2023-10-12 17:12:20,117][62634] Updated weights for policy 0, policy_version 34550 (0.0010) [2023-10-12 17:12:20,503][62634] Updated weights for policy 0, policy_version 34560 (0.0008) [2023-10-12 17:12:22,135][62635] Updated weights for policy 1, policy_version 34530 (0.0008) [2023-10-12 17:12:22,500][62635] Updated weights for policy 1, policy_version 34540 (0.0008) [2023-10-12 17:12:22,863][62635] Updated weights for policy 1, policy_version 34550 (0.0007) [2023-10-12 17:12:23,237][62635] Updated weights for policy 1, policy_version 34560 (0.0007) [2023-10-12 17:12:23,435][61643] Fps is (10 sec: 16383.6, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 70778880. Throughput: 0: 1669.8, 1: 1687.2. Samples: 17696674. Policy #0 lag: (min: 31.0, avg: 38.3, max: 63.0) [2023-10-12 17:12:23,436][61643] Avg episode reward: [(0, '8.670'), (1, '9.400')] [2023-10-12 17:12:24,532][62634] Updated weights for policy 0, policy_version 34570 (0.0009) [2023-10-12 17:12:24,915][62634] Updated weights for policy 0, policy_version 34580 (0.0009) [2023-10-12 17:12:25,290][62634] Updated weights for policy 0, policy_version 34590 (0.0008) [2023-10-12 17:12:27,338][62635] Updated weights for policy 1, policy_version 34570 (0.0007) [2023-10-12 17:12:27,703][62635] Updated weights for policy 1, policy_version 34580 (0.0008) [2023-10-12 17:12:28,073][62635] Updated weights for policy 1, policy_version 34590 (0.0008) [2023-10-12 17:12:28,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 70844416. Throughput: 0: 1685.5, 1: 1683.4. Samples: 17717186. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 17:12:28,436][61643] Avg episode reward: [(0, '8.700'), (1, '9.380')] [2023-10-12 17:12:29,298][62634] Updated weights for policy 0, policy_version 34600 (0.0009) [2023-10-12 17:12:29,682][62634] Updated weights for policy 0, policy_version 34610 (0.0007) [2023-10-12 17:12:30,065][62634] Updated weights for policy 0, policy_version 34620 (0.0007) [2023-10-12 17:12:32,171][62635] Updated weights for policy 1, policy_version 34600 (0.0008) [2023-10-12 17:12:32,530][62635] Updated weights for policy 1, policy_version 34610 (0.0007) [2023-10-12 17:12:32,909][62635] Updated weights for policy 1, policy_version 34620 (0.0008) [2023-10-12 17:12:33,435][61643] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 70909952. Throughput: 0: 1684.9, 1: 1659.2. Samples: 17736952. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 17:12:33,435][61643] Avg episode reward: [(0, '8.770'), (1, '9.510')] [2023-10-12 17:12:33,441][62495] Saving new best policy, reward=9.510! [2023-10-12 17:12:34,100][62634] Updated weights for policy 0, policy_version 34630 (0.0009) [2023-10-12 17:12:34,478][62634] Updated weights for policy 0, policy_version 34640 (0.0009) [2023-10-12 17:12:34,851][62634] Updated weights for policy 0, policy_version 34650 (0.0010) [2023-10-12 17:12:36,939][62635] Updated weights for policy 1, policy_version 34630 (0.0008) [2023-10-12 17:12:37,305][62635] Updated weights for policy 1, policy_version 34640 (0.0008) [2023-10-12 17:12:37,680][62635] Updated weights for policy 1, policy_version 34650 (0.0009) [2023-10-12 17:12:38,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 70975488. Throughput: 0: 1679.6, 1: 1689.2. Samples: 17747402. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 17:12:38,436][61643] Avg episode reward: [(0, '8.950'), (1, '9.400')] [2023-10-12 17:12:38,792][62634] Updated weights for policy 0, policy_version 34660 (0.0010) [2023-10-12 17:12:39,165][62634] Updated weights for policy 0, policy_version 34670 (0.0009) [2023-10-12 17:12:39,551][62634] Updated weights for policy 0, policy_version 34680 (0.0009) [2023-10-12 17:12:41,679][62635] Updated weights for policy 1, policy_version 34660 (0.0009) [2023-10-12 17:12:42,040][62635] Updated weights for policy 1, policy_version 34670 (0.0008) [2023-10-12 17:12:42,408][62635] Updated weights for policy 1, policy_version 34680 (0.0010) [2023-10-12 17:12:43,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 71041024. Throughput: 0: 1685.6, 1: 1673.2. Samples: 17767796. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 17:12:43,435][61643] Avg episode reward: [(0, '8.800'), (1, '9.460')] [2023-10-12 17:12:43,535][62634] Updated weights for policy 0, policy_version 34690 (0.0010) [2023-10-12 17:12:43,908][62634] Updated weights for policy 0, policy_version 34700 (0.0010) [2023-10-12 17:12:44,284][62634] Updated weights for policy 0, policy_version 34710 (0.0010) [2023-10-12 17:12:44,658][62634] Updated weights for policy 0, policy_version 34720 (0.0008) [2023-10-12 17:12:46,485][62635] Updated weights for policy 1, policy_version 34690 (0.0009) [2023-10-12 17:12:46,854][62635] Updated weights for policy 1, policy_version 34700 (0.0008) [2023-10-12 17:12:47,218][62635] Updated weights for policy 1, policy_version 34710 (0.0009) [2023-10-12 17:12:47,590][62635] Updated weights for policy 1, policy_version 34720 (0.0010) [2023-10-12 17:12:48,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 71106560. Throughput: 0: 1680.9, 1: 1667.1. Samples: 17787532. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 17:12:48,436][61643] Avg episode reward: [(0, '8.640'), (1, '9.360')] [2023-10-12 17:12:48,755][62634] Updated weights for policy 0, policy_version 34730 (0.0008) [2023-10-12 17:12:49,137][62634] Updated weights for policy 0, policy_version 34740 (0.0011) [2023-10-12 17:12:49,520][62634] Updated weights for policy 0, policy_version 34750 (0.0010) [2023-10-12 17:12:51,628][62635] Updated weights for policy 1, policy_version 34730 (0.0008) [2023-10-12 17:12:52,000][62635] Updated weights for policy 1, policy_version 34740 (0.0008) [2023-10-12 17:12:52,373][62635] Updated weights for policy 1, policy_version 34750 (0.0008) [2023-10-12 17:12:53,435][61643] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 71172096. Throughput: 0: 1679.9, 1: 1680.5. Samples: 17797892. Policy #0 lag: (min: 31.0, avg: 33.7, max: 63.0) [2023-10-12 17:12:53,436][61643] Avg episode reward: [(0, '8.750'), (1, '9.480')] [2023-10-12 17:12:53,603][62634] Updated weights for policy 0, policy_version 34760 (0.0011) [2023-10-12 17:12:53,982][62634] Updated weights for policy 0, policy_version 34770 (0.0009) [2023-10-12 17:12:54,362][62634] Updated weights for policy 0, policy_version 34780 (0.0009) [2023-10-12 17:12:56,555][62635] Updated weights for policy 1, policy_version 34760 (0.0007) [2023-10-12 17:12:56,938][62635] Updated weights for policy 1, policy_version 34770 (0.0008) [2023-10-12 17:12:57,301][62635] Updated weights for policy 1, policy_version 34780 (0.0010) [2023-10-12 17:12:58,323][62634] Updated weights for policy 0, policy_version 34790 (0.0011) [2023-10-12 17:12:58,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 71237632. Throughput: 0: 1682.7, 1: 1662.8. Samples: 17817698. Policy #0 lag: (min: 31.0, avg: 33.7, max: 63.0) [2023-10-12 17:12:58,436][61643] Avg episode reward: [(0, '8.720'), (1, '9.560')] [2023-10-12 17:12:58,437][62495] Saving new best policy, reward=9.560! [2023-10-12 17:12:58,695][62634] Updated weights for policy 0, policy_version 34800 (0.0011) [2023-10-12 17:12:59,068][62634] Updated weights for policy 0, policy_version 34810 (0.0008) [2023-10-12 17:13:01,358][62635] Updated weights for policy 1, policy_version 34790 (0.0008) [2023-10-12 17:13:01,722][62635] Updated weights for policy 1, policy_version 34800 (0.0008) [2023-10-12 17:13:02,101][62635] Updated weights for policy 1, policy_version 34810 (0.0008) [2023-10-12 17:13:03,138][62634] Updated weights for policy 0, policy_version 34820 (0.0007) [2023-10-12 17:13:03,435][61643] Fps is (10 sec: 13107.5, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 71303168. Throughput: 0: 1686.2, 1: 1670.0. Samples: 17838064. Policy #0 lag: (min: 31.0, avg: 33.7, max: 63.0) [2023-10-12 17:13:03,435][61643] Avg episode reward: [(0, '8.610'), (1, '9.540')] [2023-10-12 17:13:03,520][62634] Updated weights for policy 0, policy_version 34830 (0.0008) [2023-10-12 17:13:03,890][62634] Updated weights for policy 0, policy_version 34840 (0.0007) [2023-10-12 17:13:06,032][62635] Updated weights for policy 1, policy_version 34820 (0.0009) [2023-10-12 17:13:06,408][62635] Updated weights for policy 1, policy_version 34830 (0.0008) [2023-10-12 17:13:06,779][62635] Updated weights for policy 1, policy_version 34840 (0.0008) [2023-10-12 17:13:08,067][62634] Updated weights for policy 0, policy_version 34850 (0.0009) [2023-10-12 17:13:08,435][61643] Fps is (10 sec: 13107.5, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 71368704. Throughput: 0: 1688.7, 1: 1681.6. Samples: 17848338. Policy #0 lag: (min: 31.0, avg: 33.7, max: 63.0) [2023-10-12 17:13:08,435][61643] Avg episode reward: [(0, '8.600'), (1, '9.380')] [2023-10-12 17:13:08,465][62634] Updated weights for policy 0, policy_version 34860 (0.0008) [2023-10-12 17:13:08,842][62634] Updated weights for policy 0, policy_version 34870 (0.0009) [2023-10-12 17:13:09,221][62634] Updated weights for policy 0, policy_version 34880 (0.0009) [2023-10-12 17:13:10,897][62635] Updated weights for policy 1, policy_version 34850 (0.0009) [2023-10-12 17:13:11,261][62635] Updated weights for policy 1, policy_version 34860 (0.0009) [2023-10-12 17:13:11,637][62635] Updated weights for policy 1, policy_version 34870 (0.0009) [2023-10-12 17:13:11,995][62635] Updated weights for policy 1, policy_version 34880 (0.0009) [2023-10-12 17:13:13,369][62634] Updated weights for policy 0, policy_version 34890 (0.0008) [2023-10-12 17:13:13,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 71434240. Throughput: 0: 1689.1, 1: 1662.0. Samples: 17867982. Policy #0 lag: (min: 31.0, avg: 33.7, max: 63.0) [2023-10-12 17:13:13,436][61643] Avg episode reward: [(0, '8.600'), (1, '9.500')] [2023-10-12 17:13:13,739][62634] Updated weights for policy 0, policy_version 34900 (0.0007) [2023-10-12 17:13:14,120][62634] Updated weights for policy 0, policy_version 34910 (0.0008) [2023-10-12 17:13:16,161][62635] Updated weights for policy 1, policy_version 34890 (0.0009) [2023-10-12 17:13:16,528][62635] Updated weights for policy 1, policy_version 34900 (0.0009) [2023-10-12 17:13:16,896][62635] Updated weights for policy 1, policy_version 34910 (0.0009) [2023-10-12 17:13:17,917][62634] Updated weights for policy 0, policy_version 34920 (0.0009) [2023-10-12 17:13:18,291][62634] Updated weights for policy 0, policy_version 34930 (0.0008) [2023-10-12 17:13:18,435][61643] Fps is (10 sec: 13106.8, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 71499776. Throughput: 0: 1683.1, 1: 1685.8. Samples: 17888552. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 17:13:18,436][61643] Avg episode reward: [(0, '8.590'), (1, '9.640')] [2023-10-12 17:13:18,448][62495] Saving new best policy, reward=9.640! [2023-10-12 17:13:18,674][62634] Updated weights for policy 0, policy_version 34940 (0.0011) [2023-10-12 17:13:20,771][62635] Updated weights for policy 1, policy_version 34920 (0.0010) [2023-10-12 17:13:21,131][62635] Updated weights for policy 1, policy_version 34930 (0.0007) [2023-10-12 17:13:21,508][62635] Updated weights for policy 1, policy_version 34940 (0.0009) [2023-10-12 17:13:22,673][62634] Updated weights for policy 0, policy_version 34950 (0.0008) [2023-10-12 17:13:23,048][62634] Updated weights for policy 0, policy_version 34960 (0.0009) [2023-10-12 17:13:23,432][62634] Updated weights for policy 0, policy_version 34970 (0.0008) [2023-10-12 17:13:23,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 71565312. Throughput: 0: 1691.8, 1: 1675.6. Samples: 17898936. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 17:13:23,436][61643] Avg episode reward: [(0, '8.580'), (1, '9.750')] [2023-10-12 17:13:23,437][62495] Saving new best policy, reward=9.750! [2023-10-12 17:13:25,732][62635] Updated weights for policy 1, policy_version 34950 (0.0009) [2023-10-12 17:13:26,097][62635] Updated weights for policy 1, policy_version 34960 (0.0007) [2023-10-12 17:13:26,465][62635] Updated weights for policy 1, policy_version 34970 (0.0009) [2023-10-12 17:13:27,536][62634] Updated weights for policy 0, policy_version 34980 (0.0009) [2023-10-12 17:13:27,916][62634] Updated weights for policy 0, policy_version 34990 (0.0009) [2023-10-12 17:13:28,301][62634] Updated weights for policy 0, policy_version 35000 (0.0008) [2023-10-12 17:13:28,435][61643] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 71630848. Throughput: 0: 1689.6, 1: 1667.2. Samples: 17918856. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 17:13:28,435][61643] Avg episode reward: [(0, '8.790'), (1, '9.660')] [2023-10-12 17:13:30,575][62635] Updated weights for policy 1, policy_version 34980 (0.0010) [2023-10-12 17:13:30,954][62635] Updated weights for policy 1, policy_version 34990 (0.0010) [2023-10-12 17:13:31,316][62635] Updated weights for policy 1, policy_version 35000 (0.0008) [2023-10-12 17:13:32,332][62634] Updated weights for policy 0, policy_version 35010 (0.0007) [2023-10-12 17:13:32,705][62634] Updated weights for policy 0, policy_version 35020 (0.0008) [2023-10-12 17:13:33,088][62634] Updated weights for policy 0, policy_version 35030 (0.0009) [2023-10-12 17:13:33,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 71696384. Throughput: 0: 1678.5, 1: 1689.7. Samples: 17939104. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 17:13:33,435][61643] Avg episode reward: [(0, '8.760'), (1, '9.760')] [2023-10-12 17:13:33,443][62495] Saving new best policy, reward=9.760! [2023-10-12 17:13:33,458][62634] Updated weights for policy 0, policy_version 35040 (0.0009) [2023-10-12 17:13:35,306][62635] Updated weights for policy 1, policy_version 35010 (0.0008) [2023-10-12 17:13:35,669][62635] Updated weights for policy 1, policy_version 35020 (0.0007) [2023-10-12 17:13:36,038][62635] Updated weights for policy 1, policy_version 35030 (0.0007) [2023-10-12 17:13:36,397][62635] Updated weights for policy 1, policy_version 35040 (0.0009) [2023-10-12 17:13:37,343][62634] Updated weights for policy 0, policy_version 35050 (0.0011) [2023-10-12 17:13:37,708][62634] Updated weights for policy 0, policy_version 35060 (0.0009) [2023-10-12 17:13:38,088][62634] Updated weights for policy 0, policy_version 35070 (0.0008) [2023-10-12 17:13:38,435][61643] Fps is (10 sec: 16383.7, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 71794688. Throughput: 0: 1700.9, 1: 1668.3. Samples: 17949508. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0) [2023-10-12 17:13:38,436][61643] Avg episode reward: [(0, '8.700'), (1, '9.780')] [2023-10-12 17:13:38,437][62495] Saving new best policy, reward=9.780! [2023-10-12 17:13:40,581][62635] Updated weights for policy 1, policy_version 35050 (0.0007) [2023-10-12 17:13:40,948][62635] Updated weights for policy 1, policy_version 35060 (0.0009) [2023-10-12 17:13:41,305][62635] Updated weights for policy 1, policy_version 35070 (0.0007) [2023-10-12 17:13:42,207][62634] Updated weights for policy 0, policy_version 35080 (0.0007) [2023-10-12 17:13:42,576][62634] Updated weights for policy 0, policy_version 35090 (0.0008) [2023-10-12 17:13:42,956][62634] Updated weights for policy 0, policy_version 35100 (0.0008) [2023-10-12 17:13:43,435][61643] Fps is (10 sec: 16384.0, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 71860224. Throughput: 0: 1696.5, 1: 1678.7. Samples: 17969580. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0) [2023-10-12 17:13:43,435][61643] Avg episode reward: [(0, '8.810'), (1, '9.680')] [2023-10-12 17:13:45,435][62635] Updated weights for policy 1, policy_version 35080 (0.0007) [2023-10-12 17:13:45,822][62635] Updated weights for policy 1, policy_version 35090 (0.0008) [2023-10-12 17:13:46,188][62635] Updated weights for policy 1, policy_version 35100 (0.0008) [2023-10-12 17:13:47,067][62634] Updated weights for policy 0, policy_version 35110 (0.0008) [2023-10-12 17:13:47,444][62634] Updated weights for policy 0, policy_version 35120 (0.0008) [2023-10-12 17:13:47,819][62634] Updated weights for policy 0, policy_version 35130 (0.0010) [2023-10-12 17:13:48,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 71925760. Throughput: 0: 1667.6, 1: 1689.6. Samples: 17989138. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0) [2023-10-12 17:13:48,436][61643] Avg episode reward: [(0, '8.840'), (1, '9.550')] [2023-10-12 17:13:50,313][62635] Updated weights for policy 1, policy_version 35110 (0.0009) [2023-10-12 17:13:50,678][62635] Updated weights for policy 1, policy_version 35120 (0.0008) [2023-10-12 17:13:51,047][62635] Updated weights for policy 1, policy_version 35130 (0.0007) [2023-10-12 17:13:51,753][62634] Updated weights for policy 0, policy_version 35140 (0.0011) [2023-10-12 17:13:52,129][62634] Updated weights for policy 0, policy_version 35150 (0.0010) [2023-10-12 17:13:52,508][62634] Updated weights for policy 0, policy_version 35160 (0.0008) [2023-10-12 17:13:53,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 71991296. Throughput: 0: 1695.2, 1: 1668.7. Samples: 17999712. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0) [2023-10-12 17:13:53,435][61643] Avg episode reward: [(0, '8.730'), (1, '9.620')] [2023-10-12 17:13:55,025][62635] Updated weights for policy 1, policy_version 35140 (0.0008) [2023-10-12 17:13:55,389][62635] Updated weights for policy 1, policy_version 35150 (0.0009) [2023-10-12 17:13:55,758][62635] Updated weights for policy 1, policy_version 35160 (0.0009) [2023-10-12 17:13:56,509][62634] Updated weights for policy 0, policy_version 35170 (0.0009) [2023-10-12 17:13:56,907][62634] Updated weights for policy 0, policy_version 35180 (0.0010) [2023-10-12 17:13:57,284][62634] Updated weights for policy 0, policy_version 35190 (0.0010) [2023-10-12 17:13:57,655][62634] Updated weights for policy 0, policy_version 35200 (0.0007) [2023-10-12 17:13:58,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 72056832. Throughput: 0: 1683.2, 1: 1684.7. Samples: 18019536. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0) [2023-10-12 17:13:58,435][61643] Avg episode reward: [(0, '8.800'), (1, '9.610')] [2023-10-12 17:13:59,805][62635] Updated weights for policy 1, policy_version 35170 (0.0009) [2023-10-12 17:14:00,180][62635] Updated weights for policy 1, policy_version 35180 (0.0007) [2023-10-12 17:14:00,543][62635] Updated weights for policy 1, policy_version 35190 (0.0007) [2023-10-12 17:14:00,912][62635] Updated weights for policy 1, policy_version 35200 (0.0009) [2023-10-12 17:14:01,717][62634] Updated weights for policy 0, policy_version 35210 (0.0010) [2023-10-12 17:14:02,092][62634] Updated weights for policy 0, policy_version 35220 (0.0008) [2023-10-12 17:14:02,468][62634] Updated weights for policy 0, policy_version 35230 (0.0007) [2023-10-12 17:14:03,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 72122368. Throughput: 0: 1667.4, 1: 1687.7. Samples: 18039532. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 17:14:03,435][61643] Avg episode reward: [(0, '8.830'), (1, '9.470')] [2023-10-12 17:14:04,797][62635] Updated weights for policy 1, policy_version 35210 (0.0008) [2023-10-12 17:14:05,167][62635] Updated weights for policy 1, policy_version 35220 (0.0008) [2023-10-12 17:14:05,538][62635] Updated weights for policy 1, policy_version 35230 (0.0008) [2023-10-12 17:14:06,456][62634] Updated weights for policy 0, policy_version 35240 (0.0007) [2023-10-12 17:14:06,833][62634] Updated weights for policy 0, policy_version 35250 (0.0007) [2023-10-12 17:14:07,216][62634] Updated weights for policy 0, policy_version 35260 (0.0007) [2023-10-12 17:14:08,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 72187904. Throughput: 0: 1690.2, 1: 1666.8. Samples: 18050002. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 17:14:08,435][61643] Avg episode reward: [(0, '8.680'), (1, '9.640')] [2023-10-12 17:14:09,606][62635] Updated weights for policy 1, policy_version 35240 (0.0011) [2023-10-12 17:14:09,971][62635] Updated weights for policy 1, policy_version 35250 (0.0010) [2023-10-12 17:14:10,340][62635] Updated weights for policy 1, policy_version 35260 (0.0009) [2023-10-12 17:14:11,462][62634] Updated weights for policy 0, policy_version 35270 (0.0008) [2023-10-12 17:14:11,840][62634] Updated weights for policy 0, policy_version 35280 (0.0007) [2023-10-12 17:14:12,207][62634] Updated weights for policy 0, policy_version 35290 (0.0007) [2023-10-12 17:14:13,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 72253440. Throughput: 0: 1672.5, 1: 1686.0. Samples: 18069988. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 17:14:13,435][61643] Avg episode reward: [(0, '8.710'), (1, '9.700')] [2023-10-12 17:14:14,400][62635] Updated weights for policy 1, policy_version 35270 (0.0007) [2023-10-12 17:14:14,765][62635] Updated weights for policy 1, policy_version 35280 (0.0008) [2023-10-12 17:14:15,133][62635] Updated weights for policy 1, policy_version 35290 (0.0008) [2023-10-12 17:14:16,144][62634] Updated weights for policy 0, policy_version 35300 (0.0010) [2023-10-12 17:14:16,526][62634] Updated weights for policy 0, policy_version 35310 (0.0010) [2023-10-12 17:14:16,906][62634] Updated weights for policy 0, policy_version 35320 (0.0010) [2023-10-12 17:14:18,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 72318976. Throughput: 0: 1675.5, 1: 1688.5. Samples: 18090482. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 17:14:18,436][61643] Avg episode reward: [(0, '8.910'), (1, '9.610')] [2023-10-12 17:14:18,446][62495] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000035296_36143104.pth... [2023-10-12 17:14:18,446][62354] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000035328_36175872.pth... [2023-10-12 17:14:18,478][62495] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000033728_34537472.pth [2023-10-12 17:14:18,486][62354] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000033760_34570240.pth [2023-10-12 17:14:19,042][62635] Updated weights for policy 1, policy_version 35300 (0.0008) [2023-10-12 17:14:19,412][62635] Updated weights for policy 1, policy_version 35310 (0.0008) [2023-10-12 17:14:19,774][62635] Updated weights for policy 1, policy_version 35320 (0.0008) [2023-10-12 17:14:20,939][62634] Updated weights for policy 0, policy_version 35330 (0.0008) [2023-10-12 17:14:21,312][62634] Updated weights for policy 0, policy_version 35340 (0.0007) [2023-10-12 17:14:21,692][62634] Updated weights for policy 0, policy_version 35350 (0.0008) [2023-10-12 17:14:22,072][62634] Updated weights for policy 0, policy_version 35360 (0.0010) [2023-10-12 17:14:23,435][61643] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 72384512. Throughput: 0: 1681.7, 1: 1679.1. Samples: 18100746. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 17:14:23,436][61643] Avg episode reward: [(0, '8.630'), (1, '9.720')] [2023-10-12 17:14:23,835][62635] Updated weights for policy 1, policy_version 35330 (0.0009) [2023-10-12 17:14:24,203][62635] Updated weights for policy 1, policy_version 35340 (0.0008) [2023-10-12 17:14:24,581][62635] Updated weights for policy 1, policy_version 35350 (0.0007) [2023-10-12 17:14:24,955][62635] Updated weights for policy 1, policy_version 35360 (0.0008) [2023-10-12 17:14:26,072][62634] Updated weights for policy 0, policy_version 35370 (0.0009) [2023-10-12 17:14:26,452][62634] Updated weights for policy 0, policy_version 35380 (0.0007) [2023-10-12 17:14:26,832][62634] Updated weights for policy 0, policy_version 35390 (0.0008) [2023-10-12 17:14:28,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 72450048. Throughput: 0: 1660.1, 1: 1693.6. Samples: 18120498. Policy #0 lag: (min: 31.0, avg: 31.4, max: 43.0) [2023-10-12 17:14:28,435][61643] Avg episode reward: [(0, '8.800'), (1, '9.750')] [2023-10-12 17:14:28,847][62635] Updated weights for policy 1, policy_version 35370 (0.0011) [2023-10-12 17:14:29,224][62635] Updated weights for policy 1, policy_version 35380 (0.0010) [2023-10-12 17:14:29,592][62635] Updated weights for policy 1, policy_version 35390 (0.0008) [2023-10-12 17:14:30,921][62634] Updated weights for policy 0, policy_version 35400 (0.0008) [2023-10-12 17:14:31,301][62634] Updated weights for policy 0, policy_version 35410 (0.0007) [2023-10-12 17:14:31,681][62634] Updated weights for policy 0, policy_version 35420 (0.0008) [2023-10-12 17:14:33,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 72515584. Throughput: 0: 1683.3, 1: 1696.9. Samples: 18141250. Policy #0 lag: (min: 31.0, avg: 31.4, max: 43.0) [2023-10-12 17:14:33,435][61643] Avg episode reward: [(0, '8.880'), (1, '9.910')] [2023-10-12 17:14:33,702][62635] Updated weights for policy 1, policy_version 35400 (0.0009) [2023-10-12 17:14:34,074][62635] Updated weights for policy 1, policy_version 35410 (0.0009) [2023-10-12 17:14:34,458][62635] Updated weights for policy 1, policy_version 35420 (0.0008) [2023-10-12 17:14:34,596][62495] Saving new best policy, reward=9.910! [2023-10-12 17:14:35,652][62634] Updated weights for policy 0, policy_version 35430 (0.0007) [2023-10-12 17:14:36,029][62634] Updated weights for policy 0, policy_version 35440 (0.0007) [2023-10-12 17:14:36,402][62634] Updated weights for policy 0, policy_version 35450 (0.0008) [2023-10-12 17:14:38,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 72581120. Throughput: 0: 1677.5, 1: 1686.6. Samples: 18151096. Policy #0 lag: (min: 31.0, avg: 31.4, max: 43.0) [2023-10-12 17:14:38,435][61643] Avg episode reward: [(0, '8.690'), (1, '9.780')] [2023-10-12 17:14:38,669][62635] Updated weights for policy 1, policy_version 35430 (0.0009) [2023-10-12 17:14:39,036][62635] Updated weights for policy 1, policy_version 35440 (0.0010) [2023-10-12 17:14:39,408][62635] Updated weights for policy 1, policy_version 35450 (0.0008) [2023-10-12 17:14:40,620][62634] Updated weights for policy 0, policy_version 35460 (0.0007) [2023-10-12 17:14:41,003][62634] Updated weights for policy 0, policy_version 35470 (0.0007) [2023-10-12 17:14:41,386][62634] Updated weights for policy 0, policy_version 35480 (0.0008) [2023-10-12 17:14:43,384][62635] Updated weights for policy 1, policy_version 35460 (0.0007) [2023-10-12 17:14:43,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 72646656. Throughput: 0: 1672.9, 1: 1698.7. Samples: 18171258. Policy #0 lag: (min: 31.0, avg: 31.4, max: 43.0) [2023-10-12 17:14:43,435][61643] Avg episode reward: [(0, '8.890'), (1, '9.770')] [2023-10-12 17:14:43,752][62635] Updated weights for policy 1, policy_version 35470 (0.0008) [2023-10-12 17:14:44,123][62635] Updated weights for policy 1, policy_version 35480 (0.0009) [2023-10-12 17:14:45,275][62634] Updated weights for policy 0, policy_version 35490 (0.0010) [2023-10-12 17:14:45,667][62634] Updated weights for policy 0, policy_version 35500 (0.0009) [2023-10-12 17:14:46,038][62634] Updated weights for policy 0, policy_version 35510 (0.0007) [2023-10-12 17:14:46,421][62634] Updated weights for policy 0, policy_version 35520 (0.0009) [2023-10-12 17:14:48,191][62635] Updated weights for policy 1, policy_version 35490 (0.0008) [2023-10-12 17:14:48,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 72712192. Throughput: 0: 1693.2, 1: 1698.5. Samples: 18192162. Policy #0 lag: (min: 31.0, avg: 31.4, max: 43.0) [2023-10-12 17:14:48,435][61643] Avg episode reward: [(0, '8.970'), (1, '9.800')] [2023-10-12 17:14:48,553][62635] Updated weights for policy 1, policy_version 35500 (0.0009) [2023-10-12 17:14:48,934][62635] Updated weights for policy 1, policy_version 35510 (0.0008) [2023-10-12 17:14:49,302][62635] Updated weights for policy 1, policy_version 35520 (0.0009) [2023-10-12 17:14:50,474][62634] Updated weights for policy 0, policy_version 35530 (0.0009) [2023-10-12 17:14:50,848][62634] Updated weights for policy 0, policy_version 35540 (0.0009) [2023-10-12 17:14:51,220][62634] Updated weights for policy 0, policy_version 35550 (0.0008) [2023-10-12 17:14:53,274][62635] Updated weights for policy 1, policy_version 35530 (0.0010) [2023-10-12 17:14:53,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 72777728. Throughput: 0: 1667.9, 1: 1700.8. Samples: 18201594. Policy #0 lag: (min: 6.0, avg: 12.7, max: 38.0) [2023-10-12 17:14:53,435][61643] Avg episode reward: [(0, '8.740'), (1, '9.800')] [2023-10-12 17:14:53,640][62635] Updated weights for policy 1, policy_version 35540 (0.0008) [2023-10-12 17:14:54,000][62635] Updated weights for policy 1, policy_version 35550 (0.0009) [2023-10-12 17:14:55,317][62634] Updated weights for policy 0, policy_version 35560 (0.0009) [2023-10-12 17:14:55,699][62634] Updated weights for policy 0, policy_version 35570 (0.0010) [2023-10-12 17:14:56,083][62634] Updated weights for policy 0, policy_version 35580 (0.0007) [2023-10-12 17:14:58,004][62635] Updated weights for policy 1, policy_version 35560 (0.0008) [2023-10-12 17:14:58,375][62635] Updated weights for policy 1, policy_version 35570 (0.0008) [2023-10-12 17:14:58,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 72843264. Throughput: 0: 1677.9, 1: 1700.8. Samples: 18222030. Policy #0 lag: (min: 6.0, avg: 12.7, max: 38.0) [2023-10-12 17:14:58,435][61643] Avg episode reward: [(0, '8.970'), (1, '9.870')] [2023-10-12 17:14:58,731][62635] Updated weights for policy 1, policy_version 35580 (0.0008) [2023-10-12 17:15:00,073][62634] Updated weights for policy 0, policy_version 35590 (0.0008) [2023-10-12 17:15:00,446][62634] Updated weights for policy 0, policy_version 35600 (0.0010) [2023-10-12 17:15:00,823][62634] Updated weights for policy 0, policy_version 35610 (0.0009) [2023-10-12 17:15:02,710][62635] Updated weights for policy 1, policy_version 35590 (0.0007) [2023-10-12 17:15:03,079][62635] Updated weights for policy 1, policy_version 35600 (0.0009) [2023-10-12 17:15:03,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 72908800. Throughput: 0: 1687.5, 1: 1688.7. Samples: 18242412. Policy #0 lag: (min: 6.0, avg: 12.7, max: 38.0) [2023-10-12 17:15:03,435][61643] Avg episode reward: [(0, '8.950'), (1, '9.600')] [2023-10-12 17:15:03,459][62635] Updated weights for policy 1, policy_version 35610 (0.0009) [2023-10-12 17:15:04,891][62634] Updated weights for policy 0, policy_version 35620 (0.0007) [2023-10-12 17:15:05,267][62634] Updated weights for policy 0, policy_version 35630 (0.0008) [2023-10-12 17:15:05,650][62634] Updated weights for policy 0, policy_version 35640 (0.0008) [2023-10-12 17:15:07,349][62635] Updated weights for policy 1, policy_version 35620 (0.0008) [2023-10-12 17:15:07,720][62635] Updated weights for policy 1, policy_version 35630 (0.0007) [2023-10-12 17:15:08,074][62635] Updated weights for policy 1, policy_version 35640 (0.0009) [2023-10-12 17:15:08,435][61643] Fps is (10 sec: 16384.0, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 73007104. Throughput: 0: 1662.2, 1: 1703.8. Samples: 18252218. Policy #0 lag: (min: 6.0, avg: 12.7, max: 38.0) [2023-10-12 17:15:08,435][61643] Avg episode reward: [(0, '8.790'), (1, '9.510')] [2023-10-12 17:15:09,881][62634] Updated weights for policy 0, policy_version 35650 (0.0010) [2023-10-12 17:15:10,257][62634] Updated weights for policy 0, policy_version 35660 (0.0010) [2023-10-12 17:15:10,634][62634] Updated weights for policy 0, policy_version 35670 (0.0010) [2023-10-12 17:15:11,014][62634] Updated weights for policy 0, policy_version 35680 (0.0010) [2023-10-12 17:15:12,094][62635] Updated weights for policy 1, policy_version 35650 (0.0008) [2023-10-12 17:15:12,457][62635] Updated weights for policy 1, policy_version 35660 (0.0010) [2023-10-12 17:15:12,822][62635] Updated weights for policy 1, policy_version 35670 (0.0009) [2023-10-12 17:15:13,193][62635] Updated weights for policy 1, policy_version 35680 (0.0009) [2023-10-12 17:15:13,435][61643] Fps is (10 sec: 16383.9, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 73072640. Throughput: 0: 1679.7, 1: 1698.9. Samples: 18272538. Policy #0 lag: (min: 6.0, avg: 12.7, max: 38.0) [2023-10-12 17:15:13,435][61643] Avg episode reward: [(0, '8.970'), (1, '9.570')] [2023-10-12 17:15:14,941][62634] Updated weights for policy 0, policy_version 35690 (0.0008) [2023-10-12 17:15:15,323][62634] Updated weights for policy 0, policy_version 35700 (0.0008) [2023-10-12 17:15:15,704][62634] Updated weights for policy 0, policy_version 35710 (0.0010) [2023-10-12 17:15:17,273][62635] Updated weights for policy 1, policy_version 35690 (0.0009) [2023-10-12 17:15:17,637][62635] Updated weights for policy 1, policy_version 35700 (0.0008) [2023-10-12 17:15:18,000][62635] Updated weights for policy 1, policy_version 35710 (0.0009) [2023-10-12 17:15:18,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 73138176. Throughput: 0: 1684.0, 1: 1670.2. Samples: 18292190. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 17:15:18,435][61643] Avg episode reward: [(0, '8.890'), (1, '9.770')] [2023-10-12 17:15:19,755][62634] Updated weights for policy 0, policy_version 35720 (0.0010) [2023-10-12 17:15:20,127][62634] Updated weights for policy 0, policy_version 35730 (0.0009) [2023-10-12 17:15:20,512][62634] Updated weights for policy 0, policy_version 35740 (0.0010) [2023-10-12 17:15:22,369][62635] Updated weights for policy 1, policy_version 35720 (0.0009) [2023-10-12 17:15:22,742][62635] Updated weights for policy 1, policy_version 35730 (0.0007) [2023-10-12 17:15:23,113][62635] Updated weights for policy 1, policy_version 35740 (0.0008) [2023-10-12 17:15:23,435][61643] Fps is (10 sec: 13106.8, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 73203712. Throughput: 0: 1662.4, 1: 1697.8. Samples: 18302306. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 17:15:23,437][61643] Avg episode reward: [(0, '8.880'), (1, '9.700')] [2023-10-12 17:15:24,667][62634] Updated weights for policy 0, policy_version 35750 (0.0008) [2023-10-12 17:15:25,040][62634] Updated weights for policy 0, policy_version 35760 (0.0008) [2023-10-12 17:15:25,420][62634] Updated weights for policy 0, policy_version 35770 (0.0009) [2023-10-12 17:15:27,044][62635] Updated weights for policy 1, policy_version 35750 (0.0008) [2023-10-12 17:15:27,405][62635] Updated weights for policy 1, policy_version 35760 (0.0007) [2023-10-12 17:15:27,787][62635] Updated weights for policy 1, policy_version 35770 (0.0008) [2023-10-12 17:15:28,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 73269248. Throughput: 0: 1681.0, 1: 1685.9. Samples: 18322770. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 17:15:28,436][61643] Avg episode reward: [(0, '8.830'), (1, '9.600')] [2023-10-12 17:15:29,412][62634] Updated weights for policy 0, policy_version 35780 (0.0008) [2023-10-12 17:15:29,783][62634] Updated weights for policy 0, policy_version 35790 (0.0008) [2023-10-12 17:15:30,152][62634] Updated weights for policy 0, policy_version 35800 (0.0007) [2023-10-12 17:15:31,957][62635] Updated weights for policy 1, policy_version 35780 (0.0008) [2023-10-12 17:15:32,322][62635] Updated weights for policy 1, policy_version 35790 (0.0010) [2023-10-12 17:15:32,697][62635] Updated weights for policy 1, policy_version 35800 (0.0008) [2023-10-12 17:15:33,435][61643] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 73334784. Throughput: 0: 1677.2, 1: 1662.8. Samples: 18342466. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 17:15:33,436][61643] Avg episode reward: [(0, '9.040'), (1, '9.900')] [2023-10-12 17:15:33,444][62354] Saving new best policy, reward=9.040! [2023-10-12 17:15:34,389][62634] Updated weights for policy 0, policy_version 35810 (0.0008) [2023-10-12 17:15:34,796][62634] Updated weights for policy 0, policy_version 35820 (0.0009) [2023-10-12 17:15:35,183][62634] Updated weights for policy 0, policy_version 35830 (0.0008) [2023-10-12 17:15:35,551][62634] Updated weights for policy 0, policy_version 35840 (0.0007) [2023-10-12 17:15:36,711][62635] Updated weights for policy 1, policy_version 35810 (0.0008) [2023-10-12 17:15:37,077][62635] Updated weights for policy 1, policy_version 35820 (0.0008) [2023-10-12 17:15:37,434][62635] Updated weights for policy 1, policy_version 35830 (0.0009) [2023-10-12 17:15:37,806][62635] Updated weights for policy 1, policy_version 35840 (0.0008) [2023-10-12 17:15:38,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 73400320. Throughput: 0: 1666.8, 1: 1689.4. Samples: 18352620. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 17:15:38,435][61643] Avg episode reward: [(0, '8.860'), (1, '10.010')] [2023-10-12 17:15:38,436][62495] Saving new best policy, reward=10.010! [2023-10-12 17:15:39,456][62634] Updated weights for policy 0, policy_version 35850 (0.0009) [2023-10-12 17:15:39,836][62634] Updated weights for policy 0, policy_version 35860 (0.0009) [2023-10-12 17:15:40,209][62634] Updated weights for policy 0, policy_version 35870 (0.0007) [2023-10-12 17:15:41,956][62635] Updated weights for policy 1, policy_version 35850 (0.0008) [2023-10-12 17:15:42,318][62635] Updated weights for policy 1, policy_version 35860 (0.0007) [2023-10-12 17:15:42,685][62635] Updated weights for policy 1, policy_version 35870 (0.0008) [2023-10-12 17:15:43,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 73465856. Throughput: 0: 1677.7, 1: 1678.1. Samples: 18373040. Policy #0 lag: (min: 31.0, avg: 42.4, max: 63.0) [2023-10-12 17:15:43,435][61643] Avg episode reward: [(0, '8.740'), (1, '9.830')] [2023-10-12 17:15:44,175][62634] Updated weights for policy 0, policy_version 35880 (0.0010) [2023-10-12 17:15:44,561][62634] Updated weights for policy 0, policy_version 35890 (0.0007) [2023-10-12 17:15:44,937][62634] Updated weights for policy 0, policy_version 35900 (0.0007) [2023-10-12 17:15:46,645][62635] Updated weights for policy 1, policy_version 35880 (0.0009) [2023-10-12 17:15:47,004][62635] Updated weights for policy 1, policy_version 35890 (0.0009) [2023-10-12 17:15:47,377][62635] Updated weights for policy 1, policy_version 35900 (0.0009) [2023-10-12 17:15:48,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 73531392. Throughput: 0: 1681.8, 1: 1670.2. Samples: 18393250. Policy #0 lag: (min: 31.0, avg: 42.4, max: 63.0) [2023-10-12 17:15:48,435][61643] Avg episode reward: [(0, '9.110'), (1, '9.910')] [2023-10-12 17:15:48,445][62354] Saving new best policy, reward=9.110! [2023-10-12 17:15:49,069][62634] Updated weights for policy 0, policy_version 35910 (0.0008) [2023-10-12 17:15:49,429][62634] Updated weights for policy 0, policy_version 35920 (0.0008) [2023-10-12 17:15:49,800][62634] Updated weights for policy 0, policy_version 35930 (0.0008) [2023-10-12 17:15:51,492][62635] Updated weights for policy 1, policy_version 35910 (0.0009) [2023-10-12 17:15:51,844][62635] Updated weights for policy 1, policy_version 35920 (0.0009) [2023-10-12 17:15:52,211][62635] Updated weights for policy 1, policy_version 35930 (0.0009) [2023-10-12 17:15:53,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 73596928. Throughput: 0: 1675.0, 1: 1684.2. Samples: 18403382. Policy #0 lag: (min: 31.0, avg: 42.4, max: 63.0) [2023-10-12 17:15:53,436][61643] Avg episode reward: [(0, '9.120'), (1, '9.990')] [2023-10-12 17:15:53,437][62354] Saving new best policy, reward=9.120! [2023-10-12 17:15:53,721][62634] Updated weights for policy 0, policy_version 35940 (0.0009) [2023-10-12 17:15:54,100][62634] Updated weights for policy 0, policy_version 35950 (0.0010) [2023-10-12 17:15:54,470][62634] Updated weights for policy 0, policy_version 35960 (0.0009) [2023-10-12 17:15:56,245][62635] Updated weights for policy 1, policy_version 35940 (0.0010) [2023-10-12 17:15:56,619][62635] Updated weights for policy 1, policy_version 35950 (0.0009) [2023-10-12 17:15:56,982][62635] Updated weights for policy 1, policy_version 35960 (0.0008) [2023-10-12 17:15:58,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 73662464. Throughput: 0: 1686.4, 1: 1664.4. Samples: 18423326. Policy #0 lag: (min: 31.0, avg: 42.4, max: 63.0) [2023-10-12 17:15:58,435][61643] Avg episode reward: [(0, '9.080'), (1, '9.980')] [2023-10-12 17:15:58,510][62634] Updated weights for policy 0, policy_version 35970 (0.0009) [2023-10-12 17:15:58,878][62634] Updated weights for policy 0, policy_version 35980 (0.0009) [2023-10-12 17:15:59,253][62634] Updated weights for policy 0, policy_version 35990 (0.0010) [2023-10-12 17:15:59,631][62634] Updated weights for policy 0, policy_version 36000 (0.0010) [2023-10-12 17:16:01,144][62635] Updated weights for policy 1, policy_version 35970 (0.0010) [2023-10-12 17:16:01,510][62635] Updated weights for policy 1, policy_version 35980 (0.0011) [2023-10-12 17:16:01,876][62635] Updated weights for policy 1, policy_version 35990 (0.0008) [2023-10-12 17:16:02,240][62635] Updated weights for policy 1, policy_version 36000 (0.0008) [2023-10-12 17:16:03,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 73728000. Throughput: 0: 1686.0, 1: 1680.3. Samples: 18443672. Policy #0 lag: (min: 31.0, avg: 42.4, max: 63.0) [2023-10-12 17:16:03,435][61643] Avg episode reward: [(0, '9.060'), (1, '9.960')] [2023-10-12 17:16:03,697][62634] Updated weights for policy 0, policy_version 36010 (0.0008) [2023-10-12 17:16:04,081][62634] Updated weights for policy 0, policy_version 36020 (0.0008) [2023-10-12 17:16:04,467][62634] Updated weights for policy 0, policy_version 36030 (0.0009) [2023-10-12 17:16:06,161][62635] Updated weights for policy 1, policy_version 36010 (0.0009) [2023-10-12 17:16:06,525][62635] Updated weights for policy 1, policy_version 36020 (0.0008) [2023-10-12 17:16:06,898][62635] Updated weights for policy 1, policy_version 36030 (0.0009) [2023-10-12 17:16:08,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 73793536. Throughput: 0: 1687.4, 1: 1684.5. Samples: 18454038. Policy #0 lag: (min: 24.0, avg: 44.8, max: 56.0) [2023-10-12 17:16:08,435][61643] Avg episode reward: [(0, '9.360'), (1, '9.870')] [2023-10-12 17:16:08,482][62634] Updated weights for policy 0, policy_version 36040 (0.0008) [2023-10-12 17:16:08,879][62634] Updated weights for policy 0, policy_version 36050 (0.0010) [2023-10-12 17:16:09,248][62634] Updated weights for policy 0, policy_version 36060 (0.0011) [2023-10-12 17:16:09,397][62354] Saving new best policy, reward=9.360! [2023-10-12 17:16:11,156][62635] Updated weights for policy 1, policy_version 36040 (0.0009) [2023-10-12 17:16:11,526][62635] Updated weights for policy 1, policy_version 36050 (0.0007) [2023-10-12 17:16:11,899][62635] Updated weights for policy 1, policy_version 36060 (0.0008) [2023-10-12 17:16:13,328][62634] Updated weights for policy 0, policy_version 36070 (0.0008) [2023-10-12 17:16:13,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 73859072. Throughput: 0: 1690.1, 1: 1667.7. Samples: 18473872. Policy #0 lag: (min: 24.0, avg: 44.8, max: 56.0) [2023-10-12 17:16:13,435][61643] Avg episode reward: [(0, '9.390'), (1, '9.790')] [2023-10-12 17:16:13,700][62634] Updated weights for policy 0, policy_version 36080 (0.0009) [2023-10-12 17:16:14,090][62634] Updated weights for policy 0, policy_version 36090 (0.0009) [2023-10-12 17:16:14,308][62354] Saving new best policy, reward=9.390! [2023-10-12 17:16:15,906][62635] Updated weights for policy 1, policy_version 36070 (0.0009) [2023-10-12 17:16:16,271][62635] Updated weights for policy 1, policy_version 36080 (0.0009) [2023-10-12 17:16:16,641][62635] Updated weights for policy 1, policy_version 36090 (0.0007) [2023-10-12 17:16:18,109][62634] Updated weights for policy 0, policy_version 36100 (0.0007) [2023-10-12 17:16:18,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 73924608. Throughput: 0: 1689.0, 1: 1692.5. Samples: 18494636. Policy #0 lag: (min: 24.0, avg: 44.8, max: 56.0) [2023-10-12 17:16:18,435][61643] Avg episode reward: [(0, '9.320'), (1, '9.850')] [2023-10-12 17:16:18,445][62495] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000036096_36962304.pth... [2023-10-12 17:16:18,485][62495] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000034528_35356672.pth [2023-10-12 17:16:18,495][62634] Updated weights for policy 0, policy_version 36110 (0.0009) [2023-10-12 17:16:18,865][62634] Updated weights for policy 0, policy_version 36120 (0.0008) [2023-10-12 17:16:19,169][62354] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000036128_36995072.pth... [2023-10-12 17:16:19,207][62354] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000034528_35356672.pth [2023-10-12 17:16:20,774][62635] Updated weights for policy 1, policy_version 36100 (0.0009) [2023-10-12 17:16:21,144][62635] Updated weights for policy 1, policy_version 36110 (0.0008) [2023-10-12 17:16:21,509][62635] Updated weights for policy 1, policy_version 36120 (0.0008) [2023-10-12 17:16:23,095][62634] Updated weights for policy 0, policy_version 36130 (0.0010) [2023-10-12 17:16:23,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.3, 300 sec: 13440.4). Total num frames: 73990144. Throughput: 0: 1690.0, 1: 1686.4. Samples: 18504556. Policy #0 lag: (min: 24.0, avg: 44.8, max: 56.0) [2023-10-12 17:16:23,435][61643] Avg episode reward: [(0, '9.410'), (1, '9.910')] [2023-10-12 17:16:23,497][62634] Updated weights for policy 0, policy_version 36140 (0.0010) [2023-10-12 17:16:23,875][62634] Updated weights for policy 0, policy_version 36150 (0.0010) [2023-10-12 17:16:24,259][62354] Saving new best policy, reward=9.410! [2023-10-12 17:16:24,260][62634] Updated weights for policy 0, policy_version 36160 (0.0008) [2023-10-12 17:16:25,560][62635] Updated weights for policy 1, policy_version 36130 (0.0008) [2023-10-12 17:16:25,934][62635] Updated weights for policy 1, policy_version 36140 (0.0007) [2023-10-12 17:16:26,301][62635] Updated weights for policy 1, policy_version 36150 (0.0009) [2023-10-12 17:16:26,661][62635] Updated weights for policy 1, policy_version 36160 (0.0007) [2023-10-12 17:16:28,403][62634] Updated weights for policy 0, policy_version 36170 (0.0008) [2023-10-12 17:16:28,435][61643] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 74055680. Throughput: 0: 1681.4, 1: 1680.3. Samples: 18524316. Policy #0 lag: (min: 24.0, avg: 44.8, max: 56.0) [2023-10-12 17:16:28,436][61643] Avg episode reward: [(0, '9.300'), (1, '9.820')] [2023-10-12 17:16:28,773][62634] Updated weights for policy 0, policy_version 36180 (0.0010) [2023-10-12 17:16:29,148][62634] Updated weights for policy 0, policy_version 36190 (0.0007) [2023-10-12 17:16:30,671][62635] Updated weights for policy 1, policy_version 36170 (0.0009) [2023-10-12 17:16:31,039][62635] Updated weights for policy 1, policy_version 36180 (0.0008) [2023-10-12 17:16:31,407][62635] Updated weights for policy 1, policy_version 36190 (0.0007) [2023-10-12 17:16:33,196][62634] Updated weights for policy 0, policy_version 36200 (0.0010) [2023-10-12 17:16:33,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 74121216. Throughput: 0: 1678.0, 1: 1695.7. Samples: 18545066. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 17:16:33,435][61643] Avg episode reward: [(0, '9.220'), (1, '9.830')] [2023-10-12 17:16:33,578][62634] Updated weights for policy 0, policy_version 36210 (0.0009) [2023-10-12 17:16:33,947][62634] Updated weights for policy 0, policy_version 36220 (0.0010) [2023-10-12 17:16:35,315][62635] Updated weights for policy 1, policy_version 36200 (0.0008) [2023-10-12 17:16:35,684][62635] Updated weights for policy 1, policy_version 36210 (0.0009) [2023-10-12 17:16:36,053][62635] Updated weights for policy 1, policy_version 36220 (0.0009) [2023-10-12 17:16:37,981][62634] Updated weights for policy 0, policy_version 36230 (0.0009) [2023-10-12 17:16:38,346][62634] Updated weights for policy 0, policy_version 36240 (0.0008) [2023-10-12 17:16:38,435][61643] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 74186752. Throughput: 0: 1686.6, 1: 1674.4. Samples: 18554628. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 17:16:38,435][61643] Avg episode reward: [(0, '9.440'), (1, '9.850')] [2023-10-12 17:16:38,729][62634] Updated weights for policy 0, policy_version 36250 (0.0008) [2023-10-12 17:16:38,946][62354] Saving new best policy, reward=9.440! [2023-10-12 17:16:40,279][62635] Updated weights for policy 1, policy_version 36230 (0.0009) [2023-10-12 17:16:40,650][62635] Updated weights for policy 1, policy_version 36240 (0.0007) [2023-10-12 17:16:41,030][62635] Updated weights for policy 1, policy_version 36250 (0.0008) [2023-10-12 17:16:42,717][62634] Updated weights for policy 0, policy_version 36260 (0.0009) [2023-10-12 17:16:43,092][62634] Updated weights for policy 0, policy_version 36270 (0.0009) [2023-10-12 17:16:43,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 74252288. Throughput: 0: 1681.4, 1: 1688.3. Samples: 18574964. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 17:16:43,435][61643] Avg episode reward: [(0, '9.400'), (1, '9.890')] [2023-10-12 17:16:43,467][62634] Updated weights for policy 0, policy_version 36280 (0.0009) [2023-10-12 17:16:45,081][62635] Updated weights for policy 1, policy_version 36260 (0.0010) [2023-10-12 17:16:45,450][62635] Updated weights for policy 1, policy_version 36270 (0.0009) [2023-10-12 17:16:45,825][62635] Updated weights for policy 1, policy_version 36280 (0.0008) [2023-10-12 17:16:47,449][62634] Updated weights for policy 0, policy_version 36290 (0.0008) [2023-10-12 17:16:47,824][62634] Updated weights for policy 0, policy_version 36300 (0.0010) [2023-10-12 17:16:48,198][62634] Updated weights for policy 0, policy_version 36310 (0.0010) [2023-10-12 17:16:48,435][61643] Fps is (10 sec: 13106.9, 60 sec: 13107.1, 300 sec: 13440.4). Total num frames: 74317824. Throughput: 0: 1669.5, 1: 1692.3. Samples: 18594958. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 17:16:48,436][61643] Avg episode reward: [(0, '9.050'), (1, '9.680')] [2023-10-12 17:16:48,572][62634] Updated weights for policy 0, policy_version 36320 (0.0011) [2023-10-12 17:16:49,908][62635] Updated weights for policy 1, policy_version 36290 (0.0011) [2023-10-12 17:16:50,264][62635] Updated weights for policy 1, policy_version 36300 (0.0008) [2023-10-12 17:16:50,635][62635] Updated weights for policy 1, policy_version 36310 (0.0007) [2023-10-12 17:16:51,002][62635] Updated weights for policy 1, policy_version 36320 (0.0007) [2023-10-12 17:16:52,682][62634] Updated weights for policy 0, policy_version 36330 (0.0008) [2023-10-12 17:16:53,059][62634] Updated weights for policy 0, policy_version 36340 (0.0009) [2023-10-12 17:16:53,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 74383360. Throughput: 0: 1681.6, 1: 1665.1. Samples: 18604640. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 17:16:53,435][61643] Avg episode reward: [(0, '9.220'), (1, '9.850')] [2023-10-12 17:16:53,437][62634] Updated weights for policy 0, policy_version 36350 (0.0011) [2023-10-12 17:16:54,976][62635] Updated weights for policy 1, policy_version 36330 (0.0007) [2023-10-12 17:16:55,340][62635] Updated weights for policy 1, policy_version 36340 (0.0007) [2023-10-12 17:16:55,706][62635] Updated weights for policy 1, policy_version 36350 (0.0007) [2023-10-12 17:16:57,432][62634] Updated weights for policy 0, policy_version 36360 (0.0010) [2023-10-12 17:16:57,808][62634] Updated weights for policy 0, policy_version 36370 (0.0009) [2023-10-12 17:16:58,190][62634] Updated weights for policy 0, policy_version 36380 (0.0008) [2023-10-12 17:16:58,435][61643] Fps is (10 sec: 16384.4, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 74481664. Throughput: 0: 1682.4, 1: 1688.6. Samples: 18625570. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 17:16:58,435][61643] Avg episode reward: [(0, '9.330'), (1, '9.980')] [2023-10-12 17:16:59,794][62635] Updated weights for policy 1, policy_version 36360 (0.0007) [2023-10-12 17:17:00,180][62635] Updated weights for policy 1, policy_version 36370 (0.0008) [2023-10-12 17:17:00,544][62635] Updated weights for policy 1, policy_version 36380 (0.0010) [2023-10-12 17:17:02,214][62634] Updated weights for policy 0, policy_version 36390 (0.0007) [2023-10-12 17:17:02,599][62634] Updated weights for policy 0, policy_version 36400 (0.0008) [2023-10-12 17:17:02,979][62634] Updated weights for policy 0, policy_version 36410 (0.0009) [2023-10-12 17:17:03,435][61643] Fps is (10 sec: 16384.0, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 74547200. Throughput: 0: 1665.4, 1: 1685.0. Samples: 18645404. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 17:17:03,435][61643] Avg episode reward: [(0, '9.150'), (1, '9.710')] [2023-10-12 17:17:04,501][62635] Updated weights for policy 1, policy_version 36390 (0.0009) [2023-10-12 17:17:04,871][62635] Updated weights for policy 1, policy_version 36400 (0.0008) [2023-10-12 17:17:05,245][62635] Updated weights for policy 1, policy_version 36410 (0.0007) [2023-10-12 17:17:06,965][62634] Updated weights for policy 0, policy_version 36420 (0.0008) [2023-10-12 17:17:07,350][62634] Updated weights for policy 0, policy_version 36430 (0.0007) [2023-10-12 17:17:07,724][62634] Updated weights for policy 0, policy_version 36440 (0.0008) [2023-10-12 17:17:08,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 74612736. Throughput: 0: 1689.0, 1: 1666.1. Samples: 18655534. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 17:17:08,435][61643] Avg episode reward: [(0, '9.180'), (1, '9.720')] [2023-10-12 17:17:09,426][62635] Updated weights for policy 1, policy_version 36420 (0.0009) [2023-10-12 17:17:09,799][62635] Updated weights for policy 1, policy_version 36430 (0.0008) [2023-10-12 17:17:10,175][62635] Updated weights for policy 1, policy_version 36440 (0.0008) [2023-10-12 17:17:11,848][62634] Updated weights for policy 0, policy_version 36450 (0.0007) [2023-10-12 17:17:12,247][62634] Updated weights for policy 0, policy_version 36460 (0.0009) [2023-10-12 17:17:12,626][62634] Updated weights for policy 0, policy_version 36470 (0.0010) [2023-10-12 17:17:13,010][62634] Updated weights for policy 0, policy_version 36480 (0.0010) [2023-10-12 17:17:13,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 74678272. Throughput: 0: 1689.0, 1: 1682.2. Samples: 18676020. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 17:17:13,435][61643] Avg episode reward: [(0, '9.300'), (1, '9.910')] [2023-10-12 17:17:14,416][62635] Updated weights for policy 1, policy_version 36450 (0.0007) [2023-10-12 17:17:14,791][62635] Updated weights for policy 1, policy_version 36460 (0.0008) [2023-10-12 17:17:15,158][62635] Updated weights for policy 1, policy_version 36470 (0.0008) [2023-10-12 17:17:15,526][62635] Updated weights for policy 1, policy_version 36480 (0.0009) [2023-10-12 17:17:17,037][62634] Updated weights for policy 0, policy_version 36490 (0.0007) [2023-10-12 17:17:17,423][62634] Updated weights for policy 0, policy_version 36500 (0.0007) [2023-10-12 17:17:17,802][62634] Updated weights for policy 0, policy_version 36510 (0.0007) [2023-10-12 17:17:18,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 74743808. Throughput: 0: 1665.1, 1: 1676.4. Samples: 18695432. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 17:17:18,436][61643] Avg episode reward: [(0, '9.060'), (1, '9.660')] [2023-10-12 17:17:19,428][62635] Updated weights for policy 1, policy_version 36490 (0.0010) [2023-10-12 17:17:19,797][62635] Updated weights for policy 1, policy_version 36500 (0.0008) [2023-10-12 17:17:20,169][62635] Updated weights for policy 1, policy_version 36510 (0.0011) [2023-10-12 17:17:21,803][62634] Updated weights for policy 0, policy_version 36520 (0.0008) [2023-10-12 17:17:22,174][62634] Updated weights for policy 0, policy_version 36530 (0.0011) [2023-10-12 17:17:22,549][62634] Updated weights for policy 0, policy_version 36540 (0.0010) [2023-10-12 17:17:23,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 74809344. Throughput: 0: 1690.2, 1: 1670.3. Samples: 18705848. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 17:17:23,435][61643] Avg episode reward: [(0, '8.910'), (1, '9.710')] [2023-10-12 17:17:24,056][62635] Updated weights for policy 1, policy_version 36520 (0.0008) [2023-10-12 17:17:24,435][62635] Updated weights for policy 1, policy_version 36530 (0.0007) [2023-10-12 17:17:24,802][62635] Updated weights for policy 1, policy_version 36540 (0.0010) [2023-10-12 17:17:26,462][62634] Updated weights for policy 0, policy_version 36550 (0.0009) [2023-10-12 17:17:26,838][62634] Updated weights for policy 0, policy_version 36560 (0.0010) [2023-10-12 17:17:27,215][62634] Updated weights for policy 0, policy_version 36570 (0.0007) [2023-10-12 17:17:28,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 74874880. Throughput: 0: 1675.8, 1: 1676.4. Samples: 18725814. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 17:17:28,436][61643] Avg episode reward: [(0, '9.010'), (1, '9.840')] [2023-10-12 17:17:28,788][62635] Updated weights for policy 1, policy_version 36550 (0.0010) [2023-10-12 17:17:29,153][62635] Updated weights for policy 1, policy_version 36560 (0.0010) [2023-10-12 17:17:29,524][62635] Updated weights for policy 1, policy_version 36570 (0.0011) [2023-10-12 17:17:31,193][62634] Updated weights for policy 0, policy_version 36580 (0.0007) [2023-10-12 17:17:31,564][62634] Updated weights for policy 0, policy_version 36590 (0.0007) [2023-10-12 17:17:31,939][62634] Updated weights for policy 0, policy_version 36600 (0.0008) [2023-10-12 17:17:33,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 74940416. Throughput: 0: 1679.4, 1: 1681.8. Samples: 18746212. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 17:17:33,435][61643] Avg episode reward: [(0, '9.120'), (1, '9.920')] [2023-10-12 17:17:33,569][62635] Updated weights for policy 1, policy_version 36580 (0.0009) [2023-10-12 17:17:33,937][62635] Updated weights for policy 1, policy_version 36590 (0.0009) [2023-10-12 17:17:34,311][62635] Updated weights for policy 1, policy_version 36600 (0.0008) [2023-10-12 17:17:35,900][62634] Updated weights for policy 0, policy_version 36610 (0.0009) [2023-10-12 17:17:36,282][62634] Updated weights for policy 0, policy_version 36620 (0.0009) [2023-10-12 17:17:36,659][62634] Updated weights for policy 0, policy_version 36630 (0.0009) [2023-10-12 17:17:37,042][62634] Updated weights for policy 0, policy_version 36640 (0.0008) [2023-10-12 17:17:38,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 75005952. Throughput: 0: 1698.0, 1: 1681.1. Samples: 18756696. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 17:17:38,435][61643] Avg episode reward: [(0, '8.920'), (1, '9.690')] [2023-10-12 17:17:38,574][62635] Updated weights for policy 1, policy_version 36610 (0.0008) [2023-10-12 17:17:38,945][62635] Updated weights for policy 1, policy_version 36620 (0.0011) [2023-10-12 17:17:39,315][62635] Updated weights for policy 1, policy_version 36630 (0.0010) [2023-10-12 17:17:39,677][62635] Updated weights for policy 1, policy_version 36640 (0.0011) [2023-10-12 17:17:41,057][62634] Updated weights for policy 0, policy_version 36650 (0.0007) [2023-10-12 17:17:41,428][62634] Updated weights for policy 0, policy_version 36660 (0.0007) [2023-10-12 17:17:41,810][62634] Updated weights for policy 0, policy_version 36670 (0.0007) [2023-10-12 17:17:43,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 75071488. Throughput: 0: 1666.7, 1: 1682.0. Samples: 18776266. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 17:17:43,436][61643] Avg episode reward: [(0, '9.150'), (1, '9.830')] [2023-10-12 17:17:43,801][62635] Updated weights for policy 1, policy_version 36650 (0.0009) [2023-10-12 17:17:44,172][62635] Updated weights for policy 1, policy_version 36660 (0.0009) [2023-10-12 17:17:44,528][62635] Updated weights for policy 1, policy_version 36670 (0.0008) [2023-10-12 17:17:45,895][62634] Updated weights for policy 0, policy_version 36680 (0.0010) [2023-10-12 17:17:46,266][62634] Updated weights for policy 0, policy_version 36690 (0.0010) [2023-10-12 17:17:46,642][62634] Updated weights for policy 0, policy_version 36700 (0.0007) [2023-10-12 17:17:48,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 75137024. Throughput: 0: 1685.5, 1: 1684.1. Samples: 18797036. Policy #0 lag: (min: 19.0, avg: 26.6, max: 51.0) [2023-10-12 17:17:48,435][61643] Avg episode reward: [(0, '9.470'), (1, '9.850')] [2023-10-12 17:17:48,442][62354] Saving new best policy, reward=9.470! [2023-10-12 17:17:48,895][62635] Updated weights for policy 1, policy_version 36680 (0.0009) [2023-10-12 17:17:49,279][62635] Updated weights for policy 1, policy_version 36690 (0.0010) [2023-10-12 17:17:49,647][62635] Updated weights for policy 1, policy_version 36700 (0.0011) [2023-10-12 17:17:50,793][62634] Updated weights for policy 0, policy_version 36710 (0.0009) [2023-10-12 17:17:51,175][62634] Updated weights for policy 0, policy_version 36720 (0.0009) [2023-10-12 17:17:51,557][62634] Updated weights for policy 0, policy_version 36730 (0.0007) [2023-10-12 17:17:53,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 75202560. Throughput: 0: 1683.5, 1: 1677.1. Samples: 18806760. Policy #0 lag: (min: 19.0, avg: 26.6, max: 51.0) [2023-10-12 17:17:53,435][61643] Avg episode reward: [(0, '9.230'), (1, '9.680')] [2023-10-12 17:17:53,728][62635] Updated weights for policy 1, policy_version 36710 (0.0010) [2023-10-12 17:17:54,085][62635] Updated weights for policy 1, policy_version 36720 (0.0008) [2023-10-12 17:17:54,455][62635] Updated weights for policy 1, policy_version 36730 (0.0008) [2023-10-12 17:17:55,734][62634] Updated weights for policy 0, policy_version 36740 (0.0007) [2023-10-12 17:17:56,117][62634] Updated weights for policy 0, policy_version 36750 (0.0007) [2023-10-12 17:17:56,494][62634] Updated weights for policy 0, policy_version 36760 (0.0008) [2023-10-12 17:17:58,267][62635] Updated weights for policy 1, policy_version 36740 (0.0008) [2023-10-12 17:17:58,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 75268096. Throughput: 0: 1668.6, 1: 1677.8. Samples: 18826606. Policy #0 lag: (min: 19.0, avg: 26.6, max: 51.0) [2023-10-12 17:17:58,435][61643] Avg episode reward: [(0, '9.810'), (1, '9.540')] [2023-10-12 17:17:58,436][62354] Saving new best policy, reward=9.810! [2023-10-12 17:17:58,633][62635] Updated weights for policy 1, policy_version 36750 (0.0009) [2023-10-12 17:17:58,995][62635] Updated weights for policy 1, policy_version 36760 (0.0009) [2023-10-12 17:18:00,447][62634] Updated weights for policy 0, policy_version 36770 (0.0009) [2023-10-12 17:18:00,845][62634] Updated weights for policy 0, policy_version 36780 (0.0007) [2023-10-12 17:18:01,221][62634] Updated weights for policy 0, policy_version 36790 (0.0009) [2023-10-12 17:18:01,602][62634] Updated weights for policy 0, policy_version 36800 (0.0008) [2023-10-12 17:18:03,088][62635] Updated weights for policy 1, policy_version 36770 (0.0010) [2023-10-12 17:18:03,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 75333632. Throughput: 0: 1690.3, 1: 1682.1. Samples: 18847190. Policy #0 lag: (min: 19.0, avg: 26.6, max: 51.0) [2023-10-12 17:18:03,435][61643] Avg episode reward: [(0, '9.930'), (1, '9.500')] [2023-10-12 17:18:03,442][62354] Saving new best policy, reward=9.930! [2023-10-12 17:18:03,460][62635] Updated weights for policy 1, policy_version 36780 (0.0008) [2023-10-12 17:18:03,830][62635] Updated weights for policy 1, policy_version 36790 (0.0009) [2023-10-12 17:18:04,194][62635] Updated weights for policy 1, policy_version 36800 (0.0009) [2023-10-12 17:18:05,722][62634] Updated weights for policy 0, policy_version 36810 (0.0009) [2023-10-12 17:18:06,088][62634] Updated weights for policy 0, policy_version 36820 (0.0010) [2023-10-12 17:18:06,464][62634] Updated weights for policy 0, policy_version 36830 (0.0009) [2023-10-12 17:18:08,245][62635] Updated weights for policy 1, policy_version 36810 (0.0007) [2023-10-12 17:18:08,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 75399168. Throughput: 0: 1677.6, 1: 1680.9. Samples: 18856982. Policy #0 lag: (min: 19.0, avg: 26.6, max: 51.0) [2023-10-12 17:18:08,435][61643] Avg episode reward: [(0, '9.960'), (1, '9.670')] [2023-10-12 17:18:08,436][62354] Saving new best policy, reward=9.960! [2023-10-12 17:18:08,616][62635] Updated weights for policy 1, policy_version 36820 (0.0007) [2023-10-12 17:18:08,982][62635] Updated weights for policy 1, policy_version 36830 (0.0007) [2023-10-12 17:18:10,553][62634] Updated weights for policy 0, policy_version 36840 (0.0008) [2023-10-12 17:18:10,931][62634] Updated weights for policy 0, policy_version 36850 (0.0009) [2023-10-12 17:18:11,315][62634] Updated weights for policy 0, policy_version 36860 (0.0011) [2023-10-12 17:18:13,001][62635] Updated weights for policy 1, policy_version 36840 (0.0010) [2023-10-12 17:18:13,372][62635] Updated weights for policy 1, policy_version 36850 (0.0009) [2023-10-12 17:18:13,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 75464704. Throughput: 0: 1680.7, 1: 1682.8. Samples: 18877172. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 17:18:13,435][61643] Avg episode reward: [(0, '10.150'), (1, '9.630')] [2023-10-12 17:18:13,436][62354] Saving new best policy, reward=10.150! [2023-10-12 17:18:13,740][62635] Updated weights for policy 1, policy_version 36860 (0.0009) [2023-10-12 17:18:15,082][62634] Updated weights for policy 0, policy_version 36870 (0.0009) [2023-10-12 17:18:15,452][62634] Updated weights for policy 0, policy_version 36880 (0.0010) [2023-10-12 17:18:15,841][62634] Updated weights for policy 0, policy_version 36890 (0.0010) [2023-10-12 17:18:17,774][62635] Updated weights for policy 1, policy_version 36870 (0.0009) [2023-10-12 17:18:18,142][62635] Updated weights for policy 1, policy_version 36880 (0.0008) [2023-10-12 17:18:18,435][61643] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 75530240. Throughput: 0: 1695.0, 1: 1673.3. Samples: 18897784. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 17:18:18,436][61643] Avg episode reward: [(0, '10.190'), (1, '9.820')] [2023-10-12 17:18:18,444][62354] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000036896_37781504.pth... [2023-10-12 17:18:18,479][62354] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000035328_36175872.pth [2023-10-12 17:18:18,482][62354] Saving new best policy, reward=10.190! [2023-10-12 17:18:18,514][62635] Updated weights for policy 1, policy_version 36890 (0.0008) [2023-10-12 17:18:18,724][62495] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000036896_37781504.pth... [2023-10-12 17:18:18,753][62495] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000035296_36143104.pth [2023-10-12 17:18:19,826][62634] Updated weights for policy 0, policy_version 36900 (0.0010) [2023-10-12 17:18:20,204][62634] Updated weights for policy 0, policy_version 36910 (0.0009) [2023-10-12 17:18:20,571][62634] Updated weights for policy 0, policy_version 36920 (0.0009) [2023-10-12 17:18:22,530][62635] Updated weights for policy 1, policy_version 36900 (0.0009) [2023-10-12 17:18:22,904][62635] Updated weights for policy 1, policy_version 36910 (0.0007) [2023-10-12 17:18:23,267][62635] Updated weights for policy 1, policy_version 36920 (0.0008) [2023-10-12 17:18:23,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 75595776. Throughput: 0: 1667.8, 1: 1687.4. Samples: 18907680. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 17:18:23,435][61643] Avg episode reward: [(0, '10.180'), (1, '9.560')] [2023-10-12 17:18:24,716][62634] Updated weights for policy 0, policy_version 36930 (0.0009) [2023-10-12 17:18:25,100][62634] Updated weights for policy 0, policy_version 36940 (0.0007) [2023-10-12 17:18:25,476][62634] Updated weights for policy 0, policy_version 36950 (0.0008) [2023-10-12 17:18:25,857][62634] Updated weights for policy 0, policy_version 36960 (0.0007) [2023-10-12 17:18:27,513][62635] Updated weights for policy 1, policy_version 36930 (0.0009) [2023-10-12 17:18:27,896][62635] Updated weights for policy 1, policy_version 36940 (0.0008) [2023-10-12 17:18:28,260][62635] Updated weights for policy 1, policy_version 36950 (0.0009) [2023-10-12 17:18:28,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 75661312. Throughput: 0: 1694.5, 1: 1681.7. Samples: 18928196. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 17:18:28,436][61643] Avg episode reward: [(0, '10.260'), (1, '9.680')] [2023-10-12 17:18:28,437][62354] Saving new best policy, reward=10.260! [2023-10-12 17:18:28,626][62635] Updated weights for policy 1, policy_version 36960 (0.0007) [2023-10-12 17:18:29,941][62634] Updated weights for policy 0, policy_version 36970 (0.0011) [2023-10-12 17:18:30,313][62634] Updated weights for policy 0, policy_version 36980 (0.0011) [2023-10-12 17:18:30,694][62634] Updated weights for policy 0, policy_version 36990 (0.0011) [2023-10-12 17:18:32,617][62635] Updated weights for policy 1, policy_version 36970 (0.0008) [2023-10-12 17:18:32,992][62635] Updated weights for policy 1, policy_version 36980 (0.0011) [2023-10-12 17:18:33,367][62635] Updated weights for policy 1, policy_version 36990 (0.0009) [2023-10-12 17:18:33,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 75726848. Throughput: 0: 1693.5, 1: 1660.4. Samples: 18947962. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 17:18:33,436][61643] Avg episode reward: [(0, '10.820'), (1, '9.730')] [2023-10-12 17:18:33,446][62354] Saving new best policy, reward=10.820! [2023-10-12 17:18:34,636][62634] Updated weights for policy 0, policy_version 37000 (0.0009) [2023-10-12 17:18:35,016][62634] Updated weights for policy 0, policy_version 37010 (0.0010) [2023-10-12 17:18:35,392][62634] Updated weights for policy 0, policy_version 37020 (0.0010) [2023-10-12 17:18:37,550][62635] Updated weights for policy 1, policy_version 37000 (0.0009) [2023-10-12 17:18:37,922][62635] Updated weights for policy 1, policy_version 37010 (0.0008) [2023-10-12 17:18:38,281][62635] Updated weights for policy 1, policy_version 37020 (0.0009) [2023-10-12 17:18:38,435][61643] Fps is (10 sec: 16384.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 75825152. Throughput: 0: 1674.4, 1: 1683.7. Samples: 18957876. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 17:18:38,435][61643] Avg episode reward: [(0, '11.090'), (1, '9.590')] [2023-10-12 17:18:38,436][62354] Saving new best policy, reward=11.090! [2023-10-12 17:18:39,221][62634] Updated weights for policy 0, policy_version 37030 (0.0008) [2023-10-12 17:18:39,601][62634] Updated weights for policy 0, policy_version 37040 (0.0008) [2023-10-12 17:18:39,989][62634] Updated weights for policy 0, policy_version 37050 (0.0008) [2023-10-12 17:18:42,398][62635] Updated weights for policy 1, policy_version 37030 (0.0009) [2023-10-12 17:18:42,768][62635] Updated weights for policy 1, policy_version 37040 (0.0007) [2023-10-12 17:18:43,134][62635] Updated weights for policy 1, policy_version 37050 (0.0008) [2023-10-12 17:18:43,435][61643] Fps is (10 sec: 16384.4, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 75890688. Throughput: 0: 1694.5, 1: 1681.9. Samples: 18978544. Policy #0 lag: (min: 24.0, avg: 43.1, max: 56.0) [2023-10-12 17:18:43,435][61643] Avg episode reward: [(0, '11.290'), (1, '9.570')] [2023-10-12 17:18:43,436][62354] Saving new best policy, reward=11.290! [2023-10-12 17:18:44,130][62634] Updated weights for policy 0, policy_version 37060 (0.0009) [2023-10-12 17:18:44,507][62634] Updated weights for policy 0, policy_version 37070 (0.0008) [2023-10-12 17:18:44,884][62634] Updated weights for policy 0, policy_version 37080 (0.0011) [2023-10-12 17:18:47,162][62635] Updated weights for policy 1, policy_version 37060 (0.0010) [2023-10-12 17:18:47,517][62635] Updated weights for policy 1, policy_version 37070 (0.0011) [2023-10-12 17:18:47,887][62635] Updated weights for policy 1, policy_version 37080 (0.0007) [2023-10-12 17:18:48,435][61643] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 75956224. Throughput: 0: 1696.2, 1: 1663.0. Samples: 18998356. Policy #0 lag: (min: 24.0, avg: 43.1, max: 56.0) [2023-10-12 17:18:48,436][61643] Avg episode reward: [(0, '11.100'), (1, '9.670')] [2023-10-12 17:18:49,113][62634] Updated weights for policy 0, policy_version 37090 (0.0010) [2023-10-12 17:18:49,518][62634] Updated weights for policy 0, policy_version 37100 (0.0009) [2023-10-12 17:18:49,895][62634] Updated weights for policy 0, policy_version 37110 (0.0007) [2023-10-12 17:18:50,272][62634] Updated weights for policy 0, policy_version 37120 (0.0007) [2023-10-12 17:18:52,060][62635] Updated weights for policy 1, policy_version 37090 (0.0009) [2023-10-12 17:18:52,428][62635] Updated weights for policy 1, policy_version 37100 (0.0009) [2023-10-12 17:18:52,806][62635] Updated weights for policy 1, policy_version 37110 (0.0009) [2023-10-12 17:18:53,182][62635] Updated weights for policy 1, policy_version 37120 (0.0010) [2023-10-12 17:18:53,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 76021760. Throughput: 0: 1680.0, 1: 1686.1. Samples: 19008458. Policy #0 lag: (min: 24.0, avg: 43.1, max: 56.0) [2023-10-12 17:18:53,435][61643] Avg episode reward: [(0, '12.260'), (1, '9.710')] [2023-10-12 17:18:53,436][62354] Saving new best policy, reward=12.260! [2023-10-12 17:18:54,040][62634] Updated weights for policy 0, policy_version 37130 (0.0007) [2023-10-12 17:18:54,415][62634] Updated weights for policy 0, policy_version 37140 (0.0008) [2023-10-12 17:18:54,793][62634] Updated weights for policy 0, policy_version 37150 (0.0007) [2023-10-12 17:18:57,396][62635] Updated weights for policy 1, policy_version 37130 (0.0010) [2023-10-12 17:18:57,758][62635] Updated weights for policy 1, policy_version 37140 (0.0009) [2023-10-12 17:18:58,126][62635] Updated weights for policy 1, policy_version 37150 (0.0008) [2023-10-12 17:18:58,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 76087296. Throughput: 0: 1701.1, 1: 1681.1. Samples: 19029372. Policy #0 lag: (min: 24.0, avg: 43.1, max: 56.0) [2023-10-12 17:18:58,436][61643] Avg episode reward: [(0, '12.150'), (1, '9.900')] [2023-10-12 17:18:58,775][62634] Updated weights for policy 0, policy_version 37160 (0.0010) [2023-10-12 17:18:59,157][62634] Updated weights for policy 0, policy_version 37170 (0.0008) [2023-10-12 17:18:59,546][62634] Updated weights for policy 0, policy_version 37180 (0.0007) [2023-10-12 17:19:02,221][62635] Updated weights for policy 1, policy_version 37160 (0.0010) [2023-10-12 17:19:02,590][62635] Updated weights for policy 1, policy_version 37170 (0.0007) [2023-10-12 17:19:02,964][62635] Updated weights for policy 1, policy_version 37180 (0.0009) [2023-10-12 17:19:03,435][61643] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 76152832. Throughput: 0: 1693.1, 1: 1662.4. Samples: 19048784. Policy #0 lag: (min: 24.0, avg: 43.1, max: 56.0) [2023-10-12 17:19:03,436][61643] Avg episode reward: [(0, '12.470'), (1, '9.830')] [2023-10-12 17:19:03,602][62634] Updated weights for policy 0, policy_version 37190 (0.0009) [2023-10-12 17:19:03,972][62634] Updated weights for policy 0, policy_version 37200 (0.0010) [2023-10-12 17:19:04,346][62634] Updated weights for policy 0, policy_version 37210 (0.0008) [2023-10-12 17:19:04,561][62354] Saving new best policy, reward=12.470! [2023-10-12 17:19:07,098][62635] Updated weights for policy 1, policy_version 37190 (0.0008) [2023-10-12 17:19:07,470][62635] Updated weights for policy 1, policy_version 37200 (0.0009) [2023-10-12 17:19:07,833][62635] Updated weights for policy 1, policy_version 37210 (0.0011) [2023-10-12 17:19:08,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 76218368. Throughput: 0: 1686.5, 1: 1670.8. Samples: 19058758. Policy #0 lag: (min: 24.0, avg: 43.1, max: 56.0) [2023-10-12 17:19:08,435][61643] Avg episode reward: [(0, '12.590'), (1, '9.880')] [2023-10-12 17:19:08,533][62634] Updated weights for policy 0, policy_version 37220 (0.0009) [2023-10-12 17:19:08,911][62634] Updated weights for policy 0, policy_version 37230 (0.0011) [2023-10-12 17:19:09,294][62634] Updated weights for policy 0, policy_version 37240 (0.0008) [2023-10-12 17:19:09,590][62354] Saving new best policy, reward=12.590! [2023-10-12 17:19:11,756][62635] Updated weights for policy 1, policy_version 37220 (0.0009) [2023-10-12 17:19:12,124][62635] Updated weights for policy 1, policy_version 37230 (0.0007) [2023-10-12 17:19:12,488][62635] Updated weights for policy 1, policy_version 37240 (0.0007) [2023-10-12 17:19:13,396][62634] Updated weights for policy 0, policy_version 37250 (0.0008) [2023-10-12 17:19:13,435][61643] Fps is (10 sec: 13107.6, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 76283904. Throughput: 0: 1686.7, 1: 1666.1. Samples: 19079074. Policy #0 lag: (min: 1.0, avg: 15.7, max: 33.0) [2023-10-12 17:19:13,435][61643] Avg episode reward: [(0, '12.880'), (1, '10.090')] [2023-10-12 17:19:13,436][62495] Saving new best policy, reward=10.090! [2023-10-12 17:19:13,779][62634] Updated weights for policy 0, policy_version 37260 (0.0010) [2023-10-12 17:19:14,147][62634] Updated weights for policy 0, policy_version 37270 (0.0010) [2023-10-12 17:19:14,524][62354] Saving new best policy, reward=12.880! [2023-10-12 17:19:14,524][62634] Updated weights for policy 0, policy_version 37280 (0.0007) [2023-10-12 17:19:16,578][62635] Updated weights for policy 1, policy_version 37250 (0.0009) [2023-10-12 17:19:16,951][62635] Updated weights for policy 1, policy_version 37260 (0.0007) [2023-10-12 17:19:17,314][62635] Updated weights for policy 1, policy_version 37270 (0.0009) [2023-10-12 17:19:17,682][62635] Updated weights for policy 1, policy_version 37280 (0.0010) [2023-10-12 17:19:18,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 76349440. Throughput: 0: 1687.9, 1: 1666.5. Samples: 19098912. Policy #0 lag: (min: 1.0, avg: 15.7, max: 33.0) [2023-10-12 17:19:18,435][61643] Avg episode reward: [(0, '13.370'), (1, '10.000')] [2023-10-12 17:19:18,481][62634] Updated weights for policy 0, policy_version 37290 (0.0010) [2023-10-12 17:19:18,855][62634] Updated weights for policy 0, policy_version 37300 (0.0010) [2023-10-12 17:19:19,231][62634] Updated weights for policy 0, policy_version 37310 (0.0010) [2023-10-12 17:19:19,305][62354] Saving new best policy, reward=13.370! [2023-10-12 17:19:21,610][62635] Updated weights for policy 1, policy_version 37290 (0.0009) [2023-10-12 17:19:21,981][62635] Updated weights for policy 1, policy_version 37300 (0.0010) [2023-10-12 17:19:22,348][62635] Updated weights for policy 1, policy_version 37310 (0.0010) [2023-10-12 17:19:23,255][62634] Updated weights for policy 0, policy_version 37320 (0.0008) [2023-10-12 17:19:23,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 76414976. Throughput: 0: 1685.6, 1: 1677.6. Samples: 19109222. Policy #0 lag: (min: 1.0, avg: 15.7, max: 33.0) [2023-10-12 17:19:23,435][61643] Avg episode reward: [(0, '13.900'), (1, '9.830')] [2023-10-12 17:19:23,623][62634] Updated weights for policy 0, policy_version 37330 (0.0007) [2023-10-12 17:19:24,007][62634] Updated weights for policy 0, policy_version 37340 (0.0007) [2023-10-12 17:19:24,151][62354] Saving new best policy, reward=13.900! [2023-10-12 17:19:26,506][62635] Updated weights for policy 1, policy_version 37320 (0.0009) [2023-10-12 17:19:26,881][62635] Updated weights for policy 1, policy_version 37330 (0.0008) [2023-10-12 17:19:27,250][62635] Updated weights for policy 1, policy_version 37340 (0.0010) [2023-10-12 17:19:28,010][62634] Updated weights for policy 0, policy_version 37350 (0.0007) [2023-10-12 17:19:28,387][62634] Updated weights for policy 0, policy_version 37360 (0.0010) [2023-10-12 17:19:28,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 76480512. Throughput: 0: 1687.2, 1: 1663.6. Samples: 19129326. Policy #0 lag: (min: 1.0, avg: 15.7, max: 33.0) [2023-10-12 17:19:28,435][61643] Avg episode reward: [(0, '13.770'), (1, '9.910')] [2023-10-12 17:19:28,762][62634] Updated weights for policy 0, policy_version 37370 (0.0009) [2023-10-12 17:19:31,293][62635] Updated weights for policy 1, policy_version 37350 (0.0007) [2023-10-12 17:19:31,665][62635] Updated weights for policy 1, policy_version 37360 (0.0010) [2023-10-12 17:19:32,032][62635] Updated weights for policy 1, policy_version 37370 (0.0011) [2023-10-12 17:19:32,722][62634] Updated weights for policy 0, policy_version 37380 (0.0009) [2023-10-12 17:19:33,098][62634] Updated weights for policy 0, policy_version 37390 (0.0009) [2023-10-12 17:19:33,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 76546048. Throughput: 0: 1684.4, 1: 1675.8. Samples: 19149564. Policy #0 lag: (min: 1.0, avg: 15.7, max: 33.0) [2023-10-12 17:19:33,436][61643] Avg episode reward: [(0, '13.940'), (1, '9.740')] [2023-10-12 17:19:33,478][62634] Updated weights for policy 0, policy_version 37400 (0.0009) [2023-10-12 17:19:33,775][62354] Saving new best policy, reward=13.940! [2023-10-12 17:19:36,016][62635] Updated weights for policy 1, policy_version 37380 (0.0009) [2023-10-12 17:19:36,384][62635] Updated weights for policy 1, policy_version 37390 (0.0009) [2023-10-12 17:19:36,751][62635] Updated weights for policy 1, policy_version 37400 (0.0009) [2023-10-12 17:19:37,507][62634] Updated weights for policy 0, policy_version 37410 (0.0010) [2023-10-12 17:19:37,911][62634] Updated weights for policy 0, policy_version 37420 (0.0007) [2023-10-12 17:19:38,286][62634] Updated weights for policy 0, policy_version 37430 (0.0008) [2023-10-12 17:19:38,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 76611584. Throughput: 0: 1696.9, 1: 1678.9. Samples: 19160372. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 17:19:38,435][61643] Avg episode reward: [(0, '14.500'), (1, '9.570')] [2023-10-12 17:19:38,661][62354] Saving new best policy, reward=14.500! [2023-10-12 17:19:38,666][62634] Updated weights for policy 0, policy_version 37440 (0.0007) [2023-10-12 17:19:40,692][62635] Updated weights for policy 1, policy_version 37410 (0.0010) [2023-10-12 17:19:41,059][62635] Updated weights for policy 1, policy_version 37420 (0.0008) [2023-10-12 17:19:41,433][62635] Updated weights for policy 1, policy_version 37430 (0.0008) [2023-10-12 17:19:41,790][62635] Updated weights for policy 1, policy_version 37440 (0.0007) [2023-10-12 17:19:42,532][62634] Updated weights for policy 0, policy_version 37450 (0.0008) [2023-10-12 17:19:42,900][62634] Updated weights for policy 0, policy_version 37460 (0.0008) [2023-10-12 17:19:43,274][62634] Updated weights for policy 0, policy_version 37470 (0.0011) [2023-10-12 17:19:43,435][61643] Fps is (10 sec: 16384.3, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 76709888. Throughput: 0: 1695.1, 1: 1662.5. Samples: 19180464. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 17:19:43,435][61643] Avg episode reward: [(0, '14.910'), (1, '9.550')] [2023-10-12 17:19:43,436][62354] Saving new best policy, reward=14.910! [2023-10-12 17:19:45,790][62635] Updated weights for policy 1, policy_version 37450 (0.0009) [2023-10-12 17:19:46,155][62635] Updated weights for policy 1, policy_version 37460 (0.0010) [2023-10-12 17:19:46,531][62635] Updated weights for policy 1, policy_version 37470 (0.0007) [2023-10-12 17:19:47,442][62634] Updated weights for policy 0, policy_version 37480 (0.0011) [2023-10-12 17:19:47,816][62634] Updated weights for policy 0, policy_version 37490 (0.0008) [2023-10-12 17:19:48,194][62634] Updated weights for policy 0, policy_version 37500 (0.0011) [2023-10-12 17:19:48,435][61643] Fps is (10 sec: 16384.0, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 76775424. Throughput: 0: 1677.7, 1: 1690.1. Samples: 19200336. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 17:19:48,435][61643] Avg episode reward: [(0, '14.690'), (1, '9.330')] [2023-10-12 17:19:50,693][62635] Updated weights for policy 1, policy_version 37480 (0.0007) [2023-10-12 17:19:51,055][62635] Updated weights for policy 1, policy_version 37490 (0.0007) [2023-10-12 17:19:51,421][62635] Updated weights for policy 1, policy_version 37500 (0.0008) [2023-10-12 17:19:52,158][62634] Updated weights for policy 0, policy_version 37510 (0.0009) [2023-10-12 17:19:52,547][62634] Updated weights for policy 0, policy_version 37520 (0.0010) [2023-10-12 17:19:52,928][62634] Updated weights for policy 0, policy_version 37530 (0.0010) [2023-10-12 17:19:53,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 76840960. Throughput: 0: 1700.3, 1: 1680.8. Samples: 19210906. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 17:19:53,436][61643] Avg episode reward: [(0, '15.260'), (1, '9.320')] [2023-10-12 17:19:53,436][62354] Saving new best policy, reward=15.260! [2023-10-12 17:19:55,637][62635] Updated weights for policy 1, policy_version 37510 (0.0009) [2023-10-12 17:19:55,999][62635] Updated weights for policy 1, policy_version 37520 (0.0009) [2023-10-12 17:19:56,369][62635] Updated weights for policy 1, policy_version 37530 (0.0008) [2023-10-12 17:19:56,925][62634] Updated weights for policy 0, policy_version 37540 (0.0008) [2023-10-12 17:19:57,298][62634] Updated weights for policy 0, policy_version 37550 (0.0008) [2023-10-12 17:19:57,677][62634] Updated weights for policy 0, policy_version 37560 (0.0008) [2023-10-12 17:19:58,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 76906496. Throughput: 0: 1698.1, 1: 1674.0. Samples: 19230816. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 17:19:58,435][61643] Avg episode reward: [(0, '15.350'), (1, '9.560')] [2023-10-12 17:19:58,436][62354] Saving new best policy, reward=15.350! [2023-10-12 17:20:00,626][62635] Updated weights for policy 1, policy_version 37540 (0.0007) [2023-10-12 17:20:00,993][62635] Updated weights for policy 1, policy_version 37550 (0.0008) [2023-10-12 17:20:01,366][62635] Updated weights for policy 1, policy_version 37560 (0.0008) [2023-10-12 17:20:01,820][62634] Updated weights for policy 0, policy_version 37570 (0.0010) [2023-10-12 17:20:02,195][62634] Updated weights for policy 0, policy_version 37580 (0.0010) [2023-10-12 17:20:02,564][62634] Updated weights for policy 0, policy_version 37590 (0.0010) [2023-10-12 17:20:02,940][62634] Updated weights for policy 0, policy_version 37600 (0.0010) [2023-10-12 17:20:03,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 76972032. Throughput: 0: 1670.7, 1: 1693.0. Samples: 19250276. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 17:20:03,435][61643] Avg episode reward: [(0, '15.880'), (1, '9.520')] [2023-10-12 17:20:03,442][62354] Saving new best policy, reward=15.880! [2023-10-12 17:20:05,222][62635] Updated weights for policy 1, policy_version 37570 (0.0007) [2023-10-12 17:20:05,596][62635] Updated weights for policy 1, policy_version 37580 (0.0009) [2023-10-12 17:20:05,980][62635] Updated weights for policy 1, policy_version 37590 (0.0009) [2023-10-12 17:20:06,348][62635] Updated weights for policy 1, policy_version 37600 (0.0008) [2023-10-12 17:20:06,993][62634] Updated weights for policy 0, policy_version 37610 (0.0009) [2023-10-12 17:20:07,376][62634] Updated weights for policy 0, policy_version 37620 (0.0008) [2023-10-12 17:20:07,750][62634] Updated weights for policy 0, policy_version 37630 (0.0009) [2023-10-12 17:20:08,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 77037568. Throughput: 0: 1701.9, 1: 1671.0. Samples: 19261002. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 17:20:08,436][61643] Avg episode reward: [(0, '15.730'), (1, '9.430')] [2023-10-12 17:20:10,334][62635] Updated weights for policy 1, policy_version 37610 (0.0009) [2023-10-12 17:20:10,706][62635] Updated weights for policy 1, policy_version 37620 (0.0009) [2023-10-12 17:20:11,070][62635] Updated weights for policy 1, policy_version 37630 (0.0008) [2023-10-12 17:20:11,850][62634] Updated weights for policy 0, policy_version 37640 (0.0010) [2023-10-12 17:20:12,223][62634] Updated weights for policy 0, policy_version 37650 (0.0010) [2023-10-12 17:20:12,600][62634] Updated weights for policy 0, policy_version 37660 (0.0008) [2023-10-12 17:20:13,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 77103104. Throughput: 0: 1689.9, 1: 1678.9. Samples: 19280922. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 17:20:13,436][61643] Avg episode reward: [(0, '15.820'), (1, '9.410')] [2023-10-12 17:20:15,228][62635] Updated weights for policy 1, policy_version 37640 (0.0009) [2023-10-12 17:20:15,589][62635] Updated weights for policy 1, policy_version 37650 (0.0009) [2023-10-12 17:20:15,959][62635] Updated weights for policy 1, policy_version 37660 (0.0010) [2023-10-12 17:20:16,466][62634] Updated weights for policy 0, policy_version 37670 (0.0008) [2023-10-12 17:20:16,844][62634] Updated weights for policy 0, policy_version 37680 (0.0007) [2023-10-12 17:20:17,207][62634] Updated weights for policy 0, policy_version 37690 (0.0010) [2023-10-12 17:20:18,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 77168640. Throughput: 0: 1676.1, 1: 1689.3. Samples: 19301010. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 17:20:18,436][61643] Avg episode reward: [(0, '15.680'), (1, '9.460')] [2023-10-12 17:20:18,444][62495] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000037664_38567936.pth... [2023-10-12 17:20:18,445][62354] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000037696_38600704.pth... [2023-10-12 17:20:18,483][62495] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000036096_36962304.pth [2023-10-12 17:20:18,489][62354] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000036128_36995072.pth [2023-10-12 17:20:19,941][62635] Updated weights for policy 1, policy_version 37670 (0.0010) [2023-10-12 17:20:20,314][62635] Updated weights for policy 1, policy_version 37680 (0.0009) [2023-10-12 17:20:20,688][62635] Updated weights for policy 1, policy_version 37690 (0.0010) [2023-10-12 17:20:21,322][62634] Updated weights for policy 0, policy_version 37700 (0.0009) [2023-10-12 17:20:21,699][62634] Updated weights for policy 0, policy_version 37710 (0.0007) [2023-10-12 17:20:22,075][62634] Updated weights for policy 0, policy_version 37720 (0.0008) [2023-10-12 17:20:23,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 77234176. Throughput: 0: 1691.8, 1: 1661.0. Samples: 19311248. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 17:20:23,435][61643] Avg episode reward: [(0, '16.060'), (1, '9.590')] [2023-10-12 17:20:23,436][62354] Saving new best policy, reward=16.060! [2023-10-12 17:20:24,793][62635] Updated weights for policy 1, policy_version 37700 (0.0009) [2023-10-12 17:20:25,162][62635] Updated weights for policy 1, policy_version 37710 (0.0008) [2023-10-12 17:20:25,536][62635] Updated weights for policy 1, policy_version 37720 (0.0008) [2023-10-12 17:20:26,261][62634] Updated weights for policy 0, policy_version 37730 (0.0008) [2023-10-12 17:20:26,652][62634] Updated weights for policy 0, policy_version 37740 (0.0009) [2023-10-12 17:20:27,029][62634] Updated weights for policy 0, policy_version 37750 (0.0008) [2023-10-12 17:20:27,408][62634] Updated weights for policy 0, policy_version 37760 (0.0009) [2023-10-12 17:20:28,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 77299712. Throughput: 0: 1662.5, 1: 1679.4. Samples: 19330850. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 17:20:28,435][61643] Avg episode reward: [(0, '16.550'), (1, '9.600')] [2023-10-12 17:20:28,436][62354] Saving new best policy, reward=16.550! [2023-10-12 17:20:29,662][62635] Updated weights for policy 1, policy_version 37730 (0.0008) [2023-10-12 17:20:30,027][62635] Updated weights for policy 1, policy_version 37740 (0.0009) [2023-10-12 17:20:30,404][62635] Updated weights for policy 1, policy_version 37750 (0.0008) [2023-10-12 17:20:30,784][62635] Updated weights for policy 1, policy_version 37760 (0.0008) [2023-10-12 17:20:31,461][62634] Updated weights for policy 0, policy_version 37770 (0.0010) [2023-10-12 17:20:31,838][62634] Updated weights for policy 0, policy_version 37780 (0.0008) [2023-10-12 17:20:32,221][62634] Updated weights for policy 0, policy_version 37790 (0.0008) [2023-10-12 17:20:33,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 77365248. Throughput: 0: 1671.1, 1: 1677.6. Samples: 19351026. Policy #0 lag: (min: 26.0, avg: 27.3, max: 48.0) [2023-10-12 17:20:33,436][61643] Avg episode reward: [(0, '16.810'), (1, '9.640')] [2023-10-12 17:20:33,445][62354] Saving new best policy, reward=16.810! [2023-10-12 17:20:34,830][62635] Updated weights for policy 1, policy_version 37770 (0.0008) [2023-10-12 17:20:35,201][62635] Updated weights for policy 1, policy_version 37780 (0.0009) [2023-10-12 17:20:35,569][62635] Updated weights for policy 1, policy_version 37790 (0.0009) [2023-10-12 17:20:36,235][62634] Updated weights for policy 0, policy_version 37800 (0.0009) [2023-10-12 17:20:36,614][62634] Updated weights for policy 0, policy_version 37810 (0.0010) [2023-10-12 17:20:36,986][62634] Updated weights for policy 0, policy_version 37820 (0.0008) [2023-10-12 17:20:38,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 77430784. Throughput: 0: 1681.2, 1: 1662.4. Samples: 19361370. Policy #0 lag: (min: 26.0, avg: 27.3, max: 48.0) [2023-10-12 17:20:38,435][61643] Avg episode reward: [(0, '16.700'), (1, '9.670')] [2023-10-12 17:20:39,726][62635] Updated weights for policy 1, policy_version 37800 (0.0010) [2023-10-12 17:20:40,096][62635] Updated weights for policy 1, policy_version 37810 (0.0007) [2023-10-12 17:20:40,463][62635] Updated weights for policy 1, policy_version 37820 (0.0007) [2023-10-12 17:20:40,963][62634] Updated weights for policy 0, policy_version 37830 (0.0008) [2023-10-12 17:20:41,341][62634] Updated weights for policy 0, policy_version 37840 (0.0008) [2023-10-12 17:20:41,722][62634] Updated weights for policy 0, policy_version 37850 (0.0010) [2023-10-12 17:20:43,435][61643] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 77496320. Throughput: 0: 1656.9, 1: 1673.8. Samples: 19380698. Policy #0 lag: (min: 26.0, avg: 27.3, max: 48.0) [2023-10-12 17:20:43,435][61643] Avg episode reward: [(0, '16.770'), (1, '9.810')] [2023-10-12 17:20:44,553][62635] Updated weights for policy 1, policy_version 37830 (0.0009) [2023-10-12 17:20:44,921][62635] Updated weights for policy 1, policy_version 37840 (0.0008) [2023-10-12 17:20:45,291][62635] Updated weights for policy 1, policy_version 37850 (0.0008) [2023-10-12 17:20:45,766][62634] Updated weights for policy 0, policy_version 37860 (0.0009) [2023-10-12 17:20:46,139][62634] Updated weights for policy 0, policy_version 37870 (0.0008) [2023-10-12 17:20:46,507][62634] Updated weights for policy 0, policy_version 37880 (0.0008) [2023-10-12 17:20:48,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 77561856. Throughput: 0: 1679.9, 1: 1673.6. Samples: 19401186. Policy #0 lag: (min: 26.0, avg: 27.3, max: 48.0) [2023-10-12 17:20:48,436][61643] Avg episode reward: [(0, '16.930'), (1, '9.900')] [2023-10-12 17:20:48,447][62354] Saving new best policy, reward=16.930! [2023-10-12 17:20:49,487][62635] Updated weights for policy 1, policy_version 37860 (0.0009) [2023-10-12 17:20:49,846][62635] Updated weights for policy 1, policy_version 37870 (0.0009) [2023-10-12 17:20:50,211][62635] Updated weights for policy 1, policy_version 37880 (0.0007) [2023-10-12 17:20:50,659][62634] Updated weights for policy 0, policy_version 37890 (0.0010) [2023-10-12 17:20:51,037][62634] Updated weights for policy 0, policy_version 37900 (0.0008) [2023-10-12 17:20:51,421][62634] Updated weights for policy 0, policy_version 37910 (0.0008) [2023-10-12 17:20:51,788][62634] Updated weights for policy 0, policy_version 37920 (0.0009) [2023-10-12 17:20:53,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 77627392. Throughput: 0: 1674.0, 1: 1663.4. Samples: 19411184. Policy #0 lag: (min: 26.0, avg: 27.3, max: 48.0) [2023-10-12 17:20:53,435][61643] Avg episode reward: [(0, '16.750'), (1, '9.580')] [2023-10-12 17:20:54,347][62635] Updated weights for policy 1, policy_version 37890 (0.0009) [2023-10-12 17:20:54,708][62635] Updated weights for policy 1, policy_version 37900 (0.0007) [2023-10-12 17:20:55,070][62635] Updated weights for policy 1, policy_version 37910 (0.0008) [2023-10-12 17:20:55,439][62635] Updated weights for policy 1, policy_version 37920 (0.0008) [2023-10-12 17:20:55,793][62634] Updated weights for policy 0, policy_version 37930 (0.0009) [2023-10-12 17:20:56,169][62634] Updated weights for policy 0, policy_version 37940 (0.0009) [2023-10-12 17:20:56,545][62634] Updated weights for policy 0, policy_version 37950 (0.0007) [2023-10-12 17:20:58,435][61643] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 77692928. Throughput: 0: 1661.5, 1: 1670.6. Samples: 19430866. Policy #0 lag: (min: 26.0, avg: 27.3, max: 48.0) [2023-10-12 17:20:58,435][61643] Avg episode reward: [(0, '16.850'), (1, '9.710')] [2023-10-12 17:20:59,661][62635] Updated weights for policy 1, policy_version 37930 (0.0007) [2023-10-12 17:21:00,027][62635] Updated weights for policy 1, policy_version 37940 (0.0007) [2023-10-12 17:21:00,399][62635] Updated weights for policy 1, policy_version 37950 (0.0008) [2023-10-12 17:21:00,503][62634] Updated weights for policy 0, policy_version 37960 (0.0009) [2023-10-12 17:21:00,891][62634] Updated weights for policy 0, policy_version 37970 (0.0011) [2023-10-12 17:21:01,268][62634] Updated weights for policy 0, policy_version 37980 (0.0008) [2023-10-12 17:21:03,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 77758464. Throughput: 0: 1682.4, 1: 1665.0. Samples: 19451640. Policy #0 lag: (min: 18.0, avg: 18.0, max: 20.0) [2023-10-12 17:21:03,435][61643] Avg episode reward: [(0, '16.790'), (1, '9.690')] [2023-10-12 17:21:04,499][62635] Updated weights for policy 1, policy_version 37960 (0.0009) [2023-10-12 17:21:04,876][62635] Updated weights for policy 1, policy_version 37970 (0.0007) [2023-10-12 17:21:05,253][62635] Updated weights for policy 1, policy_version 37980 (0.0007) [2023-10-12 17:21:05,319][62634] Updated weights for policy 0, policy_version 37990 (0.0009) [2023-10-12 17:21:05,693][62634] Updated weights for policy 0, policy_version 38000 (0.0010) [2023-10-12 17:21:06,077][62634] Updated weights for policy 0, policy_version 38010 (0.0009) [2023-10-12 17:21:08,435][61643] Fps is (10 sec: 13106.7, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 77824000. Throughput: 0: 1664.6, 1: 1669.4. Samples: 19461276. Policy #0 lag: (min: 18.0, avg: 18.0, max: 20.0) [2023-10-12 17:21:08,436][61643] Avg episode reward: [(0, '17.180'), (1, '9.680')] [2023-10-12 17:21:08,438][62354] Saving new best policy, reward=17.180! [2023-10-12 17:21:09,114][62635] Updated weights for policy 1, policy_version 37990 (0.0008) [2023-10-12 17:21:09,481][62635] Updated weights for policy 1, policy_version 38000 (0.0008) [2023-10-12 17:21:09,854][62635] Updated weights for policy 1, policy_version 38010 (0.0008) [2023-10-12 17:21:10,127][62634] Updated weights for policy 0, policy_version 38020 (0.0008) [2023-10-12 17:21:10,507][62634] Updated weights for policy 0, policy_version 38030 (0.0007) [2023-10-12 17:21:10,881][62634] Updated weights for policy 0, policy_version 38040 (0.0007) [2023-10-12 17:21:13,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 77889536. Throughput: 0: 1675.2, 1: 1677.5. Samples: 19481724. Policy #0 lag: (min: 18.0, avg: 18.0, max: 20.0) [2023-10-12 17:21:13,435][61643] Avg episode reward: [(0, '17.350'), (1, '9.510')] [2023-10-12 17:21:13,436][62354] Saving new best policy, reward=17.350! [2023-10-12 17:21:13,933][62635] Updated weights for policy 1, policy_version 38020 (0.0009) [2023-10-12 17:21:14,310][62635] Updated weights for policy 1, policy_version 38030 (0.0010) [2023-10-12 17:21:14,671][62635] Updated weights for policy 1, policy_version 38040 (0.0008) [2023-10-12 17:21:14,957][62634] Updated weights for policy 0, policy_version 38050 (0.0008) [2023-10-12 17:21:15,350][62634] Updated weights for policy 0, policy_version 38060 (0.0008) [2023-10-12 17:21:15,725][62634] Updated weights for policy 0, policy_version 38070 (0.0007) [2023-10-12 17:21:16,109][62634] Updated weights for policy 0, policy_version 38080 (0.0007) [2023-10-12 17:21:18,435][61643] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 77955072. Throughput: 0: 1682.5, 1: 1674.6. Samples: 19502094. Policy #0 lag: (min: 18.0, avg: 18.0, max: 20.0) [2023-10-12 17:21:18,436][61643] Avg episode reward: [(0, '17.510'), (1, '9.460')] [2023-10-12 17:21:18,445][62354] Saving new best policy, reward=17.510! [2023-10-12 17:21:18,841][62635] Updated weights for policy 1, policy_version 38050 (0.0009) [2023-10-12 17:21:19,208][62635] Updated weights for policy 1, policy_version 38060 (0.0007) [2023-10-12 17:21:19,587][62635] Updated weights for policy 1, policy_version 38070 (0.0010) [2023-10-12 17:21:19,950][62635] Updated weights for policy 1, policy_version 38080 (0.0008) [2023-10-12 17:21:20,215][62634] Updated weights for policy 0, policy_version 38090 (0.0008) [2023-10-12 17:21:20,600][62634] Updated weights for policy 0, policy_version 38100 (0.0010) [2023-10-12 17:21:20,980][62634] Updated weights for policy 0, policy_version 38110 (0.0009) [2023-10-12 17:21:23,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 78020608. Throughput: 0: 1658.2, 1: 1674.6. Samples: 19511346. Policy #0 lag: (min: 18.0, avg: 18.0, max: 20.0) [2023-10-12 17:21:23,435][61643] Avg episode reward: [(0, '17.400'), (1, '9.670')] [2023-10-12 17:21:23,825][62635] Updated weights for policy 1, policy_version 38090 (0.0009) [2023-10-12 17:21:24,187][62635] Updated weights for policy 1, policy_version 38100 (0.0008) [2023-10-12 17:21:24,559][62635] Updated weights for policy 1, policy_version 38110 (0.0007) [2023-10-12 17:21:25,004][62634] Updated weights for policy 0, policy_version 38120 (0.0010) [2023-10-12 17:21:25,392][62634] Updated weights for policy 0, policy_version 38130 (0.0008) [2023-10-12 17:21:25,765][62634] Updated weights for policy 0, policy_version 38140 (0.0008) [2023-10-12 17:21:28,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 78086144. Throughput: 0: 1680.7, 1: 1682.8. Samples: 19532054. Policy #0 lag: (min: 18.0, avg: 18.0, max: 20.0) [2023-10-12 17:21:28,436][61643] Avg episode reward: [(0, '17.160'), (1, '9.710')] [2023-10-12 17:21:28,630][62635] Updated weights for policy 1, policy_version 38120 (0.0008) [2023-10-12 17:21:28,998][62635] Updated weights for policy 1, policy_version 38130 (0.0009) [2023-10-12 17:21:29,371][62635] Updated weights for policy 1, policy_version 38140 (0.0009) [2023-10-12 17:21:29,727][62634] Updated weights for policy 0, policy_version 38150 (0.0009) [2023-10-12 17:21:30,098][62634] Updated weights for policy 0, policy_version 38160 (0.0010) [2023-10-12 17:21:30,468][62634] Updated weights for policy 0, policy_version 38170 (0.0010) [2023-10-12 17:21:33,417][62635] Updated weights for policy 1, policy_version 38150 (0.0008) [2023-10-12 17:21:33,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 78151680. Throughput: 0: 1688.7, 1: 1684.2. Samples: 19552966. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 17:21:33,435][61643] Avg episode reward: [(0, '17.200'), (1, '9.860')] [2023-10-12 17:21:33,783][62635] Updated weights for policy 1, policy_version 38160 (0.0010) [2023-10-12 17:21:34,154][62635] Updated weights for policy 1, policy_version 38170 (0.0008) [2023-10-12 17:21:34,607][62634] Updated weights for policy 0, policy_version 38180 (0.0007) [2023-10-12 17:21:34,992][62634] Updated weights for policy 0, policy_version 38190 (0.0007) [2023-10-12 17:21:35,361][62634] Updated weights for policy 0, policy_version 38200 (0.0008) [2023-10-12 17:21:38,235][62635] Updated weights for policy 1, policy_version 38180 (0.0007) [2023-10-12 17:21:38,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 78217216. Throughput: 0: 1667.3, 1: 1690.7. Samples: 19562294. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 17:21:38,436][61643] Avg episode reward: [(0, '17.420'), (1, '9.690')] [2023-10-12 17:21:38,606][62635] Updated weights for policy 1, policy_version 38190 (0.0007) [2023-10-12 17:21:38,975][62635] Updated weights for policy 1, policy_version 38200 (0.0008) [2023-10-12 17:21:39,497][62634] Updated weights for policy 0, policy_version 38210 (0.0008) [2023-10-12 17:21:39,873][62634] Updated weights for policy 0, policy_version 38220 (0.0011) [2023-10-12 17:21:40,252][62634] Updated weights for policy 0, policy_version 38230 (0.0008) [2023-10-12 17:21:40,624][62634] Updated weights for policy 0, policy_version 38240 (0.0007) [2023-10-12 17:21:43,076][62635] Updated weights for policy 1, policy_version 38210 (0.0008) [2023-10-12 17:21:43,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 78282752. Throughput: 0: 1690.4, 1: 1695.0. Samples: 19583208. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 17:21:43,435][61643] Avg episode reward: [(0, '17.290'), (1, '9.610')] [2023-10-12 17:21:43,437][62635] Updated weights for policy 1, policy_version 38220 (0.0011) [2023-10-12 17:21:43,804][62635] Updated weights for policy 1, policy_version 38230 (0.0010) [2023-10-12 17:21:44,179][62635] Updated weights for policy 1, policy_version 38240 (0.0007) [2023-10-12 17:21:44,596][62634] Updated weights for policy 0, policy_version 38250 (0.0010) [2023-10-12 17:21:44,969][62634] Updated weights for policy 0, policy_version 38260 (0.0009) [2023-10-12 17:21:45,352][62634] Updated weights for policy 0, policy_version 38270 (0.0008) [2023-10-12 17:21:48,318][62635] Updated weights for policy 1, policy_version 38250 (0.0008) [2023-10-12 17:21:48,435][61643] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 78348288. Throughput: 0: 1688.0, 1: 1692.0. Samples: 19603740. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 17:21:48,435][61643] Avg episode reward: [(0, '17.290'), (1, '10.040')] [2023-10-12 17:21:48,692][62635] Updated weights for policy 1, policy_version 38260 (0.0009) [2023-10-12 17:21:49,062][62635] Updated weights for policy 1, policy_version 38270 (0.0009) [2023-10-12 17:21:49,389][62634] Updated weights for policy 0, policy_version 38280 (0.0010) [2023-10-12 17:21:49,765][62634] Updated weights for policy 0, policy_version 38290 (0.0010) [2023-10-12 17:21:50,135][62634] Updated weights for policy 0, policy_version 38300 (0.0010) [2023-10-12 17:21:52,990][62635] Updated weights for policy 1, policy_version 38280 (0.0007) [2023-10-12 17:21:53,355][62635] Updated weights for policy 1, policy_version 38290 (0.0007) [2023-10-12 17:21:53,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 78413824. Throughput: 0: 1677.8, 1: 1690.5. Samples: 19612850. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 17:21:53,435][61643] Avg episode reward: [(0, '18.120'), (1, '9.790')] [2023-10-12 17:21:53,436][62354] Saving new best policy, reward=18.120! [2023-10-12 17:21:53,724][62635] Updated weights for policy 1, policy_version 38300 (0.0010) [2023-10-12 17:21:53,987][62634] Updated weights for policy 0, policy_version 38310 (0.0009) [2023-10-12 17:21:54,363][62634] Updated weights for policy 0, policy_version 38320 (0.0011) [2023-10-12 17:21:54,744][62634] Updated weights for policy 0, policy_version 38330 (0.0007) [2023-10-12 17:21:57,866][62635] Updated weights for policy 1, policy_version 38310 (0.0008) [2023-10-12 17:21:58,235][62635] Updated weights for policy 1, policy_version 38320 (0.0007) [2023-10-12 17:21:58,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 78479360. Throughput: 0: 1690.8, 1: 1684.6. Samples: 19633618. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 17:21:58,435][61643] Avg episode reward: [(0, '18.400'), (1, '9.670')] [2023-10-12 17:21:58,436][62354] Saving new best policy, reward=18.400! [2023-10-12 17:21:58,607][62635] Updated weights for policy 1, policy_version 38330 (0.0010) [2023-10-12 17:21:58,834][62634] Updated weights for policy 0, policy_version 38340 (0.0010) [2023-10-12 17:21:59,218][62634] Updated weights for policy 0, policy_version 38350 (0.0011) [2023-10-12 17:21:59,592][62634] Updated weights for policy 0, policy_version 38360 (0.0007) [2023-10-12 17:22:02,516][62635] Updated weights for policy 1, policy_version 38340 (0.0008) [2023-10-12 17:22:02,881][62635] Updated weights for policy 1, policy_version 38350 (0.0010) [2023-10-12 17:22:03,246][62635] Updated weights for policy 1, policy_version 38360 (0.0009) [2023-10-12 17:22:03,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 78544896. Throughput: 0: 1696.4, 1: 1677.8. Samples: 19653930. Policy #0 lag: (min: 16.0, avg: 32.3, max: 48.0) [2023-10-12 17:22:03,435][61643] Avg episode reward: [(0, '18.380'), (1, '9.420')] [2023-10-12 17:22:03,732][62634] Updated weights for policy 0, policy_version 38370 (0.0010) [2023-10-12 17:22:04,123][62634] Updated weights for policy 0, policy_version 38380 (0.0010) [2023-10-12 17:22:04,506][62634] Updated weights for policy 0, policy_version 38390 (0.0010) [2023-10-12 17:22:04,880][62634] Updated weights for policy 0, policy_version 38400 (0.0008) [2023-10-12 17:22:07,271][62635] Updated weights for policy 1, policy_version 38370 (0.0009) [2023-10-12 17:22:07,631][62635] Updated weights for policy 1, policy_version 38380 (0.0008) [2023-10-12 17:22:07,997][62635] Updated weights for policy 1, policy_version 38390 (0.0007) [2023-10-12 17:22:08,368][62635] Updated weights for policy 1, policy_version 38400 (0.0008) [2023-10-12 17:22:08,435][61643] Fps is (10 sec: 16383.9, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 78643200. Throughput: 0: 1692.5, 1: 1697.2. Samples: 19663884. Policy #0 lag: (min: 16.0, avg: 32.3, max: 48.0) [2023-10-12 17:22:08,436][61643] Avg episode reward: [(0, '18.580'), (1, '9.130')] [2023-10-12 17:22:08,860][62634] Updated weights for policy 0, policy_version 38410 (0.0007) [2023-10-12 17:22:09,244][62634] Updated weights for policy 0, policy_version 38420 (0.0008) [2023-10-12 17:22:09,612][62634] Updated weights for policy 0, policy_version 38430 (0.0010) [2023-10-12 17:22:09,685][62354] Saving new best policy, reward=18.580! [2023-10-12 17:22:12,205][62635] Updated weights for policy 1, policy_version 38410 (0.0008) [2023-10-12 17:22:12,581][62635] Updated weights for policy 1, policy_version 38420 (0.0008) [2023-10-12 17:22:12,950][62635] Updated weights for policy 1, policy_version 38430 (0.0008) [2023-10-12 17:22:13,435][61643] Fps is (10 sec: 16384.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 78708736. Throughput: 0: 1694.1, 1: 1694.5. Samples: 19684542. Policy #0 lag: (min: 16.0, avg: 32.3, max: 48.0) [2023-10-12 17:22:13,435][61643] Avg episode reward: [(0, '18.740'), (1, '9.060')] [2023-10-12 17:22:13,602][62634] Updated weights for policy 0, policy_version 38440 (0.0009) [2023-10-12 17:22:13,985][62634] Updated weights for policy 0, policy_version 38450 (0.0008) [2023-10-12 17:22:14,358][62634] Updated weights for policy 0, policy_version 38460 (0.0008) [2023-10-12 17:22:14,510][62354] Saving new best policy, reward=18.740! [2023-10-12 17:22:17,078][62635] Updated weights for policy 1, policy_version 38440 (0.0008) [2023-10-12 17:22:17,450][62635] Updated weights for policy 1, policy_version 38450 (0.0009) [2023-10-12 17:22:17,828][62635] Updated weights for policy 1, policy_version 38460 (0.0007) [2023-10-12 17:22:18,351][62634] Updated weights for policy 0, policy_version 38470 (0.0007) [2023-10-12 17:22:18,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 78774272. Throughput: 0: 1696.4, 1: 1672.4. Samples: 19704562. Policy #0 lag: (min: 16.0, avg: 32.3, max: 48.0) [2023-10-12 17:22:18,436][61643] Avg episode reward: [(0, '18.770'), (1, '9.040')] [2023-10-12 17:22:18,443][62495] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000038464_39387136.pth... [2023-10-12 17:22:18,481][62495] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000036896_37781504.pth [2023-10-12 17:22:18,734][62634] Updated weights for policy 0, policy_version 38480 (0.0007) [2023-10-12 17:22:19,120][62634] Updated weights for policy 0, policy_version 38490 (0.0008) [2023-10-12 17:22:19,339][62354] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000038496_39419904.pth... [2023-10-12 17:22:19,377][62354] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000036896_37781504.pth [2023-10-12 17:22:19,381][62354] Saving new best policy, reward=18.770! [2023-10-12 17:22:22,010][62635] Updated weights for policy 1, policy_version 38470 (0.0009) [2023-10-12 17:22:22,383][62635] Updated weights for policy 1, policy_version 38480 (0.0008) [2023-10-12 17:22:22,751][62635] Updated weights for policy 1, policy_version 38490 (0.0008) [2023-10-12 17:22:23,065][62634] Updated weights for policy 0, policy_version 38500 (0.0009) [2023-10-12 17:22:23,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 78839808. Throughput: 0: 1694.2, 1: 1697.9. Samples: 19714938. Policy #0 lag: (min: 16.0, avg: 32.3, max: 48.0) [2023-10-12 17:22:23,435][61643] Avg episode reward: [(0, '18.980'), (1, '8.940')] [2023-10-12 17:22:23,442][62634] Updated weights for policy 0, policy_version 38510 (0.0009) [2023-10-12 17:22:23,815][62634] Updated weights for policy 0, policy_version 38520 (0.0011) [2023-10-12 17:22:24,114][62354] Saving new best policy, reward=18.980! [2023-10-12 17:22:26,788][62635] Updated weights for policy 1, policy_version 38500 (0.0009) [2023-10-12 17:22:27,152][62635] Updated weights for policy 1, policy_version 38510 (0.0009) [2023-10-12 17:22:27,520][62635] Updated weights for policy 1, policy_version 38520 (0.0010) [2023-10-12 17:22:27,865][62634] Updated weights for policy 0, policy_version 38530 (0.0011) [2023-10-12 17:22:28,249][62634] Updated weights for policy 0, policy_version 38540 (0.0010) [2023-10-12 17:22:28,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 78905344. Throughput: 0: 1697.5, 1: 1683.5. Samples: 19735352. Policy #0 lag: (min: 16.0, avg: 32.3, max: 48.0) [2023-10-12 17:22:28,435][61643] Avg episode reward: [(0, '18.970'), (1, '8.960')] [2023-10-12 17:22:28,623][62634] Updated weights for policy 0, policy_version 38550 (0.0007) [2023-10-12 17:22:28,997][62634] Updated weights for policy 0, policy_version 38560 (0.0008) [2023-10-12 17:22:31,715][62635] Updated weights for policy 1, policy_version 38530 (0.0009) [2023-10-12 17:22:32,081][62635] Updated weights for policy 1, policy_version 38540 (0.0008) [2023-10-12 17:22:32,459][62635] Updated weights for policy 1, policy_version 38550 (0.0008) [2023-10-12 17:22:32,823][62635] Updated weights for policy 1, policy_version 38560 (0.0008) [2023-10-12 17:22:32,979][62634] Updated weights for policy 0, policy_version 38570 (0.0009) [2023-10-12 17:22:33,356][62634] Updated weights for policy 0, policy_version 38580 (0.0009) [2023-10-12 17:22:33,435][61643] Fps is (10 sec: 13106.8, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 78970880. Throughput: 0: 1689.2, 1: 1665.0. Samples: 19754680. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2023-10-12 17:22:33,436][61643] Avg episode reward: [(0, '18.990'), (1, '9.080')] [2023-10-12 17:22:33,730][62634] Updated weights for policy 0, policy_version 38590 (0.0009) [2023-10-12 17:22:33,807][62354] Saving new best policy, reward=18.990! [2023-10-12 17:22:37,064][62635] Updated weights for policy 1, policy_version 38570 (0.0009) [2023-10-12 17:22:37,441][62635] Updated weights for policy 1, policy_version 38580 (0.0008) [2023-10-12 17:22:37,798][62634] Updated weights for policy 0, policy_version 38600 (0.0008) [2023-10-12 17:22:37,812][62635] Updated weights for policy 1, policy_version 38590 (0.0009) [2023-10-12 17:22:38,177][62634] Updated weights for policy 0, policy_version 38610 (0.0009) [2023-10-12 17:22:38,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 79036416. Throughput: 0: 1700.9, 1: 1690.0. Samples: 19765444. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2023-10-12 17:22:38,435][61643] Avg episode reward: [(0, '19.270'), (1, '9.410')] [2023-10-12 17:22:38,556][62634] Updated weights for policy 0, policy_version 38620 (0.0009) [2023-10-12 17:22:38,706][62354] Saving new best policy, reward=19.270! [2023-10-12 17:22:41,819][62635] Updated weights for policy 1, policy_version 38600 (0.0008) [2023-10-12 17:22:42,181][62635] Updated weights for policy 1, policy_version 38610 (0.0009) [2023-10-12 17:22:42,488][62634] Updated weights for policy 0, policy_version 38630 (0.0009) [2023-10-12 17:22:42,560][62635] Updated weights for policy 1, policy_version 38620 (0.0008) [2023-10-12 17:22:42,876][62634] Updated weights for policy 0, policy_version 38640 (0.0011) [2023-10-12 17:22:43,248][62634] Updated weights for policy 0, policy_version 38650 (0.0008) [2023-10-12 17:22:43,435][61643] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 79101952. Throughput: 0: 1702.1, 1: 1678.0. Samples: 19785720. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2023-10-12 17:22:43,436][61643] Avg episode reward: [(0, '18.950'), (1, '9.530')] [2023-10-12 17:22:46,691][62635] Updated weights for policy 1, policy_version 38630 (0.0009) [2023-10-12 17:22:47,066][62635] Updated weights for policy 1, policy_version 38640 (0.0008) [2023-10-12 17:22:47,320][62634] Updated weights for policy 0, policy_version 38660 (0.0010) [2023-10-12 17:22:47,436][62635] Updated weights for policy 1, policy_version 38650 (0.0007) [2023-10-12 17:22:47,698][62634] Updated weights for policy 0, policy_version 38670 (0.0009) [2023-10-12 17:22:48,071][62634] Updated weights for policy 0, policy_version 38680 (0.0010) [2023-10-12 17:22:48,435][61643] Fps is (10 sec: 16383.6, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 79200256. Throughput: 0: 1682.3, 1: 1673.0. Samples: 19804916. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2023-10-12 17:22:48,436][61643] Avg episode reward: [(0, '19.100'), (1, '9.580')] [2023-10-12 17:22:51,410][62635] Updated weights for policy 1, policy_version 38660 (0.0009) [2023-10-12 17:22:51,781][62635] Updated weights for policy 1, policy_version 38670 (0.0009) [2023-10-12 17:22:52,024][62634] Updated weights for policy 0, policy_version 38690 (0.0009) [2023-10-12 17:22:52,147][62635] Updated weights for policy 1, policy_version 38680 (0.0007) [2023-10-12 17:22:52,423][62634] Updated weights for policy 0, policy_version 38700 (0.0007) [2023-10-12 17:22:52,806][62634] Updated weights for policy 0, policy_version 38710 (0.0007) [2023-10-12 17:22:53,175][62634] Updated weights for policy 0, policy_version 38720 (0.0008) [2023-10-12 17:22:53,435][61643] Fps is (10 sec: 16384.2, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 79265792. Throughput: 0: 1698.5, 1: 1683.3. Samples: 19816064. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2023-10-12 17:22:53,435][61643] Avg episode reward: [(0, '19.100'), (1, '9.480')] [2023-10-12 17:22:56,388][62635] Updated weights for policy 1, policy_version 38690 (0.0009) [2023-10-12 17:22:56,747][62635] Updated weights for policy 1, policy_version 38700 (0.0007) [2023-10-12 17:22:57,111][62635] Updated weights for policy 1, policy_version 38710 (0.0008) [2023-10-12 17:22:57,184][62634] Updated weights for policy 0, policy_version 38730 (0.0008) [2023-10-12 17:22:57,480][62635] Updated weights for policy 1, policy_version 38720 (0.0008) [2023-10-12 17:22:57,553][62634] Updated weights for policy 0, policy_version 38740 (0.0007) [2023-10-12 17:22:57,935][62634] Updated weights for policy 0, policy_version 38750 (0.0007) [2023-10-12 17:22:58,435][61643] Fps is (10 sec: 13107.2, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 79331328. Throughput: 0: 1696.0, 1: 1662.0. Samples: 19835652. Policy #0 lag: (min: 31.0, avg: 35.0, max: 63.0) [2023-10-12 17:22:58,436][61643] Avg episode reward: [(0, '18.980'), (1, '9.740')] [2023-10-12 17:23:01,595][62635] Updated weights for policy 1, policy_version 38730 (0.0011) [2023-10-12 17:23:01,950][62635] Updated weights for policy 1, policy_version 38740 (0.0008) [2023-10-12 17:23:02,037][62634] Updated weights for policy 0, policy_version 38760 (0.0007) [2023-10-12 17:23:02,319][62635] Updated weights for policy 1, policy_version 38750 (0.0009) [2023-10-12 17:23:02,414][62634] Updated weights for policy 0, policy_version 38770 (0.0007) [2023-10-12 17:23:02,794][62634] Updated weights for policy 0, policy_version 38780 (0.0008) [2023-10-12 17:23:03,435][61643] Fps is (10 sec: 13106.9, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 79396864. Throughput: 0: 1666.0, 1: 1669.5. Samples: 19854660. Policy #0 lag: (min: 31.0, avg: 35.0, max: 63.0) [2023-10-12 17:23:03,436][61643] Avg episode reward: [(0, '18.900'), (1, '9.800')] [2023-10-12 17:23:06,408][62635] Updated weights for policy 1, policy_version 38760 (0.0010) [2023-10-12 17:23:06,771][62635] Updated weights for policy 1, policy_version 38770 (0.0007) [2023-10-12 17:23:06,928][62634] Updated weights for policy 0, policy_version 38790 (0.0009) [2023-10-12 17:23:07,137][62635] Updated weights for policy 1, policy_version 38780 (0.0007) [2023-10-12 17:23:07,304][62634] Updated weights for policy 0, policy_version 38800 (0.0010) [2023-10-12 17:23:07,684][62634] Updated weights for policy 0, policy_version 38810 (0.0010) [2023-10-12 17:23:08,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 79462400. Throughput: 0: 1693.8, 1: 1668.2. Samples: 19866230. Policy #0 lag: (min: 31.0, avg: 35.0, max: 63.0) [2023-10-12 17:23:08,436][61643] Avg episode reward: [(0, '19.140'), (1, '9.610')] [2023-10-12 17:23:10,975][62635] Updated weights for policy 1, policy_version 38790 (0.0007) [2023-10-12 17:23:11,341][62635] Updated weights for policy 1, policy_version 38800 (0.0009) [2023-10-12 17:23:11,708][62635] Updated weights for policy 1, policy_version 38810 (0.0009) [2023-10-12 17:23:11,770][62634] Updated weights for policy 0, policy_version 38820 (0.0010) [2023-10-12 17:23:12,146][62634] Updated weights for policy 0, policy_version 38830 (0.0008) [2023-10-12 17:23:12,520][62634] Updated weights for policy 0, policy_version 38840 (0.0007) [2023-10-12 17:23:13,435][61643] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 79527936. Throughput: 0: 1681.6, 1: 1654.9. Samples: 19885494. Policy #0 lag: (min: 31.0, avg: 35.0, max: 63.0) [2023-10-12 17:23:13,435][61643] Avg episode reward: [(0, '19.160'), (1, '9.800')] [2023-10-12 17:23:15,737][62635] Updated weights for policy 1, policy_version 38820 (0.0008) [2023-10-12 17:23:16,106][62635] Updated weights for policy 1, policy_version 38830 (0.0008) [2023-10-12 17:23:16,479][62635] Updated weights for policy 1, policy_version 38840 (0.0008) [2023-10-12 17:23:16,636][62634] Updated weights for policy 0, policy_version 38850 (0.0009) [2023-10-12 17:23:17,008][62634] Updated weights for policy 0, policy_version 38860 (0.0008) [2023-10-12 17:23:17,384][62634] Updated weights for policy 0, policy_version 38870 (0.0009) [2023-10-12 17:23:17,763][62634] Updated weights for policy 0, policy_version 38880 (0.0009) [2023-10-12 17:23:18,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 79593472. Throughput: 0: 1664.4, 1: 1679.5. Samples: 19905152. Policy #0 lag: (min: 31.0, avg: 35.0, max: 63.0) [2023-10-12 17:23:18,436][61643] Avg episode reward: [(0, '19.210'), (1, '9.730')] [2023-10-12 17:23:20,692][62635] Updated weights for policy 1, policy_version 38850 (0.0007) [2023-10-12 17:23:21,054][62635] Updated weights for policy 1, policy_version 38860 (0.0008) [2023-10-12 17:23:21,430][62635] Updated weights for policy 1, policy_version 38870 (0.0007) [2023-10-12 17:23:21,783][62634] Updated weights for policy 0, policy_version 38890 (0.0007) [2023-10-12 17:23:21,794][62635] Updated weights for policy 1, policy_version 38880 (0.0009) [2023-10-12 17:23:22,157][62634] Updated weights for policy 0, policy_version 38900 (0.0008) [2023-10-12 17:23:22,535][62634] Updated weights for policy 0, policy_version 38910 (0.0009) [2023-10-12 17:23:23,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 79659008. Throughput: 0: 1680.7, 1: 1671.5. Samples: 19916290. Policy #0 lag: (min: 31.0, avg: 35.0, max: 63.0) [2023-10-12 17:23:23,435][61643] Avg episode reward: [(0, '19.490'), (1, '9.580')] [2023-10-12 17:23:23,436][62354] Saving new best policy, reward=19.490! [2023-10-12 17:23:25,906][62635] Updated weights for policy 1, policy_version 38890 (0.0008) [2023-10-12 17:23:26,270][62635] Updated weights for policy 1, policy_version 38900 (0.0007) [2023-10-12 17:23:26,589][62634] Updated weights for policy 0, policy_version 38920 (0.0009) [2023-10-12 17:23:26,625][62635] Updated weights for policy 1, policy_version 38910 (0.0009) [2023-10-12 17:23:26,960][62634] Updated weights for policy 0, policy_version 38930 (0.0009) [2023-10-12 17:23:27,345][62634] Updated weights for policy 0, policy_version 38940 (0.0010) [2023-10-12 17:23:28,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 79724544. Throughput: 0: 1666.4, 1: 1662.9. Samples: 19935538. Policy #0 lag: (min: 31.0, avg: 38.8, max: 63.0) [2023-10-12 17:23:28,435][61643] Avg episode reward: [(0, '19.450'), (1, '9.560')] [2023-10-12 17:23:30,638][62635] Updated weights for policy 1, policy_version 38920 (0.0008) [2023-10-12 17:23:31,012][62635] Updated weights for policy 1, policy_version 38930 (0.0009) [2023-10-12 17:23:31,388][62635] Updated weights for policy 1, policy_version 38940 (0.0010) [2023-10-12 17:23:31,521][62634] Updated weights for policy 0, policy_version 38950 (0.0007) [2023-10-12 17:23:31,889][62634] Updated weights for policy 0, policy_version 38960 (0.0008) [2023-10-12 17:23:32,268][62634] Updated weights for policy 0, policy_version 38970 (0.0009) [2023-10-12 17:23:33,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 79790080. Throughput: 0: 1662.5, 1: 1675.6. Samples: 19955128. Policy #0 lag: (min: 31.0, avg: 38.8, max: 63.0) [2023-10-12 17:23:33,436][61643] Avg episode reward: [(0, '19.260'), (1, '9.660')] [2023-10-12 17:23:35,571][62635] Updated weights for policy 1, policy_version 38950 (0.0008) [2023-10-12 17:23:35,941][62635] Updated weights for policy 1, policy_version 38960 (0.0007) [2023-10-12 17:23:36,299][62634] Updated weights for policy 0, policy_version 38980 (0.0011) [2023-10-12 17:23:36,306][62635] Updated weights for policy 1, policy_version 38970 (0.0008) [2023-10-12 17:23:36,676][62634] Updated weights for policy 0, policy_version 38990 (0.0008) [2023-10-12 17:23:37,046][62634] Updated weights for policy 0, policy_version 39000 (0.0009) [2023-10-12 17:23:38,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 79855616. Throughput: 0: 1675.0, 1: 1658.7. Samples: 19966080. Policy #0 lag: (min: 31.0, avg: 38.8, max: 63.0) [2023-10-12 17:23:38,436][61643] Avg episode reward: [(0, '19.280'), (1, '9.510')] [2023-10-12 17:23:40,422][62635] Updated weights for policy 1, policy_version 38980 (0.0008) [2023-10-12 17:23:40,787][62635] Updated weights for policy 1, policy_version 38990 (0.0009) [2023-10-12 17:23:41,083][62634] Updated weights for policy 0, policy_version 39010 (0.0010) [2023-10-12 17:23:41,156][62635] Updated weights for policy 1, policy_version 39000 (0.0008) [2023-10-12 17:23:41,477][62634] Updated weights for policy 0, policy_version 39020 (0.0009) [2023-10-12 17:23:41,856][62634] Updated weights for policy 0, policy_version 39030 (0.0009) [2023-10-12 17:23:42,236][62634] Updated weights for policy 0, policy_version 39040 (0.0007) [2023-10-12 17:23:43,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 79921152. Throughput: 0: 1655.9, 1: 1666.7. Samples: 19985168. Policy #0 lag: (min: 31.0, avg: 38.8, max: 63.0) [2023-10-12 17:23:43,436][61643] Avg episode reward: [(0, '19.560'), (1, '9.520')] [2023-10-12 17:23:43,437][62354] Saving new best policy, reward=19.560! [2023-10-12 17:23:45,339][62635] Updated weights for policy 1, policy_version 39010 (0.0009) [2023-10-12 17:23:45,708][62635] Updated weights for policy 1, policy_version 39020 (0.0008) [2023-10-12 17:23:46,084][62635] Updated weights for policy 1, policy_version 39030 (0.0008) [2023-10-12 17:23:46,388][62634] Updated weights for policy 0, policy_version 39050 (0.0008) [2023-10-12 17:23:46,439][62635] Updated weights for policy 1, policy_version 39040 (0.0008) [2023-10-12 17:23:46,759][62634] Updated weights for policy 0, policy_version 39060 (0.0007) [2023-10-12 17:23:47,148][62634] Updated weights for policy 0, policy_version 39070 (0.0007) [2023-10-12 17:23:48,435][61643] Fps is (10 sec: 13107.6, 60 sec: 13107.3, 300 sec: 13440.4). Total num frames: 79986688. Throughput: 0: 1674.4, 1: 1680.0. Samples: 20005606. Policy #0 lag: (min: 31.0, avg: 38.8, max: 63.0) [2023-10-12 17:23:48,435][61643] Avg episode reward: [(0, '19.530'), (1, '9.790')] [2023-10-12 17:23:50,362][62635] Updated weights for policy 1, policy_version 39050 (0.0009) [2023-10-12 17:23:50,720][62635] Updated weights for policy 1, policy_version 39060 (0.0008) [2023-10-12 17:23:51,095][62635] Updated weights for policy 1, policy_version 39070 (0.0009) [2023-10-12 17:23:51,295][62634] Updated weights for policy 0, policy_version 39080 (0.0007) [2023-10-12 17:23:51,670][62634] Updated weights for policy 0, policy_version 39090 (0.0007) [2023-10-12 17:23:52,052][62634] Updated weights for policy 0, policy_version 39100 (0.0010) [2023-10-12 17:23:53,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 80052224. Throughput: 0: 1676.8, 1: 1655.4. Samples: 20016176. Policy #0 lag: (min: 31.0, avg: 38.8, max: 63.0) [2023-10-12 17:23:53,436][61643] Avg episode reward: [(0, '19.560'), (1, '9.560')] [2023-10-12 17:23:55,116][62635] Updated weights for policy 1, policy_version 39080 (0.0009) [2023-10-12 17:23:55,483][62635] Updated weights for policy 1, policy_version 39090 (0.0009) [2023-10-12 17:23:55,848][62635] Updated weights for policy 1, policy_version 39100 (0.0008) [2023-10-12 17:23:56,092][62634] Updated weights for policy 0, policy_version 39110 (0.0009) [2023-10-12 17:23:56,476][62634] Updated weights for policy 0, policy_version 39120 (0.0009) [2023-10-12 17:23:56,842][62634] Updated weights for policy 0, policy_version 39130 (0.0008) [2023-10-12 17:23:58,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 80117760. Throughput: 0: 1662.5, 1: 1678.1. Samples: 20035820. Policy #0 lag: (min: 20.0, avg: 20.3, max: 32.0) [2023-10-12 17:23:58,435][61643] Avg episode reward: [(0, '19.830'), (1, '9.410')] [2023-10-12 17:23:58,436][62354] Saving new best policy, reward=19.830! [2023-10-12 17:23:59,781][62635] Updated weights for policy 1, policy_version 39110 (0.0007) [2023-10-12 17:24:00,150][62635] Updated weights for policy 1, policy_version 39120 (0.0007) [2023-10-12 17:24:00,520][62635] Updated weights for policy 1, policy_version 39130 (0.0008) [2023-10-12 17:24:00,818][62634] Updated weights for policy 0, policy_version 39140 (0.0009) [2023-10-12 17:24:01,182][62634] Updated weights for policy 0, policy_version 39150 (0.0010) [2023-10-12 17:24:01,564][62634] Updated weights for policy 0, policy_version 39160 (0.0008) [2023-10-12 17:24:03,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 80183296. Throughput: 0: 1680.9, 1: 1687.0. Samples: 20056710. Policy #0 lag: (min: 20.0, avg: 20.3, max: 32.0) [2023-10-12 17:24:03,435][61643] Avg episode reward: [(0, '19.830'), (1, '9.390')] [2023-10-12 17:24:04,532][62635] Updated weights for policy 1, policy_version 39140 (0.0008) [2023-10-12 17:24:04,889][62635] Updated weights for policy 1, policy_version 39150 (0.0010) [2023-10-12 17:24:05,262][62635] Updated weights for policy 1, policy_version 39160 (0.0007) [2023-10-12 17:24:05,575][62634] Updated weights for policy 0, policy_version 39170 (0.0008) [2023-10-12 17:24:05,955][62634] Updated weights for policy 0, policy_version 39180 (0.0009) [2023-10-12 17:24:06,331][62634] Updated weights for policy 0, policy_version 39190 (0.0007) [2023-10-12 17:24:06,709][62634] Updated weights for policy 0, policy_version 39200 (0.0008) [2023-10-12 17:24:08,435][61643] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 80248832. Throughput: 0: 1674.3, 1: 1669.1. Samples: 20066742. Policy #0 lag: (min: 20.0, avg: 20.3, max: 32.0) [2023-10-12 17:24:08,436][61643] Avg episode reward: [(0, '19.910'), (1, '9.250')] [2023-10-12 17:24:08,437][62354] Saving new best policy, reward=19.910! [2023-10-12 17:24:09,345][62635] Updated weights for policy 1, policy_version 39170 (0.0008) [2023-10-12 17:24:09,719][62635] Updated weights for policy 1, policy_version 39180 (0.0010) [2023-10-12 17:24:10,083][62635] Updated weights for policy 1, policy_version 39190 (0.0010) [2023-10-12 17:24:10,457][62635] Updated weights for policy 1, policy_version 39200 (0.0009) [2023-10-12 17:24:10,837][62634] Updated weights for policy 0, policy_version 39210 (0.0008) [2023-10-12 17:24:11,221][62634] Updated weights for policy 0, policy_version 39220 (0.0008) [2023-10-12 17:24:11,589][62634] Updated weights for policy 0, policy_version 39230 (0.0008) [2023-10-12 17:24:13,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 80314368. Throughput: 0: 1665.2, 1: 1689.4. Samples: 20086496. Policy #0 lag: (min: 20.0, avg: 20.3, max: 32.0) [2023-10-12 17:24:13,435][61643] Avg episode reward: [(0, '19.850'), (1, '9.350')] [2023-10-12 17:24:14,486][62635] Updated weights for policy 1, policy_version 39210 (0.0007) [2023-10-12 17:24:14,852][62635] Updated weights for policy 1, policy_version 39220 (0.0009) [2023-10-12 17:24:15,216][62635] Updated weights for policy 1, policy_version 39230 (0.0007) [2023-10-12 17:24:15,614][62634] Updated weights for policy 0, policy_version 39240 (0.0009) [2023-10-12 17:24:16,001][62634] Updated weights for policy 0, policy_version 39250 (0.0007) [2023-10-12 17:24:16,370][62634] Updated weights for policy 0, policy_version 39260 (0.0007) [2023-10-12 17:24:18,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 80379904. Throughput: 0: 1687.9, 1: 1692.8. Samples: 20107258. Policy #0 lag: (min: 20.0, avg: 20.3, max: 32.0) [2023-10-12 17:24:18,436][61643] Avg episode reward: [(0, '19.670'), (1, '9.450')] [2023-10-12 17:24:18,448][62495] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000039232_40173568.pth... [2023-10-12 17:24:18,448][62354] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000039264_40206336.pth... [2023-10-12 17:24:18,485][62495] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000037664_38567936.pth [2023-10-12 17:24:18,488][62495] Saving a milestone ./train_atari/atari_kangaroo_APPO/checkpoint_p1/milestones/checkpoint_000039232_40173568.pth [2023-10-12 17:24:18,489][62354] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000037696_38600704.pth [2023-10-12 17:24:18,494][62354] Saving a milestone ./train_atari/atari_kangaroo_APPO/checkpoint_p0/milestones/checkpoint_000039264_40206336.pth [2023-10-12 17:24:19,455][62635] Updated weights for policy 1, policy_version 39240 (0.0008) [2023-10-12 17:24:19,827][62635] Updated weights for policy 1, policy_version 39250 (0.0008) [2023-10-12 17:24:20,190][62635] Updated weights for policy 1, policy_version 39260 (0.0008) [2023-10-12 17:24:20,428][62634] Updated weights for policy 0, policy_version 39270 (0.0007) [2023-10-12 17:24:20,812][62634] Updated weights for policy 0, policy_version 39280 (0.0007) [2023-10-12 17:24:21,192][62634] Updated weights for policy 0, policy_version 39290 (0.0010) [2023-10-12 17:24:23,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 80445440. Throughput: 0: 1669.7, 1: 1678.0. Samples: 20116730. Policy #0 lag: (min: 20.0, avg: 20.3, max: 32.0) [2023-10-12 17:24:23,435][61643] Avg episode reward: [(0, '19.340'), (1, '9.570')] [2023-10-12 17:24:24,123][62635] Updated weights for policy 1, policy_version 39270 (0.0009) [2023-10-12 17:24:24,490][62635] Updated weights for policy 1, policy_version 39280 (0.0007) [2023-10-12 17:24:24,861][62635] Updated weights for policy 1, policy_version 39290 (0.0008) [2023-10-12 17:24:25,145][62634] Updated weights for policy 0, policy_version 39300 (0.0008) [2023-10-12 17:24:25,520][62634] Updated weights for policy 0, policy_version 39310 (0.0010) [2023-10-12 17:24:25,894][62634] Updated weights for policy 0, policy_version 39320 (0.0010) [2023-10-12 17:24:28,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 80510976. Throughput: 0: 1681.5, 1: 1692.1. Samples: 20136982. Policy #0 lag: (min: 31.0, avg: 42.1, max: 63.0) [2023-10-12 17:24:28,436][61643] Avg episode reward: [(0, '19.210'), (1, '9.700')] [2023-10-12 17:24:28,979][62635] Updated weights for policy 1, policy_version 39300 (0.0007) [2023-10-12 17:24:29,348][62635] Updated weights for policy 1, policy_version 39310 (0.0007) [2023-10-12 17:24:29,718][62635] Updated weights for policy 1, policy_version 39320 (0.0007) [2023-10-12 17:24:29,855][62634] Updated weights for policy 0, policy_version 39330 (0.0009) [2023-10-12 17:24:30,249][62634] Updated weights for policy 0, policy_version 39340 (0.0007) [2023-10-12 17:24:30,614][62634] Updated weights for policy 0, policy_version 39350 (0.0007) [2023-10-12 17:24:30,988][62634] Updated weights for policy 0, policy_version 39360 (0.0010) [2023-10-12 17:24:33,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 80576512. Throughput: 0: 1688.9, 1: 1694.2. Samples: 20157844. Policy #0 lag: (min: 31.0, avg: 42.1, max: 63.0) [2023-10-12 17:24:33,435][61643] Avg episode reward: [(0, '19.180'), (1, '9.950')] [2023-10-12 17:24:33,697][62635] Updated weights for policy 1, policy_version 39330 (0.0008) [2023-10-12 17:24:34,062][62635] Updated weights for policy 1, policy_version 39340 (0.0010) [2023-10-12 17:24:34,441][62635] Updated weights for policy 1, policy_version 39350 (0.0010) [2023-10-12 17:24:34,814][62635] Updated weights for policy 1, policy_version 39360 (0.0009) [2023-10-12 17:24:35,080][62634] Updated weights for policy 0, policy_version 39370 (0.0007) [2023-10-12 17:24:35,457][62634] Updated weights for policy 0, policy_version 39380 (0.0011) [2023-10-12 17:24:35,832][62634] Updated weights for policy 0, policy_version 39390 (0.0011) [2023-10-12 17:24:38,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 80642048. Throughput: 0: 1656.5, 1: 1690.1. Samples: 20166772. Policy #0 lag: (min: 31.0, avg: 42.1, max: 63.0) [2023-10-12 17:24:38,436][61643] Avg episode reward: [(0, '19.400'), (1, '9.920')] [2023-10-12 17:24:39,063][62635] Updated weights for policy 1, policy_version 39370 (0.0009) [2023-10-12 17:24:39,433][62635] Updated weights for policy 1, policy_version 39380 (0.0008) [2023-10-12 17:24:39,796][62634] Updated weights for policy 0, policy_version 39400 (0.0008) [2023-10-12 17:24:39,801][62635] Updated weights for policy 1, policy_version 39390 (0.0011) [2023-10-12 17:24:40,175][62634] Updated weights for policy 0, policy_version 39410 (0.0007) [2023-10-12 17:24:40,561][62634] Updated weights for policy 0, policy_version 39420 (0.0007) [2023-10-12 17:24:43,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 80707584. Throughput: 0: 1679.0, 1: 1692.5. Samples: 20187536. Policy #0 lag: (min: 31.0, avg: 42.1, max: 63.0) [2023-10-12 17:24:43,436][61643] Avg episode reward: [(0, '19.250'), (1, '9.780')] [2023-10-12 17:24:43,900][62635] Updated weights for policy 1, policy_version 39400 (0.0009) [2023-10-12 17:24:44,269][62635] Updated weights for policy 1, policy_version 39410 (0.0007) [2023-10-12 17:24:44,596][62634] Updated weights for policy 0, policy_version 39430 (0.0009) [2023-10-12 17:24:44,635][62635] Updated weights for policy 1, policy_version 39420 (0.0007) [2023-10-12 17:24:44,970][62634] Updated weights for policy 0, policy_version 39440 (0.0009) [2023-10-12 17:24:45,344][62634] Updated weights for policy 0, policy_version 39450 (0.0009) [2023-10-12 17:24:48,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.1, 300 sec: 13329.3). Total num frames: 80773120. Throughput: 0: 1692.7, 1: 1679.1. Samples: 20208446. Policy #0 lag: (min: 31.0, avg: 42.1, max: 63.0) [2023-10-12 17:24:48,436][61643] Avg episode reward: [(0, '19.350'), (1, '9.870')] [2023-10-12 17:24:48,916][62635] Updated weights for policy 1, policy_version 39430 (0.0008) [2023-10-12 17:24:49,289][62635] Updated weights for policy 1, policy_version 39440 (0.0007) [2023-10-12 17:24:49,392][62634] Updated weights for policy 0, policy_version 39460 (0.0008) [2023-10-12 17:24:49,664][62635] Updated weights for policy 1, policy_version 39450 (0.0008) [2023-10-12 17:24:49,772][62634] Updated weights for policy 0, policy_version 39470 (0.0009) [2023-10-12 17:24:50,148][62634] Updated weights for policy 0, policy_version 39480 (0.0007) [2023-10-12 17:24:53,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 80838656. Throughput: 0: 1668.5, 1: 1679.0. Samples: 20217376. Policy #0 lag: (min: 31.0, avg: 42.1, max: 63.0) [2023-10-12 17:24:53,435][61643] Avg episode reward: [(0, '19.500'), (1, '9.880')] [2023-10-12 17:24:53,714][62635] Updated weights for policy 1, policy_version 39460 (0.0009) [2023-10-12 17:24:54,084][62635] Updated weights for policy 1, policy_version 39470 (0.0007) [2023-10-12 17:24:54,301][62634] Updated weights for policy 0, policy_version 39490 (0.0008) [2023-10-12 17:24:54,442][62635] Updated weights for policy 1, policy_version 39480 (0.0007) [2023-10-12 17:24:54,674][62634] Updated weights for policy 0, policy_version 39500 (0.0008) [2023-10-12 17:24:55,056][62634] Updated weights for policy 0, policy_version 39510 (0.0010) [2023-10-12 17:24:55,437][62634] Updated weights for policy 0, policy_version 39520 (0.0010) [2023-10-12 17:24:58,215][62635] Updated weights for policy 1, policy_version 39490 (0.0007) [2023-10-12 17:24:58,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 80904192. Throughput: 0: 1686.6, 1: 1680.6. Samples: 20238022. Policy #0 lag: (min: 28.0, avg: 28.0, max: 28.0) [2023-10-12 17:24:58,436][61643] Avg episode reward: [(0, '19.790'), (1, '9.920')] [2023-10-12 17:24:58,584][62635] Updated weights for policy 1, policy_version 39500 (0.0008) [2023-10-12 17:24:58,955][62635] Updated weights for policy 1, policy_version 39510 (0.0007) [2023-10-12 17:24:59,315][62635] Updated weights for policy 1, policy_version 39520 (0.0007) [2023-10-12 17:24:59,505][62634] Updated weights for policy 0, policy_version 39530 (0.0008) [2023-10-12 17:24:59,873][62634] Updated weights for policy 0, policy_version 39540 (0.0009) [2023-10-12 17:25:00,245][62634] Updated weights for policy 0, policy_version 39550 (0.0010) [2023-10-12 17:25:03,151][62635] Updated weights for policy 1, policy_version 39530 (0.0009) [2023-10-12 17:25:03,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 80969728. Throughput: 0: 1683.5, 1: 1678.9. Samples: 20258562. Policy #0 lag: (min: 28.0, avg: 28.0, max: 28.0) [2023-10-12 17:25:03,435][61643] Avg episode reward: [(0, '19.240'), (1, '10.100')] [2023-10-12 17:25:03,521][62635] Updated weights for policy 1, policy_version 39540 (0.0009) [2023-10-12 17:25:03,882][62635] Updated weights for policy 1, policy_version 39550 (0.0007) [2023-10-12 17:25:03,954][62495] Saving new best policy, reward=10.100! [2023-10-12 17:25:04,257][62634] Updated weights for policy 0, policy_version 39560 (0.0008) [2023-10-12 17:25:04,631][62634] Updated weights for policy 0, policy_version 39570 (0.0008) [2023-10-12 17:25:04,995][62634] Updated weights for policy 0, policy_version 39580 (0.0009) [2023-10-12 17:25:08,058][62635] Updated weights for policy 1, policy_version 39560 (0.0007) [2023-10-12 17:25:08,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 81035264. Throughput: 0: 1669.8, 1: 1688.8. Samples: 20267868. Policy #0 lag: (min: 28.0, avg: 28.0, max: 28.0) [2023-10-12 17:25:08,435][61643] Avg episode reward: [(0, '19.170'), (1, '9.860')] [2023-10-12 17:25:08,437][62635] Updated weights for policy 1, policy_version 39570 (0.0007) [2023-10-12 17:25:08,799][62635] Updated weights for policy 1, policy_version 39580 (0.0007) [2023-10-12 17:25:09,084][62634] Updated weights for policy 0, policy_version 39590 (0.0009) [2023-10-12 17:25:09,467][62634] Updated weights for policy 0, policy_version 39600 (0.0007) [2023-10-12 17:25:09,840][62634] Updated weights for policy 0, policy_version 39610 (0.0010) [2023-10-12 17:25:12,863][62635] Updated weights for policy 1, policy_version 39590 (0.0008) [2023-10-12 17:25:13,229][62635] Updated weights for policy 1, policy_version 39600 (0.0010) [2023-10-12 17:25:13,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 81100800. Throughput: 0: 1681.6, 1: 1687.7. Samples: 20288600. Policy #0 lag: (min: 28.0, avg: 28.0, max: 28.0) [2023-10-12 17:25:13,436][61643] Avg episode reward: [(0, '19.110'), (1, '9.820')] [2023-10-12 17:25:13,598][62635] Updated weights for policy 1, policy_version 39610 (0.0010) [2023-10-12 17:25:13,916][62634] Updated weights for policy 0, policy_version 39620 (0.0010) [2023-10-12 17:25:14,289][62634] Updated weights for policy 0, policy_version 39630 (0.0008) [2023-10-12 17:25:14,662][62634] Updated weights for policy 0, policy_version 39640 (0.0009) [2023-10-12 17:25:17,649][62635] Updated weights for policy 1, policy_version 39620 (0.0008) [2023-10-12 17:25:18,008][62635] Updated weights for policy 1, policy_version 39630 (0.0008) [2023-10-12 17:25:18,385][62635] Updated weights for policy 1, policy_version 39640 (0.0010) [2023-10-12 17:25:18,435][61643] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 81166336. Throughput: 0: 1679.8, 1: 1677.9. Samples: 20308938. Policy #0 lag: (min: 28.0, avg: 28.0, max: 28.0) [2023-10-12 17:25:18,436][61643] Avg episode reward: [(0, '19.270'), (1, '9.990')] [2023-10-12 17:25:18,778][62634] Updated weights for policy 0, policy_version 39650 (0.0007) [2023-10-12 17:25:19,179][62634] Updated weights for policy 0, policy_version 39660 (0.0009) [2023-10-12 17:25:19,561][62634] Updated weights for policy 0, policy_version 39670 (0.0007) [2023-10-12 17:25:19,935][62634] Updated weights for policy 0, policy_version 39680 (0.0007) [2023-10-12 17:25:22,368][62635] Updated weights for policy 1, policy_version 39650 (0.0008) [2023-10-12 17:25:22,728][62635] Updated weights for policy 1, policy_version 39660 (0.0007) [2023-10-12 17:25:23,091][62635] Updated weights for policy 1, policy_version 39670 (0.0007) [2023-10-12 17:25:23,435][61643] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 81231872. Throughput: 0: 1683.1, 1: 1694.5. Samples: 20318760. Policy #0 lag: (min: 28.0, avg: 28.0, max: 28.0) [2023-10-12 17:25:23,435][61643] Avg episode reward: [(0, '19.420'), (1, '9.800')] [2023-10-12 17:25:23,467][62635] Updated weights for policy 1, policy_version 39680 (0.0009) [2023-10-12 17:25:23,865][62634] Updated weights for policy 0, policy_version 39690 (0.0007) [2023-10-12 17:25:24,259][62634] Updated weights for policy 0, policy_version 39700 (0.0010) [2023-10-12 17:25:24,633][62634] Updated weights for policy 0, policy_version 39710 (0.0008) [2023-10-12 17:25:27,561][62635] Updated weights for policy 1, policy_version 39690 (0.0008) [2023-10-12 17:25:27,937][62635] Updated weights for policy 1, policy_version 39700 (0.0009) [2023-10-12 17:25:28,302][62635] Updated weights for policy 1, policy_version 39710 (0.0007) [2023-10-12 17:25:28,435][61643] Fps is (10 sec: 16384.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 81330176. Throughput: 0: 1687.6, 1: 1690.4. Samples: 20339544. Policy #0 lag: (min: 14.0, avg: 15.7, max: 42.0) [2023-10-12 17:25:28,436][61643] Avg episode reward: [(0, '19.440'), (1, '9.790')] [2023-10-12 17:25:28,751][62634] Updated weights for policy 0, policy_version 39720 (0.0008) [2023-10-12 17:25:29,128][62634] Updated weights for policy 0, policy_version 39730 (0.0007) [2023-10-12 17:25:29,503][62634] Updated weights for policy 0, policy_version 39740 (0.0007) [2023-10-12 17:25:32,326][62635] Updated weights for policy 1, policy_version 39720 (0.0008) [2023-10-12 17:25:32,697][62635] Updated weights for policy 1, policy_version 39730 (0.0009) [2023-10-12 17:25:33,067][62635] Updated weights for policy 1, policy_version 39740 (0.0007) [2023-10-12 17:25:33,435][61643] Fps is (10 sec: 16384.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 81395712. Throughput: 0: 1678.9, 1: 1669.9. Samples: 20359142. Policy #0 lag: (min: 14.0, avg: 15.7, max: 42.0) [2023-10-12 17:25:33,435][61643] Avg episode reward: [(0, '19.630'), (1, '9.820')] [2023-10-12 17:25:33,532][62634] Updated weights for policy 0, policy_version 39750 (0.0008) [2023-10-12 17:25:33,901][62634] Updated weights for policy 0, policy_version 39760 (0.0008) [2023-10-12 17:25:34,280][62634] Updated weights for policy 0, policy_version 39770 (0.0009) [2023-10-12 17:25:37,163][62635] Updated weights for policy 1, policy_version 39750 (0.0008) [2023-10-12 17:25:37,530][62635] Updated weights for policy 1, policy_version 39760 (0.0008) [2023-10-12 17:25:37,893][62635] Updated weights for policy 1, policy_version 39770 (0.0008) [2023-10-12 17:25:38,325][62634] Updated weights for policy 0, policy_version 39780 (0.0009) [2023-10-12 17:25:38,435][61643] Fps is (10 sec: 13107.5, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 81461248. Throughput: 0: 1680.6, 1: 1694.8. Samples: 20369268. Policy #0 lag: (min: 14.0, avg: 15.7, max: 42.0) [2023-10-12 17:25:38,435][61643] Avg episode reward: [(0, '19.470'), (1, '9.790')] [2023-10-12 17:25:38,706][62634] Updated weights for policy 0, policy_version 39790 (0.0008) [2023-10-12 17:25:39,077][62634] Updated weights for policy 0, policy_version 39800 (0.0007) [2023-10-12 17:25:41,934][62635] Updated weights for policy 1, policy_version 39780 (0.0008) [2023-10-12 17:25:42,301][62635] Updated weights for policy 1, policy_version 39790 (0.0008) [2023-10-12 17:25:42,667][62635] Updated weights for policy 1, policy_version 39800 (0.0009) [2023-10-12 17:25:43,115][62634] Updated weights for policy 0, policy_version 39810 (0.0010) [2023-10-12 17:25:43,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 81526784. Throughput: 0: 1679.7, 1: 1688.8. Samples: 20389606. Policy #0 lag: (min: 14.0, avg: 15.7, max: 42.0) [2023-10-12 17:25:43,436][61643] Avg episode reward: [(0, '19.430'), (1, '9.740')] [2023-10-12 17:25:43,491][62634] Updated weights for policy 0, policy_version 39820 (0.0011) [2023-10-12 17:25:43,866][62634] Updated weights for policy 0, policy_version 39830 (0.0011) [2023-10-12 17:25:44,247][62634] Updated weights for policy 0, policy_version 39840 (0.0009) [2023-10-12 17:25:46,804][62635] Updated weights for policy 1, policy_version 39810 (0.0009) [2023-10-12 17:25:47,171][62635] Updated weights for policy 1, policy_version 39820 (0.0008) [2023-10-12 17:25:47,536][62635] Updated weights for policy 1, policy_version 39830 (0.0009) [2023-10-12 17:25:47,916][62635] Updated weights for policy 1, policy_version 39840 (0.0010) [2023-10-12 17:25:48,325][62634] Updated weights for policy 0, policy_version 39850 (0.0009) [2023-10-12 17:25:48,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 81592320. Throughput: 0: 1684.0, 1: 1669.8. Samples: 20409484. Policy #0 lag: (min: 14.0, avg: 15.7, max: 42.0) [2023-10-12 17:25:48,435][61643] Avg episode reward: [(0, '19.790'), (1, '9.700')] [2023-10-12 17:25:48,718][62634] Updated weights for policy 0, policy_version 39860 (0.0010) [2023-10-12 17:25:49,083][62634] Updated weights for policy 0, policy_version 39870 (0.0007) [2023-10-12 17:25:52,017][62635] Updated weights for policy 1, policy_version 39850 (0.0010) [2023-10-12 17:25:52,378][62635] Updated weights for policy 1, policy_version 39860 (0.0010) [2023-10-12 17:25:52,752][62635] Updated weights for policy 1, policy_version 39870 (0.0009) [2023-10-12 17:25:53,108][62634] Updated weights for policy 0, policy_version 39880 (0.0007) [2023-10-12 17:25:53,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 81657856. Throughput: 0: 1683.3, 1: 1689.5. Samples: 20419642. Policy #0 lag: (min: 14.0, avg: 15.7, max: 42.0) [2023-10-12 17:25:53,435][61643] Avg episode reward: [(0, '19.720'), (1, '9.830')] [2023-10-12 17:25:53,476][62634] Updated weights for policy 0, policy_version 39890 (0.0008) [2023-10-12 17:25:53,854][62634] Updated weights for policy 0, policy_version 39900 (0.0009) [2023-10-12 17:25:57,022][62635] Updated weights for policy 1, policy_version 39880 (0.0010) [2023-10-12 17:25:57,401][62635] Updated weights for policy 1, policy_version 39890 (0.0008) [2023-10-12 17:25:57,776][62635] Updated weights for policy 1, policy_version 39900 (0.0008) [2023-10-12 17:25:57,949][62634] Updated weights for policy 0, policy_version 39910 (0.0007) [2023-10-12 17:25:58,316][62634] Updated weights for policy 0, policy_version 39920 (0.0007) [2023-10-12 17:25:58,435][61643] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 81723392. Throughput: 0: 1684.9, 1: 1676.8. Samples: 20439876. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2023-10-12 17:25:58,436][61643] Avg episode reward: [(0, '19.820'), (1, '9.600')] [2023-10-12 17:25:58,701][62634] Updated weights for policy 0, policy_version 39930 (0.0008) [2023-10-12 17:26:01,836][62635] Updated weights for policy 1, policy_version 39910 (0.0008) [2023-10-12 17:26:02,201][62635] Updated weights for policy 1, policy_version 39920 (0.0007) [2023-10-12 17:26:02,565][62634] Updated weights for policy 0, policy_version 39940 (0.0008) [2023-10-12 17:26:02,575][62635] Updated weights for policy 1, policy_version 39930 (0.0007) [2023-10-12 17:26:02,946][62634] Updated weights for policy 0, policy_version 39950 (0.0007) [2023-10-12 17:26:03,319][62634] Updated weights for policy 0, policy_version 39960 (0.0007) [2023-10-12 17:26:03,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 81788928. Throughput: 0: 1679.3, 1: 1660.3. Samples: 20459218. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2023-10-12 17:26:03,435][61643] Avg episode reward: [(0, '19.650'), (1, '9.690')] [2023-10-12 17:26:06,699][62635] Updated weights for policy 1, policy_version 39940 (0.0008) [2023-10-12 17:26:07,069][62635] Updated weights for policy 1, policy_version 39950 (0.0010) [2023-10-12 17:26:07,431][62634] Updated weights for policy 0, policy_version 39970 (0.0009) [2023-10-12 17:26:07,435][62635] Updated weights for policy 1, policy_version 39960 (0.0009) [2023-10-12 17:26:07,827][62634] Updated weights for policy 0, policy_version 39980 (0.0009) [2023-10-12 17:26:08,203][62634] Updated weights for policy 0, policy_version 39990 (0.0009) [2023-10-12 17:26:08,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 81854464. Throughput: 0: 1693.0, 1: 1672.0. Samples: 20470186. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2023-10-12 17:26:08,436][61643] Avg episode reward: [(0, '19.570'), (1, '9.890')] [2023-10-12 17:26:08,591][62634] Updated weights for policy 0, policy_version 40000 (0.0009) [2023-10-12 17:26:11,635][62635] Updated weights for policy 1, policy_version 39970 (0.0007) [2023-10-12 17:26:12,008][62635] Updated weights for policy 1, policy_version 39980 (0.0007) [2023-10-12 17:26:12,368][62635] Updated weights for policy 1, policy_version 39990 (0.0007) [2023-10-12 17:26:12,726][62634] Updated weights for policy 0, policy_version 40010 (0.0009) [2023-10-12 17:26:12,735][62635] Updated weights for policy 1, policy_version 40000 (0.0007) [2023-10-12 17:26:13,106][62634] Updated weights for policy 0, policy_version 40020 (0.0009) [2023-10-12 17:26:13,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 81920000. Throughput: 0: 1683.2, 1: 1666.7. Samples: 20490286. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2023-10-12 17:26:13,435][61643] Avg episode reward: [(0, '20.040'), (1, '9.740')] [2023-10-12 17:26:13,486][62634] Updated weights for policy 0, policy_version 40030 (0.0009) [2023-10-12 17:26:13,554][62354] Saving new best policy, reward=20.040! [2023-10-12 17:26:16,656][62635] Updated weights for policy 1, policy_version 40010 (0.0009) [2023-10-12 17:26:17,024][62635] Updated weights for policy 1, policy_version 40020 (0.0008) [2023-10-12 17:26:17,382][62635] Updated weights for policy 1, policy_version 40030 (0.0007) [2023-10-12 17:26:17,440][62634] Updated weights for policy 0, policy_version 40040 (0.0008) [2023-10-12 17:26:17,820][62634] Updated weights for policy 0, policy_version 40050 (0.0008) [2023-10-12 17:26:18,195][62634] Updated weights for policy 0, policy_version 40060 (0.0008) [2023-10-12 17:26:18,435][61643] Fps is (10 sec: 16384.3, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 82018304. Throughput: 0: 1665.2, 1: 1674.1. Samples: 20509414. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2023-10-12 17:26:18,435][61643] Avg episode reward: [(0, '20.040'), (1, '9.860')] [2023-10-12 17:26:18,444][62354] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000040064_41025536.pth... [2023-10-12 17:26:18,445][62495] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000040032_40992768.pth... [2023-10-12 17:26:18,479][62495] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000038464_39387136.pth [2023-10-12 17:26:18,481][62354] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000038496_39419904.pth [2023-10-12 17:26:21,489][62635] Updated weights for policy 1, policy_version 40040 (0.0010) [2023-10-12 17:26:21,858][62635] Updated weights for policy 1, policy_version 40050 (0.0010) [2023-10-12 17:26:22,232][62635] Updated weights for policy 1, policy_version 40060 (0.0010) [2023-10-12 17:26:22,363][62634] Updated weights for policy 0, policy_version 40070 (0.0008) [2023-10-12 17:26:22,730][62634] Updated weights for policy 0, policy_version 40080 (0.0007) [2023-10-12 17:26:23,102][62634] Updated weights for policy 0, policy_version 40090 (0.0008) [2023-10-12 17:26:23,435][61643] Fps is (10 sec: 16384.0, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 82083840. Throughput: 0: 1684.5, 1: 1678.1. Samples: 20520584. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 17:26:23,435][61643] Avg episode reward: [(0, '19.840'), (1, '10.070')] [2023-10-12 17:26:26,358][62635] Updated weights for policy 1, policy_version 40070 (0.0009) [2023-10-12 17:26:26,727][62635] Updated weights for policy 1, policy_version 40080 (0.0010) [2023-10-12 17:26:27,105][62635] Updated weights for policy 1, policy_version 40090 (0.0010) [2023-10-12 17:26:27,285][62634] Updated weights for policy 0, policy_version 40100 (0.0008) [2023-10-12 17:26:27,657][62634] Updated weights for policy 0, policy_version 40110 (0.0008) [2023-10-12 17:26:28,041][62634] Updated weights for policy 0, policy_version 40120 (0.0009) [2023-10-12 17:26:28,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 82149376. Throughput: 0: 1688.4, 1: 1662.8. Samples: 20540410. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 17:26:28,435][61643] Avg episode reward: [(0, '19.620'), (1, '9.870')] [2023-10-12 17:26:31,097][62635] Updated weights for policy 1, policy_version 40100 (0.0009) [2023-10-12 17:26:31,470][62635] Updated weights for policy 1, policy_version 40110 (0.0010) [2023-10-12 17:26:31,836][62635] Updated weights for policy 1, policy_version 40120 (0.0009) [2023-10-12 17:26:31,877][62634] Updated weights for policy 0, policy_version 40130 (0.0010) [2023-10-12 17:26:32,241][62634] Updated weights for policy 0, policy_version 40140 (0.0011) [2023-10-12 17:26:32,617][62634] Updated weights for policy 0, policy_version 40150 (0.0008) [2023-10-12 17:26:32,998][62634] Updated weights for policy 0, policy_version 40160 (0.0008) [2023-10-12 17:26:33,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 82214912. Throughput: 0: 1659.6, 1: 1676.0. Samples: 20559584. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 17:26:33,435][61643] Avg episode reward: [(0, '19.630'), (1, '9.950')] [2023-10-12 17:26:35,982][62635] Updated weights for policy 1, policy_version 40130 (0.0009) [2023-10-12 17:26:36,355][62635] Updated weights for policy 1, policy_version 40140 (0.0008) [2023-10-12 17:26:36,724][62635] Updated weights for policy 1, policy_version 40150 (0.0008) [2023-10-12 17:26:37,083][62635] Updated weights for policy 1, policy_version 40160 (0.0007) [2023-10-12 17:26:37,199][62634] Updated weights for policy 0, policy_version 40170 (0.0010) [2023-10-12 17:26:37,577][62634] Updated weights for policy 0, policy_version 40180 (0.0007) [2023-10-12 17:26:37,960][62634] Updated weights for policy 0, policy_version 40190 (0.0009) [2023-10-12 17:26:38,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 82280448. Throughput: 0: 1684.4, 1: 1673.6. Samples: 20570756. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 17:26:38,435][61643] Avg episode reward: [(0, '19.820'), (1, '9.830')] [2023-10-12 17:26:41,067][62635] Updated weights for policy 1, policy_version 40170 (0.0007) [2023-10-12 17:26:41,428][62635] Updated weights for policy 1, policy_version 40180 (0.0007) [2023-10-12 17:26:41,798][62635] Updated weights for policy 1, policy_version 40190 (0.0007) [2023-10-12 17:26:42,087][62634] Updated weights for policy 0, policy_version 40200 (0.0007) [2023-10-12 17:26:42,475][62634] Updated weights for policy 0, policy_version 40210 (0.0007) [2023-10-12 17:26:42,851][62634] Updated weights for policy 0, policy_version 40220 (0.0009) [2023-10-12 17:26:43,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 82345984. Throughput: 0: 1676.8, 1: 1660.4. Samples: 20590050. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 17:26:43,436][61643] Avg episode reward: [(0, '20.000'), (1, '9.760')] [2023-10-12 17:26:46,068][62635] Updated weights for policy 1, policy_version 40200 (0.0007) [2023-10-12 17:26:46,443][62635] Updated weights for policy 1, policy_version 40210 (0.0007) [2023-10-12 17:26:46,808][62635] Updated weights for policy 1, policy_version 40220 (0.0007) [2023-10-12 17:26:46,877][62634] Updated weights for policy 0, policy_version 40230 (0.0007) [2023-10-12 17:26:47,256][62634] Updated weights for policy 0, policy_version 40240 (0.0011) [2023-10-12 17:26:47,624][62634] Updated weights for policy 0, policy_version 40250 (0.0008) [2023-10-12 17:26:48,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 82411520. Throughput: 0: 1652.8, 1: 1684.8. Samples: 20609406. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 17:26:48,435][61643] Avg episode reward: [(0, '19.980'), (1, '9.940')] [2023-10-12 17:26:50,697][62635] Updated weights for policy 1, policy_version 40230 (0.0008) [2023-10-12 17:26:51,067][62635] Updated weights for policy 1, policy_version 40240 (0.0008) [2023-10-12 17:26:51,439][62635] Updated weights for policy 1, policy_version 40250 (0.0008) [2023-10-12 17:26:51,801][62634] Updated weights for policy 0, policy_version 40260 (0.0008) [2023-10-12 17:26:52,197][62634] Updated weights for policy 0, policy_version 40270 (0.0008) [2023-10-12 17:26:52,571][62634] Updated weights for policy 0, policy_version 40280 (0.0009) [2023-10-12 17:26:53,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 82477056. Throughput: 0: 1666.6, 1: 1670.8. Samples: 20620370. Policy #0 lag: (min: 2.0, avg: 9.2, max: 34.0) [2023-10-12 17:26:53,435][61643] Avg episode reward: [(0, '20.060'), (1, '9.940')] [2023-10-12 17:26:53,436][62354] Saving new best policy, reward=20.060! [2023-10-12 17:26:55,346][62635] Updated weights for policy 1, policy_version 40260 (0.0007) [2023-10-12 17:26:55,717][62635] Updated weights for policy 1, policy_version 40270 (0.0007) [2023-10-12 17:26:56,078][62635] Updated weights for policy 1, policy_version 40280 (0.0007) [2023-10-12 17:26:56,707][62634] Updated weights for policy 0, policy_version 40290 (0.0007) [2023-10-12 17:26:57,082][62634] Updated weights for policy 0, policy_version 40300 (0.0009) [2023-10-12 17:26:57,458][62634] Updated weights for policy 0, policy_version 40310 (0.0007) [2023-10-12 17:26:57,838][62634] Updated weights for policy 0, policy_version 40320 (0.0009) [2023-10-12 17:26:58,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 82542592. Throughput: 0: 1664.2, 1: 1665.0. Samples: 20640102. Policy #0 lag: (min: 2.0, avg: 9.2, max: 34.0) [2023-10-12 17:26:58,435][61643] Avg episode reward: [(0, '19.970'), (1, '9.770')] [2023-10-12 17:27:00,143][62635] Updated weights for policy 1, policy_version 40290 (0.0008) [2023-10-12 17:27:00,509][62635] Updated weights for policy 1, policy_version 40300 (0.0009) [2023-10-12 17:27:00,877][62635] Updated weights for policy 1, policy_version 40310 (0.0007) [2023-10-12 17:27:01,240][62635] Updated weights for policy 1, policy_version 40320 (0.0007) [2023-10-12 17:27:01,816][62634] Updated weights for policy 0, policy_version 40330 (0.0007) [2023-10-12 17:27:02,183][62634] Updated weights for policy 0, policy_version 40340 (0.0007) [2023-10-12 17:27:02,566][62634] Updated weights for policy 0, policy_version 40350 (0.0007) [2023-10-12 17:27:03,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 82608128. Throughput: 0: 1665.5, 1: 1685.5. Samples: 20660208. Policy #0 lag: (min: 2.0, avg: 9.2, max: 34.0) [2023-10-12 17:27:03,436][61643] Avg episode reward: [(0, '20.150'), (1, '9.890')] [2023-10-12 17:27:03,444][62354] Saving new best policy, reward=20.150! [2023-10-12 17:27:05,275][62635] Updated weights for policy 1, policy_version 40330 (0.0007) [2023-10-12 17:27:05,632][62635] Updated weights for policy 1, policy_version 40340 (0.0009) [2023-10-12 17:27:06,005][62635] Updated weights for policy 1, policy_version 40350 (0.0008) [2023-10-12 17:27:06,561][62634] Updated weights for policy 0, policy_version 40360 (0.0009) [2023-10-12 17:27:06,942][62634] Updated weights for policy 0, policy_version 40370 (0.0010) [2023-10-12 17:27:07,311][62634] Updated weights for policy 0, policy_version 40380 (0.0010) [2023-10-12 17:27:08,435][61643] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 82673664. Throughput: 0: 1678.2, 1: 1659.5. Samples: 20670780. Policy #0 lag: (min: 2.0, avg: 9.2, max: 34.0) [2023-10-12 17:27:08,436][61643] Avg episode reward: [(0, '20.180'), (1, '10.000')] [2023-10-12 17:27:08,437][62354] Saving new best policy, reward=20.180! [2023-10-12 17:27:10,149][62635] Updated weights for policy 1, policy_version 40360 (0.0009) [2023-10-12 17:27:10,512][62635] Updated weights for policy 1, policy_version 40370 (0.0008) [2023-10-12 17:27:10,886][62635] Updated weights for policy 1, policy_version 40380 (0.0009) [2023-10-12 17:27:11,335][62634] Updated weights for policy 0, policy_version 40390 (0.0007) [2023-10-12 17:27:11,711][62634] Updated weights for policy 0, policy_version 40400 (0.0009) [2023-10-12 17:27:12,095][62634] Updated weights for policy 0, policy_version 40410 (0.0009) [2023-10-12 17:27:13,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 82739200. Throughput: 0: 1659.2, 1: 1677.7. Samples: 20690572. Policy #0 lag: (min: 2.0, avg: 9.2, max: 34.0) [2023-10-12 17:27:13,435][61643] Avg episode reward: [(0, '20.150'), (1, '9.820')] [2023-10-12 17:27:15,052][62635] Updated weights for policy 1, policy_version 40390 (0.0008) [2023-10-12 17:27:15,419][62635] Updated weights for policy 1, policy_version 40400 (0.0007) [2023-10-12 17:27:15,799][62635] Updated weights for policy 1, policy_version 40410 (0.0007) [2023-10-12 17:27:16,104][62634] Updated weights for policy 0, policy_version 40420 (0.0008) [2023-10-12 17:27:16,473][62634] Updated weights for policy 0, policy_version 40430 (0.0007) [2023-10-12 17:27:16,854][62634] Updated weights for policy 0, policy_version 40440 (0.0008) [2023-10-12 17:27:18,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 82804736. Throughput: 0: 1678.2, 1: 1684.7. Samples: 20710914. Policy #0 lag: (min: 2.0, avg: 9.2, max: 34.0) [2023-10-12 17:27:18,435][61643] Avg episode reward: [(0, '20.380'), (1, '9.810')] [2023-10-12 17:27:18,443][62354] Saving new best policy, reward=20.380! [2023-10-12 17:27:19,852][62635] Updated weights for policy 1, policy_version 40420 (0.0008) [2023-10-12 17:27:20,217][62635] Updated weights for policy 1, policy_version 40430 (0.0007) [2023-10-12 17:27:20,582][62635] Updated weights for policy 1, policy_version 40440 (0.0007) [2023-10-12 17:27:20,973][62634] Updated weights for policy 0, policy_version 40450 (0.0009) [2023-10-12 17:27:21,346][62634] Updated weights for policy 0, policy_version 40460 (0.0007) [2023-10-12 17:27:21,733][62634] Updated weights for policy 0, policy_version 40470 (0.0008) [2023-10-12 17:27:22,099][62634] Updated weights for policy 0, policy_version 40480 (0.0008) [2023-10-12 17:27:23,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 82870272. Throughput: 0: 1680.9, 1: 1662.2. Samples: 20721196. Policy #0 lag: (min: 17.0, avg: 20.2, max: 49.0) [2023-10-12 17:27:23,436][61643] Avg episode reward: [(0, '20.650'), (1, '9.980')] [2023-10-12 17:27:23,437][62354] Saving new best policy, reward=20.650! [2023-10-12 17:27:24,497][62635] Updated weights for policy 1, policy_version 40450 (0.0008) [2023-10-12 17:27:24,865][62635] Updated weights for policy 1, policy_version 40460 (0.0008) [2023-10-12 17:27:25,239][62635] Updated weights for policy 1, policy_version 40470 (0.0008) [2023-10-12 17:27:25,612][62635] Updated weights for policy 1, policy_version 40480 (0.0007) [2023-10-12 17:27:26,236][62634] Updated weights for policy 0, policy_version 40490 (0.0009) [2023-10-12 17:27:26,613][62634] Updated weights for policy 0, policy_version 40500 (0.0007) [2023-10-12 17:27:26,995][62634] Updated weights for policy 0, policy_version 40510 (0.0007) [2023-10-12 17:27:28,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 82935808. Throughput: 0: 1661.4, 1: 1693.6. Samples: 20741026. Policy #0 lag: (min: 17.0, avg: 20.2, max: 49.0) [2023-10-12 17:27:28,435][61643] Avg episode reward: [(0, '20.810'), (1, '9.780')] [2023-10-12 17:27:28,436][62354] Saving new best policy, reward=20.810! [2023-10-12 17:27:29,618][62635] Updated weights for policy 1, policy_version 40490 (0.0008) [2023-10-12 17:27:29,981][62635] Updated weights for policy 1, policy_version 40500 (0.0010) [2023-10-12 17:27:30,363][62635] Updated weights for policy 1, policy_version 40510 (0.0009) [2023-10-12 17:27:30,952][62634] Updated weights for policy 0, policy_version 40520 (0.0007) [2023-10-12 17:27:31,330][62634] Updated weights for policy 0, policy_version 40530 (0.0008) [2023-10-12 17:27:31,712][62634] Updated weights for policy 0, policy_version 40540 (0.0008) [2023-10-12 17:27:33,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 83001344. Throughput: 0: 1689.0, 1: 1693.9. Samples: 20761634. Policy #0 lag: (min: 17.0, avg: 20.2, max: 49.0) [2023-10-12 17:27:33,435][61643] Avg episode reward: [(0, '20.920'), (1, '9.910')] [2023-10-12 17:27:33,444][62354] Saving new best policy, reward=20.920! [2023-10-12 17:27:34,480][62635] Updated weights for policy 1, policy_version 40520 (0.0008) [2023-10-12 17:27:34,859][62635] Updated weights for policy 1, policy_version 40530 (0.0007) [2023-10-12 17:27:35,226][62635] Updated weights for policy 1, policy_version 40540 (0.0008) [2023-10-12 17:27:35,688][62634] Updated weights for policy 0, policy_version 40550 (0.0010) [2023-10-12 17:27:36,072][62634] Updated weights for policy 0, policy_version 40560 (0.0007) [2023-10-12 17:27:36,443][62634] Updated weights for policy 0, policy_version 40570 (0.0008) [2023-10-12 17:27:38,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 83066880. Throughput: 0: 1681.4, 1: 1678.1. Samples: 20771548. Policy #0 lag: (min: 17.0, avg: 20.2, max: 49.0) [2023-10-12 17:27:38,436][61643] Avg episode reward: [(0, '20.980'), (1, '10.070')] [2023-10-12 17:27:38,437][62354] Saving new best policy, reward=20.980! [2023-10-12 17:27:38,989][62635] Updated weights for policy 1, policy_version 40550 (0.0009) [2023-10-12 17:27:39,354][62635] Updated weights for policy 1, policy_version 40560 (0.0008) [2023-10-12 17:27:39,717][62635] Updated weights for policy 1, policy_version 40570 (0.0008) [2023-10-12 17:27:40,684][62634] Updated weights for policy 0, policy_version 40580 (0.0010) [2023-10-12 17:27:41,074][62634] Updated weights for policy 0, policy_version 40590 (0.0007) [2023-10-12 17:27:41,461][62634] Updated weights for policy 0, policy_version 40600 (0.0007) [2023-10-12 17:27:43,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 83132416. Throughput: 0: 1664.5, 1: 1698.0. Samples: 20791414. Policy #0 lag: (min: 17.0, avg: 20.2, max: 49.0) [2023-10-12 17:27:43,435][61643] Avg episode reward: [(0, '21.190'), (1, '9.800')] [2023-10-12 17:27:43,436][62354] Saving new best policy, reward=21.190! [2023-10-12 17:27:43,915][62635] Updated weights for policy 1, policy_version 40580 (0.0010) [2023-10-12 17:27:44,283][62635] Updated weights for policy 1, policy_version 40590 (0.0010) [2023-10-12 17:27:44,648][62635] Updated weights for policy 1, policy_version 40600 (0.0010) [2023-10-12 17:27:45,185][62634] Updated weights for policy 0, policy_version 40610 (0.0008) [2023-10-12 17:27:45,562][62634] Updated weights for policy 0, policy_version 40620 (0.0007) [2023-10-12 17:27:45,932][62634] Updated weights for policy 0, policy_version 40630 (0.0008) [2023-10-12 17:27:46,311][62634] Updated weights for policy 0, policy_version 40640 (0.0008) [2023-10-12 17:27:48,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 83197952. Throughput: 0: 1687.5, 1: 1691.0. Samples: 20812242. Policy #0 lag: (min: 17.0, avg: 20.2, max: 49.0) [2023-10-12 17:27:48,436][61643] Avg episode reward: [(0, '21.150'), (1, '9.880')] [2023-10-12 17:27:48,842][62635] Updated weights for policy 1, policy_version 40610 (0.0010) [2023-10-12 17:27:49,209][62635] Updated weights for policy 1, policy_version 40620 (0.0008) [2023-10-12 17:27:49,572][62635] Updated weights for policy 1, policy_version 40630 (0.0007) [2023-10-12 17:27:49,940][62635] Updated weights for policy 1, policy_version 40640 (0.0009) [2023-10-12 17:27:50,516][62634] Updated weights for policy 0, policy_version 40650 (0.0008) [2023-10-12 17:27:50,908][62634] Updated weights for policy 0, policy_version 40660 (0.0007) [2023-10-12 17:27:51,290][62634] Updated weights for policy 0, policy_version 40670 (0.0008) [2023-10-12 17:27:53,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 83263488. Throughput: 0: 1669.3, 1: 1686.0. Samples: 20821768. Policy #0 lag: (min: 18.0, avg: 18.5, max: 34.0) [2023-10-12 17:27:53,436][61643] Avg episode reward: [(0, '21.230'), (1, '9.960')] [2023-10-12 17:27:53,437][62354] Saving new best policy, reward=21.230! [2023-10-12 17:27:54,084][62635] Updated weights for policy 1, policy_version 40650 (0.0009) [2023-10-12 17:27:54,449][62635] Updated weights for policy 1, policy_version 40660 (0.0011) [2023-10-12 17:27:54,815][62635] Updated weights for policy 1, policy_version 40670 (0.0009) [2023-10-12 17:27:55,244][62634] Updated weights for policy 0, policy_version 40680 (0.0008) [2023-10-12 17:27:55,616][62634] Updated weights for policy 0, policy_version 40690 (0.0007) [2023-10-12 17:27:55,994][62634] Updated weights for policy 0, policy_version 40700 (0.0011) [2023-10-12 17:27:58,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 83329024. Throughput: 0: 1676.1, 1: 1688.9. Samples: 20842000. Policy #0 lag: (min: 18.0, avg: 18.5, max: 34.0) [2023-10-12 17:27:58,435][61643] Avg episode reward: [(0, '21.400'), (1, '9.790')] [2023-10-12 17:27:58,436][62354] Saving new best policy, reward=21.400! [2023-10-12 17:27:59,046][62635] Updated weights for policy 1, policy_version 40680 (0.0008) [2023-10-12 17:27:59,406][62635] Updated weights for policy 1, policy_version 40690 (0.0008) [2023-10-12 17:27:59,778][62635] Updated weights for policy 1, policy_version 40700 (0.0011) [2023-10-12 17:28:00,322][62634] Updated weights for policy 0, policy_version 40710 (0.0009) [2023-10-12 17:28:00,713][62634] Updated weights for policy 0, policy_version 40720 (0.0009) [2023-10-12 17:28:01,086][62634] Updated weights for policy 0, policy_version 40730 (0.0009) [2023-10-12 17:28:03,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 83394560. Throughput: 0: 1680.1, 1: 1689.4. Samples: 20862540. Policy #0 lag: (min: 18.0, avg: 18.5, max: 34.0) [2023-10-12 17:28:03,435][61643] Avg episode reward: [(0, '21.360'), (1, '9.890')] [2023-10-12 17:28:03,744][62635] Updated weights for policy 1, policy_version 40710 (0.0009) [2023-10-12 17:28:04,118][62635] Updated weights for policy 1, policy_version 40720 (0.0007) [2023-10-12 17:28:04,475][62635] Updated weights for policy 1, policy_version 40730 (0.0009) [2023-10-12 17:28:05,053][62634] Updated weights for policy 0, policy_version 40740 (0.0009) [2023-10-12 17:28:05,435][62634] Updated weights for policy 0, policy_version 40750 (0.0011) [2023-10-12 17:28:05,817][62634] Updated weights for policy 0, policy_version 40760 (0.0009) [2023-10-12 17:28:08,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 83460096. Throughput: 0: 1663.2, 1: 1686.7. Samples: 20871940. Policy #0 lag: (min: 18.0, avg: 18.5, max: 34.0) [2023-10-12 17:28:08,436][61643] Avg episode reward: [(0, '21.650'), (1, '9.850')] [2023-10-12 17:28:08,436][62354] Saving new best policy, reward=21.650! [2023-10-12 17:28:08,626][62635] Updated weights for policy 1, policy_version 40740 (0.0010) [2023-10-12 17:28:08,990][62635] Updated weights for policy 1, policy_version 40750 (0.0007) [2023-10-12 17:28:09,352][62635] Updated weights for policy 1, policy_version 40760 (0.0008) [2023-10-12 17:28:09,839][62634] Updated weights for policy 0, policy_version 40770 (0.0010) [2023-10-12 17:28:10,213][62634] Updated weights for policy 0, policy_version 40780 (0.0009) [2023-10-12 17:28:10,588][62634] Updated weights for policy 0, policy_version 40790 (0.0007) [2023-10-12 17:28:10,967][62634] Updated weights for policy 0, policy_version 40800 (0.0009) [2023-10-12 17:28:13,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 83525632. Throughput: 0: 1682.8, 1: 1679.8. Samples: 20892346. Policy #0 lag: (min: 18.0, avg: 18.5, max: 34.0) [2023-10-12 17:28:13,436][61643] Avg episode reward: [(0, '21.400'), (1, '9.770')] [2023-10-12 17:28:13,487][62635] Updated weights for policy 1, policy_version 40770 (0.0009) [2023-10-12 17:28:13,845][62635] Updated weights for policy 1, policy_version 40780 (0.0008) [2023-10-12 17:28:14,225][62635] Updated weights for policy 1, policy_version 40790 (0.0010) [2023-10-12 17:28:14,596][62635] Updated weights for policy 1, policy_version 40800 (0.0007) [2023-10-12 17:28:15,022][62634] Updated weights for policy 0, policy_version 40810 (0.0007) [2023-10-12 17:28:15,404][62634] Updated weights for policy 0, policy_version 40820 (0.0010) [2023-10-12 17:28:15,783][62634] Updated weights for policy 0, policy_version 40830 (0.0011) [2023-10-12 17:28:18,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 83591168. Throughput: 0: 1685.2, 1: 1678.0. Samples: 20912978. Policy #0 lag: (min: 18.0, avg: 18.5, max: 34.0) [2023-10-12 17:28:18,435][61643] Avg episode reward: [(0, '21.160'), (1, '9.770')] [2023-10-12 17:28:18,444][62354] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000040832_41811968.pth... [2023-10-12 17:28:18,476][62354] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000039264_40206336.pth [2023-10-12 17:28:18,610][62635] Updated weights for policy 1, policy_version 40810 (0.0009) [2023-10-12 17:28:18,984][62635] Updated weights for policy 1, policy_version 40820 (0.0010) [2023-10-12 17:28:19,354][62635] Updated weights for policy 1, policy_version 40830 (0.0007) [2023-10-12 17:28:19,428][62495] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000040832_41811968.pth... [2023-10-12 17:28:19,457][62495] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000039232_40173568.pth [2023-10-12 17:28:19,857][62634] Updated weights for policy 0, policy_version 40840 (0.0010) [2023-10-12 17:28:20,236][62634] Updated weights for policy 0, policy_version 40850 (0.0007) [2023-10-12 17:28:20,613][62634] Updated weights for policy 0, policy_version 40860 (0.0009) [2023-10-12 17:28:23,334][62635] Updated weights for policy 1, policy_version 40840 (0.0009) [2023-10-12 17:28:23,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 83656704. Throughput: 0: 1663.2, 1: 1676.7. Samples: 20921842. Policy #0 lag: (min: 26.0, avg: 31.9, max: 32.0) [2023-10-12 17:28:23,435][61643] Avg episode reward: [(0, '21.170'), (1, '9.860')] [2023-10-12 17:28:23,706][62635] Updated weights for policy 1, policy_version 40850 (0.0010) [2023-10-12 17:28:24,063][62635] Updated weights for policy 1, policy_version 40860 (0.0009) [2023-10-12 17:28:24,620][62634] Updated weights for policy 0, policy_version 40870 (0.0010) [2023-10-12 17:28:24,993][62634] Updated weights for policy 0, policy_version 40880 (0.0010) [2023-10-12 17:28:25,363][62634] Updated weights for policy 0, policy_version 40890 (0.0010) [2023-10-12 17:28:28,290][62635] Updated weights for policy 1, policy_version 40870 (0.0009) [2023-10-12 17:28:28,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 83722240. Throughput: 0: 1685.6, 1: 1671.9. Samples: 20942504. Policy #0 lag: (min: 26.0, avg: 31.9, max: 32.0) [2023-10-12 17:28:28,436][61643] Avg episode reward: [(0, '21.260'), (1, '9.870')] [2023-10-12 17:28:28,662][62635] Updated weights for policy 1, policy_version 40880 (0.0008) [2023-10-12 17:28:29,028][62635] Updated weights for policy 1, policy_version 40890 (0.0009) [2023-10-12 17:28:29,454][62634] Updated weights for policy 0, policy_version 40900 (0.0008) [2023-10-12 17:28:29,834][62634] Updated weights for policy 0, policy_version 40910 (0.0008) [2023-10-12 17:28:30,217][62634] Updated weights for policy 0, policy_version 40920 (0.0008) [2023-10-12 17:28:33,003][62635] Updated weights for policy 1, policy_version 40900 (0.0008) [2023-10-12 17:28:33,377][62635] Updated weights for policy 1, policy_version 40910 (0.0009) [2023-10-12 17:28:33,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 83787776. Throughput: 0: 1674.0, 1: 1672.7. Samples: 20962844. Policy #0 lag: (min: 26.0, avg: 31.9, max: 32.0) [2023-10-12 17:28:33,435][61643] Avg episode reward: [(0, '21.310'), (1, '9.980')] [2023-10-12 17:28:33,740][62635] Updated weights for policy 1, policy_version 40920 (0.0010) [2023-10-12 17:28:34,285][62634] Updated weights for policy 0, policy_version 40930 (0.0009) [2023-10-12 17:28:34,663][62634] Updated weights for policy 0, policy_version 40940 (0.0007) [2023-10-12 17:28:35,041][62634] Updated weights for policy 0, policy_version 40950 (0.0008) [2023-10-12 17:28:35,415][62634] Updated weights for policy 0, policy_version 40960 (0.0010) [2023-10-12 17:28:37,772][62635] Updated weights for policy 1, policy_version 40930 (0.0007) [2023-10-12 17:28:38,150][62635] Updated weights for policy 1, policy_version 40940 (0.0007) [2023-10-12 17:28:38,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 83853312. Throughput: 0: 1662.1, 1: 1680.8. Samples: 20972198. Policy #0 lag: (min: 26.0, avg: 31.9, max: 32.0) [2023-10-12 17:28:38,435][61643] Avg episode reward: [(0, '21.100'), (1, '9.820')] [2023-10-12 17:28:38,518][62635] Updated weights for policy 1, policy_version 40950 (0.0009) [2023-10-12 17:28:38,885][62635] Updated weights for policy 1, policy_version 40960 (0.0008) [2023-10-12 17:28:39,584][62634] Updated weights for policy 0, policy_version 40970 (0.0009) [2023-10-12 17:28:39,964][62634] Updated weights for policy 0, policy_version 40980 (0.0010) [2023-10-12 17:28:40,338][62634] Updated weights for policy 0, policy_version 40990 (0.0008) [2023-10-12 17:28:43,042][62635] Updated weights for policy 1, policy_version 40970 (0.0008) [2023-10-12 17:28:43,410][62635] Updated weights for policy 1, policy_version 40980 (0.0009) [2023-10-12 17:28:43,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 83918848. Throughput: 0: 1678.8, 1: 1676.5. Samples: 20992986. Policy #0 lag: (min: 26.0, avg: 31.9, max: 32.0) [2023-10-12 17:28:43,435][61643] Avg episode reward: [(0, '21.150'), (1, '9.840')] [2023-10-12 17:28:43,783][62635] Updated weights for policy 1, policy_version 40990 (0.0009) [2023-10-12 17:28:44,268][62634] Updated weights for policy 0, policy_version 41000 (0.0008) [2023-10-12 17:28:44,646][62634] Updated weights for policy 0, policy_version 41010 (0.0010) [2023-10-12 17:28:45,023][62634] Updated weights for policy 0, policy_version 41020 (0.0010) [2023-10-12 17:28:47,783][62635] Updated weights for policy 1, policy_version 41000 (0.0010) [2023-10-12 17:28:48,147][62635] Updated weights for policy 1, policy_version 41010 (0.0008) [2023-10-12 17:28:48,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 83984384. Throughput: 0: 1683.2, 1: 1666.5. Samples: 21013278. Policy #0 lag: (min: 26.0, avg: 31.9, max: 32.0) [2023-10-12 17:28:48,435][61643] Avg episode reward: [(0, '21.140'), (1, '9.760')] [2023-10-12 17:28:48,515][62635] Updated weights for policy 1, policy_version 41020 (0.0010) [2023-10-12 17:28:49,042][62634] Updated weights for policy 0, policy_version 41030 (0.0010) [2023-10-12 17:28:49,411][62634] Updated weights for policy 0, policy_version 41040 (0.0011) [2023-10-12 17:28:49,799][62634] Updated weights for policy 0, policy_version 41050 (0.0008) [2023-10-12 17:28:52,582][62635] Updated weights for policy 1, policy_version 41030 (0.0008) [2023-10-12 17:28:52,960][62635] Updated weights for policy 1, policy_version 41040 (0.0008) [2023-10-12 17:28:53,328][62635] Updated weights for policy 1, policy_version 41050 (0.0008) [2023-10-12 17:28:53,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 84049920. Throughput: 0: 1675.1, 1: 1677.4. Samples: 21022802. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 17:28:53,435][61643] Avg episode reward: [(0, '21.010'), (1, '9.590')] [2023-10-12 17:28:53,839][62634] Updated weights for policy 0, policy_version 41060 (0.0008) [2023-10-12 17:28:54,212][62634] Updated weights for policy 0, policy_version 41070 (0.0010) [2023-10-12 17:28:54,594][62634] Updated weights for policy 0, policy_version 41080 (0.0009) [2023-10-12 17:28:57,381][62635] Updated weights for policy 1, policy_version 41060 (0.0008) [2023-10-12 17:28:57,757][62635] Updated weights for policy 1, policy_version 41070 (0.0007) [2023-10-12 17:28:58,127][62635] Updated weights for policy 1, policy_version 41080 (0.0008) [2023-10-12 17:28:58,435][61643] Fps is (10 sec: 16383.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 84148224. Throughput: 0: 1681.2, 1: 1680.0. Samples: 21043602. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 17:28:58,436][61643] Avg episode reward: [(0, '20.800'), (1, '9.540')] [2023-10-12 17:28:58,671][62634] Updated weights for policy 0, policy_version 41090 (0.0008) [2023-10-12 17:28:59,040][62634] Updated weights for policy 0, policy_version 41100 (0.0007) [2023-10-12 17:28:59,414][62634] Updated weights for policy 0, policy_version 41110 (0.0007) [2023-10-12 17:28:59,789][62634] Updated weights for policy 0, policy_version 41120 (0.0009) [2023-10-12 17:29:02,214][62635] Updated weights for policy 1, policy_version 41090 (0.0008) [2023-10-12 17:29:02,580][62635] Updated weights for policy 1, policy_version 41100 (0.0008) [2023-10-12 17:29:02,951][62635] Updated weights for policy 1, policy_version 41110 (0.0009) [2023-10-12 17:29:03,327][62635] Updated weights for policy 1, policy_version 41120 (0.0008) [2023-10-12 17:29:03,435][61643] Fps is (10 sec: 16383.8, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 84213760. Throughput: 0: 1685.1, 1: 1662.3. Samples: 21063612. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 17:29:03,435][61643] Avg episode reward: [(0, '20.830'), (1, '9.660')] [2023-10-12 17:29:03,761][62634] Updated weights for policy 0, policy_version 41130 (0.0008) [2023-10-12 17:29:04,145][62634] Updated weights for policy 0, policy_version 41140 (0.0007) [2023-10-12 17:29:04,520][62634] Updated weights for policy 0, policy_version 41150 (0.0009) [2023-10-12 17:29:07,583][62635] Updated weights for policy 1, policy_version 41130 (0.0008) [2023-10-12 17:29:07,954][62635] Updated weights for policy 1, policy_version 41140 (0.0009) [2023-10-12 17:29:08,319][62635] Updated weights for policy 1, policy_version 41150 (0.0008) [2023-10-12 17:29:08,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 84279296. Throughput: 0: 1686.7, 1: 1688.5. Samples: 21073728. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 17:29:08,436][61643] Avg episode reward: [(0, '20.710'), (1, '9.600')] [2023-10-12 17:29:08,562][62634] Updated weights for policy 0, policy_version 41160 (0.0008) [2023-10-12 17:29:08,946][62634] Updated weights for policy 0, policy_version 41170 (0.0007) [2023-10-12 17:29:09,325][62634] Updated weights for policy 0, policy_version 41180 (0.0008) [2023-10-12 17:29:12,353][62635] Updated weights for policy 1, policy_version 41160 (0.0009) [2023-10-12 17:29:12,717][62635] Updated weights for policy 1, policy_version 41170 (0.0009) [2023-10-12 17:29:13,088][62635] Updated weights for policy 1, policy_version 41180 (0.0007) [2023-10-12 17:29:13,363][62634] Updated weights for policy 0, policy_version 41190 (0.0008) [2023-10-12 17:29:13,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 84344832. Throughput: 0: 1690.3, 1: 1681.9. Samples: 21094250. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 17:29:13,435][61643] Avg episode reward: [(0, '20.630'), (1, '9.510')] [2023-10-12 17:29:13,736][62634] Updated weights for policy 0, policy_version 41200 (0.0009) [2023-10-12 17:29:14,124][62634] Updated weights for policy 0, policy_version 41210 (0.0010) [2023-10-12 17:29:17,074][62635] Updated weights for policy 1, policy_version 41190 (0.0009) [2023-10-12 17:29:17,446][62635] Updated weights for policy 1, policy_version 41200 (0.0008) [2023-10-12 17:29:17,815][62635] Updated weights for policy 1, policy_version 41210 (0.0008) [2023-10-12 17:29:18,172][62634] Updated weights for policy 0, policy_version 41220 (0.0007) [2023-10-12 17:29:18,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 84410368. Throughput: 0: 1699.7, 1: 1658.7. Samples: 21113974. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 17:29:18,436][61643] Avg episode reward: [(0, '20.860'), (1, '9.710')] [2023-10-12 17:29:18,557][62634] Updated weights for policy 0, policy_version 41230 (0.0008) [2023-10-12 17:29:18,935][62634] Updated weights for policy 0, policy_version 41240 (0.0009) [2023-10-12 17:29:21,951][62635] Updated weights for policy 1, policy_version 41220 (0.0008) [2023-10-12 17:29:22,313][62635] Updated weights for policy 1, policy_version 41230 (0.0008) [2023-10-12 17:29:22,680][62635] Updated weights for policy 1, policy_version 41240 (0.0007) [2023-10-12 17:29:22,854][62634] Updated weights for policy 0, policy_version 41250 (0.0008) [2023-10-12 17:29:23,236][62634] Updated weights for policy 0, policy_version 41260 (0.0007) [2023-10-12 17:29:23,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 84475904. Throughput: 0: 1695.9, 1: 1678.8. Samples: 21124056. Policy #0 lag: (min: 31.0, avg: 38.9, max: 63.0) [2023-10-12 17:29:23,435][61643] Avg episode reward: [(0, '20.930'), (1, '9.750')] [2023-10-12 17:29:23,607][62634] Updated weights for policy 0, policy_version 41270 (0.0008) [2023-10-12 17:29:23,989][62634] Updated weights for policy 0, policy_version 41280 (0.0007) [2023-10-12 17:29:26,678][62635] Updated weights for policy 1, policy_version 41250 (0.0008) [2023-10-12 17:29:27,041][62635] Updated weights for policy 1, policy_version 41260 (0.0007) [2023-10-12 17:29:27,414][62635] Updated weights for policy 1, policy_version 41270 (0.0008) [2023-10-12 17:29:27,778][62635] Updated weights for policy 1, policy_version 41280 (0.0007) [2023-10-12 17:29:28,006][62634] Updated weights for policy 0, policy_version 41290 (0.0007) [2023-10-12 17:29:28,378][62634] Updated weights for policy 0, policy_version 41300 (0.0010) [2023-10-12 17:29:28,435][61643] Fps is (10 sec: 13107.5, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 84541440. Throughput: 0: 1692.4, 1: 1677.7. Samples: 21144642. Policy #0 lag: (min: 31.0, avg: 38.9, max: 63.0) [2023-10-12 17:29:28,436][61643] Avg episode reward: [(0, '21.110'), (1, '9.660')] [2023-10-12 17:29:28,759][62634] Updated weights for policy 0, policy_version 41310 (0.0010) [2023-10-12 17:29:31,826][62635] Updated weights for policy 1, policy_version 41290 (0.0008) [2023-10-12 17:29:32,194][62635] Updated weights for policy 1, policy_version 41300 (0.0007) [2023-10-12 17:29:32,566][62635] Updated weights for policy 1, policy_version 41310 (0.0007) [2023-10-12 17:29:32,868][62634] Updated weights for policy 0, policy_version 41320 (0.0008) [2023-10-12 17:29:33,242][62634] Updated weights for policy 0, policy_version 41330 (0.0010) [2023-10-12 17:29:33,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 84606976. Throughput: 0: 1684.5, 1: 1669.2. Samples: 21164196. Policy #0 lag: (min: 31.0, avg: 38.9, max: 63.0) [2023-10-12 17:29:33,435][61643] Avg episode reward: [(0, '21.340'), (1, '9.830')] [2023-10-12 17:29:33,614][62634] Updated weights for policy 0, policy_version 41340 (0.0010) [2023-10-12 17:29:36,516][62635] Updated weights for policy 1, policy_version 41320 (0.0008) [2023-10-12 17:29:36,878][62635] Updated weights for policy 1, policy_version 41330 (0.0008) [2023-10-12 17:29:37,245][62635] Updated weights for policy 1, policy_version 41340 (0.0008) [2023-10-12 17:29:37,676][62634] Updated weights for policy 0, policy_version 41350 (0.0007) [2023-10-12 17:29:38,057][62634] Updated weights for policy 0, policy_version 41360 (0.0007) [2023-10-12 17:29:38,434][62634] Updated weights for policy 0, policy_version 41370 (0.0009) [2023-10-12 17:29:38,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 84672512. Throughput: 0: 1693.2, 1: 1690.4. Samples: 21175062. Policy #0 lag: (min: 31.0, avg: 38.9, max: 63.0) [2023-10-12 17:29:38,435][61643] Avg episode reward: [(0, '21.250'), (1, '9.730')] [2023-10-12 17:29:41,275][62635] Updated weights for policy 1, policy_version 41350 (0.0008) [2023-10-12 17:29:41,653][62635] Updated weights for policy 1, policy_version 41360 (0.0008) [2023-10-12 17:29:42,023][62635] Updated weights for policy 1, policy_version 41370 (0.0008) [2023-10-12 17:29:42,489][62634] Updated weights for policy 0, policy_version 41380 (0.0009) [2023-10-12 17:29:42,861][62634] Updated weights for policy 0, policy_version 41390 (0.0010) [2023-10-12 17:29:43,241][62634] Updated weights for policy 0, policy_version 41400 (0.0010) [2023-10-12 17:29:43,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 84738048. Throughput: 0: 1693.0, 1: 1667.2. Samples: 21194808. Policy #0 lag: (min: 31.0, avg: 38.9, max: 63.0) [2023-10-12 17:29:43,435][61643] Avg episode reward: [(0, '21.260'), (1, '9.510')] [2023-10-12 17:29:46,048][62635] Updated weights for policy 1, policy_version 41380 (0.0009) [2023-10-12 17:29:46,422][62635] Updated weights for policy 1, policy_version 41390 (0.0009) [2023-10-12 17:29:46,788][62635] Updated weights for policy 1, policy_version 41400 (0.0009) [2023-10-12 17:29:47,349][62634] Updated weights for policy 0, policy_version 41410 (0.0009) [2023-10-12 17:29:47,725][62634] Updated weights for policy 0, policy_version 41420 (0.0008) [2023-10-12 17:29:48,117][62634] Updated weights for policy 0, policy_version 41430 (0.0007) [2023-10-12 17:29:48,435][61643] Fps is (10 sec: 13106.8, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 84803584. Throughput: 0: 1669.8, 1: 1681.9. Samples: 21214442. Policy #0 lag: (min: 31.0, avg: 38.9, max: 63.0) [2023-10-12 17:29:48,436][61643] Avg episode reward: [(0, '21.230'), (1, '9.420')] [2023-10-12 17:29:48,487][62634] Updated weights for policy 0, policy_version 41440 (0.0007) [2023-10-12 17:29:50,985][62635] Updated weights for policy 1, policy_version 41410 (0.0009) [2023-10-12 17:29:51,342][62635] Updated weights for policy 1, policy_version 41420 (0.0008) [2023-10-12 17:29:51,718][62635] Updated weights for policy 1, policy_version 41430 (0.0007) [2023-10-12 17:29:52,078][62635] Updated weights for policy 1, policy_version 41440 (0.0008) [2023-10-12 17:29:52,606][62634] Updated weights for policy 0, policy_version 41450 (0.0009) [2023-10-12 17:29:52,987][62634] Updated weights for policy 0, policy_version 41460 (0.0010) [2023-10-12 17:29:53,360][62634] Updated weights for policy 0, policy_version 41470 (0.0008) [2023-10-12 17:29:53,435][61643] Fps is (10 sec: 16383.9, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 84901888. Throughput: 0: 1684.1, 1: 1682.1. Samples: 21225204. Policy #0 lag: (min: 17.0, avg: 21.6, max: 49.0) [2023-10-12 17:29:53,435][61643] Avg episode reward: [(0, '21.170'), (1, '9.520')] [2023-10-12 17:29:56,120][62635] Updated weights for policy 1, policy_version 41450 (0.0010) [2023-10-12 17:29:56,484][62635] Updated weights for policy 1, policy_version 41460 (0.0010) [2023-10-12 17:29:56,854][62635] Updated weights for policy 1, policy_version 41470 (0.0009) [2023-10-12 17:29:57,226][62634] Updated weights for policy 0, policy_version 41480 (0.0010) [2023-10-12 17:29:57,614][62634] Updated weights for policy 0, policy_version 41490 (0.0008) [2023-10-12 17:29:57,996][62634] Updated weights for policy 0, policy_version 41500 (0.0007) [2023-10-12 17:29:58,435][61643] Fps is (10 sec: 16384.1, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 84967424. Throughput: 0: 1682.6, 1: 1662.6. Samples: 21244786. Policy #0 lag: (min: 17.0, avg: 21.6, max: 49.0) [2023-10-12 17:29:58,436][61643] Avg episode reward: [(0, '21.360'), (1, '9.660')] [2023-10-12 17:30:01,130][62635] Updated weights for policy 1, policy_version 41480 (0.0008) [2023-10-12 17:30:01,494][62635] Updated weights for policy 1, policy_version 41490 (0.0007) [2023-10-12 17:30:01,861][62635] Updated weights for policy 1, policy_version 41500 (0.0007) [2023-10-12 17:30:02,070][62634] Updated weights for policy 0, policy_version 41510 (0.0008) [2023-10-12 17:30:02,445][62634] Updated weights for policy 0, policy_version 41520 (0.0009) [2023-10-12 17:30:02,821][62634] Updated weights for policy 0, policy_version 41530 (0.0009) [2023-10-12 17:30:03,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 85032960. Throughput: 0: 1654.1, 1: 1685.8. Samples: 21264270. Policy #0 lag: (min: 17.0, avg: 21.6, max: 49.0) [2023-10-12 17:30:03,435][61643] Avg episode reward: [(0, '21.180'), (1, '9.750')] [2023-10-12 17:30:05,955][62635] Updated weights for policy 1, policy_version 41510 (0.0008) [2023-10-12 17:30:06,319][62635] Updated weights for policy 1, policy_version 41520 (0.0009) [2023-10-12 17:30:06,686][62635] Updated weights for policy 1, policy_version 41530 (0.0009) [2023-10-12 17:30:07,071][62634] Updated weights for policy 0, policy_version 41540 (0.0009) [2023-10-12 17:30:07,452][62634] Updated weights for policy 0, policy_version 41550 (0.0008) [2023-10-12 17:30:07,838][62634] Updated weights for policy 0, policy_version 41560 (0.0007) [2023-10-12 17:30:08,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 85098496. Throughput: 0: 1684.3, 1: 1681.6. Samples: 21275522. Policy #0 lag: (min: 17.0, avg: 21.6, max: 49.0) [2023-10-12 17:30:08,436][61643] Avg episode reward: [(0, '21.180'), (1, '9.480')] [2023-10-12 17:30:10,674][62635] Updated weights for policy 1, policy_version 41540 (0.0008) [2023-10-12 17:30:11,037][62635] Updated weights for policy 1, policy_version 41550 (0.0008) [2023-10-12 17:30:11,401][62635] Updated weights for policy 1, policy_version 41560 (0.0009) [2023-10-12 17:30:11,837][62634] Updated weights for policy 0, policy_version 41570 (0.0009) [2023-10-12 17:30:12,210][62634] Updated weights for policy 0, policy_version 41580 (0.0011) [2023-10-12 17:30:12,583][62634] Updated weights for policy 0, policy_version 41590 (0.0009) [2023-10-12 17:30:12,960][62634] Updated weights for policy 0, policy_version 41600 (0.0009) [2023-10-12 17:30:13,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 85164032. Throughput: 0: 1678.3, 1: 1667.4. Samples: 21295198. Policy #0 lag: (min: 17.0, avg: 21.6, max: 49.0) [2023-10-12 17:30:13,435][61643] Avg episode reward: [(0, '21.410'), (1, '9.630')] [2023-10-12 17:30:15,445][62635] Updated weights for policy 1, policy_version 41570 (0.0008) [2023-10-12 17:30:15,812][62635] Updated weights for policy 1, policy_version 41580 (0.0011) [2023-10-12 17:30:16,171][62635] Updated weights for policy 1, policy_version 41590 (0.0011) [2023-10-12 17:30:16,541][62635] Updated weights for policy 1, policy_version 41600 (0.0010) [2023-10-12 17:30:17,180][62634] Updated weights for policy 0, policy_version 41610 (0.0007) [2023-10-12 17:30:17,571][62634] Updated weights for policy 0, policy_version 41620 (0.0009) [2023-10-12 17:30:17,951][62634] Updated weights for policy 0, policy_version 41630 (0.0009) [2023-10-12 17:30:18,435][61643] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 85229568. Throughput: 0: 1659.5, 1: 1685.2. Samples: 21314708. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 17:30:18,436][61643] Avg episode reward: [(0, '21.410'), (1, '9.890')] [2023-10-12 17:30:18,446][62354] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000041632_42631168.pth... [2023-10-12 17:30:18,447][62495] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000041600_42598400.pth... [2023-10-12 17:30:18,482][62495] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000040032_40992768.pth [2023-10-12 17:30:18,486][62354] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000040064_41025536.pth [2023-10-12 17:30:20,552][62635] Updated weights for policy 1, policy_version 41610 (0.0009) [2023-10-12 17:30:20,907][62635] Updated weights for policy 1, policy_version 41620 (0.0010) [2023-10-12 17:30:21,283][62635] Updated weights for policy 1, policy_version 41630 (0.0008) [2023-10-12 17:30:21,819][62634] Updated weights for policy 0, policy_version 41640 (0.0007) [2023-10-12 17:30:22,198][62634] Updated weights for policy 0, policy_version 41650 (0.0007) [2023-10-12 17:30:22,579][62634] Updated weights for policy 0, policy_version 41660 (0.0010) [2023-10-12 17:30:23,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 85295104. Throughput: 0: 1677.2, 1: 1663.8. Samples: 21325406. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 17:30:23,435][61643] Avg episode reward: [(0, '21.740'), (1, '9.760')] [2023-10-12 17:30:23,436][62354] Saving new best policy, reward=21.740! [2023-10-12 17:30:25,456][62635] Updated weights for policy 1, policy_version 41640 (0.0009) [2023-10-12 17:30:25,823][62635] Updated weights for policy 1, policy_version 41650 (0.0009) [2023-10-12 17:30:26,185][62635] Updated weights for policy 1, policy_version 41660 (0.0007) [2023-10-12 17:30:26,602][62634] Updated weights for policy 0, policy_version 41670 (0.0009) [2023-10-12 17:30:26,977][62634] Updated weights for policy 0, policy_version 41680 (0.0008) [2023-10-12 17:30:27,362][62634] Updated weights for policy 0, policy_version 41690 (0.0007) [2023-10-12 17:30:28,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 85360640. Throughput: 0: 1665.5, 1: 1677.4. Samples: 21345238. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 17:30:28,436][61643] Avg episode reward: [(0, '21.560'), (1, '9.750')] [2023-10-12 17:30:30,370][62635] Updated weights for policy 1, policy_version 41670 (0.0010) [2023-10-12 17:30:30,747][62635] Updated weights for policy 1, policy_version 41680 (0.0010) [2023-10-12 17:30:31,102][62635] Updated weights for policy 1, policy_version 41690 (0.0008) [2023-10-12 17:30:31,236][62634] Updated weights for policy 0, policy_version 41700 (0.0008) [2023-10-12 17:30:31,617][62634] Updated weights for policy 0, policy_version 41710 (0.0009) [2023-10-12 17:30:31,995][62634] Updated weights for policy 0, policy_version 41720 (0.0010) [2023-10-12 17:30:33,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 85426176. Throughput: 0: 1667.1, 1: 1680.6. Samples: 21365086. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 17:30:33,435][61643] Avg episode reward: [(0, '21.500'), (1, '9.880')] [2023-10-12 17:30:35,024][62635] Updated weights for policy 1, policy_version 41700 (0.0008) [2023-10-12 17:30:35,393][62635] Updated weights for policy 1, policy_version 41710 (0.0007) [2023-10-12 17:30:35,763][62635] Updated weights for policy 1, policy_version 41720 (0.0008) [2023-10-12 17:30:36,104][62634] Updated weights for policy 0, policy_version 41730 (0.0009) [2023-10-12 17:30:36,477][62634] Updated weights for policy 0, policy_version 41740 (0.0009) [2023-10-12 17:30:36,850][62634] Updated weights for policy 0, policy_version 41750 (0.0010) [2023-10-12 17:30:37,237][62634] Updated weights for policy 0, policy_version 41760 (0.0010) [2023-10-12 17:30:38,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 85491712. Throughput: 0: 1682.8, 1: 1660.8. Samples: 21375666. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 17:30:38,435][61643] Avg episode reward: [(0, '21.510'), (1, '9.930')] [2023-10-12 17:30:39,770][62635] Updated weights for policy 1, policy_version 41730 (0.0009) [2023-10-12 17:30:40,143][62635] Updated weights for policy 1, policy_version 41740 (0.0008) [2023-10-12 17:30:40,505][62635] Updated weights for policy 1, policy_version 41750 (0.0009) [2023-10-12 17:30:40,877][62635] Updated weights for policy 1, policy_version 41760 (0.0010) [2023-10-12 17:30:41,257][62634] Updated weights for policy 0, policy_version 41770 (0.0010) [2023-10-12 17:30:41,631][62634] Updated weights for policy 0, policy_version 41780 (0.0007) [2023-10-12 17:30:42,007][62634] Updated weights for policy 0, policy_version 41790 (0.0009) [2023-10-12 17:30:43,435][61643] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 85557248. Throughput: 0: 1660.2, 1: 1686.6. Samples: 21395390. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 17:30:43,436][61643] Avg episode reward: [(0, '21.400'), (1, '10.020')] [2023-10-12 17:30:44,895][62635] Updated weights for policy 1, policy_version 41770 (0.0009) [2023-10-12 17:30:45,268][62635] Updated weights for policy 1, policy_version 41780 (0.0008) [2023-10-12 17:30:45,636][62635] Updated weights for policy 1, policy_version 41790 (0.0008) [2023-10-12 17:30:46,168][62634] Updated weights for policy 0, policy_version 41800 (0.0009) [2023-10-12 17:30:46,548][62634] Updated weights for policy 0, policy_version 41810 (0.0009) [2023-10-12 17:30:46,921][62634] Updated weights for policy 0, policy_version 41820 (0.0009) [2023-10-12 17:30:48,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 85622784. Throughput: 0: 1679.3, 1: 1691.1. Samples: 21415938. Policy #0 lag: (min: 31.0, avg: 33.8, max: 63.0) [2023-10-12 17:30:48,436][61643] Avg episode reward: [(0, '21.510'), (1, '9.860')] [2023-10-12 17:30:49,704][62635] Updated weights for policy 1, policy_version 41800 (0.0007) [2023-10-12 17:30:50,075][62635] Updated weights for policy 1, policy_version 41810 (0.0007) [2023-10-12 17:30:50,450][62635] Updated weights for policy 1, policy_version 41820 (0.0008) [2023-10-12 17:30:51,059][62634] Updated weights for policy 0, policy_version 41830 (0.0007) [2023-10-12 17:30:51,432][62634] Updated weights for policy 0, policy_version 41840 (0.0007) [2023-10-12 17:30:51,810][62634] Updated weights for policy 0, policy_version 41850 (0.0007) [2023-10-12 17:30:53,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 85688320. Throughput: 0: 1679.4, 1: 1663.0. Samples: 21425932. Policy #0 lag: (min: 31.0, avg: 33.8, max: 63.0) [2023-10-12 17:30:53,436][61643] Avg episode reward: [(0, '21.560'), (1, '9.770')] [2023-10-12 17:30:54,484][62635] Updated weights for policy 1, policy_version 41830 (0.0008) [2023-10-12 17:30:54,860][62635] Updated weights for policy 1, policy_version 41840 (0.0007) [2023-10-12 17:30:55,231][62635] Updated weights for policy 1, policy_version 41850 (0.0009) [2023-10-12 17:30:55,882][62634] Updated weights for policy 0, policy_version 41860 (0.0008) [2023-10-12 17:30:56,257][62634] Updated weights for policy 0, policy_version 41870 (0.0010) [2023-10-12 17:30:56,633][62634] Updated weights for policy 0, policy_version 41880 (0.0009) [2023-10-12 17:30:58,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13107.3, 300 sec: 13440.4). Total num frames: 85753856. Throughput: 0: 1657.8, 1: 1686.6. Samples: 21445698. Policy #0 lag: (min: 31.0, avg: 33.8, max: 63.0) [2023-10-12 17:30:58,435][61643] Avg episode reward: [(0, '21.610'), (1, '9.760')] [2023-10-12 17:30:59,340][62635] Updated weights for policy 1, policy_version 41860 (0.0009) [2023-10-12 17:30:59,716][62635] Updated weights for policy 1, policy_version 41870 (0.0008) [2023-10-12 17:31:00,082][62635] Updated weights for policy 1, policy_version 41880 (0.0009) [2023-10-12 17:31:00,686][62634] Updated weights for policy 0, policy_version 41890 (0.0009) [2023-10-12 17:31:01,074][62634] Updated weights for policy 0, policy_version 41900 (0.0011) [2023-10-12 17:31:01,450][62634] Updated weights for policy 0, policy_version 41910 (0.0008) [2023-10-12 17:31:01,825][62634] Updated weights for policy 0, policy_version 41920 (0.0008) [2023-10-12 17:31:03,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 85819392. Throughput: 0: 1684.5, 1: 1688.5. Samples: 21466488. Policy #0 lag: (min: 31.0, avg: 33.8, max: 63.0) [2023-10-12 17:31:03,435][61643] Avg episode reward: [(0, '21.760'), (1, '9.880')] [2023-10-12 17:31:03,444][62354] Saving new best policy, reward=21.760! [2023-10-12 17:31:04,048][62635] Updated weights for policy 1, policy_version 41890 (0.0008) [2023-10-12 17:31:04,409][62635] Updated weights for policy 1, policy_version 41900 (0.0008) [2023-10-12 17:31:04,774][62635] Updated weights for policy 1, policy_version 41910 (0.0008) [2023-10-12 17:31:05,142][62635] Updated weights for policy 1, policy_version 41920 (0.0007) [2023-10-12 17:31:05,847][62634] Updated weights for policy 0, policy_version 41930 (0.0007) [2023-10-12 17:31:06,220][62634] Updated weights for policy 0, policy_version 41940 (0.0008) [2023-10-12 17:31:06,592][62634] Updated weights for policy 0, policy_version 41950 (0.0007) [2023-10-12 17:31:08,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 85884928. Throughput: 0: 1675.1, 1: 1679.7. Samples: 21476370. Policy #0 lag: (min: 31.0, avg: 33.8, max: 63.0) [2023-10-12 17:31:08,435][61643] Avg episode reward: [(0, '21.660'), (1, '9.720')] [2023-10-12 17:31:09,257][62635] Updated weights for policy 1, policy_version 41930 (0.0007) [2023-10-12 17:31:09,631][62635] Updated weights for policy 1, policy_version 41940 (0.0007) [2023-10-12 17:31:10,003][62635] Updated weights for policy 1, policy_version 41950 (0.0008) [2023-10-12 17:31:10,493][62634] Updated weights for policy 0, policy_version 41960 (0.0008) [2023-10-12 17:31:10,867][62634] Updated weights for policy 0, policy_version 41970 (0.0008) [2023-10-12 17:31:11,241][62634] Updated weights for policy 0, policy_version 41980 (0.0007) [2023-10-12 17:31:13,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 85950464. Throughput: 0: 1667.7, 1: 1686.8. Samples: 21496192. Policy #0 lag: (min: 31.0, avg: 33.8, max: 63.0) [2023-10-12 17:31:13,435][61643] Avg episode reward: [(0, '21.830'), (1, '9.560')] [2023-10-12 17:31:13,436][62354] Saving new best policy, reward=21.830! [2023-10-12 17:31:14,213][62635] Updated weights for policy 1, policy_version 41960 (0.0008) [2023-10-12 17:31:14,579][62635] Updated weights for policy 1, policy_version 41970 (0.0008) [2023-10-12 17:31:14,944][62635] Updated weights for policy 1, policy_version 41980 (0.0008) [2023-10-12 17:31:15,339][62634] Updated weights for policy 0, policy_version 41990 (0.0007) [2023-10-12 17:31:15,714][62634] Updated weights for policy 0, policy_version 42000 (0.0007) [2023-10-12 17:31:16,096][62634] Updated weights for policy 0, policy_version 42010 (0.0009) [2023-10-12 17:31:18,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 86016000. Throughput: 0: 1685.9, 1: 1690.9. Samples: 21517042. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0) [2023-10-12 17:31:18,436][61643] Avg episode reward: [(0, '21.830'), (1, '9.680')] [2023-10-12 17:31:18,794][62635] Updated weights for policy 1, policy_version 41990 (0.0007) [2023-10-12 17:31:19,160][62635] Updated weights for policy 1, policy_version 42000 (0.0011) [2023-10-12 17:31:19,533][62635] Updated weights for policy 1, policy_version 42010 (0.0010) [2023-10-12 17:31:20,236][62634] Updated weights for policy 0, policy_version 42020 (0.0010) [2023-10-12 17:31:20,612][62634] Updated weights for policy 0, policy_version 42030 (0.0008) [2023-10-12 17:31:20,979][62634] Updated weights for policy 0, policy_version 42040 (0.0009) [2023-10-12 17:31:23,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 86081536. Throughput: 0: 1660.0, 1: 1687.9. Samples: 21526320. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0) [2023-10-12 17:31:23,435][61643] Avg episode reward: [(0, '22.040'), (1, '9.760')] [2023-10-12 17:31:23,436][62354] Saving new best policy, reward=22.040! [2023-10-12 17:31:23,691][62635] Updated weights for policy 1, policy_version 42020 (0.0008) [2023-10-12 17:31:24,068][62635] Updated weights for policy 1, policy_version 42030 (0.0008) [2023-10-12 17:31:24,444][62635] Updated weights for policy 1, policy_version 42040 (0.0008) [2023-10-12 17:31:25,009][62634] Updated weights for policy 0, policy_version 42050 (0.0007) [2023-10-12 17:31:25,379][62634] Updated weights for policy 0, policy_version 42060 (0.0008) [2023-10-12 17:31:25,756][62634] Updated weights for policy 0, policy_version 42070 (0.0008) [2023-10-12 17:31:26,140][62634] Updated weights for policy 0, policy_version 42080 (0.0011) [2023-10-12 17:31:28,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 86147072. Throughput: 0: 1673.9, 1: 1687.2. Samples: 21546638. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0) [2023-10-12 17:31:28,435][61643] Avg episode reward: [(0, '21.810'), (1, '9.700')] [2023-10-12 17:31:28,472][62635] Updated weights for policy 1, policy_version 42050 (0.0009) [2023-10-12 17:31:28,836][62635] Updated weights for policy 1, policy_version 42060 (0.0008) [2023-10-12 17:31:29,204][62635] Updated weights for policy 1, policy_version 42070 (0.0008) [2023-10-12 17:31:29,573][62635] Updated weights for policy 1, policy_version 42080 (0.0010) [2023-10-12 17:31:30,232][62634] Updated weights for policy 0, policy_version 42090 (0.0007) [2023-10-12 17:31:30,608][62634] Updated weights for policy 0, policy_version 42100 (0.0010) [2023-10-12 17:31:30,984][62634] Updated weights for policy 0, policy_version 42110 (0.0007) [2023-10-12 17:31:33,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 86212608. Throughput: 0: 1681.6, 1: 1682.7. Samples: 21567328. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0) [2023-10-12 17:31:33,435][61643] Avg episode reward: [(0, '21.880'), (1, '9.590')] [2023-10-12 17:31:33,683][62635] Updated weights for policy 1, policy_version 42090 (0.0011) [2023-10-12 17:31:34,060][62635] Updated weights for policy 1, policy_version 42100 (0.0007) [2023-10-12 17:31:34,420][62635] Updated weights for policy 1, policy_version 42110 (0.0008) [2023-10-12 17:31:35,089][62634] Updated weights for policy 0, policy_version 42120 (0.0007) [2023-10-12 17:31:35,469][62634] Updated weights for policy 0, policy_version 42130 (0.0007) [2023-10-12 17:31:35,851][62634] Updated weights for policy 0, policy_version 42140 (0.0008) [2023-10-12 17:31:38,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 86278144. Throughput: 0: 1660.9, 1: 1692.2. Samples: 21576820. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0) [2023-10-12 17:31:38,436][61643] Avg episode reward: [(0, '21.720'), (1, '9.930')] [2023-10-12 17:31:38,500][62635] Updated weights for policy 1, policy_version 42120 (0.0009) [2023-10-12 17:31:38,867][62635] Updated weights for policy 1, policy_version 42130 (0.0010) [2023-10-12 17:31:39,234][62635] Updated weights for policy 1, policy_version 42140 (0.0010) [2023-10-12 17:31:39,916][62634] Updated weights for policy 0, policy_version 42150 (0.0009) [2023-10-12 17:31:40,295][62634] Updated weights for policy 0, policy_version 42160 (0.0008) [2023-10-12 17:31:40,686][62634] Updated weights for policy 0, policy_version 42170 (0.0008) [2023-10-12 17:31:43,210][62635] Updated weights for policy 1, policy_version 42150 (0.0009) [2023-10-12 17:31:43,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 86343680. Throughput: 0: 1682.1, 1: 1693.2. Samples: 21597584. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0) [2023-10-12 17:31:43,435][61643] Avg episode reward: [(0, '21.860'), (1, '9.810')] [2023-10-12 17:31:43,585][62635] Updated weights for policy 1, policy_version 42160 (0.0009) [2023-10-12 17:31:43,943][62635] Updated weights for policy 1, policy_version 42170 (0.0009) [2023-10-12 17:31:44,668][62634] Updated weights for policy 0, policy_version 42180 (0.0007) [2023-10-12 17:31:45,044][62634] Updated weights for policy 0, policy_version 42190 (0.0007) [2023-10-12 17:31:45,427][62634] Updated weights for policy 0, policy_version 42200 (0.0008) [2023-10-12 17:31:48,095][62635] Updated weights for policy 1, policy_version 42180 (0.0009) [2023-10-12 17:31:48,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 86409216. Throughput: 0: 1678.1, 1: 1684.6. Samples: 21617810. Policy #0 lag: (min: 31.0, avg: 38.9, max: 63.0) [2023-10-12 17:31:48,435][61643] Avg episode reward: [(0, '21.650'), (1, '9.600')] [2023-10-12 17:31:48,473][62635] Updated weights for policy 1, policy_version 42190 (0.0008) [2023-10-12 17:31:48,838][62635] Updated weights for policy 1, policy_version 42200 (0.0009) [2023-10-12 17:31:49,465][62634] Updated weights for policy 0, policy_version 42210 (0.0009) [2023-10-12 17:31:49,852][62634] Updated weights for policy 0, policy_version 42220 (0.0008) [2023-10-12 17:31:50,227][62634] Updated weights for policy 0, policy_version 42230 (0.0008) [2023-10-12 17:31:50,606][62634] Updated weights for policy 0, policy_version 42240 (0.0008) [2023-10-12 17:31:52,771][62635] Updated weights for policy 1, policy_version 42210 (0.0008) [2023-10-12 17:31:53,131][62635] Updated weights for policy 1, policy_version 42220 (0.0007) [2023-10-12 17:31:53,435][61643] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 86474752. Throughput: 0: 1658.7, 1: 1686.1. Samples: 21626886. Policy #0 lag: (min: 31.0, avg: 38.9, max: 63.0) [2023-10-12 17:31:53,436][61643] Avg episode reward: [(0, '21.660'), (1, '9.630')] [2023-10-12 17:31:53,508][62635] Updated weights for policy 1, policy_version 42230 (0.0007) [2023-10-12 17:31:53,871][62635] Updated weights for policy 1, policy_version 42240 (0.0007) [2023-10-12 17:31:54,622][62634] Updated weights for policy 0, policy_version 42250 (0.0008) [2023-10-12 17:31:55,011][62634] Updated weights for policy 0, policy_version 42260 (0.0010) [2023-10-12 17:31:55,393][62634] Updated weights for policy 0, policy_version 42270 (0.0009) [2023-10-12 17:31:57,894][62635] Updated weights for policy 1, policy_version 42250 (0.0009) [2023-10-12 17:31:58,257][62635] Updated weights for policy 1, policy_version 42260 (0.0009) [2023-10-12 17:31:58,435][61643] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 86540288. Throughput: 0: 1682.2, 1: 1695.0. Samples: 21648168. Policy #0 lag: (min: 31.0, avg: 38.9, max: 63.0) [2023-10-12 17:31:58,436][61643] Avg episode reward: [(0, '21.630'), (1, '9.650')] [2023-10-12 17:31:58,631][62635] Updated weights for policy 1, policy_version 42270 (0.0009) [2023-10-12 17:31:59,317][62634] Updated weights for policy 0, policy_version 42280 (0.0009) [2023-10-12 17:31:59,693][62634] Updated weights for policy 0, policy_version 42290 (0.0007) [2023-10-12 17:32:00,071][62634] Updated weights for policy 0, policy_version 42300 (0.0008) [2023-10-12 17:32:02,797][62635] Updated weights for policy 1, policy_version 42280 (0.0008) [2023-10-12 17:32:03,170][62635] Updated weights for policy 1, policy_version 42290 (0.0009) [2023-10-12 17:32:03,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 86605824. Throughput: 0: 1686.0, 1: 1680.4. Samples: 21668530. Policy #0 lag: (min: 31.0, avg: 38.9, max: 63.0) [2023-10-12 17:32:03,436][61643] Avg episode reward: [(0, '21.670'), (1, '9.650')] [2023-10-12 17:32:03,543][62635] Updated weights for policy 1, policy_version 42300 (0.0009) [2023-10-12 17:32:04,121][62634] Updated weights for policy 0, policy_version 42310 (0.0008) [2023-10-12 17:32:04,494][62634] Updated weights for policy 0, policy_version 42320 (0.0008) [2023-10-12 17:32:04,872][62634] Updated weights for policy 0, policy_version 42330 (0.0009) [2023-10-12 17:32:07,514][62635] Updated weights for policy 1, policy_version 42310 (0.0008) [2023-10-12 17:32:07,883][62635] Updated weights for policy 1, policy_version 42320 (0.0009) [2023-10-12 17:32:08,251][62635] Updated weights for policy 1, policy_version 42330 (0.0009) [2023-10-12 17:32:08,435][61643] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 86671360. Throughput: 0: 1679.5, 1: 1695.8. Samples: 21678208. Policy #0 lag: (min: 31.0, avg: 38.9, max: 63.0) [2023-10-12 17:32:08,435][61643] Avg episode reward: [(0, '21.460'), (1, '9.620')] [2023-10-12 17:32:08,929][62634] Updated weights for policy 0, policy_version 42340 (0.0009) [2023-10-12 17:32:09,309][62634] Updated weights for policy 0, policy_version 42350 (0.0009) [2023-10-12 17:32:09,681][62634] Updated weights for policy 0, policy_version 42360 (0.0007) [2023-10-12 17:32:12,378][62635] Updated weights for policy 1, policy_version 42340 (0.0008) [2023-10-12 17:32:12,745][62635] Updated weights for policy 1, policy_version 42350 (0.0007) [2023-10-12 17:32:13,106][62635] Updated weights for policy 1, policy_version 42360 (0.0007) [2023-10-12 17:32:13,435][61643] Fps is (10 sec: 16383.8, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 86769664. Throughput: 0: 1688.2, 1: 1697.4. Samples: 21698992. Policy #0 lag: (min: 31.0, avg: 38.9, max: 63.0) [2023-10-12 17:32:13,436][61643] Avg episode reward: [(0, '21.510'), (1, '9.690')] [2023-10-12 17:32:13,926][62634] Updated weights for policy 0, policy_version 42370 (0.0008) [2023-10-12 17:32:14,309][62634] Updated weights for policy 0, policy_version 42380 (0.0010) [2023-10-12 17:32:14,690][62634] Updated weights for policy 0, policy_version 42390 (0.0010) [2023-10-12 17:32:15,064][62634] Updated weights for policy 0, policy_version 42400 (0.0009) [2023-10-12 17:32:16,959][62635] Updated weights for policy 1, policy_version 42370 (0.0008) [2023-10-12 17:32:17,327][62635] Updated weights for policy 1, policy_version 42380 (0.0007) [2023-10-12 17:32:17,702][62635] Updated weights for policy 1, policy_version 42390 (0.0007) [2023-10-12 17:32:18,064][62635] Updated weights for policy 1, policy_version 42400 (0.0009) [2023-10-12 17:32:18,435][61643] Fps is (10 sec: 16383.8, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 86835200. Throughput: 0: 1680.8, 1: 1677.3. Samples: 21718444. Policy #0 lag: (min: 2.0, avg: 8.2, max: 34.0) [2023-10-12 17:32:18,435][61643] Avg episode reward: [(0, '21.280'), (1, '9.810')] [2023-10-12 17:32:18,445][62495] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000042400_43417600.pth... [2023-10-12 17:32:18,446][62354] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000042400_43417600.pth... [2023-10-12 17:32:18,477][62495] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000040832_41811968.pth [2023-10-12 17:32:18,482][62354] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000040832_41811968.pth [2023-10-12 17:32:19,227][62634] Updated weights for policy 0, policy_version 42410 (0.0011) [2023-10-12 17:32:19,611][62634] Updated weights for policy 0, policy_version 42420 (0.0008) [2023-10-12 17:32:19,987][62634] Updated weights for policy 0, policy_version 42430 (0.0008) [2023-10-12 17:32:22,179][62635] Updated weights for policy 1, policy_version 42410 (0.0009) [2023-10-12 17:32:22,551][62635] Updated weights for policy 1, policy_version 42420 (0.0007) [2023-10-12 17:32:22,924][62635] Updated weights for policy 1, policy_version 42430 (0.0007) [2023-10-12 17:32:23,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 86900736. Throughput: 0: 1673.2, 1: 1699.4. Samples: 21728588. Policy #0 lag: (min: 2.0, avg: 8.2, max: 34.0) [2023-10-12 17:32:23,435][61643] Avg episode reward: [(0, '21.220'), (1, '9.910')] [2023-10-12 17:32:23,997][62634] Updated weights for policy 0, policy_version 42440 (0.0008) [2023-10-12 17:32:24,370][62634] Updated weights for policy 0, policy_version 42450 (0.0007) [2023-10-12 17:32:24,738][62634] Updated weights for policy 0, policy_version 42460 (0.0007) [2023-10-12 17:32:27,194][62635] Updated weights for policy 1, policy_version 42440 (0.0008) [2023-10-12 17:32:27,576][62635] Updated weights for policy 1, policy_version 42450 (0.0007) [2023-10-12 17:32:27,948][62635] Updated weights for policy 1, policy_version 42460 (0.0007) [2023-10-12 17:32:28,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 86966272. Throughput: 0: 1678.9, 1: 1687.8. Samples: 21749086. Policy #0 lag: (min: 2.0, avg: 8.2, max: 34.0) [2023-10-12 17:32:28,435][61643] Avg episode reward: [(0, '21.180'), (1, '9.820')] [2023-10-12 17:32:28,584][62634] Updated weights for policy 0, policy_version 42470 (0.0008) [2023-10-12 17:32:28,960][62634] Updated weights for policy 0, policy_version 42480 (0.0010) [2023-10-12 17:32:29,342][62634] Updated weights for policy 0, policy_version 42490 (0.0009) [2023-10-12 17:32:31,790][62635] Updated weights for policy 1, policy_version 42470 (0.0008) [2023-10-12 17:32:32,151][62635] Updated weights for policy 1, policy_version 42480 (0.0009) [2023-10-12 17:32:32,525][62635] Updated weights for policy 1, policy_version 42490 (0.0009) [2023-10-12 17:32:33,413][62634] Updated weights for policy 0, policy_version 42500 (0.0009) [2023-10-12 17:32:33,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 87031808. Throughput: 0: 1685.2, 1: 1668.6. Samples: 21768732. Policy #0 lag: (min: 2.0, avg: 8.2, max: 34.0) [2023-10-12 17:32:33,435][61643] Avg episode reward: [(0, '20.820'), (1, '9.620')] [2023-10-12 17:32:33,780][62634] Updated weights for policy 0, policy_version 42510 (0.0011) [2023-10-12 17:32:34,164][62634] Updated weights for policy 0, policy_version 42520 (0.0010) [2023-10-12 17:32:36,594][62635] Updated weights for policy 1, policy_version 42500 (0.0009) [2023-10-12 17:32:36,960][62635] Updated weights for policy 1, policy_version 42510 (0.0008) [2023-10-12 17:32:37,323][62635] Updated weights for policy 1, policy_version 42520 (0.0008) [2023-10-12 17:32:38,338][62634] Updated weights for policy 0, policy_version 42530 (0.0010) [2023-10-12 17:32:38,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 87097344. Throughput: 0: 1685.0, 1: 1696.4. Samples: 21779046. Policy #0 lag: (min: 2.0, avg: 8.2, max: 34.0) [2023-10-12 17:32:38,435][61643] Avg episode reward: [(0, '20.640'), (1, '9.730')] [2023-10-12 17:32:38,721][62634] Updated weights for policy 0, policy_version 42540 (0.0009) [2023-10-12 17:32:39,095][62634] Updated weights for policy 0, policy_version 42550 (0.0007) [2023-10-12 17:32:39,472][62634] Updated weights for policy 0, policy_version 42560 (0.0007) [2023-10-12 17:32:41,371][62635] Updated weights for policy 1, policy_version 42530 (0.0007) [2023-10-12 17:32:41,737][62635] Updated weights for policy 1, policy_version 42540 (0.0010) [2023-10-12 17:32:42,103][62635] Updated weights for policy 1, policy_version 42550 (0.0010) [2023-10-12 17:32:42,479][62635] Updated weights for policy 1, policy_version 42560 (0.0009) [2023-10-12 17:32:43,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 87162880. Throughput: 0: 1680.0, 1: 1673.5. Samples: 21799074. Policy #0 lag: (min: 2.0, avg: 8.2, max: 34.0) [2023-10-12 17:32:43,435][61643] Avg episode reward: [(0, '20.660'), (1, '9.760')] [2023-10-12 17:32:43,559][62634] Updated weights for policy 0, policy_version 42570 (0.0007) [2023-10-12 17:32:43,935][62634] Updated weights for policy 0, policy_version 42580 (0.0008) [2023-10-12 17:32:44,308][62634] Updated weights for policy 0, policy_version 42590 (0.0009) [2023-10-12 17:32:46,488][62635] Updated weights for policy 1, policy_version 42570 (0.0010) [2023-10-12 17:32:46,852][62635] Updated weights for policy 1, policy_version 42580 (0.0009) [2023-10-12 17:32:47,231][62635] Updated weights for policy 1, policy_version 42590 (0.0010) [2023-10-12 17:32:48,322][62634] Updated weights for policy 0, policy_version 42600 (0.0008) [2023-10-12 17:32:48,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 87228416. Throughput: 0: 1678.2, 1: 1672.9. Samples: 21819330. Policy #0 lag: (min: 12.0, avg: 19.9, max: 44.0) [2023-10-12 17:32:48,435][61643] Avg episode reward: [(0, '20.770'), (1, '9.740')] [2023-10-12 17:32:48,701][62634] Updated weights for policy 0, policy_version 42610 (0.0010) [2023-10-12 17:32:49,080][62634] Updated weights for policy 0, policy_version 42620 (0.0009) [2023-10-12 17:32:51,348][62635] Updated weights for policy 1, policy_version 42600 (0.0008) [2023-10-12 17:32:51,712][62635] Updated weights for policy 1, policy_version 42610 (0.0007) [2023-10-12 17:32:52,071][62635] Updated weights for policy 1, policy_version 42620 (0.0009) [2023-10-12 17:32:53,089][62634] Updated weights for policy 0, policy_version 42630 (0.0007) [2023-10-12 17:32:53,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 87293952. Throughput: 0: 1678.6, 1: 1687.8. Samples: 21829696. Policy #0 lag: (min: 12.0, avg: 19.9, max: 44.0) [2023-10-12 17:32:53,435][61643] Avg episode reward: [(0, '20.840'), (1, '9.930')] [2023-10-12 17:32:53,467][62634] Updated weights for policy 0, policy_version 42640 (0.0009) [2023-10-12 17:32:53,844][62634] Updated weights for policy 0, policy_version 42650 (0.0011) [2023-10-12 17:32:56,358][62635] Updated weights for policy 1, policy_version 42630 (0.0008) [2023-10-12 17:32:56,721][62635] Updated weights for policy 1, policy_version 42640 (0.0010) [2023-10-12 17:32:57,084][62635] Updated weights for policy 1, policy_version 42650 (0.0008) [2023-10-12 17:32:57,753][62634] Updated weights for policy 0, policy_version 42660 (0.0010) [2023-10-12 17:32:58,136][62634] Updated weights for policy 0, policy_version 42670 (0.0008) [2023-10-12 17:32:58,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 87359488. Throughput: 0: 1685.0, 1: 1661.3. Samples: 21849574. Policy #0 lag: (min: 12.0, avg: 19.9, max: 44.0) [2023-10-12 17:32:58,435][61643] Avg episode reward: [(0, '20.760'), (1, '9.600')] [2023-10-12 17:32:58,521][62634] Updated weights for policy 0, policy_version 42680 (0.0007) [2023-10-12 17:33:01,174][62635] Updated weights for policy 1, policy_version 42660 (0.0008) [2023-10-12 17:33:01,535][62635] Updated weights for policy 1, policy_version 42670 (0.0007) [2023-10-12 17:33:01,906][62635] Updated weights for policy 1, policy_version 42680 (0.0009) [2023-10-12 17:33:02,643][62634] Updated weights for policy 0, policy_version 42690 (0.0007) [2023-10-12 17:33:03,013][62634] Updated weights for policy 0, policy_version 42700 (0.0007) [2023-10-12 17:33:03,393][62634] Updated weights for policy 0, policy_version 42710 (0.0007) [2023-10-12 17:33:03,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 87425024. Throughput: 0: 1679.2, 1: 1674.3. Samples: 21869352. Policy #0 lag: (min: 12.0, avg: 19.9, max: 44.0) [2023-10-12 17:33:03,436][61643] Avg episode reward: [(0, '20.810'), (1, '9.720')] [2023-10-12 17:33:03,761][62634] Updated weights for policy 0, policy_version 42720 (0.0008) [2023-10-12 17:33:06,043][62635] Updated weights for policy 1, policy_version 42690 (0.0010) [2023-10-12 17:33:06,422][62635] Updated weights for policy 1, policy_version 42700 (0.0009) [2023-10-12 17:33:06,791][62635] Updated weights for policy 1, policy_version 42710 (0.0007) [2023-10-12 17:33:07,160][62635] Updated weights for policy 1, policy_version 42720 (0.0007) [2023-10-12 17:33:07,884][62634] Updated weights for policy 0, policy_version 42730 (0.0009) [2023-10-12 17:33:08,257][62634] Updated weights for policy 0, policy_version 42740 (0.0012) [2023-10-12 17:33:08,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 87490560. Throughput: 0: 1689.7, 1: 1677.3. Samples: 21880104. Policy #0 lag: (min: 12.0, avg: 19.9, max: 44.0) [2023-10-12 17:33:08,435][61643] Avg episode reward: [(0, '20.810'), (1, '9.970')] [2023-10-12 17:33:08,642][62634] Updated weights for policy 0, policy_version 42750 (0.0010) [2023-10-12 17:33:11,225][62635] Updated weights for policy 1, policy_version 42730 (0.0007) [2023-10-12 17:33:11,597][62635] Updated weights for policy 1, policy_version 42740 (0.0007) [2023-10-12 17:33:11,967][62635] Updated weights for policy 1, policy_version 42750 (0.0010) [2023-10-12 17:33:12,757][62634] Updated weights for policy 0, policy_version 42760 (0.0008) [2023-10-12 17:33:13,143][62634] Updated weights for policy 0, policy_version 42770 (0.0007) [2023-10-12 17:33:13,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 87556096. Throughput: 0: 1688.6, 1: 1658.6. Samples: 21899710. Policy #0 lag: (min: 12.0, avg: 19.9, max: 44.0) [2023-10-12 17:33:13,435][61643] Avg episode reward: [(0, '20.730'), (1, '9.730')] [2023-10-12 17:33:13,525][62634] Updated weights for policy 0, policy_version 42780 (0.0008) [2023-10-12 17:33:16,066][62635] Updated weights for policy 1, policy_version 42760 (0.0010) [2023-10-12 17:33:16,452][62635] Updated weights for policy 1, policy_version 42770 (0.0007) [2023-10-12 17:33:16,815][62635] Updated weights for policy 1, policy_version 42780 (0.0007) [2023-10-12 17:33:17,540][62634] Updated weights for policy 0, policy_version 42790 (0.0008) [2023-10-12 17:33:17,916][62634] Updated weights for policy 0, policy_version 42800 (0.0007) [2023-10-12 17:33:18,288][62634] Updated weights for policy 0, policy_version 42810 (0.0007) [2023-10-12 17:33:18,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 87621632. Throughput: 0: 1667.0, 1: 1683.4. Samples: 21919500. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 17:33:18,435][61643] Avg episode reward: [(0, '20.970'), (1, '9.920')] [2023-10-12 17:33:20,900][62635] Updated weights for policy 1, policy_version 42790 (0.0007) [2023-10-12 17:33:21,270][62635] Updated weights for policy 1, policy_version 42800 (0.0007) [2023-10-12 17:33:21,647][62635] Updated weights for policy 1, policy_version 42810 (0.0008) [2023-10-12 17:33:22,363][62634] Updated weights for policy 0, policy_version 42820 (0.0009) [2023-10-12 17:33:22,733][62634] Updated weights for policy 0, policy_version 42830 (0.0007) [2023-10-12 17:33:23,107][62634] Updated weights for policy 0, policy_version 42840 (0.0007) [2023-10-12 17:33:23,435][61643] Fps is (10 sec: 16383.6, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 87719936. Throughput: 0: 1680.5, 1: 1671.6. Samples: 21929892. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 17:33:23,436][61643] Avg episode reward: [(0, '20.920'), (1, '9.810')] [2023-10-12 17:33:25,699][62635] Updated weights for policy 1, policy_version 42820 (0.0010) [2023-10-12 17:33:26,073][62635] Updated weights for policy 1, policy_version 42830 (0.0010) [2023-10-12 17:33:26,445][62635] Updated weights for policy 1, policy_version 42840 (0.0011) [2023-10-12 17:33:27,252][62634] Updated weights for policy 0, policy_version 42850 (0.0008) [2023-10-12 17:33:27,641][62634] Updated weights for policy 0, policy_version 42860 (0.0010) [2023-10-12 17:33:28,015][62634] Updated weights for policy 0, policy_version 42870 (0.0008) [2023-10-12 17:33:28,383][62634] Updated weights for policy 0, policy_version 42880 (0.0008) [2023-10-12 17:33:28,435][61643] Fps is (10 sec: 16384.0, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 87785472. Throughput: 0: 1679.6, 1: 1667.6. Samples: 21949698. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 17:33:28,435][61643] Avg episode reward: [(0, '21.030'), (1, '9.820')] [2023-10-12 17:33:30,374][62635] Updated weights for policy 1, policy_version 42850 (0.0008) [2023-10-12 17:33:30,739][62635] Updated weights for policy 1, policy_version 42860 (0.0011) [2023-10-12 17:33:31,109][62635] Updated weights for policy 1, policy_version 42870 (0.0010) [2023-10-12 17:33:31,474][62635] Updated weights for policy 1, policy_version 42880 (0.0009) [2023-10-12 17:33:32,462][62634] Updated weights for policy 0, policy_version 42890 (0.0007) [2023-10-12 17:33:32,841][62634] Updated weights for policy 0, policy_version 42900 (0.0007) [2023-10-12 17:33:33,225][62634] Updated weights for policy 0, policy_version 42910 (0.0007) [2023-10-12 17:33:33,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 87851008. Throughput: 0: 1652.6, 1: 1683.4. Samples: 21969450. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 17:33:33,435][61643] Avg episode reward: [(0, '21.110'), (1, '9.990')] [2023-10-12 17:33:35,556][62635] Updated weights for policy 1, policy_version 42890 (0.0011) [2023-10-12 17:33:35,927][62635] Updated weights for policy 1, policy_version 42900 (0.0008) [2023-10-12 17:33:36,309][62635] Updated weights for policy 1, policy_version 42910 (0.0010) [2023-10-12 17:33:37,243][62634] Updated weights for policy 0, policy_version 42920 (0.0010) [2023-10-12 17:33:37,614][62634] Updated weights for policy 0, policy_version 42930 (0.0009) [2023-10-12 17:33:38,002][62634] Updated weights for policy 0, policy_version 42940 (0.0010) [2023-10-12 17:33:38,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 87916544. Throughput: 0: 1678.1, 1: 1660.7. Samples: 21979942. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 17:33:38,436][61643] Avg episode reward: [(0, '21.380'), (1, '10.010')] [2023-10-12 17:33:40,275][62635] Updated weights for policy 1, policy_version 42920 (0.0009) [2023-10-12 17:33:40,641][62635] Updated weights for policy 1, policy_version 42930 (0.0007) [2023-10-12 17:33:41,015][62635] Updated weights for policy 1, policy_version 42940 (0.0008) [2023-10-12 17:33:42,140][62634] Updated weights for policy 0, policy_version 42950 (0.0010) [2023-10-12 17:33:42,519][62634] Updated weights for policy 0, policy_version 42960 (0.0008) [2023-10-12 17:33:42,892][62634] Updated weights for policy 0, policy_version 42970 (0.0007) [2023-10-12 17:33:43,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 87982080. Throughput: 0: 1669.7, 1: 1677.2. Samples: 22000186. Policy #0 lag: (min: 31.0, avg: 38.5, max: 63.0) [2023-10-12 17:33:43,436][61643] Avg episode reward: [(0, '21.370'), (1, '9.780')] [2023-10-12 17:33:45,014][62635] Updated weights for policy 1, policy_version 42950 (0.0007) [2023-10-12 17:33:45,389][62635] Updated weights for policy 1, policy_version 42960 (0.0010) [2023-10-12 17:33:45,765][62635] Updated weights for policy 1, policy_version 42970 (0.0010) [2023-10-12 17:33:46,965][62634] Updated weights for policy 0, policy_version 42980 (0.0008) [2023-10-12 17:33:47,347][62634] Updated weights for policy 0, policy_version 42990 (0.0007) [2023-10-12 17:33:47,724][62634] Updated weights for policy 0, policy_version 43000 (0.0008) [2023-10-12 17:33:48,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 88047616. Throughput: 0: 1656.9, 1: 1687.1. Samples: 22019830. Policy #0 lag: (min: 31.0, avg: 38.5, max: 63.0) [2023-10-12 17:33:48,436][61643] Avg episode reward: [(0, '21.680'), (1, '9.870')] [2023-10-12 17:33:49,816][62635] Updated weights for policy 1, policy_version 42980 (0.0009) [2023-10-12 17:33:50,184][62635] Updated weights for policy 1, policy_version 42990 (0.0008) [2023-10-12 17:33:50,560][62635] Updated weights for policy 1, policy_version 43000 (0.0008) [2023-10-12 17:33:51,821][62634] Updated weights for policy 0, policy_version 43010 (0.0009) [2023-10-12 17:33:52,196][62634] Updated weights for policy 0, policy_version 43020 (0.0007) [2023-10-12 17:33:52,570][62634] Updated weights for policy 0, policy_version 43030 (0.0007) [2023-10-12 17:33:52,947][62634] Updated weights for policy 0, policy_version 43040 (0.0007) [2023-10-12 17:33:53,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 88113152. Throughput: 0: 1671.2, 1: 1656.4. Samples: 22029848. Policy #0 lag: (min: 31.0, avg: 38.5, max: 63.0) [2023-10-12 17:33:53,436][61643] Avg episode reward: [(0, '22.100'), (1, '9.810')] [2023-10-12 17:33:53,437][62354] Saving new best policy, reward=22.100! [2023-10-12 17:33:54,666][62635] Updated weights for policy 1, policy_version 43010 (0.0008) [2023-10-12 17:33:55,040][62635] Updated weights for policy 1, policy_version 43020 (0.0010) [2023-10-12 17:33:55,399][62635] Updated weights for policy 1, policy_version 43030 (0.0010) [2023-10-12 17:33:55,775][62635] Updated weights for policy 1, policy_version 43040 (0.0010) [2023-10-12 17:33:56,920][62634] Updated weights for policy 0, policy_version 43050 (0.0010) [2023-10-12 17:33:57,288][62634] Updated weights for policy 0, policy_version 43060 (0.0011) [2023-10-12 17:33:57,661][62634] Updated weights for policy 0, policy_version 43070 (0.0009) [2023-10-12 17:33:58,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 88178688. Throughput: 0: 1665.5, 1: 1677.3. Samples: 22050136. Policy #0 lag: (min: 31.0, avg: 38.5, max: 63.0) [2023-10-12 17:33:58,436][61643] Avg episode reward: [(0, '22.130'), (1, '9.640')] [2023-10-12 17:33:58,437][62354] Saving new best policy, reward=22.130! [2023-10-12 17:33:59,849][62635] Updated weights for policy 1, policy_version 43050 (0.0008) [2023-10-12 17:34:00,223][62635] Updated weights for policy 1, policy_version 43060 (0.0008) [2023-10-12 17:34:00,598][62635] Updated weights for policy 1, policy_version 43070 (0.0009) [2023-10-12 17:34:01,647][62634] Updated weights for policy 0, policy_version 43080 (0.0007) [2023-10-12 17:34:02,034][62634] Updated weights for policy 0, policy_version 43090 (0.0008) [2023-10-12 17:34:02,404][62634] Updated weights for policy 0, policy_version 43100 (0.0009) [2023-10-12 17:34:03,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 88244224. Throughput: 0: 1663.2, 1: 1680.4. Samples: 22069966. Policy #0 lag: (min: 31.0, avg: 38.5, max: 63.0) [2023-10-12 17:34:03,435][61643] Avg episode reward: [(0, '22.150'), (1, '9.820')] [2023-10-12 17:34:03,445][62354] Saving new best policy, reward=22.150! [2023-10-12 17:34:04,723][62635] Updated weights for policy 1, policy_version 43080 (0.0009) [2023-10-12 17:34:05,084][62635] Updated weights for policy 1, policy_version 43090 (0.0008) [2023-10-12 17:34:05,461][62635] Updated weights for policy 1, policy_version 43100 (0.0009) [2023-10-12 17:34:06,459][62634] Updated weights for policy 0, policy_version 43110 (0.0010) [2023-10-12 17:34:06,846][62634] Updated weights for policy 0, policy_version 43120 (0.0009) [2023-10-12 17:34:07,216][62634] Updated weights for policy 0, policy_version 43130 (0.0008) [2023-10-12 17:34:08,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 88309760. Throughput: 0: 1683.2, 1: 1659.2. Samples: 22080296. Policy #0 lag: (min: 31.0, avg: 38.5, max: 63.0) [2023-10-12 17:34:08,435][61643] Avg episode reward: [(0, '22.220'), (1, '9.640')] [2023-10-12 17:34:08,436][62354] Saving new best policy, reward=22.220! [2023-10-12 17:34:09,540][62635] Updated weights for policy 1, policy_version 43110 (0.0008) [2023-10-12 17:34:09,908][62635] Updated weights for policy 1, policy_version 43120 (0.0010) [2023-10-12 17:34:10,276][62635] Updated weights for policy 1, policy_version 43130 (0.0011) [2023-10-12 17:34:11,294][62634] Updated weights for policy 0, policy_version 43140 (0.0008) [2023-10-12 17:34:11,671][62634] Updated weights for policy 0, policy_version 43150 (0.0008) [2023-10-12 17:34:12,036][62634] Updated weights for policy 0, policy_version 43160 (0.0008) [2023-10-12 17:34:13,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 88375296. Throughput: 0: 1666.5, 1: 1681.6. Samples: 22100364. Policy #0 lag: (min: 2.0, avg: 9.7, max: 34.0) [2023-10-12 17:34:13,435][61643] Avg episode reward: [(0, '21.990'), (1, '9.680')] [2023-10-12 17:34:14,519][62635] Updated weights for policy 1, policy_version 43140 (0.0009) [2023-10-12 17:34:14,893][62635] Updated weights for policy 1, policy_version 43150 (0.0007) [2023-10-12 17:34:15,253][62635] Updated weights for policy 1, policy_version 43160 (0.0008) [2023-10-12 17:34:16,055][62634] Updated weights for policy 0, policy_version 43170 (0.0008) [2023-10-12 17:34:16,446][62634] Updated weights for policy 0, policy_version 43180 (0.0009) [2023-10-12 17:34:16,828][62634] Updated weights for policy 0, policy_version 43190 (0.0008) [2023-10-12 17:34:17,198][62634] Updated weights for policy 0, policy_version 43200 (0.0009) [2023-10-12 17:34:18,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 88440832. Throughput: 0: 1681.4, 1: 1678.5. Samples: 22120646. Policy #0 lag: (min: 2.0, avg: 9.7, max: 34.0) [2023-10-12 17:34:18,436][61643] Avg episode reward: [(0, '22.000'), (1, '9.870')] [2023-10-12 17:34:18,445][62354] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000043200_44236800.pth... [2023-10-12 17:34:18,445][62495] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000043168_44204032.pth... [2023-10-12 17:34:18,485][62495] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000041600_42598400.pth [2023-10-12 17:34:18,485][62354] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000041632_42631168.pth [2023-10-12 17:34:19,356][62635] Updated weights for policy 1, policy_version 43170 (0.0009) [2023-10-12 17:34:19,725][62635] Updated weights for policy 1, policy_version 43180 (0.0008) [2023-10-12 17:34:20,091][62635] Updated weights for policy 1, policy_version 43190 (0.0008) [2023-10-12 17:34:20,462][62635] Updated weights for policy 1, policy_version 43200 (0.0008) [2023-10-12 17:34:21,204][62634] Updated weights for policy 0, policy_version 43210 (0.0010) [2023-10-12 17:34:21,590][62634] Updated weights for policy 0, policy_version 43220 (0.0010) [2023-10-12 17:34:21,965][62634] Updated weights for policy 0, policy_version 43230 (0.0010) [2023-10-12 17:34:23,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 88506368. Throughput: 0: 1686.2, 1: 1668.9. Samples: 22130922. Policy #0 lag: (min: 2.0, avg: 9.7, max: 34.0) [2023-10-12 17:34:23,435][61643] Avg episode reward: [(0, '22.030'), (1, '9.800')] [2023-10-12 17:34:24,561][62635] Updated weights for policy 1, policy_version 43210 (0.0008) [2023-10-12 17:34:24,930][62635] Updated weights for policy 1, policy_version 43220 (0.0007) [2023-10-12 17:34:25,304][62635] Updated weights for policy 1, policy_version 43230 (0.0007) [2023-10-12 17:34:26,002][62634] Updated weights for policy 0, policy_version 43240 (0.0009) [2023-10-12 17:34:26,377][62634] Updated weights for policy 0, policy_version 43250 (0.0009) [2023-10-12 17:34:26,757][62634] Updated weights for policy 0, policy_version 43260 (0.0007) [2023-10-12 17:34:28,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 88571904. Throughput: 0: 1659.9, 1: 1675.9. Samples: 22150298. Policy #0 lag: (min: 2.0, avg: 9.7, max: 34.0) [2023-10-12 17:34:28,435][61643] Avg episode reward: [(0, '22.260'), (1, '10.120')] [2023-10-12 17:34:28,436][62354] Saving new best policy, reward=22.260! [2023-10-12 17:34:28,436][62495] Saving new best policy, reward=10.120! [2023-10-12 17:34:29,499][62635] Updated weights for policy 1, policy_version 43240 (0.0007) [2023-10-12 17:34:29,856][62635] Updated weights for policy 1, policy_version 43250 (0.0009) [2023-10-12 17:34:30,234][62635] Updated weights for policy 1, policy_version 43260 (0.0008) [2023-10-12 17:34:30,744][62634] Updated weights for policy 0, policy_version 43270 (0.0008) [2023-10-12 17:34:31,132][62634] Updated weights for policy 0, policy_version 43280 (0.0009) [2023-10-12 17:34:31,501][62634] Updated weights for policy 0, policy_version 43290 (0.0008) [2023-10-12 17:34:33,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 88637440. Throughput: 0: 1690.2, 1: 1672.9. Samples: 22171168. Policy #0 lag: (min: 2.0, avg: 9.7, max: 34.0) [2023-10-12 17:34:33,435][61643] Avg episode reward: [(0, '21.980'), (1, '10.140')] [2023-10-12 17:34:33,443][62495] Saving new best policy, reward=10.140! [2023-10-12 17:34:34,249][62635] Updated weights for policy 1, policy_version 43270 (0.0008) [2023-10-12 17:34:34,625][62635] Updated weights for policy 1, policy_version 43280 (0.0008) [2023-10-12 17:34:34,999][62635] Updated weights for policy 1, policy_version 43290 (0.0007) [2023-10-12 17:34:35,437][62634] Updated weights for policy 0, policy_version 43300 (0.0007) [2023-10-12 17:34:35,813][62634] Updated weights for policy 0, policy_version 43310 (0.0011) [2023-10-12 17:34:36,195][62634] Updated weights for policy 0, policy_version 43320 (0.0008) [2023-10-12 17:34:38,435][61643] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 88702976. Throughput: 0: 1681.3, 1: 1675.4. Samples: 22180898. Policy #0 lag: (min: 2.0, avg: 9.7, max: 34.0) [2023-10-12 17:34:38,436][61643] Avg episode reward: [(0, '22.070'), (1, '9.870')] [2023-10-12 17:34:39,036][62635] Updated weights for policy 1, policy_version 43300 (0.0007) [2023-10-12 17:34:39,401][62635] Updated weights for policy 1, policy_version 43310 (0.0007) [2023-10-12 17:34:39,761][62635] Updated weights for policy 1, policy_version 43320 (0.0007) [2023-10-12 17:34:40,226][62634] Updated weights for policy 0, policy_version 43330 (0.0009) [2023-10-12 17:34:40,601][62634] Updated weights for policy 0, policy_version 43340 (0.0008) [2023-10-12 17:34:40,978][62634] Updated weights for policy 0, policy_version 43350 (0.0008) [2023-10-12 17:34:41,351][62634] Updated weights for policy 0, policy_version 43360 (0.0007) [2023-10-12 17:34:43,435][61643] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 88768512. Throughput: 0: 1674.1, 1: 1678.8. Samples: 22201014. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 17:34:43,436][61643] Avg episode reward: [(0, '22.010'), (1, '10.050')] [2023-10-12 17:34:43,858][62635] Updated weights for policy 1, policy_version 43330 (0.0007) [2023-10-12 17:34:44,231][62635] Updated weights for policy 1, policy_version 43340 (0.0007) [2023-10-12 17:34:44,596][62635] Updated weights for policy 1, policy_version 43350 (0.0008) [2023-10-12 17:34:44,971][62635] Updated weights for policy 1, policy_version 43360 (0.0007) [2023-10-12 17:34:45,466][62634] Updated weights for policy 0, policy_version 43370 (0.0007) [2023-10-12 17:34:45,848][62634] Updated weights for policy 0, policy_version 43380 (0.0007) [2023-10-12 17:34:46,225][62634] Updated weights for policy 0, policy_version 43390 (0.0011) [2023-10-12 17:34:48,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 88834048. Throughput: 0: 1696.2, 1: 1676.3. Samples: 22221728. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 17:34:48,436][61643] Avg episode reward: [(0, '21.800'), (1, '10.020')] [2023-10-12 17:34:48,986][62635] Updated weights for policy 1, policy_version 43370 (0.0008) [2023-10-12 17:34:49,354][62635] Updated weights for policy 1, policy_version 43380 (0.0010) [2023-10-12 17:34:49,721][62635] Updated weights for policy 1, policy_version 43390 (0.0008) [2023-10-12 17:34:50,382][62634] Updated weights for policy 0, policy_version 43400 (0.0009) [2023-10-12 17:34:50,758][62634] Updated weights for policy 0, policy_version 43410 (0.0009) [2023-10-12 17:34:51,142][62634] Updated weights for policy 0, policy_version 43420 (0.0008) [2023-10-12 17:34:53,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 88899584. Throughput: 0: 1673.1, 1: 1677.8. Samples: 22231086. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 17:34:53,435][61643] Avg episode reward: [(0, '21.800'), (1, '9.940')] [2023-10-12 17:34:53,855][62635] Updated weights for policy 1, policy_version 43400 (0.0010) [2023-10-12 17:34:54,224][62635] Updated weights for policy 1, policy_version 43410 (0.0009) [2023-10-12 17:34:54,590][62635] Updated weights for policy 1, policy_version 43420 (0.0007) [2023-10-12 17:34:55,008][62634] Updated weights for policy 0, policy_version 43430 (0.0009) [2023-10-12 17:34:55,390][62634] Updated weights for policy 0, policy_version 43440 (0.0010) [2023-10-12 17:34:55,771][62634] Updated weights for policy 0, policy_version 43450 (0.0008) [2023-10-12 17:34:58,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 88965120. Throughput: 0: 1682.7, 1: 1671.5. Samples: 22251300. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 17:34:58,436][61643] Avg episode reward: [(0, '21.760'), (1, '10.060')] [2023-10-12 17:34:58,636][62635] Updated weights for policy 1, policy_version 43430 (0.0010) [2023-10-12 17:34:59,007][62635] Updated weights for policy 1, policy_version 43440 (0.0009) [2023-10-12 17:34:59,376][62635] Updated weights for policy 1, policy_version 43450 (0.0009) [2023-10-12 17:34:59,803][62634] Updated weights for policy 0, policy_version 43460 (0.0010) [2023-10-12 17:35:00,175][62634] Updated weights for policy 0, policy_version 43470 (0.0009) [2023-10-12 17:35:00,551][62634] Updated weights for policy 0, policy_version 43480 (0.0007) [2023-10-12 17:35:03,375][62635] Updated weights for policy 1, policy_version 43460 (0.0010) [2023-10-12 17:35:03,435][61643] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 89030656. Throughput: 0: 1692.0, 1: 1674.1. Samples: 22272120. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 17:35:03,436][61643] Avg episode reward: [(0, '21.650'), (1, '9.980')] [2023-10-12 17:35:03,739][62635] Updated weights for policy 1, policy_version 43470 (0.0011) [2023-10-12 17:35:04,121][62635] Updated weights for policy 1, policy_version 43480 (0.0009) [2023-10-12 17:35:04,769][62634] Updated weights for policy 0, policy_version 43490 (0.0007) [2023-10-12 17:35:05,176][62634] Updated weights for policy 0, policy_version 43500 (0.0010) [2023-10-12 17:35:05,545][62634] Updated weights for policy 0, policy_version 43510 (0.0009) [2023-10-12 17:35:05,922][62634] Updated weights for policy 0, policy_version 43520 (0.0010) [2023-10-12 17:35:08,256][62635] Updated weights for policy 1, policy_version 43490 (0.0009) [2023-10-12 17:35:08,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 89096192. Throughput: 0: 1662.4, 1: 1673.6. Samples: 22281042. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 17:35:08,435][61643] Avg episode reward: [(0, '21.330'), (1, '9.800')] [2023-10-12 17:35:08,633][62635] Updated weights for policy 1, policy_version 43500 (0.0009) [2023-10-12 17:35:08,992][62635] Updated weights for policy 1, policy_version 43510 (0.0007) [2023-10-12 17:35:09,366][62635] Updated weights for policy 1, policy_version 43520 (0.0007) [2023-10-12 17:35:10,020][62634] Updated weights for policy 0, policy_version 43530 (0.0010) [2023-10-12 17:35:10,385][62634] Updated weights for policy 0, policy_version 43540 (0.0009) [2023-10-12 17:35:10,770][62634] Updated weights for policy 0, policy_version 43550 (0.0007) [2023-10-12 17:35:13,364][62635] Updated weights for policy 1, policy_version 43530 (0.0009) [2023-10-12 17:35:13,435][61643] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 89161728. Throughput: 0: 1690.4, 1: 1677.2. Samples: 22301844. Policy #0 lag: (min: 31.0, avg: 38.9, max: 63.0) [2023-10-12 17:35:13,435][61643] Avg episode reward: [(0, '21.100'), (1, '9.730')] [2023-10-12 17:35:13,738][62635] Updated weights for policy 1, policy_version 43540 (0.0011) [2023-10-12 17:35:14,096][62635] Updated weights for policy 1, policy_version 43550 (0.0010) [2023-10-12 17:35:14,650][62634] Updated weights for policy 0, policy_version 43560 (0.0009) [2023-10-12 17:35:15,037][62634] Updated weights for policy 0, policy_version 43570 (0.0009) [2023-10-12 17:35:15,410][62634] Updated weights for policy 0, policy_version 43580 (0.0007) [2023-10-12 17:35:18,277][62635] Updated weights for policy 1, policy_version 43560 (0.0009) [2023-10-12 17:35:18,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 89227264. Throughput: 0: 1683.7, 1: 1680.5. Samples: 22322560. Policy #0 lag: (min: 31.0, avg: 38.9, max: 63.0) [2023-10-12 17:35:18,435][61643] Avg episode reward: [(0, '20.950'), (1, '9.670')] [2023-10-12 17:35:18,647][62635] Updated weights for policy 1, policy_version 43570 (0.0008) [2023-10-12 17:35:19,010][62635] Updated weights for policy 1, policy_version 43580 (0.0009) [2023-10-12 17:35:19,527][62634] Updated weights for policy 0, policy_version 43590 (0.0008) [2023-10-12 17:35:19,902][62634] Updated weights for policy 0, policy_version 43600 (0.0009) [2023-10-12 17:35:20,263][62634] Updated weights for policy 0, policy_version 43610 (0.0011) [2023-10-12 17:35:23,082][62635] Updated weights for policy 1, policy_version 43590 (0.0008) [2023-10-12 17:35:23,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 89292800. Throughput: 0: 1666.9, 1: 1686.7. Samples: 22331812. Policy #0 lag: (min: 31.0, avg: 38.9, max: 63.0) [2023-10-12 17:35:23,435][61643] Avg episode reward: [(0, '21.210'), (1, '9.690')] [2023-10-12 17:35:23,450][62635] Updated weights for policy 1, policy_version 43600 (0.0008) [2023-10-12 17:35:23,828][62635] Updated weights for policy 1, policy_version 43610 (0.0009) [2023-10-12 17:35:24,374][62634] Updated weights for policy 0, policy_version 43620 (0.0009) [2023-10-12 17:35:24,741][62634] Updated weights for policy 0, policy_version 43630 (0.0008) [2023-10-12 17:35:25,116][62634] Updated weights for policy 0, policy_version 43640 (0.0010) [2023-10-12 17:35:27,881][62635] Updated weights for policy 1, policy_version 43620 (0.0009) [2023-10-12 17:35:28,248][62635] Updated weights for policy 1, policy_version 43630 (0.0009) [2023-10-12 17:35:28,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 89358336. Throughput: 0: 1674.5, 1: 1688.2. Samples: 22352332. Policy #0 lag: (min: 31.0, avg: 38.9, max: 63.0) [2023-10-12 17:35:28,435][61643] Avg episode reward: [(0, '20.920'), (1, '9.650')] [2023-10-12 17:35:28,611][62635] Updated weights for policy 1, policy_version 43640 (0.0007) [2023-10-12 17:35:29,276][62634] Updated weights for policy 0, policy_version 43650 (0.0010) [2023-10-12 17:35:29,654][62634] Updated weights for policy 0, policy_version 43660 (0.0009) [2023-10-12 17:35:30,031][62634] Updated weights for policy 0, policy_version 43670 (0.0009) [2023-10-12 17:35:30,407][62634] Updated weights for policy 0, policy_version 43680 (0.0008) [2023-10-12 17:35:32,617][62635] Updated weights for policy 1, policy_version 43650 (0.0007) [2023-10-12 17:35:32,990][62635] Updated weights for policy 1, policy_version 43660 (0.0009) [2023-10-12 17:35:33,358][62635] Updated weights for policy 1, policy_version 43670 (0.0009) [2023-10-12 17:35:33,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 89423872. Throughput: 0: 1672.9, 1: 1678.1. Samples: 22372518. Policy #0 lag: (min: 31.0, avg: 38.9, max: 63.0) [2023-10-12 17:35:33,435][61643] Avg episode reward: [(0, '20.950'), (1, '9.720')] [2023-10-12 17:35:33,728][62635] Updated weights for policy 1, policy_version 43680 (0.0008) [2023-10-12 17:35:34,543][62634] Updated weights for policy 0, policy_version 43690 (0.0007) [2023-10-12 17:35:34,916][62634] Updated weights for policy 0, policy_version 43700 (0.0010) [2023-10-12 17:35:35,298][62634] Updated weights for policy 0, policy_version 43710 (0.0010) [2023-10-12 17:35:37,833][62635] Updated weights for policy 1, policy_version 43690 (0.0009) [2023-10-12 17:35:38,210][62635] Updated weights for policy 1, policy_version 43700 (0.0011) [2023-10-12 17:35:38,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.3, 300 sec: 13329.4). Total num frames: 89489408. Throughput: 0: 1662.1, 1: 1689.6. Samples: 22381914. Policy #0 lag: (min: 31.0, avg: 38.9, max: 63.0) [2023-10-12 17:35:38,435][61643] Avg episode reward: [(0, '20.960'), (1, '9.650')] [2023-10-12 17:35:38,585][62635] Updated weights for policy 1, policy_version 43710 (0.0011) [2023-10-12 17:35:39,573][62634] Updated weights for policy 0, policy_version 43720 (0.0012) [2023-10-12 17:35:39,952][62634] Updated weights for policy 0, policy_version 43730 (0.0008) [2023-10-12 17:35:40,328][62634] Updated weights for policy 0, policy_version 43740 (0.0011) [2023-10-12 17:35:42,745][62635] Updated weights for policy 1, policy_version 43720 (0.0008) [2023-10-12 17:35:43,133][62635] Updated weights for policy 1, policy_version 43730 (0.0007) [2023-10-12 17:35:43,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 89554944. Throughput: 0: 1666.2, 1: 1688.3. Samples: 22402252. Policy #0 lag: (min: 8.0, avg: 31.7, max: 40.0) [2023-10-12 17:35:43,435][61643] Avg episode reward: [(0, '21.220'), (1, '9.720')] [2023-10-12 17:35:43,495][62635] Updated weights for policy 1, policy_version 43740 (0.0009) [2023-10-12 17:35:44,411][62634] Updated weights for policy 0, policy_version 43750 (0.0010) [2023-10-12 17:35:44,793][62634] Updated weights for policy 0, policy_version 43760 (0.0008) [2023-10-12 17:35:45,165][62634] Updated weights for policy 0, policy_version 43770 (0.0009) [2023-10-12 17:35:47,548][62635] Updated weights for policy 1, policy_version 43750 (0.0009) [2023-10-12 17:35:47,914][62635] Updated weights for policy 1, policy_version 43760 (0.0009) [2023-10-12 17:35:48,278][62635] Updated weights for policy 1, policy_version 43770 (0.0009) [2023-10-12 17:35:48,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.3, 300 sec: 13329.4). Total num frames: 89620480. Throughput: 0: 1669.8, 1: 1665.2. Samples: 22422192. Policy #0 lag: (min: 8.0, avg: 31.7, max: 40.0) [2023-10-12 17:35:48,435][61643] Avg episode reward: [(0, '20.930'), (1, '9.620')] [2023-10-12 17:35:49,281][62634] Updated weights for policy 0, policy_version 43780 (0.0009) [2023-10-12 17:35:49,668][62634] Updated weights for policy 0, policy_version 43790 (0.0009) [2023-10-12 17:35:50,033][62634] Updated weights for policy 0, policy_version 43800 (0.0010) [2023-10-12 17:35:52,275][62635] Updated weights for policy 1, policy_version 43780 (0.0008) [2023-10-12 17:35:52,642][62635] Updated weights for policy 1, policy_version 43790 (0.0008) [2023-10-12 17:35:53,019][62635] Updated weights for policy 1, policy_version 43800 (0.0010) [2023-10-12 17:35:53,435][61643] Fps is (10 sec: 16383.8, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 89718784. Throughput: 0: 1670.9, 1: 1685.5. Samples: 22432080. Policy #0 lag: (min: 8.0, avg: 31.7, max: 40.0) [2023-10-12 17:35:53,436][61643] Avg episode reward: [(0, '21.000'), (1, '9.880')] [2023-10-12 17:35:54,156][62634] Updated weights for policy 0, policy_version 43810 (0.0007) [2023-10-12 17:35:54,537][62634] Updated weights for policy 0, policy_version 43820 (0.0008) [2023-10-12 17:35:54,913][62634] Updated weights for policy 0, policy_version 43830 (0.0009) [2023-10-12 17:35:55,282][62634] Updated weights for policy 0, policy_version 43840 (0.0009) [2023-10-12 17:35:56,925][62635] Updated weights for policy 1, policy_version 43810 (0.0010) [2023-10-12 17:35:57,289][62635] Updated weights for policy 1, policy_version 43820 (0.0010) [2023-10-12 17:35:57,657][62635] Updated weights for policy 1, policy_version 43830 (0.0008) [2023-10-12 17:35:58,024][62635] Updated weights for policy 1, policy_version 43840 (0.0007) [2023-10-12 17:35:58,435][61643] Fps is (10 sec: 16383.9, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 89784320. Throughput: 0: 1671.1, 1: 1683.5. Samples: 22452798. Policy #0 lag: (min: 8.0, avg: 31.7, max: 40.0) [2023-10-12 17:35:58,436][61643] Avg episode reward: [(0, '21.100'), (1, '9.870')] [2023-10-12 17:35:59,242][62634] Updated weights for policy 0, policy_version 43850 (0.0010) [2023-10-12 17:35:59,623][62634] Updated weights for policy 0, policy_version 43860 (0.0008) [2023-10-12 17:36:00,000][62634] Updated weights for policy 0, policy_version 43870 (0.0008) [2023-10-12 17:36:02,156][62635] Updated weights for policy 1, policy_version 43850 (0.0010) [2023-10-12 17:36:02,509][62635] Updated weights for policy 1, policy_version 43860 (0.0010) [2023-10-12 17:36:02,892][62635] Updated weights for policy 1, policy_version 43870 (0.0009) [2023-10-12 17:36:03,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 89849856. Throughput: 0: 1678.6, 1: 1658.3. Samples: 22472720. Policy #0 lag: (min: 8.0, avg: 31.7, max: 40.0) [2023-10-12 17:36:03,436][61643] Avg episode reward: [(0, '20.750'), (1, '9.780')] [2023-10-12 17:36:03,860][62634] Updated weights for policy 0, policy_version 43880 (0.0007) [2023-10-12 17:36:04,243][62634] Updated weights for policy 0, policy_version 43890 (0.0009) [2023-10-12 17:36:04,609][62634] Updated weights for policy 0, policy_version 43900 (0.0007) [2023-10-12 17:36:07,080][62635] Updated weights for policy 1, policy_version 43880 (0.0009) [2023-10-12 17:36:07,444][62635] Updated weights for policy 1, policy_version 43890 (0.0011) [2023-10-12 17:36:07,814][62635] Updated weights for policy 1, policy_version 43900 (0.0009) [2023-10-12 17:36:08,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 89915392. Throughput: 0: 1681.8, 1: 1677.6. Samples: 22482982. Policy #0 lag: (min: 8.0, avg: 31.7, max: 40.0) [2023-10-12 17:36:08,435][61643] Avg episode reward: [(0, '20.720'), (1, '9.930')] [2023-10-12 17:36:08,542][62634] Updated weights for policy 0, policy_version 43910 (0.0007) [2023-10-12 17:36:08,912][62634] Updated weights for policy 0, policy_version 43920 (0.0007) [2023-10-12 17:36:09,293][62634] Updated weights for policy 0, policy_version 43930 (0.0008) [2023-10-12 17:36:11,944][62635] Updated weights for policy 1, policy_version 43910 (0.0009) [2023-10-12 17:36:12,312][62635] Updated weights for policy 1, policy_version 43920 (0.0007) [2023-10-12 17:36:12,687][62635] Updated weights for policy 1, policy_version 43930 (0.0007) [2023-10-12 17:36:13,269][62634] Updated weights for policy 0, policy_version 43940 (0.0009) [2023-10-12 17:36:13,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 89980928. Throughput: 0: 1687.9, 1: 1671.4. Samples: 22503500. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 17:36:13,435][61643] Avg episode reward: [(0, '20.840'), (1, '9.720')] [2023-10-12 17:36:13,652][62634] Updated weights for policy 0, policy_version 43950 (0.0009) [2023-10-12 17:36:14,019][62634] Updated weights for policy 0, policy_version 43960 (0.0008) [2023-10-12 17:36:16,785][62635] Updated weights for policy 1, policy_version 43940 (0.0007) [2023-10-12 17:36:17,147][62635] Updated weights for policy 1, policy_version 43950 (0.0009) [2023-10-12 17:36:17,526][62635] Updated weights for policy 1, policy_version 43960 (0.0009) [2023-10-12 17:36:18,145][62634] Updated weights for policy 0, policy_version 43970 (0.0007) [2023-10-12 17:36:18,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 90046464. Throughput: 0: 1688.0, 1: 1661.4. Samples: 22523240. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 17:36:18,435][61643] Avg episode reward: [(0, '20.940'), (1, '9.730')] [2023-10-12 17:36:18,445][62495] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000043968_45023232.pth... [2023-10-12 17:36:18,484][62495] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000042400_43417600.pth [2023-10-12 17:36:18,529][62634] Updated weights for policy 0, policy_version 43980 (0.0008) [2023-10-12 17:36:18,905][62634] Updated weights for policy 0, policy_version 43990 (0.0007) [2023-10-12 17:36:19,281][62354] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000044000_45056000.pth... [2023-10-12 17:36:19,281][62634] Updated weights for policy 0, policy_version 44000 (0.0007) [2023-10-12 17:36:19,310][62354] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000042400_43417600.pth [2023-10-12 17:36:21,634][62635] Updated weights for policy 1, policy_version 43970 (0.0008) [2023-10-12 17:36:22,000][62635] Updated weights for policy 1, policy_version 43980 (0.0008) [2023-10-12 17:36:22,362][62635] Updated weights for policy 1, policy_version 43990 (0.0007) [2023-10-12 17:36:22,736][62635] Updated weights for policy 1, policy_version 44000 (0.0008) [2023-10-12 17:36:23,246][62634] Updated weights for policy 0, policy_version 44010 (0.0007) [2023-10-12 17:36:23,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 90112000. Throughput: 0: 1691.8, 1: 1678.0. Samples: 22533552. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 17:36:23,435][61643] Avg episode reward: [(0, '20.870'), (1, '9.820')] [2023-10-12 17:36:23,608][62634] Updated weights for policy 0, policy_version 44020 (0.0008) [2023-10-12 17:36:23,987][62634] Updated weights for policy 0, policy_version 44030 (0.0007) [2023-10-12 17:36:26,860][62635] Updated weights for policy 1, policy_version 44010 (0.0008) [2023-10-12 17:36:27,236][62635] Updated weights for policy 1, policy_version 44020 (0.0007) [2023-10-12 17:36:27,595][62635] Updated weights for policy 1, policy_version 44030 (0.0009) [2023-10-12 17:36:28,016][62634] Updated weights for policy 0, policy_version 44040 (0.0007) [2023-10-12 17:36:28,399][62634] Updated weights for policy 0, policy_version 44050 (0.0007) [2023-10-12 17:36:28,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 90177536. Throughput: 0: 1697.7, 1: 1669.3. Samples: 22553766. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 17:36:28,435][61643] Avg episode reward: [(0, '21.080'), (1, '9.710')] [2023-10-12 17:36:28,771][62634] Updated weights for policy 0, policy_version 44060 (0.0008) [2023-10-12 17:36:31,637][62635] Updated weights for policy 1, policy_version 44040 (0.0010) [2023-10-12 17:36:32,005][62635] Updated weights for policy 1, policy_version 44050 (0.0010) [2023-10-12 17:36:32,375][62635] Updated weights for policy 1, policy_version 44060 (0.0010) [2023-10-12 17:36:33,056][62634] Updated weights for policy 0, policy_version 44070 (0.0008) [2023-10-12 17:36:33,430][62634] Updated weights for policy 0, policy_version 44080 (0.0010) [2023-10-12 17:36:33,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 90243072. Throughput: 0: 1692.0, 1: 1672.4. Samples: 22573594. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 17:36:33,435][61643] Avg episode reward: [(0, '21.070'), (1, '9.730')] [2023-10-12 17:36:33,801][62634] Updated weights for policy 0, policy_version 44090 (0.0010) [2023-10-12 17:36:36,570][62635] Updated weights for policy 1, policy_version 44070 (0.0009) [2023-10-12 17:36:36,938][62635] Updated weights for policy 1, policy_version 44080 (0.0008) [2023-10-12 17:36:37,312][62635] Updated weights for policy 1, policy_version 44090 (0.0008) [2023-10-12 17:36:37,805][62634] Updated weights for policy 0, policy_version 44100 (0.0007) [2023-10-12 17:36:38,191][62634] Updated weights for policy 0, policy_version 44110 (0.0007) [2023-10-12 17:36:38,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 90308608. Throughput: 0: 1695.8, 1: 1682.5. Samples: 22584102. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 17:36:38,436][61643] Avg episode reward: [(0, '21.480'), (1, '9.480')] [2023-10-12 17:36:38,576][62634] Updated weights for policy 0, policy_version 44120 (0.0007) [2023-10-12 17:36:41,300][62635] Updated weights for policy 1, policy_version 44100 (0.0012) [2023-10-12 17:36:41,675][62635] Updated weights for policy 1, policy_version 44110 (0.0010) [2023-10-12 17:36:42,042][62635] Updated weights for policy 1, policy_version 44120 (0.0008) [2023-10-12 17:36:42,455][62634] Updated weights for policy 0, policy_version 44130 (0.0008) [2023-10-12 17:36:42,828][62634] Updated weights for policy 0, policy_version 44140 (0.0007) [2023-10-12 17:36:43,208][62634] Updated weights for policy 0, policy_version 44150 (0.0007) [2023-10-12 17:36:43,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 90374144. Throughput: 0: 1695.8, 1: 1660.0. Samples: 22603810. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 17:36:43,435][61643] Avg episode reward: [(0, '21.430'), (1, '9.800')] [2023-10-12 17:36:43,582][62634] Updated weights for policy 0, policy_version 44160 (0.0009) [2023-10-12 17:36:46,014][62635] Updated weights for policy 1, policy_version 44130 (0.0008) [2023-10-12 17:36:46,377][62635] Updated weights for policy 1, policy_version 44140 (0.0008) [2023-10-12 17:36:46,742][62635] Updated weights for policy 1, policy_version 44150 (0.0007) [2023-10-12 17:36:47,113][62635] Updated weights for policy 1, policy_version 44160 (0.0009) [2023-10-12 17:36:47,552][62634] Updated weights for policy 0, policy_version 44170 (0.0009) [2023-10-12 17:36:47,936][62634] Updated weights for policy 0, policy_version 44180 (0.0008) [2023-10-12 17:36:48,315][62634] Updated weights for policy 0, policy_version 44190 (0.0009) [2023-10-12 17:36:48,435][61643] Fps is (10 sec: 16384.0, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 90472448. Throughput: 0: 1678.3, 1: 1679.5. Samples: 22623820. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 17:36:48,435][61643] Avg episode reward: [(0, '21.390'), (1, '9.750')] [2023-10-12 17:36:51,193][62635] Updated weights for policy 1, policy_version 44170 (0.0008) [2023-10-12 17:36:51,566][62635] Updated weights for policy 1, policy_version 44180 (0.0007) [2023-10-12 17:36:51,935][62635] Updated weights for policy 1, policy_version 44190 (0.0007) [2023-10-12 17:36:52,034][62634] Updated weights for policy 0, policy_version 44200 (0.0009) [2023-10-12 17:36:52,407][62634] Updated weights for policy 0, policy_version 44210 (0.0008) [2023-10-12 17:36:52,785][62634] Updated weights for policy 0, policy_version 44220 (0.0009) [2023-10-12 17:36:53,435][61643] Fps is (10 sec: 16383.8, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 90537984. Throughput: 0: 1694.4, 1: 1674.8. Samples: 22634596. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 17:36:53,436][61643] Avg episode reward: [(0, '21.530'), (1, '9.790')] [2023-10-12 17:36:56,238][62635] Updated weights for policy 1, policy_version 44200 (0.0010) [2023-10-12 17:36:56,610][62635] Updated weights for policy 1, policy_version 44210 (0.0008) [2023-10-12 17:36:56,858][62634] Updated weights for policy 0, policy_version 44230 (0.0009) [2023-10-12 17:36:56,983][62635] Updated weights for policy 1, policy_version 44220 (0.0009) [2023-10-12 17:36:57,231][62634] Updated weights for policy 0, policy_version 44240 (0.0008) [2023-10-12 17:36:57,615][62634] Updated weights for policy 0, policy_version 44250 (0.0010) [2023-10-12 17:36:58,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 90603520. Throughput: 0: 1691.2, 1: 1655.0. Samples: 22654082. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 17:36:58,436][61643] Avg episode reward: [(0, '21.660'), (1, '9.880')] [2023-10-12 17:37:00,951][62635] Updated weights for policy 1, policy_version 44230 (0.0007) [2023-10-12 17:37:01,322][62635] Updated weights for policy 1, policy_version 44240 (0.0008) [2023-10-12 17:37:01,690][62635] Updated weights for policy 1, policy_version 44250 (0.0007) [2023-10-12 17:37:01,760][62634] Updated weights for policy 0, policy_version 44260 (0.0009) [2023-10-12 17:37:02,138][62634] Updated weights for policy 0, policy_version 44270 (0.0008) [2023-10-12 17:37:02,515][62634] Updated weights for policy 0, policy_version 44280 (0.0010) [2023-10-12 17:37:03,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 90669056. Throughput: 0: 1667.3, 1: 1675.7. Samples: 22673676. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 17:37:03,435][61643] Avg episode reward: [(0, '21.910'), (1, '9.690')] [2023-10-12 17:37:05,899][62635] Updated weights for policy 1, policy_version 44260 (0.0010) [2023-10-12 17:37:06,264][62635] Updated weights for policy 1, policy_version 44270 (0.0009) [2023-10-12 17:37:06,572][62634] Updated weights for policy 0, policy_version 44290 (0.0010) [2023-10-12 17:37:06,634][62635] Updated weights for policy 1, policy_version 44280 (0.0010) [2023-10-12 17:37:06,936][62634] Updated weights for policy 0, policy_version 44300 (0.0009) [2023-10-12 17:37:07,310][62634] Updated weights for policy 0, policy_version 44310 (0.0009) [2023-10-12 17:37:07,689][62634] Updated weights for policy 0, policy_version 44320 (0.0008) [2023-10-12 17:37:08,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 90734592. Throughput: 0: 1698.5, 1: 1672.4. Samples: 22685246. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 17:37:08,436][61643] Avg episode reward: [(0, '21.800'), (1, '9.780')] [2023-10-12 17:37:10,760][62635] Updated weights for policy 1, policy_version 44290 (0.0008) [2023-10-12 17:37:11,129][62635] Updated weights for policy 1, policy_version 44300 (0.0009) [2023-10-12 17:37:11,508][62635] Updated weights for policy 1, policy_version 44310 (0.0009) [2023-10-12 17:37:11,666][62634] Updated weights for policy 0, policy_version 44330 (0.0008) [2023-10-12 17:37:11,875][62635] Updated weights for policy 1, policy_version 44320 (0.0008) [2023-10-12 17:37:12,045][62634] Updated weights for policy 0, policy_version 44340 (0.0010) [2023-10-12 17:37:12,432][62634] Updated weights for policy 0, policy_version 44350 (0.0007) [2023-10-12 17:37:13,435][61643] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 90800128. Throughput: 0: 1682.8, 1: 1657.3. Samples: 22704070. Policy #0 lag: (min: 31.0, avg: 31.1, max: 36.0) [2023-10-12 17:37:13,436][61643] Avg episode reward: [(0, '22.160'), (1, '10.040')] [2023-10-12 17:37:16,002][62635] Updated weights for policy 1, policy_version 44330 (0.0007) [2023-10-12 17:37:16,347][62634] Updated weights for policy 0, policy_version 44360 (0.0008) [2023-10-12 17:37:16,370][62635] Updated weights for policy 1, policy_version 44340 (0.0007) [2023-10-12 17:37:16,718][62634] Updated weights for policy 0, policy_version 44370 (0.0009) [2023-10-12 17:37:16,733][62635] Updated weights for policy 1, policy_version 44350 (0.0007) [2023-10-12 17:37:17,099][62634] Updated weights for policy 0, policy_version 44380 (0.0009) [2023-10-12 17:37:18,435][61643] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 90865664. Throughput: 0: 1672.6, 1: 1675.1. Samples: 22724238. Policy #0 lag: (min: 31.0, avg: 31.1, max: 36.0) [2023-10-12 17:37:18,436][61643] Avg episode reward: [(0, '22.250'), (1, '9.900')] [2023-10-12 17:37:20,866][62635] Updated weights for policy 1, policy_version 44360 (0.0008) [2023-10-12 17:37:21,147][62634] Updated weights for policy 0, policy_version 44390 (0.0009) [2023-10-12 17:37:21,244][62635] Updated weights for policy 1, policy_version 44370 (0.0008) [2023-10-12 17:37:21,528][62634] Updated weights for policy 0, policy_version 44400 (0.0009) [2023-10-12 17:37:21,612][62635] Updated weights for policy 1, policy_version 44380 (0.0008) [2023-10-12 17:37:21,897][62634] Updated weights for policy 0, policy_version 44410 (0.0007) [2023-10-12 17:37:23,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 90931200. Throughput: 0: 1701.0, 1: 1662.1. Samples: 22735440. Policy #0 lag: (min: 31.0, avg: 31.1, max: 36.0) [2023-10-12 17:37:23,436][61643] Avg episode reward: [(0, '22.390'), (1, '9.990')] [2023-10-12 17:37:23,437][62354] Saving new best policy, reward=22.390! [2023-10-12 17:37:25,570][62635] Updated weights for policy 1, policy_version 44390 (0.0008) [2023-10-12 17:37:25,932][62635] Updated weights for policy 1, policy_version 44400 (0.0007) [2023-10-12 17:37:26,065][62634] Updated weights for policy 0, policy_version 44420 (0.0007) [2023-10-12 17:37:26,298][62635] Updated weights for policy 1, policy_version 44410 (0.0008) [2023-10-12 17:37:26,444][62634] Updated weights for policy 0, policy_version 44430 (0.0008) [2023-10-12 17:37:26,807][62634] Updated weights for policy 0, policy_version 44440 (0.0007) [2023-10-12 17:37:28,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 90996736. Throughput: 0: 1673.5, 1: 1665.5. Samples: 22754066. Policy #0 lag: (min: 31.0, avg: 31.1, max: 36.0) [2023-10-12 17:37:28,436][61643] Avg episode reward: [(0, '22.510'), (1, '9.900')] [2023-10-12 17:37:28,438][62354] Saving new best policy, reward=22.510! [2023-10-12 17:37:30,294][62635] Updated weights for policy 1, policy_version 44420 (0.0007) [2023-10-12 17:37:30,661][62635] Updated weights for policy 1, policy_version 44430 (0.0009) [2023-10-12 17:37:30,988][62634] Updated weights for policy 0, policy_version 44450 (0.0010) [2023-10-12 17:37:31,029][62635] Updated weights for policy 1, policy_version 44440 (0.0008) [2023-10-12 17:37:31,365][62634] Updated weights for policy 0, policy_version 44460 (0.0008) [2023-10-12 17:37:31,743][62634] Updated weights for policy 0, policy_version 44470 (0.0008) [2023-10-12 17:37:32,123][62634] Updated weights for policy 0, policy_version 44480 (0.0009) [2023-10-12 17:37:33,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 91062272. Throughput: 0: 1677.0, 1: 1673.8. Samples: 22774606. Policy #0 lag: (min: 31.0, avg: 31.1, max: 36.0) [2023-10-12 17:37:33,436][61643] Avg episode reward: [(0, '22.680'), (1, '9.960')] [2023-10-12 17:37:33,445][62354] Saving new best policy, reward=22.680! [2023-10-12 17:37:35,006][62635] Updated weights for policy 1, policy_version 44450 (0.0007) [2023-10-12 17:37:35,373][62635] Updated weights for policy 1, policy_version 44460 (0.0007) [2023-10-12 17:37:35,740][62635] Updated weights for policy 1, policy_version 44470 (0.0009) [2023-10-12 17:37:36,106][62635] Updated weights for policy 1, policy_version 44480 (0.0009) [2023-10-12 17:37:36,379][62634] Updated weights for policy 0, policy_version 44490 (0.0008) [2023-10-12 17:37:36,755][62634] Updated weights for policy 0, policy_version 44500 (0.0010) [2023-10-12 17:37:37,141][62634] Updated weights for policy 0, policy_version 44510 (0.0010) [2023-10-12 17:37:38,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 91127808. Throughput: 0: 1688.0, 1: 1654.8. Samples: 22785022. Policy #0 lag: (min: 31.0, avg: 31.1, max: 36.0) [2023-10-12 17:37:38,436][61643] Avg episode reward: [(0, '22.360'), (1, '9.920')] [2023-10-12 17:37:40,097][62635] Updated weights for policy 1, policy_version 44490 (0.0009) [2023-10-12 17:37:40,456][62635] Updated weights for policy 1, policy_version 44500 (0.0009) [2023-10-12 17:37:40,823][62635] Updated weights for policy 1, policy_version 44510 (0.0008) [2023-10-12 17:37:41,256][62634] Updated weights for policy 0, policy_version 44520 (0.0008) [2023-10-12 17:37:41,639][62634] Updated weights for policy 0, policy_version 44530 (0.0009) [2023-10-12 17:37:42,009][62634] Updated weights for policy 0, policy_version 44540 (0.0011) [2023-10-12 17:37:43,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 91193344. Throughput: 0: 1666.0, 1: 1678.6. Samples: 22804586. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 17:37:43,435][61643] Avg episode reward: [(0, '22.210'), (1, '9.810')] [2023-10-12 17:37:44,962][62635] Updated weights for policy 1, policy_version 44520 (0.0009) [2023-10-12 17:37:45,326][62635] Updated weights for policy 1, policy_version 44530 (0.0009) [2023-10-12 17:37:45,694][62635] Updated weights for policy 1, policy_version 44540 (0.0008) [2023-10-12 17:37:45,959][62634] Updated weights for policy 0, policy_version 44550 (0.0011) [2023-10-12 17:37:46,332][62634] Updated weights for policy 0, policy_version 44560 (0.0009) [2023-10-12 17:37:46,704][62634] Updated weights for policy 0, policy_version 44570 (0.0009) [2023-10-12 17:37:48,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 91258880. Throughput: 0: 1686.6, 1: 1681.1. Samples: 22825222. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 17:37:48,436][61643] Avg episode reward: [(0, '22.250'), (1, '9.890')] [2023-10-12 17:37:49,787][62635] Updated weights for policy 1, policy_version 44550 (0.0009) [2023-10-12 17:37:50,158][62635] Updated weights for policy 1, policy_version 44560 (0.0008) [2023-10-12 17:37:50,521][62635] Updated weights for policy 1, policy_version 44570 (0.0007) [2023-10-12 17:37:50,692][62634] Updated weights for policy 0, policy_version 44580 (0.0008) [2023-10-12 17:37:51,069][62634] Updated weights for policy 0, policy_version 44590 (0.0010) [2023-10-12 17:37:51,450][62634] Updated weights for policy 0, policy_version 44600 (0.0008) [2023-10-12 17:37:53,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 91324416. Throughput: 0: 1674.8, 1: 1657.5. Samples: 22835198. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 17:37:53,436][61643] Avg episode reward: [(0, '22.170'), (1, '9.890')] [2023-10-12 17:37:54,473][62635] Updated weights for policy 1, policy_version 44580 (0.0009) [2023-10-12 17:37:54,827][62635] Updated weights for policy 1, policy_version 44590 (0.0009) [2023-10-12 17:37:55,194][62635] Updated weights for policy 1, policy_version 44600 (0.0010) [2023-10-12 17:37:55,355][62634] Updated weights for policy 0, policy_version 44610 (0.0009) [2023-10-12 17:37:55,726][62634] Updated weights for policy 0, policy_version 44620 (0.0010) [2023-10-12 17:37:56,102][62634] Updated weights for policy 0, policy_version 44630 (0.0008) [2023-10-12 17:37:56,472][62634] Updated weights for policy 0, policy_version 44640 (0.0008) [2023-10-12 17:37:58,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 91389952. Throughput: 0: 1672.8, 1: 1688.2. Samples: 22855314. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 17:37:58,435][61643] Avg episode reward: [(0, '22.230'), (1, '9.850')] [2023-10-12 17:37:59,345][62635] Updated weights for policy 1, policy_version 44610 (0.0009) [2023-10-12 17:37:59,711][62635] Updated weights for policy 1, policy_version 44620 (0.0010) [2023-10-12 17:38:00,079][62635] Updated weights for policy 1, policy_version 44630 (0.0009) [2023-10-12 17:38:00,451][62635] Updated weights for policy 1, policy_version 44640 (0.0009) [2023-10-12 17:38:00,750][62634] Updated weights for policy 0, policy_version 44650 (0.0010) [2023-10-12 17:38:01,129][62634] Updated weights for policy 0, policy_version 44660 (0.0010) [2023-10-12 17:38:01,496][62634] Updated weights for policy 0, policy_version 44670 (0.0011) [2023-10-12 17:38:03,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 91455488. Throughput: 0: 1682.0, 1: 1684.9. Samples: 22875750. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 17:38:03,436][61643] Avg episode reward: [(0, '22.340'), (1, '9.770')] [2023-10-12 17:38:04,670][62635] Updated weights for policy 1, policy_version 44650 (0.0008) [2023-10-12 17:38:05,035][62635] Updated weights for policy 1, policy_version 44660 (0.0009) [2023-10-12 17:38:05,408][62635] Updated weights for policy 1, policy_version 44670 (0.0008) [2023-10-12 17:38:05,606][62634] Updated weights for policy 0, policy_version 44680 (0.0009) [2023-10-12 17:38:05,987][62634] Updated weights for policy 0, policy_version 44690 (0.0010) [2023-10-12 17:38:06,359][62634] Updated weights for policy 0, policy_version 44700 (0.0011) [2023-10-12 17:38:08,435][61643] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 91521024. Throughput: 0: 1665.6, 1: 1670.4. Samples: 22885564. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 17:38:08,436][61643] Avg episode reward: [(0, '22.380'), (1, '9.900')] [2023-10-12 17:38:09,467][62635] Updated weights for policy 1, policy_version 44680 (0.0009) [2023-10-12 17:38:09,830][62635] Updated weights for policy 1, policy_version 44690 (0.0009) [2023-10-12 17:38:10,197][62635] Updated weights for policy 1, policy_version 44700 (0.0007) [2023-10-12 17:38:10,281][62634] Updated weights for policy 0, policy_version 44710 (0.0008) [2023-10-12 17:38:10,666][62634] Updated weights for policy 0, policy_version 44720 (0.0009) [2023-10-12 17:38:11,038][62634] Updated weights for policy 0, policy_version 44730 (0.0009) [2023-10-12 17:38:13,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 91586560. Throughput: 0: 1680.4, 1: 1690.9. Samples: 22905776. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 17:38:13,436][61643] Avg episode reward: [(0, '21.800'), (1, '9.760')] [2023-10-12 17:38:14,078][62635] Updated weights for policy 1, policy_version 44710 (0.0007) [2023-10-12 17:38:14,447][62635] Updated weights for policy 1, policy_version 44720 (0.0011) [2023-10-12 17:38:14,807][62635] Updated weights for policy 1, policy_version 44730 (0.0010) [2023-10-12 17:38:15,185][62634] Updated weights for policy 0, policy_version 44740 (0.0009) [2023-10-12 17:38:15,579][62634] Updated weights for policy 0, policy_version 44750 (0.0009) [2023-10-12 17:38:15,964][62634] Updated weights for policy 0, policy_version 44760 (0.0009) [2023-10-12 17:38:18,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 91652096. Throughput: 0: 1685.7, 1: 1688.8. Samples: 22926462. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 17:38:18,436][61643] Avg episode reward: [(0, '21.900'), (1, '9.780')] [2023-10-12 17:38:18,444][62354] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000044768_45842432.pth... [2023-10-12 17:38:18,444][62495] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000044736_45809664.pth... [2023-10-12 17:38:18,474][62354] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000043200_44236800.pth [2023-10-12 17:38:18,480][62495] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000043168_44204032.pth [2023-10-12 17:38:18,867][62635] Updated weights for policy 1, policy_version 44740 (0.0009) [2023-10-12 17:38:19,231][62635] Updated weights for policy 1, policy_version 44750 (0.0010) [2023-10-12 17:38:19,594][62635] Updated weights for policy 1, policy_version 44760 (0.0010) [2023-10-12 17:38:19,957][62634] Updated weights for policy 0, policy_version 44770 (0.0008) [2023-10-12 17:38:20,332][62634] Updated weights for policy 0, policy_version 44780 (0.0010) [2023-10-12 17:38:20,708][62634] Updated weights for policy 0, policy_version 44790 (0.0007) [2023-10-12 17:38:21,087][62634] Updated weights for policy 0, policy_version 44800 (0.0008) [2023-10-12 17:38:23,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 91717632. Throughput: 0: 1662.0, 1: 1686.0. Samples: 22935680. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 17:38:23,436][61643] Avg episode reward: [(0, '22.240'), (1, '9.780')] [2023-10-12 17:38:23,692][62635] Updated weights for policy 1, policy_version 44770 (0.0011) [2023-10-12 17:38:24,053][62635] Updated weights for policy 1, policy_version 44780 (0.0009) [2023-10-12 17:38:24,422][62635] Updated weights for policy 1, policy_version 44790 (0.0008) [2023-10-12 17:38:24,791][62635] Updated weights for policy 1, policy_version 44800 (0.0008) [2023-10-12 17:38:25,109][62634] Updated weights for policy 0, policy_version 44810 (0.0008) [2023-10-12 17:38:25,486][62634] Updated weights for policy 0, policy_version 44820 (0.0010) [2023-10-12 17:38:25,864][62634] Updated weights for policy 0, policy_version 44830 (0.0009) [2023-10-12 17:38:28,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 91783168. Throughput: 0: 1683.2, 1: 1685.3. Samples: 22956172. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 17:38:28,435][61643] Avg episode reward: [(0, '22.170'), (1, '9.770')] [2023-10-12 17:38:28,952][62635] Updated weights for policy 1, policy_version 44810 (0.0007) [2023-10-12 17:38:29,318][62635] Updated weights for policy 1, policy_version 44820 (0.0007) [2023-10-12 17:38:29,682][62635] Updated weights for policy 1, policy_version 44830 (0.0009) [2023-10-12 17:38:29,879][62634] Updated weights for policy 0, policy_version 44840 (0.0008) [2023-10-12 17:38:30,254][62634] Updated weights for policy 0, policy_version 44850 (0.0008) [2023-10-12 17:38:30,628][62634] Updated weights for policy 0, policy_version 44860 (0.0009) [2023-10-12 17:38:33,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 91848704. Throughput: 0: 1692.8, 1: 1682.3. Samples: 22977100. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 17:38:33,436][61643] Avg episode reward: [(0, '22.610'), (1, '9.900')] [2023-10-12 17:38:33,775][62635] Updated weights for policy 1, policy_version 44840 (0.0009) [2023-10-12 17:38:34,142][62635] Updated weights for policy 1, policy_version 44850 (0.0009) [2023-10-12 17:38:34,517][62635] Updated weights for policy 1, policy_version 44860 (0.0008) [2023-10-12 17:38:34,613][62634] Updated weights for policy 0, policy_version 44870 (0.0009) [2023-10-12 17:38:34,989][62634] Updated weights for policy 0, policy_version 44880 (0.0008) [2023-10-12 17:38:35,370][62634] Updated weights for policy 0, policy_version 44890 (0.0007) [2023-10-12 17:38:38,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 91914240. Throughput: 0: 1670.6, 1: 1684.3. Samples: 22986166. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 17:38:38,435][61643] Avg episode reward: [(0, '22.550'), (1, '9.930')] [2023-10-12 17:38:38,601][62635] Updated weights for policy 1, policy_version 44870 (0.0009) [2023-10-12 17:38:38,972][62635] Updated weights for policy 1, policy_version 44880 (0.0010) [2023-10-12 17:38:39,228][62634] Updated weights for policy 0, policy_version 44900 (0.0008) [2023-10-12 17:38:39,343][62635] Updated weights for policy 1, policy_version 44890 (0.0010) [2023-10-12 17:38:39,602][62634] Updated weights for policy 0, policy_version 44910 (0.0008) [2023-10-12 17:38:39,982][62634] Updated weights for policy 0, policy_version 44920 (0.0009) [2023-10-12 17:38:43,416][62635] Updated weights for policy 1, policy_version 44900 (0.0010) [2023-10-12 17:38:43,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 91979776. Throughput: 0: 1689.0, 1: 1682.2. Samples: 23007018. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 17:38:43,435][61643] Avg episode reward: [(0, '22.530'), (1, '9.930')] [2023-10-12 17:38:43,777][62635] Updated weights for policy 1, policy_version 44910 (0.0010) [2023-10-12 17:38:44,139][62635] Updated weights for policy 1, policy_version 44920 (0.0009) [2023-10-12 17:38:44,158][62634] Updated weights for policy 0, policy_version 44930 (0.0008) [2023-10-12 17:38:44,532][62634] Updated weights for policy 0, policy_version 44940 (0.0008) [2023-10-12 17:38:44,910][62634] Updated weights for policy 0, policy_version 44950 (0.0008) [2023-10-12 17:38:45,278][62634] Updated weights for policy 0, policy_version 44960 (0.0007) [2023-10-12 17:38:48,256][62635] Updated weights for policy 1, policy_version 44930 (0.0008) [2023-10-12 17:38:48,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 92045312. Throughput: 0: 1691.4, 1: 1686.7. Samples: 23027764. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 17:38:48,435][61643] Avg episode reward: [(0, '22.620'), (1, '9.960')] [2023-10-12 17:38:48,630][62635] Updated weights for policy 1, policy_version 44940 (0.0012) [2023-10-12 17:38:48,996][62635] Updated weights for policy 1, policy_version 44950 (0.0009) [2023-10-12 17:38:49,153][62634] Updated weights for policy 0, policy_version 44970 (0.0007) [2023-10-12 17:38:49,365][62635] Updated weights for policy 1, policy_version 44960 (0.0008) [2023-10-12 17:38:49,529][62634] Updated weights for policy 0, policy_version 44980 (0.0007) [2023-10-12 17:38:49,903][62634] Updated weights for policy 0, policy_version 44990 (0.0007) [2023-10-12 17:38:53,272][62635] Updated weights for policy 1, policy_version 44970 (0.0007) [2023-10-12 17:38:53,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 92110848. Throughput: 0: 1675.0, 1: 1684.0. Samples: 23036718. Policy #0 lag: (min: 31.0, avg: 38.5, max: 63.0) [2023-10-12 17:38:53,435][61643] Avg episode reward: [(0, '23.100'), (1, '9.810')] [2023-10-12 17:38:53,436][62354] Saving new best policy, reward=23.100! [2023-10-12 17:38:53,633][62635] Updated weights for policy 1, policy_version 44980 (0.0010) [2023-10-12 17:38:54,002][62635] Updated weights for policy 1, policy_version 44990 (0.0009) [2023-10-12 17:38:54,048][62634] Updated weights for policy 0, policy_version 45000 (0.0008) [2023-10-12 17:38:54,432][62634] Updated weights for policy 0, policy_version 45010 (0.0010) [2023-10-12 17:38:54,809][62634] Updated weights for policy 0, policy_version 45020 (0.0011) [2023-10-12 17:38:58,176][62635] Updated weights for policy 1, policy_version 45000 (0.0007) [2023-10-12 17:38:58,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 92176384. Throughput: 0: 1682.4, 1: 1688.1. Samples: 23057452. Policy #0 lag: (min: 31.0, avg: 38.5, max: 63.0) [2023-10-12 17:38:58,435][61643] Avg episode reward: [(0, '23.060'), (1, '9.660')] [2023-10-12 17:38:58,545][62635] Updated weights for policy 1, policy_version 45010 (0.0008) [2023-10-12 17:38:58,772][62634] Updated weights for policy 0, policy_version 45030 (0.0009) [2023-10-12 17:38:58,913][62635] Updated weights for policy 1, policy_version 45020 (0.0008) [2023-10-12 17:38:59,145][62634] Updated weights for policy 0, policy_version 45040 (0.0007) [2023-10-12 17:38:59,510][62634] Updated weights for policy 0, policy_version 45050 (0.0008) [2023-10-12 17:39:02,874][62635] Updated weights for policy 1, policy_version 45030 (0.0008) [2023-10-12 17:39:03,230][62635] Updated weights for policy 1, policy_version 45040 (0.0009) [2023-10-12 17:39:03,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 92241920. Throughput: 0: 1682.7, 1: 1678.8. Samples: 23077730. Policy #0 lag: (min: 31.0, avg: 38.5, max: 63.0) [2023-10-12 17:39:03,435][61643] Avg episode reward: [(0, '23.130'), (1, '9.900')] [2023-10-12 17:39:03,441][62354] Saving new best policy, reward=23.130! [2023-10-12 17:39:03,595][62635] Updated weights for policy 1, policy_version 45050 (0.0009) [2023-10-12 17:39:03,840][62634] Updated weights for policy 0, policy_version 45060 (0.0009) [2023-10-12 17:39:04,226][62634] Updated weights for policy 0, policy_version 45070 (0.0008) [2023-10-12 17:39:04,602][62634] Updated weights for policy 0, policy_version 45080 (0.0007) [2023-10-12 17:39:07,636][62635] Updated weights for policy 1, policy_version 45060 (0.0007) [2023-10-12 17:39:08,019][62635] Updated weights for policy 1, policy_version 45070 (0.0009) [2023-10-12 17:39:08,376][62635] Updated weights for policy 1, policy_version 45080 (0.0008) [2023-10-12 17:39:08,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 92307456. Throughput: 0: 1677.7, 1: 1690.9. Samples: 23087266. Policy #0 lag: (min: 31.0, avg: 38.5, max: 63.0) [2023-10-12 17:39:08,435][61643] Avg episode reward: [(0, '23.410'), (1, '9.660')] [2023-10-12 17:39:08,436][62354] Saving new best policy, reward=23.410! [2023-10-12 17:39:08,687][62634] Updated weights for policy 0, policy_version 45090 (0.0010) [2023-10-12 17:39:09,060][62634] Updated weights for policy 0, policy_version 45100 (0.0009) [2023-10-12 17:39:09,441][62634] Updated weights for policy 0, policy_version 45110 (0.0007) [2023-10-12 17:39:09,822][62634] Updated weights for policy 0, policy_version 45120 (0.0008) [2023-10-12 17:39:12,478][62635] Updated weights for policy 1, policy_version 45090 (0.0009) [2023-10-12 17:39:12,852][62635] Updated weights for policy 1, policy_version 45100 (0.0009) [2023-10-12 17:39:13,220][62635] Updated weights for policy 1, policy_version 45110 (0.0008) [2023-10-12 17:39:13,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 92372992. Throughput: 0: 1682.7, 1: 1691.0. Samples: 23107988. Policy #0 lag: (min: 31.0, avg: 38.5, max: 63.0) [2023-10-12 17:39:13,435][61643] Avg episode reward: [(0, '22.740'), (1, '9.660')] [2023-10-12 17:39:13,579][62635] Updated weights for policy 1, policy_version 45120 (0.0008) [2023-10-12 17:39:13,991][62634] Updated weights for policy 0, policy_version 45130 (0.0011) [2023-10-12 17:39:14,359][62634] Updated weights for policy 0, policy_version 45140 (0.0010) [2023-10-12 17:39:14,750][62634] Updated weights for policy 0, policy_version 45150 (0.0010) [2023-10-12 17:39:17,771][62635] Updated weights for policy 1, policy_version 45130 (0.0009) [2023-10-12 17:39:18,148][62635] Updated weights for policy 1, policy_version 45140 (0.0008) [2023-10-12 17:39:18,435][61643] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 92438528. Throughput: 0: 1675.1, 1: 1674.1. Samples: 23127812. Policy #0 lag: (min: 31.0, avg: 38.5, max: 63.0) [2023-10-12 17:39:18,436][61643] Avg episode reward: [(0, '22.830'), (1, '9.650')] [2023-10-12 17:39:18,520][62635] Updated weights for policy 1, policy_version 45150 (0.0009) [2023-10-12 17:39:18,887][62634] Updated weights for policy 0, policy_version 45160 (0.0009) [2023-10-12 17:39:19,269][62634] Updated weights for policy 0, policy_version 45170 (0.0008) [2023-10-12 17:39:19,653][62634] Updated weights for policy 0, policy_version 45180 (0.0009) [2023-10-12 17:39:22,452][62635] Updated weights for policy 1, policy_version 45160 (0.0009) [2023-10-12 17:39:22,825][62635] Updated weights for policy 1, policy_version 45170 (0.0008) [2023-10-12 17:39:23,202][62635] Updated weights for policy 1, policy_version 45180 (0.0008) [2023-10-12 17:39:23,435][61643] Fps is (10 sec: 16384.0, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 92536832. Throughput: 0: 1678.2, 1: 1689.9. Samples: 23137732. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 17:39:23,435][61643] Avg episode reward: [(0, '23.090'), (1, '9.550')] [2023-10-12 17:39:23,708][62634] Updated weights for policy 0, policy_version 45190 (0.0010) [2023-10-12 17:39:24,079][62634] Updated weights for policy 0, policy_version 45200 (0.0011) [2023-10-12 17:39:24,458][62634] Updated weights for policy 0, policy_version 45210 (0.0011) [2023-10-12 17:39:27,065][62635] Updated weights for policy 1, policy_version 45190 (0.0008) [2023-10-12 17:39:27,429][62635] Updated weights for policy 1, policy_version 45200 (0.0007) [2023-10-12 17:39:27,792][62635] Updated weights for policy 1, policy_version 45210 (0.0010) [2023-10-12 17:39:28,435][61643] Fps is (10 sec: 16384.4, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 92602368. Throughput: 0: 1674.2, 1: 1691.1. Samples: 23158458. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 17:39:28,435][61643] Avg episode reward: [(0, '23.000'), (1, '9.730')] [2023-10-12 17:39:28,543][62634] Updated weights for policy 0, policy_version 45220 (0.0010) [2023-10-12 17:39:28,917][62634] Updated weights for policy 0, policy_version 45230 (0.0010) [2023-10-12 17:39:29,292][62634] Updated weights for policy 0, policy_version 45240 (0.0008) [2023-10-12 17:39:31,901][62635] Updated weights for policy 1, policy_version 45220 (0.0008) [2023-10-12 17:39:32,272][62635] Updated weights for policy 1, policy_version 45230 (0.0010) [2023-10-12 17:39:32,652][62635] Updated weights for policy 1, policy_version 45240 (0.0009) [2023-10-12 17:39:33,180][62634] Updated weights for policy 0, policy_version 45250 (0.0009) [2023-10-12 17:39:33,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 92667904. Throughput: 0: 1672.8, 1: 1666.6. Samples: 23178040. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 17:39:33,436][61643] Avg episode reward: [(0, '22.980'), (1, '9.920')] [2023-10-12 17:39:33,558][62634] Updated weights for policy 0, policy_version 45260 (0.0008) [2023-10-12 17:39:33,941][62634] Updated weights for policy 0, policy_version 45270 (0.0011) [2023-10-12 17:39:34,312][62634] Updated weights for policy 0, policy_version 45280 (0.0011) [2023-10-12 17:39:36,954][62635] Updated weights for policy 1, policy_version 45250 (0.0008) [2023-10-12 17:39:37,319][62635] Updated weights for policy 1, policy_version 45260 (0.0008) [2023-10-12 17:39:37,689][62635] Updated weights for policy 1, policy_version 45270 (0.0008) [2023-10-12 17:39:38,046][62635] Updated weights for policy 1, policy_version 45280 (0.0009) [2023-10-12 17:39:38,358][62634] Updated weights for policy 0, policy_version 45290 (0.0009) [2023-10-12 17:39:38,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 92733440. Throughput: 0: 1675.3, 1: 1695.8. Samples: 23188418. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 17:39:38,435][61643] Avg episode reward: [(0, '22.700'), (1, '9.710')] [2023-10-12 17:39:38,732][62634] Updated weights for policy 0, policy_version 45300 (0.0007) [2023-10-12 17:39:39,111][62634] Updated weights for policy 0, policy_version 45310 (0.0008) [2023-10-12 17:39:42,056][62635] Updated weights for policy 1, policy_version 45290 (0.0008) [2023-10-12 17:39:42,424][62635] Updated weights for policy 1, policy_version 45300 (0.0008) [2023-10-12 17:39:42,786][62635] Updated weights for policy 1, policy_version 45310 (0.0008) [2023-10-12 17:39:43,208][62634] Updated weights for policy 0, policy_version 45320 (0.0007) [2023-10-12 17:39:43,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 92798976. Throughput: 0: 1680.1, 1: 1682.4. Samples: 23208766. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 17:39:43,435][61643] Avg episode reward: [(0, '22.490'), (1, '9.740')] [2023-10-12 17:39:43,582][62634] Updated weights for policy 0, policy_version 45330 (0.0008) [2023-10-12 17:39:43,969][62634] Updated weights for policy 0, policy_version 45340 (0.0010) [2023-10-12 17:39:46,846][62635] Updated weights for policy 1, policy_version 45320 (0.0009) [2023-10-12 17:39:47,223][62635] Updated weights for policy 1, policy_version 45330 (0.0009) [2023-10-12 17:39:47,587][62635] Updated weights for policy 1, policy_version 45340 (0.0009) [2023-10-12 17:39:47,985][62634] Updated weights for policy 0, policy_version 45350 (0.0009) [2023-10-12 17:39:48,372][62634] Updated weights for policy 0, policy_version 45360 (0.0009) [2023-10-12 17:39:48,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 92864512. Throughput: 0: 1676.8, 1: 1669.9. Samples: 23228334. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 17:39:48,435][61643] Avg episode reward: [(0, '22.400'), (1, '10.020')] [2023-10-12 17:39:48,741][62634] Updated weights for policy 0, policy_version 45370 (0.0008) [2023-10-12 17:39:51,489][62635] Updated weights for policy 1, policy_version 45350 (0.0008) [2023-10-12 17:39:51,861][62635] Updated weights for policy 1, policy_version 45360 (0.0008) [2023-10-12 17:39:52,241][62635] Updated weights for policy 1, policy_version 45370 (0.0008) [2023-10-12 17:39:52,585][62634] Updated weights for policy 0, policy_version 45380 (0.0008) [2023-10-12 17:39:52,963][62634] Updated weights for policy 0, policy_version 45390 (0.0008) [2023-10-12 17:39:53,340][62634] Updated weights for policy 0, policy_version 45400 (0.0008) [2023-10-12 17:39:53,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 92930048. Throughput: 0: 1681.2, 1: 1691.5. Samples: 23239040. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 17:39:53,435][61643] Avg episode reward: [(0, '22.560'), (1, '9.750')] [2023-10-12 17:39:56,263][62635] Updated weights for policy 1, policy_version 45380 (0.0007) [2023-10-12 17:39:56,629][62635] Updated weights for policy 1, policy_version 45390 (0.0009) [2023-10-12 17:39:56,994][62635] Updated weights for policy 1, policy_version 45400 (0.0010) [2023-10-12 17:39:57,339][62634] Updated weights for policy 0, policy_version 45410 (0.0009) [2023-10-12 17:39:57,716][62634] Updated weights for policy 0, policy_version 45420 (0.0009) [2023-10-12 17:39:58,091][62634] Updated weights for policy 0, policy_version 45430 (0.0008) [2023-10-12 17:39:58,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 92995584. Throughput: 0: 1681.3, 1: 1672.8. Samples: 23258926. Policy #0 lag: (min: 31.0, avg: 45.3, max: 63.0) [2023-10-12 17:39:58,435][61643] Avg episode reward: [(0, '22.740'), (1, '9.910')] [2023-10-12 17:39:58,476][62634] Updated weights for policy 0, policy_version 45440 (0.0008) [2023-10-12 17:40:00,979][62635] Updated weights for policy 1, policy_version 45410 (0.0008) [2023-10-12 17:40:01,352][62635] Updated weights for policy 1, policy_version 45420 (0.0007) [2023-10-12 17:40:01,715][62635] Updated weights for policy 1, policy_version 45430 (0.0007) [2023-10-12 17:40:02,084][62635] Updated weights for policy 1, policy_version 45440 (0.0010) [2023-10-12 17:40:02,618][62634] Updated weights for policy 0, policy_version 45450 (0.0009) [2023-10-12 17:40:03,003][62634] Updated weights for policy 0, policy_version 45460 (0.0008) [2023-10-12 17:40:03,375][62634] Updated weights for policy 0, policy_version 45470 (0.0009) [2023-10-12 17:40:03,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 93061120. Throughput: 0: 1667.8, 1: 1685.7. Samples: 23278722. Policy #0 lag: (min: 31.0, avg: 45.3, max: 63.0) [2023-10-12 17:40:03,435][61643] Avg episode reward: [(0, '22.720'), (1, '10.040')] [2023-10-12 17:40:06,061][62635] Updated weights for policy 1, policy_version 45450 (0.0008) [2023-10-12 17:40:06,434][62635] Updated weights for policy 1, policy_version 45460 (0.0007) [2023-10-12 17:40:06,801][62635] Updated weights for policy 1, policy_version 45470 (0.0008) [2023-10-12 17:40:07,562][62634] Updated weights for policy 0, policy_version 45480 (0.0008) [2023-10-12 17:40:07,941][62634] Updated weights for policy 0, policy_version 45490 (0.0009) [2023-10-12 17:40:08,321][62634] Updated weights for policy 0, policy_version 45500 (0.0008) [2023-10-12 17:40:08,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 93126656. Throughput: 0: 1683.1, 1: 1692.8. Samples: 23289650. Policy #0 lag: (min: 31.0, avg: 45.3, max: 63.0) [2023-10-12 17:40:08,436][61643] Avg episode reward: [(0, '22.180'), (1, '9.880')] [2023-10-12 17:40:10,783][62635] Updated weights for policy 1, policy_version 45480 (0.0009) [2023-10-12 17:40:11,151][62635] Updated weights for policy 1, policy_version 45490 (0.0008) [2023-10-12 17:40:11,515][62635] Updated weights for policy 1, policy_version 45500 (0.0009) [2023-10-12 17:40:12,112][62634] Updated weights for policy 0, policy_version 45510 (0.0009) [2023-10-12 17:40:12,498][62634] Updated weights for policy 0, policy_version 45520 (0.0008) [2023-10-12 17:40:12,873][62634] Updated weights for policy 0, policy_version 45530 (0.0009) [2023-10-12 17:40:13,435][61643] Fps is (10 sec: 16384.0, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 93224960. Throughput: 0: 1686.8, 1: 1671.9. Samples: 23309596. Policy #0 lag: (min: 31.0, avg: 45.3, max: 63.0) [2023-10-12 17:40:13,435][61643] Avg episode reward: [(0, '21.510'), (1, '9.760')] [2023-10-12 17:40:15,665][62635] Updated weights for policy 1, policy_version 45510 (0.0008) [2023-10-12 17:40:16,022][62635] Updated weights for policy 1, policy_version 45520 (0.0008) [2023-10-12 17:40:16,384][62635] Updated weights for policy 1, policy_version 45530 (0.0010) [2023-10-12 17:40:16,831][62634] Updated weights for policy 0, policy_version 45540 (0.0007) [2023-10-12 17:40:17,203][62634] Updated weights for policy 0, policy_version 45550 (0.0007) [2023-10-12 17:40:17,579][62634] Updated weights for policy 0, policy_version 45560 (0.0007) [2023-10-12 17:40:18,435][61643] Fps is (10 sec: 16384.3, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 93290496. Throughput: 0: 1664.0, 1: 1700.9. Samples: 23329462. Policy #0 lag: (min: 31.0, avg: 45.3, max: 63.0) [2023-10-12 17:40:18,435][61643] Avg episode reward: [(0, '21.880'), (1, '10.110')] [2023-10-12 17:40:18,443][62495] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000045536_46628864.pth... [2023-10-12 17:40:18,443][62354] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000045568_46661632.pth... [2023-10-12 17:40:18,479][62354] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000044000_45056000.pth [2023-10-12 17:40:18,486][62495] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000043968_45023232.pth [2023-10-12 17:40:20,338][62635] Updated weights for policy 1, policy_version 45540 (0.0009) [2023-10-12 17:40:20,713][62635] Updated weights for policy 1, policy_version 45550 (0.0009) [2023-10-12 17:40:21,073][62635] Updated weights for policy 1, policy_version 45560 (0.0009) [2023-10-12 17:40:21,717][62634] Updated weights for policy 0, policy_version 45570 (0.0008) [2023-10-12 17:40:22,089][62634] Updated weights for policy 0, policy_version 45580 (0.0011) [2023-10-12 17:40:22,471][62634] Updated weights for policy 0, policy_version 45590 (0.0007) [2023-10-12 17:40:22,859][62634] Updated weights for policy 0, policy_version 45600 (0.0010) [2023-10-12 17:40:23,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 93356032. Throughput: 0: 1688.3, 1: 1684.1. Samples: 23340174. Policy #0 lag: (min: 31.0, avg: 45.3, max: 63.0) [2023-10-12 17:40:23,435][61643] Avg episode reward: [(0, '22.190'), (1, '9.800')] [2023-10-12 17:40:25,042][62635] Updated weights for policy 1, policy_version 45570 (0.0009) [2023-10-12 17:40:25,409][62635] Updated weights for policy 1, policy_version 45580 (0.0007) [2023-10-12 17:40:25,776][62635] Updated weights for policy 1, policy_version 45590 (0.0008) [2023-10-12 17:40:26,136][62635] Updated weights for policy 1, policy_version 45600 (0.0007) [2023-10-12 17:40:27,025][62634] Updated weights for policy 0, policy_version 45610 (0.0007) [2023-10-12 17:40:27,410][62634] Updated weights for policy 0, policy_version 45620 (0.0008) [2023-10-12 17:40:27,775][62634] Updated weights for policy 0, policy_version 45630 (0.0007) [2023-10-12 17:40:28,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 93421568. Throughput: 0: 1684.6, 1: 1683.7. Samples: 23360340. Policy #0 lag: (min: 6.0, avg: 10.2, max: 38.0) [2023-10-12 17:40:28,435][61643] Avg episode reward: [(0, '22.050'), (1, '9.800')] [2023-10-12 17:40:30,114][62635] Updated weights for policy 1, policy_version 45610 (0.0008) [2023-10-12 17:40:30,486][62635] Updated weights for policy 1, policy_version 45620 (0.0007) [2023-10-12 17:40:30,851][62635] Updated weights for policy 1, policy_version 45630 (0.0008) [2023-10-12 17:40:31,828][62634] Updated weights for policy 0, policy_version 45640 (0.0007) [2023-10-12 17:40:32,197][62634] Updated weights for policy 0, policy_version 45650 (0.0007) [2023-10-12 17:40:32,585][62634] Updated weights for policy 0, policy_version 45660 (0.0007) [2023-10-12 17:40:33,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 93487104. Throughput: 0: 1668.0, 1: 1708.2. Samples: 23380262. Policy #0 lag: (min: 6.0, avg: 10.2, max: 38.0) [2023-10-12 17:40:33,435][61643] Avg episode reward: [(0, '22.350'), (1, '10.020')] [2023-10-12 17:40:34,951][62635] Updated weights for policy 1, policy_version 45640 (0.0010) [2023-10-12 17:40:35,324][62635] Updated weights for policy 1, policy_version 45650 (0.0008) [2023-10-12 17:40:35,686][62635] Updated weights for policy 1, policy_version 45660 (0.0008) [2023-10-12 17:40:36,860][62634] Updated weights for policy 0, policy_version 45670 (0.0009) [2023-10-12 17:40:37,242][62634] Updated weights for policy 0, policy_version 45680 (0.0009) [2023-10-12 17:40:37,614][62634] Updated weights for policy 0, policy_version 45690 (0.0010) [2023-10-12 17:40:38,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 93552640. Throughput: 0: 1695.3, 1: 1674.0. Samples: 23390660. Policy #0 lag: (min: 6.0, avg: 10.2, max: 38.0) [2023-10-12 17:40:38,435][61643] Avg episode reward: [(0, '22.450'), (1, '9.800')] [2023-10-12 17:40:39,846][62635] Updated weights for policy 1, policy_version 45670 (0.0007) [2023-10-12 17:40:40,217][62635] Updated weights for policy 1, policy_version 45680 (0.0009) [2023-10-12 17:40:40,580][62635] Updated weights for policy 1, policy_version 45690 (0.0008) [2023-10-12 17:40:41,548][62634] Updated weights for policy 0, policy_version 45700 (0.0009) [2023-10-12 17:40:41,916][62634] Updated weights for policy 0, policy_version 45710 (0.0009) [2023-10-12 17:40:42,304][62634] Updated weights for policy 0, policy_version 45720 (0.0010) [2023-10-12 17:40:43,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 93618176. Throughput: 0: 1672.1, 1: 1695.8. Samples: 23410480. Policy #0 lag: (min: 6.0, avg: 10.2, max: 38.0) [2023-10-12 17:40:43,436][61643] Avg episode reward: [(0, '22.370'), (1, '9.780')] [2023-10-12 17:40:44,688][62635] Updated weights for policy 1, policy_version 45700 (0.0008) [2023-10-12 17:40:45,058][62635] Updated weights for policy 1, policy_version 45710 (0.0009) [2023-10-12 17:40:45,435][62635] Updated weights for policy 1, policy_version 45720 (0.0007) [2023-10-12 17:40:46,373][62634] Updated weights for policy 0, policy_version 45730 (0.0008) [2023-10-12 17:40:46,745][62634] Updated weights for policy 0, policy_version 45740 (0.0010) [2023-10-12 17:40:47,121][62634] Updated weights for policy 0, policy_version 45750 (0.0009) [2023-10-12 17:40:47,500][62634] Updated weights for policy 0, policy_version 45760 (0.0007) [2023-10-12 17:40:48,435][61643] Fps is (10 sec: 13106.8, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 93683712. Throughput: 0: 1671.5, 1: 1700.9. Samples: 23430482. Policy #0 lag: (min: 6.0, avg: 10.2, max: 38.0) [2023-10-12 17:40:48,436][61643] Avg episode reward: [(0, '22.430'), (1, '10.020')] [2023-10-12 17:40:49,491][62635] Updated weights for policy 1, policy_version 45730 (0.0007) [2023-10-12 17:40:49,855][62635] Updated weights for policy 1, policy_version 45740 (0.0007) [2023-10-12 17:40:50,227][62635] Updated weights for policy 1, policy_version 45750 (0.0007) [2023-10-12 17:40:50,594][62635] Updated weights for policy 1, policy_version 45760 (0.0008) [2023-10-12 17:40:51,506][62634] Updated weights for policy 0, policy_version 45770 (0.0009) [2023-10-12 17:40:51,879][62634] Updated weights for policy 0, policy_version 45780 (0.0008) [2023-10-12 17:40:52,247][62634] Updated weights for policy 0, policy_version 45790 (0.0007) [2023-10-12 17:40:53,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 93749248. Throughput: 0: 1686.5, 1: 1673.9. Samples: 23440868. Policy #0 lag: (min: 6.0, avg: 10.2, max: 38.0) [2023-10-12 17:40:53,435][61643] Avg episode reward: [(0, '22.460'), (1, '9.810')] [2023-10-12 17:40:54,667][62635] Updated weights for policy 1, policy_version 45770 (0.0009) [2023-10-12 17:40:55,035][62635] Updated weights for policy 1, policy_version 45780 (0.0008) [2023-10-12 17:40:55,402][62635] Updated weights for policy 1, policy_version 45790 (0.0009) [2023-10-12 17:40:56,216][62634] Updated weights for policy 0, policy_version 45800 (0.0008) [2023-10-12 17:40:56,594][62634] Updated weights for policy 0, policy_version 45810 (0.0007) [2023-10-12 17:40:56,971][62634] Updated weights for policy 0, policy_version 45820 (0.0008) [2023-10-12 17:40:58,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 93814784. Throughput: 0: 1662.8, 1: 1694.7. Samples: 23460684. Policy #0 lag: (min: 23.0, avg: 23.0, max: 23.0) [2023-10-12 17:40:58,436][61643] Avg episode reward: [(0, '22.610'), (1, '9.750')] [2023-10-12 17:40:59,440][62635] Updated weights for policy 1, policy_version 45800 (0.0008) [2023-10-12 17:40:59,810][62635] Updated weights for policy 1, policy_version 45810 (0.0008) [2023-10-12 17:41:00,183][62635] Updated weights for policy 1, policy_version 45820 (0.0007) [2023-10-12 17:41:00,866][62634] Updated weights for policy 0, policy_version 45830 (0.0009) [2023-10-12 17:41:01,241][62634] Updated weights for policy 0, policy_version 45840 (0.0008) [2023-10-12 17:41:01,627][62634] Updated weights for policy 0, policy_version 45850 (0.0007) [2023-10-12 17:41:03,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 93880320. Throughput: 0: 1683.3, 1: 1688.3. Samples: 23481182. Policy #0 lag: (min: 23.0, avg: 23.0, max: 23.0) [2023-10-12 17:41:03,436][61643] Avg episode reward: [(0, '22.690'), (1, '9.880')] [2023-10-12 17:41:04,180][62635] Updated weights for policy 1, policy_version 45830 (0.0009) [2023-10-12 17:41:04,547][62635] Updated weights for policy 1, policy_version 45840 (0.0010) [2023-10-12 17:41:04,922][62635] Updated weights for policy 1, policy_version 45850 (0.0008) [2023-10-12 17:41:05,903][62634] Updated weights for policy 0, policy_version 45860 (0.0008) [2023-10-12 17:41:06,265][62634] Updated weights for policy 0, policy_version 45870 (0.0008) [2023-10-12 17:41:06,650][62634] Updated weights for policy 0, policy_version 45880 (0.0008) [2023-10-12 17:41:08,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 93945856. Throughput: 0: 1680.1, 1: 1674.2. Samples: 23491118. Policy #0 lag: (min: 23.0, avg: 23.0, max: 23.0) [2023-10-12 17:41:08,435][61643] Avg episode reward: [(0, '22.950'), (1, '9.750')] [2023-10-12 17:41:09,056][62635] Updated weights for policy 1, policy_version 45860 (0.0008) [2023-10-12 17:41:09,415][62635] Updated weights for policy 1, policy_version 45870 (0.0010) [2023-10-12 17:41:09,781][62635] Updated weights for policy 1, policy_version 45880 (0.0008) [2023-10-12 17:41:10,700][62634] Updated weights for policy 0, policy_version 45890 (0.0008) [2023-10-12 17:41:11,080][62634] Updated weights for policy 0, policy_version 45900 (0.0007) [2023-10-12 17:41:11,465][62634] Updated weights for policy 0, policy_version 45910 (0.0008) [2023-10-12 17:41:11,842][62634] Updated weights for policy 0, policy_version 45920 (0.0007) [2023-10-12 17:41:13,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 94011392. Throughput: 0: 1655.6, 1: 1682.5. Samples: 23510558. Policy #0 lag: (min: 23.0, avg: 23.0, max: 23.0) [2023-10-12 17:41:13,435][61643] Avg episode reward: [(0, '22.830'), (1, '9.820')] [2023-10-12 17:41:13,855][62635] Updated weights for policy 1, policy_version 45890 (0.0008) [2023-10-12 17:41:14,222][62635] Updated weights for policy 1, policy_version 45900 (0.0009) [2023-10-12 17:41:14,587][62635] Updated weights for policy 1, policy_version 45910 (0.0008) [2023-10-12 17:41:14,957][62635] Updated weights for policy 1, policy_version 45920 (0.0009) [2023-10-12 17:41:15,711][62634] Updated weights for policy 0, policy_version 45930 (0.0008) [2023-10-12 17:41:16,096][62634] Updated weights for policy 0, policy_version 45940 (0.0007) [2023-10-12 17:41:16,469][62634] Updated weights for policy 0, policy_version 45950 (0.0007) [2023-10-12 17:41:18,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 94076928. Throughput: 0: 1682.5, 1: 1678.4. Samples: 23531500. Policy #0 lag: (min: 23.0, avg: 23.0, max: 23.0) [2023-10-12 17:41:18,435][61643] Avg episode reward: [(0, '23.180'), (1, '10.000')] [2023-10-12 17:41:19,044][62635] Updated weights for policy 1, policy_version 45930 (0.0007) [2023-10-12 17:41:19,419][62635] Updated weights for policy 1, policy_version 45940 (0.0009) [2023-10-12 17:41:19,781][62635] Updated weights for policy 1, policy_version 45950 (0.0007) [2023-10-12 17:41:20,505][62634] Updated weights for policy 0, policy_version 45960 (0.0008) [2023-10-12 17:41:20,885][62634] Updated weights for policy 0, policy_version 45970 (0.0008) [2023-10-12 17:41:21,270][62634] Updated weights for policy 0, policy_version 45980 (0.0008) [2023-10-12 17:41:23,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 94142464. Throughput: 0: 1665.3, 1: 1678.1. Samples: 23541112. Policy #0 lag: (min: 23.0, avg: 23.0, max: 23.0) [2023-10-12 17:41:23,435][61643] Avg episode reward: [(0, '23.360'), (1, '9.870')] [2023-10-12 17:41:24,023][62635] Updated weights for policy 1, policy_version 45960 (0.0008) [2023-10-12 17:41:24,389][62635] Updated weights for policy 1, policy_version 45970 (0.0009) [2023-10-12 17:41:24,749][62635] Updated weights for policy 1, policy_version 45980 (0.0008) [2023-10-12 17:41:25,338][62634] Updated weights for policy 0, policy_version 45990 (0.0007) [2023-10-12 17:41:25,725][62634] Updated weights for policy 0, policy_version 46000 (0.0007) [2023-10-12 17:41:26,097][62634] Updated weights for policy 0, policy_version 46010 (0.0007) [2023-10-12 17:41:28,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 94208000. Throughput: 0: 1673.1, 1: 1673.0. Samples: 23561054. Policy #0 lag: (min: 23.0, avg: 23.0, max: 23.0) [2023-10-12 17:41:28,435][61643] Avg episode reward: [(0, '23.710'), (1, '9.770')] [2023-10-12 17:41:28,436][62354] Saving new best policy, reward=23.710! [2023-10-12 17:41:28,848][62635] Updated weights for policy 1, policy_version 45990 (0.0010) [2023-10-12 17:41:29,222][62635] Updated weights for policy 1, policy_version 46000 (0.0008) [2023-10-12 17:41:29,590][62635] Updated weights for policy 1, policy_version 46010 (0.0009) [2023-10-12 17:41:30,156][62634] Updated weights for policy 0, policy_version 46020 (0.0008) [2023-10-12 17:41:30,552][62634] Updated weights for policy 0, policy_version 46030 (0.0009) [2023-10-12 17:41:30,927][62634] Updated weights for policy 0, policy_version 46040 (0.0009) [2023-10-12 17:41:33,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 94273536. Throughput: 0: 1688.8, 1: 1674.5. Samples: 23581826. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 17:41:33,435][61643] Avg episode reward: [(0, '23.500'), (1, '9.980')] [2023-10-12 17:41:33,630][62635] Updated weights for policy 1, policy_version 46020 (0.0010) [2023-10-12 17:41:34,000][62635] Updated weights for policy 1, policy_version 46030 (0.0010) [2023-10-12 17:41:34,367][62635] Updated weights for policy 1, policy_version 46040 (0.0009) [2023-10-12 17:41:35,044][62634] Updated weights for policy 0, policy_version 46050 (0.0008) [2023-10-12 17:41:35,419][62634] Updated weights for policy 0, policy_version 46060 (0.0007) [2023-10-12 17:41:35,796][62634] Updated weights for policy 0, policy_version 46070 (0.0008) [2023-10-12 17:41:36,176][62634] Updated weights for policy 0, policy_version 46080 (0.0009) [2023-10-12 17:41:38,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 94339072. Throughput: 0: 1662.7, 1: 1679.4. Samples: 23591262. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 17:41:38,435][61643] Avg episode reward: [(0, '23.630'), (1, '9.830')] [2023-10-12 17:41:38,547][62635] Updated weights for policy 1, policy_version 46050 (0.0008) [2023-10-12 17:41:38,914][62635] Updated weights for policy 1, policy_version 46060 (0.0009) [2023-10-12 17:41:39,281][62635] Updated weights for policy 1, policy_version 46070 (0.0009) [2023-10-12 17:41:39,643][62635] Updated weights for policy 1, policy_version 46080 (0.0009) [2023-10-12 17:41:40,314][62634] Updated weights for policy 0, policy_version 46090 (0.0010) [2023-10-12 17:41:40,691][62634] Updated weights for policy 0, policy_version 46100 (0.0009) [2023-10-12 17:41:41,067][62634] Updated weights for policy 0, policy_version 46110 (0.0008) [2023-10-12 17:41:43,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 94404608. Throughput: 0: 1678.1, 1: 1678.3. Samples: 23611718. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 17:41:43,435][61643] Avg episode reward: [(0, '23.580'), (1, '9.680')] [2023-10-12 17:41:43,815][62635] Updated weights for policy 1, policy_version 46090 (0.0008) [2023-10-12 17:41:44,181][62635] Updated weights for policy 1, policy_version 46100 (0.0008) [2023-10-12 17:41:44,557][62635] Updated weights for policy 1, policy_version 46110 (0.0008) [2023-10-12 17:41:44,961][62634] Updated weights for policy 0, policy_version 46120 (0.0009) [2023-10-12 17:41:45,338][62634] Updated weights for policy 0, policy_version 46130 (0.0008) [2023-10-12 17:41:45,716][62634] Updated weights for policy 0, policy_version 46140 (0.0009) [2023-10-12 17:41:48,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.3, 300 sec: 13329.4). Total num frames: 94470144. Throughput: 0: 1689.6, 1: 1675.8. Samples: 23632622. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 17:41:48,435][61643] Avg episode reward: [(0, '23.800'), (1, '9.850')] [2023-10-12 17:41:48,445][62354] Saving new best policy, reward=23.800! [2023-10-12 17:41:48,647][62635] Updated weights for policy 1, policy_version 46120 (0.0008) [2023-10-12 17:41:49,021][62635] Updated weights for policy 1, policy_version 46130 (0.0008) [2023-10-12 17:41:49,391][62635] Updated weights for policy 1, policy_version 46140 (0.0009) [2023-10-12 17:41:49,695][62634] Updated weights for policy 0, policy_version 46150 (0.0008) [2023-10-12 17:41:50,069][62634] Updated weights for policy 0, policy_version 46160 (0.0011) [2023-10-12 17:41:50,457][62634] Updated weights for policy 0, policy_version 46170 (0.0010) [2023-10-12 17:41:53,435][61643] Fps is (10 sec: 13106.7, 60 sec: 13107.1, 300 sec: 13329.3). Total num frames: 94535680. Throughput: 0: 1663.4, 1: 1679.8. Samples: 23641562. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 17:41:53,436][61643] Avg episode reward: [(0, '23.920'), (1, '9.780')] [2023-10-12 17:41:53,438][62354] Saving new best policy, reward=23.920! [2023-10-12 17:41:53,499][62635] Updated weights for policy 1, policy_version 46150 (0.0008) [2023-10-12 17:41:53,859][62635] Updated weights for policy 1, policy_version 46160 (0.0008) [2023-10-12 17:41:54,235][62635] Updated weights for policy 1, policy_version 46170 (0.0009) [2023-10-12 17:41:54,324][62634] Updated weights for policy 0, policy_version 46180 (0.0007) [2023-10-12 17:41:54,702][62634] Updated weights for policy 0, policy_version 46190 (0.0007) [2023-10-12 17:41:55,087][62634] Updated weights for policy 0, policy_version 46200 (0.0007) [2023-10-12 17:41:58,133][62635] Updated weights for policy 1, policy_version 46180 (0.0009) [2023-10-12 17:41:58,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 94601216. Throughput: 0: 1696.0, 1: 1681.8. Samples: 23662560. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 17:41:58,435][61643] Avg episode reward: [(0, '23.540'), (1, '9.590')] [2023-10-12 17:41:58,506][62635] Updated weights for policy 1, policy_version 46190 (0.0008) [2023-10-12 17:41:58,871][62635] Updated weights for policy 1, policy_version 46200 (0.0011) [2023-10-12 17:41:59,122][62634] Updated weights for policy 0, policy_version 46210 (0.0007) [2023-10-12 17:41:59,497][62634] Updated weights for policy 0, policy_version 46220 (0.0009) [2023-10-12 17:41:59,868][62634] Updated weights for policy 0, policy_version 46230 (0.0008) [2023-10-12 17:42:00,237][62634] Updated weights for policy 0, policy_version 46240 (0.0009) [2023-10-12 17:42:02,808][62635] Updated weights for policy 1, policy_version 46210 (0.0009) [2023-10-12 17:42:03,173][62635] Updated weights for policy 1, policy_version 46220 (0.0009) [2023-10-12 17:42:03,435][61643] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 94666752. Throughput: 0: 1687.9, 1: 1675.7. Samples: 23682862. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 17:42:03,435][61643] Avg episode reward: [(0, '23.970'), (1, '9.680')] [2023-10-12 17:42:03,448][62354] Saving new best policy, reward=23.970! [2023-10-12 17:42:03,540][62635] Updated weights for policy 1, policy_version 46230 (0.0010) [2023-10-12 17:42:03,909][62635] Updated weights for policy 1, policy_version 46240 (0.0008) [2023-10-12 17:42:04,378][62634] Updated weights for policy 0, policy_version 46250 (0.0007) [2023-10-12 17:42:04,764][62634] Updated weights for policy 0, policy_version 46260 (0.0010) [2023-10-12 17:42:05,140][62634] Updated weights for policy 0, policy_version 46270 (0.0009) [2023-10-12 17:42:08,228][62635] Updated weights for policy 1, policy_version 46250 (0.0008) [2023-10-12 17:42:08,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 94732288. Throughput: 0: 1674.8, 1: 1680.4. Samples: 23692094. Policy #0 lag: (min: 14.0, avg: 24.3, max: 46.0) [2023-10-12 17:42:08,435][61643] Avg episode reward: [(0, '24.070'), (1, '9.670')] [2023-10-12 17:42:08,436][62354] Saving new best policy, reward=24.070! [2023-10-12 17:42:08,592][62635] Updated weights for policy 1, policy_version 46260 (0.0008) [2023-10-12 17:42:08,960][62635] Updated weights for policy 1, policy_version 46270 (0.0008) [2023-10-12 17:42:09,020][62634] Updated weights for policy 0, policy_version 46280 (0.0007) [2023-10-12 17:42:09,400][62634] Updated weights for policy 0, policy_version 46290 (0.0008) [2023-10-12 17:42:09,767][62634] Updated weights for policy 0, policy_version 46300 (0.0008) [2023-10-12 17:42:13,123][62635] Updated weights for policy 1, policy_version 46280 (0.0007) [2023-10-12 17:42:13,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 94797824. Throughput: 0: 1691.6, 1: 1686.5. Samples: 23713070. Policy #0 lag: (min: 14.0, avg: 24.3, max: 46.0) [2023-10-12 17:42:13,436][61643] Avg episode reward: [(0, '24.120'), (1, '9.380')] [2023-10-12 17:42:13,437][62354] Saving new best policy, reward=24.120! [2023-10-12 17:42:13,500][62635] Updated weights for policy 1, policy_version 46290 (0.0007) [2023-10-12 17:42:13,863][62635] Updated weights for policy 1, policy_version 46300 (0.0009) [2023-10-12 17:42:14,132][62634] Updated weights for policy 0, policy_version 46310 (0.0011) [2023-10-12 17:42:14,505][62634] Updated weights for policy 0, policy_version 46320 (0.0009) [2023-10-12 17:42:14,882][62634] Updated weights for policy 0, policy_version 46330 (0.0011) [2023-10-12 17:42:17,765][62635] Updated weights for policy 1, policy_version 46310 (0.0007) [2023-10-12 17:42:18,128][62635] Updated weights for policy 1, policy_version 46320 (0.0007) [2023-10-12 17:42:18,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 94863360. Throughput: 0: 1690.5, 1: 1674.0. Samples: 23733230. Policy #0 lag: (min: 14.0, avg: 24.3, max: 46.0) [2023-10-12 17:42:18,435][61643] Avg episode reward: [(0, '24.040'), (1, '9.400')] [2023-10-12 17:42:18,443][62354] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000046336_47448064.pth... [2023-10-12 17:42:18,486][62354] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000044768_45842432.pth [2023-10-12 17:42:18,502][62635] Updated weights for policy 1, policy_version 46330 (0.0007) [2023-10-12 17:42:18,712][62495] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000046336_47448064.pth... [2023-10-12 17:42:18,748][62495] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000044736_45809664.pth [2023-10-12 17:42:18,946][62634] Updated weights for policy 0, policy_version 46340 (0.0009) [2023-10-12 17:42:19,337][62634] Updated weights for policy 0, policy_version 46350 (0.0007) [2023-10-12 17:42:19,703][62634] Updated weights for policy 0, policy_version 46360 (0.0007) [2023-10-12 17:42:22,466][62635] Updated weights for policy 1, policy_version 46340 (0.0007) [2023-10-12 17:42:22,845][62635] Updated weights for policy 1, policy_version 46350 (0.0007) [2023-10-12 17:42:23,209][62635] Updated weights for policy 1, policy_version 46360 (0.0008) [2023-10-12 17:42:23,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 94928896. Throughput: 0: 1681.8, 1: 1683.4. Samples: 23742694. Policy #0 lag: (min: 14.0, avg: 24.3, max: 46.0) [2023-10-12 17:42:23,436][61643] Avg episode reward: [(0, '24.150'), (1, '9.620')] [2023-10-12 17:42:23,436][62354] Saving new best policy, reward=24.150! [2023-10-12 17:42:23,789][62634] Updated weights for policy 0, policy_version 46370 (0.0010) [2023-10-12 17:42:24,164][62634] Updated weights for policy 0, policy_version 46380 (0.0008) [2023-10-12 17:42:24,535][62634] Updated weights for policy 0, policy_version 46390 (0.0009) [2023-10-12 17:42:24,911][62634] Updated weights for policy 0, policy_version 46400 (0.0009) [2023-10-12 17:42:27,160][62635] Updated weights for policy 1, policy_version 46370 (0.0010) [2023-10-12 17:42:27,522][62635] Updated weights for policy 1, policy_version 46380 (0.0011) [2023-10-12 17:42:27,883][62635] Updated weights for policy 1, policy_version 46390 (0.0010) [2023-10-12 17:42:28,250][62635] Updated weights for policy 1, policy_version 46400 (0.0010) [2023-10-12 17:42:28,435][61643] Fps is (10 sec: 16384.1, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 95027200. Throughput: 0: 1683.6, 1: 1687.3. Samples: 23763410. Policy #0 lag: (min: 14.0, avg: 24.3, max: 46.0) [2023-10-12 17:42:28,435][61643] Avg episode reward: [(0, '24.270'), (1, '9.530')] [2023-10-12 17:42:28,856][62634] Updated weights for policy 0, policy_version 46410 (0.0010) [2023-10-12 17:42:29,229][62634] Updated weights for policy 0, policy_version 46420 (0.0009) [2023-10-12 17:42:29,606][62634] Updated weights for policy 0, policy_version 46430 (0.0010) [2023-10-12 17:42:29,681][62354] Saving new best policy, reward=24.270! [2023-10-12 17:42:32,516][62635] Updated weights for policy 1, policy_version 46410 (0.0008) [2023-10-12 17:42:32,885][62635] Updated weights for policy 1, policy_version 46420 (0.0007) [2023-10-12 17:42:33,251][62635] Updated weights for policy 1, policy_version 46430 (0.0008) [2023-10-12 17:42:33,435][61643] Fps is (10 sec: 16384.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 95092736. Throughput: 0: 1679.6, 1: 1665.4. Samples: 23783146. Policy #0 lag: (min: 14.0, avg: 24.3, max: 46.0) [2023-10-12 17:42:33,436][61643] Avg episode reward: [(0, '24.110'), (1, '9.420')] [2023-10-12 17:42:33,680][62634] Updated weights for policy 0, policy_version 46440 (0.0008) [2023-10-12 17:42:34,052][62634] Updated weights for policy 0, policy_version 46450 (0.0010) [2023-10-12 17:42:34,433][62634] Updated weights for policy 0, policy_version 46460 (0.0009) [2023-10-12 17:42:37,228][62635] Updated weights for policy 1, policy_version 46440 (0.0009) [2023-10-12 17:42:37,598][62635] Updated weights for policy 1, policy_version 46450 (0.0007) [2023-10-12 17:42:37,970][62635] Updated weights for policy 1, policy_version 46460 (0.0010) [2023-10-12 17:42:38,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 95158272. Throughput: 0: 1683.3, 1: 1682.9. Samples: 23793042. Policy #0 lag: (min: 14.0, avg: 24.3, max: 46.0) [2023-10-12 17:42:38,435][61643] Avg episode reward: [(0, '23.980'), (1, '9.760')] [2023-10-12 17:42:38,582][62634] Updated weights for policy 0, policy_version 46470 (0.0008) [2023-10-12 17:42:38,961][62634] Updated weights for policy 0, policy_version 46480 (0.0009) [2023-10-12 17:42:39,346][62634] Updated weights for policy 0, policy_version 46490 (0.0008) [2023-10-12 17:42:42,136][62635] Updated weights for policy 1, policy_version 46470 (0.0009) [2023-10-12 17:42:42,508][62635] Updated weights for policy 1, policy_version 46480 (0.0010) [2023-10-12 17:42:42,873][62635] Updated weights for policy 1, policy_version 46490 (0.0009) [2023-10-12 17:42:43,391][62634] Updated weights for policy 0, policy_version 46500 (0.0009) [2023-10-12 17:42:43,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 95223808. Throughput: 0: 1676.1, 1: 1673.9. Samples: 23813310. Policy #0 lag: (min: 21.0, avg: 21.0, max: 21.0) [2023-10-12 17:42:43,435][61643] Avg episode reward: [(0, '23.990'), (1, '9.710')] [2023-10-12 17:42:43,775][62634] Updated weights for policy 0, policy_version 46510 (0.0008) [2023-10-12 17:42:44,148][62634] Updated weights for policy 0, policy_version 46520 (0.0009) [2023-10-12 17:42:46,874][62635] Updated weights for policy 1, policy_version 46500 (0.0009) [2023-10-12 17:42:47,248][62635] Updated weights for policy 1, policy_version 46510 (0.0009) [2023-10-12 17:42:47,626][62635] Updated weights for policy 1, policy_version 46520 (0.0009) [2023-10-12 17:42:48,209][62634] Updated weights for policy 0, policy_version 46530 (0.0008) [2023-10-12 17:42:48,435][61643] Fps is (10 sec: 13106.8, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 95289344. Throughput: 0: 1685.5, 1: 1658.2. Samples: 23833326. Policy #0 lag: (min: 21.0, avg: 21.0, max: 21.0) [2023-10-12 17:42:48,436][61643] Avg episode reward: [(0, '24.210'), (1, '9.850')] [2023-10-12 17:42:48,591][62634] Updated weights for policy 0, policy_version 46540 (0.0009) [2023-10-12 17:42:48,976][62634] Updated weights for policy 0, policy_version 46550 (0.0008) [2023-10-12 17:42:49,354][62634] Updated weights for policy 0, policy_version 46560 (0.0007) [2023-10-12 17:42:51,711][62635] Updated weights for policy 1, policy_version 46530 (0.0009) [2023-10-12 17:42:52,086][62635] Updated weights for policy 1, policy_version 46540 (0.0009) [2023-10-12 17:42:52,458][62635] Updated weights for policy 1, policy_version 46550 (0.0008) [2023-10-12 17:42:52,831][62635] Updated weights for policy 1, policy_version 46560 (0.0008) [2023-10-12 17:42:53,287][62634] Updated weights for policy 0, policy_version 46570 (0.0009) [2023-10-12 17:42:53,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 95354880. Throughput: 0: 1684.0, 1: 1682.1. Samples: 23843566. Policy #0 lag: (min: 21.0, avg: 21.0, max: 21.0) [2023-10-12 17:42:53,436][61643] Avg episode reward: [(0, '24.130'), (1, '10.010')] [2023-10-12 17:42:53,673][62634] Updated weights for policy 0, policy_version 46580 (0.0009) [2023-10-12 17:42:54,041][62634] Updated weights for policy 0, policy_version 46590 (0.0009) [2023-10-12 17:42:56,876][62635] Updated weights for policy 1, policy_version 46570 (0.0008) [2023-10-12 17:42:57,248][62635] Updated weights for policy 1, policy_version 46580 (0.0008) [2023-10-12 17:42:57,616][62635] Updated weights for policy 1, policy_version 46590 (0.0008) [2023-10-12 17:42:58,233][62634] Updated weights for policy 0, policy_version 46600 (0.0009) [2023-10-12 17:42:58,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 95420416. Throughput: 0: 1682.3, 1: 1669.5. Samples: 23863900. Policy #0 lag: (min: 21.0, avg: 21.0, max: 21.0) [2023-10-12 17:42:58,436][61643] Avg episode reward: [(0, '24.200'), (1, '9.720')] [2023-10-12 17:42:58,614][62634] Updated weights for policy 0, policy_version 46610 (0.0011) [2023-10-12 17:42:58,981][62634] Updated weights for policy 0, policy_version 46620 (0.0010) [2023-10-12 17:43:01,752][62635] Updated weights for policy 1, policy_version 46600 (0.0008) [2023-10-12 17:43:02,122][62635] Updated weights for policy 1, policy_version 46610 (0.0010) [2023-10-12 17:43:02,494][62635] Updated weights for policy 1, policy_version 46620 (0.0007) [2023-10-12 17:43:02,966][62634] Updated weights for policy 0, policy_version 46630 (0.0009) [2023-10-12 17:43:03,341][62634] Updated weights for policy 0, policy_version 46640 (0.0011) [2023-10-12 17:43:03,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 95485952. Throughput: 0: 1681.1, 1: 1662.5. Samples: 23883690. Policy #0 lag: (min: 21.0, avg: 21.0, max: 21.0) [2023-10-12 17:43:03,435][61643] Avg episode reward: [(0, '23.910'), (1, '9.820')] [2023-10-12 17:43:03,724][62634] Updated weights for policy 0, policy_version 46650 (0.0008) [2023-10-12 17:43:06,391][62635] Updated weights for policy 1, policy_version 46630 (0.0009) [2023-10-12 17:43:06,762][62635] Updated weights for policy 1, policy_version 46640 (0.0011) [2023-10-12 17:43:07,122][62635] Updated weights for policy 1, policy_version 46650 (0.0010) [2023-10-12 17:43:07,909][62634] Updated weights for policy 0, policy_version 46660 (0.0007) [2023-10-12 17:43:08,309][62634] Updated weights for policy 0, policy_version 46670 (0.0009) [2023-10-12 17:43:08,435][61643] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 95551488. Throughput: 0: 1689.9, 1: 1684.5. Samples: 23894542. Policy #0 lag: (min: 21.0, avg: 21.0, max: 21.0) [2023-10-12 17:43:08,435][61643] Avg episode reward: [(0, '23.920'), (1, '10.090')] [2023-10-12 17:43:08,687][62634] Updated weights for policy 0, policy_version 46680 (0.0010) [2023-10-12 17:43:11,214][62635] Updated weights for policy 1, policy_version 46660 (0.0009) [2023-10-12 17:43:11,581][62635] Updated weights for policy 1, policy_version 46670 (0.0008) [2023-10-12 17:43:11,940][62635] Updated weights for policy 1, policy_version 46680 (0.0009) [2023-10-12 17:43:12,764][62634] Updated weights for policy 0, policy_version 46690 (0.0010) [2023-10-12 17:43:13,144][62634] Updated weights for policy 0, policy_version 46700 (0.0009) [2023-10-12 17:43:13,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 95617024. Throughput: 0: 1688.6, 1: 1658.1. Samples: 23914010. Policy #0 lag: (min: 21.0, avg: 21.0, max: 21.0) [2023-10-12 17:43:13,435][61643] Avg episode reward: [(0, '24.190'), (1, '9.820')] [2023-10-12 17:43:13,528][62634] Updated weights for policy 0, policy_version 46710 (0.0009) [2023-10-12 17:43:13,899][62634] Updated weights for policy 0, policy_version 46720 (0.0010) [2023-10-12 17:43:15,962][62635] Updated weights for policy 1, policy_version 46690 (0.0008) [2023-10-12 17:43:16,330][62635] Updated weights for policy 1, policy_version 46700 (0.0010) [2023-10-12 17:43:16,709][62635] Updated weights for policy 1, policy_version 46710 (0.0007) [2023-10-12 17:43:17,079][62635] Updated weights for policy 1, policy_version 46720 (0.0008) [2023-10-12 17:43:17,849][62634] Updated weights for policy 0, policy_version 46730 (0.0007) [2023-10-12 17:43:18,233][62634] Updated weights for policy 0, policy_version 46740 (0.0008) [2023-10-12 17:43:18,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 95682560. Throughput: 0: 1678.8, 1: 1681.1. Samples: 23934342. Policy #0 lag: (min: 31.0, avg: 31.3, max: 43.0) [2023-10-12 17:43:18,436][61643] Avg episode reward: [(0, '23.950'), (1, '9.950')] [2023-10-12 17:43:18,608][62634] Updated weights for policy 0, policy_version 46750 (0.0008) [2023-10-12 17:43:21,172][62635] Updated weights for policy 1, policy_version 46730 (0.0010) [2023-10-12 17:43:21,548][62635] Updated weights for policy 1, policy_version 46740 (0.0011) [2023-10-12 17:43:21,922][62635] Updated weights for policy 1, policy_version 46750 (0.0010) [2023-10-12 17:43:22,629][62634] Updated weights for policy 0, policy_version 46760 (0.0009) [2023-10-12 17:43:23,000][62634] Updated weights for policy 0, policy_version 46770 (0.0007) [2023-10-12 17:43:23,387][62634] Updated weights for policy 0, policy_version 46780 (0.0007) [2023-10-12 17:43:23,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 95748096. Throughput: 0: 1690.0, 1: 1685.8. Samples: 23944952. Policy #0 lag: (min: 31.0, avg: 31.3, max: 43.0) [2023-10-12 17:43:23,435][61643] Avg episode reward: [(0, '23.990'), (1, '10.100')] [2023-10-12 17:43:25,924][62635] Updated weights for policy 1, policy_version 46760 (0.0007) [2023-10-12 17:43:26,293][62635] Updated weights for policy 1, policy_version 46770 (0.0008) [2023-10-12 17:43:26,658][62635] Updated weights for policy 1, policy_version 46780 (0.0007) [2023-10-12 17:43:27,362][62634] Updated weights for policy 0, policy_version 46790 (0.0010) [2023-10-12 17:43:27,748][62634] Updated weights for policy 0, policy_version 46800 (0.0007) [2023-10-12 17:43:28,127][62634] Updated weights for policy 0, policy_version 46810 (0.0009) [2023-10-12 17:43:28,435][61643] Fps is (10 sec: 16383.9, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 95846400. Throughput: 0: 1699.3, 1: 1671.0. Samples: 23964974. Policy #0 lag: (min: 31.0, avg: 31.3, max: 43.0) [2023-10-12 17:43:28,436][61643] Avg episode reward: [(0, '24.000'), (1, '9.890')] [2023-10-12 17:43:30,662][62635] Updated weights for policy 1, policy_version 46790 (0.0007) [2023-10-12 17:43:31,036][62635] Updated weights for policy 1, policy_version 46800 (0.0007) [2023-10-12 17:43:31,395][62635] Updated weights for policy 1, policy_version 46810 (0.0008) [2023-10-12 17:43:32,013][62634] Updated weights for policy 0, policy_version 46820 (0.0010) [2023-10-12 17:43:32,389][62634] Updated weights for policy 0, policy_version 46830 (0.0007) [2023-10-12 17:43:32,763][62634] Updated weights for policy 0, policy_version 46840 (0.0007) [2023-10-12 17:43:33,435][61643] Fps is (10 sec: 16384.2, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 95911936. Throughput: 0: 1671.8, 1: 1696.4. Samples: 23984894. Policy #0 lag: (min: 31.0, avg: 31.3, max: 43.0) [2023-10-12 17:43:33,435][61643] Avg episode reward: [(0, '23.720'), (1, '9.980')] [2023-10-12 17:43:35,438][62635] Updated weights for policy 1, policy_version 46820 (0.0008) [2023-10-12 17:43:35,807][62635] Updated weights for policy 1, policy_version 46830 (0.0009) [2023-10-12 17:43:36,167][62635] Updated weights for policy 1, policy_version 46840 (0.0008) [2023-10-12 17:43:36,821][62634] Updated weights for policy 0, policy_version 46850 (0.0009) [2023-10-12 17:43:37,195][62634] Updated weights for policy 0, policy_version 46860 (0.0007) [2023-10-12 17:43:37,578][62634] Updated weights for policy 0, policy_version 46870 (0.0007) [2023-10-12 17:43:37,954][62634] Updated weights for policy 0, policy_version 46880 (0.0007) [2023-10-12 17:43:38,435][61643] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 95977472. Throughput: 0: 1696.8, 1: 1682.4. Samples: 23995632. Policy #0 lag: (min: 31.0, avg: 31.3, max: 43.0) [2023-10-12 17:43:38,435][61643] Avg episode reward: [(0, '23.510'), (1, '9.990')] [2023-10-12 17:43:40,233][62635] Updated weights for policy 1, policy_version 46850 (0.0007) [2023-10-12 17:43:40,594][62635] Updated weights for policy 1, policy_version 46860 (0.0008) [2023-10-12 17:43:40,959][62635] Updated weights for policy 1, policy_version 46870 (0.0009) [2023-10-12 17:43:41,326][62635] Updated weights for policy 1, policy_version 46880 (0.0009) [2023-10-12 17:43:41,912][62634] Updated weights for policy 0, policy_version 46890 (0.0011) [2023-10-12 17:43:42,297][62634] Updated weights for policy 0, policy_version 46900 (0.0010) [2023-10-12 17:43:42,671][62634] Updated weights for policy 0, policy_version 46910 (0.0009) [2023-10-12 17:43:43,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 96043008. Throughput: 0: 1688.0, 1: 1678.8. Samples: 24015408. Policy #0 lag: (min: 31.0, avg: 31.3, max: 43.0) [2023-10-12 17:43:43,436][61643] Avg episode reward: [(0, '23.310'), (1, '9.900')] [2023-10-12 17:43:45,407][62635] Updated weights for policy 1, policy_version 46890 (0.0007) [2023-10-12 17:43:45,773][62635] Updated weights for policy 1, policy_version 46900 (0.0010) [2023-10-12 17:43:46,140][62635] Updated weights for policy 1, policy_version 46910 (0.0009) [2023-10-12 17:43:46,634][62634] Updated weights for policy 0, policy_version 46920 (0.0008) [2023-10-12 17:43:47,013][62634] Updated weights for policy 0, policy_version 46930 (0.0009) [2023-10-12 17:43:47,402][62634] Updated weights for policy 0, policy_version 46940 (0.0009) [2023-10-12 17:43:48,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 96108544. Throughput: 0: 1671.2, 1: 1698.7. Samples: 24035338. Policy #0 lag: (min: 31.0, avg: 36.5, max: 63.0) [2023-10-12 17:43:48,435][61643] Avg episode reward: [(0, '23.420'), (1, '9.740')] [2023-10-12 17:43:50,293][62635] Updated weights for policy 1, policy_version 46920 (0.0008) [2023-10-12 17:43:50,667][62635] Updated weights for policy 1, policy_version 46930 (0.0010) [2023-10-12 17:43:51,034][62635] Updated weights for policy 1, policy_version 46940 (0.0011) [2023-10-12 17:43:51,495][62634] Updated weights for policy 0, policy_version 46950 (0.0009) [2023-10-12 17:43:51,872][62634] Updated weights for policy 0, policy_version 46960 (0.0008) [2023-10-12 17:43:52,254][62634] Updated weights for policy 0, policy_version 46970 (0.0007) [2023-10-12 17:43:53,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 96174080. Throughput: 0: 1694.8, 1: 1664.7. Samples: 24045716. Policy #0 lag: (min: 31.0, avg: 36.5, max: 63.0) [2023-10-12 17:43:53,435][61643] Avg episode reward: [(0, '23.130'), (1, '9.920')] [2023-10-12 17:43:55,103][62635] Updated weights for policy 1, policy_version 46950 (0.0008) [2023-10-12 17:43:55,462][62635] Updated weights for policy 1, policy_version 46960 (0.0008) [2023-10-12 17:43:55,838][62635] Updated weights for policy 1, policy_version 46970 (0.0008) [2023-10-12 17:43:56,325][62634] Updated weights for policy 0, policy_version 46980 (0.0008) [2023-10-12 17:43:56,715][62634] Updated weights for policy 0, policy_version 46990 (0.0007) [2023-10-12 17:43:57,100][62634] Updated weights for policy 0, policy_version 47000 (0.0008) [2023-10-12 17:43:58,435][61643] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 96239616. Throughput: 0: 1679.3, 1: 1683.2. Samples: 24065322. Policy #0 lag: (min: 31.0, avg: 36.5, max: 63.0) [2023-10-12 17:43:58,436][61643] Avg episode reward: [(0, '23.180'), (1, '10.000')] [2023-10-12 17:43:59,856][62635] Updated weights for policy 1, policy_version 46980 (0.0010) [2023-10-12 17:44:00,218][62635] Updated weights for policy 1, policy_version 46990 (0.0010) [2023-10-12 17:44:00,589][62635] Updated weights for policy 1, policy_version 47000 (0.0007) [2023-10-12 17:44:01,044][62634] Updated weights for policy 0, policy_version 47010 (0.0009) [2023-10-12 17:44:01,420][62634] Updated weights for policy 0, policy_version 47020 (0.0008) [2023-10-12 17:44:01,791][62634] Updated weights for policy 0, policy_version 47030 (0.0009) [2023-10-12 17:44:02,170][62634] Updated weights for policy 0, policy_version 47040 (0.0010) [2023-10-12 17:44:03,435][61643] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 96305152. Throughput: 0: 1678.6, 1: 1686.4. Samples: 24085766. Policy #0 lag: (min: 31.0, avg: 36.5, max: 63.0) [2023-10-12 17:44:03,436][61643] Avg episode reward: [(0, '23.140'), (1, '9.720')] [2023-10-12 17:44:04,853][62635] Updated weights for policy 1, policy_version 47010 (0.0008) [2023-10-12 17:44:05,223][62635] Updated weights for policy 1, policy_version 47020 (0.0010) [2023-10-12 17:44:05,593][62635] Updated weights for policy 1, policy_version 47030 (0.0008) [2023-10-12 17:44:05,961][62635] Updated weights for policy 1, policy_version 47040 (0.0007) [2023-10-12 17:44:06,199][62634] Updated weights for policy 0, policy_version 47050 (0.0007) [2023-10-12 17:44:06,569][62634] Updated weights for policy 0, policy_version 47060 (0.0008) [2023-10-12 17:44:06,946][62634] Updated weights for policy 0, policy_version 47070 (0.0008) [2023-10-12 17:44:08,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 96370688. Throughput: 0: 1694.0, 1: 1659.2. Samples: 24095844. Policy #0 lag: (min: 31.0, avg: 36.5, max: 63.0) [2023-10-12 17:44:08,436][61643] Avg episode reward: [(0, '23.540'), (1, '9.970')] [2023-10-12 17:44:10,099][62635] Updated weights for policy 1, policy_version 47050 (0.0008) [2023-10-12 17:44:10,467][62635] Updated weights for policy 1, policy_version 47060 (0.0007) [2023-10-12 17:44:10,840][62635] Updated weights for policy 1, policy_version 47070 (0.0008) [2023-10-12 17:44:10,988][62634] Updated weights for policy 0, policy_version 47080 (0.0009) [2023-10-12 17:44:11,371][62634] Updated weights for policy 0, policy_version 47090 (0.0008) [2023-10-12 17:44:11,747][62634] Updated weights for policy 0, policy_version 47100 (0.0008) [2023-10-12 17:44:13,435][61643] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 96436224. Throughput: 0: 1663.7, 1: 1683.9. Samples: 24115618. Policy #0 lag: (min: 31.0, avg: 36.5, max: 63.0) [2023-10-12 17:44:13,436][61643] Avg episode reward: [(0, '23.610'), (1, '10.060')] [2023-10-12 17:44:14,768][62635] Updated weights for policy 1, policy_version 47080 (0.0007) [2023-10-12 17:44:15,146][62635] Updated weights for policy 1, policy_version 47090 (0.0008) [2023-10-12 17:44:15,516][62635] Updated weights for policy 1, policy_version 47100 (0.0009) [2023-10-12 17:44:15,843][62634] Updated weights for policy 0, policy_version 47110 (0.0007) [2023-10-12 17:44:16,218][62634] Updated weights for policy 0, policy_version 47120 (0.0007) [2023-10-12 17:44:16,595][62634] Updated weights for policy 0, policy_version 47130 (0.0008) [2023-10-12 17:44:18,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 96501760. Throughput: 0: 1681.1, 1: 1679.1. Samples: 24136104. Policy #0 lag: (min: 31.0, avg: 36.5, max: 63.0) [2023-10-12 17:44:18,436][61643] Avg episode reward: [(0, '23.340'), (1, '9.750')] [2023-10-12 17:44:18,450][62495] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000047104_48234496.pth... [2023-10-12 17:44:18,450][62354] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000047136_48267264.pth... [2023-10-12 17:44:18,486][62495] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000045536_46628864.pth [2023-10-12 17:44:18,491][62354] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000045568_46661632.pth [2023-10-12 17:44:18,491][62495] Saving a milestone ./train_atari/atari_kangaroo_APPO/checkpoint_p1/milestones/checkpoint_000047104_48234496.pth [2023-10-12 17:44:18,496][62354] Saving a milestone ./train_atari/atari_kangaroo_APPO/checkpoint_p0/milestones/checkpoint_000047136_48267264.pth [2023-10-12 17:44:19,594][62635] Updated weights for policy 1, policy_version 47110 (0.0007) [2023-10-12 17:44:19,953][62635] Updated weights for policy 1, policy_version 47120 (0.0007) [2023-10-12 17:44:20,320][62635] Updated weights for policy 1, policy_version 47130 (0.0007) [2023-10-12 17:44:20,636][62634] Updated weights for policy 0, policy_version 47140 (0.0009) [2023-10-12 17:44:21,011][62634] Updated weights for policy 0, policy_version 47150 (0.0007) [2023-10-12 17:44:21,390][62634] Updated weights for policy 0, policy_version 47160 (0.0010) [2023-10-12 17:44:23,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 96567296. Throughput: 0: 1678.4, 1: 1663.0. Samples: 24145996. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 17:44:23,435][61643] Avg episode reward: [(0, '23.490'), (1, '9.850')] [2023-10-12 17:44:24,441][62635] Updated weights for policy 1, policy_version 47140 (0.0008) [2023-10-12 17:44:24,818][62635] Updated weights for policy 1, policy_version 47150 (0.0008) [2023-10-12 17:44:25,190][62635] Updated weights for policy 1, policy_version 47160 (0.0010) [2023-10-12 17:44:25,446][62634] Updated weights for policy 0, policy_version 47170 (0.0008) [2023-10-12 17:44:25,817][62634] Updated weights for policy 0, policy_version 47180 (0.0007) [2023-10-12 17:44:26,195][62634] Updated weights for policy 0, policy_version 47190 (0.0009) [2023-10-12 17:44:26,572][62634] Updated weights for policy 0, policy_version 47200 (0.0009) [2023-10-12 17:44:28,435][61643] Fps is (10 sec: 13107.7, 60 sec: 13107.3, 300 sec: 13440.4). Total num frames: 96632832. Throughput: 0: 1670.9, 1: 1676.8. Samples: 24166056. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 17:44:28,435][61643] Avg episode reward: [(0, '23.520'), (1, '10.120')] [2023-10-12 17:44:29,139][62635] Updated weights for policy 1, policy_version 47170 (0.0010) [2023-10-12 17:44:29,511][62635] Updated weights for policy 1, policy_version 47180 (0.0009) [2023-10-12 17:44:29,885][62635] Updated weights for policy 1, policy_version 47190 (0.0009) [2023-10-12 17:44:30,246][62635] Updated weights for policy 1, policy_version 47200 (0.0008) [2023-10-12 17:44:30,442][62634] Updated weights for policy 0, policy_version 47210 (0.0008) [2023-10-12 17:44:30,824][62634] Updated weights for policy 0, policy_version 47220 (0.0007) [2023-10-12 17:44:31,199][62634] Updated weights for policy 0, policy_version 47230 (0.0007) [2023-10-12 17:44:33,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 96698368. Throughput: 0: 1689.7, 1: 1679.2. Samples: 24186938. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 17:44:33,435][61643] Avg episode reward: [(0, '23.690'), (1, '9.860')] [2023-10-12 17:44:34,421][62635] Updated weights for policy 1, policy_version 47210 (0.0011) [2023-10-12 17:44:34,782][62635] Updated weights for policy 1, policy_version 47220 (0.0010) [2023-10-12 17:44:35,149][62635] Updated weights for policy 1, policy_version 47230 (0.0010) [2023-10-12 17:44:35,378][62634] Updated weights for policy 0, policy_version 47240 (0.0009) [2023-10-12 17:44:35,755][62634] Updated weights for policy 0, policy_version 47250 (0.0010) [2023-10-12 17:44:36,127][62634] Updated weights for policy 0, policy_version 47260 (0.0010) [2023-10-12 17:44:38,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 96763904. Throughput: 0: 1670.4, 1: 1676.4. Samples: 24196324. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 17:44:38,435][61643] Avg episode reward: [(0, '23.780'), (1, '9.700')] [2023-10-12 17:44:39,359][62635] Updated weights for policy 1, policy_version 47240 (0.0009) [2023-10-12 17:44:39,737][62635] Updated weights for policy 1, policy_version 47250 (0.0009) [2023-10-12 17:44:40,049][62634] Updated weights for policy 0, policy_version 47270 (0.0008) [2023-10-12 17:44:40,096][62635] Updated weights for policy 1, policy_version 47260 (0.0008) [2023-10-12 17:44:40,426][62634] Updated weights for policy 0, policy_version 47280 (0.0007) [2023-10-12 17:44:40,795][62634] Updated weights for policy 0, policy_version 47290 (0.0008) [2023-10-12 17:44:43,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 96829440. Throughput: 0: 1684.5, 1: 1679.7. Samples: 24216712. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 17:44:43,435][61643] Avg episode reward: [(0, '23.770'), (1, '9.760')] [2023-10-12 17:44:44,073][62635] Updated weights for policy 1, policy_version 47270 (0.0010) [2023-10-12 17:44:44,437][62635] Updated weights for policy 1, policy_version 47280 (0.0010) [2023-10-12 17:44:44,809][62635] Updated weights for policy 1, policy_version 47290 (0.0010) [2023-10-12 17:44:45,081][62634] Updated weights for policy 0, policy_version 47300 (0.0009) [2023-10-12 17:44:45,465][62634] Updated weights for policy 0, policy_version 47310 (0.0007) [2023-10-12 17:44:45,841][62634] Updated weights for policy 0, policy_version 47320 (0.0007) [2023-10-12 17:44:48,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 96894976. Throughput: 0: 1686.4, 1: 1675.5. Samples: 24237054. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 17:44:48,436][61643] Avg episode reward: [(0, '24.040'), (1, '10.040')] [2023-10-12 17:44:49,071][62635] Updated weights for policy 1, policy_version 47300 (0.0008) [2023-10-12 17:44:49,434][62635] Updated weights for policy 1, policy_version 47310 (0.0011) [2023-10-12 17:44:49,804][62635] Updated weights for policy 1, policy_version 47320 (0.0010) [2023-10-12 17:44:49,996][62634] Updated weights for policy 0, policy_version 47330 (0.0007) [2023-10-12 17:44:50,375][62634] Updated weights for policy 0, policy_version 47340 (0.0008) [2023-10-12 17:44:50,750][62634] Updated weights for policy 0, policy_version 47350 (0.0010) [2023-10-12 17:44:51,131][62634] Updated weights for policy 0, policy_version 47360 (0.0007) [2023-10-12 17:44:53,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 96960512. Throughput: 0: 1665.1, 1: 1677.6. Samples: 24246264. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 17:44:53,436][61643] Avg episode reward: [(0, '24.020'), (1, '9.710')] [2023-10-12 17:44:53,891][62635] Updated weights for policy 1, policy_version 47330 (0.0008) [2023-10-12 17:44:54,256][62635] Updated weights for policy 1, policy_version 47340 (0.0009) [2023-10-12 17:44:54,634][62635] Updated weights for policy 1, policy_version 47350 (0.0009) [2023-10-12 17:44:54,996][62635] Updated weights for policy 1, policy_version 47360 (0.0008) [2023-10-12 17:44:55,154][62634] Updated weights for policy 0, policy_version 47370 (0.0007) [2023-10-12 17:44:55,542][62634] Updated weights for policy 0, policy_version 47380 (0.0008) [2023-10-12 17:44:55,913][62634] Updated weights for policy 0, policy_version 47390 (0.0007) [2023-10-12 17:44:58,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13107.3, 300 sec: 13440.4). Total num frames: 97026048. Throughput: 0: 1683.8, 1: 1674.6. Samples: 24266748. Policy #0 lag: (min: 20.0, avg: 33.4, max: 52.0) [2023-10-12 17:44:58,435][61643] Avg episode reward: [(0, '24.340'), (1, '9.560')] [2023-10-12 17:44:58,436][62354] Saving new best policy, reward=24.340! [2023-10-12 17:44:59,189][62635] Updated weights for policy 1, policy_version 47370 (0.0009) [2023-10-12 17:44:59,545][62635] Updated weights for policy 1, policy_version 47380 (0.0011) [2023-10-12 17:44:59,910][62635] Updated weights for policy 1, policy_version 47390 (0.0010) [2023-10-12 17:45:00,198][62634] Updated weights for policy 0, policy_version 47400 (0.0009) [2023-10-12 17:45:00,580][62634] Updated weights for policy 0, policy_version 47410 (0.0008) [2023-10-12 17:45:00,955][62634] Updated weights for policy 0, policy_version 47420 (0.0008) [2023-10-12 17:45:03,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 97091584. Throughput: 0: 1682.4, 1: 1676.2. Samples: 24287240. Policy #0 lag: (min: 20.0, avg: 33.4, max: 52.0) [2023-10-12 17:45:03,436][61643] Avg episode reward: [(0, '24.520'), (1, '9.610')] [2023-10-12 17:45:03,444][62354] Saving new best policy, reward=24.520! [2023-10-12 17:45:03,944][62635] Updated weights for policy 1, policy_version 47400 (0.0007) [2023-10-12 17:45:04,310][62635] Updated weights for policy 1, policy_version 47410 (0.0007) [2023-10-12 17:45:04,676][62635] Updated weights for policy 1, policy_version 47420 (0.0008) [2023-10-12 17:45:04,938][62634] Updated weights for policy 0, policy_version 47430 (0.0010) [2023-10-12 17:45:05,322][62634] Updated weights for policy 0, policy_version 47440 (0.0009) [2023-10-12 17:45:05,692][62634] Updated weights for policy 0, policy_version 47450 (0.0010) [2023-10-12 17:45:08,435][61643] Fps is (10 sec: 13106.8, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 97157120. Throughput: 0: 1662.8, 1: 1679.8. Samples: 24296412. Policy #0 lag: (min: 20.0, avg: 33.4, max: 52.0) [2023-10-12 17:45:08,436][61643] Avg episode reward: [(0, '25.000'), (1, '9.640')] [2023-10-12 17:45:08,437][62354] Saving new best policy, reward=25.000! [2023-10-12 17:45:08,524][62635] Updated weights for policy 1, policy_version 47430 (0.0007) [2023-10-12 17:45:08,890][62635] Updated weights for policy 1, policy_version 47440 (0.0007) [2023-10-12 17:45:09,256][62635] Updated weights for policy 1, policy_version 47450 (0.0009) [2023-10-12 17:45:09,703][62634] Updated weights for policy 0, policy_version 47460 (0.0009) [2023-10-12 17:45:10,079][62634] Updated weights for policy 0, policy_version 47470 (0.0010) [2023-10-12 17:45:10,467][62634] Updated weights for policy 0, policy_version 47480 (0.0008) [2023-10-12 17:45:13,226][62635] Updated weights for policy 1, policy_version 47460 (0.0010) [2023-10-12 17:45:13,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 97222656. Throughput: 0: 1676.6, 1: 1683.1. Samples: 24317244. Policy #0 lag: (min: 20.0, avg: 33.4, max: 52.0) [2023-10-12 17:45:13,435][61643] Avg episode reward: [(0, '24.860'), (1, '9.650')] [2023-10-12 17:45:13,598][62635] Updated weights for policy 1, policy_version 47470 (0.0007) [2023-10-12 17:45:13,963][62635] Updated weights for policy 1, policy_version 47480 (0.0008) [2023-10-12 17:45:14,509][62634] Updated weights for policy 0, policy_version 47490 (0.0008) [2023-10-12 17:45:14,885][62634] Updated weights for policy 0, policy_version 47500 (0.0010) [2023-10-12 17:45:15,264][62634] Updated weights for policy 0, policy_version 47510 (0.0010) [2023-10-12 17:45:15,647][62634] Updated weights for policy 0, policy_version 47520 (0.0008) [2023-10-12 17:45:17,891][62635] Updated weights for policy 1, policy_version 47490 (0.0008) [2023-10-12 17:45:18,263][62635] Updated weights for policy 1, policy_version 47500 (0.0008) [2023-10-12 17:45:18,435][61643] Fps is (10 sec: 13107.6, 60 sec: 13107.3, 300 sec: 13329.4). Total num frames: 97288192. Throughput: 0: 1678.8, 1: 1678.7. Samples: 24338026. Policy #0 lag: (min: 20.0, avg: 33.4, max: 52.0) [2023-10-12 17:45:18,435][61643] Avg episode reward: [(0, '24.920'), (1, '9.640')] [2023-10-12 17:45:18,638][62635] Updated weights for policy 1, policy_version 47510 (0.0009) [2023-10-12 17:45:18,999][62635] Updated weights for policy 1, policy_version 47520 (0.0010) [2023-10-12 17:45:19,691][62634] Updated weights for policy 0, policy_version 47530 (0.0007) [2023-10-12 17:45:20,067][62634] Updated weights for policy 0, policy_version 47540 (0.0007) [2023-10-12 17:45:20,435][62634] Updated weights for policy 0, policy_version 47550 (0.0007) [2023-10-12 17:45:23,239][62635] Updated weights for policy 1, policy_version 47530 (0.0007) [2023-10-12 17:45:23,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 97353728. Throughput: 0: 1668.4, 1: 1686.4. Samples: 24347290. Policy #0 lag: (min: 20.0, avg: 33.4, max: 52.0) [2023-10-12 17:45:23,435][61643] Avg episode reward: [(0, '24.540'), (1, '9.660')] [2023-10-12 17:45:23,601][62635] Updated weights for policy 1, policy_version 47540 (0.0009) [2023-10-12 17:45:23,957][62635] Updated weights for policy 1, policy_version 47550 (0.0011) [2023-10-12 17:45:24,406][62634] Updated weights for policy 0, policy_version 47560 (0.0008) [2023-10-12 17:45:24,781][62634] Updated weights for policy 0, policy_version 47570 (0.0007) [2023-10-12 17:45:25,160][62634] Updated weights for policy 0, policy_version 47580 (0.0008) [2023-10-12 17:45:28,302][62635] Updated weights for policy 1, policy_version 47560 (0.0007) [2023-10-12 17:45:28,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 97419264. Throughput: 0: 1676.9, 1: 1681.5. Samples: 24367840. Policy #0 lag: (min: 20.0, avg: 33.4, max: 52.0) [2023-10-12 17:45:28,435][61643] Avg episode reward: [(0, '24.130'), (1, '9.820')] [2023-10-12 17:45:28,683][62635] Updated weights for policy 1, policy_version 47570 (0.0008) [2023-10-12 17:45:29,053][62635] Updated weights for policy 1, policy_version 47580 (0.0009) [2023-10-12 17:45:29,218][62634] Updated weights for policy 0, policy_version 47590 (0.0008) [2023-10-12 17:45:29,598][62634] Updated weights for policy 0, policy_version 47600 (0.0009) [2023-10-12 17:45:29,974][62634] Updated weights for policy 0, policy_version 47610 (0.0010) [2023-10-12 17:45:32,936][62635] Updated weights for policy 1, policy_version 47590 (0.0011) [2023-10-12 17:45:33,300][62635] Updated weights for policy 1, policy_version 47600 (0.0010) [2023-10-12 17:45:33,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 97484800. Throughput: 0: 1678.5, 1: 1676.7. Samples: 24388036. Policy #0 lag: (min: 19.0, avg: 19.0, max: 19.0) [2023-10-12 17:45:33,435][61643] Avg episode reward: [(0, '24.420'), (1, '9.890')] [2023-10-12 17:45:33,673][62635] Updated weights for policy 1, policy_version 47610 (0.0010) [2023-10-12 17:45:34,031][62634] Updated weights for policy 0, policy_version 47620 (0.0010) [2023-10-12 17:45:34,402][62634] Updated weights for policy 0, policy_version 47630 (0.0007) [2023-10-12 17:45:34,784][62634] Updated weights for policy 0, policy_version 47640 (0.0007) [2023-10-12 17:45:37,759][62635] Updated weights for policy 1, policy_version 47620 (0.0010) [2023-10-12 17:45:38,131][62635] Updated weights for policy 1, policy_version 47630 (0.0008) [2023-10-12 17:45:38,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 97550336. Throughput: 0: 1674.7, 1: 1690.4. Samples: 24397690. Policy #0 lag: (min: 19.0, avg: 19.0, max: 19.0) [2023-10-12 17:45:38,436][61643] Avg episode reward: [(0, '24.400'), (1, '9.790')] [2023-10-12 17:45:38,502][62635] Updated weights for policy 1, policy_version 47640 (0.0007) [2023-10-12 17:45:38,837][62634] Updated weights for policy 0, policy_version 47650 (0.0009) [2023-10-12 17:45:39,204][62634] Updated weights for policy 0, policy_version 47660 (0.0009) [2023-10-12 17:45:39,576][62634] Updated weights for policy 0, policy_version 47670 (0.0010) [2023-10-12 17:45:39,949][62634] Updated weights for policy 0, policy_version 47680 (0.0010) [2023-10-12 17:45:42,552][62635] Updated weights for policy 1, policy_version 47650 (0.0008) [2023-10-12 17:45:42,930][62635] Updated weights for policy 1, policy_version 47660 (0.0007) [2023-10-12 17:45:43,288][62635] Updated weights for policy 1, policy_version 47670 (0.0007) [2023-10-12 17:45:43,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 97615872. Throughput: 0: 1675.5, 1: 1689.6. Samples: 24418176. Policy #0 lag: (min: 19.0, avg: 19.0, max: 19.0) [2023-10-12 17:45:43,435][61643] Avg episode reward: [(0, '24.210'), (1, '10.020')] [2023-10-12 17:45:43,657][62635] Updated weights for policy 1, policy_version 47680 (0.0007) [2023-10-12 17:45:44,167][62634] Updated weights for policy 0, policy_version 47690 (0.0008) [2023-10-12 17:45:44,553][62634] Updated weights for policy 0, policy_version 47700 (0.0007) [2023-10-12 17:45:44,926][62634] Updated weights for policy 0, policy_version 47710 (0.0007) [2023-10-12 17:45:47,679][62635] Updated weights for policy 1, policy_version 47690 (0.0008) [2023-10-12 17:45:48,059][62635] Updated weights for policy 1, policy_version 47700 (0.0011) [2023-10-12 17:45:48,427][62635] Updated weights for policy 1, policy_version 47710 (0.0007) [2023-10-12 17:45:48,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 97681408. Throughput: 0: 1680.0, 1: 1672.0. Samples: 24438080. Policy #0 lag: (min: 19.0, avg: 19.0, max: 19.0) [2023-10-12 17:45:48,436][61643] Avg episode reward: [(0, '24.140'), (1, '9.960')] [2023-10-12 17:45:48,825][62634] Updated weights for policy 0, policy_version 47720 (0.0009) [2023-10-12 17:45:49,198][62634] Updated weights for policy 0, policy_version 47730 (0.0010) [2023-10-12 17:45:49,581][62634] Updated weights for policy 0, policy_version 47740 (0.0010) [2023-10-12 17:45:52,432][62635] Updated weights for policy 1, policy_version 47720 (0.0009) [2023-10-12 17:45:52,802][62635] Updated weights for policy 1, policy_version 47730 (0.0007) [2023-10-12 17:45:53,175][62635] Updated weights for policy 1, policy_version 47740 (0.0009) [2023-10-12 17:45:53,435][61643] Fps is (10 sec: 16384.1, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 97779712. Throughput: 0: 1674.0, 1: 1692.2. Samples: 24447890. Policy #0 lag: (min: 19.0, avg: 19.0, max: 19.0) [2023-10-12 17:45:53,435][61643] Avg episode reward: [(0, '23.800'), (1, '9.870')] [2023-10-12 17:45:53,683][62634] Updated weights for policy 0, policy_version 47750 (0.0010) [2023-10-12 17:45:54,051][62634] Updated weights for policy 0, policy_version 47760 (0.0008) [2023-10-12 17:45:54,438][62634] Updated weights for policy 0, policy_version 47770 (0.0007) [2023-10-12 17:45:57,302][62635] Updated weights for policy 1, policy_version 47750 (0.0007) [2023-10-12 17:45:57,666][62635] Updated weights for policy 1, policy_version 47760 (0.0008) [2023-10-12 17:45:58,033][62635] Updated weights for policy 1, policy_version 47770 (0.0008) [2023-10-12 17:45:58,435][61643] Fps is (10 sec: 16384.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 97845248. Throughput: 0: 1675.2, 1: 1685.2. Samples: 24468464. Policy #0 lag: (min: 19.0, avg: 19.0, max: 19.0) [2023-10-12 17:45:58,435][61643] Avg episode reward: [(0, '23.930'), (1, '10.030')] [2023-10-12 17:45:58,550][62634] Updated weights for policy 0, policy_version 47780 (0.0007) [2023-10-12 17:45:58,929][62634] Updated weights for policy 0, policy_version 47790 (0.0012) [2023-10-12 17:45:59,300][62634] Updated weights for policy 0, policy_version 47800 (0.0010) [2023-10-12 17:46:02,196][62635] Updated weights for policy 1, policy_version 47780 (0.0008) [2023-10-12 17:46:02,573][62635] Updated weights for policy 1, policy_version 47790 (0.0012) [2023-10-12 17:46:02,940][62635] Updated weights for policy 1, policy_version 47800 (0.0010) [2023-10-12 17:46:03,136][62634] Updated weights for policy 0, policy_version 47810 (0.0008) [2023-10-12 17:46:03,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 97910784. Throughput: 0: 1680.0, 1: 1661.4. Samples: 24488388. Policy #0 lag: (min: 19.0, avg: 19.0, max: 19.0) [2023-10-12 17:46:03,435][61643] Avg episode reward: [(0, '23.890'), (1, '10.050')] [2023-10-12 17:46:03,504][62634] Updated weights for policy 0, policy_version 47820 (0.0008) [2023-10-12 17:46:03,878][62634] Updated weights for policy 0, policy_version 47830 (0.0008) [2023-10-12 17:46:04,259][62634] Updated weights for policy 0, policy_version 47840 (0.0008) [2023-10-12 17:46:06,990][62635] Updated weights for policy 1, policy_version 47810 (0.0008) [2023-10-12 17:46:07,358][62635] Updated weights for policy 1, policy_version 47820 (0.0009) [2023-10-12 17:46:07,735][62635] Updated weights for policy 1, policy_version 47830 (0.0007) [2023-10-12 17:46:08,105][62635] Updated weights for policy 1, policy_version 47840 (0.0007) [2023-10-12 17:46:08,308][62634] Updated weights for policy 0, policy_version 47850 (0.0007) [2023-10-12 17:46:08,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 97976320. Throughput: 0: 1683.9, 1: 1678.3. Samples: 24498588. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 17:46:08,435][61643] Avg episode reward: [(0, '23.990'), (1, '10.040')] [2023-10-12 17:46:08,681][62634] Updated weights for policy 0, policy_version 47860 (0.0008) [2023-10-12 17:46:09,062][62634] Updated weights for policy 0, policy_version 47870 (0.0008) [2023-10-12 17:46:12,191][62635] Updated weights for policy 1, policy_version 47850 (0.0008) [2023-10-12 17:46:12,554][62635] Updated weights for policy 1, policy_version 47860 (0.0008) [2023-10-12 17:46:12,926][62635] Updated weights for policy 1, policy_version 47870 (0.0008) [2023-10-12 17:46:13,121][62634] Updated weights for policy 0, policy_version 47880 (0.0008) [2023-10-12 17:46:13,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 98041856. Throughput: 0: 1685.0, 1: 1676.7. Samples: 24519118. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 17:46:13,435][61643] Avg episode reward: [(0, '23.920'), (1, '9.910')] [2023-10-12 17:46:13,491][62634] Updated weights for policy 0, policy_version 47890 (0.0010) [2023-10-12 17:46:13,866][62634] Updated weights for policy 0, policy_version 47900 (0.0011) [2023-10-12 17:46:17,100][62635] Updated weights for policy 1, policy_version 47880 (0.0009) [2023-10-12 17:46:17,470][62635] Updated weights for policy 1, policy_version 47890 (0.0008) [2023-10-12 17:46:17,839][62635] Updated weights for policy 1, policy_version 47900 (0.0009) [2023-10-12 17:46:18,031][62634] Updated weights for policy 0, policy_version 47910 (0.0009) [2023-10-12 17:46:18,420][62634] Updated weights for policy 0, policy_version 47920 (0.0008) [2023-10-12 17:46:18,435][61643] Fps is (10 sec: 13106.7, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 98107392. Throughput: 0: 1684.5, 1: 1666.1. Samples: 24538814. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 17:46:18,436][61643] Avg episode reward: [(0, '23.930'), (1, '10.020')] [2023-10-12 17:46:18,445][62495] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000047904_49053696.pth... [2023-10-12 17:46:18,474][62495] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000046336_47448064.pth [2023-10-12 17:46:18,791][62634] Updated weights for policy 0, policy_version 47930 (0.0007) [2023-10-12 17:46:19,017][62354] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000047936_49086464.pth... [2023-10-12 17:46:19,056][62354] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000046336_47448064.pth [2023-10-12 17:46:21,979][62635] Updated weights for policy 1, policy_version 47910 (0.0009) [2023-10-12 17:46:22,341][62635] Updated weights for policy 1, policy_version 47920 (0.0009) [2023-10-12 17:46:22,708][62635] Updated weights for policy 1, policy_version 47930 (0.0009) [2023-10-12 17:46:22,710][62634] Updated weights for policy 0, policy_version 47940 (0.0009) [2023-10-12 17:46:23,099][62634] Updated weights for policy 0, policy_version 47950 (0.0007) [2023-10-12 17:46:23,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 98172928. Throughput: 0: 1689.5, 1: 1680.8. Samples: 24549352. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 17:46:23,435][61643] Avg episode reward: [(0, '23.850'), (1, '9.930')] [2023-10-12 17:46:23,472][62634] Updated weights for policy 0, policy_version 47960 (0.0010) [2023-10-12 17:46:26,913][62635] Updated weights for policy 1, policy_version 47940 (0.0008) [2023-10-12 17:46:27,284][62635] Updated weights for policy 1, policy_version 47950 (0.0010) [2023-10-12 17:46:27,373][62634] Updated weights for policy 0, policy_version 47970 (0.0007) [2023-10-12 17:46:27,642][62635] Updated weights for policy 1, policy_version 47960 (0.0008) [2023-10-12 17:46:27,754][62634] Updated weights for policy 0, policy_version 47980 (0.0008) [2023-10-12 17:46:28,121][62634] Updated weights for policy 0, policy_version 47990 (0.0008) [2023-10-12 17:46:28,435][61643] Fps is (10 sec: 13107.7, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 98238464. Throughput: 0: 1698.6, 1: 1670.6. Samples: 24569790. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 17:46:28,435][61643] Avg episode reward: [(0, '23.900'), (1, '9.860')] [2023-10-12 17:46:28,503][62634] Updated weights for policy 0, policy_version 48000 (0.0007) [2023-10-12 17:46:31,701][62635] Updated weights for policy 1, policy_version 47970 (0.0008) [2023-10-12 17:46:32,065][62635] Updated weights for policy 1, policy_version 47980 (0.0009) [2023-10-12 17:46:32,432][62635] Updated weights for policy 1, policy_version 47990 (0.0008) [2023-10-12 17:46:32,645][62634] Updated weights for policy 0, policy_version 48010 (0.0010) [2023-10-12 17:46:32,802][62635] Updated weights for policy 1, policy_version 48000 (0.0008) [2023-10-12 17:46:33,021][62634] Updated weights for policy 0, policy_version 48020 (0.0008) [2023-10-12 17:46:33,401][62634] Updated weights for policy 0, policy_version 48030 (0.0008) [2023-10-12 17:46:33,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 98304000. Throughput: 0: 1681.3, 1: 1665.5. Samples: 24588688. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 17:46:33,435][61643] Avg episode reward: [(0, '24.000'), (1, '9.860')] [2023-10-12 17:46:36,835][62635] Updated weights for policy 1, policy_version 48010 (0.0009) [2023-10-12 17:46:37,202][62635] Updated weights for policy 1, policy_version 48020 (0.0008) [2023-10-12 17:46:37,387][62634] Updated weights for policy 0, policy_version 48040 (0.0008) [2023-10-12 17:46:37,566][62635] Updated weights for policy 1, policy_version 48030 (0.0009) [2023-10-12 17:46:37,761][62634] Updated weights for policy 0, policy_version 48050 (0.0008) [2023-10-12 17:46:38,147][62634] Updated weights for policy 0, policy_version 48060 (0.0010) [2023-10-12 17:46:38,435][61643] Fps is (10 sec: 16384.0, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 98402304. Throughput: 0: 1700.6, 1: 1673.6. Samples: 24599728. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 17:46:38,435][61643] Avg episode reward: [(0, '23.990'), (1, '9.740')] [2023-10-12 17:46:41,592][62635] Updated weights for policy 1, policy_version 48040 (0.0007) [2023-10-12 17:46:41,962][62635] Updated weights for policy 1, policy_version 48050 (0.0007) [2023-10-12 17:46:42,168][62634] Updated weights for policy 0, policy_version 48070 (0.0008) [2023-10-12 17:46:42,324][62635] Updated weights for policy 1, policy_version 48060 (0.0008) [2023-10-12 17:46:42,537][62634] Updated weights for policy 0, policy_version 48080 (0.0009) [2023-10-12 17:46:42,916][62634] Updated weights for policy 0, policy_version 48090 (0.0009) [2023-10-12 17:46:43,435][61643] Fps is (10 sec: 16383.9, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 98467840. Throughput: 0: 1700.0, 1: 1660.5. Samples: 24619684. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 17:46:43,435][61643] Avg episode reward: [(0, '24.370'), (1, '9.730')] [2023-10-12 17:46:46,374][62635] Updated weights for policy 1, policy_version 48070 (0.0009) [2023-10-12 17:46:46,748][62635] Updated weights for policy 1, policy_version 48080 (0.0009) [2023-10-12 17:46:47,085][62634] Updated weights for policy 0, policy_version 48100 (0.0010) [2023-10-12 17:46:47,119][62635] Updated weights for policy 1, policy_version 48090 (0.0007) [2023-10-12 17:46:47,458][62634] Updated weights for policy 0, policy_version 48110 (0.0008) [2023-10-12 17:46:47,836][62634] Updated weights for policy 0, policy_version 48120 (0.0008) [2023-10-12 17:46:48,435][61643] Fps is (10 sec: 13106.9, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 98533376. Throughput: 0: 1667.3, 1: 1673.1. Samples: 24638704. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 17:46:48,436][61643] Avg episode reward: [(0, '24.510'), (1, '9.890')] [2023-10-12 17:46:51,092][62635] Updated weights for policy 1, policy_version 48100 (0.0008) [2023-10-12 17:46:51,460][62635] Updated weights for policy 1, policy_version 48110 (0.0009) [2023-10-12 17:46:51,742][62634] Updated weights for policy 0, policy_version 48130 (0.0009) [2023-10-12 17:46:51,833][62635] Updated weights for policy 1, policy_version 48120 (0.0009) [2023-10-12 17:46:52,128][62634] Updated weights for policy 0, policy_version 48140 (0.0008) [2023-10-12 17:46:52,501][62634] Updated weights for policy 0, policy_version 48150 (0.0007) [2023-10-12 17:46:52,889][62634] Updated weights for policy 0, policy_version 48160 (0.0007) [2023-10-12 17:46:53,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 98598912. Throughput: 0: 1690.1, 1: 1676.3. Samples: 24650076. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 17:46:53,435][61643] Avg episode reward: [(0, '24.430'), (1, '9.530')] [2023-10-12 17:46:55,826][62635] Updated weights for policy 1, policy_version 48130 (0.0009) [2023-10-12 17:46:56,187][62635] Updated weights for policy 1, policy_version 48140 (0.0009) [2023-10-12 17:46:56,556][62635] Updated weights for policy 1, policy_version 48150 (0.0008) [2023-10-12 17:46:56,742][62634] Updated weights for policy 0, policy_version 48170 (0.0008) [2023-10-12 17:46:56,922][62635] Updated weights for policy 1, policy_version 48160 (0.0007) [2023-10-12 17:46:57,118][62634] Updated weights for policy 0, policy_version 48180 (0.0007) [2023-10-12 17:46:57,499][62634] Updated weights for policy 0, policy_version 48190 (0.0008) [2023-10-12 17:46:58,435][61643] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 98664448. Throughput: 0: 1677.0, 1: 1660.0. Samples: 24669284. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 17:46:58,435][61643] Avg episode reward: [(0, '24.700'), (1, '9.690')] [2023-10-12 17:47:01,019][62635] Updated weights for policy 1, policy_version 48170 (0.0007) [2023-10-12 17:47:01,387][62635] Updated weights for policy 1, policy_version 48180 (0.0007) [2023-10-12 17:47:01,648][62634] Updated weights for policy 0, policy_version 48200 (0.0008) [2023-10-12 17:47:01,756][62635] Updated weights for policy 1, policy_version 48190 (0.0007) [2023-10-12 17:47:02,018][62634] Updated weights for policy 0, policy_version 48210 (0.0007) [2023-10-12 17:47:02,403][62634] Updated weights for policy 0, policy_version 48220 (0.0008) [2023-10-12 17:47:03,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 98729984. Throughput: 0: 1665.7, 1: 1677.8. Samples: 24689270. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 17:47:03,435][61643] Avg episode reward: [(0, '24.790'), (1, '9.750')] [2023-10-12 17:47:05,923][62635] Updated weights for policy 1, policy_version 48200 (0.0008) [2023-10-12 17:47:06,299][62635] Updated weights for policy 1, policy_version 48210 (0.0008) [2023-10-12 17:47:06,453][62634] Updated weights for policy 0, policy_version 48230 (0.0009) [2023-10-12 17:47:06,671][62635] Updated weights for policy 1, policy_version 48220 (0.0007) [2023-10-12 17:47:06,836][62634] Updated weights for policy 0, policy_version 48240 (0.0008) [2023-10-12 17:47:07,207][62634] Updated weights for policy 0, policy_version 48250 (0.0008) [2023-10-12 17:47:08,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 98795520. Throughput: 0: 1689.5, 1: 1666.2. Samples: 24700360. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 17:47:08,435][61643] Avg episode reward: [(0, '24.970'), (1, '9.650')] [2023-10-12 17:47:10,726][62635] Updated weights for policy 1, policy_version 48230 (0.0008) [2023-10-12 17:47:11,092][62635] Updated weights for policy 1, policy_version 48240 (0.0007) [2023-10-12 17:47:11,349][62634] Updated weights for policy 0, policy_version 48260 (0.0008) [2023-10-12 17:47:11,467][62635] Updated weights for policy 1, policy_version 48250 (0.0007) [2023-10-12 17:47:11,723][62634] Updated weights for policy 0, policy_version 48270 (0.0008) [2023-10-12 17:47:12,088][62634] Updated weights for policy 0, policy_version 48280 (0.0010) [2023-10-12 17:47:13,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 98861056. Throughput: 0: 1666.0, 1: 1659.1. Samples: 24719418. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 17:47:13,435][61643] Avg episode reward: [(0, '24.890'), (1, '9.650')] [2023-10-12 17:47:15,577][62635] Updated weights for policy 1, policy_version 48260 (0.0010) [2023-10-12 17:47:15,941][62635] Updated weights for policy 1, policy_version 48270 (0.0010) [2023-10-12 17:47:16,177][62634] Updated weights for policy 0, policy_version 48290 (0.0010) [2023-10-12 17:47:16,308][62635] Updated weights for policy 1, policy_version 48280 (0.0009) [2023-10-12 17:47:16,557][62634] Updated weights for policy 0, policy_version 48300 (0.0007) [2023-10-12 17:47:16,932][62634] Updated weights for policy 0, policy_version 48310 (0.0007) [2023-10-12 17:47:17,303][62634] Updated weights for policy 0, policy_version 48320 (0.0008) [2023-10-12 17:47:18,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 98926592. Throughput: 0: 1676.6, 1: 1680.9. Samples: 24739776. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 17:47:18,435][61643] Avg episode reward: [(0, '24.460'), (1, '9.810')] [2023-10-12 17:47:20,440][62635] Updated weights for policy 1, policy_version 48290 (0.0008) [2023-10-12 17:47:20,812][62635] Updated weights for policy 1, policy_version 48300 (0.0007) [2023-10-12 17:47:21,180][62635] Updated weights for policy 1, policy_version 48310 (0.0008) [2023-10-12 17:47:21,207][62634] Updated weights for policy 0, policy_version 48330 (0.0007) [2023-10-12 17:47:21,543][62635] Updated weights for policy 1, policy_version 48320 (0.0010) [2023-10-12 17:47:21,588][62634] Updated weights for policy 0, policy_version 48340 (0.0007) [2023-10-12 17:47:21,961][62634] Updated weights for policy 0, policy_version 48350 (0.0009) [2023-10-12 17:47:23,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 98992128. Throughput: 0: 1690.9, 1: 1664.3. Samples: 24750710. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 17:47:23,435][61643] Avg episode reward: [(0, '24.400'), (1, '9.830')] [2023-10-12 17:47:25,638][62635] Updated weights for policy 1, policy_version 48330 (0.0010) [2023-10-12 17:47:26,014][62635] Updated weights for policy 1, policy_version 48340 (0.0008) [2023-10-12 17:47:26,098][62634] Updated weights for policy 0, policy_version 48360 (0.0007) [2023-10-12 17:47:26,379][62635] Updated weights for policy 1, policy_version 48350 (0.0008) [2023-10-12 17:47:26,462][62634] Updated weights for policy 0, policy_version 48370 (0.0009) [2023-10-12 17:47:26,836][62634] Updated weights for policy 0, policy_version 48380 (0.0007) [2023-10-12 17:47:28,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 99057664. Throughput: 0: 1663.7, 1: 1666.1. Samples: 24769526. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 17:47:28,436][61643] Avg episode reward: [(0, '24.690'), (1, '9.960')] [2023-10-12 17:47:30,363][62635] Updated weights for policy 1, policy_version 48360 (0.0010) [2023-10-12 17:47:30,735][62635] Updated weights for policy 1, policy_version 48370 (0.0007) [2023-10-12 17:47:31,004][62634] Updated weights for policy 0, policy_version 48390 (0.0008) [2023-10-12 17:47:31,093][62635] Updated weights for policy 1, policy_version 48380 (0.0007) [2023-10-12 17:47:31,377][62634] Updated weights for policy 0, policy_version 48400 (0.0007) [2023-10-12 17:47:31,752][62634] Updated weights for policy 0, policy_version 48410 (0.0008) [2023-10-12 17:47:33,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 99123200. Throughput: 0: 1687.8, 1: 1676.3. Samples: 24790090. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 17:47:33,435][61643] Avg episode reward: [(0, '24.690'), (1, '9.880')] [2023-10-12 17:47:35,164][62635] Updated weights for policy 1, policy_version 48390 (0.0007) [2023-10-12 17:47:35,533][62635] Updated weights for policy 1, policy_version 48400 (0.0008) [2023-10-12 17:47:35,702][62634] Updated weights for policy 0, policy_version 48420 (0.0008) [2023-10-12 17:47:35,900][62635] Updated weights for policy 1, policy_version 48410 (0.0010) [2023-10-12 17:47:36,094][62634] Updated weights for policy 0, policy_version 48430 (0.0009) [2023-10-12 17:47:36,468][62634] Updated weights for policy 0, policy_version 48440 (0.0009) [2023-10-12 17:47:38,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 99188736. Throughput: 0: 1683.6, 1: 1656.0. Samples: 24800356. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 17:47:38,435][61643] Avg episode reward: [(0, '24.520'), (1, '9.800')] [2023-10-12 17:47:39,997][62635] Updated weights for policy 1, policy_version 48420 (0.0010) [2023-10-12 17:47:40,366][62635] Updated weights for policy 1, policy_version 48430 (0.0009) [2023-10-12 17:47:40,738][62635] Updated weights for policy 1, policy_version 48440 (0.0008) [2023-10-12 17:47:40,781][62634] Updated weights for policy 0, policy_version 48450 (0.0010) [2023-10-12 17:47:41,155][62634] Updated weights for policy 0, policy_version 48460 (0.0008) [2023-10-12 17:47:41,535][62634] Updated weights for policy 0, policy_version 48470 (0.0010) [2023-10-12 17:47:41,916][62634] Updated weights for policy 0, policy_version 48480 (0.0009) [2023-10-12 17:47:43,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 99254272. Throughput: 0: 1667.7, 1: 1678.5. Samples: 24819864. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 17:47:43,435][61643] Avg episode reward: [(0, '24.650'), (1, '10.130')] [2023-10-12 17:47:44,809][62635] Updated weights for policy 1, policy_version 48450 (0.0009) [2023-10-12 17:47:45,176][62635] Updated weights for policy 1, policy_version 48460 (0.0009) [2023-10-12 17:47:45,544][62635] Updated weights for policy 1, policy_version 48470 (0.0008) [2023-10-12 17:47:45,907][62635] Updated weights for policy 1, policy_version 48480 (0.0007) [2023-10-12 17:47:45,923][62634] Updated weights for policy 0, policy_version 48490 (0.0007) [2023-10-12 17:47:46,291][62634] Updated weights for policy 0, policy_version 48500 (0.0009) [2023-10-12 17:47:46,674][62634] Updated weights for policy 0, policy_version 48510 (0.0008) [2023-10-12 17:47:48,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 99319808. Throughput: 0: 1679.4, 1: 1681.6. Samples: 24840516. Policy #0 lag: (min: 21.0, avg: 21.0, max: 21.0) [2023-10-12 17:47:48,436][61643] Avg episode reward: [(0, '24.550'), (1, '10.050')] [2023-10-12 17:47:49,966][62635] Updated weights for policy 1, policy_version 48490 (0.0007) [2023-10-12 17:47:50,330][62635] Updated weights for policy 1, policy_version 48500 (0.0007) [2023-10-12 17:47:50,692][62635] Updated weights for policy 1, policy_version 48510 (0.0009) [2023-10-12 17:47:50,715][62634] Updated weights for policy 0, policy_version 48520 (0.0007) [2023-10-12 17:47:51,082][62634] Updated weights for policy 0, policy_version 48530 (0.0007) [2023-10-12 17:47:51,458][62634] Updated weights for policy 0, policy_version 48540 (0.0008) [2023-10-12 17:47:53,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 99385344. Throughput: 0: 1667.1, 1: 1665.1. Samples: 24850310. Policy #0 lag: (min: 21.0, avg: 21.0, max: 21.0) [2023-10-12 17:47:53,435][61643] Avg episode reward: [(0, '24.460'), (1, '9.910')] [2023-10-12 17:47:54,719][62635] Updated weights for policy 1, policy_version 48520 (0.0008) [2023-10-12 17:47:55,083][62635] Updated weights for policy 1, policy_version 48530 (0.0009) [2023-10-12 17:47:55,463][62635] Updated weights for policy 1, policy_version 48540 (0.0010) [2023-10-12 17:47:55,708][62634] Updated weights for policy 0, policy_version 48550 (0.0008) [2023-10-12 17:47:56,083][62634] Updated weights for policy 0, policy_version 48560 (0.0009) [2023-10-12 17:47:56,464][62634] Updated weights for policy 0, policy_version 48570 (0.0009) [2023-10-12 17:47:58,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.1, 300 sec: 13440.4). Total num frames: 99450880. Throughput: 0: 1664.7, 1: 1682.6. Samples: 24870048. Policy #0 lag: (min: 21.0, avg: 21.0, max: 21.0) [2023-10-12 17:47:58,436][61643] Avg episode reward: [(0, '24.480'), (1, '9.950')] [2023-10-12 17:47:59,602][62635] Updated weights for policy 1, policy_version 48550 (0.0007) [2023-10-12 17:47:59,987][62635] Updated weights for policy 1, policy_version 48560 (0.0009) [2023-10-12 17:48:00,338][62634] Updated weights for policy 0, policy_version 48580 (0.0009) [2023-10-12 17:48:00,349][62635] Updated weights for policy 1, policy_version 48570 (0.0009) [2023-10-12 17:48:00,716][62634] Updated weights for policy 0, policy_version 48590 (0.0007) [2023-10-12 17:48:01,092][62634] Updated weights for policy 0, policy_version 48600 (0.0010) [2023-10-12 17:48:03,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 99516416. Throughput: 0: 1675.8, 1: 1682.1. Samples: 24890882. Policy #0 lag: (min: 21.0, avg: 21.0, max: 21.0) [2023-10-12 17:48:03,435][61643] Avg episode reward: [(0, '24.220'), (1, '9.910')] [2023-10-12 17:48:04,476][62635] Updated weights for policy 1, policy_version 48580 (0.0007) [2023-10-12 17:48:04,844][62635] Updated weights for policy 1, policy_version 48590 (0.0007) [2023-10-12 17:48:05,213][62635] Updated weights for policy 1, policy_version 48600 (0.0008) [2023-10-12 17:48:05,257][62634] Updated weights for policy 0, policy_version 48610 (0.0007) [2023-10-12 17:48:05,639][62634] Updated weights for policy 0, policy_version 48620 (0.0007) [2023-10-12 17:48:06,017][62634] Updated weights for policy 0, policy_version 48630 (0.0010) [2023-10-12 17:48:06,383][62634] Updated weights for policy 0, policy_version 48640 (0.0011) [2023-10-12 17:48:08,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 99581952. Throughput: 0: 1657.2, 1: 1671.8. Samples: 24900514. Policy #0 lag: (min: 21.0, avg: 21.0, max: 21.0) [2023-10-12 17:48:08,435][61643] Avg episode reward: [(0, '24.550'), (1, '9.810')] [2023-10-12 17:48:09,267][62635] Updated weights for policy 1, policy_version 48610 (0.0009) [2023-10-12 17:48:09,647][62635] Updated weights for policy 1, policy_version 48620 (0.0011) [2023-10-12 17:48:10,016][62635] Updated weights for policy 1, policy_version 48630 (0.0010) [2023-10-12 17:48:10,379][62635] Updated weights for policy 1, policy_version 48640 (0.0008) [2023-10-12 17:48:10,480][62634] Updated weights for policy 0, policy_version 48650 (0.0007) [2023-10-12 17:48:10,862][62634] Updated weights for policy 0, policy_version 48660 (0.0009) [2023-10-12 17:48:11,233][62634] Updated weights for policy 0, policy_version 48670 (0.0008) [2023-10-12 17:48:13,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 99647488. Throughput: 0: 1671.6, 1: 1679.8. Samples: 24920342. Policy #0 lag: (min: 21.0, avg: 21.0, max: 21.0) [2023-10-12 17:48:13,435][61643] Avg episode reward: [(0, '24.660'), (1, '9.990')] [2023-10-12 17:48:14,736][62635] Updated weights for policy 1, policy_version 48650 (0.0011) [2023-10-12 17:48:15,101][62635] Updated weights for policy 1, policy_version 48660 (0.0010) [2023-10-12 17:48:15,232][62634] Updated weights for policy 0, policy_version 48680 (0.0008) [2023-10-12 17:48:15,469][62635] Updated weights for policy 1, policy_version 48670 (0.0008) [2023-10-12 17:48:15,611][62634] Updated weights for policy 0, policy_version 48690 (0.0008) [2023-10-12 17:48:15,995][62634] Updated weights for policy 0, policy_version 48700 (0.0009) [2023-10-12 17:48:18,435][61643] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 99713024. Throughput: 0: 1670.5, 1: 1678.8. Samples: 24940808. Policy #0 lag: (min: 21.0, avg: 21.0, max: 21.0) [2023-10-12 17:48:18,436][61643] Avg episode reward: [(0, '24.420'), (1, '9.720')] [2023-10-12 17:48:18,450][62354] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000048704_49872896.pth... [2023-10-12 17:48:18,450][62495] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000048672_49840128.pth... [2023-10-12 17:48:18,483][62354] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000047136_48267264.pth [2023-10-12 17:48:18,485][62495] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000047104_48234496.pth [2023-10-12 17:48:19,595][62635] Updated weights for policy 1, policy_version 48680 (0.0008) [2023-10-12 17:48:19,957][62635] Updated weights for policy 1, policy_version 48690 (0.0010) [2023-10-12 17:48:20,113][62634] Updated weights for policy 0, policy_version 48710 (0.0008) [2023-10-12 17:48:20,324][62635] Updated weights for policy 1, policy_version 48700 (0.0007) [2023-10-12 17:48:20,487][62634] Updated weights for policy 0, policy_version 48720 (0.0008) [2023-10-12 17:48:20,864][62634] Updated weights for policy 0, policy_version 48730 (0.0009) [2023-10-12 17:48:23,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 99778560. Throughput: 0: 1654.0, 1: 1673.7. Samples: 24950104. Policy #0 lag: (min: 31.0, avg: 33.8, max: 63.0) [2023-10-12 17:48:23,435][61643] Avg episode reward: [(0, '24.420'), (1, '9.760')] [2023-10-12 17:48:24,344][62635] Updated weights for policy 1, policy_version 48710 (0.0008) [2023-10-12 17:48:24,717][62635] Updated weights for policy 1, policy_version 48720 (0.0007) [2023-10-12 17:48:24,975][62634] Updated weights for policy 0, policy_version 48740 (0.0007) [2023-10-12 17:48:25,090][62635] Updated weights for policy 1, policy_version 48730 (0.0007) [2023-10-12 17:48:25,343][62634] Updated weights for policy 0, policy_version 48750 (0.0007) [2023-10-12 17:48:25,714][62634] Updated weights for policy 0, policy_version 48760 (0.0011) [2023-10-12 17:48:28,435][61643] Fps is (10 sec: 13107.6, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 99844096. Throughput: 0: 1673.7, 1: 1673.8. Samples: 24970502. Policy #0 lag: (min: 31.0, avg: 33.8, max: 63.0) [2023-10-12 17:48:28,435][61643] Avg episode reward: [(0, '24.210'), (1, '9.820')] [2023-10-12 17:48:29,125][62635] Updated weights for policy 1, policy_version 48740 (0.0009) [2023-10-12 17:48:29,503][62635] Updated weights for policy 1, policy_version 48750 (0.0010) [2023-10-12 17:48:29,861][62635] Updated weights for policy 1, policy_version 48760 (0.0010) [2023-10-12 17:48:29,895][62634] Updated weights for policy 0, policy_version 48770 (0.0007) [2023-10-12 17:48:30,270][62634] Updated weights for policy 0, policy_version 48780 (0.0008) [2023-10-12 17:48:30,654][62634] Updated weights for policy 0, policy_version 48790 (0.0008) [2023-10-12 17:48:31,027][62634] Updated weights for policy 0, policy_version 48800 (0.0007) [2023-10-12 17:48:33,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 99909632. Throughput: 0: 1675.1, 1: 1670.5. Samples: 24991066. Policy #0 lag: (min: 31.0, avg: 33.8, max: 63.0) [2023-10-12 17:48:33,435][61643] Avg episode reward: [(0, '24.240'), (1, '9.760')] [2023-10-12 17:48:33,823][62635] Updated weights for policy 1, policy_version 48770 (0.0008) [2023-10-12 17:48:34,184][62635] Updated weights for policy 1, policy_version 48780 (0.0010) [2023-10-12 17:48:34,568][62635] Updated weights for policy 1, policy_version 48790 (0.0009) [2023-10-12 17:48:34,926][62635] Updated weights for policy 1, policy_version 48800 (0.0008) [2023-10-12 17:48:35,058][62634] Updated weights for policy 0, policy_version 48810 (0.0010) [2023-10-12 17:48:35,446][62634] Updated weights for policy 0, policy_version 48820 (0.0010) [2023-10-12 17:48:35,821][62634] Updated weights for policy 0, policy_version 48830 (0.0008) [2023-10-12 17:48:38,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 99975168. Throughput: 0: 1656.7, 1: 1675.9. Samples: 25000276. Policy #0 lag: (min: 31.0, avg: 33.8, max: 63.0) [2023-10-12 17:48:38,435][61643] Avg episode reward: [(0, '24.130'), (1, '9.760')] [2023-10-12 17:48:39,103][62635] Updated weights for policy 1, policy_version 48810 (0.0007) [2023-10-12 17:48:39,468][62635] Updated weights for policy 1, policy_version 48820 (0.0009) [2023-10-12 17:48:39,829][62635] Updated weights for policy 1, policy_version 48830 (0.0008) [2023-10-12 17:48:39,982][62634] Updated weights for policy 0, policy_version 48840 (0.0009) [2023-10-12 17:48:40,361][62634] Updated weights for policy 0, policy_version 48850 (0.0010) [2023-10-12 17:48:40,731][62634] Updated weights for policy 0, policy_version 48860 (0.0009) [2023-10-12 17:48:43,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 100040704. Throughput: 0: 1676.4, 1: 1674.6. Samples: 25020842. Policy #0 lag: (min: 31.0, avg: 33.8, max: 63.0) [2023-10-12 17:48:43,435][61643] Avg episode reward: [(0, '23.770'), (1, '9.720')] [2023-10-12 17:48:43,893][62635] Updated weights for policy 1, policy_version 48840 (0.0010) [2023-10-12 17:48:44,269][62635] Updated weights for policy 1, policy_version 48850 (0.0007) [2023-10-12 17:48:44,637][62635] Updated weights for policy 1, policy_version 48860 (0.0008) [2023-10-12 17:48:44,751][62634] Updated weights for policy 0, policy_version 48870 (0.0009) [2023-10-12 17:48:45,129][62634] Updated weights for policy 0, policy_version 48880 (0.0009) [2023-10-12 17:48:45,501][62634] Updated weights for policy 0, policy_version 48890 (0.0008) [2023-10-12 17:48:48,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 100106240. Throughput: 0: 1668.0, 1: 1675.7. Samples: 25041348. Policy #0 lag: (min: 31.0, avg: 33.8, max: 63.0) [2023-10-12 17:48:48,435][61643] Avg episode reward: [(0, '23.830'), (1, '9.900')] [2023-10-12 17:48:48,850][62635] Updated weights for policy 1, policy_version 48870 (0.0009) [2023-10-12 17:48:49,228][62635] Updated weights for policy 1, policy_version 48880 (0.0008) [2023-10-12 17:48:49,605][62635] Updated weights for policy 1, policy_version 48890 (0.0010) [2023-10-12 17:48:49,679][62634] Updated weights for policy 0, policy_version 48900 (0.0007) [2023-10-12 17:48:50,059][62634] Updated weights for policy 0, policy_version 48910 (0.0009) [2023-10-12 17:48:50,424][62634] Updated weights for policy 0, policy_version 48920 (0.0012) [2023-10-12 17:48:53,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 100171776. Throughput: 0: 1655.1, 1: 1674.4. Samples: 25050342. Policy #0 lag: (min: 31.0, avg: 33.8, max: 63.0) [2023-10-12 17:48:53,436][61643] Avg episode reward: [(0, '24.130'), (1, '9.660')] [2023-10-12 17:48:53,567][62635] Updated weights for policy 1, policy_version 48900 (0.0009) [2023-10-12 17:48:53,947][62635] Updated weights for policy 1, policy_version 48910 (0.0011) [2023-10-12 17:48:54,310][62635] Updated weights for policy 1, policy_version 48920 (0.0007) [2023-10-12 17:48:54,328][62634] Updated weights for policy 0, policy_version 48930 (0.0009) [2023-10-12 17:48:54,704][62634] Updated weights for policy 0, policy_version 48940 (0.0011) [2023-10-12 17:48:55,087][62634] Updated weights for policy 0, policy_version 48950 (0.0010) [2023-10-12 17:48:55,474][62634] Updated weights for policy 0, policy_version 48960 (0.0007) [2023-10-12 17:48:58,394][62635] Updated weights for policy 1, policy_version 48930 (0.0008) [2023-10-12 17:48:58,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 100237312. Throughput: 0: 1671.1, 1: 1681.8. Samples: 25071222. Policy #0 lag: (min: 21.0, avg: 21.0, max: 21.0) [2023-10-12 17:48:58,435][61643] Avg episode reward: [(0, '24.090'), (1, '9.590')] [2023-10-12 17:48:58,763][62635] Updated weights for policy 1, policy_version 48940 (0.0010) [2023-10-12 17:48:59,143][62635] Updated weights for policy 1, policy_version 48950 (0.0007) [2023-10-12 17:48:59,508][62635] Updated weights for policy 1, policy_version 48960 (0.0008) [2023-10-12 17:48:59,578][62634] Updated weights for policy 0, policy_version 48970 (0.0008) [2023-10-12 17:48:59,961][62634] Updated weights for policy 0, policy_version 48980 (0.0009) [2023-10-12 17:49:00,330][62634] Updated weights for policy 0, policy_version 48990 (0.0009) [2023-10-12 17:49:03,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 100302848. Throughput: 0: 1675.4, 1: 1682.3. Samples: 25091906. Policy #0 lag: (min: 21.0, avg: 21.0, max: 21.0) [2023-10-12 17:49:03,435][61643] Avg episode reward: [(0, '24.120'), (1, '9.810')] [2023-10-12 17:49:03,654][62635] Updated weights for policy 1, policy_version 48970 (0.0010) [2023-10-12 17:49:04,024][62635] Updated weights for policy 1, policy_version 48980 (0.0010) [2023-10-12 17:49:04,394][62635] Updated weights for policy 1, policy_version 48990 (0.0007) [2023-10-12 17:49:04,408][62634] Updated weights for policy 0, policy_version 49000 (0.0009) [2023-10-12 17:49:04,775][62634] Updated weights for policy 0, policy_version 49010 (0.0010) [2023-10-12 17:49:05,156][62634] Updated weights for policy 0, policy_version 49020 (0.0008) [2023-10-12 17:49:08,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 100368384. Throughput: 0: 1666.4, 1: 1683.8. Samples: 25100862. Policy #0 lag: (min: 21.0, avg: 21.0, max: 21.0) [2023-10-12 17:49:08,435][61643] Avg episode reward: [(0, '24.200'), (1, '9.950')] [2023-10-12 17:49:08,516][62635] Updated weights for policy 1, policy_version 49000 (0.0008) [2023-10-12 17:49:08,877][62635] Updated weights for policy 1, policy_version 49010 (0.0009) [2023-10-12 17:49:09,193][62634] Updated weights for policy 0, policy_version 49030 (0.0010) [2023-10-12 17:49:09,245][62635] Updated weights for policy 1, policy_version 49020 (0.0008) [2023-10-12 17:49:09,572][62634] Updated weights for policy 0, policy_version 49040 (0.0007) [2023-10-12 17:49:09,953][62634] Updated weights for policy 0, policy_version 49050 (0.0009) [2023-10-12 17:49:13,111][62635] Updated weights for policy 1, policy_version 49030 (0.0008) [2023-10-12 17:49:13,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 100433920. Throughput: 0: 1675.8, 1: 1688.2. Samples: 25121884. Policy #0 lag: (min: 21.0, avg: 21.0, max: 21.0) [2023-10-12 17:49:13,436][61643] Avg episode reward: [(0, '24.180'), (1, '9.830')] [2023-10-12 17:49:13,471][62635] Updated weights for policy 1, policy_version 49040 (0.0008) [2023-10-12 17:49:13,844][62635] Updated weights for policy 1, policy_version 49050 (0.0007) [2023-10-12 17:49:13,897][62634] Updated weights for policy 0, policy_version 49060 (0.0009) [2023-10-12 17:49:14,273][62634] Updated weights for policy 0, policy_version 49070 (0.0008) [2023-10-12 17:49:14,641][62634] Updated weights for policy 0, policy_version 49080 (0.0009) [2023-10-12 17:49:18,090][62635] Updated weights for policy 1, policy_version 49060 (0.0009) [2023-10-12 17:49:18,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.3, 300 sec: 13329.4). Total num frames: 100499456. Throughput: 0: 1678.8, 1: 1683.9. Samples: 25142386. Policy #0 lag: (min: 21.0, avg: 21.0, max: 21.0) [2023-10-12 17:49:18,435][61643] Avg episode reward: [(0, '24.190'), (1, '9.760')] [2023-10-12 17:49:18,453][62635] Updated weights for policy 1, policy_version 49070 (0.0009) [2023-10-12 17:49:18,809][62634] Updated weights for policy 0, policy_version 49090 (0.0010) [2023-10-12 17:49:18,821][62635] Updated weights for policy 1, policy_version 49080 (0.0008) [2023-10-12 17:49:19,174][62634] Updated weights for policy 0, policy_version 49100 (0.0007) [2023-10-12 17:49:19,553][62634] Updated weights for policy 0, policy_version 49110 (0.0011) [2023-10-12 17:49:19,933][62634] Updated weights for policy 0, policy_version 49120 (0.0011) [2023-10-12 17:49:22,802][62635] Updated weights for policy 1, policy_version 49090 (0.0008) [2023-10-12 17:49:23,174][62635] Updated weights for policy 1, policy_version 49100 (0.0011) [2023-10-12 17:49:23,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 100564992. Throughput: 0: 1678.9, 1: 1681.8. Samples: 25151508. Policy #0 lag: (min: 21.0, avg: 21.0, max: 21.0) [2023-10-12 17:49:23,436][61643] Avg episode reward: [(0, '23.980'), (1, '9.850')] [2023-10-12 17:49:23,550][62635] Updated weights for policy 1, policy_version 49110 (0.0009) [2023-10-12 17:49:23,767][62634] Updated weights for policy 0, policy_version 49130 (0.0009) [2023-10-12 17:49:23,917][62635] Updated weights for policy 1, policy_version 49120 (0.0007) [2023-10-12 17:49:24,139][62634] Updated weights for policy 0, policy_version 49140 (0.0007) [2023-10-12 17:49:24,512][62634] Updated weights for policy 0, policy_version 49150 (0.0009) [2023-10-12 17:49:27,877][62635] Updated weights for policy 1, policy_version 49130 (0.0009) [2023-10-12 17:49:28,247][62635] Updated weights for policy 1, policy_version 49140 (0.0007) [2023-10-12 17:49:28,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 100630528. Throughput: 0: 1683.2, 1: 1685.2. Samples: 25172418. Policy #0 lag: (min: 21.0, avg: 21.0, max: 21.0) [2023-10-12 17:49:28,435][61643] Avg episode reward: [(0, '23.920'), (1, '9.970')] [2023-10-12 17:49:28,590][62634] Updated weights for policy 0, policy_version 49160 (0.0009) [2023-10-12 17:49:28,617][62635] Updated weights for policy 1, policy_version 49150 (0.0008) [2023-10-12 17:49:28,962][62634] Updated weights for policy 0, policy_version 49170 (0.0008) [2023-10-12 17:49:29,339][62634] Updated weights for policy 0, policy_version 49180 (0.0009) [2023-10-12 17:49:32,657][62635] Updated weights for policy 1, policy_version 49160 (0.0007) [2023-10-12 17:49:33,028][62635] Updated weights for policy 1, policy_version 49170 (0.0009) [2023-10-12 17:49:33,395][62635] Updated weights for policy 1, policy_version 49180 (0.0010) [2023-10-12 17:49:33,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 100696064. Throughput: 0: 1683.5, 1: 1669.2. Samples: 25192220. Policy #0 lag: (min: 31.0, avg: 38.7, max: 63.0) [2023-10-12 17:49:33,435][61643] Avg episode reward: [(0, '23.750'), (1, '9.830')] [2023-10-12 17:49:33,535][62634] Updated weights for policy 0, policy_version 49190 (0.0007) [2023-10-12 17:49:33,913][62634] Updated weights for policy 0, policy_version 49200 (0.0009) [2023-10-12 17:49:34,293][62634] Updated weights for policy 0, policy_version 49210 (0.0009) [2023-10-12 17:49:37,444][62635] Updated weights for policy 1, policy_version 49190 (0.0009) [2023-10-12 17:49:37,817][62635] Updated weights for policy 1, policy_version 49200 (0.0008) [2023-10-12 17:49:38,185][62635] Updated weights for policy 1, policy_version 49210 (0.0010) [2023-10-12 17:49:38,435][61643] Fps is (10 sec: 16383.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 100794368. Throughput: 0: 1683.4, 1: 1685.2. Samples: 25201928. Policy #0 lag: (min: 31.0, avg: 38.7, max: 63.0) [2023-10-12 17:49:38,435][61643] Avg episode reward: [(0, '23.970'), (1, '9.920')] [2023-10-12 17:49:38,553][62634] Updated weights for policy 0, policy_version 49220 (0.0009) [2023-10-12 17:49:38,923][62634] Updated weights for policy 0, policy_version 49230 (0.0010) [2023-10-12 17:49:39,305][62634] Updated weights for policy 0, policy_version 49240 (0.0008) [2023-10-12 17:49:42,099][62635] Updated weights for policy 1, policy_version 49220 (0.0008) [2023-10-12 17:49:42,465][62635] Updated weights for policy 1, policy_version 49230 (0.0009) [2023-10-12 17:49:42,832][62635] Updated weights for policy 1, policy_version 49240 (0.0008) [2023-10-12 17:49:43,197][62634] Updated weights for policy 0, policy_version 49250 (0.0008) [2023-10-12 17:49:43,435][61643] Fps is (10 sec: 16383.8, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 100859904. Throughput: 0: 1675.9, 1: 1683.8. Samples: 25222412. Policy #0 lag: (min: 31.0, avg: 38.7, max: 63.0) [2023-10-12 17:49:43,436][61643] Avg episode reward: [(0, '24.000'), (1, '9.990')] [2023-10-12 17:49:43,580][62634] Updated weights for policy 0, policy_version 49260 (0.0008) [2023-10-12 17:49:43,957][62634] Updated weights for policy 0, policy_version 49270 (0.0007) [2023-10-12 17:49:44,329][62634] Updated weights for policy 0, policy_version 49280 (0.0009) [2023-10-12 17:49:46,955][62635] Updated weights for policy 1, policy_version 49250 (0.0009) [2023-10-12 17:49:47,334][62635] Updated weights for policy 1, policy_version 49260 (0.0010) [2023-10-12 17:49:47,694][62635] Updated weights for policy 1, policy_version 49270 (0.0010) [2023-10-12 17:49:48,063][62635] Updated weights for policy 1, policy_version 49280 (0.0008) [2023-10-12 17:49:48,328][62634] Updated weights for policy 0, policy_version 49290 (0.0009) [2023-10-12 17:49:48,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 100925440. Throughput: 0: 1679.2, 1: 1658.5. Samples: 25242106. Policy #0 lag: (min: 31.0, avg: 38.7, max: 63.0) [2023-10-12 17:49:48,435][61643] Avg episode reward: [(0, '24.000'), (1, '9.780')] [2023-10-12 17:49:48,708][62634] Updated weights for policy 0, policy_version 49300 (0.0008) [2023-10-12 17:49:49,076][62634] Updated weights for policy 0, policy_version 49310 (0.0008) [2023-10-12 17:49:52,113][62635] Updated weights for policy 1, policy_version 49290 (0.0007) [2023-10-12 17:49:52,479][62635] Updated weights for policy 1, policy_version 49300 (0.0007) [2023-10-12 17:49:52,853][62635] Updated weights for policy 1, policy_version 49310 (0.0007) [2023-10-12 17:49:53,088][62634] Updated weights for policy 0, policy_version 49320 (0.0009) [2023-10-12 17:49:53,435][61643] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 100990976. Throughput: 0: 1684.0, 1: 1685.0. Samples: 25252468. Policy #0 lag: (min: 31.0, avg: 38.7, max: 63.0) [2023-10-12 17:49:53,435][61643] Avg episode reward: [(0, '24.080'), (1, '9.750')] [2023-10-12 17:49:53,468][62634] Updated weights for policy 0, policy_version 49330 (0.0010) [2023-10-12 17:49:53,850][62634] Updated weights for policy 0, policy_version 49340 (0.0010) [2023-10-12 17:49:56,786][62635] Updated weights for policy 1, policy_version 49320 (0.0009) [2023-10-12 17:49:57,160][62635] Updated weights for policy 1, policy_version 49330 (0.0008) [2023-10-12 17:49:57,525][62635] Updated weights for policy 1, policy_version 49340 (0.0008) [2023-10-12 17:49:57,949][62634] Updated weights for policy 0, policy_version 49350 (0.0008) [2023-10-12 17:49:58,326][62634] Updated weights for policy 0, policy_version 49360 (0.0009) [2023-10-12 17:49:58,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 101056512. Throughput: 0: 1676.3, 1: 1669.6. Samples: 25272446. Policy #0 lag: (min: 31.0, avg: 38.7, max: 63.0) [2023-10-12 17:49:58,435][61643] Avg episode reward: [(0, '24.240'), (1, '9.840')] [2023-10-12 17:49:58,710][62634] Updated weights for policy 0, policy_version 49370 (0.0007) [2023-10-12 17:50:01,600][62635] Updated weights for policy 1, policy_version 49350 (0.0007) [2023-10-12 17:50:01,973][62635] Updated weights for policy 1, policy_version 49360 (0.0008) [2023-10-12 17:50:02,345][62635] Updated weights for policy 1, policy_version 49370 (0.0008) [2023-10-12 17:50:02,656][62634] Updated weights for policy 0, policy_version 49380 (0.0007) [2023-10-12 17:50:03,033][62634] Updated weights for policy 0, policy_version 49390 (0.0009) [2023-10-12 17:50:03,411][62634] Updated weights for policy 0, policy_version 49400 (0.0009) [2023-10-12 17:50:03,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 101122048. Throughput: 0: 1669.2, 1: 1657.2. Samples: 25292076. Policy #0 lag: (min: 31.0, avg: 38.7, max: 63.0) [2023-10-12 17:50:03,435][61643] Avg episode reward: [(0, '24.080'), (1, '9.840')] [2023-10-12 17:50:06,433][62635] Updated weights for policy 1, policy_version 49380 (0.0008) [2023-10-12 17:50:06,804][62635] Updated weights for policy 1, policy_version 49390 (0.0007) [2023-10-12 17:50:07,176][62635] Updated weights for policy 1, policy_version 49400 (0.0007) [2023-10-12 17:50:07,583][62634] Updated weights for policy 0, policy_version 49410 (0.0008) [2023-10-12 17:50:07,968][62634] Updated weights for policy 0, policy_version 49420 (0.0010) [2023-10-12 17:50:08,335][62634] Updated weights for policy 0, policy_version 49430 (0.0009) [2023-10-12 17:50:08,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 101187584. Throughput: 0: 1684.4, 1: 1685.7. Samples: 25303164. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 17:50:08,436][61643] Avg episode reward: [(0, '24.070'), (1, '9.670')] [2023-10-12 17:50:08,713][62634] Updated weights for policy 0, policy_version 49440 (0.0009) [2023-10-12 17:50:11,319][62635] Updated weights for policy 1, policy_version 49410 (0.0008) [2023-10-12 17:50:11,686][62635] Updated weights for policy 1, policy_version 49420 (0.0011) [2023-10-12 17:50:12,061][62635] Updated weights for policy 1, policy_version 49430 (0.0008) [2023-10-12 17:50:12,425][62635] Updated weights for policy 1, policy_version 49440 (0.0007) [2023-10-12 17:50:12,664][62634] Updated weights for policy 0, policy_version 49450 (0.0011) [2023-10-12 17:50:13,053][62634] Updated weights for policy 0, policy_version 49460 (0.0009) [2023-10-12 17:50:13,423][62634] Updated weights for policy 0, policy_version 49470 (0.0008) [2023-10-12 17:50:13,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 101253120. Throughput: 0: 1689.3, 1: 1664.0. Samples: 25323316. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 17:50:13,435][61643] Avg episode reward: [(0, '24.060'), (1, '9.780')] [2023-10-12 17:50:16,496][62635] Updated weights for policy 1, policy_version 49450 (0.0008) [2023-10-12 17:50:16,870][62635] Updated weights for policy 1, policy_version 49460 (0.0009) [2023-10-12 17:50:17,233][62635] Updated weights for policy 1, policy_version 49470 (0.0007) [2023-10-12 17:50:17,518][62634] Updated weights for policy 0, policy_version 49480 (0.0008) [2023-10-12 17:50:17,887][62634] Updated weights for policy 0, policy_version 49490 (0.0007) [2023-10-12 17:50:18,265][62634] Updated weights for policy 0, policy_version 49500 (0.0010) [2023-10-12 17:50:18,435][61643] Fps is (10 sec: 16383.8, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 101351424. Throughput: 0: 1675.4, 1: 1674.5. Samples: 25342968. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 17:50:18,436][61643] Avg episode reward: [(0, '24.090'), (1, '9.880')] [2023-10-12 17:50:18,451][62354] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000049504_50692096.pth... [2023-10-12 17:50:18,451][62495] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000049472_50659328.pth... [2023-10-12 17:50:18,487][62354] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000047936_49086464.pth [2023-10-12 17:50:18,489][62495] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000047904_49053696.pth [2023-10-12 17:50:21,294][62635] Updated weights for policy 1, policy_version 49480 (0.0008) [2023-10-12 17:50:21,659][62635] Updated weights for policy 1, policy_version 49490 (0.0008) [2023-10-12 17:50:22,024][62635] Updated weights for policy 1, policy_version 49500 (0.0008) [2023-10-12 17:50:22,387][62634] Updated weights for policy 0, policy_version 49510 (0.0008) [2023-10-12 17:50:22,765][62634] Updated weights for policy 0, policy_version 49520 (0.0010) [2023-10-12 17:50:23,142][62634] Updated weights for policy 0, policy_version 49530 (0.0009) [2023-10-12 17:50:23,435][61643] Fps is (10 sec: 16384.1, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 101416960. Throughput: 0: 1694.7, 1: 1685.5. Samples: 25354038. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 17:50:23,435][61643] Avg episode reward: [(0, '24.210'), (1, '9.880')] [2023-10-12 17:50:26,292][62635] Updated weights for policy 1, policy_version 49510 (0.0010) [2023-10-12 17:50:26,681][62635] Updated weights for policy 1, policy_version 49520 (0.0008) [2023-10-12 17:50:27,036][62635] Updated weights for policy 1, policy_version 49530 (0.0007) [2023-10-12 17:50:27,044][62634] Updated weights for policy 0, policy_version 49540 (0.0008) [2023-10-12 17:50:27,418][62634] Updated weights for policy 0, policy_version 49550 (0.0007) [2023-10-12 17:50:27,795][62634] Updated weights for policy 0, policy_version 49560 (0.0008) [2023-10-12 17:50:28,435][61643] Fps is (10 sec: 13107.4, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 101482496. Throughput: 0: 1696.9, 1: 1662.4. Samples: 25373582. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 17:50:28,436][61643] Avg episode reward: [(0, '24.360'), (1, '9.810')] [2023-10-12 17:50:31,240][62635] Updated weights for policy 1, policy_version 49540 (0.0010) [2023-10-12 17:50:31,624][62635] Updated weights for policy 1, policy_version 49550 (0.0009) [2023-10-12 17:50:31,871][62634] Updated weights for policy 0, policy_version 49570 (0.0009) [2023-10-12 17:50:31,981][62635] Updated weights for policy 1, policy_version 49560 (0.0009) [2023-10-12 17:50:32,244][62634] Updated weights for policy 0, policy_version 49580 (0.0007) [2023-10-12 17:50:32,627][62634] Updated weights for policy 0, policy_version 49590 (0.0008) [2023-10-12 17:50:33,014][62634] Updated weights for policy 0, policy_version 49600 (0.0010) [2023-10-12 17:50:33,435][61643] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 101548032. Throughput: 0: 1664.9, 1: 1680.1. Samples: 25392630. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 17:50:33,435][61643] Avg episode reward: [(0, '24.290'), (1, '9.810')] [2023-10-12 17:50:35,913][62635] Updated weights for policy 1, policy_version 49570 (0.0009) [2023-10-12 17:50:36,277][62635] Updated weights for policy 1, policy_version 49580 (0.0007) [2023-10-12 17:50:36,651][62635] Updated weights for policy 1, policy_version 49590 (0.0009) [2023-10-12 17:50:37,021][62635] Updated weights for policy 1, policy_version 49600 (0.0008) [2023-10-12 17:50:37,092][62634] Updated weights for policy 0, policy_version 49610 (0.0009) [2023-10-12 17:50:37,476][62634] Updated weights for policy 0, policy_version 49620 (0.0008) [2023-10-12 17:50:37,846][62634] Updated weights for policy 0, policy_version 49630 (0.0007) [2023-10-12 17:50:38,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 101613568. Throughput: 0: 1688.7, 1: 1679.9. Samples: 25404060. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 17:50:38,436][61643] Avg episode reward: [(0, '24.000'), (1, '9.980')] [2023-10-12 17:50:41,286][62635] Updated weights for policy 1, policy_version 49610 (0.0007) [2023-10-12 17:50:41,661][62635] Updated weights for policy 1, policy_version 49620 (0.0007) [2023-10-12 17:50:41,790][62634] Updated weights for policy 0, policy_version 49640 (0.0007) [2023-10-12 17:50:42,034][62635] Updated weights for policy 1, policy_version 49630 (0.0008) [2023-10-12 17:50:42,164][62634] Updated weights for policy 0, policy_version 49650 (0.0009) [2023-10-12 17:50:42,538][62634] Updated weights for policy 0, policy_version 49660 (0.0011) [2023-10-12 17:50:43,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 101679104. Throughput: 0: 1683.6, 1: 1667.8. Samples: 25423260. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 17:50:43,435][61643] Avg episode reward: [(0, '24.110'), (1, '9.930')] [2023-10-12 17:50:46,119][62635] Updated weights for policy 1, policy_version 49640 (0.0010) [2023-10-12 17:50:46,481][62635] Updated weights for policy 1, policy_version 49650 (0.0010) [2023-10-12 17:50:46,835][62634] Updated weights for policy 0, policy_version 49670 (0.0009) [2023-10-12 17:50:46,855][62635] Updated weights for policy 1, policy_version 49660 (0.0007) [2023-10-12 17:50:47,217][62634] Updated weights for policy 0, policy_version 49680 (0.0009) [2023-10-12 17:50:47,600][62634] Updated weights for policy 0, policy_version 49690 (0.0008) [2023-10-12 17:50:48,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 101744640. Throughput: 0: 1670.6, 1: 1677.9. Samples: 25442760. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 17:50:48,436][61643] Avg episode reward: [(0, '23.980'), (1, '9.850')] [2023-10-12 17:50:50,971][62635] Updated weights for policy 1, policy_version 49670 (0.0007) [2023-10-12 17:50:51,336][62635] Updated weights for policy 1, policy_version 49680 (0.0007) [2023-10-12 17:50:51,559][62634] Updated weights for policy 0, policy_version 49700 (0.0009) [2023-10-12 17:50:51,706][62635] Updated weights for policy 1, policy_version 49690 (0.0008) [2023-10-12 17:50:51,930][62634] Updated weights for policy 0, policy_version 49710 (0.0007) [2023-10-12 17:50:52,304][62634] Updated weights for policy 0, policy_version 49720 (0.0007) [2023-10-12 17:50:53,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 101810176. Throughput: 0: 1686.4, 1: 1668.8. Samples: 25454144. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 17:50:53,435][61643] Avg episode reward: [(0, '24.100'), (1, '9.850')] [2023-10-12 17:50:55,820][62635] Updated weights for policy 1, policy_version 49700 (0.0009) [2023-10-12 17:50:56,186][62635] Updated weights for policy 1, policy_version 49710 (0.0009) [2023-10-12 17:50:56,305][62634] Updated weights for policy 0, policy_version 49730 (0.0009) [2023-10-12 17:50:56,552][62635] Updated weights for policy 1, policy_version 49720 (0.0007) [2023-10-12 17:50:56,672][62634] Updated weights for policy 0, policy_version 49740 (0.0008) [2023-10-12 17:50:57,054][62634] Updated weights for policy 0, policy_version 49750 (0.0007) [2023-10-12 17:50:57,420][62634] Updated weights for policy 0, policy_version 49760 (0.0008) [2023-10-12 17:50:58,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 101875712. Throughput: 0: 1661.9, 1: 1667.5. Samples: 25473144. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 17:50:58,436][61643] Avg episode reward: [(0, '24.300'), (1, '9.830')] [2023-10-12 17:51:00,498][62635] Updated weights for policy 1, policy_version 49730 (0.0007) [2023-10-12 17:51:00,866][62635] Updated weights for policy 1, policy_version 49740 (0.0009) [2023-10-12 17:51:01,228][62635] Updated weights for policy 1, policy_version 49750 (0.0009) [2023-10-12 17:51:01,439][62634] Updated weights for policy 0, policy_version 49770 (0.0008) [2023-10-12 17:51:01,600][62635] Updated weights for policy 1, policy_version 49760 (0.0008) [2023-10-12 17:51:01,819][62634] Updated weights for policy 0, policy_version 49780 (0.0008) [2023-10-12 17:51:02,200][62634] Updated weights for policy 0, policy_version 49790 (0.0009) [2023-10-12 17:51:03,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 101941248. Throughput: 0: 1669.1, 1: 1673.9. Samples: 25493404. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 17:51:03,435][61643] Avg episode reward: [(0, '24.280'), (1, '9.880')] [2023-10-12 17:51:05,513][62635] Updated weights for policy 1, policy_version 49770 (0.0009) [2023-10-12 17:51:05,883][62635] Updated weights for policy 1, policy_version 49780 (0.0007) [2023-10-12 17:51:06,252][62635] Updated weights for policy 1, policy_version 49790 (0.0007) [2023-10-12 17:51:06,275][62634] Updated weights for policy 0, policy_version 49800 (0.0009) [2023-10-12 17:51:06,658][62634] Updated weights for policy 0, policy_version 49810 (0.0008) [2023-10-12 17:51:07,036][62634] Updated weights for policy 0, policy_version 49820 (0.0010) [2023-10-12 17:51:08,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 102006784. Throughput: 0: 1679.0, 1: 1656.7. Samples: 25504146. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 17:51:08,436][61643] Avg episode reward: [(0, '24.480'), (1, '9.890')] [2023-10-12 17:51:10,290][62635] Updated weights for policy 1, policy_version 49800 (0.0007) [2023-10-12 17:51:10,651][62635] Updated weights for policy 1, policy_version 49810 (0.0007) [2023-10-12 17:51:11,024][62635] Updated weights for policy 1, policy_version 49820 (0.0008) [2023-10-12 17:51:11,230][62634] Updated weights for policy 0, policy_version 49830 (0.0011) [2023-10-12 17:51:11,616][62634] Updated weights for policy 0, policy_version 49840 (0.0009) [2023-10-12 17:51:11,992][62634] Updated weights for policy 0, policy_version 49850 (0.0008) [2023-10-12 17:51:13,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 102072320. Throughput: 0: 1659.2, 1: 1674.5. Samples: 25523594. Policy #0 lag: (min: 31.0, avg: 36.9, max: 63.0) [2023-10-12 17:51:13,435][61643] Avg episode reward: [(0, '24.280'), (1, '9.920')] [2023-10-12 17:51:15,026][62635] Updated weights for policy 1, policy_version 49830 (0.0008) [2023-10-12 17:51:15,386][62635] Updated weights for policy 1, policy_version 49840 (0.0009) [2023-10-12 17:51:15,754][62635] Updated weights for policy 1, policy_version 49850 (0.0009) [2023-10-12 17:51:16,035][62634] Updated weights for policy 0, policy_version 49860 (0.0009) [2023-10-12 17:51:16,416][62634] Updated weights for policy 0, policy_version 49870 (0.0007) [2023-10-12 17:51:16,788][62634] Updated weights for policy 0, policy_version 49880 (0.0008) [2023-10-12 17:51:18,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.3, 300 sec: 13440.4). Total num frames: 102137856. Throughput: 0: 1676.8, 1: 1687.6. Samples: 25544030. Policy #0 lag: (min: 31.0, avg: 36.9, max: 63.0) [2023-10-12 17:51:18,435][61643] Avg episode reward: [(0, '23.840'), (1, '9.940')] [2023-10-12 17:51:20,090][62635] Updated weights for policy 1, policy_version 49860 (0.0010) [2023-10-12 17:51:20,452][62635] Updated weights for policy 1, policy_version 49870 (0.0009) [2023-10-12 17:51:20,772][62634] Updated weights for policy 0, policy_version 49890 (0.0008) [2023-10-12 17:51:20,826][62635] Updated weights for policy 1, policy_version 49880 (0.0008) [2023-10-12 17:51:21,141][62634] Updated weights for policy 0, policy_version 49900 (0.0008) [2023-10-12 17:51:21,515][62634] Updated weights for policy 0, policy_version 49910 (0.0010) [2023-10-12 17:51:21,892][62634] Updated weights for policy 0, policy_version 49920 (0.0010) [2023-10-12 17:51:23,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 102203392. Throughput: 0: 1677.2, 1: 1661.4. Samples: 25554300. Policy #0 lag: (min: 31.0, avg: 36.9, max: 63.0) [2023-10-12 17:51:23,436][61643] Avg episode reward: [(0, '23.740'), (1, '9.950')] [2023-10-12 17:51:24,865][62635] Updated weights for policy 1, policy_version 49890 (0.0009) [2023-10-12 17:51:25,230][62635] Updated weights for policy 1, policy_version 49900 (0.0007) [2023-10-12 17:51:25,604][62635] Updated weights for policy 1, policy_version 49910 (0.0008) [2023-10-12 17:51:25,972][62634] Updated weights for policy 0, policy_version 49930 (0.0007) [2023-10-12 17:51:25,974][62635] Updated weights for policy 1, policy_version 49920 (0.0008) [2023-10-12 17:51:26,341][62634] Updated weights for policy 0, policy_version 49940 (0.0010) [2023-10-12 17:51:26,721][62634] Updated weights for policy 0, policy_version 49950 (0.0010) [2023-10-12 17:51:28,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13107.3, 300 sec: 13440.4). Total num frames: 102268928. Throughput: 0: 1663.6, 1: 1679.7. Samples: 25573706. Policy #0 lag: (min: 31.0, avg: 36.9, max: 63.0) [2023-10-12 17:51:28,435][61643] Avg episode reward: [(0, '23.490'), (1, '9.900')] [2023-10-12 17:51:30,076][62635] Updated weights for policy 1, policy_version 49930 (0.0010) [2023-10-12 17:51:30,443][62635] Updated weights for policy 1, policy_version 49940 (0.0009) [2023-10-12 17:51:30,627][62634] Updated weights for policy 0, policy_version 49960 (0.0007) [2023-10-12 17:51:30,807][62635] Updated weights for policy 1, policy_version 49950 (0.0007) [2023-10-12 17:51:31,003][62634] Updated weights for policy 0, policy_version 49970 (0.0007) [2023-10-12 17:51:31,377][62634] Updated weights for policy 0, policy_version 49980 (0.0008) [2023-10-12 17:51:33,435][61643] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 102334464. Throughput: 0: 1684.4, 1: 1683.3. Samples: 25594310. Policy #0 lag: (min: 31.0, avg: 36.9, max: 63.0) [2023-10-12 17:51:33,435][61643] Avg episode reward: [(0, '23.490'), (1, '9.920')] [2023-10-12 17:51:34,939][62635] Updated weights for policy 1, policy_version 49960 (0.0008) [2023-10-12 17:51:35,305][62635] Updated weights for policy 1, policy_version 49970 (0.0008) [2023-10-12 17:51:35,450][62634] Updated weights for policy 0, policy_version 49990 (0.0011) [2023-10-12 17:51:35,678][62635] Updated weights for policy 1, policy_version 49980 (0.0008) [2023-10-12 17:51:35,828][62634] Updated weights for policy 0, policy_version 50000 (0.0007) [2023-10-12 17:51:36,202][62634] Updated weights for policy 0, policy_version 50010 (0.0007) [2023-10-12 17:51:38,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 102400000. Throughput: 0: 1668.0, 1: 1664.8. Samples: 25604120. Policy #0 lag: (min: 31.0, avg: 36.9, max: 63.0) [2023-10-12 17:51:38,435][61643] Avg episode reward: [(0, '23.510'), (1, '9.610')] [2023-10-12 17:51:39,799][62635] Updated weights for policy 1, policy_version 49990 (0.0010) [2023-10-12 17:51:40,176][62635] Updated weights for policy 1, policy_version 50000 (0.0007) [2023-10-12 17:51:40,216][62634] Updated weights for policy 0, policy_version 50020 (0.0009) [2023-10-12 17:51:40,549][62635] Updated weights for policy 1, policy_version 50010 (0.0008) [2023-10-12 17:51:40,599][62634] Updated weights for policy 0, policy_version 50030 (0.0007) [2023-10-12 17:51:40,974][62634] Updated weights for policy 0, policy_version 50040 (0.0008) [2023-10-12 17:51:43,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 102465536. Throughput: 0: 1673.3, 1: 1686.7. Samples: 25624344. Policy #0 lag: (min: 31.0, avg: 36.9, max: 63.0) [2023-10-12 17:51:43,435][61643] Avg episode reward: [(0, '23.490'), (1, '9.550')] [2023-10-12 17:51:44,508][62635] Updated weights for policy 1, policy_version 50020 (0.0008) [2023-10-12 17:51:44,878][62635] Updated weights for policy 1, policy_version 50030 (0.0008) [2023-10-12 17:51:45,089][62634] Updated weights for policy 0, policy_version 50050 (0.0008) [2023-10-12 17:51:45,249][62635] Updated weights for policy 1, policy_version 50040 (0.0008) [2023-10-12 17:51:45,471][62634] Updated weights for policy 0, policy_version 50060 (0.0009) [2023-10-12 17:51:45,847][62634] Updated weights for policy 0, policy_version 50070 (0.0007) [2023-10-12 17:51:46,218][62634] Updated weights for policy 0, policy_version 50080 (0.0009) [2023-10-12 17:51:48,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 102531072. Throughput: 0: 1683.0, 1: 1684.2. Samples: 25644930. Policy #0 lag: (min: 31.0, avg: 32.0, max: 53.0) [2023-10-12 17:51:48,435][61643] Avg episode reward: [(0, '23.860'), (1, '9.610')] [2023-10-12 17:51:49,385][62635] Updated weights for policy 1, policy_version 50050 (0.0009) [2023-10-12 17:51:49,755][62635] Updated weights for policy 1, policy_version 50060 (0.0008) [2023-10-12 17:51:50,122][62635] Updated weights for policy 1, policy_version 50070 (0.0007) [2023-10-12 17:51:50,369][62634] Updated weights for policy 0, policy_version 50090 (0.0009) [2023-10-12 17:51:50,500][62635] Updated weights for policy 1, policy_version 50080 (0.0007) [2023-10-12 17:51:50,744][62634] Updated weights for policy 0, policy_version 50100 (0.0008) [2023-10-12 17:51:51,120][62634] Updated weights for policy 0, policy_version 50110 (0.0009) [2023-10-12 17:51:53,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 102596608. Throughput: 0: 1661.9, 1: 1673.4. Samples: 25654234. Policy #0 lag: (min: 31.0, avg: 32.0, max: 53.0) [2023-10-12 17:51:53,435][61643] Avg episode reward: [(0, '24.060'), (1, '9.570')] [2023-10-12 17:51:54,490][62635] Updated weights for policy 1, policy_version 50090 (0.0008) [2023-10-12 17:51:54,860][62635] Updated weights for policy 1, policy_version 50100 (0.0009) [2023-10-12 17:51:55,225][62635] Updated weights for policy 1, policy_version 50110 (0.0007) [2023-10-12 17:51:55,243][62634] Updated weights for policy 0, policy_version 50120 (0.0008) [2023-10-12 17:51:55,615][62634] Updated weights for policy 0, policy_version 50130 (0.0009) [2023-10-12 17:51:55,996][62634] Updated weights for policy 0, policy_version 50140 (0.0007) [2023-10-12 17:51:58,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 102662144. Throughput: 0: 1676.1, 1: 1680.5. Samples: 25674644. Policy #0 lag: (min: 31.0, avg: 32.0, max: 53.0) [2023-10-12 17:51:58,436][61643] Avg episode reward: [(0, '24.070'), (1, '9.570')] [2023-10-12 17:51:59,205][62635] Updated weights for policy 1, policy_version 50120 (0.0009) [2023-10-12 17:51:59,571][62635] Updated weights for policy 1, policy_version 50130 (0.0008) [2023-10-12 17:51:59,941][62635] Updated weights for policy 1, policy_version 50140 (0.0008) [2023-10-12 17:52:00,142][62634] Updated weights for policy 0, policy_version 50150 (0.0010) [2023-10-12 17:52:00,527][62634] Updated weights for policy 0, policy_version 50160 (0.0009) [2023-10-12 17:52:00,908][62634] Updated weights for policy 0, policy_version 50170 (0.0007) [2023-10-12 17:52:03,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 102727680. Throughput: 0: 1681.2, 1: 1679.7. Samples: 25695274. Policy #0 lag: (min: 31.0, avg: 32.0, max: 53.0) [2023-10-12 17:52:03,435][61643] Avg episode reward: [(0, '24.180'), (1, '9.790')] [2023-10-12 17:52:04,208][62635] Updated weights for policy 1, policy_version 50150 (0.0008) [2023-10-12 17:52:04,591][62635] Updated weights for policy 1, policy_version 50160 (0.0007) [2023-10-12 17:52:04,916][62634] Updated weights for policy 0, policy_version 50180 (0.0010) [2023-10-12 17:52:04,956][62635] Updated weights for policy 1, policy_version 50170 (0.0008) [2023-10-12 17:52:05,291][62634] Updated weights for policy 0, policy_version 50190 (0.0008) [2023-10-12 17:52:05,662][62634] Updated weights for policy 0, policy_version 50200 (0.0008) [2023-10-12 17:52:08,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 102793216. Throughput: 0: 1658.8, 1: 1674.7. Samples: 25704308. Policy #0 lag: (min: 31.0, avg: 32.0, max: 53.0) [2023-10-12 17:52:08,435][61643] Avg episode reward: [(0, '24.180'), (1, '9.760')] [2023-10-12 17:52:09,022][62635] Updated weights for policy 1, policy_version 50180 (0.0007) [2023-10-12 17:52:09,383][62635] Updated weights for policy 1, policy_version 50190 (0.0008) [2023-10-12 17:52:09,752][62635] Updated weights for policy 1, policy_version 50200 (0.0007) [2023-10-12 17:52:09,764][62634] Updated weights for policy 0, policy_version 50210 (0.0007) [2023-10-12 17:52:10,138][62634] Updated weights for policy 0, policy_version 50220 (0.0008) [2023-10-12 17:52:10,509][62634] Updated weights for policy 0, policy_version 50230 (0.0007) [2023-10-12 17:52:10,890][62634] Updated weights for policy 0, policy_version 50240 (0.0007) [2023-10-12 17:52:13,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 102858752. Throughput: 0: 1683.0, 1: 1680.9. Samples: 25725084. Policy #0 lag: (min: 31.0, avg: 32.0, max: 53.0) [2023-10-12 17:52:13,435][61643] Avg episode reward: [(0, '24.420'), (1, '9.790')] [2023-10-12 17:52:13,791][62635] Updated weights for policy 1, policy_version 50210 (0.0008) [2023-10-12 17:52:14,159][62635] Updated weights for policy 1, policy_version 50220 (0.0007) [2023-10-12 17:52:14,534][62635] Updated weights for policy 1, policy_version 50230 (0.0008) [2023-10-12 17:52:14,870][62634] Updated weights for policy 0, policy_version 50250 (0.0010) [2023-10-12 17:52:14,900][62635] Updated weights for policy 1, policy_version 50240 (0.0007) [2023-10-12 17:52:15,239][62634] Updated weights for policy 0, policy_version 50260 (0.0011) [2023-10-12 17:52:15,617][62634] Updated weights for policy 0, policy_version 50270 (0.0008) [2023-10-12 17:52:18,435][61643] Fps is (10 sec: 13106.8, 60 sec: 13107.1, 300 sec: 13329.3). Total num frames: 102924288. Throughput: 0: 1682.3, 1: 1685.0. Samples: 25745842. Policy #0 lag: (min: 31.0, avg: 32.0, max: 53.0) [2023-10-12 17:52:18,436][61643] Avg episode reward: [(0, '24.330'), (1, '9.930')] [2023-10-12 17:52:18,448][62354] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000050272_51478528.pth... [2023-10-12 17:52:18,448][62495] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000050240_51445760.pth... [2023-10-12 17:52:18,478][62354] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000048704_49872896.pth [2023-10-12 17:52:18,491][62495] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000048672_49840128.pth [2023-10-12 17:52:18,997][62635] Updated weights for policy 1, policy_version 50250 (0.0010) [2023-10-12 17:52:19,365][62635] Updated weights for policy 1, policy_version 50260 (0.0007) [2023-10-12 17:52:19,552][62634] Updated weights for policy 0, policy_version 50280 (0.0009) [2023-10-12 17:52:19,731][62635] Updated weights for policy 1, policy_version 50270 (0.0008) [2023-10-12 17:52:19,928][62634] Updated weights for policy 0, policy_version 50290 (0.0009) [2023-10-12 17:52:20,309][62634] Updated weights for policy 0, policy_version 50300 (0.0009) [2023-10-12 17:52:23,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 102989824. Throughput: 0: 1669.7, 1: 1681.1. Samples: 25754904. Policy #0 lag: (min: 22.0, avg: 22.0, max: 22.0) [2023-10-12 17:52:23,435][61643] Avg episode reward: [(0, '24.370'), (1, '9.550')] [2023-10-12 17:52:23,724][62635] Updated weights for policy 1, policy_version 50280 (0.0010) [2023-10-12 17:52:24,084][62635] Updated weights for policy 1, policy_version 50290 (0.0007) [2023-10-12 17:52:24,338][62634] Updated weights for policy 0, policy_version 50310 (0.0008) [2023-10-12 17:52:24,455][62635] Updated weights for policy 1, policy_version 50300 (0.0008) [2023-10-12 17:52:24,712][62634] Updated weights for policy 0, policy_version 50320 (0.0008) [2023-10-12 17:52:25,092][62634] Updated weights for policy 0, policy_version 50330 (0.0007) [2023-10-12 17:52:28,435][61643] Fps is (10 sec: 13107.7, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 103055360. Throughput: 0: 1678.0, 1: 1681.6. Samples: 25775530. Policy #0 lag: (min: 22.0, avg: 22.0, max: 22.0) [2023-10-12 17:52:28,435][61643] Avg episode reward: [(0, '24.320'), (1, '9.730')] [2023-10-12 17:52:28,561][62635] Updated weights for policy 1, policy_version 50310 (0.0008) [2023-10-12 17:52:28,928][62635] Updated weights for policy 1, policy_version 50320 (0.0009) [2023-10-12 17:52:29,239][62634] Updated weights for policy 0, policy_version 50340 (0.0009) [2023-10-12 17:52:29,301][62635] Updated weights for policy 1, policy_version 50330 (0.0007) [2023-10-12 17:52:29,615][62634] Updated weights for policy 0, policy_version 50350 (0.0010) [2023-10-12 17:52:29,996][62634] Updated weights for policy 0, policy_version 50360 (0.0007) [2023-10-12 17:52:33,418][62635] Updated weights for policy 1, policy_version 50340 (0.0008) [2023-10-12 17:52:33,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 103120896. Throughput: 0: 1678.3, 1: 1683.4. Samples: 25796204. Policy #0 lag: (min: 22.0, avg: 22.0, max: 22.0) [2023-10-12 17:52:33,436][61643] Avg episode reward: [(0, '24.730'), (1, '9.590')] [2023-10-12 17:52:33,791][62635] Updated weights for policy 1, policy_version 50350 (0.0009) [2023-10-12 17:52:34,166][62635] Updated weights for policy 1, policy_version 50360 (0.0008) [2023-10-12 17:52:34,192][62634] Updated weights for policy 0, policy_version 50370 (0.0007) [2023-10-12 17:52:34,567][62634] Updated weights for policy 0, policy_version 50380 (0.0008) [2023-10-12 17:52:34,942][62634] Updated weights for policy 0, policy_version 50390 (0.0008) [2023-10-12 17:52:35,327][62634] Updated weights for policy 0, policy_version 50400 (0.0007) [2023-10-12 17:52:38,250][62635] Updated weights for policy 1, policy_version 50370 (0.0008) [2023-10-12 17:52:38,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 103186432. Throughput: 0: 1671.6, 1: 1683.3. Samples: 25805206. Policy #0 lag: (min: 22.0, avg: 22.0, max: 22.0) [2023-10-12 17:52:38,435][61643] Avg episode reward: [(0, '25.030'), (1, '9.610')] [2023-10-12 17:52:38,436][62354] Saving new best policy, reward=25.030! [2023-10-12 17:52:38,610][62635] Updated weights for policy 1, policy_version 50380 (0.0010) [2023-10-12 17:52:38,980][62635] Updated weights for policy 1, policy_version 50390 (0.0008) [2023-10-12 17:52:39,343][62635] Updated weights for policy 1, policy_version 50400 (0.0007) [2023-10-12 17:52:39,479][62634] Updated weights for policy 0, policy_version 50410 (0.0008) [2023-10-12 17:52:39,859][62634] Updated weights for policy 0, policy_version 50420 (0.0008) [2023-10-12 17:52:40,228][62634] Updated weights for policy 0, policy_version 50430 (0.0008) [2023-10-12 17:52:43,354][62635] Updated weights for policy 1, policy_version 50410 (0.0009) [2023-10-12 17:52:43,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 103251968. Throughput: 0: 1676.8, 1: 1685.8. Samples: 25825960. Policy #0 lag: (min: 22.0, avg: 22.0, max: 22.0) [2023-10-12 17:52:43,436][61643] Avg episode reward: [(0, '25.280'), (1, '9.780')] [2023-10-12 17:52:43,436][62354] Saving new best policy, reward=25.280! [2023-10-12 17:52:43,717][62635] Updated weights for policy 1, policy_version 50420 (0.0009) [2023-10-12 17:52:44,085][62635] Updated weights for policy 1, policy_version 50430 (0.0007) [2023-10-12 17:52:44,308][62634] Updated weights for policy 0, policy_version 50440 (0.0007) [2023-10-12 17:52:44,699][62634] Updated weights for policy 0, policy_version 50450 (0.0007) [2023-10-12 17:52:45,085][62634] Updated weights for policy 0, policy_version 50460 (0.0008) [2023-10-12 17:52:48,179][62635] Updated weights for policy 1, policy_version 50440 (0.0010) [2023-10-12 17:52:48,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 103317504. Throughput: 0: 1681.8, 1: 1680.2. Samples: 25846564. Policy #0 lag: (min: 22.0, avg: 22.0, max: 22.0) [2023-10-12 17:52:48,435][61643] Avg episode reward: [(0, '25.360'), (1, '9.650')] [2023-10-12 17:52:48,444][62354] Saving new best policy, reward=25.360! [2023-10-12 17:52:48,540][62635] Updated weights for policy 1, policy_version 50450 (0.0008) [2023-10-12 17:52:48,915][62635] Updated weights for policy 1, policy_version 50460 (0.0007) [2023-10-12 17:52:49,136][62634] Updated weights for policy 0, policy_version 50470 (0.0009) [2023-10-12 17:52:49,521][62634] Updated weights for policy 0, policy_version 50480 (0.0008) [2023-10-12 17:52:49,898][62634] Updated weights for policy 0, policy_version 50490 (0.0008) [2023-10-12 17:52:52,970][62635] Updated weights for policy 1, policy_version 50470 (0.0009) [2023-10-12 17:52:53,349][62635] Updated weights for policy 1, policy_version 50480 (0.0011) [2023-10-12 17:52:53,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 103383040. Throughput: 0: 1676.7, 1: 1689.7. Samples: 25855796. Policy #0 lag: (min: 22.0, avg: 22.0, max: 22.0) [2023-10-12 17:52:53,435][61643] Avg episode reward: [(0, '25.400'), (1, '9.440')] [2023-10-12 17:52:53,436][62354] Saving new best policy, reward=25.400! [2023-10-12 17:52:53,719][62635] Updated weights for policy 1, policy_version 50490 (0.0008) [2023-10-12 17:52:53,969][62634] Updated weights for policy 0, policy_version 50500 (0.0008) [2023-10-12 17:52:54,340][62634] Updated weights for policy 0, policy_version 50510 (0.0009) [2023-10-12 17:52:54,723][62634] Updated weights for policy 0, policy_version 50520 (0.0009) [2023-10-12 17:52:57,598][62635] Updated weights for policy 1, policy_version 50500 (0.0007) [2023-10-12 17:52:57,970][62635] Updated weights for policy 1, policy_version 50510 (0.0007) [2023-10-12 17:52:58,341][62635] Updated weights for policy 1, policy_version 50520 (0.0011) [2023-10-12 17:52:58,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 103448576. Throughput: 0: 1677.3, 1: 1689.0. Samples: 25876570. Policy #0 lag: (min: 31.0, avg: 36.8, max: 63.0) [2023-10-12 17:52:58,436][61643] Avg episode reward: [(0, '25.230'), (1, '10.030')] [2023-10-12 17:52:58,823][62634] Updated weights for policy 0, policy_version 50530 (0.0009) [2023-10-12 17:52:59,197][62634] Updated weights for policy 0, policy_version 50540 (0.0010) [2023-10-12 17:52:59,574][62634] Updated weights for policy 0, policy_version 50550 (0.0008) [2023-10-12 17:52:59,955][62634] Updated weights for policy 0, policy_version 50560 (0.0010) [2023-10-12 17:53:02,377][62635] Updated weights for policy 1, policy_version 50530 (0.0009) [2023-10-12 17:53:02,751][62635] Updated weights for policy 1, policy_version 50540 (0.0008) [2023-10-12 17:53:03,135][62635] Updated weights for policy 1, policy_version 50550 (0.0009) [2023-10-12 17:53:03,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 103514112. Throughput: 0: 1677.7, 1: 1676.0. Samples: 25896760. Policy #0 lag: (min: 31.0, avg: 36.8, max: 63.0) [2023-10-12 17:53:03,435][61643] Avg episode reward: [(0, '25.000'), (1, '9.720')] [2023-10-12 17:53:03,504][62635] Updated weights for policy 1, policy_version 50560 (0.0007) [2023-10-12 17:53:03,888][62634] Updated weights for policy 0, policy_version 50570 (0.0010) [2023-10-12 17:53:04,253][62634] Updated weights for policy 0, policy_version 50580 (0.0010) [2023-10-12 17:53:04,638][62634] Updated weights for policy 0, policy_version 50590 (0.0011) [2023-10-12 17:53:07,356][62635] Updated weights for policy 1, policy_version 50570 (0.0008) [2023-10-12 17:53:07,717][62635] Updated weights for policy 1, policy_version 50580 (0.0009) [2023-10-12 17:53:08,084][62635] Updated weights for policy 1, policy_version 50590 (0.0008) [2023-10-12 17:53:08,435][61643] Fps is (10 sec: 16384.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 103612416. Throughput: 0: 1674.7, 1: 1697.3. Samples: 25906644. Policy #0 lag: (min: 31.0, avg: 36.8, max: 63.0) [2023-10-12 17:53:08,435][61643] Avg episode reward: [(0, '24.750'), (1, '9.730')] [2023-10-12 17:53:08,762][62634] Updated weights for policy 0, policy_version 50600 (0.0009) [2023-10-12 17:53:09,133][62634] Updated weights for policy 0, policy_version 50610 (0.0009) [2023-10-12 17:53:09,509][62634] Updated weights for policy 0, policy_version 50620 (0.0009) [2023-10-12 17:53:12,234][62635] Updated weights for policy 1, policy_version 50600 (0.0007) [2023-10-12 17:53:12,607][62635] Updated weights for policy 1, policy_version 50610 (0.0007) [2023-10-12 17:53:12,978][62635] Updated weights for policy 1, policy_version 50620 (0.0007) [2023-10-12 17:53:13,435][61643] Fps is (10 sec: 16384.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 103677952. Throughput: 0: 1678.3, 1: 1693.6. Samples: 25927268. Policy #0 lag: (min: 31.0, avg: 36.8, max: 63.0) [2023-10-12 17:53:13,435][61643] Avg episode reward: [(0, '24.750'), (1, '9.870')] [2023-10-12 17:53:13,460][62634] Updated weights for policy 0, policy_version 50630 (0.0008) [2023-10-12 17:53:13,836][62634] Updated weights for policy 0, policy_version 50640 (0.0010) [2023-10-12 17:53:14,213][62634] Updated weights for policy 0, policy_version 50650 (0.0007) [2023-10-12 17:53:17,100][62635] Updated weights for policy 1, policy_version 50630 (0.0008) [2023-10-12 17:53:17,468][62635] Updated weights for policy 1, policy_version 50640 (0.0008) [2023-10-12 17:53:17,836][62635] Updated weights for policy 1, policy_version 50650 (0.0007) [2023-10-12 17:53:18,278][62634] Updated weights for policy 0, policy_version 50660 (0.0008) [2023-10-12 17:53:18,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 103743488. Throughput: 0: 1681.3, 1: 1668.6. Samples: 25946948. Policy #0 lag: (min: 31.0, avg: 36.8, max: 63.0) [2023-10-12 17:53:18,435][61643] Avg episode reward: [(0, '24.930'), (1, '9.570')] [2023-10-12 17:53:18,643][62634] Updated weights for policy 0, policy_version 50670 (0.0008) [2023-10-12 17:53:19,019][62634] Updated weights for policy 0, policy_version 50680 (0.0009) [2023-10-12 17:53:21,998][62635] Updated weights for policy 1, policy_version 50660 (0.0009) [2023-10-12 17:53:22,370][62635] Updated weights for policy 1, policy_version 50670 (0.0008) [2023-10-12 17:53:22,733][62635] Updated weights for policy 1, policy_version 50680 (0.0007) [2023-10-12 17:53:22,914][62634] Updated weights for policy 0, policy_version 50690 (0.0008) [2023-10-12 17:53:23,281][62634] Updated weights for policy 0, policy_version 50700 (0.0009) [2023-10-12 17:53:23,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 103809024. Throughput: 0: 1685.6, 1: 1697.1. Samples: 25957432. Policy #0 lag: (min: 31.0, avg: 36.8, max: 63.0) [2023-10-12 17:53:23,436][61643] Avg episode reward: [(0, '24.780'), (1, '9.670')] [2023-10-12 17:53:23,666][62634] Updated weights for policy 0, policy_version 50710 (0.0010) [2023-10-12 17:53:24,040][62634] Updated weights for policy 0, policy_version 50720 (0.0009) [2023-10-12 17:53:26,817][62635] Updated weights for policy 1, policy_version 50690 (0.0008) [2023-10-12 17:53:27,188][62635] Updated weights for policy 1, policy_version 50700 (0.0009) [2023-10-12 17:53:27,559][62635] Updated weights for policy 1, policy_version 50710 (0.0007) [2023-10-12 17:53:27,937][62635] Updated weights for policy 1, policy_version 50720 (0.0007) [2023-10-12 17:53:27,943][62634] Updated weights for policy 0, policy_version 50730 (0.0008) [2023-10-12 17:53:28,330][62634] Updated weights for policy 0, policy_version 50740 (0.0009) [2023-10-12 17:53:28,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 103874560. Throughput: 0: 1691.4, 1: 1685.2. Samples: 25977908. Policy #0 lag: (min: 31.0, avg: 36.8, max: 63.0) [2023-10-12 17:53:28,436][61643] Avg episode reward: [(0, '24.640'), (1, '9.740')] [2023-10-12 17:53:28,708][62634] Updated weights for policy 0, policy_version 50750 (0.0009) [2023-10-12 17:53:31,876][62635] Updated weights for policy 1, policy_version 50730 (0.0008) [2023-10-12 17:53:32,252][62635] Updated weights for policy 1, policy_version 50740 (0.0007) [2023-10-12 17:53:32,624][62635] Updated weights for policy 1, policy_version 50750 (0.0008) [2023-10-12 17:53:32,807][62634] Updated weights for policy 0, policy_version 50760 (0.0008) [2023-10-12 17:53:33,174][62634] Updated weights for policy 0, policy_version 50770 (0.0007) [2023-10-12 17:53:33,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 103940096. Throughput: 0: 1682.0, 1: 1665.7. Samples: 25997212. Policy #0 lag: (min: 31.0, avg: 38.5, max: 63.0) [2023-10-12 17:53:33,435][61643] Avg episode reward: [(0, '24.370'), (1, '9.660')] [2023-10-12 17:53:33,551][62634] Updated weights for policy 0, policy_version 50780 (0.0009) [2023-10-12 17:53:36,810][62635] Updated weights for policy 1, policy_version 50760 (0.0008) [2023-10-12 17:53:37,171][62635] Updated weights for policy 1, policy_version 50770 (0.0009) [2023-10-12 17:53:37,538][62635] Updated weights for policy 1, policy_version 50780 (0.0008) [2023-10-12 17:53:37,652][62634] Updated weights for policy 0, policy_version 50790 (0.0007) [2023-10-12 17:53:38,036][62634] Updated weights for policy 0, policy_version 50800 (0.0009) [2023-10-12 17:53:38,401][62634] Updated weights for policy 0, policy_version 50810 (0.0008) [2023-10-12 17:53:38,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 104005632. Throughput: 0: 1692.6, 1: 1689.2. Samples: 26007980. Policy #0 lag: (min: 31.0, avg: 38.5, max: 63.0) [2023-10-12 17:53:38,435][61643] Avg episode reward: [(0, '24.400'), (1, '9.440')] [2023-10-12 17:53:41,624][62635] Updated weights for policy 1, policy_version 50790 (0.0010) [2023-10-12 17:53:42,004][62635] Updated weights for policy 1, policy_version 50800 (0.0009) [2023-10-12 17:53:42,374][62635] Updated weights for policy 1, policy_version 50810 (0.0008) [2023-10-12 17:53:42,578][62634] Updated weights for policy 0, policy_version 50820 (0.0008) [2023-10-12 17:53:42,955][62634] Updated weights for policy 0, policy_version 50830 (0.0008) [2023-10-12 17:53:43,336][62634] Updated weights for policy 0, policy_version 50840 (0.0009) [2023-10-12 17:53:43,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 104071168. Throughput: 0: 1692.7, 1: 1674.1. Samples: 26028076. Policy #0 lag: (min: 31.0, avg: 38.5, max: 63.0) [2023-10-12 17:53:43,435][61643] Avg episode reward: [(0, '24.440'), (1, '9.590')] [2023-10-12 17:53:46,305][62635] Updated weights for policy 1, policy_version 50820 (0.0008) [2023-10-12 17:53:46,674][62635] Updated weights for policy 1, policy_version 50830 (0.0009) [2023-10-12 17:53:47,049][62635] Updated weights for policy 1, policy_version 50840 (0.0008) [2023-10-12 17:53:47,332][62634] Updated weights for policy 0, policy_version 50850 (0.0009) [2023-10-12 17:53:47,709][62634] Updated weights for policy 0, policy_version 50860 (0.0008) [2023-10-12 17:53:48,092][62634] Updated weights for policy 0, policy_version 50870 (0.0008) [2023-10-12 17:53:48,435][61643] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 104136704. Throughput: 0: 1675.5, 1: 1676.2. Samples: 26047586. Policy #0 lag: (min: 31.0, avg: 38.5, max: 63.0) [2023-10-12 17:53:48,436][61643] Avg episode reward: [(0, '24.370'), (1, '9.630')] [2023-10-12 17:53:48,464][62634] Updated weights for policy 0, policy_version 50880 (0.0008) [2023-10-12 17:53:51,157][62635] Updated weights for policy 1, policy_version 50850 (0.0008) [2023-10-12 17:53:51,529][62635] Updated weights for policy 1, policy_version 50860 (0.0008) [2023-10-12 17:53:51,901][62635] Updated weights for policy 1, policy_version 50870 (0.0007) [2023-10-12 17:53:52,268][62635] Updated weights for policy 1, policy_version 50880 (0.0009) [2023-10-12 17:53:52,404][62634] Updated weights for policy 0, policy_version 50890 (0.0009) [2023-10-12 17:53:52,775][62634] Updated weights for policy 0, policy_version 50900 (0.0010) [2023-10-12 17:53:53,148][62634] Updated weights for policy 0, policy_version 50910 (0.0011) [2023-10-12 17:53:53,435][61643] Fps is (10 sec: 16384.0, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 104235008. Throughput: 0: 1695.2, 1: 1683.7. Samples: 26058696. Policy #0 lag: (min: 31.0, avg: 38.5, max: 63.0) [2023-10-12 17:53:53,435][61643] Avg episode reward: [(0, '24.220'), (1, '9.490')] [2023-10-12 17:53:56,317][62635] Updated weights for policy 1, policy_version 50890 (0.0008) [2023-10-12 17:53:56,695][62635] Updated weights for policy 1, policy_version 50900 (0.0008) [2023-10-12 17:53:57,063][62635] Updated weights for policy 1, policy_version 50910 (0.0009) [2023-10-12 17:53:57,467][62634] Updated weights for policy 0, policy_version 50920 (0.0008) [2023-10-12 17:53:57,843][62634] Updated weights for policy 0, policy_version 50930 (0.0009) [2023-10-12 17:53:58,226][62634] Updated weights for policy 0, policy_version 50940 (0.0010) [2023-10-12 17:53:58,435][61643] Fps is (10 sec: 16384.3, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 104300544. Throughput: 0: 1690.1, 1: 1665.9. Samples: 26078288. Policy #0 lag: (min: 31.0, avg: 38.5, max: 63.0) [2023-10-12 17:53:58,435][61643] Avg episode reward: [(0, '24.260'), (1, '9.470')] [2023-10-12 17:54:00,963][62635] Updated weights for policy 1, policy_version 50920 (0.0008) [2023-10-12 17:54:01,335][62635] Updated weights for policy 1, policy_version 50930 (0.0010) [2023-10-12 17:54:01,700][62635] Updated weights for policy 1, policy_version 50940 (0.0008) [2023-10-12 17:54:02,258][62634] Updated weights for policy 0, policy_version 50950 (0.0009) [2023-10-12 17:54:02,635][62634] Updated weights for policy 0, policy_version 50960 (0.0007) [2023-10-12 17:54:03,017][62634] Updated weights for policy 0, policy_version 50970 (0.0010) [2023-10-12 17:54:03,435][61643] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 104366080. Throughput: 0: 1664.8, 1: 1696.7. Samples: 26098214. Policy #0 lag: (min: 31.0, avg: 37.9, max: 63.0) [2023-10-12 17:54:03,435][61643] Avg episode reward: [(0, '24.230'), (1, '9.240')] [2023-10-12 17:54:05,613][62635] Updated weights for policy 1, policy_version 50950 (0.0008) [2023-10-12 17:54:05,982][62635] Updated weights for policy 1, policy_version 50960 (0.0007) [2023-10-12 17:54:06,364][62635] Updated weights for policy 1, policy_version 50970 (0.0008) [2023-10-12 17:54:07,092][62634] Updated weights for policy 0, policy_version 50980 (0.0010) [2023-10-12 17:54:07,468][62634] Updated weights for policy 0, policy_version 50990 (0.0008) [2023-10-12 17:54:07,847][62634] Updated weights for policy 0, policy_version 51000 (0.0007) [2023-10-12 17:54:08,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 104431616. Throughput: 0: 1682.2, 1: 1685.4. Samples: 26108972. Policy #0 lag: (min: 31.0, avg: 37.9, max: 63.0) [2023-10-12 17:54:08,436][61643] Avg episode reward: [(0, '24.130'), (1, '9.130')] [2023-10-12 17:54:10,304][62635] Updated weights for policy 1, policy_version 50980 (0.0009) [2023-10-12 17:54:10,676][62635] Updated weights for policy 1, policy_version 50990 (0.0008) [2023-10-12 17:54:11,055][62635] Updated weights for policy 1, policy_version 51000 (0.0008) [2023-10-12 17:54:11,870][62634] Updated weights for policy 0, policy_version 51010 (0.0008) [2023-10-12 17:54:12,246][62634] Updated weights for policy 0, policy_version 51020 (0.0010) [2023-10-12 17:54:12,623][62634] Updated weights for policy 0, policy_version 51030 (0.0008) [2023-10-12 17:54:13,000][62634] Updated weights for policy 0, policy_version 51040 (0.0007) [2023-10-12 17:54:13,435][61643] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 104497152. Throughput: 0: 1676.3, 1: 1680.8. Samples: 26128976. Policy #0 lag: (min: 31.0, avg: 37.9, max: 63.0) [2023-10-12 17:54:13,436][61643] Avg episode reward: [(0, '24.220'), (1, '9.240')] [2023-10-12 17:54:14,912][62635] Updated weights for policy 1, policy_version 51010 (0.0008) [2023-10-12 17:54:15,281][62635] Updated weights for policy 1, policy_version 51020 (0.0008) [2023-10-12 17:54:15,655][62635] Updated weights for policy 1, policy_version 51030 (0.0008) [2023-10-12 17:54:16,014][62635] Updated weights for policy 1, policy_version 51040 (0.0009) [2023-10-12 17:54:17,132][62634] Updated weights for policy 0, policy_version 51050 (0.0008) [2023-10-12 17:54:17,506][62634] Updated weights for policy 0, policy_version 51060 (0.0007) [2023-10-12 17:54:17,885][62634] Updated weights for policy 0, policy_version 51070 (0.0007) [2023-10-12 17:54:18,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 104562688. Throughput: 0: 1660.9, 1: 1706.5. Samples: 26148748. Policy #0 lag: (min: 31.0, avg: 37.9, max: 63.0) [2023-10-12 17:54:18,436][61643] Avg episode reward: [(0, '24.640'), (1, '9.170')] [2023-10-12 17:54:18,446][62495] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000051040_52264960.pth... [2023-10-12 17:54:18,446][62354] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000051072_52297728.pth... [2023-10-12 17:54:18,480][62495] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000049472_50659328.pth [2023-10-12 17:54:18,485][62354] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000049504_50692096.pth [2023-10-12 17:54:20,219][62635] Updated weights for policy 1, policy_version 51050 (0.0010) [2023-10-12 17:54:20,586][62635] Updated weights for policy 1, policy_version 51060 (0.0010) [2023-10-12 17:54:20,951][62635] Updated weights for policy 1, policy_version 51070 (0.0008) [2023-10-12 17:54:22,145][62634] Updated weights for policy 0, policy_version 51080 (0.0010) [2023-10-12 17:54:22,533][62634] Updated weights for policy 0, policy_version 51090 (0.0008) [2023-10-12 17:54:22,917][62634] Updated weights for policy 0, policy_version 51100 (0.0007) [2023-10-12 17:54:23,435][61643] Fps is (10 sec: 13107.6, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 104628224. Throughput: 0: 1676.0, 1: 1678.3. Samples: 26158922. Policy #0 lag: (min: 31.0, avg: 37.9, max: 63.0) [2023-10-12 17:54:23,435][61643] Avg episode reward: [(0, '24.760'), (1, '9.280')] [2023-10-12 17:54:25,136][62635] Updated weights for policy 1, policy_version 51080 (0.0008) [2023-10-12 17:54:25,505][62635] Updated weights for policy 1, policy_version 51090 (0.0007) [2023-10-12 17:54:25,878][62635] Updated weights for policy 1, policy_version 51100 (0.0007) [2023-10-12 17:54:26,801][62634] Updated weights for policy 0, policy_version 51110 (0.0010) [2023-10-12 17:54:27,183][62634] Updated weights for policy 0, policy_version 51120 (0.0009) [2023-10-12 17:54:27,567][62634] Updated weights for policy 0, policy_version 51130 (0.0007) [2023-10-12 17:54:28,435][61643] Fps is (10 sec: 13107.6, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 104693760. Throughput: 0: 1662.3, 1: 1691.4. Samples: 26178992. Policy #0 lag: (min: 31.0, avg: 37.9, max: 63.0) [2023-10-12 17:54:28,435][61643] Avg episode reward: [(0, '24.830'), (1, '9.520')] [2023-10-12 17:54:30,056][62635] Updated weights for policy 1, policy_version 51110 (0.0009) [2023-10-12 17:54:30,445][62635] Updated weights for policy 1, policy_version 51120 (0.0009) [2023-10-12 17:54:30,813][62635] Updated weights for policy 1, policy_version 51130 (0.0009) [2023-10-12 17:54:31,732][62634] Updated weights for policy 0, policy_version 51140 (0.0010) [2023-10-12 17:54:32,110][62634] Updated weights for policy 0, policy_version 51150 (0.0008) [2023-10-12 17:54:32,493][62634] Updated weights for policy 0, policy_version 51160 (0.0008) [2023-10-12 17:54:33,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 104759296. Throughput: 0: 1657.6, 1: 1698.7. Samples: 26198618. Policy #0 lag: (min: 31.0, avg: 37.9, max: 63.0) [2023-10-12 17:54:33,436][61643] Avg episode reward: [(0, '24.870'), (1, '9.540')] [2023-10-12 17:54:34,835][62635] Updated weights for policy 1, policy_version 51140 (0.0007) [2023-10-12 17:54:35,209][62635] Updated weights for policy 1, policy_version 51150 (0.0007) [2023-10-12 17:54:35,579][62635] Updated weights for policy 1, policy_version 51160 (0.0007) [2023-10-12 17:54:36,638][62634] Updated weights for policy 0, policy_version 51170 (0.0010) [2023-10-12 17:54:37,013][62634] Updated weights for policy 0, policy_version 51180 (0.0008) [2023-10-12 17:54:37,400][62634] Updated weights for policy 0, policy_version 51190 (0.0007) [2023-10-12 17:54:37,781][62634] Updated weights for policy 0, policy_version 51200 (0.0008) [2023-10-12 17:54:38,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 104824832. Throughput: 0: 1666.8, 1: 1668.5. Samples: 26208784. Policy #0 lag: (min: 28.0, avg: 28.0, max: 28.0) [2023-10-12 17:54:38,436][61643] Avg episode reward: [(0, '24.990'), (1, '9.550')] [2023-10-12 17:54:39,547][62635] Updated weights for policy 1, policy_version 51170 (0.0008) [2023-10-12 17:54:39,914][62635] Updated weights for policy 1, policy_version 51180 (0.0007) [2023-10-12 17:54:40,275][62635] Updated weights for policy 1, policy_version 51190 (0.0010) [2023-10-12 17:54:40,646][62635] Updated weights for policy 1, policy_version 51200 (0.0007) [2023-10-12 17:54:41,631][62634] Updated weights for policy 0, policy_version 51210 (0.0008) [2023-10-12 17:54:42,017][62634] Updated weights for policy 0, policy_version 51220 (0.0007) [2023-10-12 17:54:42,387][62634] Updated weights for policy 0, policy_version 51230 (0.0007) [2023-10-12 17:54:43,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 104890368. Throughput: 0: 1660.1, 1: 1692.1. Samples: 26229140. Policy #0 lag: (min: 28.0, avg: 28.0, max: 28.0) [2023-10-12 17:54:43,435][61643] Avg episode reward: [(0, '25.060'), (1, '9.800')] [2023-10-12 17:54:44,800][62635] Updated weights for policy 1, policy_version 51210 (0.0009) [2023-10-12 17:54:45,171][62635] Updated weights for policy 1, policy_version 51220 (0.0007) [2023-10-12 17:54:45,541][62635] Updated weights for policy 1, policy_version 51230 (0.0008) [2023-10-12 17:54:46,465][62634] Updated weights for policy 0, policy_version 51240 (0.0008) [2023-10-12 17:54:46,850][62634] Updated weights for policy 0, policy_version 51250 (0.0009) [2023-10-12 17:54:47,224][62634] Updated weights for policy 0, policy_version 51260 (0.0008) [2023-10-12 17:54:48,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 104955904. Throughput: 0: 1672.4, 1: 1680.4. Samples: 26249092. Policy #0 lag: (min: 28.0, avg: 28.0, max: 28.0) [2023-10-12 17:54:48,436][61643] Avg episode reward: [(0, '25.100'), (1, '9.900')] [2023-10-12 17:54:49,702][62635] Updated weights for policy 1, policy_version 51240 (0.0008) [2023-10-12 17:54:50,062][62635] Updated weights for policy 1, policy_version 51250 (0.0008) [2023-10-12 17:54:50,432][62635] Updated weights for policy 1, policy_version 51260 (0.0009) [2023-10-12 17:54:51,131][62634] Updated weights for policy 0, policy_version 51270 (0.0009) [2023-10-12 17:54:51,508][62634] Updated weights for policy 0, policy_version 51280 (0.0008) [2023-10-12 17:54:51,894][62634] Updated weights for policy 0, policy_version 51290 (0.0009) [2023-10-12 17:54:53,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 105021440. Throughput: 0: 1679.3, 1: 1661.0. Samples: 26259286. Policy #0 lag: (min: 28.0, avg: 28.0, max: 28.0) [2023-10-12 17:54:53,435][61643] Avg episode reward: [(0, '24.920'), (1, '9.880')] [2023-10-12 17:54:54,603][62635] Updated weights for policy 1, policy_version 51270 (0.0007) [2023-10-12 17:54:54,978][62635] Updated weights for policy 1, policy_version 51280 (0.0007) [2023-10-12 17:54:55,342][62635] Updated weights for policy 1, policy_version 51290 (0.0008) [2023-10-12 17:54:56,135][62634] Updated weights for policy 0, policy_version 51300 (0.0009) [2023-10-12 17:54:56,509][62634] Updated weights for policy 0, policy_version 51310 (0.0008) [2023-10-12 17:54:56,897][62634] Updated weights for policy 0, policy_version 51320 (0.0007) [2023-10-12 17:54:58,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 105086976. Throughput: 0: 1657.8, 1: 1671.1. Samples: 26278774. Policy #0 lag: (min: 28.0, avg: 28.0, max: 28.0) [2023-10-12 17:54:58,436][61643] Avg episode reward: [(0, '25.110'), (1, '9.880')] [2023-10-12 17:54:59,286][62635] Updated weights for policy 1, policy_version 51300 (0.0010) [2023-10-12 17:54:59,659][62635] Updated weights for policy 1, policy_version 51310 (0.0011) [2023-10-12 17:55:00,031][62635] Updated weights for policy 1, policy_version 51320 (0.0008) [2023-10-12 17:55:00,984][62634] Updated weights for policy 0, policy_version 51330 (0.0009) [2023-10-12 17:55:01,372][62634] Updated weights for policy 0, policy_version 51340 (0.0009) [2023-10-12 17:55:01,740][62634] Updated weights for policy 0, policy_version 51350 (0.0008) [2023-10-12 17:55:02,118][62634] Updated weights for policy 0, policy_version 51360 (0.0008) [2023-10-12 17:55:03,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 105152512. Throughput: 0: 1675.2, 1: 1670.9. Samples: 26299318. Policy #0 lag: (min: 28.0, avg: 28.0, max: 28.0) [2023-10-12 17:55:03,435][61643] Avg episode reward: [(0, '25.250'), (1, '9.880')] [2023-10-12 17:55:04,179][62635] Updated weights for policy 1, policy_version 51330 (0.0011) [2023-10-12 17:55:04,549][62635] Updated weights for policy 1, policy_version 51340 (0.0009) [2023-10-12 17:55:04,921][62635] Updated weights for policy 1, policy_version 51350 (0.0007) [2023-10-12 17:55:05,286][62635] Updated weights for policy 1, policy_version 51360 (0.0010) [2023-10-12 17:55:06,125][62634] Updated weights for policy 0, policy_version 51370 (0.0010) [2023-10-12 17:55:06,504][62634] Updated weights for policy 0, policy_version 51380 (0.0009) [2023-10-12 17:55:06,885][62634] Updated weights for policy 0, policy_version 51390 (0.0010) [2023-10-12 17:55:08,435][61643] Fps is (10 sec: 13107.5, 60 sec: 13107.3, 300 sec: 13440.4). Total num frames: 105218048. Throughput: 0: 1680.3, 1: 1666.8. Samples: 26309544. Policy #0 lag: (min: 28.0, avg: 28.0, max: 28.0) [2023-10-12 17:55:08,435][61643] Avg episode reward: [(0, '24.600'), (1, '9.840')] [2023-10-12 17:55:09,523][62635] Updated weights for policy 1, policy_version 51370 (0.0007) [2023-10-12 17:55:09,878][62635] Updated weights for policy 1, policy_version 51380 (0.0011) [2023-10-12 17:55:10,249][62635] Updated weights for policy 1, policy_version 51390 (0.0009) [2023-10-12 17:55:10,957][62634] Updated weights for policy 0, policy_version 51400 (0.0009) [2023-10-12 17:55:11,351][62634] Updated weights for policy 0, policy_version 51410 (0.0008) [2023-10-12 17:55:11,725][62634] Updated weights for policy 0, policy_version 51420 (0.0007) [2023-10-12 17:55:13,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 105283584. Throughput: 0: 1665.6, 1: 1670.9. Samples: 26329136. Policy #0 lag: (min: 31.0, avg: 33.9, max: 63.0) [2023-10-12 17:55:13,435][61643] Avg episode reward: [(0, '24.610'), (1, '10.110')] [2023-10-12 17:55:14,133][62635] Updated weights for policy 1, policy_version 51400 (0.0009) [2023-10-12 17:55:14,499][62635] Updated weights for policy 1, policy_version 51410 (0.0008) [2023-10-12 17:55:14,877][62635] Updated weights for policy 1, policy_version 51420 (0.0007) [2023-10-12 17:55:15,618][62634] Updated weights for policy 0, policy_version 51430 (0.0010) [2023-10-12 17:55:16,005][62634] Updated weights for policy 0, policy_version 51440 (0.0009) [2023-10-12 17:55:16,371][62634] Updated weights for policy 0, policy_version 51450 (0.0007) [2023-10-12 17:55:18,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13107.3, 300 sec: 13329.3). Total num frames: 105349120. Throughput: 0: 1687.0, 1: 1671.2. Samples: 26349734. Policy #0 lag: (min: 31.0, avg: 33.9, max: 63.0) [2023-10-12 17:55:18,436][61643] Avg episode reward: [(0, '24.740'), (1, '10.110')] [2023-10-12 17:55:19,179][62635] Updated weights for policy 1, policy_version 51430 (0.0008) [2023-10-12 17:55:19,579][62635] Updated weights for policy 1, policy_version 51440 (0.0009) [2023-10-12 17:55:19,949][62635] Updated weights for policy 1, policy_version 51450 (0.0009) [2023-10-12 17:55:20,504][62634] Updated weights for policy 0, policy_version 51460 (0.0009) [2023-10-12 17:55:20,891][62634] Updated weights for policy 0, policy_version 51470 (0.0010) [2023-10-12 17:55:21,272][62634] Updated weights for policy 0, policy_version 51480 (0.0010) [2023-10-12 17:55:23,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 105414656. Throughput: 0: 1675.1, 1: 1670.0. Samples: 26359316. Policy #0 lag: (min: 31.0, avg: 33.9, max: 63.0) [2023-10-12 17:55:23,436][61643] Avg episode reward: [(0, '24.510'), (1, '9.970')] [2023-10-12 17:55:24,125][62635] Updated weights for policy 1, policy_version 51460 (0.0008) [2023-10-12 17:55:24,493][62635] Updated weights for policy 1, policy_version 51470 (0.0008) [2023-10-12 17:55:24,855][62635] Updated weights for policy 1, policy_version 51480 (0.0010) [2023-10-12 17:55:25,346][62634] Updated weights for policy 0, policy_version 51490 (0.0010) [2023-10-12 17:55:25,720][62634] Updated weights for policy 0, policy_version 51500 (0.0007) [2023-10-12 17:55:26,105][62634] Updated weights for policy 0, policy_version 51510 (0.0008) [2023-10-12 17:55:26,474][62634] Updated weights for policy 0, policy_version 51520 (0.0007) [2023-10-12 17:55:28,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 105480192. Throughput: 0: 1667.5, 1: 1667.2. Samples: 26379200. Policy #0 lag: (min: 31.0, avg: 33.9, max: 63.0) [2023-10-12 17:55:28,435][61643] Avg episode reward: [(0, '24.540'), (1, '9.970')] [2023-10-12 17:55:28,847][62635] Updated weights for policy 1, policy_version 51490 (0.0008) [2023-10-12 17:55:29,214][62635] Updated weights for policy 1, policy_version 51500 (0.0008) [2023-10-12 17:55:29,593][62635] Updated weights for policy 1, policy_version 51510 (0.0008) [2023-10-12 17:55:29,957][62635] Updated weights for policy 1, policy_version 51520 (0.0008) [2023-10-12 17:55:30,451][62634] Updated weights for policy 0, policy_version 51530 (0.0008) [2023-10-12 17:55:30,831][62634] Updated weights for policy 0, policy_version 51540 (0.0007) [2023-10-12 17:55:31,218][62634] Updated weights for policy 0, policy_version 51550 (0.0007) [2023-10-12 17:55:33,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 105545728. Throughput: 0: 1676.9, 1: 1672.2. Samples: 26399798. Policy #0 lag: (min: 31.0, avg: 33.9, max: 63.0) [2023-10-12 17:55:33,436][61643] Avg episode reward: [(0, '24.370'), (1, '9.960')] [2023-10-12 17:55:34,018][62635] Updated weights for policy 1, policy_version 51530 (0.0009) [2023-10-12 17:55:34,402][62635] Updated weights for policy 1, policy_version 51540 (0.0008) [2023-10-12 17:55:34,778][62635] Updated weights for policy 1, policy_version 51550 (0.0011) [2023-10-12 17:55:35,218][62634] Updated weights for policy 0, policy_version 51560 (0.0007) [2023-10-12 17:55:35,595][62634] Updated weights for policy 0, policy_version 51570 (0.0008) [2023-10-12 17:55:35,976][62634] Updated weights for policy 0, policy_version 51580 (0.0009) [2023-10-12 17:55:38,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 105611264. Throughput: 0: 1654.6, 1: 1673.4. Samples: 26409048. Policy #0 lag: (min: 31.0, avg: 33.9, max: 63.0) [2023-10-12 17:55:38,435][61643] Avg episode reward: [(0, '24.310'), (1, '9.760')] [2023-10-12 17:55:38,991][62635] Updated weights for policy 1, policy_version 51560 (0.0010) [2023-10-12 17:55:39,371][62635] Updated weights for policy 1, policy_version 51570 (0.0008) [2023-10-12 17:55:39,744][62635] Updated weights for policy 1, policy_version 51580 (0.0007) [2023-10-12 17:55:40,098][62634] Updated weights for policy 0, policy_version 51590 (0.0007) [2023-10-12 17:55:40,479][62634] Updated weights for policy 0, policy_version 51600 (0.0010) [2023-10-12 17:55:40,855][62634] Updated weights for policy 0, policy_version 51610 (0.0011) [2023-10-12 17:55:43,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 105676800. Throughput: 0: 1670.2, 1: 1673.5. Samples: 26429242. Policy #0 lag: (min: 31.0, avg: 33.9, max: 63.0) [2023-10-12 17:55:43,435][61643] Avg episode reward: [(0, '24.170'), (1, '9.950')] [2023-10-12 17:55:43,784][62635] Updated weights for policy 1, policy_version 51590 (0.0009) [2023-10-12 17:55:44,158][62635] Updated weights for policy 1, policy_version 51600 (0.0011) [2023-10-12 17:55:44,529][62635] Updated weights for policy 1, policy_version 51610 (0.0011) [2023-10-12 17:55:44,970][62634] Updated weights for policy 0, policy_version 51620 (0.0007) [2023-10-12 17:55:45,345][62634] Updated weights for policy 0, policy_version 51630 (0.0007) [2023-10-12 17:55:45,723][62634] Updated weights for policy 0, policy_version 51640 (0.0010) [2023-10-12 17:55:48,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 105742336. Throughput: 0: 1674.8, 1: 1670.4. Samples: 26449854. Policy #0 lag: (min: 1.0, avg: 1.2, max: 10.0) [2023-10-12 17:55:48,435][61643] Avg episode reward: [(0, '24.080'), (1, '9.980')] [2023-10-12 17:55:48,597][62635] Updated weights for policy 1, policy_version 51620 (0.0009) [2023-10-12 17:55:48,971][62635] Updated weights for policy 1, policy_version 51630 (0.0008) [2023-10-12 17:55:49,337][62635] Updated weights for policy 1, policy_version 51640 (0.0007) [2023-10-12 17:55:49,815][62634] Updated weights for policy 0, policy_version 51650 (0.0008) [2023-10-12 17:55:50,189][62634] Updated weights for policy 0, policy_version 51660 (0.0008) [2023-10-12 17:55:50,564][62634] Updated weights for policy 0, policy_version 51670 (0.0008) [2023-10-12 17:55:50,936][62634] Updated weights for policy 0, policy_version 51680 (0.0009) [2023-10-12 17:55:53,395][62635] Updated weights for policy 1, policy_version 51650 (0.0008) [2023-10-12 17:55:53,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 105807872. Throughput: 0: 1647.5, 1: 1671.0. Samples: 26458876. Policy #0 lag: (min: 1.0, avg: 1.2, max: 10.0) [2023-10-12 17:55:53,435][61643] Avg episode reward: [(0, '23.880'), (1, '9.710')] [2023-10-12 17:55:53,761][62635] Updated weights for policy 1, policy_version 51660 (0.0007) [2023-10-12 17:55:54,127][62635] Updated weights for policy 1, policy_version 51670 (0.0008) [2023-10-12 17:55:54,502][62635] Updated weights for policy 1, policy_version 51680 (0.0007) [2023-10-12 17:55:54,918][62634] Updated weights for policy 0, policy_version 51690 (0.0010) [2023-10-12 17:55:55,307][62634] Updated weights for policy 0, policy_version 51700 (0.0010) [2023-10-12 17:55:55,676][62634] Updated weights for policy 0, policy_version 51710 (0.0011) [2023-10-12 17:55:58,367][62635] Updated weights for policy 1, policy_version 51690 (0.0009) [2023-10-12 17:55:58,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 105873408. Throughput: 0: 1671.2, 1: 1672.0. Samples: 26479580. Policy #0 lag: (min: 1.0, avg: 1.2, max: 10.0) [2023-10-12 17:55:58,435][61643] Avg episode reward: [(0, '23.410'), (1, '9.890')] [2023-10-12 17:55:58,737][62635] Updated weights for policy 1, policy_version 51700 (0.0009) [2023-10-12 17:55:59,113][62635] Updated weights for policy 1, policy_version 51710 (0.0010) [2023-10-12 17:55:59,870][62634] Updated weights for policy 0, policy_version 51720 (0.0008) [2023-10-12 17:56:00,254][62634] Updated weights for policy 0, policy_version 51730 (0.0007) [2023-10-12 17:56:00,626][62634] Updated weights for policy 0, policy_version 51740 (0.0007) [2023-10-12 17:56:03,304][62635] Updated weights for policy 1, policy_version 51720 (0.0009) [2023-10-12 17:56:03,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 105938944. Throughput: 0: 1665.1, 1: 1676.2. Samples: 26500092. Policy #0 lag: (min: 1.0, avg: 1.2, max: 10.0) [2023-10-12 17:56:03,435][61643] Avg episode reward: [(0, '23.480'), (1, '9.980')] [2023-10-12 17:56:03,671][62635] Updated weights for policy 1, policy_version 51730 (0.0009) [2023-10-12 17:56:04,042][62635] Updated weights for policy 1, policy_version 51740 (0.0008) [2023-10-12 17:56:04,801][62634] Updated weights for policy 0, policy_version 51750 (0.0008) [2023-10-12 17:56:05,177][62634] Updated weights for policy 0, policy_version 51760 (0.0008) [2023-10-12 17:56:05,556][62634] Updated weights for policy 0, policy_version 51770 (0.0010) [2023-10-12 17:56:08,212][62635] Updated weights for policy 1, policy_version 51750 (0.0009) [2023-10-12 17:56:08,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 106004480. Throughput: 0: 1649.3, 1: 1681.5. Samples: 26509200. Policy #0 lag: (min: 1.0, avg: 1.2, max: 10.0) [2023-10-12 17:56:08,435][61643] Avg episode reward: [(0, '23.810'), (1, '10.010')] [2023-10-12 17:56:08,588][62635] Updated weights for policy 1, policy_version 51760 (0.0011) [2023-10-12 17:56:08,957][62635] Updated weights for policy 1, policy_version 51770 (0.0011) [2023-10-12 17:56:09,596][62634] Updated weights for policy 0, policy_version 51780 (0.0009) [2023-10-12 17:56:09,976][62634] Updated weights for policy 0, policy_version 51790 (0.0008) [2023-10-12 17:56:10,355][62634] Updated weights for policy 0, policy_version 51800 (0.0009) [2023-10-12 17:56:12,995][62635] Updated weights for policy 1, policy_version 51780 (0.0009) [2023-10-12 17:56:13,364][62635] Updated weights for policy 1, policy_version 51790 (0.0010) [2023-10-12 17:56:13,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 106070016. Throughput: 0: 1664.2, 1: 1681.0. Samples: 26529736. Policy #0 lag: (min: 1.0, avg: 1.2, max: 10.0) [2023-10-12 17:56:13,436][61643] Avg episode reward: [(0, '23.630'), (1, '10.100')] [2023-10-12 17:56:13,730][62635] Updated weights for policy 1, policy_version 51800 (0.0009) [2023-10-12 17:56:14,426][62634] Updated weights for policy 0, policy_version 51810 (0.0008) [2023-10-12 17:56:14,797][62634] Updated weights for policy 0, policy_version 51820 (0.0008) [2023-10-12 17:56:15,186][62634] Updated weights for policy 0, policy_version 51830 (0.0008) [2023-10-12 17:56:15,556][62634] Updated weights for policy 0, policy_version 51840 (0.0008) [2023-10-12 17:56:17,579][62635] Updated weights for policy 1, policy_version 51810 (0.0007) [2023-10-12 17:56:17,943][62635] Updated weights for policy 1, policy_version 51820 (0.0007) [2023-10-12 17:56:18,317][62635] Updated weights for policy 1, policy_version 51830 (0.0009) [2023-10-12 17:56:18,435][61643] Fps is (10 sec: 13106.8, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 106135552. Throughput: 0: 1664.5, 1: 1676.7. Samples: 26550150. Policy #0 lag: (min: 1.0, avg: 1.2, max: 10.0) [2023-10-12 17:56:18,436][61643] Avg episode reward: [(0, '23.650'), (1, '10.100')] [2023-10-12 17:56:18,447][62354] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000051840_53084160.pth... [2023-10-12 17:56:18,479][62354] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000050272_51478528.pth [2023-10-12 17:56:18,692][62495] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000051840_53084160.pth... [2023-10-12 17:56:18,699][62635] Updated weights for policy 1, policy_version 51840 (0.0010) [2023-10-12 17:56:18,721][62495] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000050240_51445760.pth [2023-10-12 17:56:19,781][62634] Updated weights for policy 0, policy_version 51850 (0.0009) [2023-10-12 17:56:20,168][62634] Updated weights for policy 0, policy_version 51860 (0.0008) [2023-10-12 17:56:20,544][62634] Updated weights for policy 0, policy_version 51870 (0.0007) [2023-10-12 17:56:22,804][62635] Updated weights for policy 1, policy_version 51850 (0.0008) [2023-10-12 17:56:23,177][62635] Updated weights for policy 1, policy_version 51860 (0.0007) [2023-10-12 17:56:23,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 106201088. Throughput: 0: 1657.5, 1: 1689.9. Samples: 26559680. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 17:56:23,436][61643] Avg episode reward: [(0, '24.100'), (1, '9.960')] [2023-10-12 17:56:23,545][62635] Updated weights for policy 1, policy_version 51870 (0.0009) [2023-10-12 17:56:24,631][62634] Updated weights for policy 0, policy_version 51880 (0.0008) [2023-10-12 17:56:25,000][62634] Updated weights for policy 0, policy_version 51890 (0.0008) [2023-10-12 17:56:25,390][62634] Updated weights for policy 0, policy_version 51900 (0.0008) [2023-10-12 17:56:27,556][62635] Updated weights for policy 1, policy_version 51880 (0.0009) [2023-10-12 17:56:27,921][62635] Updated weights for policy 1, policy_version 51890 (0.0007) [2023-10-12 17:56:28,284][62635] Updated weights for policy 1, policy_version 51900 (0.0007) [2023-10-12 17:56:28,435][61643] Fps is (10 sec: 16384.4, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 106299392. Throughput: 0: 1668.7, 1: 1696.1. Samples: 26580656. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 17:56:28,436][61643] Avg episode reward: [(0, '23.880'), (1, '9.960')] [2023-10-12 17:56:29,364][62634] Updated weights for policy 0, policy_version 51910 (0.0010) [2023-10-12 17:56:29,741][62634] Updated weights for policy 0, policy_version 51920 (0.0009) [2023-10-12 17:56:30,115][62634] Updated weights for policy 0, policy_version 51930 (0.0007) [2023-10-12 17:56:32,291][62635] Updated weights for policy 1, policy_version 51910 (0.0008) [2023-10-12 17:56:32,660][62635] Updated weights for policy 1, policy_version 51920 (0.0008) [2023-10-12 17:56:33,023][62635] Updated weights for policy 1, policy_version 51930 (0.0008) [2023-10-12 17:56:33,435][61643] Fps is (10 sec: 16384.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 106364928. Throughput: 0: 1673.1, 1: 1674.0. Samples: 26600476. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 17:56:33,435][61643] Avg episode reward: [(0, '23.740'), (1, '10.140')] [2023-10-12 17:56:34,166][62634] Updated weights for policy 0, policy_version 51940 (0.0010) [2023-10-12 17:56:34,539][62634] Updated weights for policy 0, policy_version 51950 (0.0008) [2023-10-12 17:56:34,909][62634] Updated weights for policy 0, policy_version 51960 (0.0008) [2023-10-12 17:56:36,922][62635] Updated weights for policy 1, policy_version 51940 (0.0007) [2023-10-12 17:56:37,300][62635] Updated weights for policy 1, policy_version 51950 (0.0009) [2023-10-12 17:56:37,669][62635] Updated weights for policy 1, policy_version 51960 (0.0010) [2023-10-12 17:56:38,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 106430464. Throughput: 0: 1673.3, 1: 1700.9. Samples: 26610716. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 17:56:38,436][61643] Avg episode reward: [(0, '23.900'), (1, '9.890')] [2023-10-12 17:56:38,947][62634] Updated weights for policy 0, policy_version 51970 (0.0009) [2023-10-12 17:56:39,320][62634] Updated weights for policy 0, policy_version 51980 (0.0008) [2023-10-12 17:56:39,693][62634] Updated weights for policy 0, policy_version 51990 (0.0007) [2023-10-12 17:56:40,076][62634] Updated weights for policy 0, policy_version 52000 (0.0008) [2023-10-12 17:56:41,768][62635] Updated weights for policy 1, policy_version 51970 (0.0011) [2023-10-12 17:56:42,144][62635] Updated weights for policy 1, policy_version 51980 (0.0007) [2023-10-12 17:56:42,518][62635] Updated weights for policy 1, policy_version 51990 (0.0007) [2023-10-12 17:56:42,889][62635] Updated weights for policy 1, policy_version 52000 (0.0007) [2023-10-12 17:56:43,435][61643] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 106496000. Throughput: 0: 1679.4, 1: 1687.6. Samples: 26631094. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 17:56:43,436][61643] Avg episode reward: [(0, '24.000'), (1, '9.980')] [2023-10-12 17:56:44,018][62634] Updated weights for policy 0, policy_version 52010 (0.0008) [2023-10-12 17:56:44,396][62634] Updated weights for policy 0, policy_version 52020 (0.0007) [2023-10-12 17:56:44,774][62634] Updated weights for policy 0, policy_version 52030 (0.0007) [2023-10-12 17:56:47,142][62635] Updated weights for policy 1, policy_version 52010 (0.0010) [2023-10-12 17:56:47,521][62635] Updated weights for policy 1, policy_version 52020 (0.0009) [2023-10-12 17:56:47,883][62635] Updated weights for policy 1, policy_version 52030 (0.0009) [2023-10-12 17:56:48,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 106561536. Throughput: 0: 1684.7, 1: 1665.0. Samples: 26650830. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 17:56:48,435][61643] Avg episode reward: [(0, '23.870'), (1, '9.870')] [2023-10-12 17:56:48,917][62634] Updated weights for policy 0, policy_version 52040 (0.0009) [2023-10-12 17:56:49,287][62634] Updated weights for policy 0, policy_version 52050 (0.0009) [2023-10-12 17:56:49,665][62634] Updated weights for policy 0, policy_version 52060 (0.0009) [2023-10-12 17:56:51,883][62635] Updated weights for policy 1, policy_version 52040 (0.0007) [2023-10-12 17:56:52,256][62635] Updated weights for policy 1, policy_version 52050 (0.0007) [2023-10-12 17:56:52,611][62635] Updated weights for policy 1, policy_version 52060 (0.0007) [2023-10-12 17:56:53,435][61643] Fps is (10 sec: 13107.6, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 106627072. Throughput: 0: 1680.4, 1: 1693.6. Samples: 26661030. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 17:56:53,435][61643] Avg episode reward: [(0, '23.220'), (1, '9.850')] [2023-10-12 17:56:53,548][62634] Updated weights for policy 0, policy_version 52070 (0.0008) [2023-10-12 17:56:53,936][62634] Updated weights for policy 0, policy_version 52080 (0.0008) [2023-10-12 17:56:54,305][62634] Updated weights for policy 0, policy_version 52090 (0.0007) [2023-10-12 17:56:56,699][62635] Updated weights for policy 1, policy_version 52070 (0.0008) [2023-10-12 17:56:57,066][62635] Updated weights for policy 1, policy_version 52080 (0.0008) [2023-10-12 17:56:57,434][62635] Updated weights for policy 1, policy_version 52090 (0.0008) [2023-10-12 17:56:58,386][62634] Updated weights for policy 0, policy_version 52100 (0.0008) [2023-10-12 17:56:58,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 106692608. Throughput: 0: 1686.5, 1: 1679.0. Samples: 26681186. Policy #0 lag: (min: 16.0, avg: 40.3, max: 48.0) [2023-10-12 17:56:58,436][61643] Avg episode reward: [(0, '23.600'), (1, '9.670')] [2023-10-12 17:56:58,770][62634] Updated weights for policy 0, policy_version 52110 (0.0010) [2023-10-12 17:56:59,154][62634] Updated weights for policy 0, policy_version 52120 (0.0009) [2023-10-12 17:57:01,434][62635] Updated weights for policy 1, policy_version 52100 (0.0007) [2023-10-12 17:57:01,801][62635] Updated weights for policy 1, policy_version 52110 (0.0009) [2023-10-12 17:57:02,173][62635] Updated weights for policy 1, policy_version 52120 (0.0009) [2023-10-12 17:57:03,023][62634] Updated weights for policy 0, policy_version 52130 (0.0007) [2023-10-12 17:57:03,406][62634] Updated weights for policy 0, policy_version 52140 (0.0009) [2023-10-12 17:57:03,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 106758144. Throughput: 0: 1690.7, 1: 1669.4. Samples: 26701354. Policy #0 lag: (min: 16.0, avg: 40.3, max: 48.0) [2023-10-12 17:57:03,435][61643] Avg episode reward: [(0, '23.480'), (1, '9.850')] [2023-10-12 17:57:03,779][62634] Updated weights for policy 0, policy_version 52150 (0.0009) [2023-10-12 17:57:04,157][62634] Updated weights for policy 0, policy_version 52160 (0.0010) [2023-10-12 17:57:06,237][62635] Updated weights for policy 1, policy_version 52130 (0.0009) [2023-10-12 17:57:06,601][62635] Updated weights for policy 1, policy_version 52140 (0.0007) [2023-10-12 17:57:06,965][62635] Updated weights for policy 1, policy_version 52150 (0.0007) [2023-10-12 17:57:07,332][62635] Updated weights for policy 1, policy_version 52160 (0.0007) [2023-10-12 17:57:08,269][62634] Updated weights for policy 0, policy_version 52170 (0.0007) [2023-10-12 17:57:08,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 106823680. Throughput: 0: 1694.5, 1: 1684.3. Samples: 26711724. Policy #0 lag: (min: 16.0, avg: 40.3, max: 48.0) [2023-10-12 17:57:08,435][61643] Avg episode reward: [(0, '23.910'), (1, '9.590')] [2023-10-12 17:57:08,660][62634] Updated weights for policy 0, policy_version 52180 (0.0008) [2023-10-12 17:57:09,038][62634] Updated weights for policy 0, policy_version 52190 (0.0008) [2023-10-12 17:57:11,346][62635] Updated weights for policy 1, policy_version 52170 (0.0008) [2023-10-12 17:57:11,716][62635] Updated weights for policy 1, policy_version 52180 (0.0010) [2023-10-12 17:57:12,083][62635] Updated weights for policy 1, policy_version 52190 (0.0011) [2023-10-12 17:57:13,143][62634] Updated weights for policy 0, policy_version 52200 (0.0008) [2023-10-12 17:57:13,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13653.4, 300 sec: 13440.5). Total num frames: 106889216. Throughput: 0: 1693.1, 1: 1661.3. Samples: 26731604. Policy #0 lag: (min: 16.0, avg: 40.3, max: 48.0) [2023-10-12 17:57:13,435][61643] Avg episode reward: [(0, '24.270'), (1, '9.500')] [2023-10-12 17:57:13,529][62634] Updated weights for policy 0, policy_version 52210 (0.0009) [2023-10-12 17:57:13,909][62634] Updated weights for policy 0, policy_version 52220 (0.0010) [2023-10-12 17:57:16,091][62635] Updated weights for policy 1, policy_version 52200 (0.0009) [2023-10-12 17:57:16,460][62635] Updated weights for policy 1, policy_version 52210 (0.0008) [2023-10-12 17:57:16,827][62635] Updated weights for policy 1, policy_version 52220 (0.0008) [2023-10-12 17:57:17,773][62634] Updated weights for policy 0, policy_version 52230 (0.0008) [2023-10-12 17:57:18,156][62634] Updated weights for policy 0, policy_version 52240 (0.0009) [2023-10-12 17:57:18,435][61643] Fps is (10 sec: 13106.9, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 106954752. Throughput: 0: 1681.7, 1: 1679.4. Samples: 26751728. Policy #0 lag: (min: 16.0, avg: 40.3, max: 48.0) [2023-10-12 17:57:18,436][61643] Avg episode reward: [(0, '24.420'), (1, '9.770')] [2023-10-12 17:57:18,538][62634] Updated weights for policy 0, policy_version 52250 (0.0010) [2023-10-12 17:57:20,907][62635] Updated weights for policy 1, policy_version 52230 (0.0008) [2023-10-12 17:57:21,274][62635] Updated weights for policy 1, policy_version 52240 (0.0008) [2023-10-12 17:57:21,636][62635] Updated weights for policy 1, policy_version 52250 (0.0010) [2023-10-12 17:57:22,686][62634] Updated weights for policy 0, policy_version 52260 (0.0010) [2023-10-12 17:57:23,065][62634] Updated weights for policy 0, policy_version 52270 (0.0010) [2023-10-12 17:57:23,432][62634] Updated weights for policy 0, policy_version 52280 (0.0010) [2023-10-12 17:57:23,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 107020288. Throughput: 0: 1687.8, 1: 1675.1. Samples: 26762046. Policy #0 lag: (min: 16.0, avg: 40.3, max: 48.0) [2023-10-12 17:57:23,435][61643] Avg episode reward: [(0, '24.390'), (1, '9.610')] [2023-10-12 17:57:25,776][62635] Updated weights for policy 1, policy_version 52260 (0.0008) [2023-10-12 17:57:26,145][62635] Updated weights for policy 1, policy_version 52270 (0.0008) [2023-10-12 17:57:26,509][62635] Updated weights for policy 1, policy_version 52280 (0.0010) [2023-10-12 17:57:27,614][62634] Updated weights for policy 0, policy_version 52290 (0.0010) [2023-10-12 17:57:27,996][62634] Updated weights for policy 0, policy_version 52300 (0.0010) [2023-10-12 17:57:28,368][62634] Updated weights for policy 0, policy_version 52310 (0.0010) [2023-10-12 17:57:28,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 107085824. Throughput: 0: 1686.1, 1: 1664.7. Samples: 26781876. Policy #0 lag: (min: 16.0, avg: 40.3, max: 48.0) [2023-10-12 17:57:28,436][61643] Avg episode reward: [(0, '24.330'), (1, '9.520')] [2023-10-12 17:57:28,736][62634] Updated weights for policy 0, policy_version 52320 (0.0011) [2023-10-12 17:57:30,512][62635] Updated weights for policy 1, policy_version 52290 (0.0010) [2023-10-12 17:57:30,877][62635] Updated weights for policy 1, policy_version 52300 (0.0007) [2023-10-12 17:57:31,245][62635] Updated weights for policy 1, policy_version 52310 (0.0010) [2023-10-12 17:57:31,615][62635] Updated weights for policy 1, policy_version 52320 (0.0008) [2023-10-12 17:57:32,763][62634] Updated weights for policy 0, policy_version 52330 (0.0007) [2023-10-12 17:57:33,148][62634] Updated weights for policy 0, policy_version 52340 (0.0008) [2023-10-12 17:57:33,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 107151360. Throughput: 0: 1673.6, 1: 1689.6. Samples: 26802174. Policy #0 lag: (min: 22.0, avg: 22.0, max: 22.0) [2023-10-12 17:57:33,435][61643] Avg episode reward: [(0, '24.340'), (1, '9.690')] [2023-10-12 17:57:33,523][62634] Updated weights for policy 0, policy_version 52350 (0.0009) [2023-10-12 17:57:35,624][62635] Updated weights for policy 1, policy_version 52330 (0.0008) [2023-10-12 17:57:35,997][62635] Updated weights for policy 1, policy_version 52340 (0.0008) [2023-10-12 17:57:36,369][62635] Updated weights for policy 1, policy_version 52350 (0.0008) [2023-10-12 17:57:37,529][62634] Updated weights for policy 0, policy_version 52360 (0.0009) [2023-10-12 17:57:37,910][62634] Updated weights for policy 0, policy_version 52370 (0.0009) [2023-10-12 17:57:38,283][62634] Updated weights for policy 0, policy_version 52380 (0.0008) [2023-10-12 17:57:38,435][61643] Fps is (10 sec: 16384.1, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 107249664. Throughput: 0: 1695.6, 1: 1673.3. Samples: 26812630. Policy #0 lag: (min: 22.0, avg: 22.0, max: 22.0) [2023-10-12 17:57:38,436][61643] Avg episode reward: [(0, '23.900'), (1, '9.640')] [2023-10-12 17:57:40,402][62635] Updated weights for policy 1, policy_version 52360 (0.0008) [2023-10-12 17:57:40,775][62635] Updated weights for policy 1, policy_version 52370 (0.0007) [2023-10-12 17:57:41,140][62635] Updated weights for policy 1, policy_version 52380 (0.0011) [2023-10-12 17:57:42,355][62634] Updated weights for policy 0, policy_version 52390 (0.0011) [2023-10-12 17:57:42,741][62634] Updated weights for policy 0, policy_version 52400 (0.0008) [2023-10-12 17:57:43,116][62634] Updated weights for policy 0, policy_version 52410 (0.0008) [2023-10-12 17:57:43,435][61643] Fps is (10 sec: 16383.9, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 107315200. Throughput: 0: 1701.6, 1: 1676.3. Samples: 26833194. Policy #0 lag: (min: 22.0, avg: 22.0, max: 22.0) [2023-10-12 17:57:43,435][61643] Avg episode reward: [(0, '23.980'), (1, '9.640')] [2023-10-12 17:57:45,189][62635] Updated weights for policy 1, policy_version 52390 (0.0008) [2023-10-12 17:57:45,553][62635] Updated weights for policy 1, policy_version 52400 (0.0007) [2023-10-12 17:57:45,918][62635] Updated weights for policy 1, policy_version 52410 (0.0009) [2023-10-12 17:57:47,020][62634] Updated weights for policy 0, policy_version 52420 (0.0009) [2023-10-12 17:57:47,393][62634] Updated weights for policy 0, policy_version 52430 (0.0009) [2023-10-12 17:57:47,776][62634] Updated weights for policy 0, policy_version 52440 (0.0008) [2023-10-12 17:57:48,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 107380736. Throughput: 0: 1674.6, 1: 1691.6. Samples: 26852834. Policy #0 lag: (min: 22.0, avg: 22.0, max: 22.0) [2023-10-12 17:57:48,435][61643] Avg episode reward: [(0, '24.100'), (1, '9.830')] [2023-10-12 17:57:50,114][62635] Updated weights for policy 1, policy_version 52420 (0.0008) [2023-10-12 17:57:50,493][62635] Updated weights for policy 1, policy_version 52430 (0.0010) [2023-10-12 17:57:50,857][62635] Updated weights for policy 1, policy_version 52440 (0.0008) [2023-10-12 17:57:51,848][62634] Updated weights for policy 0, policy_version 52450 (0.0008) [2023-10-12 17:57:52,230][62634] Updated weights for policy 0, policy_version 52460 (0.0010) [2023-10-12 17:57:52,597][62634] Updated weights for policy 0, policy_version 52470 (0.0010) [2023-10-12 17:57:52,977][62634] Updated weights for policy 0, policy_version 52480 (0.0010) [2023-10-12 17:57:53,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 107446272. Throughput: 0: 1697.4, 1: 1665.4. Samples: 26863052. Policy #0 lag: (min: 22.0, avg: 22.0, max: 22.0) [2023-10-12 17:57:53,435][61643] Avg episode reward: [(0, '24.220'), (1, '9.890')] [2023-10-12 17:57:54,841][62635] Updated weights for policy 1, policy_version 52450 (0.0007) [2023-10-12 17:57:55,210][62635] Updated weights for policy 1, policy_version 52460 (0.0007) [2023-10-12 17:57:55,577][62635] Updated weights for policy 1, policy_version 52470 (0.0008) [2023-10-12 17:57:55,950][62635] Updated weights for policy 1, policy_version 52480 (0.0007) [2023-10-12 17:57:57,141][62634] Updated weights for policy 0, policy_version 52490 (0.0009) [2023-10-12 17:57:57,523][62634] Updated weights for policy 0, policy_version 52500 (0.0009) [2023-10-12 17:57:57,893][62634] Updated weights for policy 0, policy_version 52510 (0.0009) [2023-10-12 17:57:58,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 107511808. Throughput: 0: 1690.0, 1: 1681.9. Samples: 26883342. Policy #0 lag: (min: 22.0, avg: 22.0, max: 22.0) [2023-10-12 17:57:58,435][61643] Avg episode reward: [(0, '24.340'), (1, '9.720')] [2023-10-12 17:58:00,033][62635] Updated weights for policy 1, policy_version 52490 (0.0010) [2023-10-12 17:58:00,402][62635] Updated weights for policy 1, policy_version 52500 (0.0008) [2023-10-12 17:58:00,757][62635] Updated weights for policy 1, policy_version 52510 (0.0008) [2023-10-12 17:58:01,639][62634] Updated weights for policy 0, policy_version 52520 (0.0008) [2023-10-12 17:58:02,018][62634] Updated weights for policy 0, policy_version 52530 (0.0008) [2023-10-12 17:58:02,405][62634] Updated weights for policy 0, policy_version 52540 (0.0009) [2023-10-12 17:58:03,435][61643] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 107577344. Throughput: 0: 1674.6, 1: 1688.2. Samples: 26903052. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 17:58:03,436][61643] Avg episode reward: [(0, '24.460'), (1, '9.950')] [2023-10-12 17:58:04,825][62635] Updated weights for policy 1, policy_version 52520 (0.0010) [2023-10-12 17:58:05,195][62635] Updated weights for policy 1, policy_version 52530 (0.0009) [2023-10-12 17:58:05,562][62635] Updated weights for policy 1, policy_version 52540 (0.0008) [2023-10-12 17:58:06,524][62634] Updated weights for policy 0, policy_version 52550 (0.0008) [2023-10-12 17:58:06,904][62634] Updated weights for policy 0, policy_version 52560 (0.0008) [2023-10-12 17:58:07,281][62634] Updated weights for policy 0, policy_version 52570 (0.0008) [2023-10-12 17:58:08,435][61643] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 107642880. Throughput: 0: 1696.9, 1: 1667.9. Samples: 26913462. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 17:58:08,436][61643] Avg episode reward: [(0, '24.800'), (1, '9.920')] [2023-10-12 17:58:09,581][62635] Updated weights for policy 1, policy_version 52550 (0.0007) [2023-10-12 17:58:09,946][62635] Updated weights for policy 1, policy_version 52560 (0.0008) [2023-10-12 17:58:10,319][62635] Updated weights for policy 1, policy_version 52570 (0.0009) [2023-10-12 17:58:11,390][62634] Updated weights for policy 0, policy_version 52580 (0.0007) [2023-10-12 17:58:11,764][62634] Updated weights for policy 0, policy_version 52590 (0.0008) [2023-10-12 17:58:12,143][62634] Updated weights for policy 0, policy_version 52600 (0.0007) [2023-10-12 17:58:13,435][61643] Fps is (10 sec: 13107.7, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 107708416. Throughput: 0: 1679.3, 1: 1689.5. Samples: 26933474. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 17:58:13,435][61643] Avg episode reward: [(0, '24.760'), (1, '9.860')] [2023-10-12 17:58:14,576][62635] Updated weights for policy 1, policy_version 52580 (0.0009) [2023-10-12 17:58:14,943][62635] Updated weights for policy 1, policy_version 52590 (0.0008) [2023-10-12 17:58:15,317][62635] Updated weights for policy 1, policy_version 52600 (0.0009) [2023-10-12 17:58:16,320][62634] Updated weights for policy 0, policy_version 52610 (0.0008) [2023-10-12 17:58:16,695][62634] Updated weights for policy 0, policy_version 52620 (0.0008) [2023-10-12 17:58:17,068][62634] Updated weights for policy 0, policy_version 52630 (0.0008) [2023-10-12 17:58:17,444][62634] Updated weights for policy 0, policy_version 52640 (0.0008) [2023-10-12 17:58:18,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 107773952. Throughput: 0: 1681.5, 1: 1684.2. Samples: 26953632. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 17:58:18,436][61643] Avg episode reward: [(0, '24.520'), (1, '9.940')] [2023-10-12 17:58:18,448][62354] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000052640_53903360.pth... [2023-10-12 17:58:18,448][62495] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000052608_53870592.pth... [2023-10-12 17:58:18,479][62354] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000051072_52297728.pth [2023-10-12 17:58:18,483][62495] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000051040_52264960.pth [2023-10-12 17:58:19,294][62635] Updated weights for policy 1, policy_version 52610 (0.0008) [2023-10-12 17:58:19,660][62635] Updated weights for policy 1, policy_version 52620 (0.0007) [2023-10-12 17:58:20,029][62635] Updated weights for policy 1, policy_version 52630 (0.0007) [2023-10-12 17:58:20,395][62635] Updated weights for policy 1, policy_version 52640 (0.0007) [2023-10-12 17:58:21,285][62634] Updated weights for policy 0, policy_version 52650 (0.0007) [2023-10-12 17:58:21,662][62634] Updated weights for policy 0, policy_version 52660 (0.0007) [2023-10-12 17:58:22,029][62634] Updated weights for policy 0, policy_version 52670 (0.0007) [2023-10-12 17:58:23,435][61643] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 107839488. Throughput: 0: 1693.2, 1: 1672.4. Samples: 26964082. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 17:58:23,436][61643] Avg episode reward: [(0, '24.460'), (1, '9.750')] [2023-10-12 17:58:24,488][62635] Updated weights for policy 1, policy_version 52650 (0.0009) [2023-10-12 17:58:24,852][62635] Updated weights for policy 1, policy_version 52660 (0.0010) [2023-10-12 17:58:25,217][62635] Updated weights for policy 1, policy_version 52670 (0.0011) [2023-10-12 17:58:26,191][62634] Updated weights for policy 0, policy_version 52680 (0.0007) [2023-10-12 17:58:26,566][62634] Updated weights for policy 0, policy_version 52690 (0.0009) [2023-10-12 17:58:26,948][62634] Updated weights for policy 0, policy_version 52700 (0.0009) [2023-10-12 17:58:28,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 107905024. Throughput: 0: 1662.1, 1: 1682.8. Samples: 26983718. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 17:58:28,436][61643] Avg episode reward: [(0, '24.250'), (1, '9.940')] [2023-10-12 17:58:29,348][62635] Updated weights for policy 1, policy_version 52680 (0.0009) [2023-10-12 17:58:29,714][62635] Updated weights for policy 1, policy_version 52690 (0.0008) [2023-10-12 17:58:30,079][62635] Updated weights for policy 1, policy_version 52700 (0.0007) [2023-10-12 17:58:30,721][62634] Updated weights for policy 0, policy_version 52710 (0.0008) [2023-10-12 17:58:31,102][62634] Updated weights for policy 0, policy_version 52720 (0.0009) [2023-10-12 17:58:31,478][62634] Updated weights for policy 0, policy_version 52730 (0.0007) [2023-10-12 17:58:33,435][61643] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 107970560. Throughput: 0: 1686.5, 1: 1685.0. Samples: 27004554. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 17:58:33,435][61643] Avg episode reward: [(0, '24.120'), (1, '9.960')] [2023-10-12 17:58:34,159][62635] Updated weights for policy 1, policy_version 52710 (0.0008) [2023-10-12 17:58:34,540][62635] Updated weights for policy 1, policy_version 52720 (0.0009) [2023-10-12 17:58:34,914][62635] Updated weights for policy 1, policy_version 52730 (0.0008) [2023-10-12 17:58:35,531][62634] Updated weights for policy 0, policy_version 52740 (0.0008) [2023-10-12 17:58:35,911][62634] Updated weights for policy 0, policy_version 52750 (0.0007) [2023-10-12 17:58:36,277][62634] Updated weights for policy 0, policy_version 52760 (0.0007) [2023-10-12 17:58:38,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 108036096. Throughput: 0: 1678.0, 1: 1680.6. Samples: 27014192. Policy #0 lag: (min: 1.0, avg: 8.9, max: 33.0) [2023-10-12 17:58:38,436][61643] Avg episode reward: [(0, '24.050'), (1, '9.840')] [2023-10-12 17:58:38,820][62635] Updated weights for policy 1, policy_version 52740 (0.0010) [2023-10-12 17:58:39,182][62635] Updated weights for policy 1, policy_version 52750 (0.0007) [2023-10-12 17:58:39,557][62635] Updated weights for policy 1, policy_version 52760 (0.0007) [2023-10-12 17:58:40,351][62634] Updated weights for policy 0, policy_version 52770 (0.0007) [2023-10-12 17:58:40,735][62634] Updated weights for policy 0, policy_version 52780 (0.0007) [2023-10-12 17:58:41,113][62634] Updated weights for policy 0, policy_version 52790 (0.0008) [2023-10-12 17:58:41,484][62634] Updated weights for policy 0, policy_version 52800 (0.0009) [2023-10-12 17:58:43,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 108101632. Throughput: 0: 1667.2, 1: 1684.7. Samples: 27034178. Policy #0 lag: (min: 1.0, avg: 8.9, max: 33.0) [2023-10-12 17:58:43,435][61643] Avg episode reward: [(0, '24.190'), (1, '9.930')] [2023-10-12 17:58:43,673][62635] Updated weights for policy 1, policy_version 52770 (0.0009) [2023-10-12 17:58:44,040][62635] Updated weights for policy 1, policy_version 52780 (0.0010) [2023-10-12 17:58:44,412][62635] Updated weights for policy 1, policy_version 52790 (0.0008) [2023-10-12 17:58:44,783][62635] Updated weights for policy 1, policy_version 52800 (0.0007) [2023-10-12 17:58:45,533][62634] Updated weights for policy 0, policy_version 52810 (0.0008) [2023-10-12 17:58:45,907][62634] Updated weights for policy 0, policy_version 52820 (0.0007) [2023-10-12 17:58:46,283][62634] Updated weights for policy 0, policy_version 52830 (0.0008) [2023-10-12 17:58:48,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.1, 300 sec: 13329.3). Total num frames: 108167168. Throughput: 0: 1692.3, 1: 1683.1. Samples: 27054944. Policy #0 lag: (min: 1.0, avg: 8.9, max: 33.0) [2023-10-12 17:58:48,436][61643] Avg episode reward: [(0, '24.260'), (1, '10.050')] [2023-10-12 17:58:48,902][62635] Updated weights for policy 1, policy_version 52810 (0.0008) [2023-10-12 17:58:49,259][62635] Updated weights for policy 1, policy_version 52820 (0.0008) [2023-10-12 17:58:49,634][62635] Updated weights for policy 1, policy_version 52830 (0.0009) [2023-10-12 17:58:50,260][62634] Updated weights for policy 0, policy_version 52840 (0.0007) [2023-10-12 17:58:50,635][62634] Updated weights for policy 0, policy_version 52850 (0.0008) [2023-10-12 17:58:51,013][62634] Updated weights for policy 0, policy_version 52860 (0.0008) [2023-10-12 17:58:53,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 108232704. Throughput: 0: 1669.2, 1: 1683.8. Samples: 27064348. Policy #0 lag: (min: 1.0, avg: 8.9, max: 33.0) [2023-10-12 17:58:53,436][61643] Avg episode reward: [(0, '24.670'), (1, '9.790')] [2023-10-12 17:58:53,750][62635] Updated weights for policy 1, policy_version 52840 (0.0011) [2023-10-12 17:58:54,121][62635] Updated weights for policy 1, policy_version 52850 (0.0007) [2023-10-12 17:58:54,488][62635] Updated weights for policy 1, policy_version 52860 (0.0010) [2023-10-12 17:58:54,918][62634] Updated weights for policy 0, policy_version 52870 (0.0010) [2023-10-12 17:58:55,296][62634] Updated weights for policy 0, policy_version 52880 (0.0009) [2023-10-12 17:58:55,673][62634] Updated weights for policy 0, policy_version 52890 (0.0007) [2023-10-12 17:58:58,435][61643] Fps is (10 sec: 13107.6, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 108298240. Throughput: 0: 1683.2, 1: 1681.5. Samples: 27084886. Policy #0 lag: (min: 1.0, avg: 8.9, max: 33.0) [2023-10-12 17:58:58,435][61643] Avg episode reward: [(0, '24.610'), (1, '9.880')] [2023-10-12 17:58:58,646][62635] Updated weights for policy 1, policy_version 52870 (0.0007) [2023-10-12 17:58:59,022][62635] Updated weights for policy 1, policy_version 52880 (0.0010) [2023-10-12 17:58:59,393][62635] Updated weights for policy 1, policy_version 52890 (0.0009) [2023-10-12 17:58:59,804][62634] Updated weights for policy 0, policy_version 52900 (0.0007) [2023-10-12 17:59:00,183][62634] Updated weights for policy 0, policy_version 52910 (0.0008) [2023-10-12 17:59:00,560][62634] Updated weights for policy 0, policy_version 52920 (0.0009) [2023-10-12 17:59:03,352][62635] Updated weights for policy 1, policy_version 52900 (0.0009) [2023-10-12 17:59:03,435][61643] Fps is (10 sec: 13107.5, 60 sec: 13107.3, 300 sec: 13329.4). Total num frames: 108363776. Throughput: 0: 1692.9, 1: 1684.2. Samples: 27105604. Policy #0 lag: (min: 1.0, avg: 8.9, max: 33.0) [2023-10-12 17:59:03,435][61643] Avg episode reward: [(0, '24.630'), (1, '9.970')] [2023-10-12 17:59:03,715][62635] Updated weights for policy 1, policy_version 52910 (0.0010) [2023-10-12 17:59:04,078][62635] Updated weights for policy 1, policy_version 52920 (0.0008) [2023-10-12 17:59:04,520][62634] Updated weights for policy 0, policy_version 52930 (0.0008) [2023-10-12 17:59:04,898][62634] Updated weights for policy 0, policy_version 52940 (0.0007) [2023-10-12 17:59:05,279][62634] Updated weights for policy 0, policy_version 52950 (0.0008) [2023-10-12 17:59:05,661][62634] Updated weights for policy 0, policy_version 52960 (0.0009) [2023-10-12 17:59:08,216][62635] Updated weights for policy 1, policy_version 52930 (0.0007) [2023-10-12 17:59:08,435][61643] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 108429312. Throughput: 0: 1666.6, 1: 1682.6. Samples: 27114794. Policy #0 lag: (min: 1.0, avg: 8.9, max: 33.0) [2023-10-12 17:59:08,436][61643] Avg episode reward: [(0, '24.490'), (1, '9.950')] [2023-10-12 17:59:08,573][62635] Updated weights for policy 1, policy_version 52940 (0.0007) [2023-10-12 17:59:08,934][62635] Updated weights for policy 1, policy_version 52950 (0.0008) [2023-10-12 17:59:09,307][62635] Updated weights for policy 1, policy_version 52960 (0.0007) [2023-10-12 17:59:09,718][62634] Updated weights for policy 0, policy_version 52970 (0.0008) [2023-10-12 17:59:10,101][62634] Updated weights for policy 0, policy_version 52980 (0.0008) [2023-10-12 17:59:10,486][62634] Updated weights for policy 0, policy_version 52990 (0.0010) [2023-10-12 17:59:13,304][62635] Updated weights for policy 1, policy_version 52970 (0.0007) [2023-10-12 17:59:13,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 108494848. Throughput: 0: 1684.7, 1: 1686.9. Samples: 27135442. Policy #0 lag: (min: 22.0, avg: 22.0, max: 22.0) [2023-10-12 17:59:13,435][61643] Avg episode reward: [(0, '24.610'), (1, '9.970')] [2023-10-12 17:59:13,667][62635] Updated weights for policy 1, policy_version 52980 (0.0007) [2023-10-12 17:59:14,044][62635] Updated weights for policy 1, policy_version 52990 (0.0007) [2023-10-12 17:59:14,498][62634] Updated weights for policy 0, policy_version 53000 (0.0010) [2023-10-12 17:59:14,877][62634] Updated weights for policy 0, policy_version 53010 (0.0008) [2023-10-12 17:59:15,254][62634] Updated weights for policy 0, policy_version 53020 (0.0007) [2023-10-12 17:59:18,075][62635] Updated weights for policy 1, policy_version 53000 (0.0007) [2023-10-12 17:59:18,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 108560384. Throughput: 0: 1686.3, 1: 1681.0. Samples: 27156082. Policy #0 lag: (min: 22.0, avg: 22.0, max: 22.0) [2023-10-12 17:59:18,436][61643] Avg episode reward: [(0, '24.550'), (1, '9.960')] [2023-10-12 17:59:18,440][62635] Updated weights for policy 1, policy_version 53010 (0.0008) [2023-10-12 17:59:18,813][62635] Updated weights for policy 1, policy_version 53020 (0.0007) [2023-10-12 17:59:19,468][62634] Updated weights for policy 0, policy_version 53030 (0.0007) [2023-10-12 17:59:19,845][62634] Updated weights for policy 0, policy_version 53040 (0.0008) [2023-10-12 17:59:20,228][62634] Updated weights for policy 0, policy_version 53050 (0.0009) [2023-10-12 17:59:23,001][62635] Updated weights for policy 1, policy_version 53030 (0.0008) [2023-10-12 17:59:23,387][62635] Updated weights for policy 1, policy_version 53040 (0.0008) [2023-10-12 17:59:23,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 108625920. Throughput: 0: 1667.8, 1: 1689.5. Samples: 27165272. Policy #0 lag: (min: 22.0, avg: 22.0, max: 22.0) [2023-10-12 17:59:23,435][61643] Avg episode reward: [(0, '24.780'), (1, '9.950')] [2023-10-12 17:59:23,745][62635] Updated weights for policy 1, policy_version 53050 (0.0009) [2023-10-12 17:59:24,432][62634] Updated weights for policy 0, policy_version 53060 (0.0009) [2023-10-12 17:59:24,805][62634] Updated weights for policy 0, policy_version 53070 (0.0008) [2023-10-12 17:59:25,191][62634] Updated weights for policy 0, policy_version 53080 (0.0009) [2023-10-12 17:59:27,875][62635] Updated weights for policy 1, policy_version 53060 (0.0008) [2023-10-12 17:59:28,243][62635] Updated weights for policy 1, policy_version 53070 (0.0008) [2023-10-12 17:59:28,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 108691456. Throughput: 0: 1687.5, 1: 1682.3. Samples: 27185820. Policy #0 lag: (min: 22.0, avg: 22.0, max: 22.0) [2023-10-12 17:59:28,435][61643] Avg episode reward: [(0, '24.700'), (1, '9.870')] [2023-10-12 17:59:28,607][62635] Updated weights for policy 1, policy_version 53080 (0.0010) [2023-10-12 17:59:29,235][62634] Updated weights for policy 0, policy_version 53090 (0.0010) [2023-10-12 17:59:29,610][62634] Updated weights for policy 0, policy_version 53100 (0.0008) [2023-10-12 17:59:29,985][62634] Updated weights for policy 0, policy_version 53110 (0.0011) [2023-10-12 17:59:30,359][62634] Updated weights for policy 0, policy_version 53120 (0.0010) [2023-10-12 17:59:32,648][62635] Updated weights for policy 1, policy_version 53090 (0.0008) [2023-10-12 17:59:33,029][62635] Updated weights for policy 1, policy_version 53100 (0.0009) [2023-10-12 17:59:33,402][62635] Updated weights for policy 1, policy_version 53110 (0.0008) [2023-10-12 17:59:33,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 108756992. Throughput: 0: 1688.5, 1: 1674.6. Samples: 27206284. Policy #0 lag: (min: 22.0, avg: 22.0, max: 22.0) [2023-10-12 17:59:33,435][61643] Avg episode reward: [(0, '24.950'), (1, '9.870')] [2023-10-12 17:59:33,758][62635] Updated weights for policy 1, policy_version 53120 (0.0009) [2023-10-12 17:59:34,314][62634] Updated weights for policy 0, policy_version 53130 (0.0008) [2023-10-12 17:59:34,705][62634] Updated weights for policy 0, policy_version 53140 (0.0008) [2023-10-12 17:59:35,083][62634] Updated weights for policy 0, policy_version 53150 (0.0010) [2023-10-12 17:59:37,842][62635] Updated weights for policy 1, policy_version 53130 (0.0007) [2023-10-12 17:59:38,205][62635] Updated weights for policy 1, policy_version 53140 (0.0008) [2023-10-12 17:59:38,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.3, 300 sec: 13329.4). Total num frames: 108822528. Throughput: 0: 1680.1, 1: 1684.0. Samples: 27215732. Policy #0 lag: (min: 22.0, avg: 22.0, max: 22.0) [2023-10-12 17:59:38,435][61643] Avg episode reward: [(0, '24.730'), (1, '9.940')] [2023-10-12 17:59:38,569][62635] Updated weights for policy 1, policy_version 53150 (0.0009) [2023-10-12 17:59:39,117][62634] Updated weights for policy 0, policy_version 53160 (0.0011) [2023-10-12 17:59:39,496][62634] Updated weights for policy 0, policy_version 53170 (0.0009) [2023-10-12 17:59:39,875][62634] Updated weights for policy 0, policy_version 53180 (0.0008) [2023-10-12 17:59:42,684][62635] Updated weights for policy 1, policy_version 53160 (0.0007) [2023-10-12 17:59:43,050][62635] Updated weights for policy 1, policy_version 53170 (0.0007) [2023-10-12 17:59:43,432][62635] Updated weights for policy 1, policy_version 53180 (0.0008) [2023-10-12 17:59:43,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 108888064. Throughput: 0: 1678.4, 1: 1684.8. Samples: 27236230. Policy #0 lag: (min: 22.0, avg: 22.0, max: 22.0) [2023-10-12 17:59:43,435][61643] Avg episode reward: [(0, '24.530'), (1, '10.020')] [2023-10-12 17:59:44,018][62634] Updated weights for policy 0, policy_version 53190 (0.0007) [2023-10-12 17:59:44,404][62634] Updated weights for policy 0, policy_version 53200 (0.0009) [2023-10-12 17:59:44,786][62634] Updated weights for policy 0, policy_version 53210 (0.0009) [2023-10-12 17:59:47,573][62635] Updated weights for policy 1, policy_version 53190 (0.0008) [2023-10-12 17:59:47,949][62635] Updated weights for policy 1, policy_version 53200 (0.0008) [2023-10-12 17:59:48,321][62635] Updated weights for policy 1, policy_version 53210 (0.0009) [2023-10-12 17:59:48,435][61643] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 108953600. Throughput: 0: 1677.9, 1: 1666.0. Samples: 27256080. Policy #0 lag: (min: 31.0, avg: 37.9, max: 63.0) [2023-10-12 17:59:48,436][61643] Avg episode reward: [(0, '24.800'), (1, '10.100')] [2023-10-12 17:59:48,735][62634] Updated weights for policy 0, policy_version 53220 (0.0008) [2023-10-12 17:59:49,111][62634] Updated weights for policy 0, policy_version 53230 (0.0010) [2023-10-12 17:59:49,486][62634] Updated weights for policy 0, policy_version 53240 (0.0010) [2023-10-12 17:59:52,404][62635] Updated weights for policy 1, policy_version 53220 (0.0008) [2023-10-12 17:59:52,777][62635] Updated weights for policy 1, policy_version 53230 (0.0007) [2023-10-12 17:59:53,134][62635] Updated weights for policy 1, policy_version 53240 (0.0008) [2023-10-12 17:59:53,435][61643] Fps is (10 sec: 16384.0, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 109051904. Throughput: 0: 1674.4, 1: 1681.3. Samples: 27265796. Policy #0 lag: (min: 31.0, avg: 37.9, max: 63.0) [2023-10-12 17:59:53,435][61643] Avg episode reward: [(0, '24.930'), (1, '9.920')] [2023-10-12 17:59:53,742][62634] Updated weights for policy 0, policy_version 53250 (0.0007) [2023-10-12 17:59:54,109][62634] Updated weights for policy 0, policy_version 53260 (0.0009) [2023-10-12 17:59:54,487][62634] Updated weights for policy 0, policy_version 53270 (0.0008) [2023-10-12 17:59:54,864][62634] Updated weights for policy 0, policy_version 53280 (0.0008) [2023-10-12 17:59:57,259][62635] Updated weights for policy 1, policy_version 53250 (0.0008) [2023-10-12 17:59:57,623][62635] Updated weights for policy 1, policy_version 53260 (0.0009) [2023-10-12 17:59:57,997][62635] Updated weights for policy 1, policy_version 53270 (0.0010) [2023-10-12 17:59:58,359][62635] Updated weights for policy 1, policy_version 53280 (0.0008) [2023-10-12 17:59:58,435][61643] Fps is (10 sec: 16384.4, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 109117440. Throughput: 0: 1680.9, 1: 1672.2. Samples: 27286332. Policy #0 lag: (min: 31.0, avg: 37.9, max: 63.0) [2023-10-12 17:59:58,435][61643] Avg episode reward: [(0, '24.830'), (1, '9.830')] [2023-10-12 17:59:58,851][62634] Updated weights for policy 0, policy_version 53290 (0.0009) [2023-10-12 17:59:59,229][62634] Updated weights for policy 0, policy_version 53300 (0.0009) [2023-10-12 17:59:59,601][62634] Updated weights for policy 0, policy_version 53310 (0.0009) [2023-10-12 18:00:02,378][62635] Updated weights for policy 1, policy_version 53290 (0.0009) [2023-10-12 18:00:02,754][62635] Updated weights for policy 1, policy_version 53300 (0.0009) [2023-10-12 18:00:03,114][62635] Updated weights for policy 1, policy_version 53310 (0.0007) [2023-10-12 18:00:03,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 109182976. Throughput: 0: 1686.3, 1: 1653.1. Samples: 27306354. Policy #0 lag: (min: 31.0, avg: 37.9, max: 63.0) [2023-10-12 18:00:03,435][61643] Avg episode reward: [(0, '24.880'), (1, '9.950')] [2023-10-12 18:00:03,625][62634] Updated weights for policy 0, policy_version 53320 (0.0009) [2023-10-12 18:00:03,999][62634] Updated weights for policy 0, policy_version 53330 (0.0010) [2023-10-12 18:00:04,388][62634] Updated weights for policy 0, policy_version 53340 (0.0009) [2023-10-12 18:00:07,221][62635] Updated weights for policy 1, policy_version 53320 (0.0008) [2023-10-12 18:00:07,600][62635] Updated weights for policy 1, policy_version 53330 (0.0007) [2023-10-12 18:00:07,962][62635] Updated weights for policy 1, policy_version 53340 (0.0008) [2023-10-12 18:00:08,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 109248512. Throughput: 0: 1684.3, 1: 1671.3. Samples: 27316274. Policy #0 lag: (min: 31.0, avg: 37.9, max: 63.0) [2023-10-12 18:00:08,435][61643] Avg episode reward: [(0, '24.760'), (1, '9.740')] [2023-10-12 18:00:08,529][62634] Updated weights for policy 0, policy_version 53350 (0.0008) [2023-10-12 18:00:08,906][62634] Updated weights for policy 0, policy_version 53360 (0.0008) [2023-10-12 18:00:09,285][62634] Updated weights for policy 0, policy_version 53370 (0.0009) [2023-10-12 18:00:12,030][62635] Updated weights for policy 1, policy_version 53350 (0.0009) [2023-10-12 18:00:12,409][62635] Updated weights for policy 1, policy_version 53360 (0.0010) [2023-10-12 18:00:12,784][62635] Updated weights for policy 1, policy_version 53370 (0.0007) [2023-10-12 18:00:13,227][62634] Updated weights for policy 0, policy_version 53380 (0.0009) [2023-10-12 18:00:13,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 109314048. Throughput: 0: 1676.8, 1: 1669.6. Samples: 27336406. Policy #0 lag: (min: 31.0, avg: 37.9, max: 63.0) [2023-10-12 18:00:13,435][61643] Avg episode reward: [(0, '24.840'), (1, '9.790')] [2023-10-12 18:00:13,601][62634] Updated weights for policy 0, policy_version 53390 (0.0008) [2023-10-12 18:00:13,981][62634] Updated weights for policy 0, policy_version 53400 (0.0007) [2023-10-12 18:00:16,858][62635] Updated weights for policy 1, policy_version 53380 (0.0009) [2023-10-12 18:00:17,229][62635] Updated weights for policy 1, policy_version 53390 (0.0009) [2023-10-12 18:00:17,593][62635] Updated weights for policy 1, policy_version 53400 (0.0009) [2023-10-12 18:00:18,072][62634] Updated weights for policy 0, policy_version 53410 (0.0009) [2023-10-12 18:00:18,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 109379584. Throughput: 0: 1673.4, 1: 1651.2. Samples: 27355890. Policy #0 lag: (min: 31.0, avg: 37.9, max: 63.0) [2023-10-12 18:00:18,436][61643] Avg episode reward: [(0, '24.890'), (1, '9.860')] [2023-10-12 18:00:18,443][62495] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000053408_54689792.pth... [2023-10-12 18:00:18,454][62634] Updated weights for policy 0, policy_version 53420 (0.0010) [2023-10-12 18:00:18,484][62495] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000051840_53084160.pth [2023-10-12 18:00:18,823][62634] Updated weights for policy 0, policy_version 53430 (0.0009) [2023-10-12 18:00:19,209][62354] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000053440_54722560.pth... [2023-10-12 18:00:19,213][62634] Updated weights for policy 0, policy_version 53440 (0.0009) [2023-10-12 18:00:19,244][62354] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000051840_53084160.pth [2023-10-12 18:00:21,719][62635] Updated weights for policy 1, policy_version 53410 (0.0008) [2023-10-12 18:00:22,086][62635] Updated weights for policy 1, policy_version 53420 (0.0010) [2023-10-12 18:00:22,455][62635] Updated weights for policy 1, policy_version 53430 (0.0010) [2023-10-12 18:00:22,832][62635] Updated weights for policy 1, policy_version 53440 (0.0007) [2023-10-12 18:00:23,211][62634] Updated weights for policy 0, policy_version 53450 (0.0009) [2023-10-12 18:00:23,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 109445120. Throughput: 0: 1673.1, 1: 1673.1. Samples: 27366308. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) [2023-10-12 18:00:23,435][61643] Avg episode reward: [(0, '25.090'), (1, '9.870')] [2023-10-12 18:00:23,589][62634] Updated weights for policy 0, policy_version 53460 (0.0009) [2023-10-12 18:00:23,973][62634] Updated weights for policy 0, policy_version 53470 (0.0007) [2023-10-12 18:00:26,994][62635] Updated weights for policy 1, policy_version 53450 (0.0011) [2023-10-12 18:00:27,363][62635] Updated weights for policy 1, policy_version 53460 (0.0010) [2023-10-12 18:00:27,737][62635] Updated weights for policy 1, policy_version 53470 (0.0009) [2023-10-12 18:00:28,050][62634] Updated weights for policy 0, policy_version 53480 (0.0007) [2023-10-12 18:00:28,428][62634] Updated weights for policy 0, policy_version 53490 (0.0008) [2023-10-12 18:00:28,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 109510656. Throughput: 0: 1676.1, 1: 1660.8. Samples: 27386392. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) [2023-10-12 18:00:28,435][61643] Avg episode reward: [(0, '24.400'), (1, '9.740')] [2023-10-12 18:00:28,795][62634] Updated weights for policy 0, policy_version 53500 (0.0007) [2023-10-12 18:00:31,738][62635] Updated weights for policy 1, policy_version 53480 (0.0007) [2023-10-12 18:00:32,094][62635] Updated weights for policy 1, policy_version 53490 (0.0009) [2023-10-12 18:00:32,474][62635] Updated weights for policy 1, policy_version 53500 (0.0009) [2023-10-12 18:00:32,850][62634] Updated weights for policy 0, policy_version 53510 (0.0009) [2023-10-12 18:00:33,222][62634] Updated weights for policy 0, policy_version 53520 (0.0007) [2023-10-12 18:00:33,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 109576192. Throughput: 0: 1674.0, 1: 1660.1. Samples: 27406116. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) [2023-10-12 18:00:33,435][61643] Avg episode reward: [(0, '24.790'), (1, '9.740')] [2023-10-12 18:00:33,612][62634] Updated weights for policy 0, policy_version 53530 (0.0008) [2023-10-12 18:00:36,630][62635] Updated weights for policy 1, policy_version 53510 (0.0008) [2023-10-12 18:00:36,993][62635] Updated weights for policy 1, policy_version 53520 (0.0008) [2023-10-12 18:00:37,364][62635] Updated weights for policy 1, policy_version 53530 (0.0008) [2023-10-12 18:00:37,556][62634] Updated weights for policy 0, policy_version 53540 (0.0008) [2023-10-12 18:00:37,930][62634] Updated weights for policy 0, policy_version 53550 (0.0007) [2023-10-12 18:00:38,311][62634] Updated weights for policy 0, policy_version 53560 (0.0008) [2023-10-12 18:00:38,435][61643] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 109641728. Throughput: 0: 1684.4, 1: 1674.9. Samples: 27416966. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) [2023-10-12 18:00:38,436][61643] Avg episode reward: [(0, '24.610'), (1, '9.900')] [2023-10-12 18:00:41,459][62635] Updated weights for policy 1, policy_version 53540 (0.0008) [2023-10-12 18:00:41,826][62635] Updated weights for policy 1, policy_version 53550 (0.0007) [2023-10-12 18:00:42,187][62635] Updated weights for policy 1, policy_version 53560 (0.0009) [2023-10-12 18:00:42,232][62634] Updated weights for policy 0, policy_version 53570 (0.0008) [2023-10-12 18:00:42,603][62634] Updated weights for policy 0, policy_version 53580 (0.0008) [2023-10-12 18:00:42,981][62634] Updated weights for policy 0, policy_version 53590 (0.0010) [2023-10-12 18:00:43,360][62634] Updated weights for policy 0, policy_version 53600 (0.0010) [2023-10-12 18:00:43,435][61643] Fps is (10 sec: 16383.9, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 109740032. Throughput: 0: 1689.4, 1: 1662.5. Samples: 27437168. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) [2023-10-12 18:00:43,436][61643] Avg episode reward: [(0, '24.590'), (1, '9.750')] [2023-10-12 18:00:46,171][62635] Updated weights for policy 1, policy_version 53570 (0.0008) [2023-10-12 18:00:46,534][62635] Updated weights for policy 1, policy_version 53580 (0.0010) [2023-10-12 18:00:46,902][62635] Updated weights for policy 1, policy_version 53590 (0.0008) [2023-10-12 18:00:47,267][62635] Updated weights for policy 1, policy_version 53600 (0.0010) [2023-10-12 18:00:47,550][62634] Updated weights for policy 0, policy_version 53610 (0.0010) [2023-10-12 18:00:47,929][62634] Updated weights for policy 0, policy_version 53620 (0.0009) [2023-10-12 18:00:48,310][62634] Updated weights for policy 0, policy_version 53630 (0.0009) [2023-10-12 18:00:48,435][61643] Fps is (10 sec: 16384.3, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 109805568. Throughput: 0: 1662.0, 1: 1676.4. Samples: 27456584. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) [2023-10-12 18:00:48,435][61643] Avg episode reward: [(0, '24.390'), (1, '9.690')] [2023-10-12 18:00:51,451][62635] Updated weights for policy 1, policy_version 53610 (0.0008) [2023-10-12 18:00:51,817][62635] Updated weights for policy 1, policy_version 53620 (0.0008) [2023-10-12 18:00:52,183][62635] Updated weights for policy 1, policy_version 53630 (0.0009) [2023-10-12 18:00:52,317][62634] Updated weights for policy 0, policy_version 53640 (0.0009) [2023-10-12 18:00:52,696][62634] Updated weights for policy 0, policy_version 53650 (0.0008) [2023-10-12 18:00:53,080][62634] Updated weights for policy 0, policy_version 53660 (0.0009) [2023-10-12 18:00:53,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 109871104. Throughput: 0: 1685.5, 1: 1679.9. Samples: 27467718. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) [2023-10-12 18:00:53,436][61643] Avg episode reward: [(0, '24.700'), (1, '9.760')] [2023-10-12 18:00:56,313][62635] Updated weights for policy 1, policy_version 53640 (0.0009) [2023-10-12 18:00:56,685][62635] Updated weights for policy 1, policy_version 53650 (0.0009) [2023-10-12 18:00:57,062][62635] Updated weights for policy 1, policy_version 53660 (0.0009) [2023-10-12 18:00:57,197][62634] Updated weights for policy 0, policy_version 53670 (0.0009) [2023-10-12 18:00:57,577][62634] Updated weights for policy 0, policy_version 53680 (0.0007) [2023-10-12 18:00:57,947][62634] Updated weights for policy 0, policy_version 53690 (0.0007) [2023-10-12 18:00:58,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 109936640. Throughput: 0: 1688.5, 1: 1663.5. Samples: 27487244. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 18:00:58,435][61643] Avg episode reward: [(0, '24.800'), (1, '9.660')] [2023-10-12 18:01:01,140][62635] Updated weights for policy 1, policy_version 53670 (0.0008) [2023-10-12 18:01:01,516][62635] Updated weights for policy 1, policy_version 53680 (0.0009) [2023-10-12 18:01:01,881][62635] Updated weights for policy 1, policy_version 53690 (0.0008) [2023-10-12 18:01:02,063][62634] Updated weights for policy 0, policy_version 53700 (0.0008) [2023-10-12 18:01:02,440][62634] Updated weights for policy 0, policy_version 53710 (0.0008) [2023-10-12 18:01:02,819][62634] Updated weights for policy 0, policy_version 53720 (0.0007) [2023-10-12 18:01:03,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 110002176. Throughput: 0: 1665.5, 1: 1686.2. Samples: 27506718. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 18:01:03,435][61643] Avg episode reward: [(0, '24.960'), (1, '9.750')] [2023-10-12 18:01:06,000][62635] Updated weights for policy 1, policy_version 53700 (0.0008) [2023-10-12 18:01:06,369][62635] Updated weights for policy 1, policy_version 53710 (0.0010) [2023-10-12 18:01:06,737][62635] Updated weights for policy 1, policy_version 53720 (0.0008) [2023-10-12 18:01:06,970][62634] Updated weights for policy 0, policy_version 53730 (0.0007) [2023-10-12 18:01:07,333][62634] Updated weights for policy 0, policy_version 53740 (0.0008) [2023-10-12 18:01:07,715][62634] Updated weights for policy 0, policy_version 53750 (0.0009) [2023-10-12 18:01:08,095][62634] Updated weights for policy 0, policy_version 53760 (0.0009) [2023-10-12 18:01:08,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 110067712. Throughput: 0: 1688.6, 1: 1676.8. Samples: 27517750. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 18:01:08,435][61643] Avg episode reward: [(0, '24.970'), (1, '9.880')] [2023-10-12 18:01:10,781][62635] Updated weights for policy 1, policy_version 53730 (0.0009) [2023-10-12 18:01:11,143][62635] Updated weights for policy 1, policy_version 53740 (0.0007) [2023-10-12 18:01:11,513][62635] Updated weights for policy 1, policy_version 53750 (0.0007) [2023-10-12 18:01:11,881][62635] Updated weights for policy 1, policy_version 53760 (0.0007) [2023-10-12 18:01:12,138][62634] Updated weights for policy 0, policy_version 53770 (0.0009) [2023-10-12 18:01:12,506][62634] Updated weights for policy 0, policy_version 53780 (0.0011) [2023-10-12 18:01:12,896][62634] Updated weights for policy 0, policy_version 53790 (0.0010) [2023-10-12 18:01:13,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 110133248. Throughput: 0: 1683.0, 1: 1666.7. Samples: 27537130. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 18:01:13,435][61643] Avg episode reward: [(0, '25.090'), (1, '9.650')] [2023-10-12 18:01:15,729][62635] Updated weights for policy 1, policy_version 53770 (0.0010) [2023-10-12 18:01:16,098][62635] Updated weights for policy 1, policy_version 53780 (0.0010) [2023-10-12 18:01:16,463][62635] Updated weights for policy 1, policy_version 53790 (0.0008) [2023-10-12 18:01:16,947][62634] Updated weights for policy 0, policy_version 53800 (0.0009) [2023-10-12 18:01:17,324][62634] Updated weights for policy 0, policy_version 53810 (0.0010) [2023-10-12 18:01:17,709][62634] Updated weights for policy 0, policy_version 53820 (0.0010) [2023-10-12 18:01:18,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 110198784. Throughput: 0: 1661.8, 1: 1689.3. Samples: 27556914. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 18:01:18,435][61643] Avg episode reward: [(0, '25.130'), (1, '9.750')] [2023-10-12 18:01:20,358][62635] Updated weights for policy 1, policy_version 53800 (0.0007) [2023-10-12 18:01:20,722][62635] Updated weights for policy 1, policy_version 53810 (0.0008) [2023-10-12 18:01:21,089][62635] Updated weights for policy 1, policy_version 53820 (0.0009) [2023-10-12 18:01:21,814][62634] Updated weights for policy 0, policy_version 53830 (0.0009) [2023-10-12 18:01:22,188][62634] Updated weights for policy 0, policy_version 53840 (0.0007) [2023-10-12 18:01:22,557][62634] Updated weights for policy 0, policy_version 53850 (0.0007) [2023-10-12 18:01:23,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 110264320. Throughput: 0: 1683.7, 1: 1665.2. Samples: 27567664. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 18:01:23,436][61643] Avg episode reward: [(0, '24.840'), (1, '9.860')] [2023-10-12 18:01:25,123][62635] Updated weights for policy 1, policy_version 53830 (0.0011) [2023-10-12 18:01:25,499][62635] Updated weights for policy 1, policy_version 53840 (0.0010) [2023-10-12 18:01:25,864][62635] Updated weights for policy 1, policy_version 53850 (0.0007) [2023-10-12 18:01:26,534][62634] Updated weights for policy 0, policy_version 53860 (0.0007) [2023-10-12 18:01:26,915][62634] Updated weights for policy 0, policy_version 53870 (0.0008) [2023-10-12 18:01:27,292][62634] Updated weights for policy 0, policy_version 53880 (0.0008) [2023-10-12 18:01:28,435][61643] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 110329856. Throughput: 0: 1668.4, 1: 1679.6. Samples: 27587824. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 18:01:28,436][61643] Avg episode reward: [(0, '24.910'), (1, '9.740')] [2023-10-12 18:01:29,909][62635] Updated weights for policy 1, policy_version 53860 (0.0008) [2023-10-12 18:01:30,272][62635] Updated weights for policy 1, policy_version 53870 (0.0011) [2023-10-12 18:01:30,645][62635] Updated weights for policy 1, policy_version 53880 (0.0007) [2023-10-12 18:01:31,243][62634] Updated weights for policy 0, policy_version 53890 (0.0007) [2023-10-12 18:01:31,621][62634] Updated weights for policy 0, policy_version 53900 (0.0007) [2023-10-12 18:01:32,003][62634] Updated weights for policy 0, policy_version 53910 (0.0008) [2023-10-12 18:01:32,370][62634] Updated weights for policy 0, policy_version 53920 (0.0009) [2023-10-12 18:01:33,435][61643] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 110395392. Throughput: 0: 1673.9, 1: 1694.1. Samples: 27608144. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 18:01:33,435][61643] Avg episode reward: [(0, '24.740'), (1, '9.740')] [2023-10-12 18:01:34,693][62635] Updated weights for policy 1, policy_version 53890 (0.0007) [2023-10-12 18:01:35,059][62635] Updated weights for policy 1, policy_version 53900 (0.0007) [2023-10-12 18:01:35,430][62635] Updated weights for policy 1, policy_version 53910 (0.0007) [2023-10-12 18:01:35,803][62635] Updated weights for policy 1, policy_version 53920 (0.0009) [2023-10-12 18:01:36,252][62634] Updated weights for policy 0, policy_version 53930 (0.0009) [2023-10-12 18:01:36,631][62634] Updated weights for policy 0, policy_version 53940 (0.0010) [2023-10-12 18:01:37,005][62634] Updated weights for policy 0, policy_version 53950 (0.0010) [2023-10-12 18:01:38,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 110460928. Throughput: 0: 1682.4, 1: 1668.1. Samples: 27618494. Policy #0 lag: (min: 8.0, avg: 22.8, max: 40.0) [2023-10-12 18:01:38,436][61643] Avg episode reward: [(0, '24.590'), (1, '9.790')] [2023-10-12 18:01:39,882][62635] Updated weights for policy 1, policy_version 53930 (0.0010) [2023-10-12 18:01:40,253][62635] Updated weights for policy 1, policy_version 53940 (0.0010) [2023-10-12 18:01:40,625][62635] Updated weights for policy 1, policy_version 53950 (0.0009) [2023-10-12 18:01:41,313][62634] Updated weights for policy 0, policy_version 53960 (0.0009) [2023-10-12 18:01:41,690][62634] Updated weights for policy 0, policy_version 53970 (0.0008) [2023-10-12 18:01:42,067][62634] Updated weights for policy 0, policy_version 53980 (0.0007) [2023-10-12 18:01:43,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 110526464. Throughput: 0: 1658.4, 1: 1696.5. Samples: 27638212. Policy #0 lag: (min: 8.0, avg: 22.8, max: 40.0) [2023-10-12 18:01:43,435][61643] Avg episode reward: [(0, '24.640'), (1, '9.840')] [2023-10-12 18:01:44,814][62635] Updated weights for policy 1, policy_version 53960 (0.0011) [2023-10-12 18:01:45,191][62635] Updated weights for policy 1, policy_version 53970 (0.0008) [2023-10-12 18:01:45,555][62635] Updated weights for policy 1, policy_version 53980 (0.0008) [2023-10-12 18:01:46,159][62634] Updated weights for policy 0, policy_version 53990 (0.0008) [2023-10-12 18:01:46,544][62634] Updated weights for policy 0, policy_version 54000 (0.0009) [2023-10-12 18:01:46,923][62634] Updated weights for policy 0, policy_version 54010 (0.0007) [2023-10-12 18:01:48,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 110592000. Throughput: 0: 1678.7, 1: 1700.0. Samples: 27658760. Policy #0 lag: (min: 8.0, avg: 22.8, max: 40.0) [2023-10-12 18:01:48,436][61643] Avg episode reward: [(0, '24.860'), (1, '9.920')] [2023-10-12 18:01:49,596][62635] Updated weights for policy 1, policy_version 53990 (0.0008) [2023-10-12 18:01:49,960][62635] Updated weights for policy 1, policy_version 54000 (0.0009) [2023-10-12 18:01:50,330][62635] Updated weights for policy 1, policy_version 54010 (0.0010) [2023-10-12 18:01:50,832][62634] Updated weights for policy 0, policy_version 54020 (0.0008) [2023-10-12 18:01:51,205][62634] Updated weights for policy 0, policy_version 54030 (0.0010) [2023-10-12 18:01:51,588][62634] Updated weights for policy 0, policy_version 54040 (0.0010) [2023-10-12 18:01:53,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 110657536. Throughput: 0: 1680.1, 1: 1676.7. Samples: 27668808. Policy #0 lag: (min: 8.0, avg: 22.8, max: 40.0) [2023-10-12 18:01:53,435][61643] Avg episode reward: [(0, '24.720'), (1, '10.090')] [2023-10-12 18:01:54,362][62635] Updated weights for policy 1, policy_version 54020 (0.0008) [2023-10-12 18:01:54,739][62635] Updated weights for policy 1, policy_version 54030 (0.0010) [2023-10-12 18:01:55,096][62635] Updated weights for policy 1, policy_version 54040 (0.0011) [2023-10-12 18:01:55,902][62634] Updated weights for policy 0, policy_version 54050 (0.0008) [2023-10-12 18:01:56,287][62634] Updated weights for policy 0, policy_version 54060 (0.0007) [2023-10-12 18:01:56,667][62634] Updated weights for policy 0, policy_version 54070 (0.0008) [2023-10-12 18:01:57,043][62634] Updated weights for policy 0, policy_version 54080 (0.0009) [2023-10-12 18:01:58,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 110723072. Throughput: 0: 1667.4, 1: 1697.2. Samples: 27688538. Policy #0 lag: (min: 8.0, avg: 22.8, max: 40.0) [2023-10-12 18:01:58,436][61643] Avg episode reward: [(0, '24.810'), (1, '9.910')] [2023-10-12 18:01:59,156][62635] Updated weights for policy 1, policy_version 54050 (0.0009) [2023-10-12 18:01:59,534][62635] Updated weights for policy 1, policy_version 54060 (0.0007) [2023-10-12 18:01:59,900][62635] Updated weights for policy 1, policy_version 54070 (0.0007) [2023-10-12 18:02:00,274][62635] Updated weights for policy 1, policy_version 54080 (0.0010) [2023-10-12 18:02:01,005][62634] Updated weights for policy 0, policy_version 54090 (0.0007) [2023-10-12 18:02:01,380][62634] Updated weights for policy 0, policy_version 54100 (0.0007) [2023-10-12 18:02:01,758][62634] Updated weights for policy 0, policy_version 54110 (0.0009) [2023-10-12 18:02:03,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 110788608. Throughput: 0: 1691.6, 1: 1693.2. Samples: 27709232. Policy #0 lag: (min: 8.0, avg: 22.8, max: 40.0) [2023-10-12 18:02:03,435][61643] Avg episode reward: [(0, '24.740'), (1, '9.920')] [2023-10-12 18:02:04,295][62635] Updated weights for policy 1, policy_version 54090 (0.0008) [2023-10-12 18:02:04,667][62635] Updated weights for policy 1, policy_version 54100 (0.0008) [2023-10-12 18:02:05,029][62635] Updated weights for policy 1, policy_version 54110 (0.0007) [2023-10-12 18:02:05,576][62634] Updated weights for policy 0, policy_version 54120 (0.0011) [2023-10-12 18:02:05,957][62634] Updated weights for policy 0, policy_version 54130 (0.0010) [2023-10-12 18:02:06,334][62634] Updated weights for policy 0, policy_version 54140 (0.0011) [2023-10-12 18:02:08,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 110854144. Throughput: 0: 1678.5, 1: 1684.6. Samples: 27719006. Policy #0 lag: (min: 8.0, avg: 22.8, max: 40.0) [2023-10-12 18:02:08,436][61643] Avg episode reward: [(0, '24.680'), (1, '10.000')] [2023-10-12 18:02:09,213][62635] Updated weights for policy 1, policy_version 54120 (0.0010) [2023-10-12 18:02:09,580][62635] Updated weights for policy 1, policy_version 54130 (0.0011) [2023-10-12 18:02:09,944][62635] Updated weights for policy 1, policy_version 54140 (0.0010) [2023-10-12 18:02:10,414][62634] Updated weights for policy 0, policy_version 54150 (0.0009) [2023-10-12 18:02:10,793][62634] Updated weights for policy 0, policy_version 54160 (0.0010) [2023-10-12 18:02:11,160][62634] Updated weights for policy 0, policy_version 54170 (0.0011) [2023-10-12 18:02:13,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 110919680. Throughput: 0: 1668.8, 1: 1689.0. Samples: 27738924. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 18:02:13,435][61643] Avg episode reward: [(0, '24.810'), (1, '9.950')] [2023-10-12 18:02:13,911][62635] Updated weights for policy 1, policy_version 54150 (0.0011) [2023-10-12 18:02:14,280][62635] Updated weights for policy 1, policy_version 54160 (0.0008) [2023-10-12 18:02:14,646][62635] Updated weights for policy 1, policy_version 54170 (0.0010) [2023-10-12 18:02:15,315][62634] Updated weights for policy 0, policy_version 54180 (0.0010) [2023-10-12 18:02:15,700][62634] Updated weights for policy 0, policy_version 54190 (0.0007) [2023-10-12 18:02:16,082][62634] Updated weights for policy 0, policy_version 54200 (0.0008) [2023-10-12 18:02:18,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13107.1, 300 sec: 13440.4). Total num frames: 110985216. Throughput: 0: 1687.0, 1: 1687.4. Samples: 27759992. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 18:02:18,436][61643] Avg episode reward: [(0, '24.790'), (1, '9.950')] [2023-10-12 18:02:18,445][62354] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000054208_55508992.pth... [2023-10-12 18:02:18,479][62354] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000052640_53903360.pth [2023-10-12 18:02:18,605][62635] Updated weights for policy 1, policy_version 54180 (0.0009) [2023-10-12 18:02:18,974][62635] Updated weights for policy 1, policy_version 54190 (0.0009) [2023-10-12 18:02:19,344][62635] Updated weights for policy 1, policy_version 54200 (0.0007) [2023-10-12 18:02:19,625][62495] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000054208_55508992.pth... [2023-10-12 18:02:19,656][62495] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000052608_53870592.pth [2023-10-12 18:02:20,024][62634] Updated weights for policy 0, policy_version 54210 (0.0009) [2023-10-12 18:02:20,402][62634] Updated weights for policy 0, policy_version 54220 (0.0009) [2023-10-12 18:02:20,780][62634] Updated weights for policy 0, policy_version 54230 (0.0007) [2023-10-12 18:02:21,150][62634] Updated weights for policy 0, policy_version 54240 (0.0007) [2023-10-12 18:02:23,401][62635] Updated weights for policy 1, policy_version 54210 (0.0008) [2023-10-12 18:02:23,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 111050752. Throughput: 0: 1663.2, 1: 1687.9. Samples: 27769292. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 18:02:23,435][61643] Avg episode reward: [(0, '24.590'), (1, '10.040')] [2023-10-12 18:02:23,770][62635] Updated weights for policy 1, policy_version 54220 (0.0011) [2023-10-12 18:02:24,134][62635] Updated weights for policy 1, policy_version 54230 (0.0011) [2023-10-12 18:02:24,511][62635] Updated weights for policy 1, policy_version 54240 (0.0009) [2023-10-12 18:02:25,087][62634] Updated weights for policy 0, policy_version 54250 (0.0009) [2023-10-12 18:02:25,462][62634] Updated weights for policy 0, policy_version 54260 (0.0010) [2023-10-12 18:02:25,837][62634] Updated weights for policy 0, policy_version 54270 (0.0008) [2023-10-12 18:02:28,435][61643] Fps is (10 sec: 13107.7, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 111116288. Throughput: 0: 1685.2, 1: 1682.6. Samples: 27789766. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 18:02:28,435][61643] Avg episode reward: [(0, '24.460'), (1, '9.980')] [2023-10-12 18:02:28,615][62635] Updated weights for policy 1, policy_version 54250 (0.0008) [2023-10-12 18:02:28,990][62635] Updated weights for policy 1, policy_version 54260 (0.0008) [2023-10-12 18:02:29,359][62635] Updated weights for policy 1, policy_version 54270 (0.0009) [2023-10-12 18:02:29,947][62634] Updated weights for policy 0, policy_version 54280 (0.0010) [2023-10-12 18:02:30,323][62634] Updated weights for policy 0, policy_version 54290 (0.0008) [2023-10-12 18:02:30,699][62634] Updated weights for policy 0, policy_version 54300 (0.0008) [2023-10-12 18:02:33,435][61643] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 111181824. Throughput: 0: 1685.8, 1: 1686.0. Samples: 27810490. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 18:02:33,436][61643] Avg episode reward: [(0, '24.460'), (1, '9.860')] [2023-10-12 18:02:33,537][62635] Updated weights for policy 1, policy_version 54280 (0.0008) [2023-10-12 18:02:33,920][62635] Updated weights for policy 1, policy_version 54290 (0.0008) [2023-10-12 18:02:34,288][62635] Updated weights for policy 1, policy_version 54300 (0.0009) [2023-10-12 18:02:34,868][62634] Updated weights for policy 0, policy_version 54310 (0.0008) [2023-10-12 18:02:35,245][62634] Updated weights for policy 0, policy_version 54320 (0.0009) [2023-10-12 18:02:35,627][62634] Updated weights for policy 0, policy_version 54330 (0.0009) [2023-10-12 18:02:38,324][62635] Updated weights for policy 1, policy_version 54310 (0.0008) [2023-10-12 18:02:38,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.3, 300 sec: 13329.4). Total num frames: 111247360. Throughput: 0: 1663.6, 1: 1686.3. Samples: 27819550. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 18:02:38,435][61643] Avg episode reward: [(0, '24.830'), (1, '9.950')] [2023-10-12 18:02:38,695][62635] Updated weights for policy 1, policy_version 54320 (0.0010) [2023-10-12 18:02:39,061][62635] Updated weights for policy 1, policy_version 54330 (0.0008) [2023-10-12 18:02:39,632][62634] Updated weights for policy 0, policy_version 54340 (0.0009) [2023-10-12 18:02:40,013][62634] Updated weights for policy 0, policy_version 54350 (0.0009) [2023-10-12 18:02:40,377][62634] Updated weights for policy 0, policy_version 54360 (0.0010) [2023-10-12 18:02:43,177][62635] Updated weights for policy 1, policy_version 54340 (0.0008) [2023-10-12 18:02:43,435][61643] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 111312896. Throughput: 0: 1681.4, 1: 1689.0. Samples: 27840204. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 18:02:43,435][61643] Avg episode reward: [(0, '24.880'), (1, '9.940')] [2023-10-12 18:02:43,550][62635] Updated weights for policy 1, policy_version 54350 (0.0008) [2023-10-12 18:02:43,917][62635] Updated weights for policy 1, policy_version 54360 (0.0009) [2023-10-12 18:02:44,586][62634] Updated weights for policy 0, policy_version 54370 (0.0011) [2023-10-12 18:02:44,957][62634] Updated weights for policy 0, policy_version 54380 (0.0009) [2023-10-12 18:02:45,339][62634] Updated weights for policy 0, policy_version 54390 (0.0010) [2023-10-12 18:02:45,714][62634] Updated weights for policy 0, policy_version 54400 (0.0008) [2023-10-12 18:02:47,818][62635] Updated weights for policy 1, policy_version 54370 (0.0008) [2023-10-12 18:02:48,178][62635] Updated weights for policy 1, policy_version 54380 (0.0009) [2023-10-12 18:02:48,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 111378432. Throughput: 0: 1676.2, 1: 1685.9. Samples: 27860524. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 18:02:48,435][61643] Avg episode reward: [(0, '24.950'), (1, '9.850')] [2023-10-12 18:02:48,554][62635] Updated weights for policy 1, policy_version 54390 (0.0010) [2023-10-12 18:02:48,917][62635] Updated weights for policy 1, policy_version 54400 (0.0009) [2023-10-12 18:02:49,681][62634] Updated weights for policy 0, policy_version 54410 (0.0011) [2023-10-12 18:02:50,057][62634] Updated weights for policy 0, policy_version 54420 (0.0008) [2023-10-12 18:02:50,444][62634] Updated weights for policy 0, policy_version 54430 (0.0007) [2023-10-12 18:02:52,981][62635] Updated weights for policy 1, policy_version 54410 (0.0008) [2023-10-12 18:02:53,349][62635] Updated weights for policy 1, policy_version 54420 (0.0009) [2023-10-12 18:02:53,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 111443968. Throughput: 0: 1657.8, 1: 1697.3. Samples: 27869988. Policy #0 lag: (min: 25.0, avg: 34.4, max: 57.0) [2023-10-12 18:02:53,435][61643] Avg episode reward: [(0, '25.110'), (1, '10.020')] [2023-10-12 18:02:53,718][62635] Updated weights for policy 1, policy_version 54430 (0.0007) [2023-10-12 18:02:54,417][62634] Updated weights for policy 0, policy_version 54440 (0.0009) [2023-10-12 18:02:54,793][62634] Updated weights for policy 0, policy_version 54450 (0.0010) [2023-10-12 18:02:55,172][62634] Updated weights for policy 0, policy_version 54460 (0.0009) [2023-10-12 18:02:57,867][62635] Updated weights for policy 1, policy_version 54440 (0.0009) [2023-10-12 18:02:58,235][62635] Updated weights for policy 1, policy_version 54450 (0.0008) [2023-10-12 18:02:58,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 111509504. Throughput: 0: 1675.7, 1: 1691.7. Samples: 27890458. Policy #0 lag: (min: 25.0, avg: 34.4, max: 57.0) [2023-10-12 18:02:58,435][61643] Avg episode reward: [(0, '24.920'), (1, '9.910')] [2023-10-12 18:02:58,609][62635] Updated weights for policy 1, policy_version 54460 (0.0008) [2023-10-12 18:02:59,425][62634] Updated weights for policy 0, policy_version 54470 (0.0008) [2023-10-12 18:02:59,798][62634] Updated weights for policy 0, policy_version 54480 (0.0009) [2023-10-12 18:03:00,180][62634] Updated weights for policy 0, policy_version 54490 (0.0009) [2023-10-12 18:03:02,662][62635] Updated weights for policy 1, policy_version 54470 (0.0008) [2023-10-12 18:03:03,021][62635] Updated weights for policy 1, policy_version 54480 (0.0008) [2023-10-12 18:03:03,387][62635] Updated weights for policy 1, policy_version 54490 (0.0008) [2023-10-12 18:03:03,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 111575040. Throughput: 0: 1668.1, 1: 1671.3. Samples: 27910266. Policy #0 lag: (min: 25.0, avg: 34.4, max: 57.0) [2023-10-12 18:03:03,435][61643] Avg episode reward: [(0, '24.440'), (1, '9.840')] [2023-10-12 18:03:04,389][62634] Updated weights for policy 0, policy_version 54500 (0.0007) [2023-10-12 18:03:04,763][62634] Updated weights for policy 0, policy_version 54510 (0.0007) [2023-10-12 18:03:05,142][62634] Updated weights for policy 0, policy_version 54520 (0.0010) [2023-10-12 18:03:07,485][62635] Updated weights for policy 1, policy_version 54500 (0.0010) [2023-10-12 18:03:07,855][62635] Updated weights for policy 1, policy_version 54510 (0.0009) [2023-10-12 18:03:08,227][62635] Updated weights for policy 1, policy_version 54520 (0.0008) [2023-10-12 18:03:08,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 111640576. Throughput: 0: 1661.2, 1: 1682.6. Samples: 27919766. Policy #0 lag: (min: 25.0, avg: 34.4, max: 57.0) [2023-10-12 18:03:08,435][61643] Avg episode reward: [(0, '24.580'), (1, '10.050')] [2023-10-12 18:03:09,277][62634] Updated weights for policy 0, policy_version 54530 (0.0010) [2023-10-12 18:03:09,660][62634] Updated weights for policy 0, policy_version 54540 (0.0007) [2023-10-12 18:03:10,029][62634] Updated weights for policy 0, policy_version 54550 (0.0008) [2023-10-12 18:03:10,410][62634] Updated weights for policy 0, policy_version 54560 (0.0010) [2023-10-12 18:03:12,327][62635] Updated weights for policy 1, policy_version 54530 (0.0009) [2023-10-12 18:03:12,691][62635] Updated weights for policy 1, policy_version 54540 (0.0008) [2023-10-12 18:03:13,057][62635] Updated weights for policy 1, policy_version 54550 (0.0009) [2023-10-12 18:03:13,427][62635] Updated weights for policy 1, policy_version 54560 (0.0007) [2023-10-12 18:03:13,435][61643] Fps is (10 sec: 16383.8, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 111738880. Throughput: 0: 1658.7, 1: 1682.7. Samples: 27940132. Policy #0 lag: (min: 25.0, avg: 34.4, max: 57.0) [2023-10-12 18:03:13,436][61643] Avg episode reward: [(0, '24.530'), (1, '9.880')] [2023-10-12 18:03:14,738][62634] Updated weights for policy 0, policy_version 54570 (0.0010) [2023-10-12 18:03:15,120][62634] Updated weights for policy 0, policy_version 54580 (0.0011) [2023-10-12 18:03:15,512][62634] Updated weights for policy 0, policy_version 54590 (0.0011) [2023-10-12 18:03:17,220][62635] Updated weights for policy 1, policy_version 54570 (0.0008) [2023-10-12 18:03:17,589][62635] Updated weights for policy 1, policy_version 54580 (0.0008) [2023-10-12 18:03:17,961][62635] Updated weights for policy 1, policy_version 54590 (0.0009) [2023-10-12 18:03:18,435][61643] Fps is (10 sec: 16383.6, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 111804416. Throughput: 0: 1660.2, 1: 1659.9. Samples: 27959896. Policy #0 lag: (min: 25.0, avg: 34.4, max: 57.0) [2023-10-12 18:03:18,436][61643] Avg episode reward: [(0, '24.720'), (1, '9.800')] [2023-10-12 18:03:19,524][62634] Updated weights for policy 0, policy_version 54600 (0.0008) [2023-10-12 18:03:19,898][62634] Updated weights for policy 0, policy_version 54610 (0.0007) [2023-10-12 18:03:20,281][62634] Updated weights for policy 0, policy_version 54620 (0.0008) [2023-10-12 18:03:22,117][62635] Updated weights for policy 1, policy_version 54600 (0.0009) [2023-10-12 18:03:22,486][62635] Updated weights for policy 1, policy_version 54610 (0.0011) [2023-10-12 18:03:22,860][62635] Updated weights for policy 1, policy_version 54620 (0.0009) [2023-10-12 18:03:23,435][61643] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 111869952. Throughput: 0: 1658.6, 1: 1689.9. Samples: 27970234. Policy #0 lag: (min: 25.0, avg: 34.4, max: 57.0) [2023-10-12 18:03:23,435][61643] Avg episode reward: [(0, '24.590'), (1, '10.070')] [2023-10-12 18:03:24,190][62634] Updated weights for policy 0, policy_version 54630 (0.0007) [2023-10-12 18:03:24,581][62634] Updated weights for policy 0, policy_version 54640 (0.0010) [2023-10-12 18:03:24,945][62634] Updated weights for policy 0, policy_version 54650 (0.0009) [2023-10-12 18:03:26,888][62635] Updated weights for policy 1, policy_version 54630 (0.0009) [2023-10-12 18:03:27,247][62635] Updated weights for policy 1, policy_version 54640 (0.0010) [2023-10-12 18:03:27,611][62635] Updated weights for policy 1, policy_version 54650 (0.0008) [2023-10-12 18:03:28,435][61643] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 111935488. Throughput: 0: 1663.2, 1: 1677.9. Samples: 27990552. Policy #0 lag: (min: 9.0, avg: 23.6, max: 41.0) [2023-10-12 18:03:28,436][61643] Avg episode reward: [(0, '24.480'), (1, '9.810')] [2023-10-12 18:03:29,100][62634] Updated weights for policy 0, policy_version 54660 (0.0009) [2023-10-12 18:03:29,485][62634] Updated weights for policy 0, policy_version 54670 (0.0008) [2023-10-12 18:03:29,862][62634] Updated weights for policy 0, policy_version 54680 (0.0008) [2023-10-12 18:03:31,707][62635] Updated weights for policy 1, policy_version 54660 (0.0009) [2023-10-12 18:03:32,074][62635] Updated weights for policy 1, policy_version 54670 (0.0008) [2023-10-12 18:03:32,439][62635] Updated weights for policy 1, policy_version 54680 (0.0007) [2023-10-12 18:03:33,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 112001024. Throughput: 0: 1667.4, 1: 1661.0. Samples: 28010302. Policy #0 lag: (min: 9.0, avg: 23.6, max: 41.0) [2023-10-12 18:03:33,436][61643] Avg episode reward: [(0, '24.500'), (1, '9.970')] [2023-10-12 18:03:33,851][62634] Updated weights for policy 0, policy_version 54690 (0.0009) [2023-10-12 18:03:34,230][62634] Updated weights for policy 0, policy_version 54700 (0.0010) [2023-10-12 18:03:34,598][62634] Updated weights for policy 0, policy_version 54710 (0.0009) [2023-10-12 18:03:34,978][62634] Updated weights for policy 0, policy_version 54720 (0.0010) [2023-10-12 18:03:36,500][62635] Updated weights for policy 1, policy_version 54690 (0.0008) [2023-10-12 18:03:36,865][62635] Updated weights for policy 1, policy_version 54700 (0.0008) [2023-10-12 18:03:37,228][62635] Updated weights for policy 1, policy_version 54710 (0.0009) [2023-10-12 18:03:37,591][62635] Updated weights for policy 1, policy_version 54720 (0.0009) [2023-10-12 18:03:38,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 112066560. Throughput: 0: 1665.4, 1: 1682.2. Samples: 28020630. Policy #0 lag: (min: 9.0, avg: 23.6, max: 41.0) [2023-10-12 18:03:38,436][61643] Avg episode reward: [(0, '24.430'), (1, '10.050')] [2023-10-12 18:03:39,275][62634] Updated weights for policy 0, policy_version 54730 (0.0009) [2023-10-12 18:03:39,652][62634] Updated weights for policy 0, policy_version 54740 (0.0009) [2023-10-12 18:03:40,026][62634] Updated weights for policy 0, policy_version 54750 (0.0009) [2023-10-12 18:03:41,732][62635] Updated weights for policy 1, policy_version 54730 (0.0007) [2023-10-12 18:03:42,098][62635] Updated weights for policy 1, policy_version 54740 (0.0008) [2023-10-12 18:03:42,467][62635] Updated weights for policy 1, policy_version 54750 (0.0009) [2023-10-12 18:03:43,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 112132096. Throughput: 0: 1665.2, 1: 1668.0. Samples: 28040456. Policy #0 lag: (min: 9.0, avg: 23.6, max: 41.0) [2023-10-12 18:03:43,435][61643] Avg episode reward: [(0, '24.600'), (1, '9.900')] [2023-10-12 18:03:43,879][62634] Updated weights for policy 0, policy_version 54760 (0.0011) [2023-10-12 18:03:44,253][62634] Updated weights for policy 0, policy_version 54770 (0.0007) [2023-10-12 18:03:44,627][62634] Updated weights for policy 0, policy_version 54780 (0.0010) [2023-10-12 18:03:46,557][62635] Updated weights for policy 1, policy_version 54760 (0.0008) [2023-10-12 18:03:46,919][62635] Updated weights for policy 1, policy_version 54770 (0.0009) [2023-10-12 18:03:47,294][62635] Updated weights for policy 1, policy_version 54780 (0.0008) [2023-10-12 18:03:48,435][61643] Fps is (10 sec: 13107.5, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 112197632. Throughput: 0: 1673.4, 1: 1672.1. Samples: 28060814. Policy #0 lag: (min: 9.0, avg: 23.6, max: 41.0) [2023-10-12 18:03:48,435][61643] Avg episode reward: [(0, '24.610'), (1, '9.850')] [2023-10-12 18:03:48,769][62634] Updated weights for policy 0, policy_version 54790 (0.0009) [2023-10-12 18:03:49,151][62634] Updated weights for policy 0, policy_version 54800 (0.0008) [2023-10-12 18:03:49,533][62634] Updated weights for policy 0, policy_version 54810 (0.0010) [2023-10-12 18:03:51,505][62635] Updated weights for policy 1, policy_version 54790 (0.0008) [2023-10-12 18:03:51,888][62635] Updated weights for policy 1, policy_version 54800 (0.0008) [2023-10-12 18:03:52,264][62635] Updated weights for policy 1, policy_version 54810 (0.0009) [2023-10-12 18:03:53,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 112263168. Throughput: 0: 1673.9, 1: 1690.5. Samples: 28071166. Policy #0 lag: (min: 9.0, avg: 23.6, max: 41.0) [2023-10-12 18:03:53,436][61643] Avg episode reward: [(0, '24.700'), (1, '9.910')] [2023-10-12 18:03:53,522][62634] Updated weights for policy 0, policy_version 54820 (0.0011) [2023-10-12 18:03:53,905][62634] Updated weights for policy 0, policy_version 54830 (0.0009) [2023-10-12 18:03:54,289][62634] Updated weights for policy 0, policy_version 54840 (0.0007) [2023-10-12 18:03:56,251][62635] Updated weights for policy 1, policy_version 54820 (0.0008) [2023-10-12 18:03:56,623][62635] Updated weights for policy 1, policy_version 54830 (0.0009) [2023-10-12 18:03:56,992][62635] Updated weights for policy 1, policy_version 54840 (0.0009) [2023-10-12 18:03:58,179][62634] Updated weights for policy 0, policy_version 54850 (0.0008) [2023-10-12 18:03:58,435][61643] Fps is (10 sec: 13106.8, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 112328704. Throughput: 0: 1687.0, 1: 1670.3. Samples: 28091212. Policy #0 lag: (min: 9.0, avg: 23.6, max: 41.0) [2023-10-12 18:03:58,436][61643] Avg episode reward: [(0, '24.360'), (1, '9.650')] [2023-10-12 18:03:58,552][62634] Updated weights for policy 0, policy_version 54860 (0.0008) [2023-10-12 18:03:58,933][62634] Updated weights for policy 0, policy_version 54870 (0.0010) [2023-10-12 18:03:59,306][62634] Updated weights for policy 0, policy_version 54880 (0.0010) [2023-10-12 18:04:01,098][62635] Updated weights for policy 1, policy_version 54850 (0.0008) [2023-10-12 18:04:01,467][62635] Updated weights for policy 1, policy_version 54860 (0.0010) [2023-10-12 18:04:01,831][62635] Updated weights for policy 1, policy_version 54870 (0.0008) [2023-10-12 18:04:02,201][62635] Updated weights for policy 1, policy_version 54880 (0.0008) [2023-10-12 18:04:03,289][62634] Updated weights for policy 0, policy_version 54890 (0.0008) [2023-10-12 18:04:03,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 112394240. Throughput: 0: 1689.9, 1: 1686.2. Samples: 28111818. Policy #0 lag: (min: 9.0, avg: 23.6, max: 41.0) [2023-10-12 18:04:03,436][61643] Avg episode reward: [(0, '24.280'), (1, '9.750')] [2023-10-12 18:04:03,654][62634] Updated weights for policy 0, policy_version 54900 (0.0009) [2023-10-12 18:04:04,034][62634] Updated weights for policy 0, policy_version 54910 (0.0010) [2023-10-12 18:04:06,137][62635] Updated weights for policy 1, policy_version 54890 (0.0008) [2023-10-12 18:04:06,504][62635] Updated weights for policy 1, policy_version 54900 (0.0007) [2023-10-12 18:04:06,864][62635] Updated weights for policy 1, policy_version 54910 (0.0009) [2023-10-12 18:04:08,092][62634] Updated weights for policy 0, policy_version 54920 (0.0011) [2023-10-12 18:04:08,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 112459776. Throughput: 0: 1690.2, 1: 1677.9. Samples: 28121798. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 18:04:08,436][61643] Avg episode reward: [(0, '24.280'), (1, '9.840')] [2023-10-12 18:04:08,471][62634] Updated weights for policy 0, policy_version 54930 (0.0009) [2023-10-12 18:04:08,850][62634] Updated weights for policy 0, policy_version 54940 (0.0008) [2023-10-12 18:04:11,021][62635] Updated weights for policy 1, policy_version 54920 (0.0010) [2023-10-12 18:04:11,389][62635] Updated weights for policy 1, policy_version 54930 (0.0011) [2023-10-12 18:04:11,750][62635] Updated weights for policy 1, policy_version 54940 (0.0008) [2023-10-12 18:04:12,927][62634] Updated weights for policy 0, policy_version 54950 (0.0007) [2023-10-12 18:04:13,303][62634] Updated weights for policy 0, policy_version 54960 (0.0007) [2023-10-12 18:04:13,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13107.3, 300 sec: 13440.4). Total num frames: 112525312. Throughput: 0: 1691.6, 1: 1667.3. Samples: 28141706. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 18:04:13,435][61643] Avg episode reward: [(0, '24.220'), (1, '9.760')] [2023-10-12 18:04:13,674][62634] Updated weights for policy 0, policy_version 54970 (0.0009) [2023-10-12 18:04:15,658][62635] Updated weights for policy 1, policy_version 54950 (0.0009) [2023-10-12 18:04:16,033][62635] Updated weights for policy 1, policy_version 54960 (0.0008) [2023-10-12 18:04:16,387][62635] Updated weights for policy 1, policy_version 54970 (0.0008) [2023-10-12 18:04:17,712][62634] Updated weights for policy 0, policy_version 54980 (0.0009) [2023-10-12 18:04:18,088][62634] Updated weights for policy 0, policy_version 54990 (0.0010) [2023-10-12 18:04:18,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 112590848. Throughput: 0: 1687.6, 1: 1689.3. Samples: 28162262. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 18:04:18,436][61643] Avg episode reward: [(0, '24.910'), (1, '9.730')] [2023-10-12 18:04:18,445][62495] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000054976_56295424.pth... [2023-10-12 18:04:18,469][62634] Updated weights for policy 0, policy_version 55000 (0.0010) [2023-10-12 18:04:18,479][62495] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000053408_54689792.pth [2023-10-12 18:04:18,483][62495] Saving a milestone ./train_atari/atari_kangaroo_APPO/checkpoint_p1/milestones/checkpoint_000054976_56295424.pth [2023-10-12 18:04:18,760][62354] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000055008_56328192.pth... [2023-10-12 18:04:18,789][62354] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000053440_54722560.pth [2023-10-12 18:04:18,793][62354] Saving a milestone ./train_atari/atari_kangaroo_APPO/checkpoint_p0/milestones/checkpoint_000055008_56328192.pth [2023-10-12 18:04:20,542][62635] Updated weights for policy 1, policy_version 54980 (0.0007) [2023-10-12 18:04:20,905][62635] Updated weights for policy 1, policy_version 54990 (0.0007) [2023-10-12 18:04:21,277][62635] Updated weights for policy 1, policy_version 55000 (0.0008) [2023-10-12 18:04:22,421][62634] Updated weights for policy 0, policy_version 55010 (0.0008) [2023-10-12 18:04:22,799][62634] Updated weights for policy 0, policy_version 55020 (0.0007) [2023-10-12 18:04:23,176][62634] Updated weights for policy 0, policy_version 55030 (0.0007) [2023-10-12 18:04:23,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 112656384. Throughput: 0: 1700.5, 1: 1672.7. Samples: 28172422. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 18:04:23,435][61643] Avg episode reward: [(0, '24.980'), (1, '9.860')] [2023-10-12 18:04:23,553][62634] Updated weights for policy 0, policy_version 55040 (0.0007) [2023-10-12 18:04:25,201][62635] Updated weights for policy 1, policy_version 55010 (0.0008) [2023-10-12 18:04:25,569][62635] Updated weights for policy 1, policy_version 55020 (0.0007) [2023-10-12 18:04:25,941][62635] Updated weights for policy 1, policy_version 55030 (0.0007) [2023-10-12 18:04:26,309][62635] Updated weights for policy 1, policy_version 55040 (0.0009) [2023-10-12 18:04:27,594][62634] Updated weights for policy 0, policy_version 55050 (0.0009) [2023-10-12 18:04:27,975][62634] Updated weights for policy 0, policy_version 55060 (0.0010) [2023-10-12 18:04:28,351][62634] Updated weights for policy 0, policy_version 55070 (0.0009) [2023-10-12 18:04:28,435][61643] Fps is (10 sec: 16384.4, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 112754688. Throughput: 0: 1703.3, 1: 1679.6. Samples: 28192688. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 18:04:28,435][61643] Avg episode reward: [(0, '25.230'), (1, '10.030')] [2023-10-12 18:04:30,326][62635] Updated weights for policy 1, policy_version 55050 (0.0008) [2023-10-12 18:04:30,701][62635] Updated weights for policy 1, policy_version 55060 (0.0008) [2023-10-12 18:04:31,056][62635] Updated weights for policy 1, policy_version 55070 (0.0007) [2023-10-12 18:04:32,365][62634] Updated weights for policy 0, policy_version 55080 (0.0008) [2023-10-12 18:04:32,749][62634] Updated weights for policy 0, policy_version 55090 (0.0009) [2023-10-12 18:04:33,131][62634] Updated weights for policy 0, policy_version 55100 (0.0011) [2023-10-12 18:04:33,435][61643] Fps is (10 sec: 16384.0, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 112820224. Throughput: 0: 1678.6, 1: 1691.4. Samples: 28212464. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 18:04:33,435][61643] Avg episode reward: [(0, '25.330'), (1, '9.650')] [2023-10-12 18:04:35,189][62635] Updated weights for policy 1, policy_version 55080 (0.0008) [2023-10-12 18:04:35,553][62635] Updated weights for policy 1, policy_version 55090 (0.0008) [2023-10-12 18:04:35,913][62635] Updated weights for policy 1, policy_version 55100 (0.0009) [2023-10-12 18:04:37,238][62634] Updated weights for policy 0, policy_version 55110 (0.0007) [2023-10-12 18:04:37,610][62634] Updated weights for policy 0, policy_version 55120 (0.0010) [2023-10-12 18:04:38,002][62634] Updated weights for policy 0, policy_version 55130 (0.0011) [2023-10-12 18:04:38,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 112885760. Throughput: 0: 1701.9, 1: 1663.6. Samples: 28222616. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 18:04:38,435][61643] Avg episode reward: [(0, '25.060'), (1, '9.800')] [2023-10-12 18:04:40,211][62635] Updated weights for policy 1, policy_version 55110 (0.0007) [2023-10-12 18:04:40,575][62635] Updated weights for policy 1, policy_version 55120 (0.0007) [2023-10-12 18:04:40,934][62635] Updated weights for policy 1, policy_version 55130 (0.0008) [2023-10-12 18:04:42,050][62634] Updated weights for policy 0, policy_version 55140 (0.0010) [2023-10-12 18:04:42,427][62634] Updated weights for policy 0, policy_version 55150 (0.0009) [2023-10-12 18:04:42,814][62634] Updated weights for policy 0, policy_version 55160 (0.0010) [2023-10-12 18:04:43,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 112951296. Throughput: 0: 1692.8, 1: 1678.5. Samples: 28242918. Policy #0 lag: (min: 11.0, avg: 19.0, max: 43.0) [2023-10-12 18:04:43,435][61643] Avg episode reward: [(0, '24.800'), (1, '9.870')] [2023-10-12 18:04:44,769][62635] Updated weights for policy 1, policy_version 55140 (0.0008) [2023-10-12 18:04:45,138][62635] Updated weights for policy 1, policy_version 55150 (0.0008) [2023-10-12 18:04:45,514][62635] Updated weights for policy 1, policy_version 55160 (0.0008) [2023-10-12 18:04:46,718][62634] Updated weights for policy 0, policy_version 55170 (0.0007) [2023-10-12 18:04:47,092][62634] Updated weights for policy 0, policy_version 55180 (0.0008) [2023-10-12 18:04:47,469][62634] Updated weights for policy 0, policy_version 55190 (0.0008) [2023-10-12 18:04:47,851][62634] Updated weights for policy 0, policy_version 55200 (0.0009) [2023-10-12 18:04:48,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 113016832. Throughput: 0: 1664.8, 1: 1685.8. Samples: 28262592. Policy #0 lag: (min: 11.0, avg: 19.0, max: 43.0) [2023-10-12 18:04:48,435][61643] Avg episode reward: [(0, '24.630'), (1, '9.800')] [2023-10-12 18:04:49,842][62635] Updated weights for policy 1, policy_version 55170 (0.0010) [2023-10-12 18:04:50,218][62635] Updated weights for policy 1, policy_version 55180 (0.0010) [2023-10-12 18:04:50,588][62635] Updated weights for policy 1, policy_version 55190 (0.0008) [2023-10-12 18:04:50,958][62635] Updated weights for policy 1, policy_version 55200 (0.0007) [2023-10-12 18:04:51,934][62634] Updated weights for policy 0, policy_version 55210 (0.0008) [2023-10-12 18:04:52,324][62634] Updated weights for policy 0, policy_version 55220 (0.0008) [2023-10-12 18:04:52,693][62634] Updated weights for policy 0, policy_version 55230 (0.0010) [2023-10-12 18:04:53,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 113082368. Throughput: 0: 1694.0, 1: 1661.8. Samples: 28272808. Policy #0 lag: (min: 11.0, avg: 19.0, max: 43.0) [2023-10-12 18:04:53,435][61643] Avg episode reward: [(0, '24.660'), (1, '9.930')] [2023-10-12 18:04:55,153][62635] Updated weights for policy 1, policy_version 55210 (0.0009) [2023-10-12 18:04:55,525][62635] Updated weights for policy 1, policy_version 55220 (0.0010) [2023-10-12 18:04:55,893][62635] Updated weights for policy 1, policy_version 55230 (0.0007) [2023-10-12 18:04:56,690][62634] Updated weights for policy 0, policy_version 55240 (0.0009) [2023-10-12 18:04:57,072][62634] Updated weights for policy 0, policy_version 55250 (0.0009) [2023-10-12 18:04:57,448][62634] Updated weights for policy 0, policy_version 55260 (0.0009) [2023-10-12 18:04:58,435][61643] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 113147904. Throughput: 0: 1677.7, 1: 1683.7. Samples: 28292970. Policy #0 lag: (min: 11.0, avg: 19.0, max: 43.0) [2023-10-12 18:04:58,436][61643] Avg episode reward: [(0, '24.620'), (1, '9.750')] [2023-10-12 18:04:59,907][62635] Updated weights for policy 1, policy_version 55240 (0.0008) [2023-10-12 18:05:00,286][62635] Updated weights for policy 1, policy_version 55250 (0.0008) [2023-10-12 18:05:00,653][62635] Updated weights for policy 1, policy_version 55260 (0.0007) [2023-10-12 18:05:01,462][62634] Updated weights for policy 0, policy_version 55270 (0.0010) [2023-10-12 18:05:01,835][62634] Updated weights for policy 0, policy_version 55280 (0.0009) [2023-10-12 18:05:02,210][62634] Updated weights for policy 0, policy_version 55290 (0.0010) [2023-10-12 18:05:03,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 113213440. Throughput: 0: 1670.4, 1: 1679.1. Samples: 28312990. Policy #0 lag: (min: 11.0, avg: 19.0, max: 43.0) [2023-10-12 18:05:03,436][61643] Avg episode reward: [(0, '24.450'), (1, '9.640')] [2023-10-12 18:05:04,486][62635] Updated weights for policy 1, policy_version 55270 (0.0008) [2023-10-12 18:05:04,844][62635] Updated weights for policy 1, policy_version 55280 (0.0008) [2023-10-12 18:05:05,205][62635] Updated weights for policy 1, policy_version 55290 (0.0010) [2023-10-12 18:05:06,322][62634] Updated weights for policy 0, policy_version 55300 (0.0010) [2023-10-12 18:05:06,700][62634] Updated weights for policy 0, policy_version 55310 (0.0010) [2023-10-12 18:05:07,075][62634] Updated weights for policy 0, policy_version 55320 (0.0010) [2023-10-12 18:05:08,435][61643] Fps is (10 sec: 13107.5, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 113278976. Throughput: 0: 1689.1, 1: 1667.1. Samples: 28323450. Policy #0 lag: (min: 11.0, avg: 19.0, max: 43.0) [2023-10-12 18:05:08,435][61643] Avg episode reward: [(0, '24.520'), (1, '9.700')] [2023-10-12 18:05:09,094][62635] Updated weights for policy 1, policy_version 55300 (0.0007) [2023-10-12 18:05:09,461][62635] Updated weights for policy 1, policy_version 55310 (0.0008) [2023-10-12 18:05:09,820][62635] Updated weights for policy 1, policy_version 55320 (0.0009) [2023-10-12 18:05:11,229][62634] Updated weights for policy 0, policy_version 55330 (0.0008) [2023-10-12 18:05:11,592][62634] Updated weights for policy 0, policy_version 55340 (0.0008) [2023-10-12 18:05:11,966][62634] Updated weights for policy 0, policy_version 55350 (0.0009) [2023-10-12 18:05:12,349][62634] Updated weights for policy 0, policy_version 55360 (0.0007) [2023-10-12 18:05:13,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 113344512. Throughput: 0: 1666.8, 1: 1687.2. Samples: 28343618. Policy #0 lag: (min: 11.0, avg: 19.0, max: 43.0) [2023-10-12 18:05:13,435][61643] Avg episode reward: [(0, '24.620'), (1, '9.780')] [2023-10-12 18:05:13,965][62635] Updated weights for policy 1, policy_version 55330 (0.0007) [2023-10-12 18:05:14,325][62635] Updated weights for policy 1, policy_version 55340 (0.0008) [2023-10-12 18:05:14,696][62635] Updated weights for policy 1, policy_version 55350 (0.0007) [2023-10-12 18:05:15,054][62635] Updated weights for policy 1, policy_version 55360 (0.0007) [2023-10-12 18:05:16,414][62634] Updated weights for policy 0, policy_version 55370 (0.0011) [2023-10-12 18:05:16,798][62634] Updated weights for policy 0, policy_version 55380 (0.0010) [2023-10-12 18:05:17,176][62634] Updated weights for policy 0, policy_version 55390 (0.0007) [2023-10-12 18:05:18,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 113410048. Throughput: 0: 1677.6, 1: 1689.6. Samples: 28363986. Policy #0 lag: (min: 11.0, avg: 19.0, max: 43.0) [2023-10-12 18:05:18,436][61643] Avg episode reward: [(0, '24.360'), (1, '9.680')] [2023-10-12 18:05:18,968][62635] Updated weights for policy 1, policy_version 55370 (0.0010) [2023-10-12 18:05:19,350][62635] Updated weights for policy 1, policy_version 55380 (0.0010) [2023-10-12 18:05:19,719][62635] Updated weights for policy 1, policy_version 55390 (0.0008) [2023-10-12 18:05:21,108][62634] Updated weights for policy 0, policy_version 55400 (0.0009) [2023-10-12 18:05:21,484][62634] Updated weights for policy 0, policy_version 55410 (0.0011) [2023-10-12 18:05:21,855][62634] Updated weights for policy 0, policy_version 55420 (0.0011) [2023-10-12 18:05:23,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 113475584. Throughput: 0: 1683.6, 1: 1686.8. Samples: 28374286. Policy #0 lag: (min: 31.0, avg: 39.8, max: 63.0) [2023-10-12 18:05:23,435][61643] Avg episode reward: [(0, '24.470'), (1, '9.770')] [2023-10-12 18:05:23,824][62635] Updated weights for policy 1, policy_version 55400 (0.0008) [2023-10-12 18:05:24,193][62635] Updated weights for policy 1, policy_version 55410 (0.0007) [2023-10-12 18:05:24,553][62635] Updated weights for policy 1, policy_version 55420 (0.0008) [2023-10-12 18:05:25,882][62634] Updated weights for policy 0, policy_version 55430 (0.0008) [2023-10-12 18:05:26,255][62634] Updated weights for policy 0, policy_version 55440 (0.0007) [2023-10-12 18:05:26,632][62634] Updated weights for policy 0, policy_version 55450 (0.0007) [2023-10-12 18:05:28,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 113541120. Throughput: 0: 1662.1, 1: 1697.5. Samples: 28394098. Policy #0 lag: (min: 31.0, avg: 39.8, max: 63.0) [2023-10-12 18:05:28,435][61643] Avg episode reward: [(0, '24.340'), (1, '9.910')] [2023-10-12 18:05:28,723][62635] Updated weights for policy 1, policy_version 55430 (0.0008) [2023-10-12 18:05:29,083][62635] Updated weights for policy 1, policy_version 55440 (0.0008) [2023-10-12 18:05:29,447][62635] Updated weights for policy 1, policy_version 55450 (0.0008) [2023-10-12 18:05:30,766][62634] Updated weights for policy 0, policy_version 55460 (0.0009) [2023-10-12 18:05:31,141][62634] Updated weights for policy 0, policy_version 55470 (0.0009) [2023-10-12 18:05:31,522][62634] Updated weights for policy 0, policy_version 55480 (0.0009) [2023-10-12 18:05:33,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 113606656. Throughput: 0: 1687.6, 1: 1692.0. Samples: 28414676. Policy #0 lag: (min: 31.0, avg: 39.8, max: 63.0) [2023-10-12 18:05:33,435][61643] Avg episode reward: [(0, '24.350'), (1, '9.740')] [2023-10-12 18:05:33,578][62635] Updated weights for policy 1, policy_version 55460 (0.0009) [2023-10-12 18:05:33,944][62635] Updated weights for policy 1, policy_version 55470 (0.0007) [2023-10-12 18:05:34,307][62635] Updated weights for policy 1, policy_version 55480 (0.0007) [2023-10-12 18:05:35,541][62634] Updated weights for policy 0, policy_version 55490 (0.0008) [2023-10-12 18:05:35,911][62634] Updated weights for policy 0, policy_version 55500 (0.0007) [2023-10-12 18:05:36,293][62634] Updated weights for policy 0, policy_version 55510 (0.0008) [2023-10-12 18:05:36,660][62634] Updated weights for policy 0, policy_version 55520 (0.0008) [2023-10-12 18:05:38,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 113672192. Throughput: 0: 1676.7, 1: 1695.0. Samples: 28424536. Policy #0 lag: (min: 31.0, avg: 39.8, max: 63.0) [2023-10-12 18:05:38,435][61643] Avg episode reward: [(0, '24.200'), (1, '9.880')] [2023-10-12 18:05:38,445][62635] Updated weights for policy 1, policy_version 55490 (0.0008) [2023-10-12 18:05:38,810][62635] Updated weights for policy 1, policy_version 55500 (0.0008) [2023-10-12 18:05:39,183][62635] Updated weights for policy 1, policy_version 55510 (0.0009) [2023-10-12 18:05:39,543][62635] Updated weights for policy 1, policy_version 55520 (0.0007) [2023-10-12 18:05:40,822][62634] Updated weights for policy 0, policy_version 55530 (0.0009) [2023-10-12 18:05:41,203][62634] Updated weights for policy 0, policy_version 55540 (0.0009) [2023-10-12 18:05:41,570][62634] Updated weights for policy 0, policy_version 55550 (0.0007) [2023-10-12 18:05:43,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 113737728. Throughput: 0: 1662.6, 1: 1691.2. Samples: 28443890. Policy #0 lag: (min: 31.0, avg: 39.8, max: 63.0) [2023-10-12 18:05:43,435][61643] Avg episode reward: [(0, '24.220'), (1, '9.950')] [2023-10-12 18:05:43,608][62635] Updated weights for policy 1, policy_version 55530 (0.0009) [2023-10-12 18:05:43,975][62635] Updated weights for policy 1, policy_version 55540 (0.0008) [2023-10-12 18:05:44,337][62635] Updated weights for policy 1, policy_version 55550 (0.0008) [2023-10-12 18:05:45,511][62634] Updated weights for policy 0, policy_version 55560 (0.0007) [2023-10-12 18:05:45,892][62634] Updated weights for policy 0, policy_version 55570 (0.0007) [2023-10-12 18:05:46,266][62634] Updated weights for policy 0, policy_version 55580 (0.0008) [2023-10-12 18:05:48,435][61643] Fps is (10 sec: 13106.9, 60 sec: 13107.1, 300 sec: 13329.4). Total num frames: 113803264. Throughput: 0: 1675.1, 1: 1691.1. Samples: 28464470. Policy #0 lag: (min: 31.0, avg: 39.8, max: 63.0) [2023-10-12 18:05:48,436][61643] Avg episode reward: [(0, '24.120'), (1, '9.730')] [2023-10-12 18:05:48,478][62635] Updated weights for policy 1, policy_version 55560 (0.0009) [2023-10-12 18:05:48,850][62635] Updated weights for policy 1, policy_version 55570 (0.0009) [2023-10-12 18:05:49,217][62635] Updated weights for policy 1, policy_version 55580 (0.0009) [2023-10-12 18:05:50,394][62634] Updated weights for policy 0, policy_version 55590 (0.0008) [2023-10-12 18:05:50,766][62634] Updated weights for policy 0, policy_version 55600 (0.0010) [2023-10-12 18:05:51,142][62634] Updated weights for policy 0, policy_version 55610 (0.0008) [2023-10-12 18:05:53,291][62635] Updated weights for policy 1, policy_version 55590 (0.0010) [2023-10-12 18:05:53,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 113868800. Throughput: 0: 1658.0, 1: 1688.3. Samples: 28474032. Policy #0 lag: (min: 31.0, avg: 39.8, max: 63.0) [2023-10-12 18:05:53,435][61643] Avg episode reward: [(0, '24.330'), (1, '9.910')] [2023-10-12 18:05:53,655][62635] Updated weights for policy 1, policy_version 55600 (0.0010) [2023-10-12 18:05:54,033][62635] Updated weights for policy 1, policy_version 55610 (0.0011) [2023-10-12 18:05:55,333][62634] Updated weights for policy 0, policy_version 55620 (0.0010) [2023-10-12 18:05:55,720][62634] Updated weights for policy 0, policy_version 55630 (0.0009) [2023-10-12 18:05:56,108][62634] Updated weights for policy 0, policy_version 55640 (0.0011) [2023-10-12 18:05:58,205][62635] Updated weights for policy 1, policy_version 55620 (0.0007) [2023-10-12 18:05:58,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 113934336. Throughput: 0: 1668.8, 1: 1679.9. Samples: 28494308. Policy #0 lag: (min: 31.0, avg: 39.8, max: 63.0) [2023-10-12 18:05:58,436][61643] Avg episode reward: [(0, '24.470'), (1, '9.830')] [2023-10-12 18:05:58,569][62635] Updated weights for policy 1, policy_version 55630 (0.0009) [2023-10-12 18:05:58,941][62635] Updated weights for policy 1, policy_version 55640 (0.0008) [2023-10-12 18:05:59,899][62634] Updated weights for policy 0, policy_version 55650 (0.0010) [2023-10-12 18:06:00,283][62634] Updated weights for policy 0, policy_version 55660 (0.0009) [2023-10-12 18:06:00,657][62634] Updated weights for policy 0, policy_version 55670 (0.0007) [2023-10-12 18:06:01,032][62634] Updated weights for policy 0, policy_version 55680 (0.0008) [2023-10-12 18:06:03,020][62635] Updated weights for policy 1, policy_version 55650 (0.0008) [2023-10-12 18:06:03,397][62635] Updated weights for policy 1, policy_version 55660 (0.0008) [2023-10-12 18:06:03,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 113999872. Throughput: 0: 1685.7, 1: 1678.4. Samples: 28515370. Policy #0 lag: (min: 31.0, avg: 31.9, max: 51.0) [2023-10-12 18:06:03,435][61643] Avg episode reward: [(0, '24.610'), (1, '9.850')] [2023-10-12 18:06:03,763][62635] Updated weights for policy 1, policy_version 55670 (0.0010) [2023-10-12 18:06:04,130][62635] Updated weights for policy 1, policy_version 55680 (0.0007) [2023-10-12 18:06:05,014][62634] Updated weights for policy 0, policy_version 55690 (0.0007) [2023-10-12 18:06:05,388][62634] Updated weights for policy 0, policy_version 55700 (0.0007) [2023-10-12 18:06:05,765][62634] Updated weights for policy 0, policy_version 55710 (0.0009) [2023-10-12 18:06:08,371][62635] Updated weights for policy 1, policy_version 55690 (0.0010) [2023-10-12 18:06:08,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 114065408. Throughput: 0: 1658.5, 1: 1681.8. Samples: 28524602. Policy #0 lag: (min: 31.0, avg: 31.9, max: 51.0) [2023-10-12 18:06:08,435][61643] Avg episode reward: [(0, '24.480'), (1, '9.940')] [2023-10-12 18:06:08,734][62635] Updated weights for policy 1, policy_version 55700 (0.0007) [2023-10-12 18:06:09,104][62635] Updated weights for policy 1, policy_version 55710 (0.0009) [2023-10-12 18:06:09,973][62634] Updated weights for policy 0, policy_version 55720 (0.0010) [2023-10-12 18:06:10,340][62634] Updated weights for policy 0, policy_version 55730 (0.0007) [2023-10-12 18:06:10,721][62634] Updated weights for policy 0, policy_version 55740 (0.0009) [2023-10-12 18:06:13,000][62635] Updated weights for policy 1, policy_version 55720 (0.0007) [2023-10-12 18:06:13,374][62635] Updated weights for policy 1, policy_version 55730 (0.0007) [2023-10-12 18:06:13,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 114130944. Throughput: 0: 1680.7, 1: 1676.5. Samples: 28545172. Policy #0 lag: (min: 31.0, avg: 31.9, max: 51.0) [2023-10-12 18:06:13,435][61643] Avg episode reward: [(0, '24.510'), (1, '9.980')] [2023-10-12 18:06:13,742][62635] Updated weights for policy 1, policy_version 55740 (0.0007) [2023-10-12 18:06:14,921][62634] Updated weights for policy 0, policy_version 55750 (0.0008) [2023-10-12 18:06:15,296][62634] Updated weights for policy 0, policy_version 55760 (0.0007) [2023-10-12 18:06:15,675][62634] Updated weights for policy 0, policy_version 55770 (0.0008) [2023-10-12 18:06:17,812][62635] Updated weights for policy 1, policy_version 55750 (0.0010) [2023-10-12 18:06:18,179][62635] Updated weights for policy 1, policy_version 55760 (0.0009) [2023-10-12 18:06:18,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 114196480. Throughput: 0: 1681.6, 1: 1667.4. Samples: 28565380. Policy #0 lag: (min: 31.0, avg: 31.9, max: 51.0) [2023-10-12 18:06:18,436][61643] Avg episode reward: [(0, '24.690'), (1, '9.930')] [2023-10-12 18:06:18,444][62354] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000055776_57114624.pth... [2023-10-12 18:06:18,478][62354] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000054208_55508992.pth [2023-10-12 18:06:18,546][62635] Updated weights for policy 1, policy_version 55770 (0.0009) [2023-10-12 18:06:18,764][62495] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000055776_57114624.pth... [2023-10-12 18:06:18,798][62495] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000054208_55508992.pth [2023-10-12 18:06:19,600][62634] Updated weights for policy 0, policy_version 55780 (0.0007) [2023-10-12 18:06:19,970][62634] Updated weights for policy 0, policy_version 55790 (0.0008) [2023-10-12 18:06:20,350][62634] Updated weights for policy 0, policy_version 55800 (0.0008) [2023-10-12 18:06:22,648][62635] Updated weights for policy 1, policy_version 55780 (0.0009) [2023-10-12 18:06:23,013][62635] Updated weights for policy 1, policy_version 55790 (0.0008) [2023-10-12 18:06:23,378][62635] Updated weights for policy 1, policy_version 55800 (0.0008) [2023-10-12 18:06:23,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 114262016. Throughput: 0: 1663.8, 1: 1678.5. Samples: 28574942. Policy #0 lag: (min: 31.0, avg: 31.9, max: 51.0) [2023-10-12 18:06:23,436][61643] Avg episode reward: [(0, '24.580'), (1, '9.910')] [2023-10-12 18:06:24,515][62634] Updated weights for policy 0, policy_version 55810 (0.0009) [2023-10-12 18:06:24,902][62634] Updated weights for policy 0, policy_version 55820 (0.0010) [2023-10-12 18:06:25,280][62634] Updated weights for policy 0, policy_version 55830 (0.0010) [2023-10-12 18:06:25,666][62634] Updated weights for policy 0, policy_version 55840 (0.0010) [2023-10-12 18:06:27,241][62635] Updated weights for policy 1, policy_version 55810 (0.0007) [2023-10-12 18:06:27,608][62635] Updated weights for policy 1, policy_version 55820 (0.0007) [2023-10-12 18:06:27,972][62635] Updated weights for policy 1, policy_version 55830 (0.0007) [2023-10-12 18:06:28,342][62635] Updated weights for policy 1, policy_version 55840 (0.0008) [2023-10-12 18:06:28,435][61643] Fps is (10 sec: 16384.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 114360320. Throughput: 0: 1686.3, 1: 1688.3. Samples: 28595744. Policy #0 lag: (min: 31.0, avg: 31.9, max: 51.0) [2023-10-12 18:06:28,435][61643] Avg episode reward: [(0, '24.390'), (1, '9.990')] [2023-10-12 18:06:29,745][62634] Updated weights for policy 0, policy_version 55850 (0.0008) [2023-10-12 18:06:30,113][62634] Updated weights for policy 0, policy_version 55860 (0.0007) [2023-10-12 18:06:30,487][62634] Updated weights for policy 0, policy_version 55870 (0.0008) [2023-10-12 18:06:32,180][62635] Updated weights for policy 1, policy_version 55850 (0.0010) [2023-10-12 18:06:32,558][62635] Updated weights for policy 1, policy_version 55860 (0.0007) [2023-10-12 18:06:32,932][62635] Updated weights for policy 1, policy_version 55870 (0.0010) [2023-10-12 18:06:33,435][61643] Fps is (10 sec: 16383.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 114425856. Throughput: 0: 1684.4, 1: 1668.0. Samples: 28615332. Policy #0 lag: (min: 31.0, avg: 31.9, max: 51.0) [2023-10-12 18:06:33,436][61643] Avg episode reward: [(0, '24.650'), (1, '9.850')] [2023-10-12 18:06:34,456][62634] Updated weights for policy 0, policy_version 55880 (0.0008) [2023-10-12 18:06:34,828][62634] Updated weights for policy 0, policy_version 55890 (0.0007) [2023-10-12 18:06:35,211][62634] Updated weights for policy 0, policy_version 55900 (0.0008) [2023-10-12 18:06:37,032][62635] Updated weights for policy 1, policy_version 55880 (0.0009) [2023-10-12 18:06:37,406][62635] Updated weights for policy 1, policy_version 55890 (0.0009) [2023-10-12 18:06:37,773][62635] Updated weights for policy 1, policy_version 55900 (0.0010) [2023-10-12 18:06:38,435][61643] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 114491392. Throughput: 0: 1674.1, 1: 1696.8. Samples: 28625720. Policy #0 lag: (min: 31.0, avg: 31.9, max: 51.0) [2023-10-12 18:06:38,436][61643] Avg episode reward: [(0, '24.720'), (1, '9.850')] [2023-10-12 18:06:39,279][62634] Updated weights for policy 0, policy_version 55910 (0.0008) [2023-10-12 18:06:39,659][62634] Updated weights for policy 0, policy_version 55920 (0.0009) [2023-10-12 18:06:40,030][62634] Updated weights for policy 0, policy_version 55930 (0.0010) [2023-10-12 18:06:41,914][62635] Updated weights for policy 1, policy_version 55910 (0.0010) [2023-10-12 18:06:42,292][62635] Updated weights for policy 1, policy_version 55920 (0.0009) [2023-10-12 18:06:42,649][62635] Updated weights for policy 1, policy_version 55930 (0.0009) [2023-10-12 18:06:43,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 114556928. Throughput: 0: 1683.3, 1: 1685.7. Samples: 28645914. Policy #0 lag: (min: 28.0, avg: 28.8, max: 47.0) [2023-10-12 18:06:43,436][61643] Avg episode reward: [(0, '24.580'), (1, '9.770')] [2023-10-12 18:06:44,141][62634] Updated weights for policy 0, policy_version 55940 (0.0007) [2023-10-12 18:06:44,515][62634] Updated weights for policy 0, policy_version 55950 (0.0007) [2023-10-12 18:06:44,901][62634] Updated weights for policy 0, policy_version 55960 (0.0007) [2023-10-12 18:06:46,697][62635] Updated weights for policy 1, policy_version 55940 (0.0009) [2023-10-12 18:06:47,060][62635] Updated weights for policy 1, policy_version 55950 (0.0009) [2023-10-12 18:06:47,425][62635] Updated weights for policy 1, policy_version 55960 (0.0007) [2023-10-12 18:06:48,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 114622464. Throughput: 0: 1679.5, 1: 1664.8. Samples: 28665864. Policy #0 lag: (min: 28.0, avg: 28.8, max: 47.0) [2023-10-12 18:06:48,436][61643] Avg episode reward: [(0, '24.540'), (1, '9.770')] [2023-10-12 18:06:49,088][62634] Updated weights for policy 0, policy_version 55970 (0.0008) [2023-10-12 18:06:49,455][62634] Updated weights for policy 0, policy_version 55980 (0.0008) [2023-10-12 18:06:49,830][62634] Updated weights for policy 0, policy_version 55990 (0.0010) [2023-10-12 18:06:50,207][62634] Updated weights for policy 0, policy_version 56000 (0.0007) [2023-10-12 18:06:51,337][62635] Updated weights for policy 1, policy_version 55970 (0.0008) [2023-10-12 18:06:51,713][62635] Updated weights for policy 1, policy_version 55980 (0.0008) [2023-10-12 18:06:52,072][62635] Updated weights for policy 1, policy_version 55990 (0.0008) [2023-10-12 18:06:52,443][62635] Updated weights for policy 1, policy_version 56000 (0.0007) [2023-10-12 18:06:53,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 114688000. Throughput: 0: 1678.6, 1: 1692.6. Samples: 28676308. Policy #0 lag: (min: 28.0, avg: 28.8, max: 47.0) [2023-10-12 18:06:53,436][61643] Avg episode reward: [(0, '24.690'), (1, '9.800')] [2023-10-12 18:06:54,169][62634] Updated weights for policy 0, policy_version 56010 (0.0007) [2023-10-12 18:06:54,541][62634] Updated weights for policy 0, policy_version 56020 (0.0007) [2023-10-12 18:06:54,924][62634] Updated weights for policy 0, policy_version 56030 (0.0009) [2023-10-12 18:06:56,555][62635] Updated weights for policy 1, policy_version 56010 (0.0008) [2023-10-12 18:06:56,913][62635] Updated weights for policy 1, policy_version 56020 (0.0011) [2023-10-12 18:06:57,282][62635] Updated weights for policy 1, policy_version 56030 (0.0010) [2023-10-12 18:06:58,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 114753536. Throughput: 0: 1681.8, 1: 1672.9. Samples: 28696134. Policy #0 lag: (min: 28.0, avg: 28.8, max: 47.0) [2023-10-12 18:06:58,436][61643] Avg episode reward: [(0, '24.740'), (1, '9.840')] [2023-10-12 18:06:58,982][62634] Updated weights for policy 0, policy_version 56040 (0.0010) [2023-10-12 18:06:59,365][62634] Updated weights for policy 0, policy_version 56050 (0.0010) [2023-10-12 18:06:59,746][62634] Updated weights for policy 0, policy_version 56060 (0.0009) [2023-10-12 18:07:01,437][62635] Updated weights for policy 1, policy_version 56040 (0.0009) [2023-10-12 18:07:01,808][62635] Updated weights for policy 1, policy_version 56050 (0.0007) [2023-10-12 18:07:02,188][62635] Updated weights for policy 1, policy_version 56060 (0.0008) [2023-10-12 18:07:03,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 114819072. Throughput: 0: 1685.1, 1: 1677.6. Samples: 28716704. Policy #0 lag: (min: 28.0, avg: 28.8, max: 47.0) [2023-10-12 18:07:03,436][61643] Avg episode reward: [(0, '24.720'), (1, '9.840')] [2023-10-12 18:07:03,607][62634] Updated weights for policy 0, policy_version 56070 (0.0010) [2023-10-12 18:07:03,973][62634] Updated weights for policy 0, policy_version 56080 (0.0007) [2023-10-12 18:07:04,354][62634] Updated weights for policy 0, policy_version 56090 (0.0009) [2023-10-12 18:07:06,249][62635] Updated weights for policy 1, policy_version 56070 (0.0009) [2023-10-12 18:07:06,609][62635] Updated weights for policy 1, policy_version 56080 (0.0007) [2023-10-12 18:07:06,977][62635] Updated weights for policy 1, policy_version 56090 (0.0007) [2023-10-12 18:07:08,417][62634] Updated weights for policy 0, policy_version 56100 (0.0010) [2023-10-12 18:07:08,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 114884608. Throughput: 0: 1686.9, 1: 1693.7. Samples: 28727070. Policy #0 lag: (min: 28.0, avg: 28.8, max: 47.0) [2023-10-12 18:07:08,436][61643] Avg episode reward: [(0, '24.800'), (1, '9.880')] [2023-10-12 18:07:08,795][62634] Updated weights for policy 0, policy_version 56110 (0.0008) [2023-10-12 18:07:09,175][62634] Updated weights for policy 0, policy_version 56120 (0.0009) [2023-10-12 18:07:11,059][62635] Updated weights for policy 1, policy_version 56100 (0.0007) [2023-10-12 18:07:11,426][62635] Updated weights for policy 1, policy_version 56110 (0.0007) [2023-10-12 18:07:11,793][62635] Updated weights for policy 1, policy_version 56120 (0.0008) [2023-10-12 18:07:13,199][62634] Updated weights for policy 0, policy_version 56130 (0.0009) [2023-10-12 18:07:13,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 114950144. Throughput: 0: 1690.9, 1: 1664.7. Samples: 28746746. Policy #0 lag: (min: 28.0, avg: 28.8, max: 47.0) [2023-10-12 18:07:13,435][61643] Avg episode reward: [(0, '24.610'), (1, '9.880')] [2023-10-12 18:07:13,565][62634] Updated weights for policy 0, policy_version 56140 (0.0008) [2023-10-12 18:07:13,948][62634] Updated weights for policy 0, policy_version 56150 (0.0009) [2023-10-12 18:07:14,321][62634] Updated weights for policy 0, policy_version 56160 (0.0007) [2023-10-12 18:07:15,654][62635] Updated weights for policy 1, policy_version 56130 (0.0008) [2023-10-12 18:07:16,020][62635] Updated weights for policy 1, policy_version 56140 (0.0008) [2023-10-12 18:07:16,396][62635] Updated weights for policy 1, policy_version 56150 (0.0008) [2023-10-12 18:07:16,770][62635] Updated weights for policy 1, policy_version 56160 (0.0011) [2023-10-12 18:07:18,340][62634] Updated weights for policy 0, policy_version 56170 (0.0008) [2023-10-12 18:07:18,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 115015680. Throughput: 0: 1694.4, 1: 1692.9. Samples: 28767760. Policy #0 lag: (min: 28.0, avg: 28.8, max: 47.0) [2023-10-12 18:07:18,435][61643] Avg episode reward: [(0, '24.690'), (1, '9.850')] [2023-10-12 18:07:18,711][62634] Updated weights for policy 0, policy_version 56180 (0.0008) [2023-10-12 18:07:19,088][62634] Updated weights for policy 0, policy_version 56190 (0.0007) [2023-10-12 18:07:20,741][62635] Updated weights for policy 1, policy_version 56170 (0.0008) [2023-10-12 18:07:21,103][62635] Updated weights for policy 1, policy_version 56180 (0.0011) [2023-10-12 18:07:21,468][62635] Updated weights for policy 1, policy_version 56190 (0.0009) [2023-10-12 18:07:23,095][62634] Updated weights for policy 0, policy_version 56200 (0.0009) [2023-10-12 18:07:23,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 115081216. Throughput: 0: 1694.4, 1: 1681.3. Samples: 28777624. Policy #0 lag: (min: 6.0, avg: 8.8, max: 38.0) [2023-10-12 18:07:23,436][61643] Avg episode reward: [(0, '24.460'), (1, '9.800')] [2023-10-12 18:07:23,472][62634] Updated weights for policy 0, policy_version 56210 (0.0008) [2023-10-12 18:07:23,858][62634] Updated weights for policy 0, policy_version 56220 (0.0010) [2023-10-12 18:07:25,647][62635] Updated weights for policy 1, policy_version 56200 (0.0007) [2023-10-12 18:07:26,008][62635] Updated weights for policy 1, policy_version 56210 (0.0007) [2023-10-12 18:07:26,370][62635] Updated weights for policy 1, policy_version 56220 (0.0009) [2023-10-12 18:07:27,989][62634] Updated weights for policy 0, policy_version 56230 (0.0009) [2023-10-12 18:07:28,356][62634] Updated weights for policy 0, policy_version 56240 (0.0008) [2023-10-12 18:07:28,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 115146752. Throughput: 0: 1694.0, 1: 1675.5. Samples: 28797540. Policy #0 lag: (min: 6.0, avg: 8.8, max: 38.0) [2023-10-12 18:07:28,435][61643] Avg episode reward: [(0, '24.240'), (1, '9.720')] [2023-10-12 18:07:28,732][62634] Updated weights for policy 0, policy_version 56250 (0.0010) [2023-10-12 18:07:30,498][62635] Updated weights for policy 1, policy_version 56230 (0.0008) [2023-10-12 18:07:30,884][62635] Updated weights for policy 1, policy_version 56240 (0.0008) [2023-10-12 18:07:31,252][62635] Updated weights for policy 1, policy_version 56250 (0.0007) [2023-10-12 18:07:32,712][62634] Updated weights for policy 0, policy_version 56260 (0.0009) [2023-10-12 18:07:33,094][62634] Updated weights for policy 0, policy_version 56270 (0.0008) [2023-10-12 18:07:33,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 115212288. Throughput: 0: 1678.9, 1: 1697.3. Samples: 28817790. Policy #0 lag: (min: 6.0, avg: 8.8, max: 38.0) [2023-10-12 18:07:33,435][61643] Avg episode reward: [(0, '24.050'), (1, '9.710')] [2023-10-12 18:07:33,478][62634] Updated weights for policy 0, policy_version 56280 (0.0008) [2023-10-12 18:07:35,134][62635] Updated weights for policy 1, policy_version 56260 (0.0008) [2023-10-12 18:07:35,500][62635] Updated weights for policy 1, policy_version 56270 (0.0010) [2023-10-12 18:07:35,864][62635] Updated weights for policy 1, policy_version 56280 (0.0008) [2023-10-12 18:07:37,560][62634] Updated weights for policy 0, policy_version 56290 (0.0009) [2023-10-12 18:07:37,936][62634] Updated weights for policy 0, policy_version 56300 (0.0008) [2023-10-12 18:07:38,315][62634] Updated weights for policy 0, policy_version 56310 (0.0008) [2023-10-12 18:07:38,435][61643] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 115277824. Throughput: 0: 1687.4, 1: 1674.3. Samples: 28827584. Policy #0 lag: (min: 6.0, avg: 8.8, max: 38.0) [2023-10-12 18:07:38,436][61643] Avg episode reward: [(0, '24.110'), (1, '9.990')] [2023-10-12 18:07:38,689][62634] Updated weights for policy 0, policy_version 56320 (0.0010) [2023-10-12 18:07:40,141][62635] Updated weights for policy 1, policy_version 56290 (0.0008) [2023-10-12 18:07:40,507][62635] Updated weights for policy 1, policy_version 56300 (0.0009) [2023-10-12 18:07:40,878][62635] Updated weights for policy 1, policy_version 56310 (0.0009) [2023-10-12 18:07:41,237][62635] Updated weights for policy 1, policy_version 56320 (0.0007) [2023-10-12 18:07:42,704][62634] Updated weights for policy 0, policy_version 56330 (0.0007) [2023-10-12 18:07:43,080][62634] Updated weights for policy 0, policy_version 56340 (0.0007) [2023-10-12 18:07:43,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 115343360. Throughput: 0: 1687.7, 1: 1687.9. Samples: 28848038. Policy #0 lag: (min: 6.0, avg: 8.8, max: 38.0) [2023-10-12 18:07:43,435][61643] Avg episode reward: [(0, '24.340'), (1, '9.860')] [2023-10-12 18:07:43,457][62634] Updated weights for policy 0, policy_version 56350 (0.0009) [2023-10-12 18:07:45,164][62635] Updated weights for policy 1, policy_version 56330 (0.0007) [2023-10-12 18:07:45,537][62635] Updated weights for policy 1, policy_version 56340 (0.0008) [2023-10-12 18:07:45,917][62635] Updated weights for policy 1, policy_version 56350 (0.0009) [2023-10-12 18:07:47,474][62634] Updated weights for policy 0, policy_version 56360 (0.0009) [2023-10-12 18:07:47,854][62634] Updated weights for policy 0, policy_version 56370 (0.0009) [2023-10-12 18:07:48,237][62634] Updated weights for policy 0, policy_version 56380 (0.0007) [2023-10-12 18:07:48,435][61643] Fps is (10 sec: 16384.4, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 115441664. Throughput: 0: 1665.2, 1: 1697.9. Samples: 28868042. Policy #0 lag: (min: 6.0, avg: 8.8, max: 38.0) [2023-10-12 18:07:48,435][61643] Avg episode reward: [(0, '24.270'), (1, '9.700')] [2023-10-12 18:07:50,048][62635] Updated weights for policy 1, policy_version 56360 (0.0010) [2023-10-12 18:07:50,415][62635] Updated weights for policy 1, policy_version 56370 (0.0011) [2023-10-12 18:07:50,786][62635] Updated weights for policy 1, policy_version 56380 (0.0010) [2023-10-12 18:07:52,358][62634] Updated weights for policy 0, policy_version 56390 (0.0009) [2023-10-12 18:07:52,735][62634] Updated weights for policy 0, policy_version 56400 (0.0007) [2023-10-12 18:07:53,112][62634] Updated weights for policy 0, policy_version 56410 (0.0007) [2023-10-12 18:07:53,435][61643] Fps is (10 sec: 16384.1, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 115507200. Throughput: 0: 1682.5, 1: 1668.3. Samples: 28877858. Policy #0 lag: (min: 6.0, avg: 8.8, max: 38.0) [2023-10-12 18:07:53,435][61643] Avg episode reward: [(0, '24.290'), (1, '9.830')] [2023-10-12 18:07:54,797][62635] Updated weights for policy 1, policy_version 56390 (0.0009) [2023-10-12 18:07:55,160][62635] Updated weights for policy 1, policy_version 56400 (0.0011) [2023-10-12 18:07:55,539][62635] Updated weights for policy 1, policy_version 56410 (0.0010) [2023-10-12 18:07:57,117][62634] Updated weights for policy 0, policy_version 56420 (0.0008) [2023-10-12 18:07:57,498][62634] Updated weights for policy 0, policy_version 56430 (0.0009) [2023-10-12 18:07:57,874][62634] Updated weights for policy 0, policy_version 56440 (0.0011) [2023-10-12 18:07:58,435][61643] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 115572736. Throughput: 0: 1686.6, 1: 1694.9. Samples: 28898914. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 18:07:58,436][61643] Avg episode reward: [(0, '24.510'), (1, '9.680')] [2023-10-12 18:07:59,408][62635] Updated weights for policy 1, policy_version 56420 (0.0009) [2023-10-12 18:07:59,762][62635] Updated weights for policy 1, policy_version 56430 (0.0008) [2023-10-12 18:08:00,132][62635] Updated weights for policy 1, policy_version 56440 (0.0008) [2023-10-12 18:08:01,900][62634] Updated weights for policy 0, policy_version 56450 (0.0008) [2023-10-12 18:08:02,280][62634] Updated weights for policy 0, policy_version 56460 (0.0007) [2023-10-12 18:08:02,663][62634] Updated weights for policy 0, policy_version 56470 (0.0007) [2023-10-12 18:08:03,032][62634] Updated weights for policy 0, policy_version 56480 (0.0008) [2023-10-12 18:08:03,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 115638272. Throughput: 0: 1661.7, 1: 1687.4. Samples: 28918468. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 18:08:03,435][61643] Avg episode reward: [(0, '24.500'), (1, '9.640')] [2023-10-12 18:08:04,474][62635] Updated weights for policy 1, policy_version 56450 (0.0007) [2023-10-12 18:08:04,843][62635] Updated weights for policy 1, policy_version 56460 (0.0007) [2023-10-12 18:08:05,210][62635] Updated weights for policy 1, policy_version 56470 (0.0008) [2023-10-12 18:08:05,590][62635] Updated weights for policy 1, policy_version 56480 (0.0007) [2023-10-12 18:08:07,126][62634] Updated weights for policy 0, policy_version 56490 (0.0008) [2023-10-12 18:08:07,489][62634] Updated weights for policy 0, policy_version 56500 (0.0009) [2023-10-12 18:08:07,864][62634] Updated weights for policy 0, policy_version 56510 (0.0007) [2023-10-12 18:08:08,435][61643] Fps is (10 sec: 13107.5, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 115703808. Throughput: 0: 1686.0, 1: 1668.8. Samples: 28928588. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 18:08:08,435][61643] Avg episode reward: [(0, '24.290'), (1, '9.480')] [2023-10-12 18:08:09,524][62635] Updated weights for policy 1, policy_version 56490 (0.0008) [2023-10-12 18:08:09,888][62635] Updated weights for policy 1, policy_version 56500 (0.0010) [2023-10-12 18:08:10,261][62635] Updated weights for policy 1, policy_version 56510 (0.0009) [2023-10-12 18:08:11,817][62634] Updated weights for policy 0, policy_version 56520 (0.0007) [2023-10-12 18:08:12,200][62634] Updated weights for policy 0, policy_version 56530 (0.0007) [2023-10-12 18:08:12,578][62634] Updated weights for policy 0, policy_version 56540 (0.0007) [2023-10-12 18:08:13,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 115769344. Throughput: 0: 1675.2, 1: 1687.4. Samples: 28948856. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 18:08:13,435][61643] Avg episode reward: [(0, '24.230'), (1, '9.210')] [2023-10-12 18:08:14,402][62635] Updated weights for policy 1, policy_version 56520 (0.0007) [2023-10-12 18:08:14,769][62635] Updated weights for policy 1, policy_version 56530 (0.0007) [2023-10-12 18:08:15,131][62635] Updated weights for policy 1, policy_version 56540 (0.0007) [2023-10-12 18:08:16,720][62634] Updated weights for policy 0, policy_version 56550 (0.0007) [2023-10-12 18:08:17,095][62634] Updated weights for policy 0, policy_version 56560 (0.0008) [2023-10-12 18:08:17,471][62634] Updated weights for policy 0, policy_version 56570 (0.0007) [2023-10-12 18:08:18,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 115834880. Throughput: 0: 1665.8, 1: 1690.4. Samples: 28968822. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 18:08:18,435][61643] Avg episode reward: [(0, '24.240'), (1, '9.310')] [2023-10-12 18:08:18,446][62495] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000056544_57901056.pth... [2023-10-12 18:08:18,446][62354] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000056576_57933824.pth... [2023-10-12 18:08:18,483][62354] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000055008_56328192.pth [2023-10-12 18:08:18,488][62495] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000054976_56295424.pth [2023-10-12 18:08:19,274][62635] Updated weights for policy 1, policy_version 56550 (0.0008) [2023-10-12 18:08:19,655][62635] Updated weights for policy 1, policy_version 56560 (0.0007) [2023-10-12 18:08:20,011][62635] Updated weights for policy 1, policy_version 56570 (0.0008) [2023-10-12 18:08:21,376][62634] Updated weights for policy 0, policy_version 56580 (0.0007) [2023-10-12 18:08:21,763][62634] Updated weights for policy 0, policy_version 56590 (0.0008) [2023-10-12 18:08:22,135][62634] Updated weights for policy 0, policy_version 56600 (0.0008) [2023-10-12 18:08:23,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 115900416. Throughput: 0: 1685.5, 1: 1677.2. Samples: 28978902. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 18:08:23,435][61643] Avg episode reward: [(0, '24.150'), (1, '9.430')] [2023-10-12 18:08:24,098][62635] Updated weights for policy 1, policy_version 56580 (0.0007) [2023-10-12 18:08:24,462][62635] Updated weights for policy 1, policy_version 56590 (0.0010) [2023-10-12 18:08:24,835][62635] Updated weights for policy 1, policy_version 56600 (0.0008) [2023-10-12 18:08:26,187][62634] Updated weights for policy 0, policy_version 56610 (0.0008) [2023-10-12 18:08:26,560][62634] Updated weights for policy 0, policy_version 56620 (0.0010) [2023-10-12 18:08:26,936][62634] Updated weights for policy 0, policy_version 56630 (0.0008) [2023-10-12 18:08:27,317][62634] Updated weights for policy 0, policy_version 56640 (0.0009) [2023-10-12 18:08:28,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 115965952. Throughput: 0: 1666.7, 1: 1686.2. Samples: 28998918. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 18:08:28,436][61643] Avg episode reward: [(0, '24.120'), (1, '9.670')] [2023-10-12 18:08:28,870][62635] Updated weights for policy 1, policy_version 56610 (0.0008) [2023-10-12 18:08:29,244][62635] Updated weights for policy 1, policy_version 56620 (0.0008) [2023-10-12 18:08:29,616][62635] Updated weights for policy 1, policy_version 56630 (0.0009) [2023-10-12 18:08:29,990][62635] Updated weights for policy 1, policy_version 56640 (0.0009) [2023-10-12 18:08:31,371][62634] Updated weights for policy 0, policy_version 56650 (0.0010) [2023-10-12 18:08:31,742][62634] Updated weights for policy 0, policy_version 56660 (0.0011) [2023-10-12 18:08:32,115][62634] Updated weights for policy 0, policy_version 56670 (0.0011) [2023-10-12 18:08:33,435][61643] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 116031488. Throughput: 0: 1677.4, 1: 1680.7. Samples: 29019158. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 18:08:33,436][61643] Avg episode reward: [(0, '24.180'), (1, '9.570')] [2023-10-12 18:08:33,998][62635] Updated weights for policy 1, policy_version 56650 (0.0008) [2023-10-12 18:08:34,371][62635] Updated weights for policy 1, policy_version 56660 (0.0007) [2023-10-12 18:08:34,729][62635] Updated weights for policy 1, policy_version 56670 (0.0008) [2023-10-12 18:08:36,170][62634] Updated weights for policy 0, policy_version 56680 (0.0009) [2023-10-12 18:08:36,554][62634] Updated weights for policy 0, policy_version 56690 (0.0009) [2023-10-12 18:08:36,931][62634] Updated weights for policy 0, policy_version 56700 (0.0007) [2023-10-12 18:08:38,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 116097024. Throughput: 0: 1686.3, 1: 1684.2. Samples: 29029532. Policy #0 lag: (min: 31.0, avg: 36.1, max: 63.0) [2023-10-12 18:08:38,436][61643] Avg episode reward: [(0, '24.520'), (1, '9.990')] [2023-10-12 18:08:38,723][62635] Updated weights for policy 1, policy_version 56680 (0.0010) [2023-10-12 18:08:39,097][62635] Updated weights for policy 1, policy_version 56690 (0.0010) [2023-10-12 18:08:39,464][62635] Updated weights for policy 1, policy_version 56700 (0.0010) [2023-10-12 18:08:41,058][62634] Updated weights for policy 0, policy_version 56710 (0.0010) [2023-10-12 18:08:41,437][62634] Updated weights for policy 0, policy_version 56720 (0.0010) [2023-10-12 18:08:41,819][62634] Updated weights for policy 0, policy_version 56730 (0.0007) [2023-10-12 18:08:43,435][61643] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 116162560. Throughput: 0: 1658.9, 1: 1679.6. Samples: 29049142. Policy #0 lag: (min: 31.0, avg: 36.1, max: 63.0) [2023-10-12 18:08:43,435][61643] Avg episode reward: [(0, '24.710'), (1, '10.000')] [2023-10-12 18:08:43,551][62635] Updated weights for policy 1, policy_version 56710 (0.0009) [2023-10-12 18:08:43,924][62635] Updated weights for policy 1, policy_version 56720 (0.0009) [2023-10-12 18:08:44,287][62635] Updated weights for policy 1, policy_version 56730 (0.0008) [2023-10-12 18:08:45,797][62634] Updated weights for policy 0, policy_version 56740 (0.0011) [2023-10-12 18:08:46,165][62634] Updated weights for policy 0, policy_version 56750 (0.0008) [2023-10-12 18:08:46,548][62634] Updated weights for policy 0, policy_version 56760 (0.0009) [2023-10-12 18:08:48,335][62635] Updated weights for policy 1, policy_version 56740 (0.0010) [2023-10-12 18:08:48,435][61643] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 116228096. Throughput: 0: 1681.3, 1: 1684.0. Samples: 29069910. Policy #0 lag: (min: 31.0, avg: 36.1, max: 63.0) [2023-10-12 18:08:48,435][61643] Avg episode reward: [(0, '24.750'), (1, '9.850')] [2023-10-12 18:08:48,706][62635] Updated weights for policy 1, policy_version 56750 (0.0010) [2023-10-12 18:08:49,080][62635] Updated weights for policy 1, policy_version 56760 (0.0010) [2023-10-12 18:08:50,672][62634] Updated weights for policy 0, policy_version 56770 (0.0009) [2023-10-12 18:08:51,040][62634] Updated weights for policy 0, policy_version 56780 (0.0008) [2023-10-12 18:08:51,421][62634] Updated weights for policy 0, policy_version 56790 (0.0009) [2023-10-12 18:08:51,792][62634] Updated weights for policy 0, policy_version 56800 (0.0010) [2023-10-12 18:08:53,414][62635] Updated weights for policy 1, policy_version 56770 (0.0010) [2023-10-12 18:08:53,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 116293632. Throughput: 0: 1677.2, 1: 1681.5. Samples: 29079728. Policy #0 lag: (min: 31.0, avg: 36.1, max: 63.0) [2023-10-12 18:08:53,435][61643] Avg episode reward: [(0, '24.770'), (1, '9.870')] [2023-10-12 18:08:53,783][62635] Updated weights for policy 1, policy_version 56780 (0.0010) [2023-10-12 18:08:54,157][62635] Updated weights for policy 1, policy_version 56790 (0.0008) [2023-10-12 18:08:54,524][62635] Updated weights for policy 1, policy_version 56800 (0.0007) [2023-10-12 18:08:55,827][62634] Updated weights for policy 0, policy_version 56810 (0.0008) [2023-10-12 18:08:56,210][62634] Updated weights for policy 0, policy_version 56820 (0.0008) [2023-10-12 18:08:56,589][62634] Updated weights for policy 0, policy_version 56830 (0.0008) [2023-10-12 18:08:58,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 116359168. Throughput: 0: 1672.7, 1: 1676.7. Samples: 29099578. Policy #0 lag: (min: 31.0, avg: 36.1, max: 63.0) [2023-10-12 18:08:58,435][61643] Avg episode reward: [(0, '24.610'), (1, '10.020')] [2023-10-12 18:08:58,543][62635] Updated weights for policy 1, policy_version 56810 (0.0009) [2023-10-12 18:08:58,911][62635] Updated weights for policy 1, policy_version 56820 (0.0008) [2023-10-12 18:08:59,276][62635] Updated weights for policy 1, policy_version 56830 (0.0011) [2023-10-12 18:09:00,687][62634] Updated weights for policy 0, policy_version 56840 (0.0010) [2023-10-12 18:09:01,058][62634] Updated weights for policy 0, policy_version 56850 (0.0009) [2023-10-12 18:09:01,442][62634] Updated weights for policy 0, policy_version 56860 (0.0007) [2023-10-12 18:09:03,364][62635] Updated weights for policy 1, policy_version 56840 (0.0008) [2023-10-12 18:09:03,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 116424704. Throughput: 0: 1690.9, 1: 1669.1. Samples: 29120024. Policy #0 lag: (min: 31.0, avg: 36.1, max: 63.0) [2023-10-12 18:09:03,435][61643] Avg episode reward: [(0, '24.710'), (1, '9.840')] [2023-10-12 18:09:03,729][62635] Updated weights for policy 1, policy_version 56850 (0.0009) [2023-10-12 18:09:04,088][62635] Updated weights for policy 1, policy_version 56860 (0.0007) [2023-10-12 18:09:05,561][62634] Updated weights for policy 0, policy_version 56870 (0.0008) [2023-10-12 18:09:05,940][62634] Updated weights for policy 0, policy_version 56880 (0.0010) [2023-10-12 18:09:06,301][62634] Updated weights for policy 0, policy_version 56890 (0.0010) [2023-10-12 18:09:08,195][62635] Updated weights for policy 1, policy_version 56870 (0.0007) [2023-10-12 18:09:08,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 116490240. Throughput: 0: 1676.0, 1: 1676.0. Samples: 29129742. Policy #0 lag: (min: 31.0, avg: 36.1, max: 63.0) [2023-10-12 18:09:08,435][61643] Avg episode reward: [(0, '24.790'), (1, '9.840')] [2023-10-12 18:09:08,575][62635] Updated weights for policy 1, policy_version 56880 (0.0010) [2023-10-12 18:09:08,941][62635] Updated weights for policy 1, policy_version 56890 (0.0011) [2023-10-12 18:09:10,320][62634] Updated weights for policy 0, policy_version 56900 (0.0009) [2023-10-12 18:09:10,698][62634] Updated weights for policy 0, policy_version 56910 (0.0008) [2023-10-12 18:09:11,079][62634] Updated weights for policy 0, policy_version 56920 (0.0007) [2023-10-12 18:09:13,023][62635] Updated weights for policy 1, policy_version 56900 (0.0009) [2023-10-12 18:09:13,383][62635] Updated weights for policy 1, policy_version 56910 (0.0007) [2023-10-12 18:09:13,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 116555776. Throughput: 0: 1677.1, 1: 1675.9. Samples: 29149802. Policy #0 lag: (min: 31.0, avg: 36.1, max: 63.0) [2023-10-12 18:09:13,436][61643] Avg episode reward: [(0, '24.760'), (1, '9.840')] [2023-10-12 18:09:13,741][62635] Updated weights for policy 1, policy_version 56920 (0.0009) [2023-10-12 18:09:15,165][62634] Updated weights for policy 0, policy_version 56930 (0.0007) [2023-10-12 18:09:15,546][62634] Updated weights for policy 0, policy_version 56940 (0.0008) [2023-10-12 18:09:15,915][62634] Updated weights for policy 0, policy_version 56950 (0.0008) [2023-10-12 18:09:16,298][62634] Updated weights for policy 0, policy_version 56960 (0.0007) [2023-10-12 18:09:17,640][62635] Updated weights for policy 1, policy_version 56930 (0.0008) [2023-10-12 18:09:18,003][62635] Updated weights for policy 1, policy_version 56940 (0.0008) [2023-10-12 18:09:18,374][62635] Updated weights for policy 1, policy_version 56950 (0.0008) [2023-10-12 18:09:18,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 116621312. Throughput: 0: 1683.8, 1: 1672.8. Samples: 29170204. Policy #0 lag: (min: 26.0, avg: 26.0, max: 26.0) [2023-10-12 18:09:18,436][61643] Avg episode reward: [(0, '24.830'), (1, '9.750')] [2023-10-12 18:09:18,735][62635] Updated weights for policy 1, policy_version 56960 (0.0008) [2023-10-12 18:09:20,254][62634] Updated weights for policy 0, policy_version 56970 (0.0009) [2023-10-12 18:09:20,631][62634] Updated weights for policy 0, policy_version 56980 (0.0008) [2023-10-12 18:09:21,004][62634] Updated weights for policy 0, policy_version 56990 (0.0007) [2023-10-12 18:09:22,875][62635] Updated weights for policy 1, policy_version 56970 (0.0010) [2023-10-12 18:09:23,241][62635] Updated weights for policy 1, policy_version 56980 (0.0008) [2023-10-12 18:09:23,435][61643] Fps is (10 sec: 13106.8, 60 sec: 13107.1, 300 sec: 13329.3). Total num frames: 116686848. Throughput: 0: 1661.6, 1: 1679.9. Samples: 29179900. Policy #0 lag: (min: 26.0, avg: 26.0, max: 26.0) [2023-10-12 18:09:23,436][61643] Avg episode reward: [(0, '24.810'), (1, '9.910')] [2023-10-12 18:09:23,607][62635] Updated weights for policy 1, policy_version 56990 (0.0009) [2023-10-12 18:09:25,148][62634] Updated weights for policy 0, policy_version 57000 (0.0008) [2023-10-12 18:09:25,516][62634] Updated weights for policy 0, policy_version 57010 (0.0008) [2023-10-12 18:09:25,899][62634] Updated weights for policy 0, policy_version 57020 (0.0007) [2023-10-12 18:09:27,726][62635] Updated weights for policy 1, policy_version 57000 (0.0007) [2023-10-12 18:09:28,101][62635] Updated weights for policy 1, policy_version 57010 (0.0007) [2023-10-12 18:09:28,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 116752384. Throughput: 0: 1680.3, 1: 1680.9. Samples: 29200396. Policy #0 lag: (min: 26.0, avg: 26.0, max: 26.0) [2023-10-12 18:09:28,435][61643] Avg episode reward: [(0, '24.820'), (1, '9.760')] [2023-10-12 18:09:28,478][62635] Updated weights for policy 1, policy_version 57020 (0.0009) [2023-10-12 18:09:29,768][62634] Updated weights for policy 0, policy_version 57030 (0.0009) [2023-10-12 18:09:30,139][62634] Updated weights for policy 0, policy_version 57040 (0.0007) [2023-10-12 18:09:30,512][62634] Updated weights for policy 0, policy_version 57050 (0.0009) [2023-10-12 18:09:32,489][62635] Updated weights for policy 1, policy_version 57030 (0.0008) [2023-10-12 18:09:32,865][62635] Updated weights for policy 1, policy_version 57040 (0.0007) [2023-10-12 18:09:33,231][62635] Updated weights for policy 1, policy_version 57050 (0.0010) [2023-10-12 18:09:33,435][61643] Fps is (10 sec: 13107.7, 60 sec: 13107.3, 300 sec: 13329.4). Total num frames: 116817920. Throughput: 0: 1687.9, 1: 1660.4. Samples: 29220586. Policy #0 lag: (min: 26.0, avg: 26.0, max: 26.0) [2023-10-12 18:09:33,435][61643] Avg episode reward: [(0, '24.650'), (1, '9.700')] [2023-10-12 18:09:34,571][62634] Updated weights for policy 0, policy_version 57060 (0.0008) [2023-10-12 18:09:34,941][62634] Updated weights for policy 0, policy_version 57070 (0.0007) [2023-10-12 18:09:35,321][62634] Updated weights for policy 0, policy_version 57080 (0.0010) [2023-10-12 18:09:37,202][62635] Updated weights for policy 1, policy_version 57060 (0.0008) [2023-10-12 18:09:37,577][62635] Updated weights for policy 1, policy_version 57070 (0.0007) [2023-10-12 18:09:37,950][62635] Updated weights for policy 1, policy_version 57080 (0.0008) [2023-10-12 18:09:38,435][61643] Fps is (10 sec: 16384.0, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 116916224. Throughput: 0: 1666.4, 1: 1686.1. Samples: 29230590. Policy #0 lag: (min: 26.0, avg: 26.0, max: 26.0) [2023-10-12 18:09:38,435][61643] Avg episode reward: [(0, '24.750'), (1, '9.780')] [2023-10-12 18:09:39,374][62634] Updated weights for policy 0, policy_version 57090 (0.0010) [2023-10-12 18:09:39,760][62634] Updated weights for policy 0, policy_version 57100 (0.0009) [2023-10-12 18:09:40,131][62634] Updated weights for policy 0, policy_version 57110 (0.0009) [2023-10-12 18:09:40,515][62634] Updated weights for policy 0, policy_version 57120 (0.0008) [2023-10-12 18:09:42,106][62635] Updated weights for policy 1, policy_version 57090 (0.0010) [2023-10-12 18:09:42,479][62635] Updated weights for policy 1, policy_version 57100 (0.0009) [2023-10-12 18:09:42,843][62635] Updated weights for policy 1, policy_version 57110 (0.0008) [2023-10-12 18:09:43,213][62635] Updated weights for policy 1, policy_version 57120 (0.0008) [2023-10-12 18:09:43,435][61643] Fps is (10 sec: 16384.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 116981760. Throughput: 0: 1681.5, 1: 1690.9. Samples: 29251336. Policy #0 lag: (min: 26.0, avg: 26.0, max: 26.0) [2023-10-12 18:09:43,435][61643] Avg episode reward: [(0, '25.120'), (1, '9.710')] [2023-10-12 18:09:44,506][62634] Updated weights for policy 0, policy_version 57130 (0.0009) [2023-10-12 18:09:44,894][62634] Updated weights for policy 0, policy_version 57140 (0.0009) [2023-10-12 18:09:45,259][62634] Updated weights for policy 0, policy_version 57150 (0.0009) [2023-10-12 18:09:47,178][62635] Updated weights for policy 1, policy_version 57130 (0.0010) [2023-10-12 18:09:47,544][62635] Updated weights for policy 1, policy_version 57140 (0.0010) [2023-10-12 18:09:47,913][62635] Updated weights for policy 1, policy_version 57150 (0.0010) [2023-10-12 18:09:48,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 117047296. Throughput: 0: 1686.9, 1: 1668.0. Samples: 29270996. Policy #0 lag: (min: 26.0, avg: 26.0, max: 26.0) [2023-10-12 18:09:48,436][61643] Avg episode reward: [(0, '24.740'), (1, '9.690')] [2023-10-12 18:09:49,362][62634] Updated weights for policy 0, policy_version 57160 (0.0010) [2023-10-12 18:09:49,746][62634] Updated weights for policy 0, policy_version 57170 (0.0007) [2023-10-12 18:09:50,126][62634] Updated weights for policy 0, policy_version 57180 (0.0008) [2023-10-12 18:09:51,977][62635] Updated weights for policy 1, policy_version 57160 (0.0010) [2023-10-12 18:09:52,343][62635] Updated weights for policy 1, policy_version 57170 (0.0010) [2023-10-12 18:09:52,714][62635] Updated weights for policy 1, policy_version 57180 (0.0007) [2023-10-12 18:09:53,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 117112832. Throughput: 0: 1671.2, 1: 1697.8. Samples: 29281346. Policy #0 lag: (min: 26.0, avg: 26.0, max: 26.0) [2023-10-12 18:09:53,435][61643] Avg episode reward: [(0, '24.640'), (1, '9.780')] [2023-10-12 18:09:54,168][62634] Updated weights for policy 0, policy_version 57190 (0.0010) [2023-10-12 18:09:54,549][62634] Updated weights for policy 0, policy_version 57200 (0.0008) [2023-10-12 18:09:54,921][62634] Updated weights for policy 0, policy_version 57210 (0.0007) [2023-10-12 18:09:56,809][62635] Updated weights for policy 1, policy_version 57190 (0.0009) [2023-10-12 18:09:57,185][62635] Updated weights for policy 1, policy_version 57200 (0.0009) [2023-10-12 18:09:57,551][62635] Updated weights for policy 1, policy_version 57210 (0.0010) [2023-10-12 18:09:58,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 117178368. Throughput: 0: 1687.6, 1: 1682.5. Samples: 29301454. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 18:09:58,436][61643] Avg episode reward: [(0, '24.480'), (1, '9.750')] [2023-10-12 18:09:59,067][62634] Updated weights for policy 0, policy_version 57220 (0.0009) [2023-10-12 18:09:59,449][62634] Updated weights for policy 0, policy_version 57230 (0.0007) [2023-10-12 18:09:59,819][62634] Updated weights for policy 0, policy_version 57240 (0.0007) [2023-10-12 18:10:01,557][62635] Updated weights for policy 1, policy_version 57220 (0.0008) [2023-10-12 18:10:01,923][62635] Updated weights for policy 1, policy_version 57230 (0.0007) [2023-10-12 18:10:02,292][62635] Updated weights for policy 1, policy_version 57240 (0.0007) [2023-10-12 18:10:03,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 117243904. Throughput: 0: 1690.1, 1: 1671.7. Samples: 29321484. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 18:10:03,436][61643] Avg episode reward: [(0, '24.400'), (1, '9.860')] [2023-10-12 18:10:03,844][62634] Updated weights for policy 0, policy_version 57250 (0.0008) [2023-10-12 18:10:04,210][62634] Updated weights for policy 0, policy_version 57260 (0.0007) [2023-10-12 18:10:04,589][62634] Updated weights for policy 0, policy_version 57270 (0.0010) [2023-10-12 18:10:04,964][62634] Updated weights for policy 0, policy_version 57280 (0.0007) [2023-10-12 18:10:06,269][62635] Updated weights for policy 1, policy_version 57250 (0.0008) [2023-10-12 18:10:06,634][62635] Updated weights for policy 1, policy_version 57260 (0.0010) [2023-10-12 18:10:07,010][62635] Updated weights for policy 1, policy_version 57270 (0.0010) [2023-10-12 18:10:07,378][62635] Updated weights for policy 1, policy_version 57280 (0.0011) [2023-10-12 18:10:08,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 117309440. Throughput: 0: 1683.5, 1: 1695.6. Samples: 29331960. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 18:10:08,435][61643] Avg episode reward: [(0, '24.070'), (1, '9.870')] [2023-10-12 18:10:08,927][62634] Updated weights for policy 0, policy_version 57290 (0.0010) [2023-10-12 18:10:09,305][62634] Updated weights for policy 0, policy_version 57300 (0.0009) [2023-10-12 18:10:09,677][62634] Updated weights for policy 0, policy_version 57310 (0.0009) [2023-10-12 18:10:11,331][62635] Updated weights for policy 1, policy_version 57290 (0.0008) [2023-10-12 18:10:11,703][62635] Updated weights for policy 1, policy_version 57300 (0.0009) [2023-10-12 18:10:12,081][62635] Updated weights for policy 1, policy_version 57310 (0.0008) [2023-10-12 18:10:13,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 117374976. Throughput: 0: 1687.8, 1: 1672.0. Samples: 29351584. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 18:10:13,436][61643] Avg episode reward: [(0, '24.110'), (1, '9.940')] [2023-10-12 18:10:13,734][62634] Updated weights for policy 0, policy_version 57320 (0.0010) [2023-10-12 18:10:14,115][62634] Updated weights for policy 0, policy_version 57330 (0.0009) [2023-10-12 18:10:14,500][62634] Updated weights for policy 0, policy_version 57340 (0.0009) [2023-10-12 18:10:16,003][62635] Updated weights for policy 1, policy_version 57320 (0.0009) [2023-10-12 18:10:16,364][62635] Updated weights for policy 1, policy_version 57330 (0.0010) [2023-10-12 18:10:16,732][62635] Updated weights for policy 1, policy_version 57340 (0.0008) [2023-10-12 18:10:18,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 117440512. Throughput: 0: 1684.7, 1: 1687.2. Samples: 29372318. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 18:10:18,435][61643] Avg episode reward: [(0, '24.090'), (1, '10.030')] [2023-10-12 18:10:18,442][62495] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000057344_58720256.pth... [2023-10-12 18:10:18,484][62495] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000055776_57114624.pth [2023-10-12 18:10:18,513][62634] Updated weights for policy 0, policy_version 57350 (0.0008) [2023-10-12 18:10:18,890][62634] Updated weights for policy 0, policy_version 57360 (0.0007) [2023-10-12 18:10:19,261][62634] Updated weights for policy 0, policy_version 57370 (0.0009) [2023-10-12 18:10:19,484][62354] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000057376_58753024.pth... [2023-10-12 18:10:19,517][62354] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000055776_57114624.pth [2023-10-12 18:10:20,843][62635] Updated weights for policy 1, policy_version 57350 (0.0010) [2023-10-12 18:10:21,203][62635] Updated weights for policy 1, policy_version 57360 (0.0008) [2023-10-12 18:10:21,569][62635] Updated weights for policy 1, policy_version 57370 (0.0007) [2023-10-12 18:10:23,377][62634] Updated weights for policy 0, policy_version 57380 (0.0007) [2023-10-12 18:10:23,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 117506048. Throughput: 0: 1681.9, 1: 1686.7. Samples: 29382174. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 18:10:23,436][61643] Avg episode reward: [(0, '24.180'), (1, '9.970')] [2023-10-12 18:10:23,747][62634] Updated weights for policy 0, policy_version 57390 (0.0008) [2023-10-12 18:10:24,125][62634] Updated weights for policy 0, policy_version 57400 (0.0008) [2023-10-12 18:10:25,825][62635] Updated weights for policy 1, policy_version 57380 (0.0009) [2023-10-12 18:10:26,191][62635] Updated weights for policy 1, policy_version 57390 (0.0010) [2023-10-12 18:10:26,556][62635] Updated weights for policy 1, policy_version 57400 (0.0007) [2023-10-12 18:10:27,992][62634] Updated weights for policy 0, policy_version 57410 (0.0008) [2023-10-12 18:10:28,371][62634] Updated weights for policy 0, policy_version 57420 (0.0007) [2023-10-12 18:10:28,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 117571584. Throughput: 0: 1685.3, 1: 1665.3. Samples: 29402116. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 18:10:28,435][61643] Avg episode reward: [(0, '24.280'), (1, '9.880')] [2023-10-12 18:10:28,749][62634] Updated weights for policy 0, policy_version 57430 (0.0010) [2023-10-12 18:10:29,136][62634] Updated weights for policy 0, policy_version 57440 (0.0007) [2023-10-12 18:10:30,757][62635] Updated weights for policy 1, policy_version 57410 (0.0007) [2023-10-12 18:10:31,121][62635] Updated weights for policy 1, policy_version 57420 (0.0007) [2023-10-12 18:10:31,487][62635] Updated weights for policy 1, policy_version 57430 (0.0007) [2023-10-12 18:10:31,865][62635] Updated weights for policy 1, policy_version 57440 (0.0009) [2023-10-12 18:10:33,111][62634] Updated weights for policy 0, policy_version 57450 (0.0007) [2023-10-12 18:10:33,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 117637120. Throughput: 0: 1681.1, 1: 1688.9. Samples: 29422644. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 18:10:33,436][61643] Avg episode reward: [(0, '23.930'), (1, '10.060')] [2023-10-12 18:10:33,493][62634] Updated weights for policy 0, policy_version 57460 (0.0009) [2023-10-12 18:10:33,870][62634] Updated weights for policy 0, policy_version 57470 (0.0008) [2023-10-12 18:10:35,840][62635] Updated weights for policy 1, policy_version 57450 (0.0008) [2023-10-12 18:10:36,211][62635] Updated weights for policy 1, policy_version 57460 (0.0008) [2023-10-12 18:10:36,586][62635] Updated weights for policy 1, policy_version 57470 (0.0010) [2023-10-12 18:10:38,187][62634] Updated weights for policy 0, policy_version 57480 (0.0009) [2023-10-12 18:10:38,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 117702656. Throughput: 0: 1688.5, 1: 1675.2. Samples: 29432710. Policy #0 lag: (min: 22.0, avg: 22.0, max: 22.0) [2023-10-12 18:10:38,435][61643] Avg episode reward: [(0, '23.990'), (1, '9.960')] [2023-10-12 18:10:38,567][62634] Updated weights for policy 0, policy_version 57490 (0.0010) [2023-10-12 18:10:38,937][62634] Updated weights for policy 0, policy_version 57500 (0.0009) [2023-10-12 18:10:40,760][62635] Updated weights for policy 1, policy_version 57480 (0.0007) [2023-10-12 18:10:41,125][62635] Updated weights for policy 1, policy_version 57490 (0.0007) [2023-10-12 18:10:41,485][62635] Updated weights for policy 1, policy_version 57500 (0.0009) [2023-10-12 18:10:43,160][62634] Updated weights for policy 0, policy_version 57510 (0.0009) [2023-10-12 18:10:43,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 117768192. Throughput: 0: 1686.0, 1: 1674.8. Samples: 29452688. Policy #0 lag: (min: 22.0, avg: 22.0, max: 22.0) [2023-10-12 18:10:43,436][61643] Avg episode reward: [(0, '24.090'), (1, '10.010')] [2023-10-12 18:10:43,533][62634] Updated weights for policy 0, policy_version 57520 (0.0009) [2023-10-12 18:10:43,912][62634] Updated weights for policy 0, policy_version 57530 (0.0009) [2023-10-12 18:10:45,546][62635] Updated weights for policy 1, policy_version 57510 (0.0009) [2023-10-12 18:10:45,921][62635] Updated weights for policy 1, policy_version 57520 (0.0009) [2023-10-12 18:10:46,286][62635] Updated weights for policy 1, policy_version 57530 (0.0009) [2023-10-12 18:10:47,899][62634] Updated weights for policy 0, policy_version 57540 (0.0009) [2023-10-12 18:10:48,281][62634] Updated weights for policy 0, policy_version 57550 (0.0008) [2023-10-12 18:10:48,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 117833728. Throughput: 0: 1677.8, 1: 1693.7. Samples: 29473200. Policy #0 lag: (min: 22.0, avg: 22.0, max: 22.0) [2023-10-12 18:10:48,435][61643] Avg episode reward: [(0, '24.180'), (1, '9.920')] [2023-10-12 18:10:48,656][62634] Updated weights for policy 0, policy_version 57560 (0.0007) [2023-10-12 18:10:50,214][62635] Updated weights for policy 1, policy_version 57540 (0.0009) [2023-10-12 18:10:50,584][62635] Updated weights for policy 1, policy_version 57550 (0.0010) [2023-10-12 18:10:50,951][62635] Updated weights for policy 1, policy_version 57560 (0.0007) [2023-10-12 18:10:52,727][62634] Updated weights for policy 0, policy_version 57570 (0.0008) [2023-10-12 18:10:53,103][62634] Updated weights for policy 0, policy_version 57580 (0.0011) [2023-10-12 18:10:53,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 117899264. Throughput: 0: 1687.0, 1: 1666.0. Samples: 29482848. Policy #0 lag: (min: 22.0, avg: 22.0, max: 22.0) [2023-10-12 18:10:53,435][61643] Avg episode reward: [(0, '23.970'), (1, '10.010')] [2023-10-12 18:10:53,482][62634] Updated weights for policy 0, policy_version 57590 (0.0010) [2023-10-12 18:10:53,859][62634] Updated weights for policy 0, policy_version 57600 (0.0007) [2023-10-12 18:10:55,142][62635] Updated weights for policy 1, policy_version 57570 (0.0007) [2023-10-12 18:10:55,500][62635] Updated weights for policy 1, policy_version 57580 (0.0007) [2023-10-12 18:10:55,874][62635] Updated weights for policy 1, policy_version 57590 (0.0007) [2023-10-12 18:10:56,241][62635] Updated weights for policy 1, policy_version 57600 (0.0007) [2023-10-12 18:10:57,937][62634] Updated weights for policy 0, policy_version 57610 (0.0009) [2023-10-12 18:10:58,311][62634] Updated weights for policy 0, policy_version 57620 (0.0008) [2023-10-12 18:10:58,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 117964800. Throughput: 0: 1683.3, 1: 1685.8. Samples: 29503194. Policy #0 lag: (min: 22.0, avg: 22.0, max: 22.0) [2023-10-12 18:10:58,435][61643] Avg episode reward: [(0, '23.870'), (1, '9.890')] [2023-10-12 18:10:58,680][62634] Updated weights for policy 0, policy_version 57630 (0.0007) [2023-10-12 18:11:00,263][62635] Updated weights for policy 1, policy_version 57610 (0.0007) [2023-10-12 18:11:00,634][62635] Updated weights for policy 1, policy_version 57620 (0.0007) [2023-10-12 18:11:00,996][62635] Updated weights for policy 1, policy_version 57630 (0.0009) [2023-10-12 18:11:02,631][62634] Updated weights for policy 0, policy_version 57640 (0.0007) [2023-10-12 18:11:03,013][62634] Updated weights for policy 0, policy_version 57650 (0.0008) [2023-10-12 18:11:03,380][62634] Updated weights for policy 0, policy_version 57660 (0.0008) [2023-10-12 18:11:03,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 118030336. Throughput: 0: 1667.0, 1: 1687.7. Samples: 29523280. Policy #0 lag: (min: 22.0, avg: 22.0, max: 22.0) [2023-10-12 18:11:03,436][61643] Avg episode reward: [(0, '23.950'), (1, '9.700')] [2023-10-12 18:11:05,027][62635] Updated weights for policy 1, policy_version 57640 (0.0009) [2023-10-12 18:11:05,401][62635] Updated weights for policy 1, policy_version 57650 (0.0007) [2023-10-12 18:11:05,764][62635] Updated weights for policy 1, policy_version 57660 (0.0007) [2023-10-12 18:11:07,323][62634] Updated weights for policy 0, policy_version 57670 (0.0008) [2023-10-12 18:11:07,703][62634] Updated weights for policy 0, policy_version 57680 (0.0007) [2023-10-12 18:11:08,076][62634] Updated weights for policy 0, policy_version 57690 (0.0007) [2023-10-12 18:11:08,435][61643] Fps is (10 sec: 16384.1, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 118128640. Throughput: 0: 1689.8, 1: 1667.7. Samples: 29533262. Policy #0 lag: (min: 22.0, avg: 22.0, max: 22.0) [2023-10-12 18:11:08,435][61643] Avg episode reward: [(0, '23.950'), (1, '9.910')] [2023-10-12 18:11:09,852][62635] Updated weights for policy 1, policy_version 57670 (0.0009) [2023-10-12 18:11:10,217][62635] Updated weights for policy 1, policy_version 57680 (0.0011) [2023-10-12 18:11:10,586][62635] Updated weights for policy 1, policy_version 57690 (0.0011) [2023-10-12 18:11:12,106][62634] Updated weights for policy 0, policy_version 57700 (0.0007) [2023-10-12 18:11:12,490][62634] Updated weights for policy 0, policy_version 57710 (0.0007) [2023-10-12 18:11:12,867][62634] Updated weights for policy 0, policy_version 57720 (0.0009) [2023-10-12 18:11:13,435][61643] Fps is (10 sec: 16384.1, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 118194176. Throughput: 0: 1686.4, 1: 1691.4. Samples: 29554120. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) [2023-10-12 18:11:13,435][61643] Avg episode reward: [(0, '23.970'), (1, '9.760')] [2023-10-12 18:11:14,568][62635] Updated weights for policy 1, policy_version 57700 (0.0009) [2023-10-12 18:11:14,938][62635] Updated weights for policy 1, policy_version 57710 (0.0010) [2023-10-12 18:11:15,300][62635] Updated weights for policy 1, policy_version 57720 (0.0011) [2023-10-12 18:11:16,799][62634] Updated weights for policy 0, policy_version 57730 (0.0008) [2023-10-12 18:11:17,180][62634] Updated weights for policy 0, policy_version 57740 (0.0008) [2023-10-12 18:11:17,564][62634] Updated weights for policy 0, policy_version 57750 (0.0007) [2023-10-12 18:11:17,945][62634] Updated weights for policy 0, policy_version 57760 (0.0010) [2023-10-12 18:11:18,435][61643] Fps is (10 sec: 13106.8, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 118259712. Throughput: 0: 1663.4, 1: 1696.3. Samples: 29573830. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) [2023-10-12 18:11:18,436][61643] Avg episode reward: [(0, '24.190'), (1, '9.790')] [2023-10-12 18:11:19,315][62635] Updated weights for policy 1, policy_version 57730 (0.0010) [2023-10-12 18:11:19,685][62635] Updated weights for policy 1, policy_version 57740 (0.0009) [2023-10-12 18:11:20,043][62635] Updated weights for policy 1, policy_version 57750 (0.0008) [2023-10-12 18:11:20,411][62635] Updated weights for policy 1, policy_version 57760 (0.0007) [2023-10-12 18:11:22,078][62634] Updated weights for policy 0, policy_version 57770 (0.0007) [2023-10-12 18:11:22,461][62634] Updated weights for policy 0, policy_version 57780 (0.0007) [2023-10-12 18:11:22,844][62634] Updated weights for policy 0, policy_version 57790 (0.0009) [2023-10-12 18:11:23,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 118325248. Throughput: 0: 1686.4, 1: 1679.0. Samples: 29584154. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) [2023-10-12 18:11:23,435][61643] Avg episode reward: [(0, '24.190'), (1, '9.790')] [2023-10-12 18:11:24,358][62635] Updated weights for policy 1, policy_version 57770 (0.0007) [2023-10-12 18:11:24,718][62635] Updated weights for policy 1, policy_version 57780 (0.0009) [2023-10-12 18:11:25,081][62635] Updated weights for policy 1, policy_version 57790 (0.0009) [2023-10-12 18:11:27,135][62634] Updated weights for policy 0, policy_version 57800 (0.0008) [2023-10-12 18:11:27,516][62634] Updated weights for policy 0, policy_version 57810 (0.0009) [2023-10-12 18:11:27,901][62634] Updated weights for policy 0, policy_version 57820 (0.0011) [2023-10-12 18:11:28,435][61643] Fps is (10 sec: 13107.6, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 118390784. Throughput: 0: 1684.1, 1: 1692.4. Samples: 29604630. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) [2023-10-12 18:11:28,435][61643] Avg episode reward: [(0, '24.150'), (1, '9.870')] [2023-10-12 18:11:29,344][62635] Updated weights for policy 1, policy_version 57800 (0.0008) [2023-10-12 18:11:29,706][62635] Updated weights for policy 1, policy_version 57810 (0.0008) [2023-10-12 18:11:30,073][62635] Updated weights for policy 1, policy_version 57820 (0.0008) [2023-10-12 18:11:31,720][62634] Updated weights for policy 0, policy_version 57830 (0.0010) [2023-10-12 18:11:32,096][62634] Updated weights for policy 0, policy_version 57840 (0.0010) [2023-10-12 18:11:32,477][62634] Updated weights for policy 0, policy_version 57850 (0.0009) [2023-10-12 18:11:33,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 118456320. Throughput: 0: 1668.7, 1: 1694.1. Samples: 29624524. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) [2023-10-12 18:11:33,435][61643] Avg episode reward: [(0, '23.510'), (1, '9.930')] [2023-10-12 18:11:34,231][62635] Updated weights for policy 1, policy_version 57830 (0.0009) [2023-10-12 18:11:34,626][62635] Updated weights for policy 1, policy_version 57840 (0.0007) [2023-10-12 18:11:34,991][62635] Updated weights for policy 1, policy_version 57850 (0.0007) [2023-10-12 18:11:36,303][62634] Updated weights for policy 0, policy_version 57860 (0.0007) [2023-10-12 18:11:36,682][62634] Updated weights for policy 0, policy_version 57870 (0.0008) [2023-10-12 18:11:37,044][62634] Updated weights for policy 0, policy_version 57880 (0.0010) [2023-10-12 18:11:38,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 118521856. Throughput: 0: 1698.6, 1: 1685.2. Samples: 29635116. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) [2023-10-12 18:11:38,436][61643] Avg episode reward: [(0, '23.200'), (1, '9.740')] [2023-10-12 18:11:38,885][62635] Updated weights for policy 1, policy_version 57860 (0.0008) [2023-10-12 18:11:39,249][62635] Updated weights for policy 1, policy_version 57870 (0.0011) [2023-10-12 18:11:39,627][62635] Updated weights for policy 1, policy_version 57880 (0.0011) [2023-10-12 18:11:41,241][62634] Updated weights for policy 0, policy_version 57890 (0.0010) [2023-10-12 18:11:41,622][62634] Updated weights for policy 0, policy_version 57900 (0.0008) [2023-10-12 18:11:42,005][62634] Updated weights for policy 0, policy_version 57910 (0.0009) [2023-10-12 18:11:42,379][62634] Updated weights for policy 0, policy_version 57920 (0.0009) [2023-10-12 18:11:43,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 118587392. Throughput: 0: 1680.5, 1: 1691.4. Samples: 29654930. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) [2023-10-12 18:11:43,435][61643] Avg episode reward: [(0, '23.420'), (1, '9.890')] [2023-10-12 18:11:43,711][62635] Updated weights for policy 1, policy_version 57890 (0.0010) [2023-10-12 18:11:44,073][62635] Updated weights for policy 1, policy_version 57900 (0.0010) [2023-10-12 18:11:44,439][62635] Updated weights for policy 1, policy_version 57910 (0.0008) [2023-10-12 18:11:44,807][62635] Updated weights for policy 1, policy_version 57920 (0.0010) [2023-10-12 18:11:46,454][62634] Updated weights for policy 0, policy_version 57930 (0.0010) [2023-10-12 18:11:46,828][62634] Updated weights for policy 0, policy_version 57940 (0.0009) [2023-10-12 18:11:47,202][62634] Updated weights for policy 0, policy_version 57950 (0.0011) [2023-10-12 18:11:48,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 118652928. Throughput: 0: 1680.0, 1: 1689.6. Samples: 29674916. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) [2023-10-12 18:11:48,436][61643] Avg episode reward: [(0, '22.780'), (1, '9.910')] [2023-10-12 18:11:48,991][62635] Updated weights for policy 1, policy_version 57930 (0.0009) [2023-10-12 18:11:49,367][62635] Updated weights for policy 1, policy_version 57940 (0.0007) [2023-10-12 18:11:49,746][62635] Updated weights for policy 1, policy_version 57950 (0.0009) [2023-10-12 18:11:51,208][62634] Updated weights for policy 0, policy_version 57960 (0.0008) [2023-10-12 18:11:51,588][62634] Updated weights for policy 0, policy_version 57970 (0.0010) [2023-10-12 18:11:51,959][62634] Updated weights for policy 0, policy_version 57980 (0.0011) [2023-10-12 18:11:53,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 118718464. Throughput: 0: 1690.1, 1: 1685.6. Samples: 29685170. Policy #0 lag: (min: 23.0, avg: 23.0, max: 23.0) [2023-10-12 18:11:53,436][61643] Avg episode reward: [(0, '22.680'), (1, '9.860')] [2023-10-12 18:11:53,788][62635] Updated weights for policy 1, policy_version 57960 (0.0009) [2023-10-12 18:11:54,167][62635] Updated weights for policy 1, policy_version 57970 (0.0010) [2023-10-12 18:11:54,534][62635] Updated weights for policy 1, policy_version 57980 (0.0009) [2023-10-12 18:11:55,841][62634] Updated weights for policy 0, policy_version 57990 (0.0009) [2023-10-12 18:11:56,214][62634] Updated weights for policy 0, policy_version 58000 (0.0010) [2023-10-12 18:11:56,603][62634] Updated weights for policy 0, policy_version 58010 (0.0008) [2023-10-12 18:11:58,341][62635] Updated weights for policy 1, policy_version 57990 (0.0007) [2023-10-12 18:11:58,435][61643] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 118784000. Throughput: 0: 1663.7, 1: 1692.7. Samples: 29705158. Policy #0 lag: (min: 23.0, avg: 23.0, max: 23.0) [2023-10-12 18:11:58,435][61643] Avg episode reward: [(0, '22.960'), (1, '9.860')] [2023-10-12 18:11:58,704][62635] Updated weights for policy 1, policy_version 58000 (0.0007) [2023-10-12 18:11:59,074][62635] Updated weights for policy 1, policy_version 58010 (0.0011) [2023-10-12 18:12:00,750][62634] Updated weights for policy 0, policy_version 58020 (0.0010) [2023-10-12 18:12:01,119][62634] Updated weights for policy 0, policy_version 58030 (0.0010) [2023-10-12 18:12:01,495][62634] Updated weights for policy 0, policy_version 58040 (0.0010) [2023-10-12 18:12:03,111][62635] Updated weights for policy 1, policy_version 58020 (0.0009) [2023-10-12 18:12:03,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 118849536. Throughput: 0: 1687.4, 1: 1688.6. Samples: 29725748. Policy #0 lag: (min: 23.0, avg: 23.0, max: 23.0) [2023-10-12 18:12:03,436][61643] Avg episode reward: [(0, '22.730'), (1, '9.950')] [2023-10-12 18:12:03,472][62635] Updated weights for policy 1, policy_version 58030 (0.0009) [2023-10-12 18:12:03,845][62635] Updated weights for policy 1, policy_version 58040 (0.0008) [2023-10-12 18:12:05,585][62634] Updated weights for policy 0, policy_version 58050 (0.0010) [2023-10-12 18:12:05,965][62634] Updated weights for policy 0, policy_version 58060 (0.0009) [2023-10-12 18:12:06,346][62634] Updated weights for policy 0, policy_version 58070 (0.0010) [2023-10-12 18:12:06,727][62634] Updated weights for policy 0, policy_version 58080 (0.0008) [2023-10-12 18:12:07,735][62635] Updated weights for policy 1, policy_version 58050 (0.0008) [2023-10-12 18:12:08,100][62635] Updated weights for policy 1, policy_version 58060 (0.0009) [2023-10-12 18:12:08,435][61643] Fps is (10 sec: 13106.7, 60 sec: 13107.1, 300 sec: 13440.4). Total num frames: 118915072. Throughput: 0: 1679.2, 1: 1692.9. Samples: 29735900. Policy #0 lag: (min: 23.0, avg: 23.0, max: 23.0) [2023-10-12 18:12:08,436][61643] Avg episode reward: [(0, '22.930'), (1, '9.990')] [2023-10-12 18:12:08,469][62635] Updated weights for policy 1, policy_version 58070 (0.0007) [2023-10-12 18:12:08,829][62635] Updated weights for policy 1, policy_version 58080 (0.0009) [2023-10-12 18:12:10,716][62634] Updated weights for policy 0, policy_version 58090 (0.0008) [2023-10-12 18:12:11,100][62634] Updated weights for policy 0, policy_version 58100 (0.0007) [2023-10-12 18:12:11,471][62634] Updated weights for policy 0, policy_version 58110 (0.0007) [2023-10-12 18:12:12,714][62635] Updated weights for policy 1, policy_version 58090 (0.0007) [2023-10-12 18:12:13,089][62635] Updated weights for policy 1, policy_version 58100 (0.0007) [2023-10-12 18:12:13,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 118980608. Throughput: 0: 1666.3, 1: 1701.6. Samples: 29756184. Policy #0 lag: (min: 23.0, avg: 23.0, max: 23.0) [2023-10-12 18:12:13,436][61643] Avg episode reward: [(0, '22.350'), (1, '9.900')] [2023-10-12 18:12:13,459][62635] Updated weights for policy 1, policy_version 58110 (0.0008) [2023-10-12 18:12:15,685][62634] Updated weights for policy 0, policy_version 58120 (0.0009) [2023-10-12 18:12:16,057][62634] Updated weights for policy 0, policy_version 58130 (0.0008) [2023-10-12 18:12:16,441][62634] Updated weights for policy 0, policy_version 58140 (0.0007) [2023-10-12 18:12:17,678][62635] Updated weights for policy 1, policy_version 58120 (0.0007) [2023-10-12 18:12:18,046][62635] Updated weights for policy 1, policy_version 58130 (0.0007) [2023-10-12 18:12:18,406][62635] Updated weights for policy 1, policy_version 58140 (0.0008) [2023-10-12 18:12:18,435][61643] Fps is (10 sec: 13107.7, 60 sec: 13107.3, 300 sec: 13440.4). Total num frames: 119046144. Throughput: 0: 1688.1, 1: 1679.3. Samples: 29776058. Policy #0 lag: (min: 23.0, avg: 23.0, max: 23.0) [2023-10-12 18:12:18,435][61643] Avg episode reward: [(0, '22.530'), (1, '9.810')] [2023-10-12 18:12:18,444][62354] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000058144_59539456.pth... [2023-10-12 18:12:18,479][62354] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000056576_57933824.pth [2023-10-12 18:12:18,554][62495] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000058144_59539456.pth... [2023-10-12 18:12:18,594][62495] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000056544_57901056.pth [2023-10-12 18:12:20,377][62634] Updated weights for policy 0, policy_version 58150 (0.0008) [2023-10-12 18:12:20,762][62634] Updated weights for policy 0, policy_version 58160 (0.0007) [2023-10-12 18:12:21,137][62634] Updated weights for policy 0, policy_version 58170 (0.0009) [2023-10-12 18:12:22,374][62635] Updated weights for policy 1, policy_version 58150 (0.0008) [2023-10-12 18:12:22,752][62635] Updated weights for policy 1, policy_version 58160 (0.0007) [2023-10-12 18:12:23,113][62635] Updated weights for policy 1, policy_version 58170 (0.0007) [2023-10-12 18:12:23,435][61643] Fps is (10 sec: 16384.1, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 119144448. Throughput: 0: 1661.2, 1: 1700.2. Samples: 29786378. Policy #0 lag: (min: 23.0, avg: 23.0, max: 23.0) [2023-10-12 18:12:23,435][61643] Avg episode reward: [(0, '22.550'), (1, '9.870')] [2023-10-12 18:12:25,075][62634] Updated weights for policy 0, policy_version 58180 (0.0009) [2023-10-12 18:12:25,451][62634] Updated weights for policy 0, policy_version 58190 (0.0007) [2023-10-12 18:12:25,830][62634] Updated weights for policy 0, policy_version 58200 (0.0009) [2023-10-12 18:12:27,132][62635] Updated weights for policy 1, policy_version 58180 (0.0008) [2023-10-12 18:12:27,506][62635] Updated weights for policy 1, policy_version 58190 (0.0011) [2023-10-12 18:12:27,877][62635] Updated weights for policy 1, policy_version 58200 (0.0009) [2023-10-12 18:12:28,435][61643] Fps is (10 sec: 16384.0, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 119209984. Throughput: 0: 1673.2, 1: 1702.1. Samples: 29806820. Policy #0 lag: (min: 23.0, avg: 23.0, max: 23.0) [2023-10-12 18:12:28,436][61643] Avg episode reward: [(0, '22.360'), (1, '9.920')] [2023-10-12 18:12:29,886][62634] Updated weights for policy 0, policy_version 58210 (0.0008) [2023-10-12 18:12:30,271][62634] Updated weights for policy 0, policy_version 58220 (0.0009) [2023-10-12 18:12:30,645][62634] Updated weights for policy 0, policy_version 58230 (0.0008) [2023-10-12 18:12:31,022][62634] Updated weights for policy 0, policy_version 58240 (0.0007) [2023-10-12 18:12:32,060][62635] Updated weights for policy 1, policy_version 58210 (0.0011) [2023-10-12 18:12:32,435][62635] Updated weights for policy 1, policy_version 58220 (0.0011) [2023-10-12 18:12:32,805][62635] Updated weights for policy 1, policy_version 58230 (0.0010) [2023-10-12 18:12:33,177][62635] Updated weights for policy 1, policy_version 58240 (0.0010) [2023-10-12 18:12:33,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 119275520. Throughput: 0: 1691.4, 1: 1678.5. Samples: 29826560. Policy #0 lag: (min: 31.0, avg: 40.9, max: 63.0) [2023-10-12 18:12:33,436][61643] Avg episode reward: [(0, '22.710'), (1, '9.940')] [2023-10-12 18:12:35,133][62634] Updated weights for policy 0, policy_version 58250 (0.0008) [2023-10-12 18:12:35,516][62634] Updated weights for policy 0, policy_version 58260 (0.0009) [2023-10-12 18:12:35,889][62634] Updated weights for policy 0, policy_version 58270 (0.0009) [2023-10-12 18:12:37,185][62635] Updated weights for policy 1, policy_version 58250 (0.0008) [2023-10-12 18:12:37,556][62635] Updated weights for policy 1, policy_version 58260 (0.0007) [2023-10-12 18:12:37,923][62635] Updated weights for policy 1, policy_version 58270 (0.0008) [2023-10-12 18:12:38,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 119341056. Throughput: 0: 1662.9, 1: 1707.7. Samples: 29836848. Policy #0 lag: (min: 31.0, avg: 40.9, max: 63.0) [2023-10-12 18:12:38,436][61643] Avg episode reward: [(0, '22.830'), (1, '9.740')] [2023-10-12 18:12:40,141][62634] Updated weights for policy 0, policy_version 58280 (0.0009) [2023-10-12 18:12:40,528][62634] Updated weights for policy 0, policy_version 58290 (0.0007) [2023-10-12 18:12:40,899][62634] Updated weights for policy 0, policy_version 58300 (0.0008) [2023-10-12 18:12:41,932][62635] Updated weights for policy 1, policy_version 58280 (0.0008) [2023-10-12 18:12:42,310][62635] Updated weights for policy 1, policy_version 58290 (0.0008) [2023-10-12 18:12:42,678][62635] Updated weights for policy 1, policy_version 58300 (0.0009) [2023-10-12 18:12:43,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 119406592. Throughput: 0: 1683.2, 1: 1688.5. Samples: 29856884. Policy #0 lag: (min: 31.0, avg: 40.9, max: 63.0) [2023-10-12 18:12:43,436][61643] Avg episode reward: [(0, '22.860'), (1, '9.770')] [2023-10-12 18:12:45,012][62634] Updated weights for policy 0, policy_version 58310 (0.0009) [2023-10-12 18:12:45,383][62634] Updated weights for policy 0, policy_version 58320 (0.0008) [2023-10-12 18:12:45,759][62634] Updated weights for policy 0, policy_version 58330 (0.0008) [2023-10-12 18:12:46,758][62635] Updated weights for policy 1, policy_version 58310 (0.0010) [2023-10-12 18:12:47,128][62635] Updated weights for policy 1, policy_version 58320 (0.0009) [2023-10-12 18:12:47,497][62635] Updated weights for policy 1, policy_version 58330 (0.0010) [2023-10-12 18:12:48,435][61643] Fps is (10 sec: 13107.5, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 119472128. Throughput: 0: 1684.8, 1: 1669.4. Samples: 29876690. Policy #0 lag: (min: 31.0, avg: 40.9, max: 63.0) [2023-10-12 18:12:48,435][61643] Avg episode reward: [(0, '23.320'), (1, '9.550')] [2023-10-12 18:12:49,796][62634] Updated weights for policy 0, policy_version 58340 (0.0008) [2023-10-12 18:12:50,165][62634] Updated weights for policy 0, policy_version 58350 (0.0009) [2023-10-12 18:12:50,545][62634] Updated weights for policy 0, policy_version 58360 (0.0007) [2023-10-12 18:12:51,670][62635] Updated weights for policy 1, policy_version 58340 (0.0010) [2023-10-12 18:12:52,042][62635] Updated weights for policy 1, policy_version 58350 (0.0010) [2023-10-12 18:12:52,408][62635] Updated weights for policy 1, policy_version 58360 (0.0009) [2023-10-12 18:12:53,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 119537664. Throughput: 0: 1664.2, 1: 1693.4. Samples: 29886992. Policy #0 lag: (min: 31.0, avg: 40.9, max: 63.0) [2023-10-12 18:12:53,435][61643] Avg episode reward: [(0, '23.000'), (1, '9.350')] [2023-10-12 18:12:54,492][62634] Updated weights for policy 0, policy_version 58370 (0.0007) [2023-10-12 18:12:54,882][62634] Updated weights for policy 0, policy_version 58380 (0.0007) [2023-10-12 18:12:55,259][62634] Updated weights for policy 0, policy_version 58390 (0.0009) [2023-10-12 18:12:55,643][62634] Updated weights for policy 0, policy_version 58400 (0.0009) [2023-10-12 18:12:56,712][62635] Updated weights for policy 1, policy_version 58370 (0.0010) [2023-10-12 18:12:57,066][62635] Updated weights for policy 1, policy_version 58380 (0.0010) [2023-10-12 18:12:57,430][62635] Updated weights for policy 1, policy_version 58390 (0.0010) [2023-10-12 18:12:57,805][62635] Updated weights for policy 1, policy_version 58400 (0.0010) [2023-10-12 18:12:58,435][61643] Fps is (10 sec: 13106.8, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 119603200. Throughput: 0: 1684.6, 1: 1672.5. Samples: 29907254. Policy #0 lag: (min: 31.0, avg: 40.9, max: 63.0) [2023-10-12 18:12:58,436][61643] Avg episode reward: [(0, '23.660'), (1, '9.640')] [2023-10-12 18:12:59,608][62634] Updated weights for policy 0, policy_version 58410 (0.0009) [2023-10-12 18:12:59,989][62634] Updated weights for policy 0, policy_version 58420 (0.0009) [2023-10-12 18:13:00,374][62634] Updated weights for policy 0, policy_version 58430 (0.0009) [2023-10-12 18:13:01,981][62635] Updated weights for policy 1, policy_version 58410 (0.0009) [2023-10-12 18:13:02,351][62635] Updated weights for policy 1, policy_version 58420 (0.0009) [2023-10-12 18:13:02,717][62635] Updated weights for policy 1, policy_version 58430 (0.0010) [2023-10-12 18:13:03,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 119668736. Throughput: 0: 1692.1, 1: 1665.5. Samples: 29927150. Policy #0 lag: (min: 31.0, avg: 40.9, max: 63.0) [2023-10-12 18:13:03,435][61643] Avg episode reward: [(0, '23.810'), (1, '9.440')] [2023-10-12 18:13:04,430][62634] Updated weights for policy 0, policy_version 58440 (0.0008) [2023-10-12 18:13:04,812][62634] Updated weights for policy 0, policy_version 58450 (0.0008) [2023-10-12 18:13:05,186][62634] Updated weights for policy 0, policy_version 58460 (0.0009) [2023-10-12 18:13:06,840][62635] Updated weights for policy 1, policy_version 58440 (0.0009) [2023-10-12 18:13:07,203][62635] Updated weights for policy 1, policy_version 58450 (0.0009) [2023-10-12 18:13:07,572][62635] Updated weights for policy 1, policy_version 58460 (0.0007) [2023-10-12 18:13:08,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 119734272. Throughput: 0: 1678.5, 1: 1680.2. Samples: 29937518. Policy #0 lag: (min: 31.0, avg: 40.9, max: 63.0) [2023-10-12 18:13:08,436][61643] Avg episode reward: [(0, '23.500'), (1, '9.500')] [2023-10-12 18:13:09,352][62634] Updated weights for policy 0, policy_version 58470 (0.0009) [2023-10-12 18:13:09,726][62634] Updated weights for policy 0, policy_version 58480 (0.0008) [2023-10-12 18:13:10,104][62634] Updated weights for policy 0, policy_version 58490 (0.0008) [2023-10-12 18:13:11,528][62635] Updated weights for policy 1, policy_version 58470 (0.0010) [2023-10-12 18:13:11,898][62635] Updated weights for policy 1, policy_version 58480 (0.0010) [2023-10-12 18:13:12,270][62635] Updated weights for policy 1, policy_version 58490 (0.0008) [2023-10-12 18:13:13,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 119799808. Throughput: 0: 1689.9, 1: 1658.9. Samples: 29957514. Policy #0 lag: (min: 22.0, avg: 27.7, max: 54.0) [2023-10-12 18:13:13,435][61643] Avg episode reward: [(0, '23.790'), (1, '9.760')] [2023-10-12 18:13:14,231][62634] Updated weights for policy 0, policy_version 58500 (0.0008) [2023-10-12 18:13:14,616][62634] Updated weights for policy 0, policy_version 58510 (0.0007) [2023-10-12 18:13:14,985][62634] Updated weights for policy 0, policy_version 58520 (0.0007) [2023-10-12 18:13:16,233][62635] Updated weights for policy 1, policy_version 58500 (0.0010) [2023-10-12 18:13:16,602][62635] Updated weights for policy 1, policy_version 58510 (0.0010) [2023-10-12 18:13:16,968][62635] Updated weights for policy 1, policy_version 58520 (0.0009) [2023-10-12 18:13:18,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 119865344. Throughput: 0: 1684.0, 1: 1677.2. Samples: 29977814. Policy #0 lag: (min: 22.0, avg: 27.7, max: 54.0) [2023-10-12 18:13:18,435][61643] Avg episode reward: [(0, '23.640'), (1, '9.650')] [2023-10-12 18:13:19,002][62634] Updated weights for policy 0, policy_version 58530 (0.0009) [2023-10-12 18:13:19,371][62634] Updated weights for policy 0, policy_version 58540 (0.0009) [2023-10-12 18:13:19,747][62634] Updated weights for policy 0, policy_version 58550 (0.0007) [2023-10-12 18:13:20,123][62634] Updated weights for policy 0, policy_version 58560 (0.0009) [2023-10-12 18:13:21,059][62635] Updated weights for policy 1, policy_version 58530 (0.0009) [2023-10-12 18:13:21,425][62635] Updated weights for policy 1, policy_version 58540 (0.0007) [2023-10-12 18:13:21,792][62635] Updated weights for policy 1, policy_version 58550 (0.0008) [2023-10-12 18:13:22,154][62635] Updated weights for policy 1, policy_version 58560 (0.0008) [2023-10-12 18:13:23,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 119930880. Throughput: 0: 1678.3, 1: 1679.6. Samples: 29987950. Policy #0 lag: (min: 22.0, avg: 27.7, max: 54.0) [2023-10-12 18:13:23,436][61643] Avg episode reward: [(0, '23.620'), (1, '9.670')] [2023-10-12 18:13:24,164][62634] Updated weights for policy 0, policy_version 58570 (0.0008) [2023-10-12 18:13:24,547][62634] Updated weights for policy 0, policy_version 58580 (0.0010) [2023-10-12 18:13:24,918][62634] Updated weights for policy 0, policy_version 58590 (0.0007) [2023-10-12 18:13:26,085][62635] Updated weights for policy 1, policy_version 58570 (0.0008) [2023-10-12 18:13:26,459][62635] Updated weights for policy 1, policy_version 58580 (0.0007) [2023-10-12 18:13:26,816][62635] Updated weights for policy 1, policy_version 58590 (0.0007) [2023-10-12 18:13:28,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 119996416. Throughput: 0: 1687.2, 1: 1663.2. Samples: 30007648. Policy #0 lag: (min: 22.0, avg: 27.7, max: 54.0) [2023-10-12 18:13:28,436][61643] Avg episode reward: [(0, '24.090'), (1, '9.680')] [2023-10-12 18:13:28,930][62634] Updated weights for policy 0, policy_version 58600 (0.0008) [2023-10-12 18:13:29,299][62634] Updated weights for policy 0, policy_version 58610 (0.0009) [2023-10-12 18:13:29,672][62634] Updated weights for policy 0, policy_version 58620 (0.0008) [2023-10-12 18:13:30,892][62635] Updated weights for policy 1, policy_version 58600 (0.0010) [2023-10-12 18:13:31,267][62635] Updated weights for policy 1, policy_version 58610 (0.0007) [2023-10-12 18:13:31,629][62635] Updated weights for policy 1, policy_version 58620 (0.0008) [2023-10-12 18:13:33,435][61643] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 120061952. Throughput: 0: 1681.6, 1: 1685.7. Samples: 30028216. Policy #0 lag: (min: 22.0, avg: 27.7, max: 54.0) [2023-10-12 18:13:33,435][61643] Avg episode reward: [(0, '24.050'), (1, '9.590')] [2023-10-12 18:13:33,743][62634] Updated weights for policy 0, policy_version 58630 (0.0008) [2023-10-12 18:13:34,120][62634] Updated weights for policy 0, policy_version 58640 (0.0007) [2023-10-12 18:13:34,493][62634] Updated weights for policy 0, policy_version 58650 (0.0007) [2023-10-12 18:13:35,570][62635] Updated weights for policy 1, policy_version 58630 (0.0008) [2023-10-12 18:13:35,938][62635] Updated weights for policy 1, policy_version 58640 (0.0007) [2023-10-12 18:13:36,308][62635] Updated weights for policy 1, policy_version 58650 (0.0007) [2023-10-12 18:13:38,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 120127488. Throughput: 0: 1682.1, 1: 1671.4. Samples: 30037900. Policy #0 lag: (min: 22.0, avg: 27.7, max: 54.0) [2023-10-12 18:13:38,435][61643] Avg episode reward: [(0, '24.260'), (1, '9.690')] [2023-10-12 18:13:38,486][62634] Updated weights for policy 0, policy_version 58660 (0.0011) [2023-10-12 18:13:38,864][62634] Updated weights for policy 0, policy_version 58670 (0.0009) [2023-10-12 18:13:39,247][62634] Updated weights for policy 0, policy_version 58680 (0.0009) [2023-10-12 18:13:40,363][62635] Updated weights for policy 1, policy_version 58660 (0.0007) [2023-10-12 18:13:40,743][62635] Updated weights for policy 1, policy_version 58670 (0.0008) [2023-10-12 18:13:41,112][62635] Updated weights for policy 1, policy_version 58680 (0.0011) [2023-10-12 18:13:43,353][62634] Updated weights for policy 0, policy_version 58690 (0.0009) [2023-10-12 18:13:43,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 120193024. Throughput: 0: 1676.5, 1: 1673.7. Samples: 30058008. Policy #0 lag: (min: 22.0, avg: 27.7, max: 54.0) [2023-10-12 18:13:43,436][61643] Avg episode reward: [(0, '24.290'), (1, '9.940')] [2023-10-12 18:13:43,729][62634] Updated weights for policy 0, policy_version 58700 (0.0009) [2023-10-12 18:13:44,097][62634] Updated weights for policy 0, policy_version 58710 (0.0008) [2023-10-12 18:13:44,479][62634] Updated weights for policy 0, policy_version 58720 (0.0010) [2023-10-12 18:13:45,094][62635] Updated weights for policy 1, policy_version 58690 (0.0009) [2023-10-12 18:13:45,473][62635] Updated weights for policy 1, policy_version 58700 (0.0009) [2023-10-12 18:13:45,847][62635] Updated weights for policy 1, policy_version 58710 (0.0008) [2023-10-12 18:13:46,215][62635] Updated weights for policy 1, policy_version 58720 (0.0010) [2023-10-12 18:13:48,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 120258560. Throughput: 0: 1675.2, 1: 1697.0. Samples: 30078898. Policy #0 lag: (min: 22.0, avg: 27.7, max: 54.0) [2023-10-12 18:13:48,435][61643] Avg episode reward: [(0, '24.220'), (1, '9.770')] [2023-10-12 18:13:48,491][62634] Updated weights for policy 0, policy_version 58730 (0.0009) [2023-10-12 18:13:48,875][62634] Updated weights for policy 0, policy_version 58740 (0.0010) [2023-10-12 18:13:49,249][62634] Updated weights for policy 0, policy_version 58750 (0.0008) [2023-10-12 18:13:50,405][62635] Updated weights for policy 1, policy_version 58730 (0.0007) [2023-10-12 18:13:50,772][62635] Updated weights for policy 1, policy_version 58740 (0.0007) [2023-10-12 18:13:51,156][62635] Updated weights for policy 1, policy_version 58750 (0.0011) [2023-10-12 18:13:53,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 120324096. Throughput: 0: 1676.9, 1: 1669.9. Samples: 30088126. Policy #0 lag: (min: 2.0, avg: 4.0, max: 32.0) [2023-10-12 18:13:53,435][61643] Avg episode reward: [(0, '24.050'), (1, '9.920')] [2023-10-12 18:13:53,511][62634] Updated weights for policy 0, policy_version 58760 (0.0007) [2023-10-12 18:13:53,897][62634] Updated weights for policy 0, policy_version 58770 (0.0009) [2023-10-12 18:13:54,270][62634] Updated weights for policy 0, policy_version 58780 (0.0010) [2023-10-12 18:13:55,131][62635] Updated weights for policy 1, policy_version 58760 (0.0010) [2023-10-12 18:13:55,502][62635] Updated weights for policy 1, policy_version 58770 (0.0009) [2023-10-12 18:13:55,867][62635] Updated weights for policy 1, policy_version 58780 (0.0009) [2023-10-12 18:13:58,241][62634] Updated weights for policy 0, policy_version 58790 (0.0008) [2023-10-12 18:13:58,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 120389632. Throughput: 0: 1678.3, 1: 1682.7. Samples: 30108756. Policy #0 lag: (min: 2.0, avg: 4.0, max: 32.0) [2023-10-12 18:13:58,436][61643] Avg episode reward: [(0, '24.300'), (1, '10.070')] [2023-10-12 18:13:58,619][62634] Updated weights for policy 0, policy_version 58800 (0.0007) [2023-10-12 18:13:58,999][62634] Updated weights for policy 0, policy_version 58810 (0.0010) [2023-10-12 18:14:00,068][62635] Updated weights for policy 1, policy_version 58790 (0.0008) [2023-10-12 18:14:00,453][62635] Updated weights for policy 1, policy_version 58800 (0.0007) [2023-10-12 18:14:00,827][62635] Updated weights for policy 1, policy_version 58810 (0.0007) [2023-10-12 18:14:02,966][62634] Updated weights for policy 0, policy_version 58820 (0.0007) [2023-10-12 18:14:03,340][62634] Updated weights for policy 0, policy_version 58830 (0.0007) [2023-10-12 18:14:03,435][61643] Fps is (10 sec: 13106.8, 60 sec: 13107.1, 300 sec: 13440.4). Total num frames: 120455168. Throughput: 0: 1678.8, 1: 1689.6. Samples: 30129392. Policy #0 lag: (min: 2.0, avg: 4.0, max: 32.0) [2023-10-12 18:14:03,436][61643] Avg episode reward: [(0, '24.310'), (1, '9.900')] [2023-10-12 18:14:03,715][62634] Updated weights for policy 0, policy_version 58840 (0.0009) [2023-10-12 18:14:04,685][62635] Updated weights for policy 1, policy_version 58820 (0.0009) [2023-10-12 18:14:05,063][62635] Updated weights for policy 1, policy_version 58830 (0.0008) [2023-10-12 18:14:05,435][62635] Updated weights for policy 1, policy_version 58840 (0.0009) [2023-10-12 18:14:07,873][62634] Updated weights for policy 0, policy_version 58850 (0.0009) [2023-10-12 18:14:08,252][62634] Updated weights for policy 0, policy_version 58860 (0.0009) [2023-10-12 18:14:08,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 120520704. Throughput: 0: 1687.5, 1: 1663.0. Samples: 30138722. Policy #0 lag: (min: 2.0, avg: 4.0, max: 32.0) [2023-10-12 18:14:08,436][61643] Avg episode reward: [(0, '24.650'), (1, '9.880')] [2023-10-12 18:14:08,634][62634] Updated weights for policy 0, policy_version 58870 (0.0008) [2023-10-12 18:14:09,018][62634] Updated weights for policy 0, policy_version 58880 (0.0009) [2023-10-12 18:14:09,478][62635] Updated weights for policy 1, policy_version 58850 (0.0007) [2023-10-12 18:14:09,837][62635] Updated weights for policy 1, policy_version 58860 (0.0009) [2023-10-12 18:14:10,209][62635] Updated weights for policy 1, policy_version 58870 (0.0010) [2023-10-12 18:14:10,572][62635] Updated weights for policy 1, policy_version 58880 (0.0008) [2023-10-12 18:14:13,050][62634] Updated weights for policy 0, policy_version 58890 (0.0007) [2023-10-12 18:14:13,421][62634] Updated weights for policy 0, policy_version 58900 (0.0007) [2023-10-12 18:14:13,435][61643] Fps is (10 sec: 13107.6, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 120586240. Throughput: 0: 1679.0, 1: 1688.4. Samples: 30159178. Policy #0 lag: (min: 2.0, avg: 4.0, max: 32.0) [2023-10-12 18:14:13,435][61643] Avg episode reward: [(0, '24.670'), (1, '9.890')] [2023-10-12 18:14:13,800][62634] Updated weights for policy 0, policy_version 58910 (0.0009) [2023-10-12 18:14:14,624][62635] Updated weights for policy 1, policy_version 58890 (0.0008) [2023-10-12 18:14:14,992][62635] Updated weights for policy 1, policy_version 58900 (0.0007) [2023-10-12 18:14:15,355][62635] Updated weights for policy 1, policy_version 58910 (0.0008) [2023-10-12 18:14:17,937][62634] Updated weights for policy 0, policy_version 58920 (0.0009) [2023-10-12 18:14:18,304][62634] Updated weights for policy 0, policy_version 58930 (0.0007) [2023-10-12 18:14:18,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 120651776. Throughput: 0: 1672.8, 1: 1687.1. Samples: 30179408. Policy #0 lag: (min: 2.0, avg: 4.0, max: 32.0) [2023-10-12 18:14:18,436][61643] Avg episode reward: [(0, '24.750'), (1, '9.660')] [2023-10-12 18:14:18,443][62495] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000058912_60325888.pth... [2023-10-12 18:14:18,483][62495] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000057344_58720256.pth [2023-10-12 18:14:18,679][62634] Updated weights for policy 0, policy_version 58940 (0.0007) [2023-10-12 18:14:18,830][62354] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000058944_60358656.pth... [2023-10-12 18:14:18,867][62354] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000057376_58753024.pth [2023-10-12 18:14:19,625][62635] Updated weights for policy 1, policy_version 58920 (0.0009) [2023-10-12 18:14:20,000][62635] Updated weights for policy 1, policy_version 58930 (0.0009) [2023-10-12 18:14:20,362][62635] Updated weights for policy 1, policy_version 58940 (0.0008) [2023-10-12 18:14:22,660][62634] Updated weights for policy 0, policy_version 58950 (0.0008) [2023-10-12 18:14:23,034][62634] Updated weights for policy 0, policy_version 58960 (0.0007) [2023-10-12 18:14:23,408][62634] Updated weights for policy 0, policy_version 58970 (0.0008) [2023-10-12 18:14:23,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 120717312. Throughput: 0: 1682.5, 1: 1672.6. Samples: 30188878. Policy #0 lag: (min: 2.0, avg: 4.0, max: 32.0) [2023-10-12 18:14:23,436][61643] Avg episode reward: [(0, '24.830'), (1, '9.760')] [2023-10-12 18:14:24,312][62635] Updated weights for policy 1, policy_version 58950 (0.0009) [2023-10-12 18:14:24,690][62635] Updated weights for policy 1, policy_version 58960 (0.0010) [2023-10-12 18:14:25,055][62635] Updated weights for policy 1, policy_version 58970 (0.0009) [2023-10-12 18:14:27,431][62634] Updated weights for policy 0, policy_version 58980 (0.0009) [2023-10-12 18:14:27,796][62634] Updated weights for policy 0, policy_version 58990 (0.0007) [2023-10-12 18:14:28,175][62634] Updated weights for policy 0, policy_version 59000 (0.0007) [2023-10-12 18:14:28,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 120782848. Throughput: 0: 1686.8, 1: 1687.2. Samples: 30209834. Policy #0 lag: (min: 2.0, avg: 4.0, max: 32.0) [2023-10-12 18:14:28,435][61643] Avg episode reward: [(0, '24.820'), (1, '9.990')] [2023-10-12 18:14:29,103][62635] Updated weights for policy 1, policy_version 58980 (0.0011) [2023-10-12 18:14:29,466][62635] Updated weights for policy 1, policy_version 58990 (0.0009) [2023-10-12 18:14:29,833][62635] Updated weights for policy 1, policy_version 59000 (0.0008) [2023-10-12 18:14:32,150][62634] Updated weights for policy 0, policy_version 59010 (0.0008) [2023-10-12 18:14:32,528][62634] Updated weights for policy 0, policy_version 59020 (0.0010) [2023-10-12 18:14:32,910][62634] Updated weights for policy 0, policy_version 59030 (0.0008) [2023-10-12 18:14:33,289][62634] Updated weights for policy 0, policy_version 59040 (0.0007) [2023-10-12 18:14:33,435][61643] Fps is (10 sec: 16384.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 120881152. Throughput: 0: 1666.4, 1: 1688.2. Samples: 30229858. Policy #0 lag: (min: 31.0, avg: 35.5, max: 63.0) [2023-10-12 18:14:33,435][61643] Avg episode reward: [(0, '24.890'), (1, '9.860')] [2023-10-12 18:14:33,949][62635] Updated weights for policy 1, policy_version 59010 (0.0010) [2023-10-12 18:14:34,323][62635] Updated weights for policy 1, policy_version 59020 (0.0007) [2023-10-12 18:14:34,685][62635] Updated weights for policy 1, policy_version 59030 (0.0008) [2023-10-12 18:14:35,058][62635] Updated weights for policy 1, policy_version 59040 (0.0010) [2023-10-12 18:14:37,185][62634] Updated weights for policy 0, policy_version 59050 (0.0007) [2023-10-12 18:14:37,557][62634] Updated weights for policy 0, policy_version 59060 (0.0008) [2023-10-12 18:14:37,934][62634] Updated weights for policy 0, policy_version 59070 (0.0009) [2023-10-12 18:14:38,435][61643] Fps is (10 sec: 16384.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 120946688. Throughput: 0: 1689.4, 1: 1680.8. Samples: 30239786. Policy #0 lag: (min: 31.0, avg: 35.5, max: 63.0) [2023-10-12 18:14:38,435][61643] Avg episode reward: [(0, '24.890'), (1, '9.880')] [2023-10-12 18:14:39,163][62635] Updated weights for policy 1, policy_version 59050 (0.0008) [2023-10-12 18:14:39,535][62635] Updated weights for policy 1, policy_version 59060 (0.0008) [2023-10-12 18:14:39,899][62635] Updated weights for policy 1, policy_version 59070 (0.0008) [2023-10-12 18:14:42,039][62634] Updated weights for policy 0, policy_version 59080 (0.0008) [2023-10-12 18:14:42,406][62634] Updated weights for policy 0, policy_version 59090 (0.0008) [2023-10-12 18:14:42,784][62634] Updated weights for policy 0, policy_version 59100 (0.0009) [2023-10-12 18:14:43,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 121012224. Throughput: 0: 1677.3, 1: 1684.8. Samples: 30260052. Policy #0 lag: (min: 31.0, avg: 35.5, max: 63.0) [2023-10-12 18:14:43,435][61643] Avg episode reward: [(0, '24.590'), (1, '9.970')] [2023-10-12 18:14:43,901][62635] Updated weights for policy 1, policy_version 59080 (0.0007) [2023-10-12 18:14:44,275][62635] Updated weights for policy 1, policy_version 59090 (0.0010) [2023-10-12 18:14:44,648][62635] Updated weights for policy 1, policy_version 59100 (0.0009) [2023-10-12 18:14:46,806][62634] Updated weights for policy 0, policy_version 59110 (0.0010) [2023-10-12 18:14:47,180][62634] Updated weights for policy 0, policy_version 59120 (0.0009) [2023-10-12 18:14:47,559][62634] Updated weights for policy 0, policy_version 59130 (0.0010) [2023-10-12 18:14:48,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 121077760. Throughput: 0: 1656.2, 1: 1687.3. Samples: 30279848. Policy #0 lag: (min: 31.0, avg: 35.5, max: 63.0) [2023-10-12 18:14:48,436][61643] Avg episode reward: [(0, '24.670'), (1, '9.720')] [2023-10-12 18:14:48,770][62635] Updated weights for policy 1, policy_version 59110 (0.0007) [2023-10-12 18:14:49,157][62635] Updated weights for policy 1, policy_version 59120 (0.0009) [2023-10-12 18:14:49,531][62635] Updated weights for policy 1, policy_version 59130 (0.0010) [2023-10-12 18:14:51,726][62634] Updated weights for policy 0, policy_version 59140 (0.0010) [2023-10-12 18:14:52,110][62634] Updated weights for policy 0, policy_version 59150 (0.0008) [2023-10-12 18:14:52,475][62634] Updated weights for policy 0, policy_version 59160 (0.0009) [2023-10-12 18:14:53,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 121143296. Throughput: 0: 1685.8, 1: 1679.2. Samples: 30290146. Policy #0 lag: (min: 31.0, avg: 35.5, max: 63.0) [2023-10-12 18:14:53,436][61643] Avg episode reward: [(0, '24.750'), (1, '9.790')] [2023-10-12 18:14:53,674][62635] Updated weights for policy 1, policy_version 59140 (0.0010) [2023-10-12 18:14:54,041][62635] Updated weights for policy 1, policy_version 59150 (0.0009) [2023-10-12 18:14:54,406][62635] Updated weights for policy 1, policy_version 59160 (0.0010) [2023-10-12 18:14:56,473][62634] Updated weights for policy 0, policy_version 59170 (0.0009) [2023-10-12 18:14:56,847][62634] Updated weights for policy 0, policy_version 59180 (0.0011) [2023-10-12 18:14:57,229][62634] Updated weights for policy 0, policy_version 59190 (0.0007) [2023-10-12 18:14:57,602][62634] Updated weights for policy 0, policy_version 59200 (0.0007) [2023-10-12 18:14:58,427][62635] Updated weights for policy 1, policy_version 59170 (0.0012) [2023-10-12 18:14:58,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 121208832. Throughput: 0: 1676.6, 1: 1681.1. Samples: 30310274. Policy #0 lag: (min: 31.0, avg: 35.5, max: 63.0) [2023-10-12 18:14:58,435][61643] Avg episode reward: [(0, '24.580'), (1, '10.050')] [2023-10-12 18:14:58,791][62635] Updated weights for policy 1, policy_version 59180 (0.0009) [2023-10-12 18:14:59,166][62635] Updated weights for policy 1, policy_version 59190 (0.0009) [2023-10-12 18:14:59,541][62635] Updated weights for policy 1, policy_version 59200 (0.0010) [2023-10-12 18:15:01,661][62634] Updated weights for policy 0, policy_version 59210 (0.0008) [2023-10-12 18:15:02,039][62634] Updated weights for policy 0, policy_version 59220 (0.0009) [2023-10-12 18:15:02,415][62634] Updated weights for policy 0, policy_version 59230 (0.0011) [2023-10-12 18:15:03,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 121274368. Throughput: 0: 1674.2, 1: 1683.7. Samples: 30330512. Policy #0 lag: (min: 31.0, avg: 35.5, max: 63.0) [2023-10-12 18:15:03,435][61643] Avg episode reward: [(0, '24.340'), (1, '9.890')] [2023-10-12 18:15:03,573][62635] Updated weights for policy 1, policy_version 59210 (0.0008) [2023-10-12 18:15:03,950][62635] Updated weights for policy 1, policy_version 59220 (0.0009) [2023-10-12 18:15:04,318][62635] Updated weights for policy 1, policy_version 59230 (0.0009) [2023-10-12 18:15:06,481][62634] Updated weights for policy 0, policy_version 59240 (0.0008) [2023-10-12 18:15:06,855][62634] Updated weights for policy 0, policy_version 59250 (0.0008) [2023-10-12 18:15:07,235][62634] Updated weights for policy 0, policy_version 59260 (0.0008) [2023-10-12 18:15:08,427][62635] Updated weights for policy 1, policy_version 59240 (0.0010) [2023-10-12 18:15:08,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 121339904. Throughput: 0: 1693.1, 1: 1683.5. Samples: 30340824. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 18:15:08,435][61643] Avg episode reward: [(0, '24.310'), (1, '9.730')] [2023-10-12 18:15:08,797][62635] Updated weights for policy 1, policy_version 59250 (0.0010) [2023-10-12 18:15:09,162][62635] Updated weights for policy 1, policy_version 59260 (0.0010) [2023-10-12 18:15:11,197][62634] Updated weights for policy 0, policy_version 59270 (0.0007) [2023-10-12 18:15:11,565][62634] Updated weights for policy 0, policy_version 59280 (0.0009) [2023-10-12 18:15:11,950][62634] Updated weights for policy 0, policy_version 59290 (0.0009) [2023-10-12 18:15:13,212][62635] Updated weights for policy 1, policy_version 59270 (0.0008) [2023-10-12 18:15:13,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 121405440. Throughput: 0: 1667.2, 1: 1679.9. Samples: 30360456. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 18:15:13,435][61643] Avg episode reward: [(0, '24.460'), (1, '9.940')] [2023-10-12 18:15:13,579][62635] Updated weights for policy 1, policy_version 59280 (0.0008) [2023-10-12 18:15:13,952][62635] Updated weights for policy 1, policy_version 59290 (0.0008) [2023-10-12 18:15:15,848][62634] Updated weights for policy 0, policy_version 59300 (0.0008) [2023-10-12 18:15:16,231][62634] Updated weights for policy 0, policy_version 59310 (0.0007) [2023-10-12 18:15:16,604][62634] Updated weights for policy 0, policy_version 59320 (0.0010) [2023-10-12 18:15:18,046][62635] Updated weights for policy 1, policy_version 59300 (0.0009) [2023-10-12 18:15:18,413][62635] Updated weights for policy 1, policy_version 59310 (0.0010) [2023-10-12 18:15:18,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 121470976. Throughput: 0: 1677.9, 1: 1675.6. Samples: 30380766. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 18:15:18,435][61643] Avg episode reward: [(0, '24.540'), (1, '9.870')] [2023-10-12 18:15:18,776][62635] Updated weights for policy 1, policy_version 59320 (0.0007) [2023-10-12 18:15:20,793][62634] Updated weights for policy 0, policy_version 59330 (0.0010) [2023-10-12 18:15:21,164][62634] Updated weights for policy 0, policy_version 59340 (0.0010) [2023-10-12 18:15:21,540][62634] Updated weights for policy 0, policy_version 59350 (0.0010) [2023-10-12 18:15:21,916][62634] Updated weights for policy 0, policy_version 59360 (0.0009) [2023-10-12 18:15:22,940][62635] Updated weights for policy 1, policy_version 59330 (0.0008) [2023-10-12 18:15:23,298][62635] Updated weights for policy 1, policy_version 59340 (0.0009) [2023-10-12 18:15:23,435][61643] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 121536512. Throughput: 0: 1679.3, 1: 1680.3. Samples: 30390970. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 18:15:23,436][61643] Avg episode reward: [(0, '24.440'), (1, '9.820')] [2023-10-12 18:15:23,662][62635] Updated weights for policy 1, policy_version 59350 (0.0011) [2023-10-12 18:15:24,038][62635] Updated weights for policy 1, policy_version 59360 (0.0010) [2023-10-12 18:15:26,050][62634] Updated weights for policy 0, policy_version 59370 (0.0009) [2023-10-12 18:15:26,419][62634] Updated weights for policy 0, policy_version 59380 (0.0009) [2023-10-12 18:15:26,800][62634] Updated weights for policy 0, policy_version 59390 (0.0008) [2023-10-12 18:15:28,087][62635] Updated weights for policy 1, policy_version 59370 (0.0009) [2023-10-12 18:15:28,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 121602048. Throughput: 0: 1663.5, 1: 1677.4. Samples: 30410392. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 18:15:28,436][61643] Avg episode reward: [(0, '24.540'), (1, '10.010')] [2023-10-12 18:15:28,459][62635] Updated weights for policy 1, policy_version 59380 (0.0007) [2023-10-12 18:15:28,833][62635] Updated weights for policy 1, policy_version 59390 (0.0008) [2023-10-12 18:15:30,849][62634] Updated weights for policy 0, policy_version 59400 (0.0009) [2023-10-12 18:15:31,226][62634] Updated weights for policy 0, policy_version 59410 (0.0007) [2023-10-12 18:15:31,612][62634] Updated weights for policy 0, policy_version 59420 (0.0009) [2023-10-12 18:15:32,954][62635] Updated weights for policy 1, policy_version 59400 (0.0009) [2023-10-12 18:15:33,317][62635] Updated weights for policy 1, policy_version 59410 (0.0008) [2023-10-12 18:15:33,435][61643] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 121667584. Throughput: 0: 1687.5, 1: 1667.6. Samples: 30430828. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 18:15:33,435][61643] Avg episode reward: [(0, '24.610'), (1, '9.820')] [2023-10-12 18:15:33,679][62635] Updated weights for policy 1, policy_version 59420 (0.0009) [2023-10-12 18:15:35,634][62634] Updated weights for policy 0, policy_version 59430 (0.0009) [2023-10-12 18:15:36,015][62634] Updated weights for policy 0, policy_version 59440 (0.0009) [2023-10-12 18:15:36,385][62634] Updated weights for policy 0, policy_version 59450 (0.0008) [2023-10-12 18:15:37,819][62635] Updated weights for policy 1, policy_version 59430 (0.0008) [2023-10-12 18:15:38,204][62635] Updated weights for policy 1, policy_version 59440 (0.0009) [2023-10-12 18:15:38,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 121733120. Throughput: 0: 1667.5, 1: 1684.4. Samples: 30440980. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 18:15:38,435][61643] Avg episode reward: [(0, '24.580'), (1, '9.790')] [2023-10-12 18:15:38,575][62635] Updated weights for policy 1, policy_version 59450 (0.0008) [2023-10-12 18:15:40,567][62634] Updated weights for policy 0, policy_version 59460 (0.0008) [2023-10-12 18:15:40,949][62634] Updated weights for policy 0, policy_version 59470 (0.0009) [2023-10-12 18:15:41,325][62634] Updated weights for policy 0, policy_version 59480 (0.0007) [2023-10-12 18:15:42,573][62635] Updated weights for policy 1, policy_version 59460 (0.0009) [2023-10-12 18:15:42,942][62635] Updated weights for policy 1, policy_version 59470 (0.0010) [2023-10-12 18:15:43,317][62635] Updated weights for policy 1, policy_version 59480 (0.0009) [2023-10-12 18:15:43,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 121798656. Throughput: 0: 1661.8, 1: 1681.4. Samples: 30460718. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 18:15:43,435][61643] Avg episode reward: [(0, '24.500'), (1, '10.060')] [2023-10-12 18:15:45,306][62634] Updated weights for policy 0, policy_version 59490 (0.0009) [2023-10-12 18:15:45,693][62634] Updated weights for policy 0, policy_version 59500 (0.0010) [2023-10-12 18:15:46,078][62634] Updated weights for policy 0, policy_version 59510 (0.0008) [2023-10-12 18:15:46,453][62634] Updated weights for policy 0, policy_version 59520 (0.0010) [2023-10-12 18:15:47,367][62635] Updated weights for policy 1, policy_version 59490 (0.0008) [2023-10-12 18:15:47,726][62635] Updated weights for policy 1, policy_version 59500 (0.0009) [2023-10-12 18:15:48,090][62635] Updated weights for policy 1, policy_version 59510 (0.0009) [2023-10-12 18:15:48,435][61643] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 121864192. Throughput: 0: 1677.9, 1: 1661.5. Samples: 30480788. Policy #0 lag: (min: 14.0, avg: 38.0, max: 40.0) [2023-10-12 18:15:48,436][61643] Avg episode reward: [(0, '24.800'), (1, '9.750')] [2023-10-12 18:15:48,462][62635] Updated weights for policy 1, policy_version 59520 (0.0008) [2023-10-12 18:15:50,557][62634] Updated weights for policy 0, policy_version 59530 (0.0011) [2023-10-12 18:15:50,939][62634] Updated weights for policy 0, policy_version 59540 (0.0008) [2023-10-12 18:15:51,321][62634] Updated weights for policy 0, policy_version 59550 (0.0008) [2023-10-12 18:15:52,511][62635] Updated weights for policy 1, policy_version 59530 (0.0007) [2023-10-12 18:15:52,880][62635] Updated weights for policy 1, policy_version 59540 (0.0008) [2023-10-12 18:15:53,255][62635] Updated weights for policy 1, policy_version 59550 (0.0008) [2023-10-12 18:15:53,435][61643] Fps is (10 sec: 16383.9, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 121962496. Throughput: 0: 1659.9, 1: 1683.3. Samples: 30491266. Policy #0 lag: (min: 14.0, avg: 38.0, max: 40.0) [2023-10-12 18:15:53,436][61643] Avg episode reward: [(0, '24.760'), (1, '9.660')] [2023-10-12 18:15:55,441][62634] Updated weights for policy 0, policy_version 59560 (0.0008) [2023-10-12 18:15:55,824][62634] Updated weights for policy 0, policy_version 59570 (0.0008) [2023-10-12 18:15:56,196][62634] Updated weights for policy 0, policy_version 59580 (0.0009) [2023-10-12 18:15:57,164][62635] Updated weights for policy 1, policy_version 59560 (0.0009) [2023-10-12 18:15:57,525][62635] Updated weights for policy 1, policy_version 59570 (0.0011) [2023-10-12 18:15:57,893][62635] Updated weights for policy 1, policy_version 59580 (0.0010) [2023-10-12 18:15:58,435][61643] Fps is (10 sec: 16384.3, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 122028032. Throughput: 0: 1669.0, 1: 1680.7. Samples: 30511192. Policy #0 lag: (min: 14.0, avg: 38.0, max: 40.0) [2023-10-12 18:15:58,436][61643] Avg episode reward: [(0, '25.110'), (1, '9.860')] [2023-10-12 18:16:00,320][62634] Updated weights for policy 0, policy_version 59590 (0.0007) [2023-10-12 18:16:00,707][62634] Updated weights for policy 0, policy_version 59600 (0.0008) [2023-10-12 18:16:01,097][62634] Updated weights for policy 0, policy_version 59610 (0.0007) [2023-10-12 18:16:02,187][62635] Updated weights for policy 1, policy_version 59590 (0.0010) [2023-10-12 18:16:02,565][62635] Updated weights for policy 1, policy_version 59600 (0.0008) [2023-10-12 18:16:02,936][62635] Updated weights for policy 1, policy_version 59610 (0.0007) [2023-10-12 18:16:03,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 122093568. Throughput: 0: 1675.8, 1: 1662.6. Samples: 30530992. Policy #0 lag: (min: 14.0, avg: 38.0, max: 40.0) [2023-10-12 18:16:03,435][61643] Avg episode reward: [(0, '24.950'), (1, '9.570')] [2023-10-12 18:16:05,092][62634] Updated weights for policy 0, policy_version 59620 (0.0008) [2023-10-12 18:16:05,466][62634] Updated weights for policy 0, policy_version 59630 (0.0009) [2023-10-12 18:16:05,851][62634] Updated weights for policy 0, policy_version 59640 (0.0008) [2023-10-12 18:16:07,068][62635] Updated weights for policy 1, policy_version 59620 (0.0008) [2023-10-12 18:16:07,428][62635] Updated weights for policy 1, policy_version 59630 (0.0010) [2023-10-12 18:16:07,799][62635] Updated weights for policy 1, policy_version 59640 (0.0010) [2023-10-12 18:16:08,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 122159104. Throughput: 0: 1655.3, 1: 1683.0. Samples: 30541196. Policy #0 lag: (min: 14.0, avg: 38.0, max: 40.0) [2023-10-12 18:16:08,435][61643] Avg episode reward: [(0, '24.980'), (1, '9.410')] [2023-10-12 18:16:09,789][62634] Updated weights for policy 0, policy_version 59650 (0.0007) [2023-10-12 18:16:10,164][62634] Updated weights for policy 0, policy_version 59660 (0.0009) [2023-10-12 18:16:10,541][62634] Updated weights for policy 0, policy_version 59670 (0.0008) [2023-10-12 18:16:10,918][62634] Updated weights for policy 0, policy_version 59680 (0.0007) [2023-10-12 18:16:11,803][62635] Updated weights for policy 1, policy_version 59650 (0.0010) [2023-10-12 18:16:12,176][62635] Updated weights for policy 1, policy_version 59660 (0.0010) [2023-10-12 18:16:12,540][62635] Updated weights for policy 1, policy_version 59670 (0.0008) [2023-10-12 18:16:12,907][62635] Updated weights for policy 1, policy_version 59680 (0.0008) [2023-10-12 18:16:13,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 122224640. Throughput: 0: 1676.6, 1: 1683.5. Samples: 30561598. Policy #0 lag: (min: 14.0, avg: 38.0, max: 40.0) [2023-10-12 18:16:13,436][61643] Avg episode reward: [(0, '25.140'), (1, '9.900')] [2023-10-12 18:16:14,986][62634] Updated weights for policy 0, policy_version 59690 (0.0008) [2023-10-12 18:16:15,364][62634] Updated weights for policy 0, policy_version 59700 (0.0009) [2023-10-12 18:16:15,740][62634] Updated weights for policy 0, policy_version 59710 (0.0009) [2023-10-12 18:16:16,903][62635] Updated weights for policy 1, policy_version 59690 (0.0008) [2023-10-12 18:16:17,266][62635] Updated weights for policy 1, policy_version 59700 (0.0008) [2023-10-12 18:16:17,629][62635] Updated weights for policy 1, policy_version 59710 (0.0008) [2023-10-12 18:16:18,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 122290176. Throughput: 0: 1674.3, 1: 1675.3. Samples: 30581560. Policy #0 lag: (min: 14.0, avg: 38.0, max: 40.0) [2023-10-12 18:16:18,436][61643] Avg episode reward: [(0, '25.090'), (1, '9.630')] [2023-10-12 18:16:18,446][62354] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000059712_61145088.pth... [2023-10-12 18:16:18,446][62495] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000059712_61145088.pth... [2023-10-12 18:16:18,483][62495] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000058144_59539456.pth [2023-10-12 18:16:18,485][62354] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000058144_59539456.pth [2023-10-12 18:16:19,863][62634] Updated weights for policy 0, policy_version 59720 (0.0009) [2023-10-12 18:16:20,241][62634] Updated weights for policy 0, policy_version 59730 (0.0010) [2023-10-12 18:16:20,622][62634] Updated weights for policy 0, policy_version 59740 (0.0010) [2023-10-12 18:16:21,646][62635] Updated weights for policy 1, policy_version 59720 (0.0008) [2023-10-12 18:16:22,016][62635] Updated weights for policy 1, policy_version 59730 (0.0009) [2023-10-12 18:16:22,388][62635] Updated weights for policy 1, policy_version 59740 (0.0011) [2023-10-12 18:16:23,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 122355712. Throughput: 0: 1656.0, 1: 1697.9. Samples: 30591906. Policy #0 lag: (min: 14.0, avg: 38.0, max: 40.0) [2023-10-12 18:16:23,435][61643] Avg episode reward: [(0, '25.390'), (1, '9.930')] [2023-10-12 18:16:24,485][62634] Updated weights for policy 0, policy_version 59750 (0.0009) [2023-10-12 18:16:24,871][62634] Updated weights for policy 0, policy_version 59760 (0.0009) [2023-10-12 18:16:25,253][62634] Updated weights for policy 0, policy_version 59770 (0.0007) [2023-10-12 18:16:26,598][62635] Updated weights for policy 1, policy_version 59750 (0.0010) [2023-10-12 18:16:26,991][62635] Updated weights for policy 1, policy_version 59760 (0.0010) [2023-10-12 18:16:27,355][62635] Updated weights for policy 1, policy_version 59770 (0.0010) [2023-10-12 18:16:28,435][61643] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 122421248. Throughput: 0: 1681.1, 1: 1680.5. Samples: 30611990. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 18:16:28,436][61643] Avg episode reward: [(0, '25.520'), (1, '9.980')] [2023-10-12 18:16:28,437][62354] Saving new best policy, reward=25.520! [2023-10-12 18:16:29,389][62634] Updated weights for policy 0, policy_version 59780 (0.0007) [2023-10-12 18:16:29,766][62634] Updated weights for policy 0, policy_version 59790 (0.0010) [2023-10-12 18:16:30,145][62634] Updated weights for policy 0, policy_version 59800 (0.0009) [2023-10-12 18:16:31,317][62635] Updated weights for policy 1, policy_version 59780 (0.0010) [2023-10-12 18:16:31,671][62635] Updated weights for policy 1, policy_version 59790 (0.0010) [2023-10-12 18:16:32,033][62635] Updated weights for policy 1, policy_version 59800 (0.0009) [2023-10-12 18:16:33,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 122486784. Throughput: 0: 1673.0, 1: 1681.6. Samples: 30631742. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 18:16:33,435][61643] Avg episode reward: [(0, '25.360'), (1, '9.650')] [2023-10-12 18:16:34,334][62634] Updated weights for policy 0, policy_version 59810 (0.0008) [2023-10-12 18:16:34,702][62634] Updated weights for policy 0, policy_version 59820 (0.0007) [2023-10-12 18:16:35,073][62634] Updated weights for policy 0, policy_version 59830 (0.0008) [2023-10-12 18:16:35,459][62634] Updated weights for policy 0, policy_version 59840 (0.0009) [2023-10-12 18:16:35,903][62635] Updated weights for policy 1, policy_version 59810 (0.0007) [2023-10-12 18:16:36,269][62635] Updated weights for policy 1, policy_version 59820 (0.0010) [2023-10-12 18:16:36,642][62635] Updated weights for policy 1, policy_version 59830 (0.0011) [2023-10-12 18:16:37,010][62635] Updated weights for policy 1, policy_version 59840 (0.0008) [2023-10-12 18:16:38,435][61643] Fps is (10 sec: 13107.7, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 122552320. Throughput: 0: 1659.8, 1: 1685.4. Samples: 30641802. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 18:16:38,435][61643] Avg episode reward: [(0, '25.420'), (1, '9.790')] [2023-10-12 18:16:39,798][62634] Updated weights for policy 0, policy_version 59850 (0.0009) [2023-10-12 18:16:40,176][62634] Updated weights for policy 0, policy_version 59860 (0.0009) [2023-10-12 18:16:40,551][62634] Updated weights for policy 0, policy_version 59870 (0.0011) [2023-10-12 18:16:40,987][62635] Updated weights for policy 1, policy_version 59850 (0.0010) [2023-10-12 18:16:41,354][62635] Updated weights for policy 1, policy_version 59860 (0.0007) [2023-10-12 18:16:41,716][62635] Updated weights for policy 1, policy_version 59870 (0.0007) [2023-10-12 18:16:43,435][61643] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 122617856. Throughput: 0: 1669.6, 1: 1665.4. Samples: 30661270. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 18:16:43,436][61643] Avg episode reward: [(0, '25.270'), (1, '9.810')] [2023-10-12 18:16:44,746][62634] Updated weights for policy 0, policy_version 59880 (0.0007) [2023-10-12 18:16:45,120][62634] Updated weights for policy 0, policy_version 59890 (0.0009) [2023-10-12 18:16:45,509][62634] Updated weights for policy 0, policy_version 59900 (0.0009) [2023-10-12 18:16:45,903][62635] Updated weights for policy 1, policy_version 59880 (0.0010) [2023-10-12 18:16:46,261][62635] Updated weights for policy 1, policy_version 59890 (0.0011) [2023-10-12 18:16:46,637][62635] Updated weights for policy 1, policy_version 59900 (0.0009) [2023-10-12 18:16:48,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 122683392. Throughput: 0: 1668.1, 1: 1685.8. Samples: 30681916. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 18:16:48,435][61643] Avg episode reward: [(0, '25.230'), (1, '9.630')] [2023-10-12 18:16:49,587][62634] Updated weights for policy 0, policy_version 59910 (0.0010) [2023-10-12 18:16:49,966][62634] Updated weights for policy 0, policy_version 59920 (0.0010) [2023-10-12 18:16:50,343][62634] Updated weights for policy 0, policy_version 59930 (0.0008) [2023-10-12 18:16:50,720][62635] Updated weights for policy 1, policy_version 59910 (0.0008) [2023-10-12 18:16:51,076][62635] Updated weights for policy 1, policy_version 59920 (0.0008) [2023-10-12 18:16:51,448][62635] Updated weights for policy 1, policy_version 59930 (0.0009) [2023-10-12 18:16:53,435][61643] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 122748928. Throughput: 0: 1662.4, 1: 1678.9. Samples: 30691558. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 18:16:53,435][61643] Avg episode reward: [(0, '25.180'), (1, '9.620')] [2023-10-12 18:16:54,412][62634] Updated weights for policy 0, policy_version 59940 (0.0008) [2023-10-12 18:16:54,785][62634] Updated weights for policy 0, policy_version 59950 (0.0007) [2023-10-12 18:16:55,161][62634] Updated weights for policy 0, policy_version 59960 (0.0008) [2023-10-12 18:16:55,459][62635] Updated weights for policy 1, policy_version 59940 (0.0008) [2023-10-12 18:16:55,833][62635] Updated weights for policy 1, policy_version 59950 (0.0008) [2023-10-12 18:16:56,201][62635] Updated weights for policy 1, policy_version 59960 (0.0008) [2023-10-12 18:16:58,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 122814464. Throughput: 0: 1669.3, 1: 1666.3. Samples: 30711698. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 18:16:58,435][61643] Avg episode reward: [(0, '25.340'), (1, '9.700')] [2023-10-12 18:16:59,135][62634] Updated weights for policy 0, policy_version 59970 (0.0008) [2023-10-12 18:16:59,509][62634] Updated weights for policy 0, policy_version 59980 (0.0010) [2023-10-12 18:16:59,893][62634] Updated weights for policy 0, policy_version 59990 (0.0009) [2023-10-12 18:17:00,259][62634] Updated weights for policy 0, policy_version 60000 (0.0007) [2023-10-12 18:17:00,451][62635] Updated weights for policy 1, policy_version 59970 (0.0007) [2023-10-12 18:17:00,820][62635] Updated weights for policy 1, policy_version 59980 (0.0007) [2023-10-12 18:17:01,195][62635] Updated weights for policy 1, policy_version 59990 (0.0008) [2023-10-12 18:17:01,555][62635] Updated weights for policy 1, policy_version 60000 (0.0009) [2023-10-12 18:17:03,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 122880000. Throughput: 0: 1673.8, 1: 1681.6. Samples: 30732552. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 18:17:03,435][61643] Avg episode reward: [(0, '24.950'), (1, '9.760')] [2023-10-12 18:17:04,387][62634] Updated weights for policy 0, policy_version 60010 (0.0008) [2023-10-12 18:17:04,772][62634] Updated weights for policy 0, policy_version 60020 (0.0010) [2023-10-12 18:17:05,143][62634] Updated weights for policy 0, policy_version 60030 (0.0008) [2023-10-12 18:17:05,748][62635] Updated weights for policy 1, policy_version 60010 (0.0007) [2023-10-12 18:17:06,118][62635] Updated weights for policy 1, policy_version 60020 (0.0008) [2023-10-12 18:17:06,489][62635] Updated weights for policy 1, policy_version 60030 (0.0009) [2023-10-12 18:17:08,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 122945536. Throughput: 0: 1676.2, 1: 1663.5. Samples: 30742192. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2023-10-12 18:17:08,436][61643] Avg episode reward: [(0, '24.950'), (1, '9.910')] [2023-10-12 18:17:09,306][62634] Updated weights for policy 0, policy_version 60040 (0.0010) [2023-10-12 18:17:09,674][62634] Updated weights for policy 0, policy_version 60050 (0.0009) [2023-10-12 18:17:10,061][62634] Updated weights for policy 0, policy_version 60060 (0.0010) [2023-10-12 18:17:10,402][62635] Updated weights for policy 1, policy_version 60040 (0.0008) [2023-10-12 18:17:10,766][62635] Updated weights for policy 1, policy_version 60050 (0.0007) [2023-10-12 18:17:11,138][62635] Updated weights for policy 1, policy_version 60060 (0.0007) [2023-10-12 18:17:13,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 123011072. Throughput: 0: 1667.6, 1: 1667.0. Samples: 30762046. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2023-10-12 18:17:13,436][61643] Avg episode reward: [(0, '24.850'), (1, '10.080')] [2023-10-12 18:17:14,109][62634] Updated weights for policy 0, policy_version 60070 (0.0008) [2023-10-12 18:17:14,485][62634] Updated weights for policy 0, policy_version 60080 (0.0007) [2023-10-12 18:17:14,868][62634] Updated weights for policy 0, policy_version 60090 (0.0009) [2023-10-12 18:17:15,336][62635] Updated weights for policy 1, policy_version 60070 (0.0009) [2023-10-12 18:17:15,722][62635] Updated weights for policy 1, policy_version 60080 (0.0008) [2023-10-12 18:17:16,089][62635] Updated weights for policy 1, policy_version 60090 (0.0008) [2023-10-12 18:17:18,435][61643] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 123076608. Throughput: 0: 1673.8, 1: 1682.1. Samples: 30782758. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2023-10-12 18:17:18,435][61643] Avg episode reward: [(0, '24.860'), (1, '9.970')] [2023-10-12 18:17:18,814][62634] Updated weights for policy 0, policy_version 60100 (0.0009) [2023-10-12 18:17:19,194][62634] Updated weights for policy 0, policy_version 60110 (0.0009) [2023-10-12 18:17:19,563][62634] Updated weights for policy 0, policy_version 60120 (0.0008) [2023-10-12 18:17:20,057][62635] Updated weights for policy 1, policy_version 60100 (0.0007) [2023-10-12 18:17:20,426][62635] Updated weights for policy 1, policy_version 60110 (0.0007) [2023-10-12 18:17:20,790][62635] Updated weights for policy 1, policy_version 60120 (0.0008) [2023-10-12 18:17:23,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 123142144. Throughput: 0: 1674.9, 1: 1662.0. Samples: 30791962. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2023-10-12 18:17:23,436][61643] Avg episode reward: [(0, '24.890'), (1, '9.830')] [2023-10-12 18:17:23,738][62634] Updated weights for policy 0, policy_version 60130 (0.0008) [2023-10-12 18:17:24,110][62634] Updated weights for policy 0, policy_version 60140 (0.0010) [2023-10-12 18:17:24,491][62634] Updated weights for policy 0, policy_version 60150 (0.0010) [2023-10-12 18:17:24,856][62634] Updated weights for policy 0, policy_version 60160 (0.0011) [2023-10-12 18:17:24,888][62635] Updated weights for policy 1, policy_version 60130 (0.0008) [2023-10-12 18:17:25,261][62635] Updated weights for policy 1, policy_version 60140 (0.0009) [2023-10-12 18:17:25,639][62635] Updated weights for policy 1, policy_version 60150 (0.0009) [2023-10-12 18:17:26,003][62635] Updated weights for policy 1, policy_version 60160 (0.0007) [2023-10-12 18:17:28,435][61643] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 123207680. Throughput: 0: 1679.2, 1: 1678.4. Samples: 30812362. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2023-10-12 18:17:28,436][61643] Avg episode reward: [(0, '24.820'), (1, '9.840')] [2023-10-12 18:17:29,022][62634] Updated weights for policy 0, policy_version 60170 (0.0008) [2023-10-12 18:17:29,399][62634] Updated weights for policy 0, policy_version 60180 (0.0009) [2023-10-12 18:17:29,772][62634] Updated weights for policy 0, policy_version 60190 (0.0010) [2023-10-12 18:17:30,143][62635] Updated weights for policy 1, policy_version 60170 (0.0008) [2023-10-12 18:17:30,512][62635] Updated weights for policy 1, policy_version 60180 (0.0007) [2023-10-12 18:17:30,881][62635] Updated weights for policy 1, policy_version 60190 (0.0007) [2023-10-12 18:17:33,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 123273216. Throughput: 0: 1674.3, 1: 1680.5. Samples: 30832884. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2023-10-12 18:17:33,435][61643] Avg episode reward: [(0, '24.530'), (1, '9.950')] [2023-10-12 18:17:33,853][62634] Updated weights for policy 0, policy_version 60200 (0.0009) [2023-10-12 18:17:34,231][62634] Updated weights for policy 0, policy_version 60210 (0.0008) [2023-10-12 18:17:34,613][62634] Updated weights for policy 0, policy_version 60220 (0.0007) [2023-10-12 18:17:34,760][62635] Updated weights for policy 1, policy_version 60200 (0.0007) [2023-10-12 18:17:35,123][62635] Updated weights for policy 1, policy_version 60210 (0.0007) [2023-10-12 18:17:35,484][62635] Updated weights for policy 1, policy_version 60220 (0.0007) [2023-10-12 18:17:38,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 123338752. Throughput: 0: 1677.3, 1: 1667.6. Samples: 30842078. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2023-10-12 18:17:38,436][61643] Avg episode reward: [(0, '24.440'), (1, '9.720')] [2023-10-12 18:17:38,715][62634] Updated weights for policy 0, policy_version 60230 (0.0010) [2023-10-12 18:17:39,082][62634] Updated weights for policy 0, policy_version 60240 (0.0010) [2023-10-12 18:17:39,469][62634] Updated weights for policy 0, policy_version 60250 (0.0008) [2023-10-12 18:17:39,543][62635] Updated weights for policy 1, policy_version 60230 (0.0008) [2023-10-12 18:17:39,909][62635] Updated weights for policy 1, policy_version 60240 (0.0008) [2023-10-12 18:17:40,276][62635] Updated weights for policy 1, policy_version 60250 (0.0009) [2023-10-12 18:17:43,311][62634] Updated weights for policy 0, policy_version 60260 (0.0008) [2023-10-12 18:17:43,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 123404288. Throughput: 0: 1673.8, 1: 1687.6. Samples: 30862960. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2023-10-12 18:17:43,435][61643] Avg episode reward: [(0, '24.470'), (1, '9.810')] [2023-10-12 18:17:43,690][62634] Updated weights for policy 0, policy_version 60270 (0.0009) [2023-10-12 18:17:44,061][62634] Updated weights for policy 0, policy_version 60280 (0.0007) [2023-10-12 18:17:44,181][62635] Updated weights for policy 1, policy_version 60260 (0.0008) [2023-10-12 18:17:44,544][62635] Updated weights for policy 1, policy_version 60270 (0.0007) [2023-10-12 18:17:44,908][62635] Updated weights for policy 1, policy_version 60280 (0.0009) [2023-10-12 18:17:48,006][62634] Updated weights for policy 0, policy_version 60290 (0.0009) [2023-10-12 18:17:48,386][62634] Updated weights for policy 0, policy_version 60300 (0.0009) [2023-10-12 18:17:48,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 123469824. Throughput: 0: 1669.4, 1: 1691.5. Samples: 30883790. Policy #0 lag: (min: 28.0, avg: 30.7, max: 60.0) [2023-10-12 18:17:48,435][61643] Avg episode reward: [(0, '24.260'), (1, '9.790')] [2023-10-12 18:17:48,767][62634] Updated weights for policy 0, policy_version 60310 (0.0010) [2023-10-12 18:17:48,905][62635] Updated weights for policy 1, policy_version 60290 (0.0007) [2023-10-12 18:17:49,133][62634] Updated weights for policy 0, policy_version 60320 (0.0007) [2023-10-12 18:17:49,282][62635] Updated weights for policy 1, policy_version 60300 (0.0008) [2023-10-12 18:17:49,649][62635] Updated weights for policy 1, policy_version 60310 (0.0010) [2023-10-12 18:17:50,024][62635] Updated weights for policy 1, policy_version 60320 (0.0011) [2023-10-12 18:17:53,107][62634] Updated weights for policy 0, policy_version 60330 (0.0008) [2023-10-12 18:17:53,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 123535360. Throughput: 0: 1673.5, 1: 1676.1. Samples: 30892924. Policy #0 lag: (min: 28.0, avg: 30.7, max: 60.0) [2023-10-12 18:17:53,435][61643] Avg episode reward: [(0, '24.400'), (1, '9.770')] [2023-10-12 18:17:53,480][62634] Updated weights for policy 0, policy_version 60340 (0.0008) [2023-10-12 18:17:53,852][62634] Updated weights for policy 0, policy_version 60350 (0.0010) [2023-10-12 18:17:54,089][62635] Updated weights for policy 1, policy_version 60330 (0.0008) [2023-10-12 18:17:54,462][62635] Updated weights for policy 1, policy_version 60340 (0.0007) [2023-10-12 18:17:54,836][62635] Updated weights for policy 1, policy_version 60350 (0.0008) [2023-10-12 18:17:57,961][62634] Updated weights for policy 0, policy_version 60360 (0.0010) [2023-10-12 18:17:58,345][62634] Updated weights for policy 0, policy_version 60370 (0.0008) [2023-10-12 18:17:58,435][61643] Fps is (10 sec: 13106.7, 60 sec: 13107.1, 300 sec: 13329.3). Total num frames: 123600896. Throughput: 0: 1676.3, 1: 1693.1. Samples: 30913666. Policy #0 lag: (min: 28.0, avg: 30.7, max: 60.0) [2023-10-12 18:17:58,436][61643] Avg episode reward: [(0, '24.440'), (1, '9.990')] [2023-10-12 18:17:58,726][62634] Updated weights for policy 0, policy_version 60380 (0.0008) [2023-10-12 18:17:58,868][62635] Updated weights for policy 1, policy_version 60360 (0.0007) [2023-10-12 18:17:59,232][62635] Updated weights for policy 1, policy_version 60370 (0.0008) [2023-10-12 18:17:59,611][62635] Updated weights for policy 1, policy_version 60380 (0.0010) [2023-10-12 18:18:02,888][62634] Updated weights for policy 0, policy_version 60390 (0.0008) [2023-10-12 18:18:03,262][62634] Updated weights for policy 0, policy_version 60400 (0.0008) [2023-10-12 18:18:03,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 123666432. Throughput: 0: 1669.3, 1: 1698.8. Samples: 30934326. Policy #0 lag: (min: 28.0, avg: 30.7, max: 60.0) [2023-10-12 18:18:03,435][61643] Avg episode reward: [(0, '24.520'), (1, '9.720')] [2023-10-12 18:18:03,637][62634] Updated weights for policy 0, policy_version 60410 (0.0008) [2023-10-12 18:18:03,758][62635] Updated weights for policy 1, policy_version 60390 (0.0008) [2023-10-12 18:18:04,144][62635] Updated weights for policy 1, policy_version 60400 (0.0009) [2023-10-12 18:18:04,512][62635] Updated weights for policy 1, policy_version 60410 (0.0010) [2023-10-12 18:18:07,685][62634] Updated weights for policy 0, policy_version 60420 (0.0009) [2023-10-12 18:18:08,068][62634] Updated weights for policy 0, policy_version 60430 (0.0010) [2023-10-12 18:18:08,435][61643] Fps is (10 sec: 13107.6, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 123731968. Throughput: 0: 1680.8, 1: 1692.5. Samples: 30943758. Policy #0 lag: (min: 28.0, avg: 30.7, max: 60.0) [2023-10-12 18:18:08,435][61643] Avg episode reward: [(0, '24.490'), (1, '9.880')] [2023-10-12 18:18:08,450][62634] Updated weights for policy 0, policy_version 60440 (0.0009) [2023-10-12 18:18:08,517][62635] Updated weights for policy 1, policy_version 60420 (0.0009) [2023-10-12 18:18:08,881][62635] Updated weights for policy 1, policy_version 60430 (0.0008) [2023-10-12 18:18:09,265][62635] Updated weights for policy 1, policy_version 60440 (0.0008) [2023-10-12 18:18:12,485][62634] Updated weights for policy 0, policy_version 60450 (0.0009) [2023-10-12 18:18:12,863][62634] Updated weights for policy 0, policy_version 60460 (0.0007) [2023-10-12 18:18:13,241][62634] Updated weights for policy 0, policy_version 60470 (0.0009) [2023-10-12 18:18:13,401][62635] Updated weights for policy 1, policy_version 60450 (0.0009) [2023-10-12 18:18:13,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 123797504. Throughput: 0: 1682.9, 1: 1702.9. Samples: 30964720. Policy #0 lag: (min: 28.0, avg: 30.7, max: 60.0) [2023-10-12 18:18:13,435][61643] Avg episode reward: [(0, '24.390'), (1, '10.050')] [2023-10-12 18:18:13,611][62634] Updated weights for policy 0, policy_version 60480 (0.0008) [2023-10-12 18:18:13,769][62635] Updated weights for policy 1, policy_version 60460 (0.0009) [2023-10-12 18:18:14,134][62635] Updated weights for policy 1, policy_version 60470 (0.0009) [2023-10-12 18:18:14,510][62635] Updated weights for policy 1, policy_version 60480 (0.0009) [2023-10-12 18:18:17,505][62634] Updated weights for policy 0, policy_version 60490 (0.0010) [2023-10-12 18:18:17,884][62634] Updated weights for policy 0, policy_version 60500 (0.0010) [2023-10-12 18:18:18,258][62634] Updated weights for policy 0, policy_version 60510 (0.0009) [2023-10-12 18:18:18,435][61643] Fps is (10 sec: 16383.8, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 123895808. Throughput: 0: 1670.9, 1: 1704.4. Samples: 30984774. Policy #0 lag: (min: 28.0, avg: 30.7, max: 60.0) [2023-10-12 18:18:18,436][61643] Avg episode reward: [(0, '24.670'), (1, '9.800')] [2023-10-12 18:18:18,446][62354] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000060512_61964288.pth... [2023-10-12 18:18:18,482][62354] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000058944_60358656.pth [2023-10-12 18:18:18,547][62635] Updated weights for policy 1, policy_version 60490 (0.0010) [2023-10-12 18:18:18,908][62635] Updated weights for policy 1, policy_version 60500 (0.0009) [2023-10-12 18:18:19,279][62635] Updated weights for policy 1, policy_version 60510 (0.0011) [2023-10-12 18:18:19,352][62495] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000060512_61964288.pth... [2023-10-12 18:18:19,384][62495] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000058912_60325888.pth [2023-10-12 18:18:22,321][62634] Updated weights for policy 0, policy_version 60520 (0.0009) [2023-10-12 18:18:22,709][62634] Updated weights for policy 0, policy_version 60530 (0.0007) [2023-10-12 18:18:23,082][62634] Updated weights for policy 0, policy_version 60540 (0.0008) [2023-10-12 18:18:23,218][62635] Updated weights for policy 1, policy_version 60520 (0.0010) [2023-10-12 18:18:23,435][61643] Fps is (10 sec: 16384.1, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 123961344. Throughput: 0: 1688.3, 1: 1702.6. Samples: 30994666. Policy #0 lag: (min: 2.0, avg: 2.5, max: 18.0) [2023-10-12 18:18:23,435][61643] Avg episode reward: [(0, '24.630'), (1, '9.810')] [2023-10-12 18:18:23,597][62635] Updated weights for policy 1, policy_version 60530 (0.0010) [2023-10-12 18:18:23,957][62635] Updated weights for policy 1, policy_version 60540 (0.0009) [2023-10-12 18:18:27,121][62634] Updated weights for policy 0, policy_version 60550 (0.0007) [2023-10-12 18:18:27,492][62634] Updated weights for policy 0, policy_version 60560 (0.0007) [2023-10-12 18:18:27,868][62634] Updated weights for policy 0, policy_version 60570 (0.0007) [2023-10-12 18:18:28,062][62635] Updated weights for policy 1, policy_version 60550 (0.0009) [2023-10-12 18:18:28,430][62635] Updated weights for policy 1, policy_version 60560 (0.0009) [2023-10-12 18:18:28,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 124026880. Throughput: 0: 1685.0, 1: 1696.1. Samples: 31015108. Policy #0 lag: (min: 2.0, avg: 2.5, max: 18.0) [2023-10-12 18:18:28,436][61643] Avg episode reward: [(0, '24.650'), (1, '10.080')] [2023-10-12 18:18:28,799][62635] Updated weights for policy 1, policy_version 60570 (0.0010) [2023-10-12 18:18:31,997][62634] Updated weights for policy 0, policy_version 60580 (0.0009) [2023-10-12 18:18:32,380][62634] Updated weights for policy 0, policy_version 60590 (0.0007) [2023-10-12 18:18:32,756][62634] Updated weights for policy 0, policy_version 60600 (0.0007) [2023-10-12 18:18:32,780][62635] Updated weights for policy 1, policy_version 60580 (0.0007) [2023-10-12 18:18:33,140][62635] Updated weights for policy 1, policy_version 60590 (0.0008) [2023-10-12 18:18:33,435][61643] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 124092416. Throughput: 0: 1663.5, 1: 1689.2. Samples: 31034666. Policy #0 lag: (min: 2.0, avg: 2.5, max: 18.0) [2023-10-12 18:18:33,436][61643] Avg episode reward: [(0, '24.570'), (1, '9.780')] [2023-10-12 18:18:33,507][62635] Updated weights for policy 1, policy_version 60600 (0.0011) [2023-10-12 18:18:36,768][62634] Updated weights for policy 0, policy_version 60610 (0.0010) [2023-10-12 18:18:37,142][62634] Updated weights for policy 0, policy_version 60620 (0.0010) [2023-10-12 18:18:37,486][62635] Updated weights for policy 1, policy_version 60610 (0.0010) [2023-10-12 18:18:37,519][62634] Updated weights for policy 0, policy_version 60630 (0.0009) [2023-10-12 18:18:37,852][62635] Updated weights for policy 1, policy_version 60620 (0.0009) [2023-10-12 18:18:37,894][62634] Updated weights for policy 0, policy_version 60640 (0.0007) [2023-10-12 18:18:38,222][62635] Updated weights for policy 1, policy_version 60630 (0.0009) [2023-10-12 18:18:38,435][61643] Fps is (10 sec: 13107.5, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 124157952. Throughput: 0: 1685.6, 1: 1701.0. Samples: 31045322. Policy #0 lag: (min: 2.0, avg: 2.5, max: 18.0) [2023-10-12 18:18:38,435][61643] Avg episode reward: [(0, '24.430'), (1, '9.690')] [2023-10-12 18:18:38,588][62635] Updated weights for policy 1, policy_version 60640 (0.0007) [2023-10-12 18:18:42,009][62634] Updated weights for policy 0, policy_version 60650 (0.0009) [2023-10-12 18:18:42,393][62634] Updated weights for policy 0, policy_version 60660 (0.0009) [2023-10-12 18:18:42,718][62635] Updated weights for policy 1, policy_version 60650 (0.0007) [2023-10-12 18:18:42,772][62634] Updated weights for policy 0, policy_version 60670 (0.0010) [2023-10-12 18:18:43,090][62635] Updated weights for policy 1, policy_version 60660 (0.0007) [2023-10-12 18:18:43,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 124223488. Throughput: 0: 1679.7, 1: 1697.4. Samples: 31065632. Policy #0 lag: (min: 2.0, avg: 2.5, max: 18.0) [2023-10-12 18:18:43,436][61643] Avg episode reward: [(0, '24.490'), (1, '9.870')] [2023-10-12 18:18:43,463][62635] Updated weights for policy 1, policy_version 60670 (0.0010) [2023-10-12 18:18:46,809][62634] Updated weights for policy 0, policy_version 60680 (0.0010) [2023-10-12 18:18:47,191][62634] Updated weights for policy 0, policy_version 60690 (0.0010) [2023-10-12 18:18:47,433][62635] Updated weights for policy 1, policy_version 60680 (0.0009) [2023-10-12 18:18:47,569][62634] Updated weights for policy 0, policy_version 60700 (0.0008) [2023-10-12 18:18:47,793][62635] Updated weights for policy 1, policy_version 60690 (0.0009) [2023-10-12 18:18:48,155][62635] Updated weights for policy 1, policy_version 60700 (0.0010) [2023-10-12 18:18:48,435][61643] Fps is (10 sec: 16384.0, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 124321792. Throughput: 0: 1663.7, 1: 1675.8. Samples: 31084606. Policy #0 lag: (min: 2.0, avg: 2.5, max: 18.0) [2023-10-12 18:18:48,435][61643] Avg episode reward: [(0, '24.550'), (1, '9.820')] [2023-10-12 18:18:51,588][62634] Updated weights for policy 0, policy_version 60710 (0.0009) [2023-10-12 18:18:51,970][62634] Updated weights for policy 0, policy_version 60720 (0.0008) [2023-10-12 18:18:52,265][62635] Updated weights for policy 1, policy_version 60710 (0.0010) [2023-10-12 18:18:52,341][62634] Updated weights for policy 0, policy_version 60730 (0.0008) [2023-10-12 18:18:52,652][62635] Updated weights for policy 1, policy_version 60720 (0.0008) [2023-10-12 18:18:53,028][62635] Updated weights for policy 1, policy_version 60730 (0.0008) [2023-10-12 18:18:53,435][61643] Fps is (10 sec: 16383.9, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 124387328. Throughput: 0: 1684.3, 1: 1700.7. Samples: 31096082. Policy #0 lag: (min: 2.0, avg: 2.5, max: 18.0) [2023-10-12 18:18:53,436][61643] Avg episode reward: [(0, '24.410'), (1, '9.720')] [2023-10-12 18:18:56,364][62634] Updated weights for policy 0, policy_version 60740 (0.0008) [2023-10-12 18:18:56,739][62634] Updated weights for policy 0, policy_version 60750 (0.0007) [2023-10-12 18:18:57,114][62634] Updated weights for policy 0, policy_version 60760 (0.0010) [2023-10-12 18:18:57,208][62635] Updated weights for policy 1, policy_version 60740 (0.0007) [2023-10-12 18:18:57,579][62635] Updated weights for policy 1, policy_version 60750 (0.0009) [2023-10-12 18:18:57,945][62635] Updated weights for policy 1, policy_version 60760 (0.0008) [2023-10-12 18:18:58,435][61643] Fps is (10 sec: 13107.3, 60 sec: 14199.6, 300 sec: 13551.5). Total num frames: 124452864. Throughput: 0: 1670.4, 1: 1693.1. Samples: 31116078. Policy #0 lag: (min: 2.0, avg: 2.5, max: 18.0) [2023-10-12 18:18:58,435][61643] Avg episode reward: [(0, '24.300'), (1, '9.780')] [2023-10-12 18:19:01,245][62634] Updated weights for policy 0, policy_version 60770 (0.0009) [2023-10-12 18:19:01,615][62634] Updated weights for policy 0, policy_version 60780 (0.0009) [2023-10-12 18:19:01,889][62635] Updated weights for policy 1, policy_version 60770 (0.0007) [2023-10-12 18:19:01,985][62634] Updated weights for policy 0, policy_version 60790 (0.0008) [2023-10-12 18:19:02,251][62635] Updated weights for policy 1, policy_version 60780 (0.0007) [2023-10-12 18:19:02,360][62634] Updated weights for policy 0, policy_version 60800 (0.0009) [2023-10-12 18:19:02,608][62635] Updated weights for policy 1, policy_version 60790 (0.0008) [2023-10-12 18:19:02,975][62635] Updated weights for policy 1, policy_version 60800 (0.0009) [2023-10-12 18:19:03,435][61643] Fps is (10 sec: 13107.4, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 124518400. Throughput: 0: 1670.5, 1: 1664.8. Samples: 31134864. Policy #0 lag: (min: 10.0, avg: 10.2, max: 21.0) [2023-10-12 18:19:03,435][61643] Avg episode reward: [(0, '24.420'), (1, '9.830')] [2023-10-12 18:19:06,492][62634] Updated weights for policy 0, policy_version 60810 (0.0007) [2023-10-12 18:19:06,871][62634] Updated weights for policy 0, policy_version 60820 (0.0007) [2023-10-12 18:19:06,969][62635] Updated weights for policy 1, policy_version 60810 (0.0007) [2023-10-12 18:19:07,254][62634] Updated weights for policy 0, policy_version 60830 (0.0007) [2023-10-12 18:19:07,338][62635] Updated weights for policy 1, policy_version 60820 (0.0008) [2023-10-12 18:19:07,706][62635] Updated weights for policy 1, policy_version 60830 (0.0009) [2023-10-12 18:19:08,435][61643] Fps is (10 sec: 13106.8, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 124583936. Throughput: 0: 1680.8, 1: 1691.5. Samples: 31146422. Policy #0 lag: (min: 10.0, avg: 10.2, max: 21.0) [2023-10-12 18:19:08,436][61643] Avg episode reward: [(0, '24.040'), (1, '10.060')] [2023-10-12 18:19:11,391][62634] Updated weights for policy 0, policy_version 60840 (0.0007) [2023-10-12 18:19:11,770][62634] Updated weights for policy 0, policy_version 60850 (0.0010) [2023-10-12 18:19:11,915][62635] Updated weights for policy 1, policy_version 60840 (0.0008) [2023-10-12 18:19:12,153][62634] Updated weights for policy 0, policy_version 60860 (0.0009) [2023-10-12 18:19:12,282][62635] Updated weights for policy 1, policy_version 60850 (0.0008) [2023-10-12 18:19:12,648][62635] Updated weights for policy 1, policy_version 60860 (0.0007) [2023-10-12 18:19:13,435][61643] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 124649472. Throughput: 0: 1665.1, 1: 1684.2. Samples: 31165826. Policy #0 lag: (min: 10.0, avg: 10.2, max: 21.0) [2023-10-12 18:19:13,435][61643] Avg episode reward: [(0, '23.530'), (1, '9.910')] [2023-10-12 18:19:16,319][62634] Updated weights for policy 0, policy_version 60870 (0.0010) [2023-10-12 18:19:16,697][62634] Updated weights for policy 0, policy_version 60880 (0.0009) [2023-10-12 18:19:16,830][62635] Updated weights for policy 1, policy_version 60870 (0.0009) [2023-10-12 18:19:17,080][62634] Updated weights for policy 0, policy_version 60890 (0.0008) [2023-10-12 18:19:17,193][62635] Updated weights for policy 1, policy_version 60880 (0.0008) [2023-10-12 18:19:17,556][62635] Updated weights for policy 1, policy_version 60890 (0.0009) [2023-10-12 18:19:18,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 124715008. Throughput: 0: 1678.7, 1: 1668.3. Samples: 31185278. Policy #0 lag: (min: 10.0, avg: 10.2, max: 21.0) [2023-10-12 18:19:18,436][61643] Avg episode reward: [(0, '23.560'), (1, '9.770')] [2023-10-12 18:19:21,057][62634] Updated weights for policy 0, policy_version 60900 (0.0009) [2023-10-12 18:19:21,436][62634] Updated weights for policy 0, policy_version 60910 (0.0009) [2023-10-12 18:19:21,510][62635] Updated weights for policy 1, policy_version 60900 (0.0009) [2023-10-12 18:19:21,821][62634] Updated weights for policy 0, policy_version 60920 (0.0007) [2023-10-12 18:19:21,876][62635] Updated weights for policy 1, policy_version 60910 (0.0009) [2023-10-12 18:19:22,253][62635] Updated weights for policy 1, policy_version 60920 (0.0009) [2023-10-12 18:19:23,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 124780544. Throughput: 0: 1683.4, 1: 1688.3. Samples: 31197050. Policy #0 lag: (min: 10.0, avg: 10.2, max: 21.0) [2023-10-12 18:19:23,435][61643] Avg episode reward: [(0, '23.610'), (1, '9.960')] [2023-10-12 18:19:25,665][62634] Updated weights for policy 0, policy_version 60930 (0.0007) [2023-10-12 18:19:26,043][62634] Updated weights for policy 0, policy_version 60940 (0.0010) [2023-10-12 18:19:26,186][62635] Updated weights for policy 1, policy_version 60930 (0.0007) [2023-10-12 18:19:26,421][62634] Updated weights for policy 0, policy_version 60950 (0.0009) [2023-10-12 18:19:26,556][62635] Updated weights for policy 1, policy_version 60940 (0.0007) [2023-10-12 18:19:26,797][62634] Updated weights for policy 0, policy_version 60960 (0.0008) [2023-10-12 18:19:26,927][62635] Updated weights for policy 1, policy_version 60950 (0.0007) [2023-10-12 18:19:27,292][62635] Updated weights for policy 1, policy_version 60960 (0.0009) [2023-10-12 18:19:28,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 124846080. Throughput: 0: 1665.7, 1: 1677.1. Samples: 31216060. Policy #0 lag: (min: 10.0, avg: 10.2, max: 21.0) [2023-10-12 18:19:28,438][61643] Avg episode reward: [(0, '23.220'), (1, '9.940')] [2023-10-12 18:19:30,841][62634] Updated weights for policy 0, policy_version 60970 (0.0007) [2023-10-12 18:19:31,210][62634] Updated weights for policy 0, policy_version 60980 (0.0007) [2023-10-12 18:19:31,314][62635] Updated weights for policy 1, policy_version 60970 (0.0009) [2023-10-12 18:19:31,580][62634] Updated weights for policy 0, policy_version 60990 (0.0007) [2023-10-12 18:19:31,678][62635] Updated weights for policy 1, policy_version 60980 (0.0009) [2023-10-12 18:19:32,045][62635] Updated weights for policy 1, policy_version 60990 (0.0010) [2023-10-12 18:19:33,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 124911616. Throughput: 0: 1684.9, 1: 1684.0. Samples: 31236210. Policy #0 lag: (min: 10.0, avg: 10.2, max: 21.0) [2023-10-12 18:19:33,436][61643] Avg episode reward: [(0, '23.170'), (1, '9.850')] [2023-10-12 18:19:35,507][62634] Updated weights for policy 0, policy_version 61000 (0.0010) [2023-10-12 18:19:35,888][62634] Updated weights for policy 0, policy_version 61010 (0.0010) [2023-10-12 18:19:36,071][62635] Updated weights for policy 1, policy_version 61000 (0.0008) [2023-10-12 18:19:36,261][62634] Updated weights for policy 0, policy_version 61020 (0.0009) [2023-10-12 18:19:36,433][62635] Updated weights for policy 1, policy_version 61010 (0.0008) [2023-10-12 18:19:36,800][62635] Updated weights for policy 1, policy_version 61020 (0.0010) [2023-10-12 18:19:38,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 124977152. Throughput: 0: 1666.2, 1: 1683.4. Samples: 31246816. Policy #0 lag: (min: 10.0, avg: 10.2, max: 21.0) [2023-10-12 18:19:38,436][61643] Avg episode reward: [(0, '22.910'), (1, '9.950')] [2023-10-12 18:19:40,429][62634] Updated weights for policy 0, policy_version 61030 (0.0007) [2023-10-12 18:19:40,812][62634] Updated weights for policy 0, policy_version 61040 (0.0007) [2023-10-12 18:19:40,894][62635] Updated weights for policy 1, policy_version 61030 (0.0009) [2023-10-12 18:19:41,194][62634] Updated weights for policy 0, policy_version 61050 (0.0007) [2023-10-12 18:19:41,269][62635] Updated weights for policy 1, policy_version 61040 (0.0007) [2023-10-12 18:19:41,646][62635] Updated weights for policy 1, policy_version 61050 (0.0009) [2023-10-12 18:19:43,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 125042688. Throughput: 0: 1667.7, 1: 1662.0. Samples: 31265916. Policy #0 lag: (min: 31.0, avg: 39.4, max: 63.0) [2023-10-12 18:19:43,435][61643] Avg episode reward: [(0, '23.230'), (1, '9.870')] [2023-10-12 18:19:45,195][62634] Updated weights for policy 0, policy_version 61060 (0.0008) [2023-10-12 18:19:45,576][62635] Updated weights for policy 1, policy_version 61060 (0.0009) [2023-10-12 18:19:45,576][62634] Updated weights for policy 0, policy_version 61070 (0.0007) [2023-10-12 18:19:45,938][62635] Updated weights for policy 1, policy_version 61070 (0.0008) [2023-10-12 18:19:45,953][62634] Updated weights for policy 0, policy_version 61080 (0.0008) [2023-10-12 18:19:46,306][62635] Updated weights for policy 1, policy_version 61080 (0.0010) [2023-10-12 18:19:48,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 125108224. Throughput: 0: 1684.0, 1: 1686.7. Samples: 31286548. Policy #0 lag: (min: 31.0, avg: 39.4, max: 63.0) [2023-10-12 18:19:48,436][61643] Avg episode reward: [(0, '23.400'), (1, '9.670')] [2023-10-12 18:19:50,046][62634] Updated weights for policy 0, policy_version 61090 (0.0009) [2023-10-12 18:19:50,394][62635] Updated weights for policy 1, policy_version 61090 (0.0009) [2023-10-12 18:19:50,425][62634] Updated weights for policy 0, policy_version 61100 (0.0009) [2023-10-12 18:19:50,756][62635] Updated weights for policy 1, policy_version 61100 (0.0007) [2023-10-12 18:19:50,793][62634] Updated weights for policy 0, policy_version 61110 (0.0008) [2023-10-12 18:19:51,129][62635] Updated weights for policy 1, policy_version 61110 (0.0007) [2023-10-12 18:19:51,169][62634] Updated weights for policy 0, policy_version 61120 (0.0007) [2023-10-12 18:19:51,497][62635] Updated weights for policy 1, policy_version 61120 (0.0009) [2023-10-12 18:19:53,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 125173760. Throughput: 0: 1662.0, 1: 1671.3. Samples: 31296420. Policy #0 lag: (min: 31.0, avg: 39.4, max: 63.0) [2023-10-12 18:19:53,436][61643] Avg episode reward: [(0, '23.460'), (1, '9.680')] [2023-10-12 18:19:55,324][62634] Updated weights for policy 0, policy_version 61130 (0.0007) [2023-10-12 18:19:55,514][62635] Updated weights for policy 1, policy_version 61130 (0.0008) [2023-10-12 18:19:55,701][62634] Updated weights for policy 0, policy_version 61140 (0.0007) [2023-10-12 18:19:55,879][62635] Updated weights for policy 1, policy_version 61140 (0.0007) [2023-10-12 18:19:56,065][62634] Updated weights for policy 0, policy_version 61150 (0.0008) [2023-10-12 18:19:56,249][62635] Updated weights for policy 1, policy_version 61150 (0.0007) [2023-10-12 18:19:58,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 125239296. Throughput: 0: 1676.6, 1: 1670.1. Samples: 31316430. Policy #0 lag: (min: 31.0, avg: 39.4, max: 63.0) [2023-10-12 18:19:58,435][61643] Avg episode reward: [(0, '23.460'), (1, '9.870')] [2023-10-12 18:20:00,142][62634] Updated weights for policy 0, policy_version 61160 (0.0010) [2023-10-12 18:20:00,472][62635] Updated weights for policy 1, policy_version 61160 (0.0007) [2023-10-12 18:20:00,520][62634] Updated weights for policy 0, policy_version 61170 (0.0007) [2023-10-12 18:20:00,838][62635] Updated weights for policy 1, policy_version 61170 (0.0008) [2023-10-12 18:20:00,896][62634] Updated weights for policy 0, policy_version 61180 (0.0007) [2023-10-12 18:20:01,201][62635] Updated weights for policy 1, policy_version 61180 (0.0008) [2023-10-12 18:20:03,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 125304832. Throughput: 0: 1682.4, 1: 1683.6. Samples: 31336752. Policy #0 lag: (min: 31.0, avg: 39.4, max: 63.0) [2023-10-12 18:20:03,435][61643] Avg episode reward: [(0, '23.850'), (1, '9.720')] [2023-10-12 18:20:04,988][62634] Updated weights for policy 0, policy_version 61190 (0.0007) [2023-10-12 18:20:05,371][62634] Updated weights for policy 0, policy_version 61200 (0.0007) [2023-10-12 18:20:05,403][62635] Updated weights for policy 1, policy_version 61190 (0.0009) [2023-10-12 18:20:05,739][62634] Updated weights for policy 0, policy_version 61210 (0.0009) [2023-10-12 18:20:05,764][62635] Updated weights for policy 1, policy_version 61200 (0.0007) [2023-10-12 18:20:06,136][62635] Updated weights for policy 1, policy_version 61210 (0.0008) [2023-10-12 18:20:08,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 125370368. Throughput: 0: 1657.2, 1: 1659.7. Samples: 31346310. Policy #0 lag: (min: 31.0, avg: 39.4, max: 63.0) [2023-10-12 18:20:08,436][61643] Avg episode reward: [(0, '23.920'), (1, '9.830')] [2023-10-12 18:20:09,951][62634] Updated weights for policy 0, policy_version 61220 (0.0008) [2023-10-12 18:20:10,254][62635] Updated weights for policy 1, policy_version 61220 (0.0007) [2023-10-12 18:20:10,329][62634] Updated weights for policy 0, policy_version 61230 (0.0009) [2023-10-12 18:20:10,626][62635] Updated weights for policy 1, policy_version 61230 (0.0007) [2023-10-12 18:20:10,712][62634] Updated weights for policy 0, policy_version 61240 (0.0007) [2023-10-12 18:20:11,003][62635] Updated weights for policy 1, policy_version 61240 (0.0009) [2023-10-12 18:20:13,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 125435904. Throughput: 0: 1676.5, 1: 1665.4. Samples: 31366448. Policy #0 lag: (min: 31.0, avg: 39.4, max: 63.0) [2023-10-12 18:20:13,435][61643] Avg episode reward: [(0, '23.660'), (1, '9.720')] [2023-10-12 18:20:14,563][62634] Updated weights for policy 0, policy_version 61250 (0.0007) [2023-10-12 18:20:14,939][62634] Updated weights for policy 0, policy_version 61260 (0.0007) [2023-10-12 18:20:15,065][62635] Updated weights for policy 1, policy_version 61250 (0.0009) [2023-10-12 18:20:15,325][62634] Updated weights for policy 0, policy_version 61270 (0.0009) [2023-10-12 18:20:15,432][62635] Updated weights for policy 1, policy_version 61260 (0.0009) [2023-10-12 18:20:15,695][62634] Updated weights for policy 0, policy_version 61280 (0.0009) [2023-10-12 18:20:15,798][62635] Updated weights for policy 1, policy_version 61270 (0.0008) [2023-10-12 18:20:16,178][62635] Updated weights for policy 1, policy_version 61280 (0.0008) [2023-10-12 18:20:18,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 125501440. Throughput: 0: 1679.3, 1: 1673.2. Samples: 31387072. Policy #0 lag: (min: 31.0, avg: 39.4, max: 63.0) [2023-10-12 18:20:18,436][61643] Avg episode reward: [(0, '23.810'), (1, '9.760')] [2023-10-12 18:20:18,448][62495] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000061280_62750720.pth... [2023-10-12 18:20:18,448][62354] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000061280_62750720.pth... [2023-10-12 18:20:18,479][62495] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000059712_61145088.pth [2023-10-12 18:20:18,488][62354] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000059712_61145088.pth [2023-10-12 18:20:19,791][62634] Updated weights for policy 0, policy_version 61290 (0.0009) [2023-10-12 18:20:20,167][62634] Updated weights for policy 0, policy_version 61300 (0.0008) [2023-10-12 18:20:20,305][62635] Updated weights for policy 1, policy_version 61290 (0.0007) [2023-10-12 18:20:20,545][62634] Updated weights for policy 0, policy_version 61310 (0.0007) [2023-10-12 18:20:20,671][62635] Updated weights for policy 1, policy_version 61300 (0.0007) [2023-10-12 18:20:21,045][62635] Updated weights for policy 1, policy_version 61310 (0.0007) [2023-10-12 18:20:23,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 125566976. Throughput: 0: 1666.4, 1: 1654.9. Samples: 31396274. Policy #0 lag: (min: 5.0, avg: 5.7, max: 22.0) [2023-10-12 18:20:23,435][61643] Avg episode reward: [(0, '24.100'), (1, '9.700')] [2023-10-12 18:20:24,599][62634] Updated weights for policy 0, policy_version 61320 (0.0008) [2023-10-12 18:20:24,969][62634] Updated weights for policy 0, policy_version 61330 (0.0008) [2023-10-12 18:20:25,138][62635] Updated weights for policy 1, policy_version 61320 (0.0007) [2023-10-12 18:20:25,336][62634] Updated weights for policy 0, policy_version 61340 (0.0007) [2023-10-12 18:20:25,515][62635] Updated weights for policy 1, policy_version 61330 (0.0007) [2023-10-12 18:20:25,880][62635] Updated weights for policy 1, policy_version 61340 (0.0008) [2023-10-12 18:20:28,435][61643] Fps is (10 sec: 13107.7, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 125632512. Throughput: 0: 1676.6, 1: 1674.1. Samples: 31416698. Policy #0 lag: (min: 5.0, avg: 5.7, max: 22.0) [2023-10-12 18:20:28,435][61643] Avg episode reward: [(0, '24.750'), (1, '9.600')] [2023-10-12 18:20:29,444][62634] Updated weights for policy 0, policy_version 61350 (0.0009) [2023-10-12 18:20:29,828][62634] Updated weights for policy 0, policy_version 61360 (0.0008) [2023-10-12 18:20:30,070][62635] Updated weights for policy 1, policy_version 61350 (0.0010) [2023-10-12 18:20:30,198][62634] Updated weights for policy 0, policy_version 61370 (0.0008) [2023-10-12 18:20:30,458][62635] Updated weights for policy 1, policy_version 61360 (0.0008) [2023-10-12 18:20:30,828][62635] Updated weights for policy 1, policy_version 61370 (0.0009) [2023-10-12 18:20:33,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 125698048. Throughput: 0: 1672.4, 1: 1675.6. Samples: 31437212. Policy #0 lag: (min: 5.0, avg: 5.7, max: 22.0) [2023-10-12 18:20:33,436][61643] Avg episode reward: [(0, '24.410'), (1, '9.770')] [2023-10-12 18:20:34,490][62634] Updated weights for policy 0, policy_version 61380 (0.0009) [2023-10-12 18:20:34,788][62635] Updated weights for policy 1, policy_version 61380 (0.0008) [2023-10-12 18:20:34,863][62634] Updated weights for policy 0, policy_version 61390 (0.0008) [2023-10-12 18:20:35,156][62635] Updated weights for policy 1, policy_version 61390 (0.0007) [2023-10-12 18:20:35,236][62634] Updated weights for policy 0, policy_version 61400 (0.0008) [2023-10-12 18:20:35,530][62635] Updated weights for policy 1, policy_version 61400 (0.0008) [2023-10-12 18:20:38,435][61643] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 125763584. Throughput: 0: 1664.0, 1: 1664.5. Samples: 31446204. Policy #0 lag: (min: 5.0, avg: 5.7, max: 22.0) [2023-10-12 18:20:38,436][61643] Avg episode reward: [(0, '24.130'), (1, '9.710')] [2023-10-12 18:20:39,466][62634] Updated weights for policy 0, policy_version 61410 (0.0009) [2023-10-12 18:20:39,705][62635] Updated weights for policy 1, policy_version 61410 (0.0007) [2023-10-12 18:20:39,852][62634] Updated weights for policy 0, policy_version 61420 (0.0009) [2023-10-12 18:20:40,063][62635] Updated weights for policy 1, policy_version 61420 (0.0008) [2023-10-12 18:20:40,231][62634] Updated weights for policy 0, policy_version 61430 (0.0009) [2023-10-12 18:20:40,435][62635] Updated weights for policy 1, policy_version 61430 (0.0009) [2023-10-12 18:20:40,599][62634] Updated weights for policy 0, policy_version 61440 (0.0008) [2023-10-12 18:20:40,805][62635] Updated weights for policy 1, policy_version 61440 (0.0007) [2023-10-12 18:20:43,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 125829120. Throughput: 0: 1664.5, 1: 1674.9. Samples: 31466702. Policy #0 lag: (min: 5.0, avg: 5.7, max: 22.0) [2023-10-12 18:20:43,435][61643] Avg episode reward: [(0, '24.320'), (1, '9.650')] [2023-10-12 18:20:44,598][62634] Updated weights for policy 0, policy_version 61450 (0.0011) [2023-10-12 18:20:44,918][62635] Updated weights for policy 1, policy_version 61450 (0.0009) [2023-10-12 18:20:44,963][62634] Updated weights for policy 0, policy_version 61460 (0.0008) [2023-10-12 18:20:45,286][62635] Updated weights for policy 1, policy_version 61460 (0.0008) [2023-10-12 18:20:45,338][62634] Updated weights for policy 0, policy_version 61470 (0.0007) [2023-10-12 18:20:45,649][62635] Updated weights for policy 1, policy_version 61470 (0.0010) [2023-10-12 18:20:48,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 125894656. Throughput: 0: 1662.3, 1: 1681.1. Samples: 31487204. Policy #0 lag: (min: 5.0, avg: 5.7, max: 22.0) [2023-10-12 18:20:48,435][61643] Avg episode reward: [(0, '24.360'), (1, '9.650')] [2023-10-12 18:20:49,528][62634] Updated weights for policy 0, policy_version 61480 (0.0008) [2023-10-12 18:20:49,767][62635] Updated weights for policy 1, policy_version 61480 (0.0007) [2023-10-12 18:20:49,900][62634] Updated weights for policy 0, policy_version 61490 (0.0009) [2023-10-12 18:20:50,130][62635] Updated weights for policy 1, policy_version 61490 (0.0007) [2023-10-12 18:20:50,281][62634] Updated weights for policy 0, policy_version 61500 (0.0007) [2023-10-12 18:20:50,492][62635] Updated weights for policy 1, policy_version 61500 (0.0007) [2023-10-12 18:20:53,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 125960192. Throughput: 0: 1656.2, 1: 1671.6. Samples: 31496062. Policy #0 lag: (min: 5.0, avg: 5.7, max: 22.0) [2023-10-12 18:20:53,436][61643] Avg episode reward: [(0, '24.560'), (1, '9.750')] [2023-10-12 18:20:54,382][62635] Updated weights for policy 1, policy_version 61510 (0.0007) [2023-10-12 18:20:54,467][62634] Updated weights for policy 0, policy_version 61510 (0.0008) [2023-10-12 18:20:54,751][62635] Updated weights for policy 1, policy_version 61520 (0.0008) [2023-10-12 18:20:54,844][62634] Updated weights for policy 0, policy_version 61520 (0.0008) [2023-10-12 18:20:55,112][62635] Updated weights for policy 1, policy_version 61530 (0.0009) [2023-10-12 18:20:55,221][62634] Updated weights for policy 0, policy_version 61530 (0.0007) [2023-10-12 18:20:58,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 126025728. Throughput: 0: 1658.7, 1: 1682.7. Samples: 31516808. Policy #0 lag: (min: 5.0, avg: 5.7, max: 22.0) [2023-10-12 18:20:58,436][61643] Avg episode reward: [(0, '24.380'), (1, '9.900')] [2023-10-12 18:20:59,196][62635] Updated weights for policy 1, policy_version 61540 (0.0007) [2023-10-12 18:20:59,229][62634] Updated weights for policy 0, policy_version 61540 (0.0008) [2023-10-12 18:20:59,565][62635] Updated weights for policy 1, policy_version 61550 (0.0008) [2023-10-12 18:20:59,606][62634] Updated weights for policy 0, policy_version 61550 (0.0009) [2023-10-12 18:20:59,936][62635] Updated weights for policy 1, policy_version 61560 (0.0010) [2023-10-12 18:20:59,979][62634] Updated weights for policy 0, policy_version 61560 (0.0009) [2023-10-12 18:21:03,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 126091264. Throughput: 0: 1663.1, 1: 1679.3. Samples: 31537478. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 18:21:03,435][61643] Avg episode reward: [(0, '24.430'), (1, '9.980')] [2023-10-12 18:21:04,045][62634] Updated weights for policy 0, policy_version 61570 (0.0008) [2023-10-12 18:21:04,125][62635] Updated weights for policy 1, policy_version 61570 (0.0007) [2023-10-12 18:21:04,449][62634] Updated weights for policy 0, policy_version 61580 (0.0007) [2023-10-12 18:21:04,498][62635] Updated weights for policy 1, policy_version 61580 (0.0008) [2023-10-12 18:21:04,818][62634] Updated weights for policy 0, policy_version 61590 (0.0008) [2023-10-12 18:21:04,857][62635] Updated weights for policy 1, policy_version 61590 (0.0008) [2023-10-12 18:21:05,191][62634] Updated weights for policy 0, policy_version 61600 (0.0008) [2023-10-12 18:21:05,232][62635] Updated weights for policy 1, policy_version 61600 (0.0009) [2023-10-12 18:21:08,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 126156800. Throughput: 0: 1660.7, 1: 1677.9. Samples: 31546510. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 18:21:08,435][61643] Avg episode reward: [(0, '24.340'), (1, '9.920')] [2023-10-12 18:21:09,156][62634] Updated weights for policy 0, policy_version 61610 (0.0008) [2023-10-12 18:21:09,325][62635] Updated weights for policy 1, policy_version 61610 (0.0007) [2023-10-12 18:21:09,540][62634] Updated weights for policy 0, policy_version 61620 (0.0009) [2023-10-12 18:21:09,685][62635] Updated weights for policy 1, policy_version 61620 (0.0009) [2023-10-12 18:21:09,914][62634] Updated weights for policy 0, policy_version 61630 (0.0009) [2023-10-12 18:21:10,048][62635] Updated weights for policy 1, policy_version 61630 (0.0008) [2023-10-12 18:21:13,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 126222336. Throughput: 0: 1661.0, 1: 1680.4. Samples: 31567064. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 18:21:13,435][61643] Avg episode reward: [(0, '24.260'), (1, '10.020')] [2023-10-12 18:21:14,015][62634] Updated weights for policy 0, policy_version 61640 (0.0008) [2023-10-12 18:21:14,106][62635] Updated weights for policy 1, policy_version 61640 (0.0009) [2023-10-12 18:21:14,393][62634] Updated weights for policy 0, policy_version 61650 (0.0010) [2023-10-12 18:21:14,470][62635] Updated weights for policy 1, policy_version 61650 (0.0007) [2023-10-12 18:21:14,764][62634] Updated weights for policy 0, policy_version 61660 (0.0008) [2023-10-12 18:21:14,839][62635] Updated weights for policy 1, policy_version 61660 (0.0007) [2023-10-12 18:21:18,435][61643] Fps is (10 sec: 13106.8, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 126287872. Throughput: 0: 1662.6, 1: 1677.4. Samples: 31587512. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 18:21:18,436][61643] Avg episode reward: [(0, '23.660'), (1, '9.940')] [2023-10-12 18:21:19,010][62634] Updated weights for policy 0, policy_version 61670 (0.0008) [2023-10-12 18:21:19,118][62635] Updated weights for policy 1, policy_version 61670 (0.0010) [2023-10-12 18:21:19,379][62634] Updated weights for policy 0, policy_version 61680 (0.0008) [2023-10-12 18:21:19,499][62635] Updated weights for policy 1, policy_version 61680 (0.0008) [2023-10-12 18:21:19,752][62634] Updated weights for policy 0, policy_version 61690 (0.0008) [2023-10-12 18:21:19,861][62635] Updated weights for policy 1, policy_version 61690 (0.0008) [2023-10-12 18:21:23,435][61643] Fps is (10 sec: 13106.7, 60 sec: 13107.1, 300 sec: 13329.4). Total num frames: 126353408. Throughput: 0: 1662.9, 1: 1673.0. Samples: 31596322. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 18:21:23,436][61643] Avg episode reward: [(0, '23.920'), (1, '9.850')] [2023-10-12 18:21:23,793][62635] Updated weights for policy 1, policy_version 61700 (0.0008) [2023-10-12 18:21:23,928][62634] Updated weights for policy 0, policy_version 61700 (0.0008) [2023-10-12 18:21:24,158][62635] Updated weights for policy 1, policy_version 61710 (0.0008) [2023-10-12 18:21:24,304][62634] Updated weights for policy 0, policy_version 61710 (0.0009) [2023-10-12 18:21:24,524][62635] Updated weights for policy 1, policy_version 61720 (0.0007) [2023-10-12 18:21:24,689][62634] Updated weights for policy 0, policy_version 61720 (0.0009) [2023-10-12 18:21:28,435][61643] Fps is (10 sec: 13107.6, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 126418944. Throughput: 0: 1664.0, 1: 1675.9. Samples: 31616994. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 18:21:28,435][61643] Avg episode reward: [(0, '23.820'), (1, '9.940')] [2023-10-12 18:21:28,615][62634] Updated weights for policy 0, policy_version 61730 (0.0010) [2023-10-12 18:21:28,655][62635] Updated weights for policy 1, policy_version 61730 (0.0008) [2023-10-12 18:21:28,987][62634] Updated weights for policy 0, policy_version 61740 (0.0009) [2023-10-12 18:21:29,021][62635] Updated weights for policy 1, policy_version 61740 (0.0010) [2023-10-12 18:21:29,372][62634] Updated weights for policy 0, policy_version 61750 (0.0008) [2023-10-12 18:21:29,391][62635] Updated weights for policy 1, policy_version 61750 (0.0008) [2023-10-12 18:21:29,738][62634] Updated weights for policy 0, policy_version 61760 (0.0008) [2023-10-12 18:21:29,761][62635] Updated weights for policy 1, policy_version 61760 (0.0008) [2023-10-12 18:21:33,435][61643] Fps is (10 sec: 13107.6, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 126484480. Throughput: 0: 1671.3, 1: 1675.3. Samples: 31637802. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 18:21:33,435][61643] Avg episode reward: [(0, '23.790'), (1, '9.840')] [2023-10-12 18:21:33,860][62635] Updated weights for policy 1, policy_version 61770 (0.0009) [2023-10-12 18:21:33,903][62634] Updated weights for policy 0, policy_version 61770 (0.0008) [2023-10-12 18:21:34,232][62635] Updated weights for policy 1, policy_version 61780 (0.0009) [2023-10-12 18:21:34,275][62634] Updated weights for policy 0, policy_version 61780 (0.0009) [2023-10-12 18:21:34,589][62635] Updated weights for policy 1, policy_version 61790 (0.0007) [2023-10-12 18:21:34,649][62634] Updated weights for policy 0, policy_version 61790 (0.0009) [2023-10-12 18:21:38,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 126550016. Throughput: 0: 1671.7, 1: 1677.3. Samples: 31646766. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 18:21:38,435][61643] Avg episode reward: [(0, '23.490'), (1, '9.840')] [2023-10-12 18:21:38,821][62635] Updated weights for policy 1, policy_version 61800 (0.0010) [2023-10-12 18:21:38,875][62634] Updated weights for policy 0, policy_version 61800 (0.0010) [2023-10-12 18:21:39,188][62635] Updated weights for policy 1, policy_version 61810 (0.0009) [2023-10-12 18:21:39,260][62634] Updated weights for policy 0, policy_version 61810 (0.0008) [2023-10-12 18:21:39,554][62635] Updated weights for policy 1, policy_version 61820 (0.0008) [2023-10-12 18:21:39,643][62634] Updated weights for policy 0, policy_version 61820 (0.0009) [2023-10-12 18:21:43,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 126615552. Throughput: 0: 1673.5, 1: 1671.4. Samples: 31667326. Policy #0 lag: (min: 11.0, avg: 20.7, max: 43.0) [2023-10-12 18:21:43,435][61643] Avg episode reward: [(0, '23.430'), (1, '9.930')] [2023-10-12 18:21:43,494][62634] Updated weights for policy 0, policy_version 61830 (0.0009) [2023-10-12 18:21:43,666][62635] Updated weights for policy 1, policy_version 61830 (0.0007) [2023-10-12 18:21:43,863][62634] Updated weights for policy 0, policy_version 61840 (0.0009) [2023-10-12 18:21:44,041][62635] Updated weights for policy 1, policy_version 61840 (0.0008) [2023-10-12 18:21:44,247][62634] Updated weights for policy 0, policy_version 61850 (0.0009) [2023-10-12 18:21:44,405][62635] Updated weights for policy 1, policy_version 61850 (0.0008) [2023-10-12 18:21:48,162][62634] Updated weights for policy 0, policy_version 61860 (0.0009) [2023-10-12 18:21:48,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 126681088. Throughput: 0: 1671.7, 1: 1673.6. Samples: 31688016. Policy #0 lag: (min: 11.0, avg: 20.7, max: 43.0) [2023-10-12 18:21:48,435][61643] Avg episode reward: [(0, '23.600'), (1, '9.930')] [2023-10-12 18:21:48,534][62634] Updated weights for policy 0, policy_version 61870 (0.0011) [2023-10-12 18:21:48,645][62635] Updated weights for policy 1, policy_version 61860 (0.0009) [2023-10-12 18:21:48,910][62634] Updated weights for policy 0, policy_version 61880 (0.0008) [2023-10-12 18:21:48,998][62635] Updated weights for policy 1, policy_version 61870 (0.0009) [2023-10-12 18:21:49,361][62635] Updated weights for policy 1, policy_version 61880 (0.0008) [2023-10-12 18:21:53,110][62634] Updated weights for policy 0, policy_version 61890 (0.0007) [2023-10-12 18:21:53,404][62635] Updated weights for policy 1, policy_version 61890 (0.0008) [2023-10-12 18:21:53,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 126746624. Throughput: 0: 1672.0, 1: 1669.3. Samples: 31696868. Policy #0 lag: (min: 11.0, avg: 20.7, max: 43.0) [2023-10-12 18:21:53,435][61643] Avg episode reward: [(0, '23.770'), (1, '9.930')] [2023-10-12 18:21:53,489][62634] Updated weights for policy 0, policy_version 61900 (0.0007) [2023-10-12 18:21:53,768][62635] Updated weights for policy 1, policy_version 61900 (0.0008) [2023-10-12 18:21:53,866][62634] Updated weights for policy 0, policy_version 61910 (0.0008) [2023-10-12 18:21:54,129][62635] Updated weights for policy 1, policy_version 61910 (0.0009) [2023-10-12 18:21:54,234][62634] Updated weights for policy 0, policy_version 61920 (0.0009) [2023-10-12 18:21:54,496][62635] Updated weights for policy 1, policy_version 61920 (0.0008) [2023-10-12 18:21:58,328][62634] Updated weights for policy 0, policy_version 61930 (0.0008) [2023-10-12 18:21:58,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 126812160. Throughput: 0: 1674.4, 1: 1667.4. Samples: 31717444. Policy #0 lag: (min: 11.0, avg: 20.7, max: 43.0) [2023-10-12 18:21:58,436][61643] Avg episode reward: [(0, '23.740'), (1, '9.880')] [2023-10-12 18:21:58,634][62635] Updated weights for policy 1, policy_version 61930 (0.0009) [2023-10-12 18:21:58,705][62634] Updated weights for policy 0, policy_version 61940 (0.0008) [2023-10-12 18:21:59,001][62635] Updated weights for policy 1, policy_version 61940 (0.0009) [2023-10-12 18:21:59,087][62634] Updated weights for policy 0, policy_version 61950 (0.0009) [2023-10-12 18:21:59,375][62635] Updated weights for policy 1, policy_version 61950 (0.0009) [2023-10-12 18:22:02,918][62634] Updated weights for policy 0, policy_version 61960 (0.0009) [2023-10-12 18:22:03,306][62634] Updated weights for policy 0, policy_version 61970 (0.0009) [2023-10-12 18:22:03,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 126877696. Throughput: 0: 1673.3, 1: 1667.7. Samples: 31737856. Policy #0 lag: (min: 11.0, avg: 20.7, max: 43.0) [2023-10-12 18:22:03,436][61643] Avg episode reward: [(0, '23.820'), (1, '9.820')] [2023-10-12 18:22:03,563][62635] Updated weights for policy 1, policy_version 61960 (0.0007) [2023-10-12 18:22:03,678][62634] Updated weights for policy 0, policy_version 61980 (0.0009) [2023-10-12 18:22:03,932][62635] Updated weights for policy 1, policy_version 61970 (0.0007) [2023-10-12 18:22:04,300][62635] Updated weights for policy 1, policy_version 61980 (0.0007) [2023-10-12 18:22:07,658][62634] Updated weights for policy 0, policy_version 61990 (0.0010) [2023-10-12 18:22:08,030][62634] Updated weights for policy 0, policy_version 62000 (0.0008) [2023-10-12 18:22:08,349][62635] Updated weights for policy 1, policy_version 61990 (0.0008) [2023-10-12 18:22:08,410][62634] Updated weights for policy 0, policy_version 62010 (0.0009) [2023-10-12 18:22:08,435][61643] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 126943232. Throughput: 0: 1684.8, 1: 1671.5. Samples: 31747352. Policy #0 lag: (min: 11.0, avg: 20.7, max: 43.0) [2023-10-12 18:22:08,435][61643] Avg episode reward: [(0, '23.760'), (1, '9.820')] [2023-10-12 18:22:08,716][62635] Updated weights for policy 1, policy_version 62000 (0.0009) [2023-10-12 18:22:09,080][62635] Updated weights for policy 1, policy_version 62010 (0.0009) [2023-10-12 18:22:12,646][62634] Updated weights for policy 0, policy_version 62020 (0.0010) [2023-10-12 18:22:13,029][62634] Updated weights for policy 0, policy_version 62030 (0.0010) [2023-10-12 18:22:13,036][62635] Updated weights for policy 1, policy_version 62020 (0.0008) [2023-10-12 18:22:13,406][62634] Updated weights for policy 0, policy_version 62040 (0.0009) [2023-10-12 18:22:13,407][62635] Updated weights for policy 1, policy_version 62030 (0.0009) [2023-10-12 18:22:13,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 127008768. Throughput: 0: 1688.4, 1: 1672.3. Samples: 31768224. Policy #0 lag: (min: 11.0, avg: 20.7, max: 43.0) [2023-10-12 18:22:13,435][61643] Avg episode reward: [(0, '23.900'), (1, '9.950')] [2023-10-12 18:22:13,768][62635] Updated weights for policy 1, policy_version 62040 (0.0010) [2023-10-12 18:22:17,265][62634] Updated weights for policy 0, policy_version 62050 (0.0008) [2023-10-12 18:22:17,640][62634] Updated weights for policy 0, policy_version 62060 (0.0007) [2023-10-12 18:22:17,684][62635] Updated weights for policy 1, policy_version 62050 (0.0010) [2023-10-12 18:22:18,013][62634] Updated weights for policy 0, policy_version 62070 (0.0007) [2023-10-12 18:22:18,047][62635] Updated weights for policy 1, policy_version 62060 (0.0009) [2023-10-12 18:22:18,390][62634] Updated weights for policy 0, policy_version 62080 (0.0009) [2023-10-12 18:22:18,407][62635] Updated weights for policy 1, policy_version 62070 (0.0008) [2023-10-12 18:22:18,435][61643] Fps is (10 sec: 16383.7, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 127107072. Throughput: 0: 1671.3, 1: 1665.4. Samples: 31787954. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 18:22:18,436][61643] Avg episode reward: [(0, '23.790'), (1, '9.850')] [2023-10-12 18:22:18,445][62354] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000062080_63569920.pth... [2023-10-12 18:22:18,476][62354] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000060512_61964288.pth [2023-10-12 18:22:18,770][62495] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000062080_63569920.pth... [2023-10-12 18:22:18,776][62635] Updated weights for policy 1, policy_version 62080 (0.0007) [2023-10-12 18:22:18,812][62495] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000060512_61964288.pth [2023-10-12 18:22:22,421][62634] Updated weights for policy 0, policy_version 62090 (0.0007) [2023-10-12 18:22:22,779][62635] Updated weights for policy 1, policy_version 62090 (0.0007) [2023-10-12 18:22:22,808][62634] Updated weights for policy 0, policy_version 62100 (0.0008) [2023-10-12 18:22:23,154][62635] Updated weights for policy 1, policy_version 62100 (0.0008) [2023-10-12 18:22:23,182][62634] Updated weights for policy 0, policy_version 62110 (0.0009) [2023-10-12 18:22:23,435][61643] Fps is (10 sec: 16384.0, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 127172608. Throughput: 0: 1692.8, 1: 1677.4. Samples: 31798422. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 18:22:23,435][61643] Avg episode reward: [(0, '23.850'), (1, '9.930')] [2023-10-12 18:22:23,509][62635] Updated weights for policy 1, policy_version 62110 (0.0009) [2023-10-12 18:22:27,298][62634] Updated weights for policy 0, policy_version 62120 (0.0009) [2023-10-12 18:22:27,670][62634] Updated weights for policy 0, policy_version 62130 (0.0008) [2023-10-12 18:22:27,677][62635] Updated weights for policy 1, policy_version 62120 (0.0007) [2023-10-12 18:22:28,047][62635] Updated weights for policy 1, policy_version 62130 (0.0007) [2023-10-12 18:22:28,059][62634] Updated weights for policy 0, policy_version 62140 (0.0009) [2023-10-12 18:22:28,413][62635] Updated weights for policy 1, policy_version 62140 (0.0009) [2023-10-12 18:22:28,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 127238144. Throughput: 0: 1693.3, 1: 1675.6. Samples: 31818926. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 18:22:28,435][61643] Avg episode reward: [(0, '23.830'), (1, '9.790')] [2023-10-12 18:22:32,245][62634] Updated weights for policy 0, policy_version 62150 (0.0008) [2023-10-12 18:22:32,390][62635] Updated weights for policy 1, policy_version 62150 (0.0009) [2023-10-12 18:22:32,612][62634] Updated weights for policy 0, policy_version 62160 (0.0008) [2023-10-12 18:22:32,748][62635] Updated weights for policy 1, policy_version 62160 (0.0009) [2023-10-12 18:22:32,999][62634] Updated weights for policy 0, policy_version 62170 (0.0007) [2023-10-12 18:22:33,130][62635] Updated weights for policy 1, policy_version 62170 (0.0010) [2023-10-12 18:22:33,435][61643] Fps is (10 sec: 16384.0, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 127336448. Throughput: 0: 1665.2, 1: 1658.5. Samples: 31837586. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 18:22:33,435][61643] Avg episode reward: [(0, '24.450'), (1, '9.740')] [2023-10-12 18:22:37,125][62635] Updated weights for policy 1, policy_version 62180 (0.0009) [2023-10-12 18:22:37,167][62634] Updated weights for policy 0, policy_version 62180 (0.0008) [2023-10-12 18:22:37,487][62635] Updated weights for policy 1, policy_version 62190 (0.0007) [2023-10-12 18:22:37,547][62634] Updated weights for policy 0, policy_version 62190 (0.0009) [2023-10-12 18:22:37,852][62635] Updated weights for policy 1, policy_version 62200 (0.0007) [2023-10-12 18:22:37,916][62634] Updated weights for policy 0, policy_version 62200 (0.0009) [2023-10-12 18:22:38,435][61643] Fps is (10 sec: 16384.2, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 127401984. Throughput: 0: 1687.2, 1: 1680.8. Samples: 31848430. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 18:22:38,435][61643] Avg episode reward: [(0, '24.570'), (1, '9.850')] [2023-10-12 18:22:41,952][62635] Updated weights for policy 1, policy_version 62210 (0.0007) [2023-10-12 18:22:41,986][62634] Updated weights for policy 0, policy_version 62210 (0.0008) [2023-10-12 18:22:42,326][62635] Updated weights for policy 1, policy_version 62220 (0.0007) [2023-10-12 18:22:42,396][62634] Updated weights for policy 0, policy_version 62220 (0.0008) [2023-10-12 18:22:42,686][62635] Updated weights for policy 1, policy_version 62230 (0.0008) [2023-10-12 18:22:42,778][62634] Updated weights for policy 0, policy_version 62230 (0.0007) [2023-10-12 18:22:43,053][62635] Updated weights for policy 1, policy_version 62240 (0.0010) [2023-10-12 18:22:43,152][62634] Updated weights for policy 0, policy_version 62240 (0.0007) [2023-10-12 18:22:43,435][61643] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 127467520. Throughput: 0: 1681.5, 1: 1681.0. Samples: 31868754. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 18:22:43,435][61643] Avg episode reward: [(0, '24.840'), (1, '9.680')] [2023-10-12 18:22:47,200][62635] Updated weights for policy 1, policy_version 62250 (0.0010) [2023-10-12 18:22:47,213][62634] Updated weights for policy 0, policy_version 62250 (0.0007) [2023-10-12 18:22:47,568][62635] Updated weights for policy 1, policy_version 62260 (0.0007) [2023-10-12 18:22:47,597][62634] Updated weights for policy 0, policy_version 62260 (0.0007) [2023-10-12 18:22:47,928][62635] Updated weights for policy 1, policy_version 62270 (0.0007) [2023-10-12 18:22:47,969][62634] Updated weights for policy 0, policy_version 62270 (0.0008) [2023-10-12 18:22:48,435][61643] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 127533056. Throughput: 0: 1659.2, 1: 1659.2. Samples: 31887186. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 18:22:48,435][61643] Avg episode reward: [(0, '24.730'), (1, '9.890')] [2023-10-12 18:22:51,872][62634] Updated weights for policy 0, policy_version 62280 (0.0007) [2023-10-12 18:22:52,017][62635] Updated weights for policy 1, policy_version 62280 (0.0007) [2023-10-12 18:22:52,248][62634] Updated weights for policy 0, policy_version 62290 (0.0008) [2023-10-12 18:22:52,380][62635] Updated weights for policy 1, policy_version 62290 (0.0010) [2023-10-12 18:22:52,617][62634] Updated weights for policy 0, policy_version 62300 (0.0008) [2023-10-12 18:22:52,747][62635] Updated weights for policy 1, policy_version 62300 (0.0008) [2023-10-12 18:22:53,435][61643] Fps is (10 sec: 13107.0, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 127598592. Throughput: 0: 1675.9, 1: 1686.0. Samples: 31898636. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 18:22:53,436][61643] Avg episode reward: [(0, '24.810'), (1, '9.610')] [2023-10-12 18:22:56,688][62634] Updated weights for policy 0, policy_version 62310 (0.0009) [2023-10-12 18:22:56,917][62635] Updated weights for policy 1, policy_version 62310 (0.0009) [2023-10-12 18:22:57,062][62634] Updated weights for policy 0, policy_version 62320 (0.0009) [2023-10-12 18:22:57,292][62635] Updated weights for policy 1, policy_version 62320 (0.0008) [2023-10-12 18:22:57,437][62634] Updated weights for policy 0, policy_version 62330 (0.0009) [2023-10-12 18:22:57,664][62635] Updated weights for policy 1, policy_version 62330 (0.0009) [2023-10-12 18:22:58,435][61643] Fps is (10 sec: 13106.7, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 127664128. Throughput: 0: 1665.5, 1: 1675.9. Samples: 31918590. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 18:22:58,436][61643] Avg episode reward: [(0, '24.780'), (1, '9.680')] [2023-10-12 18:23:01,580][62634] Updated weights for policy 0, policy_version 62340 (0.0008) [2023-10-12 18:23:01,777][62635] Updated weights for policy 1, policy_version 62340 (0.0008) [2023-10-12 18:23:01,946][62634] Updated weights for policy 0, policy_version 62350 (0.0007) [2023-10-12 18:23:02,147][62635] Updated weights for policy 1, policy_version 62350 (0.0009) [2023-10-12 18:23:02,321][62634] Updated weights for policy 0, policy_version 62360 (0.0007) [2023-10-12 18:23:02,512][62635] Updated weights for policy 1, policy_version 62360 (0.0008) [2023-10-12 18:23:03,435][61643] Fps is (10 sec: 13107.0, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 127729664. Throughput: 0: 1659.5, 1: 1660.9. Samples: 31937372. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 18:23:03,436][61643] Avg episode reward: [(0, '24.980'), (1, '9.810')] [2023-10-12 18:23:06,542][62634] Updated weights for policy 0, policy_version 62370 (0.0010) [2023-10-12 18:23:06,655][62635] Updated weights for policy 1, policy_version 62370 (0.0008) [2023-10-12 18:23:06,907][62634] Updated weights for policy 0, policy_version 62380 (0.0008) [2023-10-12 18:23:07,017][62635] Updated weights for policy 1, policy_version 62380 (0.0008) [2023-10-12 18:23:07,281][62634] Updated weights for policy 0, policy_version 62390 (0.0008) [2023-10-12 18:23:07,384][62635] Updated weights for policy 1, policy_version 62390 (0.0008) [2023-10-12 18:23:07,660][62634] Updated weights for policy 0, policy_version 62400 (0.0009) [2023-10-12 18:23:07,738][62635] Updated weights for policy 1, policy_version 62400 (0.0010) [2023-10-12 18:23:08,435][61643] Fps is (10 sec: 13107.6, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 127795200. Throughput: 0: 1669.4, 1: 1677.4. Samples: 31949026. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 18:23:08,436][61643] Avg episode reward: [(0, '25.160'), (1, '9.790')] [2023-10-12 18:23:11,586][62634] Updated weights for policy 0, policy_version 62410 (0.0009) [2023-10-12 18:23:11,834][62635] Updated weights for policy 1, policy_version 62410 (0.0009) [2023-10-12 18:23:11,953][62634] Updated weights for policy 0, policy_version 62420 (0.0008) [2023-10-12 18:23:12,199][62635] Updated weights for policy 1, policy_version 62420 (0.0007) [2023-10-12 18:23:12,326][62634] Updated weights for policy 0, policy_version 62430 (0.0007) [2023-10-12 18:23:12,561][62635] Updated weights for policy 1, policy_version 62430 (0.0007) [2023-10-12 18:23:13,435][61643] Fps is (10 sec: 13107.6, 60 sec: 14199.5, 300 sec: 13440.4). Total num frames: 127860736. Throughput: 0: 1657.7, 1: 1667.0. Samples: 31968536. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 18:23:13,435][61643] Avg episode reward: [(0, '25.390'), (1, '9.670')] [2023-10-12 18:23:16,575][62634] Updated weights for policy 0, policy_version 62440 (0.0009) [2023-10-12 18:23:16,642][62635] Updated weights for policy 1, policy_version 62440 (0.0009) [2023-10-12 18:23:16,947][62634] Updated weights for policy 0, policy_version 62450 (0.0009) [2023-10-12 18:23:17,003][62635] Updated weights for policy 1, policy_version 62450 (0.0009) [2023-10-12 18:23:17,326][62634] Updated weights for policy 0, policy_version 62460 (0.0007) [2023-10-12 18:23:17,366][62635] Updated weights for policy 1, policy_version 62460 (0.0008) [2023-10-12 18:23:18,435][61643] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 127926272. Throughput: 0: 1666.9, 1: 1669.7. Samples: 31987734. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 18:23:18,436][61643] Avg episode reward: [(0, '25.350'), (1, '9.870')] [2023-10-12 18:23:21,442][62634] Updated weights for policy 0, policy_version 62470 (0.0007) [2023-10-12 18:23:21,585][62635] Updated weights for policy 1, policy_version 62470 (0.0008) [2023-10-12 18:23:21,814][62634] Updated weights for policy 0, policy_version 62480 (0.0008) [2023-10-12 18:23:21,941][62635] Updated weights for policy 1, policy_version 62480 (0.0008) [2023-10-12 18:23:22,196][62634] Updated weights for policy 0, policy_version 62490 (0.0007) [2023-10-12 18:23:22,311][62635] Updated weights for policy 1, policy_version 62490 (0.0008) [2023-10-12 18:23:23,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 127991808. Throughput: 0: 1676.2, 1: 1674.4. Samples: 31999206. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 18:23:23,435][61643] Avg episode reward: [(0, '25.520'), (1, '9.880')] [2023-10-12 18:23:26,133][62634] Updated weights for policy 0, policy_version 62500 (0.0008) [2023-10-12 18:23:26,505][62635] Updated weights for policy 1, policy_version 62500 (0.0009) [2023-10-12 18:23:26,516][62634] Updated weights for policy 0, policy_version 62510 (0.0008) [2023-10-12 18:23:26,872][62635] Updated weights for policy 1, policy_version 62510 (0.0008) [2023-10-12 18:23:26,885][62634] Updated weights for policy 0, policy_version 62520 (0.0009) [2023-10-12 18:23:27,236][62635] Updated weights for policy 1, policy_version 62520 (0.0008) [2023-10-12 18:23:28,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 128057344. Throughput: 0: 1661.0, 1: 1659.2. Samples: 32018164. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 18:23:28,436][61643] Avg episode reward: [(0, '25.150'), (1, '9.810')] [2023-10-12 18:23:31,063][62634] Updated weights for policy 0, policy_version 62530 (0.0009) [2023-10-12 18:23:31,249][62635] Updated weights for policy 1, policy_version 62530 (0.0009) [2023-10-12 18:23:31,463][62634] Updated weights for policy 0, policy_version 62540 (0.0007) [2023-10-12 18:23:31,620][62635] Updated weights for policy 1, policy_version 62540 (0.0009) [2023-10-12 18:23:31,832][62634] Updated weights for policy 0, policy_version 62550 (0.0008) [2023-10-12 18:23:31,974][62635] Updated weights for policy 1, policy_version 62550 (0.0009) [2023-10-12 18:23:32,207][62634] Updated weights for policy 0, policy_version 62560 (0.0008) [2023-10-12 18:23:32,348][62635] Updated weights for policy 1, policy_version 62560 (0.0007) [2023-10-12 18:23:33,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 128122880. Throughput: 0: 1676.5, 1: 1669.7. Samples: 32037768. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 18:23:33,435][61643] Avg episode reward: [(0, '24.970'), (1, '9.900')] [2023-10-12 18:23:36,188][62634] Updated weights for policy 0, policy_version 62570 (0.0008) [2023-10-12 18:23:36,432][62635] Updated weights for policy 1, policy_version 62570 (0.0007) [2023-10-12 18:23:36,563][62634] Updated weights for policy 0, policy_version 62580 (0.0009) [2023-10-12 18:23:36,801][62635] Updated weights for policy 1, policy_version 62580 (0.0007) [2023-10-12 18:23:36,937][62634] Updated weights for policy 0, policy_version 62590 (0.0009) [2023-10-12 18:23:37,169][62635] Updated weights for policy 1, policy_version 62590 (0.0007) [2023-10-12 18:23:38,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 128188416. Throughput: 0: 1678.5, 1: 1668.5. Samples: 32049252. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 18:23:38,436][61643] Avg episode reward: [(0, '24.870'), (1, '9.890')] [2023-10-12 18:23:41,037][62634] Updated weights for policy 0, policy_version 62600 (0.0008) [2023-10-12 18:23:41,156][62635] Updated weights for policy 1, policy_version 62600 (0.0007) [2023-10-12 18:23:41,421][62634] Updated weights for policy 0, policy_version 62610 (0.0007) [2023-10-12 18:23:41,528][62635] Updated weights for policy 1, policy_version 62610 (0.0007) [2023-10-12 18:23:41,801][62634] Updated weights for policy 0, policy_version 62620 (0.0007) [2023-10-12 18:23:41,888][62635] Updated weights for policy 1, policy_version 62620 (0.0008) [2023-10-12 18:23:43,435][61643] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 128253952. Throughput: 0: 1659.3, 1: 1649.5. Samples: 32067486. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 18:23:43,436][61643] Avg episode reward: [(0, '24.840'), (1, '9.780')] [2023-10-12 18:23:45,824][62634] Updated weights for policy 0, policy_version 62630 (0.0007) [2023-10-12 18:23:46,019][62635] Updated weights for policy 1, policy_version 62630 (0.0009) [2023-10-12 18:23:46,205][62634] Updated weights for policy 0, policy_version 62640 (0.0007) [2023-10-12 18:23:46,393][62635] Updated weights for policy 1, policy_version 62640 (0.0010) [2023-10-12 18:23:46,589][62634] Updated weights for policy 0, policy_version 62650 (0.0008) [2023-10-12 18:23:46,764][62635] Updated weights for policy 1, policy_version 62650 (0.0009) [2023-10-12 18:23:48,435][61643] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 128319488. Throughput: 0: 1677.1, 1: 1669.3. Samples: 32087960. Policy #0 lag: (min: 6.0, avg: 12.1, max: 38.0) [2023-10-12 18:23:48,435][61643] Avg episode reward: [(0, '24.880'), (1, '10.100')] [2023-10-12 18:23:50,627][62634] Updated weights for policy 0, policy_version 62660 (0.0008) [2023-10-12 18:23:50,957][62635] Updated weights for policy 1, policy_version 62660 (0.0008) [2023-10-12 18:23:51,015][62634] Updated weights for policy 0, policy_version 62670 (0.0007) [2023-10-12 18:23:51,335][62635] Updated weights for policy 1, policy_version 62670 (0.0007) [2023-10-12 18:23:51,384][62634] Updated weights for policy 0, policy_version 62680 (0.0007) [2023-10-12 18:23:51,698][62635] Updated weights for policy 1, policy_version 62680 (0.0007) [2023-10-12 18:23:53,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 128385024. Throughput: 0: 1666.5, 1: 1658.6. Samples: 32098656. Policy #0 lag: (min: 6.0, avg: 12.1, max: 38.0) [2023-10-12 18:23:53,436][61643] Avg episode reward: [(0, '24.830'), (1, '10.030')] [2023-10-12 18:23:55,482][62634] Updated weights for policy 0, policy_version 62690 (0.0008) [2023-10-12 18:23:55,820][62635] Updated weights for policy 1, policy_version 62690 (0.0007) [2023-10-12 18:23:55,849][62634] Updated weights for policy 0, policy_version 62700 (0.0007) [2023-10-12 18:23:56,189][62635] Updated weights for policy 1, policy_version 62700 (0.0007) [2023-10-12 18:23:56,218][62634] Updated weights for policy 0, policy_version 62710 (0.0009) [2023-10-12 18:23:56,555][62635] Updated weights for policy 1, policy_version 62710 (0.0008) [2023-10-12 18:23:56,591][62634] Updated weights for policy 0, policy_version 62720 (0.0009) [2023-10-12 18:23:56,915][62635] Updated weights for policy 1, policy_version 62720 (0.0008) [2023-10-12 18:23:58,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.3, 300 sec: 13329.4). Total num frames: 128450560. Throughput: 0: 1658.8, 1: 1648.4. Samples: 32117362. Policy #0 lag: (min: 6.0, avg: 12.1, max: 38.0) [2023-10-12 18:23:58,435][61643] Avg episode reward: [(0, '24.350'), (1, '9.940')] [2023-10-12 18:24:00,639][62634] Updated weights for policy 0, policy_version 62730 (0.0009) [2023-10-12 18:24:01,008][62635] Updated weights for policy 1, policy_version 62730 (0.0009) [2023-10-12 18:24:01,015][62634] Updated weights for policy 0, policy_version 62740 (0.0008) [2023-10-12 18:24:01,376][62635] Updated weights for policy 1, policy_version 62740 (0.0008) [2023-10-12 18:24:01,399][62634] Updated weights for policy 0, policy_version 62750 (0.0008) [2023-10-12 18:24:01,733][62635] Updated weights for policy 1, policy_version 62750 (0.0009) [2023-10-12 18:24:03,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 128516096. Throughput: 0: 1674.1, 1: 1668.1. Samples: 32138130. Policy #0 lag: (min: 6.0, avg: 12.1, max: 38.0) [2023-10-12 18:24:03,436][61643] Avg episode reward: [(0, '24.520'), (1, '9.940')] [2023-10-12 18:24:05,511][62634] Updated weights for policy 0, policy_version 62760 (0.0009) [2023-10-12 18:24:05,746][62635] Updated weights for policy 1, policy_version 62760 (0.0008) [2023-10-12 18:24:05,892][62634] Updated weights for policy 0, policy_version 62770 (0.0008) [2023-10-12 18:24:06,119][62635] Updated weights for policy 1, policy_version 62770 (0.0008) [2023-10-12 18:24:06,266][62634] Updated weights for policy 0, policy_version 62780 (0.0007) [2023-10-12 18:24:06,478][62635] Updated weights for policy 1, policy_version 62780 (0.0009) [2023-10-12 18:24:08,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 128581632. Throughput: 0: 1660.1, 1: 1656.7. Samples: 32148460. Policy #0 lag: (min: 6.0, avg: 12.1, max: 38.0) [2023-10-12 18:24:08,436][61643] Avg episode reward: [(0, '24.430'), (1, '10.030')] [2023-10-12 18:24:10,294][62634] Updated weights for policy 0, policy_version 62790 (0.0008) [2023-10-12 18:24:10,623][62635] Updated weights for policy 1, policy_version 62790 (0.0009) [2023-10-12 18:24:10,677][62634] Updated weights for policy 0, policy_version 62800 (0.0008) [2023-10-12 18:24:10,999][62635] Updated weights for policy 1, policy_version 62800 (0.0007) [2023-10-12 18:24:11,052][62634] Updated weights for policy 0, policy_version 62810 (0.0007) [2023-10-12 18:24:11,368][62635] Updated weights for policy 1, policy_version 62810 (0.0009) [2023-10-12 18:24:13,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 128647168. Throughput: 0: 1673.2, 1: 1660.7. Samples: 32168186. Policy #0 lag: (min: 6.0, avg: 12.1, max: 38.0) [2023-10-12 18:24:13,436][61643] Avg episode reward: [(0, '24.400'), (1, '9.990')] [2023-10-12 18:24:15,184][62634] Updated weights for policy 0, policy_version 62820 (0.0007) [2023-10-12 18:24:15,387][62635] Updated weights for policy 1, policy_version 62820 (0.0007) [2023-10-12 18:24:15,567][62634] Updated weights for policy 0, policy_version 62830 (0.0007) [2023-10-12 18:24:15,752][62635] Updated weights for policy 1, policy_version 62830 (0.0007) [2023-10-12 18:24:15,935][62634] Updated weights for policy 0, policy_version 62840 (0.0008) [2023-10-12 18:24:16,117][62635] Updated weights for policy 1, policy_version 62840 (0.0008) [2023-10-12 18:24:18,435][61643] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 128712704. Throughput: 0: 1680.9, 1: 1676.0. Samples: 32188832. Policy #0 lag: (min: 6.0, avg: 12.1, max: 38.0) [2023-10-12 18:24:18,436][61643] Avg episode reward: [(0, '24.460'), (1, '10.010')] [2023-10-12 18:24:18,451][62495] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000062848_64356352.pth... [2023-10-12 18:24:18,451][62354] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000062848_64356352.pth... [2023-10-12 18:24:18,481][62495] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000061280_62750720.pth [2023-10-12 18:24:18,485][62495] Saving a milestone ./train_atari/atari_kangaroo_APPO/checkpoint_p1/milestones/checkpoint_000062848_64356352.pth [2023-10-12 18:24:18,487][62354] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000061280_62750720.pth [2023-10-12 18:24:18,491][62354] Saving a milestone ./train_atari/atari_kangaroo_APPO/checkpoint_p0/milestones/checkpoint_000062848_64356352.pth [2023-10-12 18:24:19,997][62634] Updated weights for policy 0, policy_version 62850 (0.0008) [2023-10-12 18:24:20,169][62635] Updated weights for policy 1, policy_version 62850 (0.0008) [2023-10-12 18:24:20,381][62634] Updated weights for policy 0, policy_version 62860 (0.0007) [2023-10-12 18:24:20,531][62635] Updated weights for policy 1, policy_version 62860 (0.0008) [2023-10-12 18:24:20,760][62634] Updated weights for policy 0, policy_version 62870 (0.0008) [2023-10-12 18:24:20,903][62635] Updated weights for policy 1, policy_version 62870 (0.0008) [2023-10-12 18:24:21,143][62634] Updated weights for policy 0, policy_version 62880 (0.0009) [2023-10-12 18:24:21,273][62635] Updated weights for policy 1, policy_version 62880 (0.0007) [2023-10-12 18:24:23,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 128778240. Throughput: 0: 1657.3, 1: 1659.3. Samples: 32198500. Policy #0 lag: (min: 6.0, avg: 12.1, max: 38.0) [2023-10-12 18:24:23,435][61643] Avg episode reward: [(0, '24.340'), (1, '10.010')] [2023-10-12 18:24:25,174][62634] Updated weights for policy 0, policy_version 62890 (0.0008) [2023-10-12 18:24:25,294][62635] Updated weights for policy 1, policy_version 62890 (0.0009) [2023-10-12 18:24:25,549][62634] Updated weights for policy 0, policy_version 62900 (0.0008) [2023-10-12 18:24:25,665][62635] Updated weights for policy 1, policy_version 62900 (0.0009) [2023-10-12 18:24:25,926][62634] Updated weights for policy 0, policy_version 62910 (0.0008) [2023-10-12 18:24:26,018][62635] Updated weights for policy 1, policy_version 62910 (0.0008) [2023-10-12 18:24:28,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 128843776. Throughput: 0: 1675.6, 1: 1680.5. Samples: 32218512. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 18:24:28,436][61643] Avg episode reward: [(0, '24.300'), (1, '9.920')] [2023-10-12 18:24:29,910][62634] Updated weights for policy 0, policy_version 62920 (0.0009) [2023-10-12 18:24:30,052][62635] Updated weights for policy 1, policy_version 62920 (0.0008) [2023-10-12 18:24:30,289][62634] Updated weights for policy 0, policy_version 62930 (0.0009) [2023-10-12 18:24:30,431][62635] Updated weights for policy 1, policy_version 62930 (0.0007) [2023-10-12 18:24:30,664][62634] Updated weights for policy 0, policy_version 62940 (0.0007) [2023-10-12 18:24:30,797][62635] Updated weights for policy 1, policy_version 62940 (0.0008) [2023-10-12 18:24:33,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 128909312. Throughput: 0: 1677.4, 1: 1683.1. Samples: 32239184. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 18:24:33,435][61643] Avg episode reward: [(0, '24.400'), (1, '10.010')] [2023-10-12 18:24:34,745][62634] Updated weights for policy 0, policy_version 62950 (0.0009) [2023-10-12 18:24:35,042][62635] Updated weights for policy 1, policy_version 62950 (0.0008) [2023-10-12 18:24:35,113][62634] Updated weights for policy 0, policy_version 62960 (0.0007) [2023-10-12 18:24:35,416][62635] Updated weights for policy 1, policy_version 62960 (0.0007) [2023-10-12 18:24:35,485][62634] Updated weights for policy 0, policy_version 62970 (0.0007) [2023-10-12 18:24:35,785][62635] Updated weights for policy 1, policy_version 62970 (0.0008) [2023-10-12 18:24:38,435][61643] Fps is (10 sec: 13107.5, 60 sec: 13107.3, 300 sec: 13329.4). Total num frames: 128974848. Throughput: 0: 1659.0, 1: 1663.7. Samples: 32248176. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 18:24:38,435][61643] Avg episode reward: [(0, '24.400'), (1, '10.020')] [2023-10-12 18:24:39,595][62634] Updated weights for policy 0, policy_version 62980 (0.0010) [2023-10-12 18:24:39,946][62635] Updated weights for policy 1, policy_version 62980 (0.0007) [2023-10-12 18:24:39,967][62634] Updated weights for policy 0, policy_version 62990 (0.0010) [2023-10-12 18:24:40,306][62635] Updated weights for policy 1, policy_version 62990 (0.0007) [2023-10-12 18:24:40,351][62634] Updated weights for policy 0, policy_version 63000 (0.0007) [2023-10-12 18:24:40,678][62635] Updated weights for policy 1, policy_version 63000 (0.0008) [2023-10-12 18:24:43,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 129040384. Throughput: 0: 1679.4, 1: 1686.4. Samples: 32268822. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 18:24:43,436][61643] Avg episode reward: [(0, '24.780'), (1, '10.030')] [2023-10-12 18:24:44,429][62634] Updated weights for policy 0, policy_version 63010 (0.0007) [2023-10-12 18:24:44,800][62634] Updated weights for policy 0, policy_version 63020 (0.0008) [2023-10-12 18:24:44,854][62635] Updated weights for policy 1, policy_version 63010 (0.0007) [2023-10-12 18:24:45,180][62634] Updated weights for policy 0, policy_version 63030 (0.0008) [2023-10-12 18:24:45,225][62635] Updated weights for policy 1, policy_version 63020 (0.0010) [2023-10-12 18:24:45,553][62634] Updated weights for policy 0, policy_version 63040 (0.0008) [2023-10-12 18:24:45,599][62635] Updated weights for policy 1, policy_version 63030 (0.0009) [2023-10-12 18:24:45,962][62635] Updated weights for policy 1, policy_version 63040 (0.0009) [2023-10-12 18:24:48,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 129105920. Throughput: 0: 1678.9, 1: 1683.7. Samples: 32289446. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 18:24:48,435][61643] Avg episode reward: [(0, '24.710'), (1, '9.880')] [2023-10-12 18:24:49,726][62634] Updated weights for policy 0, policy_version 63050 (0.0008) [2023-10-12 18:24:50,003][62635] Updated weights for policy 1, policy_version 63050 (0.0007) [2023-10-12 18:24:50,100][62634] Updated weights for policy 0, policy_version 63060 (0.0007) [2023-10-12 18:24:50,367][62635] Updated weights for policy 1, policy_version 63060 (0.0008) [2023-10-12 18:24:50,485][62634] Updated weights for policy 0, policy_version 63070 (0.0008) [2023-10-12 18:24:50,740][62635] Updated weights for policy 1, policy_version 63070 (0.0008) [2023-10-12 18:24:53,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 129171456. Throughput: 0: 1663.3, 1: 1669.4. Samples: 32298432. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 18:24:53,436][61643] Avg episode reward: [(0, '24.830'), (1, '9.880')] [2023-10-12 18:24:54,473][62634] Updated weights for policy 0, policy_version 63080 (0.0010) [2023-10-12 18:24:54,715][62635] Updated weights for policy 1, policy_version 63080 (0.0007) [2023-10-12 18:24:54,849][62634] Updated weights for policy 0, policy_version 63090 (0.0008) [2023-10-12 18:24:55,074][62635] Updated weights for policy 1, policy_version 63090 (0.0007) [2023-10-12 18:24:55,222][62634] Updated weights for policy 0, policy_version 63100 (0.0007) [2023-10-12 18:24:55,450][62635] Updated weights for policy 1, policy_version 63100 (0.0009) [2023-10-12 18:24:58,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 129236992. Throughput: 0: 1665.7, 1: 1688.0. Samples: 32319102. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 18:24:58,436][61643] Avg episode reward: [(0, '24.970'), (1, '9.940')] [2023-10-12 18:24:59,409][62634] Updated weights for policy 0, policy_version 63110 (0.0008) [2023-10-12 18:24:59,547][62635] Updated weights for policy 1, policy_version 63110 (0.0010) [2023-10-12 18:24:59,781][62634] Updated weights for policy 0, policy_version 63120 (0.0007) [2023-10-12 18:24:59,918][62635] Updated weights for policy 1, policy_version 63120 (0.0007) [2023-10-12 18:25:00,157][62634] Updated weights for policy 0, policy_version 63130 (0.0009) [2023-10-12 18:25:00,288][62635] Updated weights for policy 1, policy_version 63130 (0.0009) [2023-10-12 18:25:03,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 129302528. Throughput: 0: 1668.2, 1: 1683.0. Samples: 32339636. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 18:25:03,435][61643] Avg episode reward: [(0, '24.790'), (1, '9.860')] [2023-10-12 18:25:04,257][62635] Updated weights for policy 1, policy_version 63140 (0.0007) [2023-10-12 18:25:04,269][62634] Updated weights for policy 0, policy_version 63140 (0.0008) [2023-10-12 18:25:04,626][62635] Updated weights for policy 1, policy_version 63150 (0.0007) [2023-10-12 18:25:04,650][62634] Updated weights for policy 0, policy_version 63150 (0.0007) [2023-10-12 18:25:04,992][62635] Updated weights for policy 1, policy_version 63160 (0.0008) [2023-10-12 18:25:05,027][62634] Updated weights for policy 0, policy_version 63160 (0.0009) [2023-10-12 18:25:08,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 129368064. Throughput: 0: 1659.7, 1: 1674.5. Samples: 32348540. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 18:25:08,436][61643] Avg episode reward: [(0, '24.520'), (1, '9.860')] [2023-10-12 18:25:09,107][62634] Updated weights for policy 0, policy_version 63170 (0.0010) [2023-10-12 18:25:09,128][62635] Updated weights for policy 1, policy_version 63170 (0.0007) [2023-10-12 18:25:09,495][62635] Updated weights for policy 1, policy_version 63180 (0.0007) [2023-10-12 18:25:09,512][62634] Updated weights for policy 0, policy_version 63180 (0.0008) [2023-10-12 18:25:09,864][62635] Updated weights for policy 1, policy_version 63190 (0.0007) [2023-10-12 18:25:09,895][62634] Updated weights for policy 0, policy_version 63190 (0.0008) [2023-10-12 18:25:10,232][62635] Updated weights for policy 1, policy_version 63200 (0.0007) [2023-10-12 18:25:10,268][62634] Updated weights for policy 0, policy_version 63200 (0.0008) [2023-10-12 18:25:13,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 129433600. Throughput: 0: 1672.4, 1: 1679.1. Samples: 32369328. Policy #0 lag: (min: 12.0, avg: 21.9, max: 44.0) [2023-10-12 18:25:13,435][61643] Avg episode reward: [(0, '24.720'), (1, '10.030')] [2023-10-12 18:25:14,207][62634] Updated weights for policy 0, policy_version 63210 (0.0008) [2023-10-12 18:25:14,349][62635] Updated weights for policy 1, policy_version 63210 (0.0008) [2023-10-12 18:25:14,578][62634] Updated weights for policy 0, policy_version 63220 (0.0009) [2023-10-12 18:25:14,709][62635] Updated weights for policy 1, policy_version 63220 (0.0007) [2023-10-12 18:25:14,958][62634] Updated weights for policy 0, policy_version 63230 (0.0008) [2023-10-12 18:25:15,082][62635] Updated weights for policy 1, policy_version 63230 (0.0008) [2023-10-12 18:25:18,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 129499136. Throughput: 0: 1673.6, 1: 1681.7. Samples: 32390176. Policy #0 lag: (min: 12.0, avg: 21.9, max: 44.0) [2023-10-12 18:25:18,436][61643] Avg episode reward: [(0, '24.950'), (1, '9.940')] [2023-10-12 18:25:18,933][62634] Updated weights for policy 0, policy_version 63240 (0.0009) [2023-10-12 18:25:19,176][62635] Updated weights for policy 1, policy_version 63240 (0.0009) [2023-10-12 18:25:19,308][62634] Updated weights for policy 0, policy_version 63250 (0.0009) [2023-10-12 18:25:19,535][62635] Updated weights for policy 1, policy_version 63250 (0.0008) [2023-10-12 18:25:19,680][62634] Updated weights for policy 0, policy_version 63260 (0.0010) [2023-10-12 18:25:19,906][62635] Updated weights for policy 1, policy_version 63260 (0.0009) [2023-10-12 18:25:23,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 129564672. Throughput: 0: 1675.2, 1: 1681.7. Samples: 32399238. Policy #0 lag: (min: 12.0, avg: 21.9, max: 44.0) [2023-10-12 18:25:23,435][61643] Avg episode reward: [(0, '25.050'), (1, '9.850')] [2023-10-12 18:25:23,697][62634] Updated weights for policy 0, policy_version 63270 (0.0010) [2023-10-12 18:25:23,874][62635] Updated weights for policy 1, policy_version 63270 (0.0008) [2023-10-12 18:25:24,065][62634] Updated weights for policy 0, policy_version 63280 (0.0007) [2023-10-12 18:25:24,250][62635] Updated weights for policy 1, policy_version 63280 (0.0008) [2023-10-12 18:25:24,445][62634] Updated weights for policy 0, policy_version 63290 (0.0009) [2023-10-12 18:25:24,613][62635] Updated weights for policy 1, policy_version 63290 (0.0007) [2023-10-12 18:25:28,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 129630208. Throughput: 0: 1677.6, 1: 1682.5. Samples: 32420026. Policy #0 lag: (min: 12.0, avg: 21.9, max: 44.0) [2023-10-12 18:25:28,436][61643] Avg episode reward: [(0, '25.100'), (1, '10.140')] [2023-10-12 18:25:28,603][62634] Updated weights for policy 0, policy_version 63300 (0.0008) [2023-10-12 18:25:28,650][62635] Updated weights for policy 1, policy_version 63300 (0.0007) [2023-10-12 18:25:28,974][62634] Updated weights for policy 0, policy_version 63310 (0.0008) [2023-10-12 18:25:29,011][62635] Updated weights for policy 1, policy_version 63310 (0.0009) [2023-10-12 18:25:29,339][62634] Updated weights for policy 0, policy_version 63320 (0.0007) [2023-10-12 18:25:29,376][62635] Updated weights for policy 1, policy_version 63320 (0.0008) [2023-10-12 18:25:33,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 129695744. Throughput: 0: 1680.6, 1: 1682.1. Samples: 32440768. Policy #0 lag: (min: 12.0, avg: 21.9, max: 44.0) [2023-10-12 18:25:33,435][61643] Avg episode reward: [(0, '24.790'), (1, '10.050')] [2023-10-12 18:25:33,459][62634] Updated weights for policy 0, policy_version 63330 (0.0008) [2023-10-12 18:25:33,500][62635] Updated weights for policy 1, policy_version 63330 (0.0010) [2023-10-12 18:25:33,835][62634] Updated weights for policy 0, policy_version 63340 (0.0007) [2023-10-12 18:25:33,860][62635] Updated weights for policy 1, policy_version 63340 (0.0008) [2023-10-12 18:25:34,217][62634] Updated weights for policy 0, policy_version 63350 (0.0007) [2023-10-12 18:25:34,231][62635] Updated weights for policy 1, policy_version 63350 (0.0010) [2023-10-12 18:25:34,589][62635] Updated weights for policy 1, policy_version 63360 (0.0009) [2023-10-12 18:25:34,596][62634] Updated weights for policy 0, policy_version 63360 (0.0008) [2023-10-12 18:25:38,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 129761280. Throughput: 0: 1682.2, 1: 1683.8. Samples: 32449902. Policy #0 lag: (min: 12.0, avg: 21.9, max: 44.0) [2023-10-12 18:25:38,436][61643] Avg episode reward: [(0, '24.900'), (1, '9.860')] [2023-10-12 18:25:38,679][62635] Updated weights for policy 1, policy_version 63370 (0.0009) [2023-10-12 18:25:38,814][62634] Updated weights for policy 0, policy_version 63370 (0.0009) [2023-10-12 18:25:39,046][62635] Updated weights for policy 1, policy_version 63380 (0.0008) [2023-10-12 18:25:39,187][62634] Updated weights for policy 0, policy_version 63380 (0.0009) [2023-10-12 18:25:39,413][62635] Updated weights for policy 1, policy_version 63390 (0.0007) [2023-10-12 18:25:39,559][62634] Updated weights for policy 0, policy_version 63390 (0.0009) [2023-10-12 18:25:43,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 129826816. Throughput: 0: 1683.0, 1: 1677.8. Samples: 32470338. Policy #0 lag: (min: 12.0, avg: 21.9, max: 44.0) [2023-10-12 18:25:43,435][61643] Avg episode reward: [(0, '24.900'), (1, '9.870')] [2023-10-12 18:25:43,507][62635] Updated weights for policy 1, policy_version 63400 (0.0008) [2023-10-12 18:25:43,691][62634] Updated weights for policy 0, policy_version 63400 (0.0010) [2023-10-12 18:25:43,873][62635] Updated weights for policy 1, policy_version 63410 (0.0008) [2023-10-12 18:25:44,062][62634] Updated weights for policy 0, policy_version 63410 (0.0008) [2023-10-12 18:25:44,241][62635] Updated weights for policy 1, policy_version 63420 (0.0007) [2023-10-12 18:25:44,437][62634] Updated weights for policy 0, policy_version 63420 (0.0007) [2023-10-12 18:25:48,362][62635] Updated weights for policy 1, policy_version 63430 (0.0008) [2023-10-12 18:25:48,383][62634] Updated weights for policy 0, policy_version 63430 (0.0008) [2023-10-12 18:25:48,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 129892352. Throughput: 0: 1677.8, 1: 1681.9. Samples: 32490822. Policy #0 lag: (min: 12.0, avg: 21.9, max: 44.0) [2023-10-12 18:25:48,435][61643] Avg episode reward: [(0, '24.810'), (1, '9.960')] [2023-10-12 18:25:48,731][62635] Updated weights for policy 1, policy_version 63440 (0.0009) [2023-10-12 18:25:48,762][62634] Updated weights for policy 0, policy_version 63440 (0.0009) [2023-10-12 18:25:49,093][62635] Updated weights for policy 1, policy_version 63450 (0.0007) [2023-10-12 18:25:49,131][62634] Updated weights for policy 0, policy_version 63450 (0.0008) [2023-10-12 18:25:53,079][62634] Updated weights for policy 0, policy_version 63460 (0.0007) [2023-10-12 18:25:53,287][62635] Updated weights for policy 1, policy_version 63460 (0.0007) [2023-10-12 18:25:53,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 129957888. Throughput: 0: 1683.8, 1: 1678.9. Samples: 32499858. Policy #0 lag: (min: 12.0, avg: 21.9, max: 44.0) [2023-10-12 18:25:53,435][61643] Avg episode reward: [(0, '24.780'), (1, '9.960')] [2023-10-12 18:25:53,459][62634] Updated weights for policy 0, policy_version 63470 (0.0009) [2023-10-12 18:25:53,657][62635] Updated weights for policy 1, policy_version 63470 (0.0008) [2023-10-12 18:25:53,829][62634] Updated weights for policy 0, policy_version 63480 (0.0008) [2023-10-12 18:25:54,030][62635] Updated weights for policy 1, policy_version 63480 (0.0009) [2023-10-12 18:25:57,984][62634] Updated weights for policy 0, policy_version 63490 (0.0007) [2023-10-12 18:25:58,151][62635] Updated weights for policy 1, policy_version 63490 (0.0008) [2023-10-12 18:25:58,388][62634] Updated weights for policy 0, policy_version 63500 (0.0008) [2023-10-12 18:25:58,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 130023424. Throughput: 0: 1683.8, 1: 1678.1. Samples: 32520614. Policy #0 lag: (min: 31.0, avg: 32.1, max: 53.0) [2023-10-12 18:25:58,436][61643] Avg episode reward: [(0, '24.760'), (1, '9.870')] [2023-10-12 18:25:58,510][62635] Updated weights for policy 1, policy_version 63500 (0.0008) [2023-10-12 18:25:58,752][62634] Updated weights for policy 0, policy_version 63510 (0.0008) [2023-10-12 18:25:58,882][62635] Updated weights for policy 1, policy_version 63510 (0.0009) [2023-10-12 18:25:59,129][62634] Updated weights for policy 0, policy_version 63520 (0.0008) [2023-10-12 18:25:59,243][62635] Updated weights for policy 1, policy_version 63520 (0.0008) [2023-10-12 18:26:03,219][62634] Updated weights for policy 0, policy_version 63530 (0.0007) [2023-10-12 18:26:03,262][62635] Updated weights for policy 1, policy_version 63530 (0.0008) [2023-10-12 18:26:03,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 130088960. Throughput: 0: 1675.3, 1: 1672.5. Samples: 32540826. Policy #0 lag: (min: 31.0, avg: 32.1, max: 53.0) [2023-10-12 18:26:03,435][61643] Avg episode reward: [(0, '24.490'), (1, '9.880')] [2023-10-12 18:26:03,584][62634] Updated weights for policy 0, policy_version 63540 (0.0009) [2023-10-12 18:26:03,628][62635] Updated weights for policy 1, policy_version 63540 (0.0008) [2023-10-12 18:26:03,962][62634] Updated weights for policy 0, policy_version 63550 (0.0008) [2023-10-12 18:26:03,996][62635] Updated weights for policy 1, policy_version 63550 (0.0009) [2023-10-12 18:26:07,970][62634] Updated weights for policy 0, policy_version 63560 (0.0008) [2023-10-12 18:26:08,117][62635] Updated weights for policy 1, policy_version 63560 (0.0007) [2023-10-12 18:26:08,349][62634] Updated weights for policy 0, policy_version 63570 (0.0009) [2023-10-12 18:26:08,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 130154496. Throughput: 0: 1674.4, 1: 1679.7. Samples: 32550174. Policy #0 lag: (min: 31.0, avg: 32.1, max: 53.0) [2023-10-12 18:26:08,435][61643] Avg episode reward: [(0, '24.500'), (1, '10.010')] [2023-10-12 18:26:08,477][62635] Updated weights for policy 1, policy_version 63570 (0.0009) [2023-10-12 18:26:08,723][62634] Updated weights for policy 0, policy_version 63580 (0.0008) [2023-10-12 18:26:08,849][62635] Updated weights for policy 1, policy_version 63580 (0.0008) [2023-10-12 18:26:12,876][62635] Updated weights for policy 1, policy_version 63590 (0.0008) [2023-10-12 18:26:12,917][62634] Updated weights for policy 0, policy_version 63590 (0.0007) [2023-10-12 18:26:13,248][62635] Updated weights for policy 1, policy_version 63600 (0.0008) [2023-10-12 18:26:13,300][62634] Updated weights for policy 0, policy_version 63600 (0.0010) [2023-10-12 18:26:13,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 130220032. Throughput: 0: 1667.5, 1: 1677.9. Samples: 32570568. Policy #0 lag: (min: 31.0, avg: 32.1, max: 53.0) [2023-10-12 18:26:13,435][61643] Avg episode reward: [(0, '24.140'), (1, '9.790')] [2023-10-12 18:26:13,615][62635] Updated weights for policy 1, policy_version 63610 (0.0009) [2023-10-12 18:26:13,674][62634] Updated weights for policy 0, policy_version 63610 (0.0009) [2023-10-12 18:26:17,683][62635] Updated weights for policy 1, policy_version 63620 (0.0008) [2023-10-12 18:26:17,866][62634] Updated weights for policy 0, policy_version 63620 (0.0008) [2023-10-12 18:26:18,036][62635] Updated weights for policy 1, policy_version 63630 (0.0008) [2023-10-12 18:26:18,243][62634] Updated weights for policy 0, policy_version 63630 (0.0009) [2023-10-12 18:26:18,402][62635] Updated weights for policy 1, policy_version 63640 (0.0007) [2023-10-12 18:26:18,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 130285568. Throughput: 0: 1658.9, 1: 1665.4. Samples: 32590364. Policy #0 lag: (min: 31.0, avg: 32.1, max: 53.0) [2023-10-12 18:26:18,436][61643] Avg episode reward: [(0, '24.230'), (1, '9.840')] [2023-10-12 18:26:18,605][62634] Updated weights for policy 0, policy_version 63640 (0.0008) [2023-10-12 18:26:18,692][62495] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000063648_65175552.pth... [2023-10-12 18:26:18,731][62495] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000062080_63569920.pth [2023-10-12 18:26:18,905][62354] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000063648_65175552.pth... [2023-10-12 18:26:18,946][62354] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000062080_63569920.pth [2023-10-12 18:26:22,304][62635] Updated weights for policy 1, policy_version 63650 (0.0008) [2023-10-12 18:26:22,632][62634] Updated weights for policy 0, policy_version 63650 (0.0008) [2023-10-12 18:26:22,675][62635] Updated weights for policy 1, policy_version 63660 (0.0007) [2023-10-12 18:26:23,005][62634] Updated weights for policy 0, policy_version 63660 (0.0011) [2023-10-12 18:26:23,050][62635] Updated weights for policy 1, policy_version 63670 (0.0007) [2023-10-12 18:26:23,375][62634] Updated weights for policy 0, policy_version 63670 (0.0009) [2023-10-12 18:26:23,408][62635] Updated weights for policy 1, policy_version 63680 (0.0008) [2023-10-12 18:26:23,435][61643] Fps is (10 sec: 16384.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 130383872. Throughput: 0: 1664.8, 1: 1679.0. Samples: 32600374. Policy #0 lag: (min: 31.0, avg: 32.1, max: 53.0) [2023-10-12 18:26:23,435][61643] Avg episode reward: [(0, '24.100'), (1, '9.890')] [2023-10-12 18:26:23,753][62634] Updated weights for policy 0, policy_version 63680 (0.0010) [2023-10-12 18:26:27,422][62635] Updated weights for policy 1, policy_version 63690 (0.0008) [2023-10-12 18:26:27,780][62635] Updated weights for policy 1, policy_version 63700 (0.0008) [2023-10-12 18:26:27,926][62634] Updated weights for policy 0, policy_version 63690 (0.0008) [2023-10-12 18:26:28,154][62635] Updated weights for policy 1, policy_version 63710 (0.0008) [2023-10-12 18:26:28,299][62634] Updated weights for policy 0, policy_version 63700 (0.0007) [2023-10-12 18:26:28,435][61643] Fps is (10 sec: 16383.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 130449408. Throughput: 0: 1665.3, 1: 1684.8. Samples: 32621096. Policy #0 lag: (min: 31.0, avg: 32.1, max: 53.0) [2023-10-12 18:26:28,436][61643] Avg episode reward: [(0, '23.940'), (1, '9.680')] [2023-10-12 18:26:28,684][62634] Updated weights for policy 0, policy_version 63710 (0.0007) [2023-10-12 18:26:32,156][62635] Updated weights for policy 1, policy_version 63720 (0.0008) [2023-10-12 18:26:32,526][62635] Updated weights for policy 1, policy_version 63730 (0.0008) [2023-10-12 18:26:32,584][62634] Updated weights for policy 0, policy_version 63720 (0.0008) [2023-10-12 18:26:32,903][62635] Updated weights for policy 1, policy_version 63740 (0.0007) [2023-10-12 18:26:32,968][62634] Updated weights for policy 0, policy_version 63730 (0.0007) [2023-10-12 18:26:33,337][62634] Updated weights for policy 0, policy_version 63740 (0.0010) [2023-10-12 18:26:33,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 130514944. Throughput: 0: 1658.4, 1: 1660.0. Samples: 32640146. Policy #0 lag: (min: 31.0, avg: 32.1, max: 53.0) [2023-10-12 18:26:33,435][61643] Avg episode reward: [(0, '23.880'), (1, '9.860')] [2023-10-12 18:26:37,045][62635] Updated weights for policy 1, policy_version 63750 (0.0008) [2023-10-12 18:26:37,376][62634] Updated weights for policy 0, policy_version 63750 (0.0008) [2023-10-12 18:26:37,417][62635] Updated weights for policy 1, policy_version 63760 (0.0007) [2023-10-12 18:26:37,756][62634] Updated weights for policy 0, policy_version 63760 (0.0007) [2023-10-12 18:26:37,777][62635] Updated weights for policy 1, policy_version 63770 (0.0009) [2023-10-12 18:26:38,127][62634] Updated weights for policy 0, policy_version 63770 (0.0007) [2023-10-12 18:26:38,435][61643] Fps is (10 sec: 16384.2, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 130613248. Throughput: 0: 1677.5, 1: 1688.0. Samples: 32651304. Policy #0 lag: (min: 31.0, avg: 31.2, max: 39.0) [2023-10-12 18:26:38,436][61643] Avg episode reward: [(0, '23.980'), (1, '9.780')] [2023-10-12 18:26:41,847][62635] Updated weights for policy 1, policy_version 63780 (0.0007) [2023-10-12 18:26:42,099][62634] Updated weights for policy 0, policy_version 63780 (0.0008) [2023-10-12 18:26:42,213][62635] Updated weights for policy 1, policy_version 63790 (0.0007) [2023-10-12 18:26:42,474][62634] Updated weights for policy 0, policy_version 63790 (0.0009) [2023-10-12 18:26:42,579][62635] Updated weights for policy 1, policy_version 63800 (0.0007) [2023-10-12 18:26:42,847][62634] Updated weights for policy 0, policy_version 63800 (0.0007) [2023-10-12 18:26:43,435][61643] Fps is (10 sec: 16384.0, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 130678784. Throughput: 0: 1680.0, 1: 1682.0. Samples: 32671904. Policy #0 lag: (min: 31.0, avg: 31.2, max: 39.0) [2023-10-12 18:26:43,435][61643] Avg episode reward: [(0, '23.960'), (1, '9.660')] [2023-10-12 18:26:46,687][62635] Updated weights for policy 1, policy_version 63810 (0.0008) [2023-10-12 18:26:47,047][62634] Updated weights for policy 0, policy_version 63810 (0.0008) [2023-10-12 18:26:47,054][62635] Updated weights for policy 1, policy_version 63820 (0.0009) [2023-10-12 18:26:47,421][62635] Updated weights for policy 1, policy_version 63830 (0.0007) [2023-10-12 18:26:47,460][62634] Updated weights for policy 0, policy_version 63820 (0.0007) [2023-10-12 18:26:47,786][62635] Updated weights for policy 1, policy_version 63840 (0.0010) [2023-10-12 18:26:47,830][62634] Updated weights for policy 0, policy_version 63830 (0.0009) [2023-10-12 18:26:48,210][62634] Updated weights for policy 0, policy_version 63840 (0.0010) [2023-10-12 18:26:48,435][61643] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 130744320. Throughput: 0: 1659.3, 1: 1665.1. Samples: 32690422. Policy #0 lag: (min: 31.0, avg: 31.2, max: 39.0) [2023-10-12 18:26:48,435][61643] Avg episode reward: [(0, '24.330'), (1, '10.080')] [2023-10-12 18:26:51,896][62635] Updated weights for policy 1, policy_version 63850 (0.0010) [2023-10-12 18:26:52,192][62634] Updated weights for policy 0, policy_version 63850 (0.0008) [2023-10-12 18:26:52,266][62635] Updated weights for policy 1, policy_version 63860 (0.0008) [2023-10-12 18:26:52,559][62634] Updated weights for policy 0, policy_version 63860 (0.0009) [2023-10-12 18:26:52,627][62635] Updated weights for policy 1, policy_version 63870 (0.0009) [2023-10-12 18:26:52,941][62634] Updated weights for policy 0, policy_version 63870 (0.0010) [2023-10-12 18:26:53,435][61643] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 130809856. Throughput: 0: 1681.4, 1: 1687.9. Samples: 32701792. Policy #0 lag: (min: 31.0, avg: 31.2, max: 39.0) [2023-10-12 18:26:53,436][61643] Avg episode reward: [(0, '24.200'), (1, '10.070')] [2023-10-12 18:26:56,697][62635] Updated weights for policy 1, policy_version 63880 (0.0007) [2023-10-12 18:26:56,918][62634] Updated weights for policy 0, policy_version 63880 (0.0008) [2023-10-12 18:26:57,057][62635] Updated weights for policy 1, policy_version 63890 (0.0007) [2023-10-12 18:26:57,294][62634] Updated weights for policy 0, policy_version 63890 (0.0009) [2023-10-12 18:26:57,428][62635] Updated weights for policy 1, policy_version 63900 (0.0008) [2023-10-12 18:26:57,667][62634] Updated weights for policy 0, policy_version 63900 (0.0009) [2023-10-12 18:26:58,435][61643] Fps is (10 sec: 13107.0, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 130875392. Throughput: 0: 1677.0, 1: 1678.5. Samples: 32721566. Policy #0 lag: (min: 31.0, avg: 31.2, max: 39.0) [2023-10-12 18:26:58,436][61643] Avg episode reward: [(0, '23.960'), (1, '9.780')] [2023-10-12 18:27:01,706][62635] Updated weights for policy 1, policy_version 63910 (0.0008) [2023-10-12 18:27:01,813][62634] Updated weights for policy 0, policy_version 63910 (0.0009) [2023-10-12 18:27:02,086][62635] Updated weights for policy 1, policy_version 63920 (0.0008) [2023-10-12 18:27:02,193][62634] Updated weights for policy 0, policy_version 63920 (0.0009) [2023-10-12 18:27:02,457][62635] Updated weights for policy 1, policy_version 63930 (0.0008) [2023-10-12 18:27:02,567][62634] Updated weights for policy 0, policy_version 63930 (0.0009) [2023-10-12 18:27:03,435][61643] Fps is (10 sec: 13107.1, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 130940928. Throughput: 0: 1664.4, 1: 1675.6. Samples: 32740664. Policy #0 lag: (min: 31.0, avg: 31.2, max: 39.0) [2023-10-12 18:27:03,435][61643] Avg episode reward: [(0, '24.060'), (1, '9.890')] [2023-10-12 18:27:06,551][62635] Updated weights for policy 1, policy_version 63940 (0.0008) [2023-10-12 18:27:06,709][62634] Updated weights for policy 0, policy_version 63940 (0.0007) [2023-10-12 18:27:06,916][62635] Updated weights for policy 1, policy_version 63950 (0.0008) [2023-10-12 18:27:07,081][62634] Updated weights for policy 0, policy_version 63950 (0.0007) [2023-10-12 18:27:07,276][62635] Updated weights for policy 1, policy_version 63960 (0.0009) [2023-10-12 18:27:07,455][62634] Updated weights for policy 0, policy_version 63960 (0.0009) [2023-10-12 18:27:08,435][61643] Fps is (10 sec: 13107.3, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 131006464. Throughput: 0: 1686.8, 1: 1684.5. Samples: 32752084. Policy #0 lag: (min: 31.0, avg: 31.2, max: 39.0) [2023-10-12 18:27:08,436][61643] Avg episode reward: [(0, '24.130'), (1, '9.860')] [2023-10-12 18:27:11,325][62635] Updated weights for policy 1, policy_version 63970 (0.0009) [2023-10-12 18:27:11,491][62634] Updated weights for policy 0, policy_version 63970 (0.0009) [2023-10-12 18:27:11,687][62635] Updated weights for policy 1, policy_version 63980 (0.0010) [2023-10-12 18:27:11,871][62634] Updated weights for policy 0, policy_version 63980 (0.0007) [2023-10-12 18:27:12,057][62635] Updated weights for policy 1, policy_version 63990 (0.0008) [2023-10-12 18:27:12,253][62634] Updated weights for policy 0, policy_version 63990 (0.0008) [2023-10-12 18:27:12,419][62635] Updated weights for policy 1, policy_version 64000 (0.0007) [2023-10-12 18:27:12,629][62634] Updated weights for policy 0, policy_version 64000 (0.0009) [2023-10-12 18:27:13,435][61643] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 13440.4). Total num frames: 131072000. Throughput: 0: 1675.4, 1: 1664.9. Samples: 32771408. Policy #0 lag: (min: 31.0, avg: 31.2, max: 39.0) [2023-10-12 18:27:13,435][61643] Avg episode reward: [(0, '24.290'), (1, '9.760')] [2023-10-12 18:27:16,421][62634] Updated weights for policy 0, policy_version 64010 (0.0009) [2023-10-12 18:27:16,611][62635] Updated weights for policy 1, policy_version 64010 (0.0007) [2023-10-12 18:27:16,805][62634] Updated weights for policy 0, policy_version 64020 (0.0010) [2023-10-12 18:27:16,974][62635] Updated weights for policy 1, policy_version 64020 (0.0008) [2023-10-12 18:27:17,192][62634] Updated weights for policy 0, policy_version 64030 (0.0007) [2023-10-12 18:27:17,350][62635] Updated weights for policy 1, policy_version 64030 (0.0008) [2023-10-12 18:27:18,435][61643] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 13440.4). Total num frames: 131137536. Throughput: 0: 1671.4, 1: 1677.8. Samples: 32790862. Policy #0 lag: (min: 31.0, avg: 31.2, max: 39.0) [2023-10-12 18:27:18,435][61643] Avg episode reward: [(0, '24.440'), (1, '9.800')] [2023-10-12 18:27:21,160][62634] Updated weights for policy 0, policy_version 64040 (0.0008) [2023-10-12 18:27:21,331][62635] Updated weights for policy 1, policy_version 64040 (0.0007) [2023-10-12 18:27:21,534][62634] Updated weights for policy 0, policy_version 64050 (0.0009) [2023-10-12 18:27:21,708][62635] Updated weights for policy 1, policy_version 64050 (0.0007) [2023-10-12 18:27:21,907][62634] Updated weights for policy 0, policy_version 64060 (0.0009) [2023-10-12 18:27:22,070][62635] Updated weights for policy 1, policy_version 64060 (0.0007) [2023-10-12 18:27:23,435][61643] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 131203072. Throughput: 0: 1678.8, 1: 1681.8. Samples: 32802532. Policy #0 lag: (min: 16.0, avg: 37.7, max: 48.0) [2023-10-12 18:27:23,436][61643] Avg episode reward: [(0, '24.600'), (1, '9.900')] [2023-10-12 18:27:26,074][62634] Updated weights for policy 0, policy_version 64070 (0.0007) [2023-10-12 18:27:26,277][62635] Updated weights for policy 1, policy_version 64070 (0.0008) [2023-10-12 18:27:26,448][62634] Updated weights for policy 0, policy_version 64080 (0.0007) [2023-10-12 18:27:26,649][62635] Updated weights for policy 1, policy_version 64080 (0.0007) [2023-10-12 18:27:26,822][62634] Updated weights for policy 0, policy_version 64090 (0.0009) [2023-10-12 18:27:27,018][62635] Updated weights for policy 1, policy_version 64090 (0.0007) [2023-10-12 18:27:28,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13329.4). Total num frames: 131268608. Throughput: 0: 1650.9, 1: 1663.6. Samples: 32821054. Policy #0 lag: (min: 16.0, avg: 37.7, max: 48.0) [2023-10-12 18:27:28,435][61643] Avg episode reward: [(0, '24.860'), (1, '9.840')] [2023-10-12 18:27:30,917][62634] Updated weights for policy 0, policy_version 64100 (0.0008) [2023-10-12 18:27:31,056][62635] Updated weights for policy 1, policy_version 64100 (0.0007) [2023-10-12 18:27:31,299][62634] Updated weights for policy 0, policy_version 64110 (0.0008) [2023-10-12 18:27:31,424][62635] Updated weights for policy 1, policy_version 64110 (0.0008) [2023-10-12 18:27:31,676][62634] Updated weights for policy 0, policy_version 64120 (0.0009) [2023-10-12 18:27:31,794][62635] Updated weights for policy 1, policy_version 64120 (0.0007) [2023-10-12 18:27:33,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13329.3). Total num frames: 131334144. Throughput: 0: 1675.1, 1: 1677.9. Samples: 32841308. Policy #0 lag: (min: 16.0, avg: 37.7, max: 48.0) [2023-10-12 18:27:33,436][61643] Avg episode reward: [(0, '24.620'), (1, '9.830')] [2023-10-12 18:27:35,789][62635] Updated weights for policy 1, policy_version 64130 (0.0010) [2023-10-12 18:27:35,916][62634] Updated weights for policy 0, policy_version 64130 (0.0010) [2023-10-12 18:27:36,156][62635] Updated weights for policy 1, policy_version 64140 (0.0007) [2023-10-12 18:27:36,321][62634] Updated weights for policy 0, policy_version 64140 (0.0008) [2023-10-12 18:27:36,534][62635] Updated weights for policy 1, policy_version 64150 (0.0007) [2023-10-12 18:27:36,687][62634] Updated weights for policy 0, policy_version 64150 (0.0010) [2023-10-12 18:27:36,894][62635] Updated weights for policy 1, policy_version 64160 (0.0008) [2023-10-12 18:27:37,060][62634] Updated weights for policy 0, policy_version 64160 (0.0008) [2023-10-12 18:27:38,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 131399680. Throughput: 0: 1676.9, 1: 1671.0. Samples: 32852448. Policy #0 lag: (min: 16.0, avg: 37.7, max: 48.0) [2023-10-12 18:27:38,435][61643] Avg episode reward: [(0, '25.050'), (1, '10.110')] [2023-10-12 18:27:41,051][62635] Updated weights for policy 1, policy_version 64170 (0.0009) [2023-10-12 18:27:41,235][62634] Updated weights for policy 0, policy_version 64170 (0.0009) [2023-10-12 18:27:41,420][62635] Updated weights for policy 1, policy_version 64180 (0.0009) [2023-10-12 18:27:41,621][62634] Updated weights for policy 0, policy_version 64180 (0.0009) [2023-10-12 18:27:41,792][62635] Updated weights for policy 1, policy_version 64190 (0.0008) [2023-10-12 18:27:41,994][62634] Updated weights for policy 0, policy_version 64190 (0.0009) [2023-10-12 18:27:43,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.1, 300 sec: 13329.3). Total num frames: 131465216. Throughput: 0: 1659.4, 1: 1661.1. Samples: 32870988. Policy #0 lag: (min: 16.0, avg: 37.7, max: 48.0) [2023-10-12 18:27:43,436][61643] Avg episode reward: [(0, '25.180'), (1, '10.020')] [2023-10-12 18:27:45,808][62635] Updated weights for policy 1, policy_version 64200 (0.0009) [2023-10-12 18:27:46,106][62634] Updated weights for policy 0, policy_version 64200 (0.0009) [2023-10-12 18:27:46,191][62635] Updated weights for policy 1, policy_version 64210 (0.0008) [2023-10-12 18:27:46,485][62634] Updated weights for policy 0, policy_version 64210 (0.0008) [2023-10-12 18:27:46,554][62635] Updated weights for policy 1, policy_version 64220 (0.0008) [2023-10-12 18:27:46,859][62634] Updated weights for policy 0, policy_version 64220 (0.0007) [2023-10-12 18:27:48,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 131530752. Throughput: 0: 1670.0, 1: 1673.1. Samples: 32891104. Policy #0 lag: (min: 16.0, avg: 37.7, max: 48.0) [2023-10-12 18:27:48,435][61643] Avg episode reward: [(0, '25.130'), (1, '9.760')] [2023-10-12 18:27:50,538][62635] Updated weights for policy 1, policy_version 64230 (0.0007) [2023-10-12 18:27:50,908][62635] Updated weights for policy 1, policy_version 64240 (0.0009) [2023-10-12 18:27:50,978][62634] Updated weights for policy 0, policy_version 64230 (0.0010) [2023-10-12 18:27:51,277][62635] Updated weights for policy 1, policy_version 64250 (0.0008) [2023-10-12 18:27:51,351][62634] Updated weights for policy 0, policy_version 64240 (0.0009) [2023-10-12 18:27:51,734][62634] Updated weights for policy 0, policy_version 64250 (0.0008) [2023-10-12 18:27:53,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 131596288. Throughput: 0: 1666.5, 1: 1658.5. Samples: 32901710. Policy #0 lag: (min: 16.0, avg: 37.7, max: 48.0) [2023-10-12 18:27:53,436][61643] Avg episode reward: [(0, '25.130'), (1, '9.990')] [2023-10-12 18:27:55,341][62635] Updated weights for policy 1, policy_version 64260 (0.0007) [2023-10-12 18:27:55,709][62635] Updated weights for policy 1, policy_version 64270 (0.0007) [2023-10-12 18:27:55,786][62634] Updated weights for policy 0, policy_version 64260 (0.0011) [2023-10-12 18:27:56,075][62635] Updated weights for policy 1, policy_version 64280 (0.0008) [2023-10-12 18:27:56,162][62634] Updated weights for policy 0, policy_version 64270 (0.0009) [2023-10-12 18:27:56,545][62634] Updated weights for policy 0, policy_version 64280 (0.0009) [2023-10-12 18:27:58,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 131661824. Throughput: 0: 1653.5, 1: 1664.4. Samples: 32920716. Policy #0 lag: (min: 16.0, avg: 37.7, max: 48.0) [2023-10-12 18:27:58,435][61643] Avg episode reward: [(0, '25.240'), (1, '10.070')] [2023-10-12 18:28:00,314][62635] Updated weights for policy 1, policy_version 64290 (0.0008) [2023-10-12 18:28:00,481][62634] Updated weights for policy 0, policy_version 64290 (0.0010) [2023-10-12 18:28:00,676][62635] Updated weights for policy 1, policy_version 64300 (0.0009) [2023-10-12 18:28:00,864][62634] Updated weights for policy 0, policy_version 64300 (0.0008) [2023-10-12 18:28:01,045][62635] Updated weights for policy 1, policy_version 64310 (0.0008) [2023-10-12 18:28:01,248][62634] Updated weights for policy 0, policy_version 64310 (0.0008) [2023-10-12 18:28:01,408][62635] Updated weights for policy 1, policy_version 64320 (0.0008) [2023-10-12 18:28:01,621][62634] Updated weights for policy 0, policy_version 64320 (0.0007) [2023-10-12 18:28:03,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 131727360. Throughput: 0: 1675.9, 1: 1670.8. Samples: 32941464. Policy #0 lag: (min: 16.0, avg: 37.7, max: 48.0) [2023-10-12 18:28:03,435][61643] Avg episode reward: [(0, '25.160'), (1, '10.050')] [2023-10-12 18:28:05,564][62634] Updated weights for policy 0, policy_version 64330 (0.0007) [2023-10-12 18:28:05,566][62635] Updated weights for policy 1, policy_version 64330 (0.0009) [2023-10-12 18:28:05,929][62635] Updated weights for policy 1, policy_version 64340 (0.0009) [2023-10-12 18:28:05,947][62634] Updated weights for policy 0, policy_version 64340 (0.0007) [2023-10-12 18:28:06,293][62635] Updated weights for policy 1, policy_version 64350 (0.0007) [2023-10-12 18:28:06,328][62634] Updated weights for policy 0, policy_version 64350 (0.0009) [2023-10-12 18:28:08,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 131792896. Throughput: 0: 1661.9, 1: 1648.5. Samples: 32951500. Policy #0 lag: (min: 17.0, avg: 19.8, max: 49.0) [2023-10-12 18:28:08,435][61643] Avg episode reward: [(0, '25.330'), (1, '10.050')] [2023-10-12 18:28:10,293][62634] Updated weights for policy 0, policy_version 64360 (0.0007) [2023-10-12 18:28:10,380][62635] Updated weights for policy 1, policy_version 64360 (0.0007) [2023-10-12 18:28:10,663][62634] Updated weights for policy 0, policy_version 64370 (0.0009) [2023-10-12 18:28:10,740][62635] Updated weights for policy 1, policy_version 64370 (0.0008) [2023-10-12 18:28:11,038][62634] Updated weights for policy 0, policy_version 64380 (0.0008) [2023-10-12 18:28:11,113][62635] Updated weights for policy 1, policy_version 64380 (0.0007) [2023-10-12 18:28:13,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 131858432. Throughput: 0: 1673.1, 1: 1665.1. Samples: 32971272. Policy #0 lag: (min: 17.0, avg: 19.8, max: 49.0) [2023-10-12 18:28:13,436][61643] Avg episode reward: [(0, '25.350'), (1, '9.970')] [2023-10-12 18:28:15,227][62634] Updated weights for policy 0, policy_version 64390 (0.0008) [2023-10-12 18:28:15,276][62635] Updated weights for policy 1, policy_version 64390 (0.0008) [2023-10-12 18:28:15,608][62634] Updated weights for policy 0, policy_version 64400 (0.0008) [2023-10-12 18:28:15,645][62635] Updated weights for policy 1, policy_version 64400 (0.0007) [2023-10-12 18:28:15,987][62634] Updated weights for policy 0, policy_version 64410 (0.0009) [2023-10-12 18:28:16,013][62635] Updated weights for policy 1, policy_version 64410 (0.0007) [2023-10-12 18:28:18,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 131923968. Throughput: 0: 1674.4, 1: 1670.4. Samples: 32991824. Policy #0 lag: (min: 17.0, avg: 19.8, max: 49.0) [2023-10-12 18:28:18,435][61643] Avg episode reward: [(0, '25.530'), (1, '9.850')] [2023-10-12 18:28:18,443][62495] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000064416_65961984.pth... [2023-10-12 18:28:18,444][62354] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000064416_65961984.pth... [2023-10-12 18:28:18,474][62495] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000062848_64356352.pth [2023-10-12 18:28:18,489][62354] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000062848_64356352.pth [2023-10-12 18:28:18,493][62354] Saving new best policy, reward=25.530! [2023-10-12 18:28:20,069][62634] Updated weights for policy 0, policy_version 64420 (0.0009) [2023-10-12 18:28:20,101][62635] Updated weights for policy 1, policy_version 64420 (0.0008) [2023-10-12 18:28:20,441][62634] Updated weights for policy 0, policy_version 64430 (0.0009) [2023-10-12 18:28:20,459][62635] Updated weights for policy 1, policy_version 64430 (0.0009) [2023-10-12 18:28:20,814][62634] Updated weights for policy 0, policy_version 64440 (0.0008) [2023-10-12 18:28:20,824][62635] Updated weights for policy 1, policy_version 64440 (0.0008) [2023-10-12 18:28:23,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 131989504. Throughput: 0: 1652.4, 1: 1651.7. Samples: 33001132. Policy #0 lag: (min: 17.0, avg: 19.8, max: 49.0) [2023-10-12 18:28:23,435][61643] Avg episode reward: [(0, '25.360'), (1, '9.680')] [2023-10-12 18:28:24,842][62635] Updated weights for policy 1, policy_version 64450 (0.0008) [2023-10-12 18:28:24,966][62634] Updated weights for policy 0, policy_version 64450 (0.0009) [2023-10-12 18:28:25,209][62635] Updated weights for policy 1, policy_version 64460 (0.0008) [2023-10-12 18:28:25,339][62634] Updated weights for policy 0, policy_version 64460 (0.0008) [2023-10-12 18:28:25,588][62635] Updated weights for policy 1, policy_version 64470 (0.0009) [2023-10-12 18:28:25,721][62634] Updated weights for policy 0, policy_version 64470 (0.0008) [2023-10-12 18:28:25,951][62635] Updated weights for policy 1, policy_version 64480 (0.0007) [2023-10-12 18:28:26,089][62634] Updated weights for policy 0, policy_version 64480 (0.0008) [2023-10-12 18:28:28,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 132055040. Throughput: 0: 1669.6, 1: 1672.1. Samples: 33021364. Policy #0 lag: (min: 17.0, avg: 19.8, max: 49.0) [2023-10-12 18:28:28,436][61643] Avg episode reward: [(0, '25.210'), (1, '9.840')] [2023-10-12 18:28:29,974][62635] Updated weights for policy 1, policy_version 64490 (0.0008) [2023-10-12 18:28:30,354][62635] Updated weights for policy 1, policy_version 64500 (0.0009) [2023-10-12 18:28:30,363][62634] Updated weights for policy 0, policy_version 64490 (0.0009) [2023-10-12 18:28:30,724][62635] Updated weights for policy 1, policy_version 64510 (0.0009) [2023-10-12 18:28:30,732][62634] Updated weights for policy 0, policy_version 64500 (0.0009) [2023-10-12 18:28:31,106][62634] Updated weights for policy 0, policy_version 64510 (0.0008) [2023-10-12 18:28:33,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 132120576. Throughput: 0: 1677.8, 1: 1681.6. Samples: 33042278. Policy #0 lag: (min: 17.0, avg: 19.8, max: 49.0) [2023-10-12 18:28:33,435][61643] Avg episode reward: [(0, '25.170'), (1, '9.840')] [2023-10-12 18:28:34,799][62635] Updated weights for policy 1, policy_version 64520 (0.0010) [2023-10-12 18:28:35,121][62634] Updated weights for policy 0, policy_version 64520 (0.0007) [2023-10-12 18:28:35,165][62635] Updated weights for policy 1, policy_version 64530 (0.0009) [2023-10-12 18:28:35,504][62634] Updated weights for policy 0, policy_version 64530 (0.0009) [2023-10-12 18:28:35,537][62635] Updated weights for policy 1, policy_version 64540 (0.0009) [2023-10-12 18:28:35,871][62634] Updated weights for policy 0, policy_version 64540 (0.0008) [2023-10-12 18:28:38,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 132186112. Throughput: 0: 1654.6, 1: 1672.2. Samples: 33051416. Policy #0 lag: (min: 17.0, avg: 19.8, max: 49.0) [2023-10-12 18:28:38,435][61643] Avg episode reward: [(0, '25.210'), (1, '10.020')] [2023-10-12 18:28:39,608][62635] Updated weights for policy 1, policy_version 64550 (0.0007) [2023-10-12 18:28:39,803][62634] Updated weights for policy 0, policy_version 64550 (0.0008) [2023-10-12 18:28:39,977][62635] Updated weights for policy 1, policy_version 64560 (0.0008) [2023-10-12 18:28:40,179][62634] Updated weights for policy 0, policy_version 64560 (0.0008) [2023-10-12 18:28:40,346][62635] Updated weights for policy 1, policy_version 64570 (0.0009) [2023-10-12 18:28:40,560][62634] Updated weights for policy 0, policy_version 64570 (0.0008) [2023-10-12 18:28:43,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 132251648. Throughput: 0: 1676.4, 1: 1681.8. Samples: 33071836. Policy #0 lag: (min: 17.0, avg: 19.8, max: 49.0) [2023-10-12 18:28:43,435][61643] Avg episode reward: [(0, '25.310'), (1, '9.840')] [2023-10-12 18:28:44,474][62635] Updated weights for policy 1, policy_version 64580 (0.0008) [2023-10-12 18:28:44,718][62634] Updated weights for policy 0, policy_version 64580 (0.0008) [2023-10-12 18:28:44,847][62635] Updated weights for policy 1, policy_version 64590 (0.0008) [2023-10-12 18:28:45,097][62634] Updated weights for policy 0, policy_version 64590 (0.0009) [2023-10-12 18:28:45,210][62635] Updated weights for policy 1, policy_version 64600 (0.0008) [2023-10-12 18:28:45,470][62634] Updated weights for policy 0, policy_version 64600 (0.0009) [2023-10-12 18:28:48,435][61643] Fps is (10 sec: 13106.9, 60 sec: 13107.1, 300 sec: 13329.4). Total num frames: 132317184. Throughput: 0: 1669.5, 1: 1690.7. Samples: 33092672. Policy #0 lag: (min: 17.0, avg: 19.8, max: 49.0) [2023-10-12 18:28:48,436][61643] Avg episode reward: [(0, '25.550'), (1, '9.860')] [2023-10-12 18:28:48,448][62354] Saving new best policy, reward=25.550! [2023-10-12 18:28:49,214][62635] Updated weights for policy 1, policy_version 64610 (0.0008) [2023-10-12 18:28:49,532][62634] Updated weights for policy 0, policy_version 64610 (0.0009) [2023-10-12 18:28:49,587][62635] Updated weights for policy 1, policy_version 64620 (0.0009) [2023-10-12 18:28:49,898][62634] Updated weights for policy 0, policy_version 64620 (0.0008) [2023-10-12 18:28:49,951][62635] Updated weights for policy 1, policy_version 64630 (0.0008) [2023-10-12 18:28:50,284][62634] Updated weights for policy 0, policy_version 64630 (0.0008) [2023-10-12 18:28:50,318][62635] Updated weights for policy 1, policy_version 64640 (0.0009) [2023-10-12 18:28:50,668][62634] Updated weights for policy 0, policy_version 64640 (0.0010) [2023-10-12 18:28:53,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 132382720. Throughput: 0: 1653.5, 1: 1681.4. Samples: 33101570. Policy #0 lag: (min: 23.0, avg: 23.0, max: 23.0) [2023-10-12 18:28:53,436][61643] Avg episode reward: [(0, '25.690'), (1, '10.090')] [2023-10-12 18:28:53,437][62354] Saving new best policy, reward=25.690! [2023-10-12 18:28:54,301][62635] Updated weights for policy 1, policy_version 64650 (0.0008) [2023-10-12 18:28:54,659][62635] Updated weights for policy 1, policy_version 64660 (0.0007) [2023-10-12 18:28:54,767][62634] Updated weights for policy 0, policy_version 64650 (0.0010) [2023-10-12 18:28:55,027][62635] Updated weights for policy 1, policy_version 64670 (0.0008) [2023-10-12 18:28:55,140][62634] Updated weights for policy 0, policy_version 64660 (0.0009) [2023-10-12 18:28:55,523][62634] Updated weights for policy 0, policy_version 64670 (0.0009) [2023-10-12 18:28:58,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 132448256. Throughput: 0: 1665.0, 1: 1688.9. Samples: 33122196. Policy #0 lag: (min: 23.0, avg: 23.0, max: 23.0) [2023-10-12 18:28:58,436][61643] Avg episode reward: [(0, '25.610'), (1, '10.040')] [2023-10-12 18:28:59,106][62635] Updated weights for policy 1, policy_version 64680 (0.0010) [2023-10-12 18:28:59,474][62635] Updated weights for policy 1, policy_version 64690 (0.0010) [2023-10-12 18:28:59,657][62634] Updated weights for policy 0, policy_version 64680 (0.0008) [2023-10-12 18:28:59,847][62635] Updated weights for policy 1, policy_version 64700 (0.0009) [2023-10-12 18:29:00,033][62634] Updated weights for policy 0, policy_version 64690 (0.0008) [2023-10-12 18:29:00,410][62634] Updated weights for policy 0, policy_version 64700 (0.0007) [2023-10-12 18:29:03,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 132513792. Throughput: 0: 1665.2, 1: 1692.9. Samples: 33142938. Policy #0 lag: (min: 23.0, avg: 23.0, max: 23.0) [2023-10-12 18:29:03,435][61643] Avg episode reward: [(0, '25.450'), (1, '9.700')] [2023-10-12 18:29:03,880][62635] Updated weights for policy 1, policy_version 64710 (0.0009) [2023-10-12 18:29:04,243][62635] Updated weights for policy 1, policy_version 64720 (0.0007) [2023-10-12 18:29:04,466][62634] Updated weights for policy 0, policy_version 64710 (0.0008) [2023-10-12 18:29:04,623][62635] Updated weights for policy 1, policy_version 64730 (0.0008) [2023-10-12 18:29:04,841][62634] Updated weights for policy 0, policy_version 64720 (0.0009) [2023-10-12 18:29:05,222][62634] Updated weights for policy 0, policy_version 64730 (0.0009) [2023-10-12 18:29:08,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 132579328. Throughput: 0: 1662.8, 1: 1692.4. Samples: 33152114. Policy #0 lag: (min: 23.0, avg: 23.0, max: 23.0) [2023-10-12 18:29:08,436][61643] Avg episode reward: [(0, '25.220'), (1, '10.160')] [2023-10-12 18:29:08,663][62635] Updated weights for policy 1, policy_version 64740 (0.0009) [2023-10-12 18:29:09,030][62635] Updated weights for policy 1, policy_version 64750 (0.0008) [2023-10-12 18:29:09,239][62634] Updated weights for policy 0, policy_version 64740 (0.0008) [2023-10-12 18:29:09,393][62635] Updated weights for policy 1, policy_version 64760 (0.0009) [2023-10-12 18:29:09,626][62634] Updated weights for policy 0, policy_version 64750 (0.0009) [2023-10-12 18:29:09,691][62495] Saving new best policy, reward=10.160! [2023-10-12 18:29:10,005][62634] Updated weights for policy 0, policy_version 64760 (0.0010) [2023-10-12 18:29:13,423][62635] Updated weights for policy 1, policy_version 64770 (0.0009) [2023-10-12 18:29:13,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 132644864. Throughput: 0: 1672.3, 1: 1695.1. Samples: 33172896. Policy #0 lag: (min: 23.0, avg: 23.0, max: 23.0) [2023-10-12 18:29:13,436][61643] Avg episode reward: [(0, '25.300'), (1, '9.890')] [2023-10-12 18:29:13,787][62635] Updated weights for policy 1, policy_version 64780 (0.0009) [2023-10-12 18:29:14,161][62635] Updated weights for policy 1, policy_version 64790 (0.0007) [2023-10-12 18:29:14,211][62634] Updated weights for policy 0, policy_version 64770 (0.0010) [2023-10-12 18:29:14,525][62635] Updated weights for policy 1, policy_version 64800 (0.0007) [2023-10-12 18:29:14,614][62634] Updated weights for policy 0, policy_version 64780 (0.0007) [2023-10-12 18:29:14,990][62634] Updated weights for policy 0, policy_version 64790 (0.0007) [2023-10-12 18:29:15,361][62634] Updated weights for policy 0, policy_version 64800 (0.0007) [2023-10-12 18:29:18,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 132710400. Throughput: 0: 1666.8, 1: 1692.8. Samples: 33193460. Policy #0 lag: (min: 23.0, avg: 23.0, max: 23.0) [2023-10-12 18:29:18,435][61643] Avg episode reward: [(0, '25.180'), (1, '9.840')] [2023-10-12 18:29:18,520][62635] Updated weights for policy 1, policy_version 64810 (0.0008) [2023-10-12 18:29:18,881][62635] Updated weights for policy 1, policy_version 64820 (0.0009) [2023-10-12 18:29:19,257][62635] Updated weights for policy 1, policy_version 64830 (0.0009) [2023-10-12 18:29:19,438][62634] Updated weights for policy 0, policy_version 64810 (0.0008) [2023-10-12 18:29:19,818][62634] Updated weights for policy 0, policy_version 64820 (0.0008) [2023-10-12 18:29:20,199][62634] Updated weights for policy 0, policy_version 64830 (0.0009) [2023-10-12 18:29:23,367][62635] Updated weights for policy 1, policy_version 64840 (0.0009) [2023-10-12 18:29:23,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 132775936. Throughput: 0: 1665.0, 1: 1692.7. Samples: 33202512. Policy #0 lag: (min: 23.0, avg: 23.0, max: 23.0) [2023-10-12 18:29:23,435][61643] Avg episode reward: [(0, '25.210'), (1, '9.990')] [2023-10-12 18:29:23,732][62635] Updated weights for policy 1, policy_version 64850 (0.0010) [2023-10-12 18:29:24,106][62635] Updated weights for policy 1, policy_version 64860 (0.0008) [2023-10-12 18:29:24,303][62634] Updated weights for policy 0, policy_version 64840 (0.0009) [2023-10-12 18:29:24,679][62634] Updated weights for policy 0, policy_version 64850 (0.0008) [2023-10-12 18:29:25,068][62634] Updated weights for policy 0, policy_version 64860 (0.0009) [2023-10-12 18:29:28,065][62635] Updated weights for policy 1, policy_version 64870 (0.0007) [2023-10-12 18:29:28,431][62635] Updated weights for policy 1, policy_version 64880 (0.0007) [2023-10-12 18:29:28,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 132841472. Throughput: 0: 1677.2, 1: 1692.5. Samples: 33223470. Policy #0 lag: (min: 23.0, avg: 23.0, max: 23.0) [2023-10-12 18:29:28,435][61643] Avg episode reward: [(0, '25.200'), (1, '9.820')] [2023-10-12 18:29:28,800][62635] Updated weights for policy 1, policy_version 64890 (0.0007) [2023-10-12 18:29:28,971][62634] Updated weights for policy 0, policy_version 64870 (0.0008) [2023-10-12 18:29:29,351][62634] Updated weights for policy 0, policy_version 64880 (0.0009) [2023-10-12 18:29:29,741][62634] Updated weights for policy 0, policy_version 64890 (0.0010) [2023-10-12 18:29:32,829][62635] Updated weights for policy 1, policy_version 64900 (0.0007) [2023-10-12 18:29:33,195][62635] Updated weights for policy 1, policy_version 64910 (0.0007) [2023-10-12 18:29:33,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 132907008. Throughput: 0: 1684.1, 1: 1683.9. Samples: 33244232. Policy #0 lag: (min: 23.0, avg: 23.0, max: 23.0) [2023-10-12 18:29:33,435][61643] Avg episode reward: [(0, '25.090'), (1, '9.930')] [2023-10-12 18:29:33,565][62635] Updated weights for policy 1, policy_version 64920 (0.0009) [2023-10-12 18:29:33,590][62634] Updated weights for policy 0, policy_version 64900 (0.0010) [2023-10-12 18:29:33,969][62634] Updated weights for policy 0, policy_version 64910 (0.0009) [2023-10-12 18:29:34,337][62634] Updated weights for policy 0, policy_version 64920 (0.0008) [2023-10-12 18:29:37,714][62635] Updated weights for policy 1, policy_version 64930 (0.0009) [2023-10-12 18:29:38,089][62635] Updated weights for policy 1, policy_version 64940 (0.0010) [2023-10-12 18:29:38,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 132972544. Throughput: 0: 1686.1, 1: 1694.2. Samples: 33253684. Policy #0 lag: (min: 31.0, avg: 35.8, max: 63.0) [2023-10-12 18:29:38,435][61643] Avg episode reward: [(0, '24.770'), (1, '9.920')] [2023-10-12 18:29:38,455][62635] Updated weights for policy 1, policy_version 64950 (0.0009) [2023-10-12 18:29:38,548][62634] Updated weights for policy 0, policy_version 64930 (0.0008) [2023-10-12 18:29:38,819][62635] Updated weights for policy 1, policy_version 64960 (0.0009) [2023-10-12 18:29:38,920][62634] Updated weights for policy 0, policy_version 64940 (0.0008) [2023-10-12 18:29:39,299][62634] Updated weights for policy 0, policy_version 64950 (0.0009) [2023-10-12 18:29:39,667][62634] Updated weights for policy 0, policy_version 64960 (0.0008) [2023-10-12 18:29:42,797][62635] Updated weights for policy 1, policy_version 64970 (0.0009) [2023-10-12 18:29:43,171][62635] Updated weights for policy 1, policy_version 64980 (0.0009) [2023-10-12 18:29:43,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 133038080. Throughput: 0: 1686.5, 1: 1702.5. Samples: 33274698. Policy #0 lag: (min: 31.0, avg: 35.8, max: 63.0) [2023-10-12 18:29:43,435][61643] Avg episode reward: [(0, '24.650'), (1, '9.930')] [2023-10-12 18:29:43,535][62635] Updated weights for policy 1, policy_version 64990 (0.0008) [2023-10-12 18:29:43,579][62634] Updated weights for policy 0, policy_version 64970 (0.0007) [2023-10-12 18:29:43,959][62634] Updated weights for policy 0, policy_version 64980 (0.0010) [2023-10-12 18:29:44,333][62634] Updated weights for policy 0, policy_version 64990 (0.0009) [2023-10-12 18:29:47,478][62635] Updated weights for policy 1, policy_version 65000 (0.0007) [2023-10-12 18:29:47,846][62635] Updated weights for policy 1, policy_version 65010 (0.0009) [2023-10-12 18:29:48,216][62635] Updated weights for policy 1, policy_version 65020 (0.0009) [2023-10-12 18:29:48,435][61643] Fps is (10 sec: 16384.0, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 133136384. Throughput: 0: 1691.3, 1: 1680.5. Samples: 33294672. Policy #0 lag: (min: 31.0, avg: 35.8, max: 63.0) [2023-10-12 18:29:48,435][61643] Avg episode reward: [(0, '24.650'), (1, '9.930')] [2023-10-12 18:29:48,468][62634] Updated weights for policy 0, policy_version 65000 (0.0009) [2023-10-12 18:29:48,847][62634] Updated weights for policy 0, policy_version 65010 (0.0009) [2023-10-12 18:29:49,223][62634] Updated weights for policy 0, policy_version 65020 (0.0007) [2023-10-12 18:29:52,277][62635] Updated weights for policy 1, policy_version 65030 (0.0007) [2023-10-12 18:29:52,649][62635] Updated weights for policy 1, policy_version 65040 (0.0010) [2023-10-12 18:29:53,013][62635] Updated weights for policy 1, policy_version 65050 (0.0010) [2023-10-12 18:29:53,380][62634] Updated weights for policy 0, policy_version 65030 (0.0007) [2023-10-12 18:29:53,435][61643] Fps is (10 sec: 16384.0, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 133201920. Throughput: 0: 1688.6, 1: 1700.3. Samples: 33304614. Policy #0 lag: (min: 31.0, avg: 35.8, max: 63.0) [2023-10-12 18:29:53,435][61643] Avg episode reward: [(0, '24.660'), (1, '10.090')] [2023-10-12 18:29:53,749][62634] Updated weights for policy 0, policy_version 65040 (0.0008) [2023-10-12 18:29:54,129][62634] Updated weights for policy 0, policy_version 65050 (0.0007) [2023-10-12 18:29:57,216][62635] Updated weights for policy 1, policy_version 65060 (0.0010) [2023-10-12 18:29:57,585][62635] Updated weights for policy 1, policy_version 65070 (0.0008) [2023-10-12 18:29:57,946][62635] Updated weights for policy 1, policy_version 65080 (0.0008) [2023-10-12 18:29:58,037][62634] Updated weights for policy 0, policy_version 65060 (0.0008) [2023-10-12 18:29:58,404][62634] Updated weights for policy 0, policy_version 65070 (0.0009) [2023-10-12 18:29:58,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 133267456. Throughput: 0: 1693.5, 1: 1697.4. Samples: 33325484. Policy #0 lag: (min: 31.0, avg: 35.8, max: 63.0) [2023-10-12 18:29:58,435][61643] Avg episode reward: [(0, '24.660'), (1, '10.060')] [2023-10-12 18:29:58,780][62634] Updated weights for policy 0, policy_version 65080 (0.0008) [2023-10-12 18:30:02,043][62635] Updated weights for policy 1, policy_version 65090 (0.0008) [2023-10-12 18:30:02,415][62635] Updated weights for policy 1, policy_version 65100 (0.0009) [2023-10-12 18:30:02,785][62635] Updated weights for policy 1, policy_version 65110 (0.0009) [2023-10-12 18:30:02,960][62634] Updated weights for policy 0, policy_version 65090 (0.0007) [2023-10-12 18:30:03,145][62635] Updated weights for policy 1, policy_version 65120 (0.0007) [2023-10-12 18:30:03,353][62634] Updated weights for policy 0, policy_version 65100 (0.0009) [2023-10-12 18:30:03,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 133332992. Throughput: 0: 1697.3, 1: 1667.4. Samples: 33344870. Policy #0 lag: (min: 31.0, avg: 35.8, max: 63.0) [2023-10-12 18:30:03,436][61643] Avg episode reward: [(0, '24.640'), (1, '9.880')] [2023-10-12 18:30:03,732][62634] Updated weights for policy 0, policy_version 65110 (0.0010) [2023-10-12 18:30:04,103][62634] Updated weights for policy 0, policy_version 65120 (0.0008) [2023-10-12 18:30:07,315][62635] Updated weights for policy 1, policy_version 65130 (0.0009) [2023-10-12 18:30:07,674][62635] Updated weights for policy 1, policy_version 65140 (0.0009) [2023-10-12 18:30:08,053][62635] Updated weights for policy 1, policy_version 65150 (0.0007) [2023-10-12 18:30:08,211][62634] Updated weights for policy 0, policy_version 65130 (0.0009) [2023-10-12 18:30:08,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 133398528. Throughput: 0: 1697.8, 1: 1691.8. Samples: 33355042. Policy #0 lag: (min: 31.0, avg: 35.8, max: 63.0) [2023-10-12 18:30:08,435][61643] Avg episode reward: [(0, '24.550'), (1, '10.010')] [2023-10-12 18:30:08,594][62634] Updated weights for policy 0, policy_version 65140 (0.0011) [2023-10-12 18:30:08,974][62634] Updated weights for policy 0, policy_version 65150 (0.0008) [2023-10-12 18:30:12,412][62635] Updated weights for policy 1, policy_version 65160 (0.0010) [2023-10-12 18:30:12,785][62635] Updated weights for policy 1, policy_version 65170 (0.0007) [2023-10-12 18:30:13,086][62634] Updated weights for policy 0, policy_version 65160 (0.0008) [2023-10-12 18:30:13,157][62635] Updated weights for policy 1, policy_version 65180 (0.0009) [2023-10-12 18:30:13,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 133464064. Throughput: 0: 1687.7, 1: 1689.2. Samples: 33375430. Policy #0 lag: (min: 31.0, avg: 35.8, max: 63.0) [2023-10-12 18:30:13,435][61643] Avg episode reward: [(0, '24.470'), (1, '10.000')] [2023-10-12 18:30:13,464][62634] Updated weights for policy 0, policy_version 65170 (0.0008) [2023-10-12 18:30:13,842][62634] Updated weights for policy 0, policy_version 65180 (0.0008) [2023-10-12 18:30:17,108][62635] Updated weights for policy 1, policy_version 65190 (0.0009) [2023-10-12 18:30:17,466][62635] Updated weights for policy 1, policy_version 65200 (0.0009) [2023-10-12 18:30:17,731][62634] Updated weights for policy 0, policy_version 65190 (0.0009) [2023-10-12 18:30:17,830][62635] Updated weights for policy 1, policy_version 65210 (0.0009) [2023-10-12 18:30:18,107][62634] Updated weights for policy 0, policy_version 65200 (0.0008) [2023-10-12 18:30:18,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 133529600. Throughput: 0: 1676.6, 1: 1663.2. Samples: 33394522. Policy #0 lag: (min: 31.0, avg: 35.8, max: 63.0) [2023-10-12 18:30:18,436][61643] Avg episode reward: [(0, '24.590'), (1, '9.810')] [2023-10-12 18:30:18,445][62495] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000065216_66781184.pth... [2023-10-12 18:30:18,482][62495] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000063648_65175552.pth [2023-10-12 18:30:18,483][62634] Updated weights for policy 0, policy_version 65210 (0.0007) [2023-10-12 18:30:18,700][62354] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000065216_66781184.pth... [2023-10-12 18:30:18,730][62354] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000063648_65175552.pth [2023-10-12 18:30:21,923][62635] Updated weights for policy 1, policy_version 65220 (0.0008) [2023-10-12 18:30:22,299][62635] Updated weights for policy 1, policy_version 65230 (0.0008) [2023-10-12 18:30:22,451][62634] Updated weights for policy 0, policy_version 65220 (0.0008) [2023-10-12 18:30:22,669][62635] Updated weights for policy 1, policy_version 65240 (0.0008) [2023-10-12 18:30:22,825][62634] Updated weights for policy 0, policy_version 65230 (0.0009) [2023-10-12 18:30:23,200][62634] Updated weights for policy 0, policy_version 65240 (0.0009) [2023-10-12 18:30:23,435][61643] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 133595136. Throughput: 0: 1688.5, 1: 1681.7. Samples: 33405344. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 18:30:23,436][61643] Avg episode reward: [(0, '24.600'), (1, '9.870')] [2023-10-12 18:30:26,720][62635] Updated weights for policy 1, policy_version 65250 (0.0007) [2023-10-12 18:30:27,085][62635] Updated weights for policy 1, policy_version 65260 (0.0008) [2023-10-12 18:30:27,192][62634] Updated weights for policy 0, policy_version 65250 (0.0010) [2023-10-12 18:30:27,457][62635] Updated weights for policy 1, policy_version 65270 (0.0008) [2023-10-12 18:30:27,573][62634] Updated weights for policy 0, policy_version 65260 (0.0007) [2023-10-12 18:30:27,827][62635] Updated weights for policy 1, policy_version 65280 (0.0008) [2023-10-12 18:30:27,948][62634] Updated weights for policy 0, policy_version 65270 (0.0008) [2023-10-12 18:30:28,323][62634] Updated weights for policy 0, policy_version 65280 (0.0008) [2023-10-12 18:30:28,435][61643] Fps is (10 sec: 16384.2, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 133693440. Throughput: 0: 1687.5, 1: 1661.7. Samples: 33425414. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 18:30:28,435][61643] Avg episode reward: [(0, '24.750'), (1, '10.050')] [2023-10-12 18:30:32,029][62635] Updated weights for policy 1, policy_version 65290 (0.0009) [2023-10-12 18:30:32,292][62634] Updated weights for policy 0, policy_version 65290 (0.0010) [2023-10-12 18:30:32,390][62635] Updated weights for policy 1, policy_version 65300 (0.0008) [2023-10-12 18:30:32,675][62634] Updated weights for policy 0, policy_version 65300 (0.0009) [2023-10-12 18:30:32,752][62635] Updated weights for policy 1, policy_version 65310 (0.0009) [2023-10-12 18:30:33,045][62634] Updated weights for policy 0, policy_version 65310 (0.0007) [2023-10-12 18:30:33,435][61643] Fps is (10 sec: 16384.0, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 133758976. Throughput: 0: 1662.9, 1: 1661.0. Samples: 33444248. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 18:30:33,436][61643] Avg episode reward: [(0, '24.800'), (1, '9.770')] [2023-10-12 18:30:36,471][62635] Updated weights for policy 1, policy_version 65320 (0.0010) [2023-10-12 18:30:36,839][62635] Updated weights for policy 1, policy_version 65330 (0.0008) [2023-10-12 18:30:37,021][62634] Updated weights for policy 0, policy_version 65320 (0.0010) [2023-10-12 18:30:37,198][62635] Updated weights for policy 1, policy_version 65340 (0.0009) [2023-10-12 18:30:37,390][62634] Updated weights for policy 0, policy_version 65330 (0.0008) [2023-10-12 18:30:37,776][62634] Updated weights for policy 0, policy_version 65340 (0.0009) [2023-10-12 18:30:38,435][61643] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 133824512. Throughput: 0: 1688.2, 1: 1675.6. Samples: 33455986. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 18:30:38,435][61643] Avg episode reward: [(0, '24.840'), (1, '9.910')] [2023-10-12 18:30:41,243][62635] Updated weights for policy 1, policy_version 65350 (0.0008) [2023-10-12 18:30:41,614][62635] Updated weights for policy 1, policy_version 65360 (0.0009) [2023-10-12 18:30:41,968][62634] Updated weights for policy 0, policy_version 65350 (0.0009) [2023-10-12 18:30:41,985][62635] Updated weights for policy 1, policy_version 65370 (0.0008) [2023-10-12 18:30:42,354][62634] Updated weights for policy 0, policy_version 65360 (0.0008) [2023-10-12 18:30:42,723][62634] Updated weights for policy 0, policy_version 65370 (0.0007) [2023-10-12 18:30:43,435][61643] Fps is (10 sec: 13107.5, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 133890048. Throughput: 0: 1677.1, 1: 1652.8. Samples: 33475328. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 18:30:43,435][61643] Avg episode reward: [(0, '24.600'), (1, '10.270')] [2023-10-12 18:30:43,436][62495] Saving new best policy, reward=10.270! [2023-10-12 18:30:46,031][62635] Updated weights for policy 1, policy_version 65380 (0.0007) [2023-10-12 18:30:46,396][62635] Updated weights for policy 1, policy_version 65390 (0.0007) [2023-10-12 18:30:46,766][62635] Updated weights for policy 1, policy_version 65400 (0.0007) [2023-10-12 18:30:47,053][62634] Updated weights for policy 0, policy_version 65380 (0.0008) [2023-10-12 18:30:47,432][62634] Updated weights for policy 0, policy_version 65390 (0.0007) [2023-10-12 18:30:47,816][62634] Updated weights for policy 0, policy_version 65400 (0.0009) [2023-10-12 18:30:48,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 133955584. Throughput: 0: 1654.8, 1: 1674.0. Samples: 33494668. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 18:30:48,435][61643] Avg episode reward: [(0, '24.210'), (1, '9.830')] [2023-10-12 18:30:50,818][62635] Updated weights for policy 1, policy_version 65410 (0.0008) [2023-10-12 18:30:51,181][62635] Updated weights for policy 1, policy_version 65420 (0.0009) [2023-10-12 18:30:51,555][62635] Updated weights for policy 1, policy_version 65430 (0.0008) [2023-10-12 18:30:51,868][62634] Updated weights for policy 0, policy_version 65410 (0.0009) [2023-10-12 18:30:51,917][62635] Updated weights for policy 1, policy_version 65440 (0.0007) [2023-10-12 18:30:52,265][62634] Updated weights for policy 0, policy_version 65420 (0.0008) [2023-10-12 18:30:52,640][62634] Updated weights for policy 0, policy_version 65430 (0.0009) [2023-10-12 18:30:53,025][62634] Updated weights for policy 0, policy_version 65440 (0.0008) [2023-10-12 18:30:53,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 134021120. Throughput: 0: 1678.8, 1: 1670.1. Samples: 33505744. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 18:30:53,435][61643] Avg episode reward: [(0, '24.100'), (1, '10.010')] [2023-10-12 18:30:56,159][62635] Updated weights for policy 1, policy_version 65450 (0.0007) [2023-10-12 18:30:56,519][62635] Updated weights for policy 1, policy_version 65460 (0.0008) [2023-10-12 18:30:56,883][62635] Updated weights for policy 1, policy_version 65470 (0.0009) [2023-10-12 18:30:57,064][62634] Updated weights for policy 0, policy_version 65450 (0.0009) [2023-10-12 18:30:57,441][62634] Updated weights for policy 0, policy_version 65460 (0.0009) [2023-10-12 18:30:57,824][62634] Updated weights for policy 0, policy_version 65470 (0.0010) [2023-10-12 18:30:58,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 134086656. Throughput: 0: 1673.0, 1: 1653.9. Samples: 33525142. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 18:30:58,436][61643] Avg episode reward: [(0, '24.420'), (1, '10.130')] [2023-10-12 18:31:01,071][62635] Updated weights for policy 1, policy_version 65480 (0.0008) [2023-10-12 18:31:01,443][62635] Updated weights for policy 1, policy_version 65490 (0.0009) [2023-10-12 18:31:01,771][62634] Updated weights for policy 0, policy_version 65480 (0.0007) [2023-10-12 18:31:01,801][62635] Updated weights for policy 1, policy_version 65500 (0.0009) [2023-10-12 18:31:02,154][62634] Updated weights for policy 0, policy_version 65490 (0.0009) [2023-10-12 18:31:02,518][62634] Updated weights for policy 0, policy_version 65500 (0.0008) [2023-10-12 18:31:03,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 134152192. Throughput: 0: 1657.3, 1: 1684.9. Samples: 33544924. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 18:31:03,436][61643] Avg episode reward: [(0, '24.380'), (1, '9.790')] [2023-10-12 18:31:05,916][62635] Updated weights for policy 1, policy_version 65510 (0.0009) [2023-10-12 18:31:06,283][62635] Updated weights for policy 1, policy_version 65520 (0.0009) [2023-10-12 18:31:06,652][62634] Updated weights for policy 0, policy_version 65510 (0.0008) [2023-10-12 18:31:06,657][62635] Updated weights for policy 1, policy_version 65530 (0.0007) [2023-10-12 18:31:07,026][62634] Updated weights for policy 0, policy_version 65520 (0.0007) [2023-10-12 18:31:07,410][62634] Updated weights for policy 0, policy_version 65530 (0.0008) [2023-10-12 18:31:08,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 134217728. Throughput: 0: 1674.0, 1: 1678.2. Samples: 33556192. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 18:31:08,436][61643] Avg episode reward: [(0, '24.360'), (1, '10.040')] [2023-10-12 18:31:10,696][62635] Updated weights for policy 1, policy_version 65540 (0.0008) [2023-10-12 18:31:11,055][62635] Updated weights for policy 1, policy_version 65550 (0.0010) [2023-10-12 18:31:11,354][62634] Updated weights for policy 0, policy_version 65540 (0.0007) [2023-10-12 18:31:11,429][62635] Updated weights for policy 1, policy_version 65560 (0.0008) [2023-10-12 18:31:11,731][62634] Updated weights for policy 0, policy_version 65550 (0.0010) [2023-10-12 18:31:12,119][62634] Updated weights for policy 0, policy_version 65560 (0.0009) [2023-10-12 18:31:13,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 134283264. Throughput: 0: 1663.5, 1: 1668.6. Samples: 33575358. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 18:31:13,435][61643] Avg episode reward: [(0, '24.360'), (1, '10.040')] [2023-10-12 18:31:15,268][62635] Updated weights for policy 1, policy_version 65570 (0.0008) [2023-10-12 18:31:15,641][62635] Updated weights for policy 1, policy_version 65580 (0.0008) [2023-10-12 18:31:16,016][62635] Updated weights for policy 1, policy_version 65590 (0.0009) [2023-10-12 18:31:16,220][62634] Updated weights for policy 0, policy_version 65570 (0.0010) [2023-10-12 18:31:16,383][62635] Updated weights for policy 1, policy_version 65600 (0.0009) [2023-10-12 18:31:16,601][62634] Updated weights for policy 0, policy_version 65580 (0.0009) [2023-10-12 18:31:16,985][62634] Updated weights for policy 0, policy_version 65590 (0.0010) [2023-10-12 18:31:17,357][62634] Updated weights for policy 0, policy_version 65600 (0.0008) [2023-10-12 18:31:18,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 134348800. Throughput: 0: 1672.3, 1: 1693.3. Samples: 33595702. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 18:31:18,436][61643] Avg episode reward: [(0, '24.270'), (1, '9.860')] [2023-10-12 18:31:20,529][62635] Updated weights for policy 1, policy_version 65610 (0.0008) [2023-10-12 18:31:20,897][62635] Updated weights for policy 1, policy_version 65620 (0.0007) [2023-10-12 18:31:21,274][62635] Updated weights for policy 1, policy_version 65630 (0.0007) [2023-10-12 18:31:21,419][62634] Updated weights for policy 0, policy_version 65610 (0.0009) [2023-10-12 18:31:21,798][62634] Updated weights for policy 0, policy_version 65620 (0.0009) [2023-10-12 18:31:22,174][62634] Updated weights for policy 0, policy_version 65630 (0.0009) [2023-10-12 18:31:23,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 134414336. Throughput: 0: 1675.7, 1: 1663.9. Samples: 33606268. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 18:31:23,435][61643] Avg episode reward: [(0, '24.090'), (1, '10.040')] [2023-10-12 18:31:25,284][62635] Updated weights for policy 1, policy_version 65640 (0.0008) [2023-10-12 18:31:25,645][62635] Updated weights for policy 1, policy_version 65650 (0.0009) [2023-10-12 18:31:26,009][62635] Updated weights for policy 1, policy_version 65660 (0.0008) [2023-10-12 18:31:26,090][62634] Updated weights for policy 0, policy_version 65640 (0.0008) [2023-10-12 18:31:26,470][62634] Updated weights for policy 0, policy_version 65650 (0.0008) [2023-10-12 18:31:26,857][62634] Updated weights for policy 0, policy_version 65660 (0.0011) [2023-10-12 18:31:28,435][61643] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 134479872. Throughput: 0: 1659.2, 1: 1680.9. Samples: 33625634. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 18:31:28,435][61643] Avg episode reward: [(0, '24.210'), (1, '10.220')] [2023-10-12 18:31:30,238][62635] Updated weights for policy 1, policy_version 65670 (0.0008) [2023-10-12 18:31:30,598][62635] Updated weights for policy 1, policy_version 65680 (0.0007) [2023-10-12 18:31:30,827][62634] Updated weights for policy 0, policy_version 65670 (0.0010) [2023-10-12 18:31:30,965][62635] Updated weights for policy 1, policy_version 65690 (0.0007) [2023-10-12 18:31:31,209][62634] Updated weights for policy 0, policy_version 65680 (0.0009) [2023-10-12 18:31:31,578][62634] Updated weights for policy 0, policy_version 65690 (0.0007) [2023-10-12 18:31:33,435][61643] Fps is (10 sec: 13106.6, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 134545408. Throughput: 0: 1682.4, 1: 1688.1. Samples: 33646344. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 18:31:33,436][61643] Avg episode reward: [(0, '24.310'), (1, '9.860')] [2023-10-12 18:31:34,832][62635] Updated weights for policy 1, policy_version 65700 (0.0009) [2023-10-12 18:31:35,198][62635] Updated weights for policy 1, policy_version 65710 (0.0008) [2023-10-12 18:31:35,568][62635] Updated weights for policy 1, policy_version 65720 (0.0007) [2023-10-12 18:31:35,620][62634] Updated weights for policy 0, policy_version 65700 (0.0009) [2023-10-12 18:31:35,992][62634] Updated weights for policy 0, policy_version 65710 (0.0009) [2023-10-12 18:31:36,374][62634] Updated weights for policy 0, policy_version 65720 (0.0011) [2023-10-12 18:31:38,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 134610944. Throughput: 0: 1677.1, 1: 1668.3. Samples: 33656286. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 18:31:38,436][61643] Avg episode reward: [(0, '24.310'), (1, '9.860')] [2023-10-12 18:31:39,633][62635] Updated weights for policy 1, policy_version 65730 (0.0008) [2023-10-12 18:31:39,993][62635] Updated weights for policy 1, policy_version 65740 (0.0008) [2023-10-12 18:31:40,360][62635] Updated weights for policy 1, policy_version 65750 (0.0007) [2023-10-12 18:31:40,514][62634] Updated weights for policy 0, policy_version 65730 (0.0008) [2023-10-12 18:31:40,732][62635] Updated weights for policy 1, policy_version 65760 (0.0007) [2023-10-12 18:31:40,922][62634] Updated weights for policy 0, policy_version 65740 (0.0007) [2023-10-12 18:31:41,298][62634] Updated weights for policy 0, policy_version 65750 (0.0009) [2023-10-12 18:31:41,672][62634] Updated weights for policy 0, policy_version 65760 (0.0008) [2023-10-12 18:31:43,435][61643] Fps is (10 sec: 13107.8, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 134676480. Throughput: 0: 1669.9, 1: 1694.7. Samples: 33676548. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 18:31:43,435][61643] Avg episode reward: [(0, '24.520'), (1, '10.240')] [2023-10-12 18:31:44,684][62635] Updated weights for policy 1, policy_version 65770 (0.0007) [2023-10-12 18:31:45,050][62635] Updated weights for policy 1, policy_version 65780 (0.0007) [2023-10-12 18:31:45,419][62635] Updated weights for policy 1, policy_version 65790 (0.0009) [2023-10-12 18:31:45,805][62634] Updated weights for policy 0, policy_version 65770 (0.0007) [2023-10-12 18:31:46,181][62634] Updated weights for policy 0, policy_version 65780 (0.0009) [2023-10-12 18:31:46,568][62634] Updated weights for policy 0, policy_version 65790 (0.0008) [2023-10-12 18:31:48,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 134742016. Throughput: 0: 1686.2, 1: 1696.4. Samples: 33697142. Policy #0 lag: (min: 2.0, avg: 10.7, max: 34.0) [2023-10-12 18:31:48,436][61643] Avg episode reward: [(0, '24.590'), (1, '9.870')] [2023-10-12 18:31:49,454][62635] Updated weights for policy 1, policy_version 65800 (0.0007) [2023-10-12 18:31:49,814][62635] Updated weights for policy 1, policy_version 65810 (0.0008) [2023-10-12 18:31:50,184][62635] Updated weights for policy 1, policy_version 65820 (0.0008) [2023-10-12 18:31:50,473][62634] Updated weights for policy 0, policy_version 65800 (0.0007) [2023-10-12 18:31:50,845][62634] Updated weights for policy 0, policy_version 65810 (0.0008) [2023-10-12 18:31:51,221][62634] Updated weights for policy 0, policy_version 65820 (0.0007) [2023-10-12 18:31:53,435][61643] Fps is (10 sec: 13106.9, 60 sec: 13107.1, 300 sec: 13329.4). Total num frames: 134807552. Throughput: 0: 1670.0, 1: 1674.8. Samples: 33706708. Policy #0 lag: (min: 2.0, avg: 10.7, max: 34.0) [2023-10-12 18:31:53,436][61643] Avg episode reward: [(0, '24.500'), (1, '9.870')] [2023-10-12 18:31:54,365][62635] Updated weights for policy 1, policy_version 65830 (0.0007) [2023-10-12 18:31:54,738][62635] Updated weights for policy 1, policy_version 65840 (0.0008) [2023-10-12 18:31:55,102][62635] Updated weights for policy 1, policy_version 65850 (0.0009) [2023-10-12 18:31:55,291][62634] Updated weights for policy 0, policy_version 65830 (0.0007) [2023-10-12 18:31:55,669][62634] Updated weights for policy 0, policy_version 65840 (0.0008) [2023-10-12 18:31:56,044][62634] Updated weights for policy 0, policy_version 65850 (0.0008) [2023-10-12 18:31:58,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 134873088. Throughput: 0: 1665.9, 1: 1695.3. Samples: 33726614. Policy #0 lag: (min: 2.0, avg: 10.7, max: 34.0) [2023-10-12 18:31:58,436][61643] Avg episode reward: [(0, '24.550'), (1, '10.050')] [2023-10-12 18:31:59,159][62635] Updated weights for policy 1, policy_version 65860 (0.0008) [2023-10-12 18:31:59,527][62635] Updated weights for policy 1, policy_version 65870 (0.0008) [2023-10-12 18:31:59,893][62635] Updated weights for policy 1, policy_version 65880 (0.0010) [2023-10-12 18:32:00,161][62634] Updated weights for policy 0, policy_version 65860 (0.0007) [2023-10-12 18:32:00,538][62634] Updated weights for policy 0, policy_version 65870 (0.0007) [2023-10-12 18:32:00,919][62634] Updated weights for policy 0, policy_version 65880 (0.0008) [2023-10-12 18:32:03,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 134938624. Throughput: 0: 1678.4, 1: 1691.7. Samples: 33747360. Policy #0 lag: (min: 2.0, avg: 10.7, max: 34.0) [2023-10-12 18:32:03,436][61643] Avg episode reward: [(0, '24.870'), (1, '9.870')] [2023-10-12 18:32:03,949][62635] Updated weights for policy 1, policy_version 65890 (0.0009) [2023-10-12 18:32:04,318][62635] Updated weights for policy 1, policy_version 65900 (0.0007) [2023-10-12 18:32:04,689][62635] Updated weights for policy 1, policy_version 65910 (0.0007) [2023-10-12 18:32:04,977][62634] Updated weights for policy 0, policy_version 65890 (0.0009) [2023-10-12 18:32:05,048][62635] Updated weights for policy 1, policy_version 65920 (0.0007) [2023-10-12 18:32:05,354][62634] Updated weights for policy 0, policy_version 65900 (0.0007) [2023-10-12 18:32:05,730][62634] Updated weights for policy 0, policy_version 65910 (0.0007) [2023-10-12 18:32:06,102][62634] Updated weights for policy 0, policy_version 65920 (0.0008) [2023-10-12 18:32:08,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 135004160. Throughput: 0: 1658.7, 1: 1685.6. Samples: 33756764. Policy #0 lag: (min: 2.0, avg: 10.7, max: 34.0) [2023-10-12 18:32:08,436][61643] Avg episode reward: [(0, '25.120'), (1, '9.780')] [2023-10-12 18:32:09,137][62635] Updated weights for policy 1, policy_version 65930 (0.0007) [2023-10-12 18:32:09,508][62635] Updated weights for policy 1, policy_version 65940 (0.0007) [2023-10-12 18:32:09,872][62635] Updated weights for policy 1, policy_version 65950 (0.0008) [2023-10-12 18:32:10,284][62634] Updated weights for policy 0, policy_version 65930 (0.0007) [2023-10-12 18:32:10,661][62634] Updated weights for policy 0, policy_version 65940 (0.0009) [2023-10-12 18:32:11,038][62634] Updated weights for policy 0, policy_version 65950 (0.0008) [2023-10-12 18:32:13,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 135069696. Throughput: 0: 1671.7, 1: 1695.4. Samples: 33777154. Policy #0 lag: (min: 2.0, avg: 10.7, max: 34.0) [2023-10-12 18:32:13,436][61643] Avg episode reward: [(0, '25.340'), (1, '10.050')] [2023-10-12 18:32:14,022][62635] Updated weights for policy 1, policy_version 65960 (0.0008) [2023-10-12 18:32:14,392][62635] Updated weights for policy 1, policy_version 65970 (0.0008) [2023-10-12 18:32:14,768][62635] Updated weights for policy 1, policy_version 65980 (0.0008) [2023-10-12 18:32:15,045][62634] Updated weights for policy 0, policy_version 65960 (0.0008) [2023-10-12 18:32:15,429][62634] Updated weights for policy 0, policy_version 65970 (0.0007) [2023-10-12 18:32:15,799][62634] Updated weights for policy 0, policy_version 65980 (0.0009) [2023-10-12 18:32:18,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 135135232. Throughput: 0: 1676.7, 1: 1694.0. Samples: 33798024. Policy #0 lag: (min: 2.0, avg: 10.7, max: 34.0) [2023-10-12 18:32:18,436][61643] Avg episode reward: [(0, '25.380'), (1, '9.820')] [2023-10-12 18:32:18,445][62354] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000065984_67567616.pth... [2023-10-12 18:32:18,484][62354] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000064416_65961984.pth [2023-10-12 18:32:18,624][62635] Updated weights for policy 1, policy_version 65990 (0.0009) [2023-10-12 18:32:18,991][62635] Updated weights for policy 1, policy_version 66000 (0.0008) [2023-10-12 18:32:19,370][62635] Updated weights for policy 1, policy_version 66010 (0.0008) [2023-10-12 18:32:19,586][62495] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000066016_67600384.pth... [2023-10-12 18:32:19,614][62495] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000064416_65961984.pth [2023-10-12 18:32:19,669][62634] Updated weights for policy 0, policy_version 65990 (0.0009) [2023-10-12 18:32:20,043][62634] Updated weights for policy 0, policy_version 66000 (0.0009) [2023-10-12 18:32:20,432][62634] Updated weights for policy 0, policy_version 66010 (0.0010) [2023-10-12 18:32:23,414][62635] Updated weights for policy 1, policy_version 66020 (0.0009) [2023-10-12 18:32:23,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 135200768. Throughput: 0: 1658.5, 1: 1692.9. Samples: 33807102. Policy #0 lag: (min: 2.0, avg: 10.7, max: 34.0) [2023-10-12 18:32:23,436][61643] Avg episode reward: [(0, '25.540'), (1, '9.730')] [2023-10-12 18:32:23,779][62635] Updated weights for policy 1, policy_version 66030 (0.0010) [2023-10-12 18:32:24,151][62635] Updated weights for policy 1, policy_version 66040 (0.0007) [2023-10-12 18:32:24,468][62634] Updated weights for policy 0, policy_version 66020 (0.0007) [2023-10-12 18:32:24,849][62634] Updated weights for policy 0, policy_version 66030 (0.0009) [2023-10-12 18:32:25,232][62634] Updated weights for policy 0, policy_version 66040 (0.0009) [2023-10-12 18:32:28,342][62635] Updated weights for policy 1, policy_version 66050 (0.0009) [2023-10-12 18:32:28,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 135266304. Throughput: 0: 1673.7, 1: 1689.9. Samples: 33827914. Policy #0 lag: (min: 2.0, avg: 10.7, max: 34.0) [2023-10-12 18:32:28,436][61643] Avg episode reward: [(0, '25.250'), (1, '9.980')] [2023-10-12 18:32:28,703][62635] Updated weights for policy 1, policy_version 66060 (0.0007) [2023-10-12 18:32:29,088][62635] Updated weights for policy 1, policy_version 66070 (0.0008) [2023-10-12 18:32:29,322][62634] Updated weights for policy 0, policy_version 66050 (0.0008) [2023-10-12 18:32:29,448][62635] Updated weights for policy 1, policy_version 66080 (0.0009) [2023-10-12 18:32:29,706][62634] Updated weights for policy 0, policy_version 66060 (0.0009) [2023-10-12 18:32:30,081][62634] Updated weights for policy 0, policy_version 66070 (0.0007) [2023-10-12 18:32:30,465][62634] Updated weights for policy 0, policy_version 66080 (0.0007) [2023-10-12 18:32:33,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 135331840. Throughput: 0: 1679.3, 1: 1686.5. Samples: 33848604. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 18:32:33,436][61643] Avg episode reward: [(0, '25.150'), (1, '9.920')] [2023-10-12 18:32:33,570][62635] Updated weights for policy 1, policy_version 66090 (0.0008) [2023-10-12 18:32:33,934][62635] Updated weights for policy 1, policy_version 66100 (0.0008) [2023-10-12 18:32:34,302][62635] Updated weights for policy 1, policy_version 66110 (0.0007) [2023-10-12 18:32:34,616][62634] Updated weights for policy 0, policy_version 66090 (0.0008) [2023-10-12 18:32:35,003][62634] Updated weights for policy 0, policy_version 66100 (0.0008) [2023-10-12 18:32:35,377][62634] Updated weights for policy 0, policy_version 66110 (0.0007) [2023-10-12 18:32:38,395][62635] Updated weights for policy 1, policy_version 66120 (0.0009) [2023-10-12 18:32:38,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 135397376. Throughput: 0: 1664.6, 1: 1689.8. Samples: 33857654. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 18:32:38,436][61643] Avg episode reward: [(0, '25.340'), (1, '9.710')] [2023-10-12 18:32:38,767][62635] Updated weights for policy 1, policy_version 66130 (0.0008) [2023-10-12 18:32:39,129][62635] Updated weights for policy 1, policy_version 66140 (0.0008) [2023-10-12 18:32:39,493][62634] Updated weights for policy 0, policy_version 66120 (0.0008) [2023-10-12 18:32:39,869][62634] Updated weights for policy 0, policy_version 66130 (0.0007) [2023-10-12 18:32:40,258][62634] Updated weights for policy 0, policy_version 66140 (0.0010) [2023-10-12 18:32:42,982][62635] Updated weights for policy 1, policy_version 66150 (0.0007) [2023-10-12 18:32:43,352][62635] Updated weights for policy 1, policy_version 66160 (0.0007) [2023-10-12 18:32:43,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 135462912. Throughput: 0: 1680.0, 1: 1698.4. Samples: 33878640. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 18:32:43,436][61643] Avg episode reward: [(0, '25.250'), (1, '9.980')] [2023-10-12 18:32:43,726][62635] Updated weights for policy 1, policy_version 66170 (0.0010) [2023-10-12 18:32:44,322][62634] Updated weights for policy 0, policy_version 66150 (0.0009) [2023-10-12 18:32:44,690][62634] Updated weights for policy 0, policy_version 66160 (0.0008) [2023-10-12 18:32:45,069][62634] Updated weights for policy 0, policy_version 66170 (0.0010) [2023-10-12 18:32:47,628][62635] Updated weights for policy 1, policy_version 66180 (0.0010) [2023-10-12 18:32:47,988][62635] Updated weights for policy 1, policy_version 66190 (0.0008) [2023-10-12 18:32:48,348][62635] Updated weights for policy 1, policy_version 66200 (0.0010) [2023-10-12 18:32:48,435][61643] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 135528448. Throughput: 0: 1681.1, 1: 1685.9. Samples: 33898874. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 18:32:48,436][61643] Avg episode reward: [(0, '25.280'), (1, '9.750')] [2023-10-12 18:32:49,184][62634] Updated weights for policy 0, policy_version 66180 (0.0007) [2023-10-12 18:32:49,558][62634] Updated weights for policy 0, policy_version 66190 (0.0007) [2023-10-12 18:32:49,933][62634] Updated weights for policy 0, policy_version 66200 (0.0007) [2023-10-12 18:32:52,474][62635] Updated weights for policy 1, policy_version 66210 (0.0010) [2023-10-12 18:32:52,837][62635] Updated weights for policy 1, policy_version 66220 (0.0008) [2023-10-12 18:32:53,212][62635] Updated weights for policy 1, policy_version 66230 (0.0009) [2023-10-12 18:32:53,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 135593984. Throughput: 0: 1671.6, 1: 1701.0. Samples: 33908534. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 18:32:53,436][61643] Avg episode reward: [(0, '25.230'), (1, '9.590')] [2023-10-12 18:32:53,576][62635] Updated weights for policy 1, policy_version 66240 (0.0010) [2023-10-12 18:32:54,038][62634] Updated weights for policy 0, policy_version 66210 (0.0008) [2023-10-12 18:32:54,403][62634] Updated weights for policy 0, policy_version 66220 (0.0008) [2023-10-12 18:32:54,777][62634] Updated weights for policy 0, policy_version 66230 (0.0007) [2023-10-12 18:32:55,158][62634] Updated weights for policy 0, policy_version 66240 (0.0008) [2023-10-12 18:32:57,707][62635] Updated weights for policy 1, policy_version 66250 (0.0007) [2023-10-12 18:32:58,075][62635] Updated weights for policy 1, policy_version 66260 (0.0009) [2023-10-12 18:32:58,435][61643] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 135659520. Throughput: 0: 1681.4, 1: 1699.4. Samples: 33929292. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 18:32:58,435][61643] Avg episode reward: [(0, '25.260'), (1, '9.860')] [2023-10-12 18:32:58,444][62635] Updated weights for policy 1, policy_version 66270 (0.0007) [2023-10-12 18:32:59,188][62634] Updated weights for policy 0, policy_version 66250 (0.0008) [2023-10-12 18:32:59,574][62634] Updated weights for policy 0, policy_version 66260 (0.0009) [2023-10-12 18:32:59,943][62634] Updated weights for policy 0, policy_version 66270 (0.0009) [2023-10-12 18:33:02,282][62635] Updated weights for policy 1, policy_version 66280 (0.0008) [2023-10-12 18:33:02,639][62635] Updated weights for policy 1, policy_version 66290 (0.0007) [2023-10-12 18:33:03,013][62635] Updated weights for policy 1, policy_version 66300 (0.0007) [2023-10-12 18:33:03,435][61643] Fps is (10 sec: 16384.3, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 135757824. Throughput: 0: 1679.0, 1: 1680.1. Samples: 33949184. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 18:33:03,435][61643] Avg episode reward: [(0, '25.250'), (1, '9.680')] [2023-10-12 18:33:03,910][62634] Updated weights for policy 0, policy_version 66280 (0.0009) [2023-10-12 18:33:04,285][62634] Updated weights for policy 0, policy_version 66290 (0.0009) [2023-10-12 18:33:04,659][62634] Updated weights for policy 0, policy_version 66300 (0.0007) [2023-10-12 18:33:06,893][62635] Updated weights for policy 1, policy_version 66310 (0.0007) [2023-10-12 18:33:07,255][62635] Updated weights for policy 1, policy_version 66320 (0.0009) [2023-10-12 18:33:07,622][62635] Updated weights for policy 1, policy_version 66330 (0.0007) [2023-10-12 18:33:08,435][61643] Fps is (10 sec: 16383.7, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 135823360. Throughput: 0: 1682.3, 1: 1707.1. Samples: 33959624. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 18:33:08,436][61643] Avg episode reward: [(0, '25.300'), (1, '9.670')] [2023-10-12 18:33:08,677][62634] Updated weights for policy 0, policy_version 66310 (0.0009) [2023-10-12 18:33:09,050][62634] Updated weights for policy 0, policy_version 66320 (0.0008) [2023-10-12 18:33:09,433][62634] Updated weights for policy 0, policy_version 66330 (0.0007) [2023-10-12 18:33:11,610][62635] Updated weights for policy 1, policy_version 66340 (0.0007) [2023-10-12 18:33:11,976][62635] Updated weights for policy 1, policy_version 66350 (0.0008) [2023-10-12 18:33:12,338][62635] Updated weights for policy 1, policy_version 66360 (0.0008) [2023-10-12 18:33:13,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 135888896. Throughput: 0: 1681.9, 1: 1694.5. Samples: 33979850. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 18:33:13,435][61643] Avg episode reward: [(0, '25.130'), (1, '9.850')] [2023-10-12 18:33:13,540][62634] Updated weights for policy 0, policy_version 66340 (0.0007) [2023-10-12 18:33:13,912][62634] Updated weights for policy 0, policy_version 66350 (0.0008) [2023-10-12 18:33:14,282][62634] Updated weights for policy 0, policy_version 66360 (0.0007) [2023-10-12 18:33:16,410][62635] Updated weights for policy 1, policy_version 66370 (0.0007) [2023-10-12 18:33:16,763][62635] Updated weights for policy 1, policy_version 66380 (0.0009) [2023-10-12 18:33:17,135][62635] Updated weights for policy 1, policy_version 66390 (0.0008) [2023-10-12 18:33:17,504][62635] Updated weights for policy 1, policy_version 66400 (0.0007) [2023-10-12 18:33:18,287][62634] Updated weights for policy 0, policy_version 66370 (0.0010) [2023-10-12 18:33:18,435][61643] Fps is (10 sec: 13107.5, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 135954432. Throughput: 0: 1683.7, 1: 1682.3. Samples: 34000074. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 18:33:18,435][61643] Avg episode reward: [(0, '25.070'), (1, '9.680')] [2023-10-12 18:33:18,698][62634] Updated weights for policy 0, policy_version 66380 (0.0008) [2023-10-12 18:33:19,064][62634] Updated weights for policy 0, policy_version 66390 (0.0007) [2023-10-12 18:33:19,444][62634] Updated weights for policy 0, policy_version 66400 (0.0007) [2023-10-12 18:33:21,734][62635] Updated weights for policy 1, policy_version 66410 (0.0009) [2023-10-12 18:33:22,101][62635] Updated weights for policy 1, policy_version 66420 (0.0009) [2023-10-12 18:33:22,479][62635] Updated weights for policy 1, policy_version 66430 (0.0009) [2023-10-12 18:33:23,422][62634] Updated weights for policy 0, policy_version 66410 (0.0010) [2023-10-12 18:33:23,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 136019968. Throughput: 0: 1683.0, 1: 1709.4. Samples: 34010310. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 18:33:23,435][61643] Avg episode reward: [(0, '24.880'), (1, '9.850')] [2023-10-12 18:33:23,795][62634] Updated weights for policy 0, policy_version 66420 (0.0009) [2023-10-12 18:33:24,166][62634] Updated weights for policy 0, policy_version 66430 (0.0010) [2023-10-12 18:33:26,613][62635] Updated weights for policy 1, policy_version 66440 (0.0009) [2023-10-12 18:33:26,983][62635] Updated weights for policy 1, policy_version 66450 (0.0009) [2023-10-12 18:33:27,356][62635] Updated weights for policy 1, policy_version 66460 (0.0007) [2023-10-12 18:33:28,336][62634] Updated weights for policy 0, policy_version 66440 (0.0009) [2023-10-12 18:33:28,435][61643] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 136085504. Throughput: 0: 1689.8, 1: 1684.6. Samples: 34030490. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 18:33:28,436][61643] Avg episode reward: [(0, '24.880'), (1, '10.030')] [2023-10-12 18:33:28,725][62634] Updated weights for policy 0, policy_version 66450 (0.0010) [2023-10-12 18:33:29,105][62634] Updated weights for policy 0, policy_version 66460 (0.0010) [2023-10-12 18:33:31,509][62635] Updated weights for policy 1, policy_version 66470 (0.0008) [2023-10-12 18:33:31,875][62635] Updated weights for policy 1, policy_version 66480 (0.0008) [2023-10-12 18:33:32,250][62635] Updated weights for policy 1, policy_version 66490 (0.0010) [2023-10-12 18:33:33,198][62634] Updated weights for policy 0, policy_version 66470 (0.0008) [2023-10-12 18:33:33,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 136151040. Throughput: 0: 1687.8, 1: 1681.4. Samples: 34050488. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 18:33:33,435][61643] Avg episode reward: [(0, '24.910'), (1, '9.920')] [2023-10-12 18:33:33,567][62634] Updated weights for policy 0, policy_version 66480 (0.0007) [2023-10-12 18:33:33,947][62634] Updated weights for policy 0, policy_version 66490 (0.0010) [2023-10-12 18:33:36,145][62635] Updated weights for policy 1, policy_version 66500 (0.0010) [2023-10-12 18:33:36,513][62635] Updated weights for policy 1, policy_version 66510 (0.0008) [2023-10-12 18:33:36,880][62635] Updated weights for policy 1, policy_version 66520 (0.0009) [2023-10-12 18:33:37,856][62634] Updated weights for policy 0, policy_version 66500 (0.0007) [2023-10-12 18:33:38,227][62634] Updated weights for policy 0, policy_version 66510 (0.0009) [2023-10-12 18:33:38,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 136216576. Throughput: 0: 1691.9, 1: 1696.6. Samples: 34061018. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 18:33:38,436][61643] Avg episode reward: [(0, '25.130'), (1, '9.880')] [2023-10-12 18:33:38,604][62634] Updated weights for policy 0, policy_version 66520 (0.0009) [2023-10-12 18:33:40,957][62635] Updated weights for policy 1, policy_version 66530 (0.0009) [2023-10-12 18:33:41,334][62635] Updated weights for policy 1, policy_version 66540 (0.0007) [2023-10-12 18:33:41,706][62635] Updated weights for policy 1, policy_version 66550 (0.0007) [2023-10-12 18:33:42,074][62635] Updated weights for policy 1, policy_version 66560 (0.0009) [2023-10-12 18:33:42,728][62634] Updated weights for policy 0, policy_version 66530 (0.0010) [2023-10-12 18:33:43,100][62634] Updated weights for policy 0, policy_version 66540 (0.0009) [2023-10-12 18:33:43,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 136282112. Throughput: 0: 1690.5, 1: 1670.0. Samples: 34080516. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 18:33:43,436][61643] Avg episode reward: [(0, '25.470'), (1, '10.090')] [2023-10-12 18:33:43,483][62634] Updated weights for policy 0, policy_version 66550 (0.0010) [2023-10-12 18:33:43,853][62634] Updated weights for policy 0, policy_version 66560 (0.0008) [2023-10-12 18:33:46,086][62635] Updated weights for policy 1, policy_version 66570 (0.0009) [2023-10-12 18:33:46,457][62635] Updated weights for policy 1, policy_version 66580 (0.0009) [2023-10-12 18:33:46,822][62635] Updated weights for policy 1, policy_version 66590 (0.0009) [2023-10-12 18:33:48,022][62634] Updated weights for policy 0, policy_version 66570 (0.0009) [2023-10-12 18:33:48,401][62634] Updated weights for policy 0, policy_version 66580 (0.0009) [2023-10-12 18:33:48,435][61643] Fps is (10 sec: 13107.5, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 136347648. Throughput: 0: 1681.7, 1: 1690.1. Samples: 34100916. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 18:33:48,435][61643] Avg episode reward: [(0, '25.450'), (1, '9.980')] [2023-10-12 18:33:48,774][62634] Updated weights for policy 0, policy_version 66590 (0.0008) [2023-10-12 18:33:50,921][62635] Updated weights for policy 1, policy_version 66600 (0.0008) [2023-10-12 18:33:51,293][62635] Updated weights for policy 1, policy_version 66610 (0.0009) [2023-10-12 18:33:51,657][62635] Updated weights for policy 1, policy_version 66620 (0.0007) [2023-10-12 18:33:52,852][62634] Updated weights for policy 0, policy_version 66600 (0.0008) [2023-10-12 18:33:53,224][62634] Updated weights for policy 0, policy_version 66610 (0.0007) [2023-10-12 18:33:53,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 136413184. Throughput: 0: 1685.1, 1: 1680.0. Samples: 34111050. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 18:33:53,435][61643] Avg episode reward: [(0, '25.500'), (1, '9.880')] [2023-10-12 18:33:53,598][62634] Updated weights for policy 0, policy_version 66620 (0.0009) [2023-10-12 18:33:55,687][62635] Updated weights for policy 1, policy_version 66630 (0.0008) [2023-10-12 18:33:56,049][62635] Updated weights for policy 1, policy_version 66640 (0.0008) [2023-10-12 18:33:56,412][62635] Updated weights for policy 1, policy_version 66650 (0.0009) [2023-10-12 18:33:57,625][62634] Updated weights for policy 0, policy_version 66630 (0.0007) [2023-10-12 18:33:57,998][62634] Updated weights for policy 0, policy_version 66640 (0.0007) [2023-10-12 18:33:58,376][62634] Updated weights for policy 0, policy_version 66650 (0.0010) [2023-10-12 18:33:58,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 136478720. Throughput: 0: 1688.8, 1: 1668.0. Samples: 34130904. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 18:33:58,435][61643] Avg episode reward: [(0, '25.400'), (1, '10.060')] [2023-10-12 18:34:00,618][62635] Updated weights for policy 1, policy_version 66660 (0.0009) [2023-10-12 18:34:00,992][62635] Updated weights for policy 1, policy_version 66670 (0.0009) [2023-10-12 18:34:01,365][62635] Updated weights for policy 1, policy_version 66680 (0.0008) [2023-10-12 18:34:02,312][62634] Updated weights for policy 0, policy_version 66660 (0.0010) [2023-10-12 18:34:02,688][62634] Updated weights for policy 0, policy_version 66670 (0.0009) [2023-10-12 18:34:03,067][62634] Updated weights for policy 0, policy_version 66680 (0.0010) [2023-10-12 18:34:03,435][61643] Fps is (10 sec: 16384.0, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 136577024. Throughput: 0: 1670.1, 1: 1680.1. Samples: 34150832. Policy #0 lag: (min: 17.0, avg: 35.9, max: 49.0) [2023-10-12 18:34:03,435][61643] Avg episode reward: [(0, '25.490'), (1, '9.820')] [2023-10-12 18:34:05,371][62635] Updated weights for policy 1, policy_version 66690 (0.0009) [2023-10-12 18:34:05,730][62635] Updated weights for policy 1, policy_version 66700 (0.0007) [2023-10-12 18:34:06,109][62635] Updated weights for policy 1, policy_version 66710 (0.0008) [2023-10-12 18:34:06,464][62635] Updated weights for policy 1, policy_version 66720 (0.0009) [2023-10-12 18:34:07,344][62634] Updated weights for policy 0, policy_version 66690 (0.0008) [2023-10-12 18:34:07,747][62634] Updated weights for policy 0, policy_version 66700 (0.0009) [2023-10-12 18:34:08,128][62634] Updated weights for policy 0, policy_version 66710 (0.0010) [2023-10-12 18:34:08,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 136609792. Throughput: 0: 1688.9, 1: 1664.1. Samples: 34161198. Policy #0 lag: (min: 17.0, avg: 35.9, max: 49.0) [2023-10-12 18:34:08,435][61643] Avg episode reward: [(0, '25.420'), (1, '9.670')] [2023-10-12 18:34:08,503][62634] Updated weights for policy 0, policy_version 66720 (0.0010) [2023-10-12 18:34:10,547][62635] Updated weights for policy 1, policy_version 66730 (0.0009) [2023-10-12 18:34:10,916][62635] Updated weights for policy 1, policy_version 66740 (0.0009) [2023-10-12 18:34:11,297][62635] Updated weights for policy 1, policy_version 66750 (0.0007) [2023-10-12 18:34:12,401][62634] Updated weights for policy 0, policy_version 66730 (0.0009) [2023-10-12 18:34:12,782][62634] Updated weights for policy 0, policy_version 66740 (0.0007) [2023-10-12 18:34:13,166][62634] Updated weights for policy 0, policy_version 66750 (0.0007) [2023-10-12 18:34:13,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 136708096. Throughput: 0: 1679.2, 1: 1671.1. Samples: 34181256. Policy #0 lag: (min: 17.0, avg: 35.9, max: 49.0) [2023-10-12 18:34:13,436][61643] Avg episode reward: [(0, '25.460'), (1, '9.650')] [2023-10-12 18:34:15,486][62635] Updated weights for policy 1, policy_version 66760 (0.0009) [2023-10-12 18:34:15,860][62635] Updated weights for policy 1, policy_version 66770 (0.0009) [2023-10-12 18:34:16,228][62635] Updated weights for policy 1, policy_version 66780 (0.0011) [2023-10-12 18:34:17,272][62634] Updated weights for policy 0, policy_version 66760 (0.0008) [2023-10-12 18:34:17,646][62634] Updated weights for policy 0, policy_version 66770 (0.0008) [2023-10-12 18:34:18,026][62634] Updated weights for policy 0, policy_version 66780 (0.0008) [2023-10-12 18:34:18,435][61643] Fps is (10 sec: 16383.4, 60 sec: 13653.2, 300 sec: 13551.5). Total num frames: 136773632. Throughput: 0: 1660.3, 1: 1687.8. Samples: 34201154. Policy #0 lag: (min: 17.0, avg: 35.9, max: 49.0) [2023-10-12 18:34:18,436][61643] Avg episode reward: [(0, '25.470'), (1, '9.700')] [2023-10-12 18:34:18,447][62495] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000066784_68386816.pth... [2023-10-12 18:34:18,448][62354] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000066784_68386816.pth... [2023-10-12 18:34:18,484][62354] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000065216_66781184.pth [2023-10-12 18:34:18,484][62495] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000065216_66781184.pth [2023-10-12 18:34:20,102][62635] Updated weights for policy 1, policy_version 66790 (0.0010) [2023-10-12 18:34:20,476][62635] Updated weights for policy 1, policy_version 66800 (0.0009) [2023-10-12 18:34:20,845][62635] Updated weights for policy 1, policy_version 66810 (0.0008) [2023-10-12 18:34:22,158][62634] Updated weights for policy 0, policy_version 66790 (0.0009) [2023-10-12 18:34:22,528][62634] Updated weights for policy 0, policy_version 66800 (0.0008) [2023-10-12 18:34:22,911][62634] Updated weights for policy 0, policy_version 66810 (0.0009) [2023-10-12 18:34:23,435][61643] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 136839168. Throughput: 0: 1680.3, 1: 1659.4. Samples: 34211304. Policy #0 lag: (min: 17.0, avg: 35.9, max: 49.0) [2023-10-12 18:34:23,435][61643] Avg episode reward: [(0, '25.430'), (1, '9.600')] [2023-10-12 18:34:25,043][62635] Updated weights for policy 1, policy_version 66820 (0.0008) [2023-10-12 18:34:25,409][62635] Updated weights for policy 1, policy_version 66830 (0.0007) [2023-10-12 18:34:25,776][62635] Updated weights for policy 1, policy_version 66840 (0.0008) [2023-10-12 18:34:26,855][62634] Updated weights for policy 0, policy_version 66820 (0.0008) [2023-10-12 18:34:27,234][62634] Updated weights for policy 0, policy_version 66830 (0.0008) [2023-10-12 18:34:27,605][62634] Updated weights for policy 0, policy_version 66840 (0.0009) [2023-10-12 18:34:28,435][61643] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 136904704. Throughput: 0: 1675.2, 1: 1683.6. Samples: 34231658. Policy #0 lag: (min: 17.0, avg: 35.9, max: 49.0) [2023-10-12 18:34:28,436][61643] Avg episode reward: [(0, '25.350'), (1, '9.700')] [2023-10-12 18:34:29,707][62635] Updated weights for policy 1, policy_version 66850 (0.0009) [2023-10-12 18:34:30,074][62635] Updated weights for policy 1, policy_version 66860 (0.0011) [2023-10-12 18:34:30,446][62635] Updated weights for policy 1, policy_version 66870 (0.0009) [2023-10-12 18:34:30,809][62635] Updated weights for policy 1, policy_version 66880 (0.0008) [2023-10-12 18:34:31,713][62634] Updated weights for policy 0, policy_version 66850 (0.0009) [2023-10-12 18:34:32,085][62634] Updated weights for policy 0, policy_version 66860 (0.0009) [2023-10-12 18:34:32,460][62634] Updated weights for policy 0, policy_version 66870 (0.0009) [2023-10-12 18:34:32,847][62634] Updated weights for policy 0, policy_version 66880 (0.0007) [2023-10-12 18:34:33,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 136970240. Throughput: 0: 1655.3, 1: 1681.0. Samples: 34251052. Policy #0 lag: (min: 17.0, avg: 35.9, max: 49.0) [2023-10-12 18:34:33,435][61643] Avg episode reward: [(0, '25.400'), (1, '9.360')] [2023-10-12 18:34:34,989][62635] Updated weights for policy 1, policy_version 66890 (0.0009) [2023-10-12 18:34:35,353][62635] Updated weights for policy 1, policy_version 66900 (0.0008) [2023-10-12 18:34:35,726][62635] Updated weights for policy 1, policy_version 66910 (0.0008) [2023-10-12 18:34:36,725][62634] Updated weights for policy 0, policy_version 66890 (0.0008) [2023-10-12 18:34:37,110][62634] Updated weights for policy 0, policy_version 66900 (0.0009) [2023-10-12 18:34:37,478][62634] Updated weights for policy 0, policy_version 66910 (0.0010) [2023-10-12 18:34:38,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 137035776. Throughput: 0: 1678.4, 1: 1663.0. Samples: 34261410. Policy #0 lag: (min: 17.0, avg: 35.9, max: 49.0) [2023-10-12 18:34:38,435][61643] Avg episode reward: [(0, '25.480'), (1, '9.410')] [2023-10-12 18:34:39,682][62635] Updated weights for policy 1, policy_version 66920 (0.0008) [2023-10-12 18:34:40,041][62635] Updated weights for policy 1, policy_version 66930 (0.0008) [2023-10-12 18:34:40,416][62635] Updated weights for policy 1, policy_version 66940 (0.0008) [2023-10-12 18:34:41,548][62634] Updated weights for policy 0, policy_version 66920 (0.0009) [2023-10-12 18:34:41,933][62634] Updated weights for policy 0, policy_version 66930 (0.0007) [2023-10-12 18:34:42,304][62634] Updated weights for policy 0, policy_version 66940 (0.0007) [2023-10-12 18:34:43,435][61643] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 137101312. Throughput: 0: 1661.6, 1: 1688.6. Samples: 34281666. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 18:34:43,436][61643] Avg episode reward: [(0, '24.950'), (1, '9.880')] [2023-10-12 18:34:44,557][62635] Updated weights for policy 1, policy_version 66950 (0.0007) [2023-10-12 18:34:44,918][62635] Updated weights for policy 1, policy_version 66960 (0.0007) [2023-10-12 18:34:45,281][62635] Updated weights for policy 1, policy_version 66970 (0.0007) [2023-10-12 18:34:46,436][62634] Updated weights for policy 0, policy_version 66950 (0.0010) [2023-10-12 18:34:46,826][62634] Updated weights for policy 0, policy_version 66960 (0.0010) [2023-10-12 18:34:47,199][62634] Updated weights for policy 0, policy_version 66970 (0.0007) [2023-10-12 18:34:48,435][61643] Fps is (10 sec: 13106.7, 60 sec: 13653.2, 300 sec: 13440.4). Total num frames: 137166848. Throughput: 0: 1663.2, 1: 1685.4. Samples: 34301522. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 18:34:48,437][61643] Avg episode reward: [(0, '25.000'), (1, '9.690')] [2023-10-12 18:34:49,350][62635] Updated weights for policy 1, policy_version 66980 (0.0008) [2023-10-12 18:34:49,706][62635] Updated weights for policy 1, policy_version 66990 (0.0009) [2023-10-12 18:34:50,079][62635] Updated weights for policy 1, policy_version 67000 (0.0008) [2023-10-12 18:34:51,379][62634] Updated weights for policy 0, policy_version 66980 (0.0007) [2023-10-12 18:34:51,753][62634] Updated weights for policy 0, policy_version 66990 (0.0008) [2023-10-12 18:34:52,126][62634] Updated weights for policy 0, policy_version 67000 (0.0009) [2023-10-12 18:34:53,435][61643] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 137232384. Throughput: 0: 1673.2, 1: 1670.2. Samples: 34311652. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 18:34:53,435][61643] Avg episode reward: [(0, '24.970'), (1, '9.690')] [2023-10-12 18:34:54,184][62635] Updated weights for policy 1, policy_version 67010 (0.0009) [2023-10-12 18:34:54,554][62635] Updated weights for policy 1, policy_version 67020 (0.0011) [2023-10-12 18:34:54,923][62635] Updated weights for policy 1, policy_version 67030 (0.0007) [2023-10-12 18:34:55,296][62635] Updated weights for policy 1, policy_version 67040 (0.0008) [2023-10-12 18:34:56,221][62634] Updated weights for policy 0, policy_version 67010 (0.0011) [2023-10-12 18:34:56,621][62634] Updated weights for policy 0, policy_version 67020 (0.0010) [2023-10-12 18:34:57,003][62634] Updated weights for policy 0, policy_version 67030 (0.0007) [2023-10-12 18:34:57,374][62634] Updated weights for policy 0, policy_version 67040 (0.0010) [2023-10-12 18:34:58,435][61643] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 137297920. Throughput: 0: 1655.0, 1: 1681.7. Samples: 34331410. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 18:34:58,436][61643] Avg episode reward: [(0, '24.690'), (1, '9.880')] [2023-10-12 18:34:59,415][62635] Updated weights for policy 1, policy_version 67050 (0.0008) [2023-10-12 18:34:59,780][62635] Updated weights for policy 1, policy_version 67060 (0.0008) [2023-10-12 18:35:00,145][62635] Updated weights for policy 1, policy_version 67070 (0.0008) [2023-10-12 18:35:01,432][62634] Updated weights for policy 0, policy_version 67050 (0.0009) [2023-10-12 18:35:01,818][62634] Updated weights for policy 0, policy_version 67060 (0.0008) [2023-10-12 18:35:02,198][62634] Updated weights for policy 0, policy_version 67070 (0.0009) [2023-10-12 18:35:03,435][61643] Fps is (10 sec: 13106.8, 60 sec: 13107.1, 300 sec: 13440.4). Total num frames: 137363456. Throughput: 0: 1664.9, 1: 1679.6. Samples: 34351656. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 18:35:03,436][61643] Avg episode reward: [(0, '24.660'), (1, '9.670')] [2023-10-12 18:35:04,211][62635] Updated weights for policy 1, policy_version 67080 (0.0008) [2023-10-12 18:35:04,593][62635] Updated weights for policy 1, policy_version 67090 (0.0008) [2023-10-12 18:35:04,953][62635] Updated weights for policy 1, policy_version 67100 (0.0008) [2023-10-12 18:35:06,225][62634] Updated weights for policy 0, policy_version 67080 (0.0008) [2023-10-12 18:35:06,595][62634] Updated weights for policy 0, policy_version 67090 (0.0008) [2023-10-12 18:35:06,966][62634] Updated weights for policy 0, policy_version 67100 (0.0009) [2023-10-12 18:35:08,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 137428992. Throughput: 0: 1671.7, 1: 1676.4. Samples: 34361970. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 18:35:08,436][61643] Avg episode reward: [(0, '24.410'), (1, '9.950')] [2023-10-12 18:35:08,918][62635] Updated weights for policy 1, policy_version 67110 (0.0010) [2023-10-12 18:35:09,297][62635] Updated weights for policy 1, policy_version 67120 (0.0008) [2023-10-12 18:35:09,670][62635] Updated weights for policy 1, policy_version 67130 (0.0010) [2023-10-12 18:35:10,956][62634] Updated weights for policy 0, policy_version 67110 (0.0008) [2023-10-12 18:35:11,329][62634] Updated weights for policy 0, policy_version 67120 (0.0011) [2023-10-12 18:35:11,710][62634] Updated weights for policy 0, policy_version 67130 (0.0010) [2023-10-12 18:35:13,435][61643] Fps is (10 sec: 13107.6, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 137494528. Throughput: 0: 1649.7, 1: 1679.3. Samples: 34381462. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 18:35:13,435][61643] Avg episode reward: [(0, '24.190'), (1, '10.000')] [2023-10-12 18:35:13,702][62635] Updated weights for policy 1, policy_version 67140 (0.0008) [2023-10-12 18:35:14,065][62635] Updated weights for policy 1, policy_version 67150 (0.0008) [2023-10-12 18:35:14,443][62635] Updated weights for policy 1, policy_version 67160 (0.0009) [2023-10-12 18:35:15,990][62634] Updated weights for policy 0, policy_version 67140 (0.0010) [2023-10-12 18:35:16,371][62634] Updated weights for policy 0, policy_version 67150 (0.0008) [2023-10-12 18:35:16,745][62634] Updated weights for policy 0, policy_version 67160 (0.0008) [2023-10-12 18:35:18,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.3, 300 sec: 13440.4). Total num frames: 137560064. Throughput: 0: 1671.4, 1: 1683.7. Samples: 34402030. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 18:35:18,436][61643] Avg episode reward: [(0, '24.030'), (1, '9.880')] [2023-10-12 18:35:18,484][62635] Updated weights for policy 1, policy_version 67170 (0.0008) [2023-10-12 18:35:18,851][62635] Updated weights for policy 1, policy_version 67180 (0.0010) [2023-10-12 18:35:19,223][62635] Updated weights for policy 1, policy_version 67190 (0.0009) [2023-10-12 18:35:19,597][62635] Updated weights for policy 1, policy_version 67200 (0.0008) [2023-10-12 18:35:20,819][62634] Updated weights for policy 0, policy_version 67170 (0.0008) [2023-10-12 18:35:21,202][62634] Updated weights for policy 0, policy_version 67180 (0.0010) [2023-10-12 18:35:21,587][62634] Updated weights for policy 0, policy_version 67190 (0.0008) [2023-10-12 18:35:21,955][62634] Updated weights for policy 0, policy_version 67200 (0.0009) [2023-10-12 18:35:23,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 137625600. Throughput: 0: 1663.9, 1: 1684.4. Samples: 34412084. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 18:35:23,435][61643] Avg episode reward: [(0, '23.700'), (1, '9.880')] [2023-10-12 18:35:23,610][62635] Updated weights for policy 1, policy_version 67210 (0.0008) [2023-10-12 18:35:23,978][62635] Updated weights for policy 1, policy_version 67220 (0.0010) [2023-10-12 18:35:24,341][62635] Updated weights for policy 1, policy_version 67230 (0.0010) [2023-10-12 18:35:26,000][62634] Updated weights for policy 0, policy_version 67210 (0.0010) [2023-10-12 18:35:26,377][62634] Updated weights for policy 0, policy_version 67220 (0.0010) [2023-10-12 18:35:26,752][62634] Updated weights for policy 0, policy_version 67230 (0.0010) [2023-10-12 18:35:28,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 137691136. Throughput: 0: 1653.9, 1: 1684.3. Samples: 34431884. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 18:35:28,435][61643] Avg episode reward: [(0, '23.410'), (1, '9.790')] [2023-10-12 18:35:28,485][62635] Updated weights for policy 1, policy_version 67240 (0.0008) [2023-10-12 18:35:28,859][62635] Updated weights for policy 1, policy_version 67250 (0.0009) [2023-10-12 18:35:29,224][62635] Updated weights for policy 1, policy_version 67260 (0.0009) [2023-10-12 18:35:30,974][62634] Updated weights for policy 0, policy_version 67240 (0.0007) [2023-10-12 18:35:31,348][62634] Updated weights for policy 0, policy_version 67250 (0.0007) [2023-10-12 18:35:31,715][62634] Updated weights for policy 0, policy_version 67260 (0.0007) [2023-10-12 18:35:33,435][61643] Fps is (10 sec: 13106.8, 60 sec: 13107.1, 300 sec: 13329.3). Total num frames: 137756672. Throughput: 0: 1671.3, 1: 1688.2. Samples: 34452700. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 18:35:33,436][61643] Avg episode reward: [(0, '23.100'), (1, '9.840')] [2023-10-12 18:35:33,484][62635] Updated weights for policy 1, policy_version 67270 (0.0009) [2023-10-12 18:35:33,855][62635] Updated weights for policy 1, policy_version 67280 (0.0011) [2023-10-12 18:35:34,214][62635] Updated weights for policy 1, policy_version 67290 (0.0010) [2023-10-12 18:35:35,549][62634] Updated weights for policy 0, policy_version 67270 (0.0007) [2023-10-12 18:35:35,926][62634] Updated weights for policy 0, policy_version 67280 (0.0007) [2023-10-12 18:35:36,302][62634] Updated weights for policy 0, policy_version 67290 (0.0007) [2023-10-12 18:35:38,197][62635] Updated weights for policy 1, policy_version 67300 (0.0007) [2023-10-12 18:35:38,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 137822208. Throughput: 0: 1662.1, 1: 1689.4. Samples: 34462468. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 18:35:38,435][61643] Avg episode reward: [(0, '23.080'), (1, '9.850')] [2023-10-12 18:35:38,563][62635] Updated weights for policy 1, policy_version 67310 (0.0010) [2023-10-12 18:35:38,926][62635] Updated weights for policy 1, policy_version 67320 (0.0009) [2023-10-12 18:35:40,343][62634] Updated weights for policy 0, policy_version 67300 (0.0008) [2023-10-12 18:35:40,727][62634] Updated weights for policy 0, policy_version 67310 (0.0010) [2023-10-12 18:35:41,096][62634] Updated weights for policy 0, policy_version 67320 (0.0009) [2023-10-12 18:35:42,937][62635] Updated weights for policy 1, policy_version 67330 (0.0009) [2023-10-12 18:35:43,311][62635] Updated weights for policy 1, policy_version 67340 (0.0007) [2023-10-12 18:35:43,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 137887744. Throughput: 0: 1667.7, 1: 1689.5. Samples: 34482482. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 18:35:43,436][61643] Avg episode reward: [(0, '22.820'), (1, '9.800')] [2023-10-12 18:35:43,673][62635] Updated weights for policy 1, policy_version 67350 (0.0009) [2023-10-12 18:35:44,034][62635] Updated weights for policy 1, policy_version 67360 (0.0009) [2023-10-12 18:35:45,281][62634] Updated weights for policy 0, policy_version 67330 (0.0010) [2023-10-12 18:35:45,689][62634] Updated weights for policy 0, policy_version 67340 (0.0008) [2023-10-12 18:35:46,070][62634] Updated weights for policy 0, policy_version 67350 (0.0010) [2023-10-12 18:35:46,449][62634] Updated weights for policy 0, policy_version 67360 (0.0011) [2023-10-12 18:35:48,107][62635] Updated weights for policy 1, policy_version 67370 (0.0009) [2023-10-12 18:35:48,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.3, 300 sec: 13329.4). Total num frames: 137953280. Throughput: 0: 1678.9, 1: 1682.8. Samples: 34502930. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 18:35:48,435][61643] Avg episode reward: [(0, '22.870'), (1, '9.770')] [2023-10-12 18:35:48,465][62635] Updated weights for policy 1, policy_version 67380 (0.0009) [2023-10-12 18:35:48,840][62635] Updated weights for policy 1, policy_version 67390 (0.0008) [2023-10-12 18:35:50,490][62634] Updated weights for policy 0, policy_version 67370 (0.0008) [2023-10-12 18:35:50,871][62634] Updated weights for policy 0, policy_version 67380 (0.0008) [2023-10-12 18:35:51,246][62634] Updated weights for policy 0, policy_version 67390 (0.0007) [2023-10-12 18:35:52,998][62635] Updated weights for policy 1, policy_version 67400 (0.0007) [2023-10-12 18:35:53,368][62635] Updated weights for policy 1, policy_version 67410 (0.0009) [2023-10-12 18:35:53,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 138018816. Throughput: 0: 1657.6, 1: 1690.3. Samples: 34512622. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 18:35:53,435][61643] Avg episode reward: [(0, '22.880'), (1, '9.970')] [2023-10-12 18:35:53,743][62635] Updated weights for policy 1, policy_version 67420 (0.0011) [2023-10-12 18:35:55,379][62634] Updated weights for policy 0, policy_version 67400 (0.0007) [2023-10-12 18:35:55,770][62634] Updated weights for policy 0, policy_version 67410 (0.0009) [2023-10-12 18:35:56,143][62634] Updated weights for policy 0, policy_version 67420 (0.0008) [2023-10-12 18:35:57,779][62635] Updated weights for policy 1, policy_version 67430 (0.0008) [2023-10-12 18:35:58,144][62635] Updated weights for policy 1, policy_version 67440 (0.0008) [2023-10-12 18:35:58,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 138084352. Throughput: 0: 1674.3, 1: 1682.7. Samples: 34532528. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 18:35:58,436][61643] Avg episode reward: [(0, '23.170'), (1, '9.880')] [2023-10-12 18:35:58,514][62635] Updated weights for policy 1, policy_version 67450 (0.0009) [2023-10-12 18:35:59,990][62634] Updated weights for policy 0, policy_version 67430 (0.0007) [2023-10-12 18:36:00,366][62634] Updated weights for policy 0, policy_version 67440 (0.0008) [2023-10-12 18:36:00,737][62634] Updated weights for policy 0, policy_version 67450 (0.0008) [2023-10-12 18:36:02,587][62635] Updated weights for policy 1, policy_version 67460 (0.0010) [2023-10-12 18:36:02,956][62635] Updated weights for policy 1, policy_version 67470 (0.0009) [2023-10-12 18:36:03,321][62635] Updated weights for policy 1, policy_version 67480 (0.0009) [2023-10-12 18:36:03,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.3, 300 sec: 13329.4). Total num frames: 138149888. Throughput: 0: 1686.9, 1: 1665.0. Samples: 34552864. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 18:36:03,435][61643] Avg episode reward: [(0, '23.100'), (1, '9.970')] [2023-10-12 18:36:04,832][62634] Updated weights for policy 0, policy_version 67460 (0.0008) [2023-10-12 18:36:05,212][62634] Updated weights for policy 0, policy_version 67470 (0.0007) [2023-10-12 18:36:05,593][62634] Updated weights for policy 0, policy_version 67480 (0.0007) [2023-10-12 18:36:07,466][62635] Updated weights for policy 1, policy_version 67490 (0.0008) [2023-10-12 18:36:07,845][62635] Updated weights for policy 1, policy_version 67500 (0.0008) [2023-10-12 18:36:08,209][62635] Updated weights for policy 1, policy_version 67510 (0.0007) [2023-10-12 18:36:08,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 138215424. Throughput: 0: 1666.4, 1: 1678.8. Samples: 34562622. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 18:36:08,436][61643] Avg episode reward: [(0, '23.270'), (1, '9.830')] [2023-10-12 18:36:08,584][62635] Updated weights for policy 1, policy_version 67520 (0.0008) [2023-10-12 18:36:09,602][62634] Updated weights for policy 0, policy_version 67490 (0.0007) [2023-10-12 18:36:09,977][62634] Updated weights for policy 0, policy_version 67500 (0.0007) [2023-10-12 18:36:10,350][62634] Updated weights for policy 0, policy_version 67510 (0.0007) [2023-10-12 18:36:10,727][62634] Updated weights for policy 0, policy_version 67520 (0.0007) [2023-10-12 18:36:12,707][62635] Updated weights for policy 1, policy_version 67530 (0.0008) [2023-10-12 18:36:13,070][62635] Updated weights for policy 1, policy_version 67540 (0.0007) [2023-10-12 18:36:13,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 138280960. Throughput: 0: 1690.8, 1: 1674.4. Samples: 34583318. Policy #0 lag: (min: 10.0, avg: 13.1, max: 42.0) [2023-10-12 18:36:13,435][61643] Avg episode reward: [(0, '23.240'), (1, '9.930')] [2023-10-12 18:36:13,441][62635] Updated weights for policy 1, policy_version 67550 (0.0008) [2023-10-12 18:36:14,756][62634] Updated weights for policy 0, policy_version 67530 (0.0009) [2023-10-12 18:36:15,134][62634] Updated weights for policy 0, policy_version 67540 (0.0007) [2023-10-12 18:36:15,518][62634] Updated weights for policy 0, policy_version 67550 (0.0007) [2023-10-12 18:36:17,426][62635] Updated weights for policy 1, policy_version 67560 (0.0009) [2023-10-12 18:36:17,797][62635] Updated weights for policy 1, policy_version 67570 (0.0010) [2023-10-12 18:36:18,167][62635] Updated weights for policy 1, policy_version 67580 (0.0008) [2023-10-12 18:36:18,435][61643] Fps is (10 sec: 16384.3, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 138379264. Throughput: 0: 1691.3, 1: 1655.0. Samples: 34603284. Policy #0 lag: (min: 10.0, avg: 13.1, max: 42.0) [2023-10-12 18:36:18,435][61643] Avg episode reward: [(0, '23.090'), (1, '10.020')] [2023-10-12 18:36:18,444][62495] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000067584_69206016.pth... [2023-10-12 18:36:18,445][62354] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000067552_69173248.pth... [2023-10-12 18:36:18,481][62354] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000065984_67567616.pth [2023-10-12 18:36:18,482][62495] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000066016_67600384.pth [2023-10-12 18:36:19,532][62634] Updated weights for policy 0, policy_version 67560 (0.0009) [2023-10-12 18:36:19,924][62634] Updated weights for policy 0, policy_version 67570 (0.0009) [2023-10-12 18:36:20,310][62634] Updated weights for policy 0, policy_version 67580 (0.0009) [2023-10-12 18:36:22,386][62635] Updated weights for policy 1, policy_version 67590 (0.0009) [2023-10-12 18:36:22,754][62635] Updated weights for policy 1, policy_version 67600 (0.0007) [2023-10-12 18:36:23,120][62635] Updated weights for policy 1, policy_version 67610 (0.0007) [2023-10-12 18:36:23,435][61643] Fps is (10 sec: 16384.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 138444800. Throughput: 0: 1674.8, 1: 1674.5. Samples: 34613184. Policy #0 lag: (min: 10.0, avg: 13.1, max: 42.0) [2023-10-12 18:36:23,435][61643] Avg episode reward: [(0, '23.510'), (1, '9.840')] [2023-10-12 18:36:24,296][62634] Updated weights for policy 0, policy_version 67590 (0.0008) [2023-10-12 18:36:24,677][62634] Updated weights for policy 0, policy_version 67600 (0.0009) [2023-10-12 18:36:25,062][62634] Updated weights for policy 0, policy_version 67610 (0.0009) [2023-10-12 18:36:27,197][62635] Updated weights for policy 1, policy_version 67620 (0.0010) [2023-10-12 18:36:27,569][62635] Updated weights for policy 1, policy_version 67630 (0.0008) [2023-10-12 18:36:27,931][62635] Updated weights for policy 1, policy_version 67640 (0.0007) [2023-10-12 18:36:28,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 138510336. Throughput: 0: 1692.7, 1: 1673.1. Samples: 34633942. Policy #0 lag: (min: 10.0, avg: 13.1, max: 42.0) [2023-10-12 18:36:28,436][61643] Avg episode reward: [(0, '23.510'), (1, '9.840')] [2023-10-12 18:36:29,099][62634] Updated weights for policy 0, policy_version 67620 (0.0009) [2023-10-12 18:36:29,480][62634] Updated weights for policy 0, policy_version 67630 (0.0007) [2023-10-12 18:36:29,854][62634] Updated weights for policy 0, policy_version 67640 (0.0008) [2023-10-12 18:36:32,043][62635] Updated weights for policy 1, policy_version 67650 (0.0007) [2023-10-12 18:36:32,406][62635] Updated weights for policy 1, policy_version 67660 (0.0010) [2023-10-12 18:36:32,775][62635] Updated weights for policy 1, policy_version 67670 (0.0010) [2023-10-12 18:36:33,153][62635] Updated weights for policy 1, policy_version 67680 (0.0011) [2023-10-12 18:36:33,435][61643] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 138575872. Throughput: 0: 1699.0, 1: 1655.3. Samples: 34653876. Policy #0 lag: (min: 10.0, avg: 13.1, max: 42.0) [2023-10-12 18:36:33,436][61643] Avg episode reward: [(0, '23.670'), (1, '9.980')] [2023-10-12 18:36:33,828][62634] Updated weights for policy 0, policy_version 67650 (0.0007) [2023-10-12 18:36:34,232][62634] Updated weights for policy 0, policy_version 67660 (0.0007) [2023-10-12 18:36:34,599][62634] Updated weights for policy 0, policy_version 67670 (0.0007) [2023-10-12 18:36:34,973][62634] Updated weights for policy 0, policy_version 67680 (0.0007) [2023-10-12 18:36:37,128][62635] Updated weights for policy 1, policy_version 67690 (0.0008) [2023-10-12 18:36:37,493][62635] Updated weights for policy 1, policy_version 67700 (0.0007) [2023-10-12 18:36:37,862][62635] Updated weights for policy 1, policy_version 67710 (0.0009) [2023-10-12 18:36:38,435][61643] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 138641408. Throughput: 0: 1693.7, 1: 1674.7. Samples: 34664200. Policy #0 lag: (min: 10.0, avg: 13.1, max: 42.0) [2023-10-12 18:36:38,435][61643] Avg episode reward: [(0, '24.120'), (1, '9.800')] [2023-10-12 18:36:38,996][62634] Updated weights for policy 0, policy_version 67690 (0.0008) [2023-10-12 18:36:39,377][62634] Updated weights for policy 0, policy_version 67700 (0.0007) [2023-10-12 18:36:39,744][62634] Updated weights for policy 0, policy_version 67710 (0.0007) [2023-10-12 18:36:41,942][62635] Updated weights for policy 1, policy_version 67720 (0.0007) [2023-10-12 18:36:42,320][62635] Updated weights for policy 1, policy_version 67730 (0.0008) [2023-10-12 18:36:42,692][62635] Updated weights for policy 1, policy_version 67740 (0.0007) [2023-10-12 18:36:43,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 138706944. Throughput: 0: 1704.4, 1: 1674.7. Samples: 34684588. Policy #0 lag: (min: 10.0, avg: 13.1, max: 42.0) [2023-10-12 18:36:43,436][61643] Avg episode reward: [(0, '24.210'), (1, '9.800')] [2023-10-12 18:36:43,810][62634] Updated weights for policy 0, policy_version 67720 (0.0009) [2023-10-12 18:36:44,184][62634] Updated weights for policy 0, policy_version 67730 (0.0009) [2023-10-12 18:36:44,560][62634] Updated weights for policy 0, policy_version 67740 (0.0009) [2023-10-12 18:36:46,630][62635] Updated weights for policy 1, policy_version 67750 (0.0009) [2023-10-12 18:36:46,991][62635] Updated weights for policy 1, policy_version 67760 (0.0010) [2023-10-12 18:36:47,368][62635] Updated weights for policy 1, policy_version 67770 (0.0008) [2023-10-12 18:36:48,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 138772480. Throughput: 0: 1700.3, 1: 1672.4. Samples: 34704634. Policy #0 lag: (min: 10.0, avg: 13.1, max: 42.0) [2023-10-12 18:36:48,435][61643] Avg episode reward: [(0, '24.560'), (1, '10.200')] [2023-10-12 18:36:48,566][62634] Updated weights for policy 0, policy_version 67750 (0.0008) [2023-10-12 18:36:48,944][62634] Updated weights for policy 0, policy_version 67760 (0.0008) [2023-10-12 18:36:49,328][62634] Updated weights for policy 0, policy_version 67770 (0.0007) [2023-10-12 18:36:51,348][62635] Updated weights for policy 1, policy_version 67780 (0.0007) [2023-10-12 18:36:51,723][62635] Updated weights for policy 1, policy_version 67790 (0.0009) [2023-10-12 18:36:52,089][62635] Updated weights for policy 1, policy_version 67800 (0.0009) [2023-10-12 18:36:52,983][62634] Updated weights for policy 0, policy_version 67780 (0.0007) [2023-10-12 18:36:53,360][62634] Updated weights for policy 0, policy_version 67790 (0.0008) [2023-10-12 18:36:53,435][61643] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 138838016. Throughput: 0: 1700.7, 1: 1689.2. Samples: 34715166. Policy #0 lag: (min: 10.0, avg: 13.1, max: 42.0) [2023-10-12 18:36:53,435][61643] Avg episode reward: [(0, '24.450'), (1, '9.890')] [2023-10-12 18:36:53,739][62634] Updated weights for policy 0, policy_version 67800 (0.0008) [2023-10-12 18:36:56,208][62635] Updated weights for policy 1, policy_version 67810 (0.0009) [2023-10-12 18:36:56,578][62635] Updated weights for policy 1, policy_version 67820 (0.0007) [2023-10-12 18:36:56,941][62635] Updated weights for policy 1, policy_version 67830 (0.0007) [2023-10-12 18:36:57,305][62635] Updated weights for policy 1, policy_version 67840 (0.0008) [2023-10-12 18:36:57,826][62634] Updated weights for policy 0, policy_version 67810 (0.0009) [2023-10-12 18:36:58,208][62634] Updated weights for policy 0, policy_version 67820 (0.0008) [2023-10-12 18:36:58,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 138903552. Throughput: 0: 1706.4, 1: 1668.0. Samples: 34735164. Policy #0 lag: (min: 14.0, avg: 38.0, max: 40.0) [2023-10-12 18:36:58,435][61643] Avg episode reward: [(0, '24.750'), (1, '9.730')] [2023-10-12 18:36:58,590][62634] Updated weights for policy 0, policy_version 67830 (0.0010) [2023-10-12 18:36:58,968][62634] Updated weights for policy 0, policy_version 67840 (0.0009) [2023-10-12 18:37:01,492][62635] Updated weights for policy 1, policy_version 67850 (0.0008) [2023-10-12 18:37:01,854][62635] Updated weights for policy 1, policy_version 67860 (0.0009) [2023-10-12 18:37:02,227][62635] Updated weights for policy 1, policy_version 67870 (0.0008) [2023-10-12 18:37:02,988][62634] Updated weights for policy 0, policy_version 67850 (0.0009) [2023-10-12 18:37:03,367][62634] Updated weights for policy 0, policy_version 67860 (0.0010) [2023-10-12 18:37:03,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 138969088. Throughput: 0: 1698.9, 1: 1678.8. Samples: 34755282. Policy #0 lag: (min: 14.0, avg: 38.0, max: 40.0) [2023-10-12 18:37:03,435][61643] Avg episode reward: [(0, '24.770'), (1, '9.900')] [2023-10-12 18:37:03,749][62634] Updated weights for policy 0, policy_version 67870 (0.0008) [2023-10-12 18:37:06,343][62635] Updated weights for policy 1, policy_version 67880 (0.0010) [2023-10-12 18:37:06,719][62635] Updated weights for policy 1, policy_version 67890 (0.0011) [2023-10-12 18:37:07,088][62635] Updated weights for policy 1, policy_version 67900 (0.0010) [2023-10-12 18:37:07,859][62634] Updated weights for policy 0, policy_version 67880 (0.0008) [2023-10-12 18:37:08,245][62634] Updated weights for policy 0, policy_version 67890 (0.0007) [2023-10-12 18:37:08,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 139034624. Throughput: 0: 1703.0, 1: 1690.4. Samples: 34765888. Policy #0 lag: (min: 14.0, avg: 38.0, max: 40.0) [2023-10-12 18:37:08,435][61643] Avg episode reward: [(0, '24.740'), (1, '9.900')] [2023-10-12 18:37:08,616][62634] Updated weights for policy 0, policy_version 67900 (0.0008) [2023-10-12 18:37:11,159][62635] Updated weights for policy 1, policy_version 67910 (0.0008) [2023-10-12 18:37:11,541][62635] Updated weights for policy 1, policy_version 67920 (0.0009) [2023-10-12 18:37:11,914][62635] Updated weights for policy 1, policy_version 67930 (0.0009) [2023-10-12 18:37:12,701][62634] Updated weights for policy 0, policy_version 67910 (0.0010) [2023-10-12 18:37:13,088][62634] Updated weights for policy 0, policy_version 67920 (0.0010) [2023-10-12 18:37:13,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 139100160. Throughput: 0: 1705.5, 1: 1662.8. Samples: 34785514. Policy #0 lag: (min: 14.0, avg: 38.0, max: 40.0) [2023-10-12 18:37:13,435][61643] Avg episode reward: [(0, '24.820'), (1, '9.810')] [2023-10-12 18:37:13,454][62634] Updated weights for policy 0, policy_version 67930 (0.0009) [2023-10-12 18:37:16,009][62635] Updated weights for policy 1, policy_version 67940 (0.0009) [2023-10-12 18:37:16,379][62635] Updated weights for policy 1, policy_version 67950 (0.0007) [2023-10-12 18:37:16,746][62635] Updated weights for policy 1, policy_version 67960 (0.0007) [2023-10-12 18:37:17,456][62634] Updated weights for policy 0, policy_version 67940 (0.0008) [2023-10-12 18:37:17,834][62634] Updated weights for policy 0, policy_version 67950 (0.0010) [2023-10-12 18:37:18,215][62634] Updated weights for policy 0, policy_version 67960 (0.0010) [2023-10-12 18:37:18,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 139165696. Throughput: 0: 1681.8, 1: 1686.5. Samples: 34805448. Policy #0 lag: (min: 14.0, avg: 38.0, max: 40.0) [2023-10-12 18:37:18,435][61643] Avg episode reward: [(0, '24.640'), (1, '9.830')] [2023-10-12 18:37:20,810][62635] Updated weights for policy 1, policy_version 67970 (0.0009) [2023-10-12 18:37:21,173][62635] Updated weights for policy 1, policy_version 67980 (0.0007) [2023-10-12 18:37:21,539][62635] Updated weights for policy 1, policy_version 67990 (0.0011) [2023-10-12 18:37:21,909][62635] Updated weights for policy 1, policy_version 68000 (0.0010) [2023-10-12 18:37:22,245][62634] Updated weights for policy 0, policy_version 67970 (0.0010) [2023-10-12 18:37:22,650][62634] Updated weights for policy 0, policy_version 67980 (0.0009) [2023-10-12 18:37:23,025][62634] Updated weights for policy 0, policy_version 67990 (0.0008) [2023-10-12 18:37:23,404][62634] Updated weights for policy 0, policy_version 68000 (0.0010) [2023-10-12 18:37:23,435][61643] Fps is (10 sec: 16384.2, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 139264000. Throughput: 0: 1694.6, 1: 1682.8. Samples: 34816184. Policy #0 lag: (min: 14.0, avg: 38.0, max: 40.0) [2023-10-12 18:37:23,435][61643] Avg episode reward: [(0, '24.570'), (1, '9.860')] [2023-10-12 18:37:25,841][62635] Updated weights for policy 1, policy_version 68010 (0.0009) [2023-10-12 18:37:26,211][62635] Updated weights for policy 1, policy_version 68020 (0.0007) [2023-10-12 18:37:26,574][62635] Updated weights for policy 1, policy_version 68030 (0.0008) [2023-10-12 18:37:27,461][62634] Updated weights for policy 0, policy_version 68010 (0.0010) [2023-10-12 18:37:27,836][62634] Updated weights for policy 0, policy_version 68020 (0.0010) [2023-10-12 18:37:28,203][62634] Updated weights for policy 0, policy_version 68030 (0.0009) [2023-10-12 18:37:28,435][61643] Fps is (10 sec: 16383.8, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 139329536. Throughput: 0: 1696.9, 1: 1666.2. Samples: 34835928. Policy #0 lag: (min: 14.0, avg: 38.0, max: 40.0) [2023-10-12 18:37:28,435][61643] Avg episode reward: [(0, '24.480'), (1, '9.940')] [2023-10-12 18:37:30,671][62635] Updated weights for policy 1, policy_version 68040 (0.0007) [2023-10-12 18:37:31,044][62635] Updated weights for policy 1, policy_version 68050 (0.0008) [2023-10-12 18:37:31,406][62635] Updated weights for policy 1, policy_version 68060 (0.0007) [2023-10-12 18:37:32,174][62634] Updated weights for policy 0, policy_version 68040 (0.0010) [2023-10-12 18:37:32,545][62634] Updated weights for policy 0, policy_version 68050 (0.0010) [2023-10-12 18:37:32,915][62634] Updated weights for policy 0, policy_version 68060 (0.0010) [2023-10-12 18:37:33,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 139395072. Throughput: 0: 1671.6, 1: 1689.8. Samples: 34855896. Policy #0 lag: (min: 14.0, avg: 38.0, max: 40.0) [2023-10-12 18:37:33,435][61643] Avg episode reward: [(0, '24.680'), (1, '10.140')] [2023-10-12 18:37:35,304][62635] Updated weights for policy 1, policy_version 68070 (0.0007) [2023-10-12 18:37:35,674][62635] Updated weights for policy 1, policy_version 68080 (0.0007) [2023-10-12 18:37:36,048][62635] Updated weights for policy 1, policy_version 68090 (0.0009) [2023-10-12 18:37:36,933][62634] Updated weights for policy 0, policy_version 68070 (0.0009) [2023-10-12 18:37:37,310][62634] Updated weights for policy 0, policy_version 68080 (0.0009) [2023-10-12 18:37:37,682][62634] Updated weights for policy 0, policy_version 68090 (0.0008) [2023-10-12 18:37:38,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 139460608. Throughput: 0: 1697.1, 1: 1668.4. Samples: 34866612. Policy #0 lag: (min: 31.0, avg: 34.7, max: 63.0) [2023-10-12 18:37:38,435][61643] Avg episode reward: [(0, '24.870'), (1, '9.970')] [2023-10-12 18:37:40,153][62635] Updated weights for policy 1, policy_version 68100 (0.0008) [2023-10-12 18:37:40,522][62635] Updated weights for policy 1, policy_version 68110 (0.0007) [2023-10-12 18:37:40,900][62635] Updated weights for policy 1, policy_version 68120 (0.0008) [2023-10-12 18:37:41,959][62634] Updated weights for policy 0, policy_version 68100 (0.0008) [2023-10-12 18:37:42,335][62634] Updated weights for policy 0, policy_version 68110 (0.0008) [2023-10-12 18:37:42,713][62634] Updated weights for policy 0, policy_version 68120 (0.0009) [2023-10-12 18:37:43,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 139526144. Throughput: 0: 1685.4, 1: 1684.9. Samples: 34886830. Policy #0 lag: (min: 31.0, avg: 34.7, max: 63.0) [2023-10-12 18:37:43,435][61643] Avg episode reward: [(0, '24.790'), (1, '9.970')] [2023-10-12 18:37:44,924][62635] Updated weights for policy 1, policy_version 68130 (0.0009) [2023-10-12 18:37:45,285][62635] Updated weights for policy 1, policy_version 68140 (0.0009) [2023-10-12 18:37:45,654][62635] Updated weights for policy 1, policy_version 68150 (0.0007) [2023-10-12 18:37:46,016][62635] Updated weights for policy 1, policy_version 68160 (0.0009) [2023-10-12 18:37:46,674][62634] Updated weights for policy 0, policy_version 68130 (0.0008) [2023-10-12 18:37:47,056][62634] Updated weights for policy 0, policy_version 68140 (0.0010) [2023-10-12 18:37:47,435][62634] Updated weights for policy 0, policy_version 68150 (0.0007) [2023-10-12 18:37:47,809][62634] Updated weights for policy 0, policy_version 68160 (0.0008) [2023-10-12 18:37:48,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 139591680. Throughput: 0: 1663.7, 1: 1697.0. Samples: 34906512. Policy #0 lag: (min: 31.0, avg: 34.7, max: 63.0) [2023-10-12 18:37:48,435][61643] Avg episode reward: [(0, '24.750'), (1, '10.110')] [2023-10-12 18:37:49,999][62635] Updated weights for policy 1, policy_version 68170 (0.0008) [2023-10-12 18:37:50,371][62635] Updated weights for policy 1, policy_version 68180 (0.0009) [2023-10-12 18:37:50,747][62635] Updated weights for policy 1, policy_version 68190 (0.0008) [2023-10-12 18:37:51,753][62634] Updated weights for policy 0, policy_version 68170 (0.0007) [2023-10-12 18:37:52,128][62634] Updated weights for policy 0, policy_version 68180 (0.0008) [2023-10-12 18:37:52,507][62634] Updated weights for policy 0, policy_version 68190 (0.0010) [2023-10-12 18:37:53,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 139657216. Throughput: 0: 1690.9, 1: 1666.6. Samples: 34916978. Policy #0 lag: (min: 31.0, avg: 34.7, max: 63.0) [2023-10-12 18:37:53,435][61643] Avg episode reward: [(0, '24.920'), (1, '10.020')] [2023-10-12 18:37:54,807][62635] Updated weights for policy 1, policy_version 68200 (0.0008) [2023-10-12 18:37:55,166][62635] Updated weights for policy 1, policy_version 68210 (0.0008) [2023-10-12 18:37:55,534][62635] Updated weights for policy 1, policy_version 68220 (0.0008) [2023-10-12 18:37:56,518][62634] Updated weights for policy 0, policy_version 68200 (0.0007) [2023-10-12 18:37:56,895][62634] Updated weights for policy 0, policy_version 68210 (0.0009) [2023-10-12 18:37:57,268][62634] Updated weights for policy 0, policy_version 68220 (0.0010) [2023-10-12 18:37:58,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 139722752. Throughput: 0: 1670.5, 1: 1699.2. Samples: 34937152. Policy #0 lag: (min: 31.0, avg: 34.7, max: 63.0) [2023-10-12 18:37:58,436][61643] Avg episode reward: [(0, '25.010'), (1, '10.040')] [2023-10-12 18:37:59,519][62635] Updated weights for policy 1, policy_version 68230 (0.0008) [2023-10-12 18:37:59,888][62635] Updated weights for policy 1, policy_version 68240 (0.0008) [2023-10-12 18:38:00,255][62635] Updated weights for policy 1, policy_version 68250 (0.0009) [2023-10-12 18:38:01,234][62634] Updated weights for policy 0, policy_version 68230 (0.0009) [2023-10-12 18:38:01,613][62634] Updated weights for policy 0, policy_version 68240 (0.0008) [2023-10-12 18:38:01,980][62634] Updated weights for policy 0, policy_version 68250 (0.0008) [2023-10-12 18:38:03,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 139788288. Throughput: 0: 1675.4, 1: 1702.4. Samples: 34957448. Policy #0 lag: (min: 31.0, avg: 34.7, max: 63.0) [2023-10-12 18:38:03,436][61643] Avg episode reward: [(0, '24.850'), (1, '10.130')] [2023-10-12 18:38:04,258][62635] Updated weights for policy 1, policy_version 68260 (0.0009) [2023-10-12 18:38:04,631][62635] Updated weights for policy 1, policy_version 68270 (0.0008) [2023-10-12 18:38:05,011][62635] Updated weights for policy 1, policy_version 68280 (0.0008) [2023-10-12 18:38:05,935][62634] Updated weights for policy 0, policy_version 68260 (0.0008) [2023-10-12 18:38:06,307][62634] Updated weights for policy 0, policy_version 68270 (0.0009) [2023-10-12 18:38:06,693][62634] Updated weights for policy 0, policy_version 68280 (0.0009) [2023-10-12 18:38:08,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 139853824. Throughput: 0: 1690.3, 1: 1676.7. Samples: 34967698. Policy #0 lag: (min: 31.0, avg: 34.7, max: 63.0) [2023-10-12 18:38:08,435][61643] Avg episode reward: [(0, '24.910'), (1, '10.040')] [2023-10-12 18:38:09,123][62635] Updated weights for policy 1, policy_version 68290 (0.0007) [2023-10-12 18:38:09,488][62635] Updated weights for policy 1, policy_version 68300 (0.0007) [2023-10-12 18:38:09,863][62635] Updated weights for policy 1, policy_version 68310 (0.0008) [2023-10-12 18:38:10,231][62635] Updated weights for policy 1, policy_version 68320 (0.0010) [2023-10-12 18:38:10,823][62634] Updated weights for policy 0, policy_version 68290 (0.0011) [2023-10-12 18:38:11,222][62634] Updated weights for policy 0, policy_version 68300 (0.0007) [2023-10-12 18:38:11,601][62634] Updated weights for policy 0, policy_version 68310 (0.0009) [2023-10-12 18:38:11,971][62634] Updated weights for policy 0, policy_version 68320 (0.0007) [2023-10-12 18:38:13,435][61643] Fps is (10 sec: 13107.6, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 139919360. Throughput: 0: 1663.3, 1: 1701.0. Samples: 34987320. Policy #0 lag: (min: 31.0, avg: 34.7, max: 63.0) [2023-10-12 18:38:13,435][61643] Avg episode reward: [(0, '24.900'), (1, '9.950')] [2023-10-12 18:38:14,369][62635] Updated weights for policy 1, policy_version 68330 (0.0008) [2023-10-12 18:38:14,741][62635] Updated weights for policy 1, policy_version 68340 (0.0008) [2023-10-12 18:38:15,115][62635] Updated weights for policy 1, policy_version 68350 (0.0010) [2023-10-12 18:38:16,079][62634] Updated weights for policy 0, policy_version 68330 (0.0007) [2023-10-12 18:38:16,452][62634] Updated weights for policy 0, policy_version 68340 (0.0009) [2023-10-12 18:38:16,824][62634] Updated weights for policy 0, policy_version 68350 (0.0008) [2023-10-12 18:38:18,435][61643] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 139984896. Throughput: 0: 1681.8, 1: 1694.0. Samples: 35007804. Policy #0 lag: (min: 31.0, avg: 34.7, max: 63.0) [2023-10-12 18:38:18,436][61643] Avg episode reward: [(0, '25.030'), (1, '10.140')] [2023-10-12 18:38:18,444][62495] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000068352_69992448.pth... [2023-10-12 18:38:18,444][62354] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000068352_69992448.pth... [2023-10-12 18:38:18,475][62354] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000066784_68386816.pth [2023-10-12 18:38:18,482][62495] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000066784_68386816.pth [2023-10-12 18:38:19,380][62635] Updated weights for policy 1, policy_version 68360 (0.0009) [2023-10-12 18:38:19,755][62635] Updated weights for policy 1, policy_version 68370 (0.0010) [2023-10-12 18:38:20,112][62635] Updated weights for policy 1, policy_version 68380 (0.0010) [2023-10-12 18:38:20,982][62634] Updated weights for policy 0, policy_version 68360 (0.0008) [2023-10-12 18:38:21,353][62634] Updated weights for policy 0, policy_version 68370 (0.0007) [2023-10-12 18:38:21,723][62634] Updated weights for policy 0, policy_version 68380 (0.0008) [2023-10-12 18:38:23,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 140050432. Throughput: 0: 1679.2, 1: 1678.3. Samples: 35017696. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 18:38:23,435][61643] Avg episode reward: [(0, '25.130'), (1, '10.150')] [2023-10-12 18:38:24,207][62635] Updated weights for policy 1, policy_version 68390 (0.0009) [2023-10-12 18:38:24,581][62635] Updated weights for policy 1, policy_version 68400 (0.0010) [2023-10-12 18:38:24,937][62635] Updated weights for policy 1, policy_version 68410 (0.0007) [2023-10-12 18:38:25,846][62634] Updated weights for policy 0, policy_version 68390 (0.0008) [2023-10-12 18:38:26,222][62634] Updated weights for policy 0, policy_version 68400 (0.0008) [2023-10-12 18:38:26,607][62634] Updated weights for policy 0, policy_version 68410 (0.0007) [2023-10-12 18:38:28,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 140115968. Throughput: 0: 1660.2, 1: 1682.5. Samples: 35037252. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 18:38:28,435][61643] Avg episode reward: [(0, '25.160'), (1, '9.970')] [2023-10-12 18:38:29,038][62635] Updated weights for policy 1, policy_version 68420 (0.0009) [2023-10-12 18:38:29,407][62635] Updated weights for policy 1, policy_version 68430 (0.0009) [2023-10-12 18:38:29,778][62635] Updated weights for policy 1, policy_version 68440 (0.0010) [2023-10-12 18:38:30,443][62634] Updated weights for policy 0, policy_version 68420 (0.0008) [2023-10-12 18:38:30,813][62634] Updated weights for policy 0, policy_version 68430 (0.0008) [2023-10-12 18:38:31,195][62634] Updated weights for policy 0, policy_version 68440 (0.0007) [2023-10-12 18:38:33,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 140181504. Throughput: 0: 1689.1, 1: 1679.2. Samples: 35058082. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 18:38:33,436][61643] Avg episode reward: [(0, '25.170'), (1, '9.970')] [2023-10-12 18:38:33,908][62635] Updated weights for policy 1, policy_version 68450 (0.0010) [2023-10-12 18:38:34,273][62635] Updated weights for policy 1, policy_version 68460 (0.0009) [2023-10-12 18:38:34,648][62635] Updated weights for policy 1, policy_version 68470 (0.0010) [2023-10-12 18:38:35,022][62635] Updated weights for policy 1, policy_version 68480 (0.0008) [2023-10-12 18:38:35,255][62634] Updated weights for policy 0, policy_version 68450 (0.0009) [2023-10-12 18:38:35,632][62634] Updated weights for policy 0, policy_version 68460 (0.0010) [2023-10-12 18:38:36,011][62634] Updated weights for policy 0, policy_version 68470 (0.0010) [2023-10-12 18:38:36,388][62634] Updated weights for policy 0, policy_version 68480 (0.0010) [2023-10-12 18:38:38,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 140247040. Throughput: 0: 1670.0, 1: 1679.3. Samples: 35067696. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 18:38:38,435][61643] Avg episode reward: [(0, '25.210'), (1, '10.150')] [2023-10-12 18:38:38,970][62635] Updated weights for policy 1, policy_version 68490 (0.0007) [2023-10-12 18:38:39,333][62635] Updated weights for policy 1, policy_version 68500 (0.0010) [2023-10-12 18:38:39,697][62635] Updated weights for policy 1, policy_version 68510 (0.0008) [2023-10-12 18:38:40,593][62634] Updated weights for policy 0, policy_version 68490 (0.0007) [2023-10-12 18:38:40,978][62634] Updated weights for policy 0, policy_version 68500 (0.0008) [2023-10-12 18:38:41,344][62634] Updated weights for policy 0, policy_version 68510 (0.0009) [2023-10-12 18:38:43,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 140312576. Throughput: 0: 1672.8, 1: 1676.6. Samples: 35087872. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 18:38:43,436][61643] Avg episode reward: [(0, '25.040'), (1, '9.840')] [2023-10-12 18:38:43,615][62635] Updated weights for policy 1, policy_version 68520 (0.0010) [2023-10-12 18:38:43,988][62635] Updated weights for policy 1, policy_version 68530 (0.0010) [2023-10-12 18:38:44,348][62635] Updated weights for policy 1, policy_version 68540 (0.0009) [2023-10-12 18:38:45,312][62634] Updated weights for policy 0, policy_version 68520 (0.0008) [2023-10-12 18:38:45,687][62634] Updated weights for policy 0, policy_version 68530 (0.0008) [2023-10-12 18:38:46,061][62634] Updated weights for policy 0, policy_version 68540 (0.0008) [2023-10-12 18:38:48,340][62635] Updated weights for policy 1, policy_version 68550 (0.0009) [2023-10-12 18:38:48,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 140378112. Throughput: 0: 1685.3, 1: 1675.6. Samples: 35108688. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 18:38:48,435][61643] Avg episode reward: [(0, '25.120'), (1, '9.840')] [2023-10-12 18:38:48,709][62635] Updated weights for policy 1, policy_version 68560 (0.0010) [2023-10-12 18:38:49,076][62635] Updated weights for policy 1, policy_version 68570 (0.0009) [2023-10-12 18:38:50,080][62634] Updated weights for policy 0, policy_version 68550 (0.0008) [2023-10-12 18:38:50,454][62634] Updated weights for policy 0, policy_version 68560 (0.0007) [2023-10-12 18:38:50,832][62634] Updated weights for policy 0, policy_version 68570 (0.0007) [2023-10-12 18:38:53,148][62635] Updated weights for policy 1, policy_version 68580 (0.0009) [2023-10-12 18:38:53,435][61643] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 140443648. Throughput: 0: 1658.3, 1: 1677.2. Samples: 35117798. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 18:38:53,435][61643] Avg episode reward: [(0, '25.130'), (1, '9.930')] [2023-10-12 18:38:53,513][62635] Updated weights for policy 1, policy_version 68590 (0.0008) [2023-10-12 18:38:53,882][62635] Updated weights for policy 1, policy_version 68600 (0.0009) [2023-10-12 18:38:54,843][62634] Updated weights for policy 0, policy_version 68580 (0.0008) [2023-10-12 18:38:55,223][62634] Updated weights for policy 0, policy_version 68590 (0.0009) [2023-10-12 18:38:55,602][62634] Updated weights for policy 0, policy_version 68600 (0.0009) [2023-10-12 18:38:58,047][62635] Updated weights for policy 1, policy_version 68610 (0.0009) [2023-10-12 18:38:58,410][62635] Updated weights for policy 1, policy_version 68620 (0.0008) [2023-10-12 18:38:58,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 140509184. Throughput: 0: 1684.2, 1: 1675.1. Samples: 35138488. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 18:38:58,436][61643] Avg episode reward: [(0, '25.240'), (1, '9.920')] [2023-10-12 18:38:58,794][62635] Updated weights for policy 1, policy_version 68630 (0.0009) [2023-10-12 18:38:59,152][62635] Updated weights for policy 1, policy_version 68640 (0.0009) [2023-10-12 18:38:59,660][62634] Updated weights for policy 0, policy_version 68610 (0.0009) [2023-10-12 18:39:00,068][62634] Updated weights for policy 0, policy_version 68620 (0.0010) [2023-10-12 18:39:00,439][62634] Updated weights for policy 0, policy_version 68630 (0.0008) [2023-10-12 18:39:00,823][62634] Updated weights for policy 0, policy_version 68640 (0.0009) [2023-10-12 18:39:03,143][62635] Updated weights for policy 1, policy_version 68650 (0.0007) [2023-10-12 18:39:03,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.3, 300 sec: 13440.4). Total num frames: 140574720. Throughput: 0: 1692.7, 1: 1669.8. Samples: 35159116. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 18:39:03,435][61643] Avg episode reward: [(0, '25.320'), (1, '9.850')] [2023-10-12 18:39:03,500][62635] Updated weights for policy 1, policy_version 68660 (0.0009) [2023-10-12 18:39:03,864][62635] Updated weights for policy 1, policy_version 68670 (0.0007) [2023-10-12 18:39:04,819][62634] Updated weights for policy 0, policy_version 68650 (0.0007) [2023-10-12 18:39:05,204][62634] Updated weights for policy 0, policy_version 68660 (0.0007) [2023-10-12 18:39:05,574][62634] Updated weights for policy 0, policy_version 68670 (0.0009) [2023-10-12 18:39:08,293][62635] Updated weights for policy 1, policy_version 68680 (0.0008) [2023-10-12 18:39:08,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 140640256. Throughput: 0: 1669.1, 1: 1683.6. Samples: 35168568. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) [2023-10-12 18:39:08,436][61643] Avg episode reward: [(0, '25.210'), (1, '9.850')] [2023-10-12 18:39:08,661][62635] Updated weights for policy 1, policy_version 68690 (0.0009) [2023-10-12 18:39:09,030][62635] Updated weights for policy 1, policy_version 68700 (0.0008) [2023-10-12 18:39:09,591][62634] Updated weights for policy 0, policy_version 68680 (0.0009) [2023-10-12 18:39:09,955][62634] Updated weights for policy 0, policy_version 68690 (0.0008) [2023-10-12 18:39:10,336][62634] Updated weights for policy 0, policy_version 68700 (0.0008) [2023-10-12 18:39:13,095][62635] Updated weights for policy 1, policy_version 68710 (0.0010) [2023-10-12 18:39:13,434][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 140705792. Throughput: 0: 1698.4, 1: 1682.3. Samples: 35189386. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) [2023-10-12 18:39:13,435][61643] Avg episode reward: [(0, '24.970'), (1, '9.800')] [2023-10-12 18:39:13,466][62635] Updated weights for policy 1, policy_version 68720 (0.0010) [2023-10-12 18:39:13,834][62635] Updated weights for policy 1, policy_version 68730 (0.0009) [2023-10-12 18:39:14,293][62634] Updated weights for policy 0, policy_version 68710 (0.0008) [2023-10-12 18:39:14,666][62634] Updated weights for policy 0, policy_version 68720 (0.0008) [2023-10-12 18:39:15,039][62634] Updated weights for policy 0, policy_version 68730 (0.0007) [2023-10-12 18:39:17,748][62635] Updated weights for policy 1, policy_version 68740 (0.0008) [2023-10-12 18:39:18,132][62635] Updated weights for policy 1, policy_version 68750 (0.0007) [2023-10-12 18:39:18,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 140771328. Throughput: 0: 1691.9, 1: 1674.4. Samples: 35209566. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) [2023-10-12 18:39:18,435][61643] Avg episode reward: [(0, '24.770'), (1, '9.910')] [2023-10-12 18:39:18,493][62635] Updated weights for policy 1, policy_version 68760 (0.0008) [2023-10-12 18:39:18,920][62634] Updated weights for policy 0, policy_version 68740 (0.0008) [2023-10-12 18:39:19,300][62634] Updated weights for policy 0, policy_version 68750 (0.0009) [2023-10-12 18:39:19,670][62634] Updated weights for policy 0, policy_version 68760 (0.0008) [2023-10-12 18:39:22,543][62635] Updated weights for policy 1, policy_version 68770 (0.0008) [2023-10-12 18:39:22,912][62635] Updated weights for policy 1, policy_version 68780 (0.0010) [2023-10-12 18:39:23,296][62635] Updated weights for policy 1, policy_version 68790 (0.0009) [2023-10-12 18:39:23,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 140836864. Throughput: 0: 1682.5, 1: 1684.3. Samples: 35219200. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) [2023-10-12 18:39:23,435][61643] Avg episode reward: [(0, '25.090'), (1, '9.510')] [2023-10-12 18:39:23,652][62635] Updated weights for policy 1, policy_version 68800 (0.0010) [2023-10-12 18:39:23,920][62634] Updated weights for policy 0, policy_version 68770 (0.0008) [2023-10-12 18:39:24,293][62634] Updated weights for policy 0, policy_version 68780 (0.0007) [2023-10-12 18:39:24,659][62634] Updated weights for policy 0, policy_version 68790 (0.0011) [2023-10-12 18:39:25,036][62634] Updated weights for policy 0, policy_version 68800 (0.0010) [2023-10-12 18:39:27,712][62635] Updated weights for policy 1, policy_version 68810 (0.0009) [2023-10-12 18:39:28,082][62635] Updated weights for policy 1, policy_version 68820 (0.0008) [2023-10-12 18:39:28,434][61643] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 140902400. Throughput: 0: 1698.7, 1: 1682.8. Samples: 35240036. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) [2023-10-12 18:39:28,435][61643] Avg episode reward: [(0, '25.180'), (1, '9.480')] [2023-10-12 18:39:28,443][62635] Updated weights for policy 1, policy_version 68830 (0.0009) [2023-10-12 18:39:28,949][62634] Updated weights for policy 0, policy_version 68810 (0.0011) [2023-10-12 18:39:29,337][62634] Updated weights for policy 0, policy_version 68820 (0.0009) [2023-10-12 18:39:29,723][62634] Updated weights for policy 0, policy_version 68830 (0.0007) [2023-10-12 18:39:32,666][62635] Updated weights for policy 1, policy_version 68840 (0.0008) [2023-10-12 18:39:33,028][62635] Updated weights for policy 1, policy_version 68850 (0.0007) [2023-10-12 18:39:33,409][62635] Updated weights for policy 1, policy_version 68860 (0.0007) [2023-10-12 18:39:33,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 140967936. Throughput: 0: 1701.9, 1: 1663.6. Samples: 35260134. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) [2023-10-12 18:39:33,436][61643] Avg episode reward: [(0, '25.170'), (1, '9.700')] [2023-10-12 18:39:33,706][62634] Updated weights for policy 0, policy_version 68840 (0.0008) [2023-10-12 18:39:34,082][62634] Updated weights for policy 0, policy_version 68850 (0.0011) [2023-10-12 18:39:34,466][62634] Updated weights for policy 0, policy_version 68860 (0.0009) [2023-10-12 18:39:37,486][62635] Updated weights for policy 1, policy_version 68870 (0.0009) [2023-10-12 18:39:37,859][62635] Updated weights for policy 1, policy_version 68880 (0.0009) [2023-10-12 18:39:38,230][62635] Updated weights for policy 1, policy_version 68890 (0.0008) [2023-10-12 18:39:38,420][62634] Updated weights for policy 0, policy_version 68870 (0.0008) [2023-10-12 18:39:38,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 141033472. Throughput: 0: 1702.8, 1: 1681.6. Samples: 35270094. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) [2023-10-12 18:39:38,435][61643] Avg episode reward: [(0, '25.180'), (1, '9.540')] [2023-10-12 18:39:38,798][62634] Updated weights for policy 0, policy_version 68880 (0.0009) [2023-10-12 18:39:39,180][62634] Updated weights for policy 0, policy_version 68890 (0.0008) [2023-10-12 18:39:42,337][62635] Updated weights for policy 1, policy_version 68900 (0.0007) [2023-10-12 18:39:42,706][62635] Updated weights for policy 1, policy_version 68910 (0.0008) [2023-10-12 18:39:43,081][62635] Updated weights for policy 1, policy_version 68920 (0.0008) [2023-10-12 18:39:43,194][62634] Updated weights for policy 0, policy_version 68900 (0.0007) [2023-10-12 18:39:43,435][61643] Fps is (10 sec: 16384.0, 60 sec: 13653.4, 300 sec: 13440.5). Total num frames: 141131776. Throughput: 0: 1704.0, 1: 1680.4. Samples: 35290786. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) [2023-10-12 18:39:43,435][61643] Avg episode reward: [(0, '25.540'), (1, '9.630')] [2023-10-12 18:39:43,574][62634] Updated weights for policy 0, policy_version 68910 (0.0008) [2023-10-12 18:39:43,952][62634] Updated weights for policy 0, policy_version 68920 (0.0008) [2023-10-12 18:39:47,062][62635] Updated weights for policy 1, policy_version 68930 (0.0009) [2023-10-12 18:39:47,426][62635] Updated weights for policy 1, policy_version 68940 (0.0008) [2023-10-12 18:39:47,803][62635] Updated weights for policy 1, policy_version 68950 (0.0007) [2023-10-12 18:39:48,067][62634] Updated weights for policy 0, policy_version 68930 (0.0007) [2023-10-12 18:39:48,173][62635] Updated weights for policy 1, policy_version 68960 (0.0009) [2023-10-12 18:39:48,435][61643] Fps is (10 sec: 16383.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 141197312. Throughput: 0: 1696.3, 1: 1662.8. Samples: 35310274. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) [2023-10-12 18:39:48,435][61643] Avg episode reward: [(0, '25.310'), (1, '9.680')] [2023-10-12 18:39:48,466][62634] Updated weights for policy 0, policy_version 68940 (0.0010) [2023-10-12 18:39:48,851][62634] Updated weights for policy 0, policy_version 68950 (0.0009) [2023-10-12 18:39:49,225][62634] Updated weights for policy 0, policy_version 68960 (0.0008) [2023-10-12 18:39:52,149][62635] Updated weights for policy 1, policy_version 68970 (0.0008) [2023-10-12 18:39:52,519][62635] Updated weights for policy 1, policy_version 68980 (0.0007) [2023-10-12 18:39:52,895][62635] Updated weights for policy 1, policy_version 68990 (0.0007) [2023-10-12 18:39:53,323][62634] Updated weights for policy 0, policy_version 68970 (0.0008) [2023-10-12 18:39:53,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 141262848. Throughput: 0: 1693.1, 1: 1680.5. Samples: 35320378. Policy #0 lag: (min: 31.0, avg: 34.5, max: 63.0) [2023-10-12 18:39:53,435][61643] Avg episode reward: [(0, '25.260'), (1, '9.730')] [2023-10-12 18:39:53,694][62634] Updated weights for policy 0, policy_version 68980 (0.0009) [2023-10-12 18:39:54,078][62634] Updated weights for policy 0, policy_version 68990 (0.0011) [2023-10-12 18:39:57,153][62635] Updated weights for policy 1, policy_version 69000 (0.0009) [2023-10-12 18:39:57,533][62635] Updated weights for policy 1, policy_version 69010 (0.0008) [2023-10-12 18:39:57,900][62635] Updated weights for policy 1, policy_version 69020 (0.0008) [2023-10-12 18:39:58,191][62634] Updated weights for policy 0, policy_version 69000 (0.0008) [2023-10-12 18:39:58,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 141328384. Throughput: 0: 1688.0, 1: 1674.5. Samples: 35340700. Policy #0 lag: (min: 31.0, avg: 34.5, max: 63.0) [2023-10-12 18:39:58,436][61643] Avg episode reward: [(0, '25.460'), (1, '10.140')] [2023-10-12 18:39:58,580][62634] Updated weights for policy 0, policy_version 69010 (0.0010) [2023-10-12 18:39:58,958][62634] Updated weights for policy 0, policy_version 69020 (0.0009) [2023-10-12 18:40:01,919][62635] Updated weights for policy 1, policy_version 69030 (0.0009) [2023-10-12 18:40:02,293][62635] Updated weights for policy 1, policy_version 69040 (0.0010) [2023-10-12 18:40:02,671][62635] Updated weights for policy 1, policy_version 69050 (0.0010) [2023-10-12 18:40:02,886][62634] Updated weights for policy 0, policy_version 69030 (0.0009) [2023-10-12 18:40:03,263][62634] Updated weights for policy 0, policy_version 69040 (0.0010) [2023-10-12 18:40:03,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 141393920. Throughput: 0: 1686.3, 1: 1656.6. Samples: 35359996. Policy #0 lag: (min: 31.0, avg: 34.5, max: 63.0) [2023-10-12 18:40:03,435][61643] Avg episode reward: [(0, '25.170'), (1, '10.060')] [2023-10-12 18:40:03,640][62634] Updated weights for policy 0, policy_version 69050 (0.0008) [2023-10-12 18:40:06,748][62635] Updated weights for policy 1, policy_version 69060 (0.0009) [2023-10-12 18:40:07,110][62635] Updated weights for policy 1, policy_version 69070 (0.0008) [2023-10-12 18:40:07,484][62635] Updated weights for policy 1, policy_version 69080 (0.0007) [2023-10-12 18:40:07,689][62634] Updated weights for policy 0, policy_version 69060 (0.0007) [2023-10-12 18:40:08,064][62634] Updated weights for policy 0, policy_version 69070 (0.0008) [2023-10-12 18:40:08,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 141459456. Throughput: 0: 1691.8, 1: 1677.0. Samples: 35370794. Policy #0 lag: (min: 31.0, avg: 34.5, max: 63.0) [2023-10-12 18:40:08,436][61643] Avg episode reward: [(0, '25.060'), (1, '9.880')] [2023-10-12 18:40:08,446][62634] Updated weights for policy 0, policy_version 69080 (0.0008) [2023-10-12 18:40:11,555][62635] Updated weights for policy 1, policy_version 69090 (0.0007) [2023-10-12 18:40:11,919][62635] Updated weights for policy 1, policy_version 69100 (0.0007) [2023-10-12 18:40:12,294][62635] Updated weights for policy 1, policy_version 69110 (0.0008) [2023-10-12 18:40:12,562][62634] Updated weights for policy 0, policy_version 69090 (0.0007) [2023-10-12 18:40:12,660][62635] Updated weights for policy 1, policy_version 69120 (0.0007) [2023-10-12 18:40:12,939][62634] Updated weights for policy 0, policy_version 69100 (0.0008) [2023-10-12 18:40:13,313][62634] Updated weights for policy 0, policy_version 69110 (0.0007) [2023-10-12 18:40:13,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 141524992. Throughput: 0: 1687.9, 1: 1662.6. Samples: 35390806. Policy #0 lag: (min: 31.0, avg: 34.5, max: 63.0) [2023-10-12 18:40:13,435][61643] Avg episode reward: [(0, '25.060'), (1, '10.060')] [2023-10-12 18:40:13,688][62634] Updated weights for policy 0, policy_version 69120 (0.0010) [2023-10-12 18:40:16,945][62635] Updated weights for policy 1, policy_version 69130 (0.0008) [2023-10-12 18:40:17,314][62635] Updated weights for policy 1, policy_version 69140 (0.0008) [2023-10-12 18:40:17,673][62634] Updated weights for policy 0, policy_version 69130 (0.0010) [2023-10-12 18:40:17,686][62635] Updated weights for policy 1, policy_version 69150 (0.0007) [2023-10-12 18:40:18,052][62634] Updated weights for policy 0, policy_version 69140 (0.0008) [2023-10-12 18:40:18,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 141590528. Throughput: 0: 1667.1, 1: 1659.9. Samples: 35409846. Policy #0 lag: (min: 31.0, avg: 34.5, max: 63.0) [2023-10-12 18:40:18,435][61643] Avg episode reward: [(0, '25.260'), (1, '9.890')] [2023-10-12 18:40:18,437][62634] Updated weights for policy 0, policy_version 69150 (0.0007) [2023-10-12 18:40:18,442][62495] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000069152_70811648.pth... [2023-10-12 18:40:18,474][62495] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000067584_69206016.pth [2023-10-12 18:40:18,504][62354] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000069152_70811648.pth... [2023-10-12 18:40:18,533][62354] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000067552_69173248.pth [2023-10-12 18:40:21,759][62635] Updated weights for policy 1, policy_version 69160 (0.0009) [2023-10-12 18:40:22,129][62635] Updated weights for policy 1, policy_version 69170 (0.0008) [2023-10-12 18:40:22,508][62635] Updated weights for policy 1, policy_version 69180 (0.0007) [2023-10-12 18:40:22,561][62634] Updated weights for policy 0, policy_version 69160 (0.0008) [2023-10-12 18:40:22,936][62634] Updated weights for policy 0, policy_version 69170 (0.0007) [2023-10-12 18:40:23,314][62634] Updated weights for policy 0, policy_version 69180 (0.0007) [2023-10-12 18:40:23,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 141656064. Throughput: 0: 1676.1, 1: 1671.3. Samples: 35420728. Policy #0 lag: (min: 31.0, avg: 34.5, max: 63.0) [2023-10-12 18:40:23,435][61643] Avg episode reward: [(0, '25.170'), (1, '9.730')] [2023-10-12 18:40:26,611][62635] Updated weights for policy 1, policy_version 69190 (0.0007) [2023-10-12 18:40:26,986][62635] Updated weights for policy 1, policy_version 69200 (0.0007) [2023-10-12 18:40:27,363][62635] Updated weights for policy 1, policy_version 69210 (0.0008) [2023-10-12 18:40:27,463][62634] Updated weights for policy 0, policy_version 69190 (0.0008) [2023-10-12 18:40:27,845][62634] Updated weights for policy 0, policy_version 69200 (0.0007) [2023-10-12 18:40:28,229][62634] Updated weights for policy 0, policy_version 69210 (0.0009) [2023-10-12 18:40:28,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 141721600. Throughput: 0: 1677.5, 1: 1657.7. Samples: 35440872. Policy #0 lag: (min: 31.0, avg: 34.5, max: 63.0) [2023-10-12 18:40:28,435][61643] Avg episode reward: [(0, '25.240'), (1, '9.660')] [2023-10-12 18:40:31,170][62635] Updated weights for policy 1, policy_version 69220 (0.0009) [2023-10-12 18:40:31,535][62635] Updated weights for policy 1, policy_version 69230 (0.0008) [2023-10-12 18:40:31,909][62635] Updated weights for policy 1, policy_version 69240 (0.0010) [2023-10-12 18:40:32,372][62634] Updated weights for policy 0, policy_version 69220 (0.0008) [2023-10-12 18:40:32,764][62634] Updated weights for policy 0, policy_version 69230 (0.0007) [2023-10-12 18:40:33,152][62634] Updated weights for policy 0, policy_version 69240 (0.0007) [2023-10-12 18:40:33,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 141787136. Throughput: 0: 1657.5, 1: 1668.4. Samples: 35459936. Policy #0 lag: (min: 31.0, avg: 34.5, max: 63.0) [2023-10-12 18:40:33,435][61643] Avg episode reward: [(0, '25.270'), (1, '9.410')] [2023-10-12 18:40:36,020][62635] Updated weights for policy 1, policy_version 69250 (0.0008) [2023-10-12 18:40:36,381][62635] Updated weights for policy 1, policy_version 69260 (0.0009) [2023-10-12 18:40:36,753][62635] Updated weights for policy 1, policy_version 69270 (0.0009) [2023-10-12 18:40:37,120][62635] Updated weights for policy 1, policy_version 69280 (0.0009) [2023-10-12 18:40:37,274][62634] Updated weights for policy 0, policy_version 69250 (0.0010) [2023-10-12 18:40:37,647][62634] Updated weights for policy 0, policy_version 69260 (0.0008) [2023-10-12 18:40:38,023][62634] Updated weights for policy 0, policy_version 69270 (0.0007) [2023-10-12 18:40:38,402][62634] Updated weights for policy 0, policy_version 69280 (0.0008) [2023-10-12 18:40:38,435][61643] Fps is (10 sec: 16384.0, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 141885440. Throughput: 0: 1674.3, 1: 1670.6. Samples: 35470898. Policy #0 lag: (min: 31.0, avg: 38.7, max: 63.0) [2023-10-12 18:40:38,435][61643] Avg episode reward: [(0, '25.300'), (1, '9.400')] [2023-10-12 18:40:41,296][62635] Updated weights for policy 1, policy_version 69290 (0.0010) [2023-10-12 18:40:41,660][62635] Updated weights for policy 1, policy_version 69300 (0.0009) [2023-10-12 18:40:42,033][62635] Updated weights for policy 1, policy_version 69310 (0.0008) [2023-10-12 18:40:42,550][62634] Updated weights for policy 0, policy_version 69290 (0.0007) [2023-10-12 18:40:42,926][62634] Updated weights for policy 0, policy_version 69300 (0.0007) [2023-10-12 18:40:43,313][62634] Updated weights for policy 0, policy_version 69310 (0.0010) [2023-10-12 18:40:43,435][61643] Fps is (10 sec: 16384.0, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 141950976. Throughput: 0: 1679.5, 1: 1654.3. Samples: 35490720. Policy #0 lag: (min: 31.0, avg: 38.7, max: 63.0) [2023-10-12 18:40:43,435][61643] Avg episode reward: [(0, '25.140'), (1, '9.470')] [2023-10-12 18:40:46,407][62635] Updated weights for policy 1, policy_version 69320 (0.0007) [2023-10-12 18:40:46,780][62635] Updated weights for policy 1, policy_version 69330 (0.0007) [2023-10-12 18:40:47,149][62635] Updated weights for policy 1, policy_version 69340 (0.0010) [2023-10-12 18:40:47,378][62634] Updated weights for policy 0, policy_version 69320 (0.0008) [2023-10-12 18:40:47,755][62634] Updated weights for policy 0, policy_version 69330 (0.0008) [2023-10-12 18:40:48,130][62634] Updated weights for policy 0, policy_version 69340 (0.0009) [2023-10-12 18:40:48,435][61643] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 142016512. Throughput: 0: 1665.9, 1: 1677.2. Samples: 35510440. Policy #0 lag: (min: 31.0, avg: 38.7, max: 63.0) [2023-10-12 18:40:48,436][61643] Avg episode reward: [(0, '25.310'), (1, '9.420')] [2023-10-12 18:40:51,292][62635] Updated weights for policy 1, policy_version 69350 (0.0009) [2023-10-12 18:40:51,647][62635] Updated weights for policy 1, policy_version 69360 (0.0011) [2023-10-12 18:40:52,018][62635] Updated weights for policy 1, policy_version 69370 (0.0009) [2023-10-12 18:40:52,179][62634] Updated weights for policy 0, policy_version 69350 (0.0009) [2023-10-12 18:40:52,561][62634] Updated weights for policy 0, policy_version 69360 (0.0009) [2023-10-12 18:40:52,937][62634] Updated weights for policy 0, policy_version 69370 (0.0007) [2023-10-12 18:40:53,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 142082048. Throughput: 0: 1678.5, 1: 1671.0. Samples: 35521522. Policy #0 lag: (min: 31.0, avg: 38.7, max: 63.0) [2023-10-12 18:40:53,435][61643] Avg episode reward: [(0, '25.180'), (1, '9.330')] [2023-10-12 18:40:55,853][62635] Updated weights for policy 1, policy_version 69380 (0.0008) [2023-10-12 18:40:56,214][62635] Updated weights for policy 1, policy_version 69390 (0.0010) [2023-10-12 18:40:56,590][62635] Updated weights for policy 1, policy_version 69400 (0.0010) [2023-10-12 18:40:57,069][62634] Updated weights for policy 0, policy_version 69380 (0.0009) [2023-10-12 18:40:57,447][62634] Updated weights for policy 0, policy_version 69390 (0.0007) [2023-10-12 18:40:57,814][62634] Updated weights for policy 0, policy_version 69400 (0.0007) [2023-10-12 18:40:58,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 142147584. Throughput: 0: 1677.7, 1: 1660.9. Samples: 35541046. Policy #0 lag: (min: 31.0, avg: 38.7, max: 63.0) [2023-10-12 18:40:58,436][61643] Avg episode reward: [(0, '24.910'), (1, '9.630')] [2023-10-12 18:41:00,618][62635] Updated weights for policy 1, policy_version 69410 (0.0007) [2023-10-12 18:41:00,985][62635] Updated weights for policy 1, policy_version 69420 (0.0011) [2023-10-12 18:41:01,362][62635] Updated weights for policy 1, policy_version 69430 (0.0011) [2023-10-12 18:41:01,643][62634] Updated weights for policy 0, policy_version 69410 (0.0007) [2023-10-12 18:41:01,731][62635] Updated weights for policy 1, policy_version 69440 (0.0008) [2023-10-12 18:41:02,016][62634] Updated weights for policy 0, policy_version 69420 (0.0010) [2023-10-12 18:41:02,388][62634] Updated weights for policy 0, policy_version 69430 (0.0010) [2023-10-12 18:41:02,759][62634] Updated weights for policy 0, policy_version 69440 (0.0009) [2023-10-12 18:41:03,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 142213120. Throughput: 0: 1667.2, 1: 1682.3. Samples: 35560576. Policy #0 lag: (min: 31.0, avg: 38.7, max: 63.0) [2023-10-12 18:41:03,436][61643] Avg episode reward: [(0, '24.920'), (1, '9.980')] [2023-10-12 18:41:05,726][62635] Updated weights for policy 1, policy_version 69450 (0.0009) [2023-10-12 18:41:06,093][62635] Updated weights for policy 1, policy_version 69460 (0.0010) [2023-10-12 18:41:06,464][62635] Updated weights for policy 1, policy_version 69470 (0.0007) [2023-10-12 18:41:06,699][62634] Updated weights for policy 0, policy_version 69450 (0.0007) [2023-10-12 18:41:07,074][62634] Updated weights for policy 0, policy_version 69460 (0.0008) [2023-10-12 18:41:07,460][62634] Updated weights for policy 0, policy_version 69470 (0.0010) [2023-10-12 18:41:08,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 142278656. Throughput: 0: 1686.5, 1: 1668.0. Samples: 35571682. Policy #0 lag: (min: 31.0, avg: 38.7, max: 63.0) [2023-10-12 18:41:08,435][61643] Avg episode reward: [(0, '25.010'), (1, '9.830')] [2023-10-12 18:41:10,282][62635] Updated weights for policy 1, policy_version 69480 (0.0008) [2023-10-12 18:41:10,652][62635] Updated weights for policy 1, policy_version 69490 (0.0009) [2023-10-12 18:41:11,025][62635] Updated weights for policy 1, policy_version 69500 (0.0009) [2023-10-12 18:41:11,385][62634] Updated weights for policy 0, policy_version 69480 (0.0008) [2023-10-12 18:41:11,773][62634] Updated weights for policy 0, policy_version 69490 (0.0009) [2023-10-12 18:41:12,139][62634] Updated weights for policy 0, policy_version 69500 (0.0007) [2023-10-12 18:41:13,435][61643] Fps is (10 sec: 13107.6, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 142344192. Throughput: 0: 1662.3, 1: 1673.4. Samples: 35590978. Policy #0 lag: (min: 31.0, avg: 38.7, max: 63.0) [2023-10-12 18:41:13,435][61643] Avg episode reward: [(0, '25.200'), (1, '9.830')] [2023-10-12 18:41:15,123][62635] Updated weights for policy 1, policy_version 69510 (0.0009) [2023-10-12 18:41:15,489][62635] Updated weights for policy 1, policy_version 69520 (0.0010) [2023-10-12 18:41:15,856][62635] Updated weights for policy 1, policy_version 69530 (0.0008) [2023-10-12 18:41:16,195][62634] Updated weights for policy 0, policy_version 69510 (0.0009) [2023-10-12 18:41:16,571][62634] Updated weights for policy 0, policy_version 69520 (0.0010) [2023-10-12 18:41:16,954][62634] Updated weights for policy 0, policy_version 69530 (0.0009) [2023-10-12 18:41:18,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 142409728. Throughput: 0: 1677.4, 1: 1689.9. Samples: 35611466. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 18:41:18,435][61643] Avg episode reward: [(0, '25.140'), (1, '10.000')] [2023-10-12 18:41:19,819][62635] Updated weights for policy 1, policy_version 69540 (0.0007) [2023-10-12 18:41:20,190][62635] Updated weights for policy 1, policy_version 69550 (0.0008) [2023-10-12 18:41:20,548][62635] Updated weights for policy 1, policy_version 69560 (0.0011) [2023-10-12 18:41:21,102][62634] Updated weights for policy 0, policy_version 69540 (0.0010) [2023-10-12 18:41:21,489][62634] Updated weights for policy 0, policy_version 69550 (0.0009) [2023-10-12 18:41:21,859][62634] Updated weights for policy 0, policy_version 69560 (0.0008) [2023-10-12 18:41:23,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 142475264. Throughput: 0: 1690.5, 1: 1663.7. Samples: 35621836. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 18:41:23,435][61643] Avg episode reward: [(0, '25.270'), (1, '10.000')] [2023-10-12 18:41:24,624][62635] Updated weights for policy 1, policy_version 69570 (0.0009) [2023-10-12 18:41:24,996][62635] Updated weights for policy 1, policy_version 69580 (0.0007) [2023-10-12 18:41:25,361][62635] Updated weights for policy 1, policy_version 69590 (0.0007) [2023-10-12 18:41:25,730][62635] Updated weights for policy 1, policy_version 69600 (0.0008) [2023-10-12 18:41:25,903][62634] Updated weights for policy 0, policy_version 69570 (0.0008) [2023-10-12 18:41:26,274][62634] Updated weights for policy 0, policy_version 69580 (0.0009) [2023-10-12 18:41:26,641][62634] Updated weights for policy 0, policy_version 69590 (0.0008) [2023-10-12 18:41:27,016][62634] Updated weights for policy 0, policy_version 69600 (0.0011) [2023-10-12 18:41:28,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 142540800. Throughput: 0: 1662.3, 1: 1690.9. Samples: 35641616. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 18:41:28,436][61643] Avg episode reward: [(0, '25.240'), (1, '9.920')] [2023-10-12 18:41:29,882][62635] Updated weights for policy 1, policy_version 69610 (0.0009) [2023-10-12 18:41:30,249][62635] Updated weights for policy 1, policy_version 69620 (0.0008) [2023-10-12 18:41:30,618][62635] Updated weights for policy 1, policy_version 69630 (0.0010) [2023-10-12 18:41:31,103][62634] Updated weights for policy 0, policy_version 69610 (0.0008) [2023-10-12 18:41:31,484][62634] Updated weights for policy 0, policy_version 69620 (0.0010) [2023-10-12 18:41:31,871][62634] Updated weights for policy 0, policy_version 69630 (0.0008) [2023-10-12 18:41:33,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 142606336. Throughput: 0: 1677.1, 1: 1692.4. Samples: 35662064. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 18:41:33,435][61643] Avg episode reward: [(0, '25.250'), (1, '10.030')] [2023-10-12 18:41:34,983][62635] Updated weights for policy 1, policy_version 69640 (0.0007) [2023-10-12 18:41:35,356][62635] Updated weights for policy 1, policy_version 69650 (0.0007) [2023-10-12 18:41:35,717][62635] Updated weights for policy 1, policy_version 69660 (0.0007) [2023-10-12 18:41:35,895][62634] Updated weights for policy 0, policy_version 69640 (0.0007) [2023-10-12 18:41:36,273][62634] Updated weights for policy 0, policy_version 69650 (0.0007) [2023-10-12 18:41:36,639][62634] Updated weights for policy 0, policy_version 69660 (0.0009) [2023-10-12 18:41:38,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 142671872. Throughput: 0: 1675.3, 1: 1664.2. Samples: 35671800. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 18:41:38,435][61643] Avg episode reward: [(0, '25.250'), (1, '9.750')] [2023-10-12 18:41:39,824][62635] Updated weights for policy 1, policy_version 69670 (0.0008) [2023-10-12 18:41:40,189][62635] Updated weights for policy 1, policy_version 69680 (0.0008) [2023-10-12 18:41:40,563][62635] Updated weights for policy 1, policy_version 69690 (0.0007) [2023-10-12 18:41:40,710][62634] Updated weights for policy 0, policy_version 69670 (0.0008) [2023-10-12 18:41:41,083][62634] Updated weights for policy 0, policy_version 69680 (0.0007) [2023-10-12 18:41:41,459][62634] Updated weights for policy 0, policy_version 69690 (0.0007) [2023-10-12 18:41:43,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 142737408. Throughput: 0: 1654.7, 1: 1686.8. Samples: 35691416. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 18:41:43,435][61643] Avg episode reward: [(0, '25.230'), (1, '9.830')] [2023-10-12 18:41:44,594][62635] Updated weights for policy 1, policy_version 69700 (0.0007) [2023-10-12 18:41:44,960][62635] Updated weights for policy 1, policy_version 69710 (0.0010) [2023-10-12 18:41:45,326][62635] Updated weights for policy 1, policy_version 69720 (0.0012) [2023-10-12 18:41:45,600][62634] Updated weights for policy 0, policy_version 69700 (0.0010) [2023-10-12 18:41:45,973][62634] Updated weights for policy 0, policy_version 69710 (0.0007) [2023-10-12 18:41:46,357][62634] Updated weights for policy 0, policy_version 69720 (0.0009) [2023-10-12 18:41:48,435][61643] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 142802944. Throughput: 0: 1683.2, 1: 1684.6. Samples: 35712128. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 18:41:48,436][61643] Avg episode reward: [(0, '25.370'), (1, '9.830')] [2023-10-12 18:41:49,412][62635] Updated weights for policy 1, policy_version 69730 (0.0009) [2023-10-12 18:41:49,766][62635] Updated weights for policy 1, policy_version 69740 (0.0008) [2023-10-12 18:41:50,133][62635] Updated weights for policy 1, policy_version 69750 (0.0010) [2023-10-12 18:41:50,405][62634] Updated weights for policy 0, policy_version 69730 (0.0009) [2023-10-12 18:41:50,502][62635] Updated weights for policy 1, policy_version 69760 (0.0009) [2023-10-12 18:41:50,782][62634] Updated weights for policy 0, policy_version 69740 (0.0008) [2023-10-12 18:41:51,167][62634] Updated weights for policy 0, policy_version 69750 (0.0008) [2023-10-12 18:41:51,538][62634] Updated weights for policy 0, policy_version 69760 (0.0010) [2023-10-12 18:41:53,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 142868480. Throughput: 0: 1665.4, 1: 1670.7. Samples: 35721804. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 18:41:53,435][61643] Avg episode reward: [(0, '25.350'), (1, '9.790')] [2023-10-12 18:41:54,392][62635] Updated weights for policy 1, policy_version 69770 (0.0007) [2023-10-12 18:41:54,755][62635] Updated weights for policy 1, policy_version 69780 (0.0007) [2023-10-12 18:41:55,123][62635] Updated weights for policy 1, policy_version 69790 (0.0010) [2023-10-12 18:41:55,657][62634] Updated weights for policy 0, policy_version 69770 (0.0007) [2023-10-12 18:41:56,028][62634] Updated weights for policy 0, policy_version 69780 (0.0010) [2023-10-12 18:41:56,395][62634] Updated weights for policy 0, policy_version 69790 (0.0008) [2023-10-12 18:41:58,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 142934016. Throughput: 0: 1671.1, 1: 1681.1. Samples: 35741828. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 18:41:58,436][61643] Avg episode reward: [(0, '25.240'), (1, '9.870')] [2023-10-12 18:41:59,335][62635] Updated weights for policy 1, policy_version 69800 (0.0007) [2023-10-12 18:41:59,702][62635] Updated weights for policy 1, policy_version 69810 (0.0007) [2023-10-12 18:42:00,072][62635] Updated weights for policy 1, policy_version 69820 (0.0008) [2023-10-12 18:42:00,371][62634] Updated weights for policy 0, policy_version 69800 (0.0008) [2023-10-12 18:42:00,757][62634] Updated weights for policy 0, policy_version 69810 (0.0010) [2023-10-12 18:42:01,135][62634] Updated weights for policy 0, policy_version 69820 (0.0009) [2023-10-12 18:42:03,435][61643] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 142999552. Throughput: 0: 1683.3, 1: 1677.1. Samples: 35762684. Policy #0 lag: (min: 24.0, avg: 45.3, max: 56.0) [2023-10-12 18:42:03,436][61643] Avg episode reward: [(0, '25.190'), (1, '9.860')] [2023-10-12 18:42:04,421][62635] Updated weights for policy 1, policy_version 69830 (0.0008) [2023-10-12 18:42:04,788][62635] Updated weights for policy 1, policy_version 69840 (0.0009) [2023-10-12 18:42:05,157][62635] Updated weights for policy 1, policy_version 69850 (0.0008) [2023-10-12 18:42:05,182][62634] Updated weights for policy 0, policy_version 69830 (0.0007) [2023-10-12 18:42:05,565][62634] Updated weights for policy 0, policy_version 69840 (0.0007) [2023-10-12 18:42:05,946][62634] Updated weights for policy 0, policy_version 69850 (0.0009) [2023-10-12 18:42:08,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 143065088. Throughput: 0: 1658.5, 1: 1675.9. Samples: 35771884. Policy #0 lag: (min: 24.0, avg: 45.3, max: 56.0) [2023-10-12 18:42:08,436][61643] Avg episode reward: [(0, '25.120'), (1, '9.610')] [2023-10-12 18:42:09,331][62635] Updated weights for policy 1, policy_version 69860 (0.0008) [2023-10-12 18:42:09,705][62635] Updated weights for policy 1, policy_version 69870 (0.0008) [2023-10-12 18:42:10,026][62634] Updated weights for policy 0, policy_version 69860 (0.0009) [2023-10-12 18:42:10,080][62635] Updated weights for policy 1, policy_version 69880 (0.0007) [2023-10-12 18:42:10,397][62634] Updated weights for policy 0, policy_version 69870 (0.0007) [2023-10-12 18:42:10,780][62634] Updated weights for policy 0, policy_version 69880 (0.0008) [2023-10-12 18:42:13,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 143130624. Throughput: 0: 1672.3, 1: 1668.8. Samples: 35791966. Policy #0 lag: (min: 24.0, avg: 45.3, max: 56.0) [2023-10-12 18:42:13,435][61643] Avg episode reward: [(0, '24.980'), (1, '9.780')] [2023-10-12 18:42:14,072][62635] Updated weights for policy 1, policy_version 69890 (0.0007) [2023-10-12 18:42:14,446][62635] Updated weights for policy 1, policy_version 69900 (0.0008) [2023-10-12 18:42:14,824][62635] Updated weights for policy 1, policy_version 69910 (0.0008) [2023-10-12 18:42:14,857][62634] Updated weights for policy 0, policy_version 69890 (0.0008) [2023-10-12 18:42:15,181][62635] Updated weights for policy 1, policy_version 69920 (0.0009) [2023-10-12 18:42:15,227][62634] Updated weights for policy 0, policy_version 69900 (0.0009) [2023-10-12 18:42:15,602][62634] Updated weights for policy 0, policy_version 69910 (0.0011) [2023-10-12 18:42:15,982][62634] Updated weights for policy 0, policy_version 69920 (0.0009) [2023-10-12 18:42:18,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 143196160. Throughput: 0: 1679.5, 1: 1669.7. Samples: 35812778. Policy #0 lag: (min: 24.0, avg: 45.3, max: 56.0) [2023-10-12 18:42:18,435][61643] Avg episode reward: [(0, '25.300'), (1, '9.870')] [2023-10-12 18:42:18,444][62495] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000069920_71598080.pth... [2023-10-12 18:42:18,444][62354] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000069920_71598080.pth... [2023-10-12 18:42:18,484][62495] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000068352_69992448.pth [2023-10-12 18:42:18,485][62354] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000068352_69992448.pth [2023-10-12 18:42:19,379][62635] Updated weights for policy 1, policy_version 69930 (0.0008) [2023-10-12 18:42:19,756][62635] Updated weights for policy 1, policy_version 69940 (0.0007) [2023-10-12 18:42:20,103][62634] Updated weights for policy 0, policy_version 69930 (0.0007) [2023-10-12 18:42:20,118][62635] Updated weights for policy 1, policy_version 69950 (0.0007) [2023-10-12 18:42:20,484][62634] Updated weights for policy 0, policy_version 69940 (0.0010) [2023-10-12 18:42:20,854][62634] Updated weights for policy 0, policy_version 69950 (0.0011) [2023-10-12 18:42:23,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 143261696. Throughput: 0: 1660.3, 1: 1672.2. Samples: 35821762. Policy #0 lag: (min: 24.0, avg: 45.3, max: 56.0) [2023-10-12 18:42:23,435][61643] Avg episode reward: [(0, '25.360'), (1, '9.690')] [2023-10-12 18:42:24,062][62635] Updated weights for policy 1, policy_version 69960 (0.0009) [2023-10-12 18:42:24,432][62635] Updated weights for policy 1, policy_version 69970 (0.0009) [2023-10-12 18:42:24,802][62635] Updated weights for policy 1, policy_version 69980 (0.0009) [2023-10-12 18:42:24,805][62634] Updated weights for policy 0, policy_version 69960 (0.0008) [2023-10-12 18:42:25,183][62634] Updated weights for policy 0, policy_version 69970 (0.0008) [2023-10-12 18:42:25,557][62634] Updated weights for policy 0, policy_version 69980 (0.0010) [2023-10-12 18:42:28,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 143327232. Throughput: 0: 1681.2, 1: 1675.1. Samples: 35842452. Policy #0 lag: (min: 24.0, avg: 45.3, max: 56.0) [2023-10-12 18:42:28,436][61643] Avg episode reward: [(0, '25.400'), (1, '9.760')] [2023-10-12 18:42:28,883][62635] Updated weights for policy 1, policy_version 69990 (0.0009) [2023-10-12 18:42:29,256][62635] Updated weights for policy 1, policy_version 70000 (0.0008) [2023-10-12 18:42:29,626][62635] Updated weights for policy 1, policy_version 70010 (0.0007) [2023-10-12 18:42:29,658][62634] Updated weights for policy 0, policy_version 69990 (0.0010) [2023-10-12 18:42:30,036][62634] Updated weights for policy 0, policy_version 70000 (0.0010) [2023-10-12 18:42:30,411][62634] Updated weights for policy 0, policy_version 70010 (0.0010) [2023-10-12 18:42:33,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 143392768. Throughput: 0: 1678.4, 1: 1679.5. Samples: 35863234. Policy #0 lag: (min: 24.0, avg: 45.3, max: 56.0) [2023-10-12 18:42:33,436][61643] Avg episode reward: [(0, '25.440'), (1, '9.940')] [2023-10-12 18:42:33,545][62635] Updated weights for policy 1, policy_version 70020 (0.0009) [2023-10-12 18:42:33,914][62635] Updated weights for policy 1, policy_version 70030 (0.0009) [2023-10-12 18:42:34,283][62635] Updated weights for policy 1, policy_version 70040 (0.0010) [2023-10-12 18:42:34,470][62634] Updated weights for policy 0, policy_version 70020 (0.0009) [2023-10-12 18:42:34,850][62634] Updated weights for policy 0, policy_version 70030 (0.0008) [2023-10-12 18:42:35,242][62634] Updated weights for policy 0, policy_version 70040 (0.0007) [2023-10-12 18:42:38,171][62635] Updated weights for policy 1, policy_version 70050 (0.0008) [2023-10-12 18:42:38,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 143458304. Throughput: 0: 1666.0, 1: 1679.7. Samples: 35872362. Policy #0 lag: (min: 24.0, avg: 45.3, max: 56.0) [2023-10-12 18:42:38,435][61643] Avg episode reward: [(0, '25.440'), (1, '10.010')] [2023-10-12 18:42:38,540][62635] Updated weights for policy 1, policy_version 70060 (0.0009) [2023-10-12 18:42:38,898][62635] Updated weights for policy 1, policy_version 70070 (0.0010) [2023-10-12 18:42:39,268][62635] Updated weights for policy 1, policy_version 70080 (0.0009) [2023-10-12 18:42:39,324][62634] Updated weights for policy 0, policy_version 70050 (0.0007) [2023-10-12 18:42:39,714][62634] Updated weights for policy 0, policy_version 70060 (0.0007) [2023-10-12 18:42:40,088][62634] Updated weights for policy 0, policy_version 70070 (0.0010) [2023-10-12 18:42:40,463][62634] Updated weights for policy 0, policy_version 70080 (0.0007) [2023-10-12 18:42:43,243][62635] Updated weights for policy 1, policy_version 70090 (0.0011) [2023-10-12 18:42:43,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 143523840. Throughput: 0: 1682.9, 1: 1686.4. Samples: 35893446. Policy #0 lag: (min: 24.0, avg: 45.3, max: 56.0) [2023-10-12 18:42:43,435][61643] Avg episode reward: [(0, '25.480'), (1, '9.820')] [2023-10-12 18:42:43,612][62635] Updated weights for policy 1, policy_version 70100 (0.0011) [2023-10-12 18:42:43,980][62635] Updated weights for policy 1, policy_version 70110 (0.0010) [2023-10-12 18:42:44,446][62634] Updated weights for policy 0, policy_version 70090 (0.0007) [2023-10-12 18:42:44,833][62634] Updated weights for policy 0, policy_version 70100 (0.0007) [2023-10-12 18:42:45,198][62634] Updated weights for policy 0, policy_version 70110 (0.0009) [2023-10-12 18:42:48,059][62635] Updated weights for policy 1, policy_version 70120 (0.0008) [2023-10-12 18:42:48,419][62635] Updated weights for policy 1, policy_version 70130 (0.0009) [2023-10-12 18:42:48,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.3, 300 sec: 13329.4). Total num frames: 143589376. Throughput: 0: 1679.3, 1: 1678.5. Samples: 35913786. Policy #0 lag: (min: 9.0, avg: 22.6, max: 41.0) [2023-10-12 18:42:48,435][61643] Avg episode reward: [(0, '25.610'), (1, '9.840')] [2023-10-12 18:42:48,785][62635] Updated weights for policy 1, policy_version 70140 (0.0011) [2023-10-12 18:42:49,181][62634] Updated weights for policy 0, policy_version 70120 (0.0009) [2023-10-12 18:42:49,558][62634] Updated weights for policy 0, policy_version 70130 (0.0007) [2023-10-12 18:42:49,941][62634] Updated weights for policy 0, policy_version 70140 (0.0007) [2023-10-12 18:42:52,768][62635] Updated weights for policy 1, policy_version 70150 (0.0010) [2023-10-12 18:42:53,125][62635] Updated weights for policy 1, policy_version 70160 (0.0008) [2023-10-12 18:42:53,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 143654912. Throughput: 0: 1675.9, 1: 1686.7. Samples: 35923198. Policy #0 lag: (min: 9.0, avg: 22.6, max: 41.0) [2023-10-12 18:42:53,435][61643] Avg episode reward: [(0, '25.540'), (1, '9.910')] [2023-10-12 18:42:53,500][62635] Updated weights for policy 1, policy_version 70170 (0.0010) [2023-10-12 18:42:54,061][62634] Updated weights for policy 0, policy_version 70150 (0.0008) [2023-10-12 18:42:54,433][62634] Updated weights for policy 0, policy_version 70160 (0.0008) [2023-10-12 18:42:54,813][62634] Updated weights for policy 0, policy_version 70170 (0.0009) [2023-10-12 18:42:57,605][62635] Updated weights for policy 1, policy_version 70180 (0.0011) [2023-10-12 18:42:57,969][62635] Updated weights for policy 1, policy_version 70190 (0.0010) [2023-10-12 18:42:58,330][62635] Updated weights for policy 1, policy_version 70200 (0.0010) [2023-10-12 18:42:58,435][61643] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 143720448. Throughput: 0: 1688.4, 1: 1694.0. Samples: 35944172. Policy #0 lag: (min: 9.0, avg: 22.6, max: 41.0) [2023-10-12 18:42:58,436][61643] Avg episode reward: [(0, '25.220'), (1, '9.670')] [2023-10-12 18:42:58,817][62634] Updated weights for policy 0, policy_version 70180 (0.0008) [2023-10-12 18:42:59,199][62634] Updated weights for policy 0, policy_version 70190 (0.0009) [2023-10-12 18:42:59,575][62634] Updated weights for policy 0, policy_version 70200 (0.0009) [2023-10-12 18:43:02,360][62635] Updated weights for policy 1, policy_version 70210 (0.0009) [2023-10-12 18:43:02,733][62635] Updated weights for policy 1, policy_version 70220 (0.0007) [2023-10-12 18:43:03,092][62635] Updated weights for policy 1, policy_version 70230 (0.0008) [2023-10-12 18:43:03,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.3, 300 sec: 13329.4). Total num frames: 143785984. Throughput: 0: 1683.9, 1: 1681.7. Samples: 35964228. Policy #0 lag: (min: 9.0, avg: 22.6, max: 41.0) [2023-10-12 18:43:03,435][61643] Avg episode reward: [(0, '25.110'), (1, '9.850')] [2023-10-12 18:43:03,464][62635] Updated weights for policy 1, policy_version 70240 (0.0009) [2023-10-12 18:43:03,697][62634] Updated weights for policy 0, policy_version 70210 (0.0008) [2023-10-12 18:43:04,071][62634] Updated weights for policy 0, policy_version 70220 (0.0009) [2023-10-12 18:43:04,444][62634] Updated weights for policy 0, policy_version 70230 (0.0008) [2023-10-12 18:43:04,821][62634] Updated weights for policy 0, policy_version 70240 (0.0007) [2023-10-12 18:43:07,641][62635] Updated weights for policy 1, policy_version 70250 (0.0007) [2023-10-12 18:43:08,011][62635] Updated weights for policy 1, policy_version 70260 (0.0007) [2023-10-12 18:43:08,374][62635] Updated weights for policy 1, policy_version 70270 (0.0007) [2023-10-12 18:43:08,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 143851520. Throughput: 0: 1685.0, 1: 1702.8. Samples: 35974212. Policy #0 lag: (min: 9.0, avg: 22.6, max: 41.0) [2023-10-12 18:43:08,435][61643] Avg episode reward: [(0, '25.230'), (1, '9.940')] [2023-10-12 18:43:08,855][62634] Updated weights for policy 0, policy_version 70250 (0.0008) [2023-10-12 18:43:09,233][62634] Updated weights for policy 0, policy_version 70260 (0.0011) [2023-10-12 18:43:09,617][62634] Updated weights for policy 0, policy_version 70270 (0.0007) [2023-10-12 18:43:12,596][62635] Updated weights for policy 1, policy_version 70280 (0.0009) [2023-10-12 18:43:12,971][62635] Updated weights for policy 1, policy_version 70290 (0.0008) [2023-10-12 18:43:13,330][62635] Updated weights for policy 1, policy_version 70300 (0.0009) [2023-10-12 18:43:13,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 143917056. Throughput: 0: 1688.7, 1: 1695.3. Samples: 35994732. Policy #0 lag: (min: 9.0, avg: 22.6, max: 41.0) [2023-10-12 18:43:13,435][61643] Avg episode reward: [(0, '24.850'), (1, '9.670')] [2023-10-12 18:43:13,533][62634] Updated weights for policy 0, policy_version 70280 (0.0008) [2023-10-12 18:43:13,915][62634] Updated weights for policy 0, policy_version 70290 (0.0010) [2023-10-12 18:43:14,298][62634] Updated weights for policy 0, policy_version 70300 (0.0008) [2023-10-12 18:43:17,388][62635] Updated weights for policy 1, policy_version 70310 (0.0009) [2023-10-12 18:43:17,757][62635] Updated weights for policy 1, policy_version 70320 (0.0010) [2023-10-12 18:43:18,130][62635] Updated weights for policy 1, policy_version 70330 (0.0007) [2023-10-12 18:43:18,319][62634] Updated weights for policy 0, policy_version 70310 (0.0007) [2023-10-12 18:43:18,435][61643] Fps is (10 sec: 16384.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 144015360. Throughput: 0: 1694.0, 1: 1668.8. Samples: 36014560. Policy #0 lag: (min: 9.0, avg: 22.6, max: 41.0) [2023-10-12 18:43:18,435][61643] Avg episode reward: [(0, '24.750'), (1, '9.720')] [2023-10-12 18:43:18,695][62634] Updated weights for policy 0, policy_version 70320 (0.0008) [2023-10-12 18:43:19,076][62634] Updated weights for policy 0, policy_version 70330 (0.0009) [2023-10-12 18:43:22,172][62635] Updated weights for policy 1, policy_version 70340 (0.0008) [2023-10-12 18:43:22,540][62635] Updated weights for policy 1, policy_version 70350 (0.0008) [2023-10-12 18:43:22,913][62635] Updated weights for policy 1, policy_version 70360 (0.0007) [2023-10-12 18:43:23,043][62634] Updated weights for policy 0, policy_version 70340 (0.0010) [2023-10-12 18:43:23,416][62634] Updated weights for policy 0, policy_version 70350 (0.0008) [2023-10-12 18:43:23,435][61643] Fps is (10 sec: 16384.0, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 144080896. Throughput: 0: 1690.1, 1: 1690.4. Samples: 36024488. Policy #0 lag: (min: 9.0, avg: 22.6, max: 41.0) [2023-10-12 18:43:23,435][61643] Avg episode reward: [(0, '24.530'), (1, '9.890')] [2023-10-12 18:43:23,797][62634] Updated weights for policy 0, policy_version 70360 (0.0011) [2023-10-12 18:43:26,894][62635] Updated weights for policy 1, policy_version 70370 (0.0008) [2023-10-12 18:43:27,257][62635] Updated weights for policy 1, policy_version 70380 (0.0009) [2023-10-12 18:43:27,620][62635] Updated weights for policy 1, policy_version 70390 (0.0009) [2023-10-12 18:43:27,693][62634] Updated weights for policy 0, policy_version 70370 (0.0008) [2023-10-12 18:43:27,985][62635] Updated weights for policy 1, policy_version 70400 (0.0008) [2023-10-12 18:43:28,073][62634] Updated weights for policy 0, policy_version 70380 (0.0009) [2023-10-12 18:43:28,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 144146432. Throughput: 0: 1692.9, 1: 1680.0. Samples: 36045226. Policy #0 lag: (min: 9.0, avg: 22.6, max: 41.0) [2023-10-12 18:43:28,435][61643] Avg episode reward: [(0, '24.260'), (1, '9.760')] [2023-10-12 18:43:28,440][62634] Updated weights for policy 0, policy_version 70390 (0.0009) [2023-10-12 18:43:28,816][62634] Updated weights for policy 0, policy_version 70400 (0.0012) [2023-10-12 18:43:32,302][62635] Updated weights for policy 1, policy_version 70410 (0.0010) [2023-10-12 18:43:32,665][62635] Updated weights for policy 1, policy_version 70420 (0.0009) [2023-10-12 18:43:32,991][62634] Updated weights for policy 0, policy_version 70410 (0.0007) [2023-10-12 18:43:33,041][62635] Updated weights for policy 1, policy_version 70430 (0.0008) [2023-10-12 18:43:33,369][62634] Updated weights for policy 0, policy_version 70420 (0.0008) [2023-10-12 18:43:33,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 144211968. Throughput: 0: 1685.6, 1: 1664.6. Samples: 36064544. Policy #0 lag: (min: 5.0, avg: 10.9, max: 37.0) [2023-10-12 18:43:33,436][61643] Avg episode reward: [(0, '24.030'), (1, '9.930')] [2023-10-12 18:43:33,750][62634] Updated weights for policy 0, policy_version 70430 (0.0009) [2023-10-12 18:43:37,055][62635] Updated weights for policy 1, policy_version 70440 (0.0008) [2023-10-12 18:43:37,425][62635] Updated weights for policy 1, policy_version 70450 (0.0008) [2023-10-12 18:43:37,790][62635] Updated weights for policy 1, policy_version 70460 (0.0009) [2023-10-12 18:43:37,844][62634] Updated weights for policy 0, policy_version 70440 (0.0008) [2023-10-12 18:43:38,209][62634] Updated weights for policy 0, policy_version 70450 (0.0008) [2023-10-12 18:43:38,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 144277504. Throughput: 0: 1693.0, 1: 1685.6. Samples: 36075236. Policy #0 lag: (min: 5.0, avg: 10.9, max: 37.0) [2023-10-12 18:43:38,435][61643] Avg episode reward: [(0, '23.770'), (1, '9.930')] [2023-10-12 18:43:38,586][62634] Updated weights for policy 0, policy_version 70460 (0.0011) [2023-10-12 18:43:41,783][62635] Updated weights for policy 1, policy_version 70470 (0.0008) [2023-10-12 18:43:42,149][62635] Updated weights for policy 1, policy_version 70480 (0.0010) [2023-10-12 18:43:42,524][62635] Updated weights for policy 1, policy_version 70490 (0.0007) [2023-10-12 18:43:42,628][62634] Updated weights for policy 0, policy_version 70470 (0.0010) [2023-10-12 18:43:43,014][62634] Updated weights for policy 0, policy_version 70480 (0.0009) [2023-10-12 18:43:43,382][62634] Updated weights for policy 0, policy_version 70490 (0.0010) [2023-10-12 18:43:43,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 144343040. Throughput: 0: 1690.8, 1: 1672.0. Samples: 36095500. Policy #0 lag: (min: 5.0, avg: 10.9, max: 37.0) [2023-10-12 18:43:43,436][61643] Avg episode reward: [(0, '23.860'), (1, '9.920')] [2023-10-12 18:43:46,569][62635] Updated weights for policy 1, policy_version 70500 (0.0008) [2023-10-12 18:43:46,935][62635] Updated weights for policy 1, policy_version 70510 (0.0010) [2023-10-12 18:43:47,306][62635] Updated weights for policy 1, policy_version 70520 (0.0007) [2023-10-12 18:43:47,336][62634] Updated weights for policy 0, policy_version 70500 (0.0007) [2023-10-12 18:43:47,703][62634] Updated weights for policy 0, policy_version 70510 (0.0007) [2023-10-12 18:43:48,092][62634] Updated weights for policy 0, policy_version 70520 (0.0008) [2023-10-12 18:43:48,435][61643] Fps is (10 sec: 16383.8, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 144441344. Throughput: 0: 1676.4, 1: 1666.4. Samples: 36114656. Policy #0 lag: (min: 5.0, avg: 10.9, max: 37.0) [2023-10-12 18:43:48,436][61643] Avg episode reward: [(0, '23.480'), (1, '10.010')] [2023-10-12 18:43:51,437][62635] Updated weights for policy 1, policy_version 70530 (0.0009) [2023-10-12 18:43:51,804][62635] Updated weights for policy 1, policy_version 70540 (0.0011) [2023-10-12 18:43:52,179][62635] Updated weights for policy 1, policy_version 70550 (0.0008) [2023-10-12 18:43:52,474][62634] Updated weights for policy 0, policy_version 70530 (0.0009) [2023-10-12 18:43:52,545][62635] Updated weights for policy 1, policy_version 70560 (0.0007) [2023-10-12 18:43:52,856][62634] Updated weights for policy 0, policy_version 70540 (0.0008) [2023-10-12 18:43:53,234][62634] Updated weights for policy 0, policy_version 70550 (0.0009) [2023-10-12 18:43:53,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 144474112. Throughput: 0: 1691.9, 1: 1675.5. Samples: 36125742. Policy #0 lag: (min: 5.0, avg: 10.9, max: 37.0) [2023-10-12 18:43:53,435][61643] Avg episode reward: [(0, '23.260'), (1, '9.970')] [2023-10-12 18:43:53,600][62634] Updated weights for policy 0, policy_version 70560 (0.0009) [2023-10-12 18:43:56,687][62635] Updated weights for policy 1, policy_version 70570 (0.0007) [2023-10-12 18:43:57,063][62635] Updated weights for policy 1, policy_version 70580 (0.0008) [2023-10-12 18:43:57,429][62635] Updated weights for policy 1, policy_version 70590 (0.0007) [2023-10-12 18:43:57,637][62634] Updated weights for policy 0, policy_version 70570 (0.0008) [2023-10-12 18:43:58,023][62634] Updated weights for policy 0, policy_version 70580 (0.0010) [2023-10-12 18:43:58,405][62634] Updated weights for policy 0, policy_version 70590 (0.0010) [2023-10-12 18:43:58,435][61643] Fps is (10 sec: 9830.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 144539648. Throughput: 0: 1692.8, 1: 1663.8. Samples: 36145778. Policy #0 lag: (min: 5.0, avg: 10.9, max: 37.0) [2023-10-12 18:43:58,436][61643] Avg episode reward: [(0, '23.300'), (1, '9.800')] [2023-10-12 18:44:01,382][62635] Updated weights for policy 1, policy_version 70600 (0.0008) [2023-10-12 18:44:01,748][62635] Updated weights for policy 1, policy_version 70610 (0.0008) [2023-10-12 18:44:02,116][62635] Updated weights for policy 1, policy_version 70620 (0.0011) [2023-10-12 18:44:02,487][62634] Updated weights for policy 0, policy_version 70600 (0.0010) [2023-10-12 18:44:02,864][62634] Updated weights for policy 0, policy_version 70610 (0.0011) [2023-10-12 18:44:03,234][62634] Updated weights for policy 0, policy_version 70620 (0.0009) [2023-10-12 18:44:03,435][61643] Fps is (10 sec: 16384.0, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 144637952. Throughput: 0: 1667.9, 1: 1678.2. Samples: 36165134. Policy #0 lag: (min: 5.0, avg: 10.9, max: 37.0) [2023-10-12 18:44:03,435][61643] Avg episode reward: [(0, '23.220'), (1, '9.890')] [2023-10-12 18:44:06,198][62635] Updated weights for policy 1, policy_version 70630 (0.0009) [2023-10-12 18:44:06,564][62635] Updated weights for policy 1, policy_version 70640 (0.0009) [2023-10-12 18:44:06,933][62635] Updated weights for policy 1, policy_version 70650 (0.0010) [2023-10-12 18:44:07,240][62634] Updated weights for policy 0, policy_version 70630 (0.0009) [2023-10-12 18:44:07,619][62634] Updated weights for policy 0, policy_version 70640 (0.0008) [2023-10-12 18:44:07,991][62634] Updated weights for policy 0, policy_version 70650 (0.0007) [2023-10-12 18:44:08,435][61643] Fps is (10 sec: 16384.0, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 144703488. Throughput: 0: 1687.3, 1: 1681.3. Samples: 36176076. Policy #0 lag: (min: 5.0, avg: 10.9, max: 37.0) [2023-10-12 18:44:08,436][61643] Avg episode reward: [(0, '23.400'), (1, '9.830')] [2023-10-12 18:44:11,052][62635] Updated weights for policy 1, policy_version 70660 (0.0009) [2023-10-12 18:44:11,418][62635] Updated weights for policy 1, policy_version 70670 (0.0007) [2023-10-12 18:44:11,784][62635] Updated weights for policy 1, policy_version 70680 (0.0007) [2023-10-12 18:44:11,914][62634] Updated weights for policy 0, policy_version 70660 (0.0007) [2023-10-12 18:44:12,287][62634] Updated weights for policy 0, policy_version 70670 (0.0011) [2023-10-12 18:44:12,667][62634] Updated weights for policy 0, policy_version 70680 (0.0011) [2023-10-12 18:44:13,435][61643] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 144769024. Throughput: 0: 1682.3, 1: 1660.3. Samples: 36195640. Policy #0 lag: (min: 23.0, avg: 23.0, max: 23.0) [2023-10-12 18:44:13,435][61643] Avg episode reward: [(0, '23.800'), (1, '9.880')] [2023-10-12 18:44:15,706][62635] Updated weights for policy 1, policy_version 70690 (0.0008) [2023-10-12 18:44:16,077][62635] Updated weights for policy 1, policy_version 70700 (0.0008) [2023-10-12 18:44:16,440][62635] Updated weights for policy 1, policy_version 70710 (0.0009) [2023-10-12 18:44:16,809][62634] Updated weights for policy 0, policy_version 70690 (0.0010) [2023-10-12 18:44:16,810][62635] Updated weights for policy 1, policy_version 70720 (0.0009) [2023-10-12 18:44:17,186][62634] Updated weights for policy 0, policy_version 70700 (0.0009) [2023-10-12 18:44:17,566][62634] Updated weights for policy 0, policy_version 70710 (0.0008) [2023-10-12 18:44:17,941][62634] Updated weights for policy 0, policy_version 70720 (0.0007) [2023-10-12 18:44:18,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 144834560. Throughput: 0: 1664.0, 1: 1687.3. Samples: 36215352. Policy #0 lag: (min: 23.0, avg: 23.0, max: 23.0) [2023-10-12 18:44:18,436][61643] Avg episode reward: [(0, '23.580'), (1, '9.870')] [2023-10-12 18:44:18,445][62495] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000070720_72417280.pth... [2023-10-12 18:44:18,445][62354] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000070720_72417280.pth... [2023-10-12 18:44:18,475][62495] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000069152_70811648.pth [2023-10-12 18:44:18,478][62354] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000069152_70811648.pth [2023-10-12 18:44:18,479][62495] Saving a milestone ./train_atari/atari_kangaroo_APPO/checkpoint_p1/milestones/checkpoint_000070720_72417280.pth [2023-10-12 18:44:18,482][62354] Saving a milestone ./train_atari/atari_kangaroo_APPO/checkpoint_p0/milestones/checkpoint_000070720_72417280.pth [2023-10-12 18:44:20,993][62635] Updated weights for policy 1, policy_version 70730 (0.0007) [2023-10-12 18:44:21,367][62635] Updated weights for policy 1, policy_version 70740 (0.0007) [2023-10-12 18:44:21,733][62635] Updated weights for policy 1, policy_version 70750 (0.0008) [2023-10-12 18:44:21,947][62634] Updated weights for policy 0, policy_version 70730 (0.0008) [2023-10-12 18:44:22,322][62634] Updated weights for policy 0, policy_version 70740 (0.0010) [2023-10-12 18:44:22,705][62634] Updated weights for policy 0, policy_version 70750 (0.0010) [2023-10-12 18:44:23,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 144900096. Throughput: 0: 1686.2, 1: 1672.7. Samples: 36226388. Policy #0 lag: (min: 23.0, avg: 23.0, max: 23.0) [2023-10-12 18:44:23,436][61643] Avg episode reward: [(0, '23.810'), (1, '9.880')] [2023-10-12 18:44:25,678][62635] Updated weights for policy 1, policy_version 70760 (0.0007) [2023-10-12 18:44:26,047][62635] Updated weights for policy 1, policy_version 70770 (0.0008) [2023-10-12 18:44:26,412][62635] Updated weights for policy 1, policy_version 70780 (0.0009) [2023-10-12 18:44:26,565][62634] Updated weights for policy 0, policy_version 70760 (0.0009) [2023-10-12 18:44:26,941][62634] Updated weights for policy 0, policy_version 70770 (0.0009) [2023-10-12 18:44:27,327][62634] Updated weights for policy 0, policy_version 70780 (0.0007) [2023-10-12 18:44:28,435][61643] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 144965632. Throughput: 0: 1671.3, 1: 1662.4. Samples: 36245518. Policy #0 lag: (min: 23.0, avg: 23.0, max: 23.0) [2023-10-12 18:44:28,435][61643] Avg episode reward: [(0, '24.020'), (1, '9.790')] [2023-10-12 18:44:30,701][62635] Updated weights for policy 1, policy_version 70790 (0.0009) [2023-10-12 18:44:31,064][62635] Updated weights for policy 1, policy_version 70800 (0.0008) [2023-10-12 18:44:31,436][62635] Updated weights for policy 1, policy_version 70810 (0.0007) [2023-10-12 18:44:31,457][62634] Updated weights for policy 0, policy_version 70790 (0.0007) [2023-10-12 18:44:31,841][62634] Updated weights for policy 0, policy_version 70800 (0.0007) [2023-10-12 18:44:32,226][62634] Updated weights for policy 0, policy_version 70810 (0.0008) [2023-10-12 18:44:33,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 145031168. Throughput: 0: 1672.3, 1: 1678.5. Samples: 36265442. Policy #0 lag: (min: 23.0, avg: 23.0, max: 23.0) [2023-10-12 18:44:33,435][61643] Avg episode reward: [(0, '24.170'), (1, '9.680')] [2023-10-12 18:44:35,502][62635] Updated weights for policy 1, policy_version 70820 (0.0010) [2023-10-12 18:44:35,873][62635] Updated weights for policy 1, policy_version 70830 (0.0010) [2023-10-12 18:44:36,235][62635] Updated weights for policy 1, policy_version 70840 (0.0010) [2023-10-12 18:44:36,455][62634] Updated weights for policy 0, policy_version 70820 (0.0009) [2023-10-12 18:44:36,829][62634] Updated weights for policy 0, policy_version 70830 (0.0009) [2023-10-12 18:44:37,206][62634] Updated weights for policy 0, policy_version 70840 (0.0009) [2023-10-12 18:44:38,435][61643] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 145096704. Throughput: 0: 1682.5, 1: 1662.9. Samples: 36276288. Policy #0 lag: (min: 23.0, avg: 23.0, max: 23.0) [2023-10-12 18:44:38,436][61643] Avg episode reward: [(0, '24.270'), (1, '9.640')] [2023-10-12 18:44:40,485][62635] Updated weights for policy 1, policy_version 70850 (0.0008) [2023-10-12 18:44:40,852][62635] Updated weights for policy 1, policy_version 70860 (0.0008) [2023-10-12 18:44:41,221][62634] Updated weights for policy 0, policy_version 70850 (0.0010) [2023-10-12 18:44:41,224][62635] Updated weights for policy 1, policy_version 70870 (0.0007) [2023-10-12 18:44:41,596][62635] Updated weights for policy 1, policy_version 70880 (0.0007) [2023-10-12 18:44:41,602][62634] Updated weights for policy 0, policy_version 70860 (0.0007) [2023-10-12 18:44:41,978][62634] Updated weights for policy 0, policy_version 70870 (0.0009) [2023-10-12 18:44:42,360][62634] Updated weights for policy 0, policy_version 70880 (0.0008) [2023-10-12 18:44:43,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 145162240. Throughput: 0: 1662.1, 1: 1666.0. Samples: 36295540. Policy #0 lag: (min: 23.0, avg: 23.0, max: 23.0) [2023-10-12 18:44:43,435][61643] Avg episode reward: [(0, '24.460'), (1, '9.760')] [2023-10-12 18:44:45,627][62635] Updated weights for policy 1, policy_version 70890 (0.0010) [2023-10-12 18:44:46,002][62635] Updated weights for policy 1, policy_version 70900 (0.0010) [2023-10-12 18:44:46,359][62635] Updated weights for policy 1, policy_version 70910 (0.0011) [2023-10-12 18:44:46,557][62634] Updated weights for policy 0, policy_version 70890 (0.0008) [2023-10-12 18:44:46,936][62634] Updated weights for policy 0, policy_version 70900 (0.0009) [2023-10-12 18:44:47,317][62634] Updated weights for policy 0, policy_version 70910 (0.0007) [2023-10-12 18:44:48,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 145227776. Throughput: 0: 1669.8, 1: 1674.8. Samples: 36315640. Policy #0 lag: (min: 23.0, avg: 23.0, max: 23.0) [2023-10-12 18:44:48,436][61643] Avg episode reward: [(0, '24.520'), (1, '9.590')] [2023-10-12 18:44:50,470][62635] Updated weights for policy 1, policy_version 70920 (0.0009) [2023-10-12 18:44:50,836][62635] Updated weights for policy 1, policy_version 70930 (0.0008) [2023-10-12 18:44:51,207][62635] Updated weights for policy 1, policy_version 70940 (0.0008) [2023-10-12 18:44:51,220][62634] Updated weights for policy 0, policy_version 70920 (0.0007) [2023-10-12 18:44:51,587][62634] Updated weights for policy 0, policy_version 70930 (0.0008) [2023-10-12 18:44:51,970][62634] Updated weights for policy 0, policy_version 70940 (0.0010) [2023-10-12 18:44:53,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 145293312. Throughput: 0: 1684.1, 1: 1658.3. Samples: 36326482. Policy #0 lag: (min: 23.0, avg: 23.0, max: 23.0) [2023-10-12 18:44:53,435][61643] Avg episode reward: [(0, '24.850'), (1, '9.770')] [2023-10-12 18:44:55,270][62635] Updated weights for policy 1, policy_version 70950 (0.0008) [2023-10-12 18:44:55,638][62635] Updated weights for policy 1, policy_version 70960 (0.0009) [2023-10-12 18:44:55,781][62634] Updated weights for policy 0, policy_version 70950 (0.0009) [2023-10-12 18:44:55,999][62635] Updated weights for policy 1, policy_version 70970 (0.0007) [2023-10-12 18:44:56,161][62634] Updated weights for policy 0, policy_version 70960 (0.0009) [2023-10-12 18:44:56,535][62634] Updated weights for policy 0, policy_version 70970 (0.0010) [2023-10-12 18:44:58,435][61643] Fps is (10 sec: 13107.5, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 145358848. Throughput: 0: 1660.4, 1: 1677.3. Samples: 36345840. Policy #0 lag: (min: 23.0, avg: 23.0, max: 23.0) [2023-10-12 18:44:58,435][61643] Avg episode reward: [(0, '24.650'), (1, '10.080')] [2023-10-12 18:44:59,929][62635] Updated weights for policy 1, policy_version 70980 (0.0009) [2023-10-12 18:45:00,291][62635] Updated weights for policy 1, policy_version 70990 (0.0009) [2023-10-12 18:45:00,662][62635] Updated weights for policy 1, policy_version 71000 (0.0009) [2023-10-12 18:45:00,713][62634] Updated weights for policy 0, policy_version 70980 (0.0008) [2023-10-12 18:45:01,080][62634] Updated weights for policy 0, policy_version 70990 (0.0007) [2023-10-12 18:45:01,458][62634] Updated weights for policy 0, policy_version 71000 (0.0009) [2023-10-12 18:45:03,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 145424384. Throughput: 0: 1680.9, 1: 1678.1. Samples: 36366510. Policy #0 lag: (min: 30.0, avg: 30.0, max: 30.0) [2023-10-12 18:45:03,435][61643] Avg episode reward: [(0, '24.790'), (1, '9.660')] [2023-10-12 18:45:04,676][62635] Updated weights for policy 1, policy_version 71010 (0.0007) [2023-10-12 18:45:05,053][62635] Updated weights for policy 1, policy_version 71020 (0.0008) [2023-10-12 18:45:05,409][62635] Updated weights for policy 1, policy_version 71030 (0.0008) [2023-10-12 18:45:05,548][62634] Updated weights for policy 0, policy_version 71010 (0.0009) [2023-10-12 18:45:05,783][62635] Updated weights for policy 1, policy_version 71040 (0.0008) [2023-10-12 18:45:05,931][62634] Updated weights for policy 0, policy_version 71020 (0.0009) [2023-10-12 18:45:06,303][62634] Updated weights for policy 0, policy_version 71030 (0.0011) [2023-10-12 18:45:06,677][62634] Updated weights for policy 0, policy_version 71040 (0.0009) [2023-10-12 18:45:08,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 145489920. Throughput: 0: 1669.1, 1: 1664.1. Samples: 36376380. Policy #0 lag: (min: 30.0, avg: 30.0, max: 30.0) [2023-10-12 18:45:08,435][61643] Avg episode reward: [(0, '24.980'), (1, '9.870')] [2023-10-12 18:45:09,723][62635] Updated weights for policy 1, policy_version 71050 (0.0007) [2023-10-12 18:45:10,082][62635] Updated weights for policy 1, policy_version 71060 (0.0007) [2023-10-12 18:45:10,451][62635] Updated weights for policy 1, policy_version 71070 (0.0007) [2023-10-12 18:45:10,769][62634] Updated weights for policy 0, policy_version 71050 (0.0010) [2023-10-12 18:45:11,147][62634] Updated weights for policy 0, policy_version 71060 (0.0008) [2023-10-12 18:45:11,521][62634] Updated weights for policy 0, policy_version 71070 (0.0008) [2023-10-12 18:45:13,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 145555456. Throughput: 0: 1664.5, 1: 1689.5. Samples: 36396446. Policy #0 lag: (min: 30.0, avg: 30.0, max: 30.0) [2023-10-12 18:45:13,435][61643] Avg episode reward: [(0, '24.780'), (1, '9.940')] [2023-10-12 18:45:14,531][62635] Updated weights for policy 1, policy_version 71080 (0.0007) [2023-10-12 18:45:14,899][62635] Updated weights for policy 1, policy_version 71090 (0.0008) [2023-10-12 18:45:15,274][62635] Updated weights for policy 1, policy_version 71100 (0.0009) [2023-10-12 18:45:15,437][62634] Updated weights for policy 0, policy_version 71080 (0.0007) [2023-10-12 18:45:15,813][62634] Updated weights for policy 0, policy_version 71090 (0.0009) [2023-10-12 18:45:16,198][62634] Updated weights for policy 0, policy_version 71100 (0.0008) [2023-10-12 18:45:18,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 145620992. Throughput: 0: 1684.4, 1: 1686.5. Samples: 36417134. Policy #0 lag: (min: 30.0, avg: 30.0, max: 30.0) [2023-10-12 18:45:18,435][61643] Avg episode reward: [(0, '24.870'), (1, '9.650')] [2023-10-12 18:45:19,435][62635] Updated weights for policy 1, policy_version 71110 (0.0008) [2023-10-12 18:45:19,803][62635] Updated weights for policy 1, policy_version 71120 (0.0009) [2023-10-12 18:45:20,171][62635] Updated weights for policy 1, policy_version 71130 (0.0007) [2023-10-12 18:45:20,395][62634] Updated weights for policy 0, policy_version 71110 (0.0009) [2023-10-12 18:45:20,768][62634] Updated weights for policy 0, policy_version 71120 (0.0011) [2023-10-12 18:45:21,147][62634] Updated weights for policy 0, policy_version 71130 (0.0009) [2023-10-12 18:45:23,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 145686528. Throughput: 0: 1664.9, 1: 1676.9. Samples: 36426670. Policy #0 lag: (min: 30.0, avg: 30.0, max: 30.0) [2023-10-12 18:45:23,435][61643] Avg episode reward: [(0, '24.860'), (1, '9.580')] [2023-10-12 18:45:24,360][62635] Updated weights for policy 1, policy_version 71140 (0.0007) [2023-10-12 18:45:24,728][62635] Updated weights for policy 1, policy_version 71150 (0.0010) [2023-10-12 18:45:25,114][62635] Updated weights for policy 1, policy_version 71160 (0.0010) [2023-10-12 18:45:25,238][62634] Updated weights for policy 0, policy_version 71140 (0.0009) [2023-10-12 18:45:25,612][62634] Updated weights for policy 0, policy_version 71150 (0.0009) [2023-10-12 18:45:25,992][62634] Updated weights for policy 0, policy_version 71160 (0.0008) [2023-10-12 18:45:28,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 145752064. Throughput: 0: 1672.7, 1: 1693.9. Samples: 36447036. Policy #0 lag: (min: 30.0, avg: 30.0, max: 30.0) [2023-10-12 18:45:28,435][61643] Avg episode reward: [(0, '24.860'), (1, '9.420')] [2023-10-12 18:45:29,089][62635] Updated weights for policy 1, policy_version 71170 (0.0007) [2023-10-12 18:45:29,455][62635] Updated weights for policy 1, policy_version 71180 (0.0008) [2023-10-12 18:45:29,816][62635] Updated weights for policy 1, policy_version 71190 (0.0009) [2023-10-12 18:45:29,937][62634] Updated weights for policy 0, policy_version 71170 (0.0008) [2023-10-12 18:45:30,180][62635] Updated weights for policy 1, policy_version 71200 (0.0008) [2023-10-12 18:45:30,313][62634] Updated weights for policy 0, policy_version 71180 (0.0008) [2023-10-12 18:45:30,696][62634] Updated weights for policy 0, policy_version 71190 (0.0007) [2023-10-12 18:45:31,069][62634] Updated weights for policy 0, policy_version 71200 (0.0010) [2023-10-12 18:45:33,435][61643] Fps is (10 sec: 13106.8, 60 sec: 13107.1, 300 sec: 13329.3). Total num frames: 145817600. Throughput: 0: 1683.0, 1: 1695.7. Samples: 36467682. Policy #0 lag: (min: 30.0, avg: 30.0, max: 30.0) [2023-10-12 18:45:33,436][61643] Avg episode reward: [(0, '24.940'), (1, '9.350')] [2023-10-12 18:45:34,307][62635] Updated weights for policy 1, policy_version 71210 (0.0009) [2023-10-12 18:45:34,679][62635] Updated weights for policy 1, policy_version 71220 (0.0009) [2023-10-12 18:45:35,044][62635] Updated weights for policy 1, policy_version 71230 (0.0008) [2023-10-12 18:45:35,172][62634] Updated weights for policy 0, policy_version 71210 (0.0009) [2023-10-12 18:45:35,550][62634] Updated weights for policy 0, policy_version 71220 (0.0009) [2023-10-12 18:45:35,919][62634] Updated weights for policy 0, policy_version 71230 (0.0009) [2023-10-12 18:45:38,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 145883136. Throughput: 0: 1654.2, 1: 1687.8. Samples: 36476870. Policy #0 lag: (min: 30.0, avg: 30.0, max: 30.0) [2023-10-12 18:45:38,435][61643] Avg episode reward: [(0, '24.830'), (1, '9.420')] [2023-10-12 18:45:39,083][62635] Updated weights for policy 1, policy_version 71240 (0.0010) [2023-10-12 18:45:39,459][62635] Updated weights for policy 1, policy_version 71250 (0.0008) [2023-10-12 18:45:39,821][62635] Updated weights for policy 1, policy_version 71260 (0.0008) [2023-10-12 18:45:39,978][62634] Updated weights for policy 0, policy_version 71240 (0.0007) [2023-10-12 18:45:40,352][62634] Updated weights for policy 0, policy_version 71250 (0.0009) [2023-10-12 18:45:40,730][62634] Updated weights for policy 0, policy_version 71260 (0.0007) [2023-10-12 18:45:43,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 145948672. Throughput: 0: 1678.1, 1: 1693.1. Samples: 36497546. Policy #0 lag: (min: 30.0, avg: 30.0, max: 30.0) [2023-10-12 18:45:43,436][61643] Avg episode reward: [(0, '24.510'), (1, '9.480')] [2023-10-12 18:45:43,688][62635] Updated weights for policy 1, policy_version 71270 (0.0008) [2023-10-12 18:45:44,043][62635] Updated weights for policy 1, policy_version 71280 (0.0010) [2023-10-12 18:45:44,409][62635] Updated weights for policy 1, policy_version 71290 (0.0011) [2023-10-12 18:45:44,638][62634] Updated weights for policy 0, policy_version 71270 (0.0007) [2023-10-12 18:45:45,009][62634] Updated weights for policy 0, policy_version 71280 (0.0010) [2023-10-12 18:45:45,388][62634] Updated weights for policy 0, policy_version 71290 (0.0010) [2023-10-12 18:45:48,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 146014208. Throughput: 0: 1687.1, 1: 1692.9. Samples: 36518610. Policy #0 lag: (min: 30.0, avg: 30.0, max: 30.0) [2023-10-12 18:45:48,435][61643] Avg episode reward: [(0, '24.570'), (1, '9.500')] [2023-10-12 18:45:48,508][62635] Updated weights for policy 1, policy_version 71300 (0.0008) [2023-10-12 18:45:48,872][62635] Updated weights for policy 1, policy_version 71310 (0.0011) [2023-10-12 18:45:49,242][62635] Updated weights for policy 1, policy_version 71320 (0.0009) [2023-10-12 18:45:49,383][62634] Updated weights for policy 0, policy_version 71300 (0.0009) [2023-10-12 18:45:49,748][62634] Updated weights for policy 0, policy_version 71310 (0.0008) [2023-10-12 18:45:50,129][62634] Updated weights for policy 0, policy_version 71320 (0.0007) [2023-10-12 18:45:53,223][62635] Updated weights for policy 1, policy_version 71330 (0.0007) [2023-10-12 18:45:53,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 146079744. Throughput: 0: 1669.4, 1: 1693.1. Samples: 36527692. Policy #0 lag: (min: 31.0, avg: 31.7, max: 49.0) [2023-10-12 18:45:53,435][61643] Avg episode reward: [(0, '24.590'), (1, '9.690')] [2023-10-12 18:45:53,584][62635] Updated weights for policy 1, policy_version 71340 (0.0009) [2023-10-12 18:45:53,951][62635] Updated weights for policy 1, policy_version 71350 (0.0008) [2023-10-12 18:45:54,299][62634] Updated weights for policy 0, policy_version 71330 (0.0008) [2023-10-12 18:45:54,319][62635] Updated weights for policy 1, policy_version 71360 (0.0008) [2023-10-12 18:45:54,670][62634] Updated weights for policy 0, policy_version 71340 (0.0008) [2023-10-12 18:45:55,045][62634] Updated weights for policy 0, policy_version 71350 (0.0008) [2023-10-12 18:45:55,425][62634] Updated weights for policy 0, policy_version 71360 (0.0010) [2023-10-12 18:45:58,344][62635] Updated weights for policy 1, policy_version 71370 (0.0012) [2023-10-12 18:45:58,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 146145280. Throughput: 0: 1690.8, 1: 1688.6. Samples: 36548520. Policy #0 lag: (min: 31.0, avg: 31.7, max: 49.0) [2023-10-12 18:45:58,435][61643] Avg episode reward: [(0, '24.590'), (1, '9.580')] [2023-10-12 18:45:58,715][62635] Updated weights for policy 1, policy_version 71380 (0.0008) [2023-10-12 18:45:59,077][62635] Updated weights for policy 1, policy_version 71390 (0.0008) [2023-10-12 18:45:59,422][62634] Updated weights for policy 0, policy_version 71370 (0.0008) [2023-10-12 18:45:59,807][62634] Updated weights for policy 0, policy_version 71380 (0.0007) [2023-10-12 18:46:00,177][62634] Updated weights for policy 0, policy_version 71390 (0.0007) [2023-10-12 18:46:03,165][62635] Updated weights for policy 1, policy_version 71400 (0.0007) [2023-10-12 18:46:03,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 146210816. Throughput: 0: 1681.4, 1: 1693.0. Samples: 36568980. Policy #0 lag: (min: 31.0, avg: 31.7, max: 49.0) [2023-10-12 18:46:03,435][61643] Avg episode reward: [(0, '24.400'), (1, '9.700')] [2023-10-12 18:46:03,527][62635] Updated weights for policy 1, policy_version 71410 (0.0010) [2023-10-12 18:46:03,894][62635] Updated weights for policy 1, policy_version 71420 (0.0009) [2023-10-12 18:46:04,264][62634] Updated weights for policy 0, policy_version 71400 (0.0007) [2023-10-12 18:46:04,651][62634] Updated weights for policy 0, policy_version 71410 (0.0008) [2023-10-12 18:46:05,034][62634] Updated weights for policy 0, policy_version 71420 (0.0009) [2023-10-12 18:46:07,975][62635] Updated weights for policy 1, policy_version 71430 (0.0010) [2023-10-12 18:46:08,340][62635] Updated weights for policy 1, policy_version 71440 (0.0009) [2023-10-12 18:46:08,435][61643] Fps is (10 sec: 13106.8, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 146276352. Throughput: 0: 1677.5, 1: 1693.1. Samples: 36578346. Policy #0 lag: (min: 31.0, avg: 31.7, max: 49.0) [2023-10-12 18:46:08,436][61643] Avg episode reward: [(0, '24.530'), (1, '9.700')] [2023-10-12 18:46:08,704][62635] Updated weights for policy 1, policy_version 71450 (0.0010) [2023-10-12 18:46:09,187][62634] Updated weights for policy 0, policy_version 71430 (0.0009) [2023-10-12 18:46:09,570][62634] Updated weights for policy 0, policy_version 71440 (0.0008) [2023-10-12 18:46:09,956][62634] Updated weights for policy 0, policy_version 71450 (0.0010) [2023-10-12 18:46:12,906][62635] Updated weights for policy 1, policy_version 71460 (0.0009) [2023-10-12 18:46:13,281][62635] Updated weights for policy 1, policy_version 71470 (0.0009) [2023-10-12 18:46:13,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 146341888. Throughput: 0: 1683.3, 1: 1690.8. Samples: 36598870. Policy #0 lag: (min: 31.0, avg: 31.7, max: 49.0) [2023-10-12 18:46:13,435][61643] Avg episode reward: [(0, '24.440'), (1, '9.780')] [2023-10-12 18:46:13,654][62635] Updated weights for policy 1, policy_version 71480 (0.0010) [2023-10-12 18:46:13,936][62634] Updated weights for policy 0, policy_version 71460 (0.0010) [2023-10-12 18:46:14,311][62634] Updated weights for policy 0, policy_version 71470 (0.0007) [2023-10-12 18:46:14,694][62634] Updated weights for policy 0, policy_version 71480 (0.0008) [2023-10-12 18:46:17,607][62635] Updated weights for policy 1, policy_version 71490 (0.0008) [2023-10-12 18:46:17,973][62635] Updated weights for policy 1, policy_version 71500 (0.0008) [2023-10-12 18:46:18,350][62635] Updated weights for policy 1, policy_version 71510 (0.0007) [2023-10-12 18:46:18,435][61643] Fps is (10 sec: 13107.6, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 146407424. Throughput: 0: 1688.4, 1: 1677.8. Samples: 36619160. Policy #0 lag: (min: 31.0, avg: 31.7, max: 49.0) [2023-10-12 18:46:18,435][61643] Avg episode reward: [(0, '24.490'), (1, '9.590')] [2023-10-12 18:46:18,442][62354] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000071488_73203712.pth... [2023-10-12 18:46:18,476][62354] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000069920_71598080.pth [2023-10-12 18:46:18,706][62495] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000071520_73236480.pth... [2023-10-12 18:46:18,707][62635] Updated weights for policy 1, policy_version 71520 (0.0007) [2023-10-12 18:46:18,736][62495] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000069920_71598080.pth [2023-10-12 18:46:18,757][62634] Updated weights for policy 0, policy_version 71490 (0.0008) [2023-10-12 18:46:19,127][62634] Updated weights for policy 0, policy_version 71500 (0.0010) [2023-10-12 18:46:19,509][62634] Updated weights for policy 0, policy_version 71510 (0.0010) [2023-10-12 18:46:19,885][62634] Updated weights for policy 0, policy_version 71520 (0.0008) [2023-10-12 18:46:22,889][62635] Updated weights for policy 1, policy_version 71530 (0.0007) [2023-10-12 18:46:23,253][62635] Updated weights for policy 1, policy_version 71540 (0.0007) [2023-10-12 18:46:23,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 146472960. Throughput: 0: 1684.7, 1: 1689.0. Samples: 36628684. Policy #0 lag: (min: 31.0, avg: 31.7, max: 49.0) [2023-10-12 18:46:23,435][61643] Avg episode reward: [(0, '24.590'), (1, '9.680')] [2023-10-12 18:46:23,628][62635] Updated weights for policy 1, policy_version 71550 (0.0008) [2023-10-12 18:46:23,877][62634] Updated weights for policy 0, policy_version 71530 (0.0007) [2023-10-12 18:46:24,255][62634] Updated weights for policy 0, policy_version 71540 (0.0007) [2023-10-12 18:46:24,635][62634] Updated weights for policy 0, policy_version 71550 (0.0007) [2023-10-12 18:46:27,589][62635] Updated weights for policy 1, policy_version 71560 (0.0007) [2023-10-12 18:46:27,956][62635] Updated weights for policy 1, policy_version 71570 (0.0007) [2023-10-12 18:46:28,332][62635] Updated weights for policy 1, policy_version 71580 (0.0008) [2023-10-12 18:46:28,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 146538496. Throughput: 0: 1688.7, 1: 1690.3. Samples: 36649600. Policy #0 lag: (min: 31.0, avg: 31.7, max: 49.0) [2023-10-12 18:46:28,436][61643] Avg episode reward: [(0, '24.780'), (1, '9.610')] [2023-10-12 18:46:28,534][62634] Updated weights for policy 0, policy_version 71560 (0.0007) [2023-10-12 18:46:28,907][62634] Updated weights for policy 0, policy_version 71570 (0.0009) [2023-10-12 18:46:29,290][62634] Updated weights for policy 0, policy_version 71580 (0.0010) [2023-10-12 18:46:32,434][62635] Updated weights for policy 1, policy_version 71590 (0.0008) [2023-10-12 18:46:32,807][62635] Updated weights for policy 1, policy_version 71600 (0.0007) [2023-10-12 18:46:33,180][62635] Updated weights for policy 1, policy_version 71610 (0.0009) [2023-10-12 18:46:33,435][61643] Fps is (10 sec: 16383.9, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 146636800. Throughput: 0: 1684.4, 1: 1671.9. Samples: 36669644. Policy #0 lag: (min: 31.0, avg: 31.7, max: 49.0) [2023-10-12 18:46:33,435][61643] Avg episode reward: [(0, '24.660'), (1, '9.660')] [2023-10-12 18:46:33,455][62634] Updated weights for policy 0, policy_version 71590 (0.0008) [2023-10-12 18:46:33,832][62634] Updated weights for policy 0, policy_version 71600 (0.0008) [2023-10-12 18:46:34,211][62634] Updated weights for policy 0, policy_version 71610 (0.0007) [2023-10-12 18:46:37,263][62635] Updated weights for policy 1, policy_version 71620 (0.0008) [2023-10-12 18:46:37,635][62635] Updated weights for policy 1, policy_version 71630 (0.0008) [2023-10-12 18:46:37,993][62635] Updated weights for policy 1, policy_version 71640 (0.0008) [2023-10-12 18:46:38,209][62634] Updated weights for policy 0, policy_version 71620 (0.0007) [2023-10-12 18:46:38,435][61643] Fps is (10 sec: 16384.3, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 146702336. Throughput: 0: 1686.6, 1: 1692.0. Samples: 36679730. Policy #0 lag: (min: 31.0, avg: 31.7, max: 49.0) [2023-10-12 18:46:38,435][61643] Avg episode reward: [(0, '24.600'), (1, '9.870')] [2023-10-12 18:46:38,577][62634] Updated weights for policy 0, policy_version 71630 (0.0008) [2023-10-12 18:46:38,953][62634] Updated weights for policy 0, policy_version 71640 (0.0010) [2023-10-12 18:46:42,026][62635] Updated weights for policy 1, policy_version 71650 (0.0007) [2023-10-12 18:46:42,390][62635] Updated weights for policy 1, policy_version 71660 (0.0008) [2023-10-12 18:46:42,762][62635] Updated weights for policy 1, policy_version 71670 (0.0009) [2023-10-12 18:46:43,022][62634] Updated weights for policy 0, policy_version 71650 (0.0010) [2023-10-12 18:46:43,135][62635] Updated weights for policy 1, policy_version 71680 (0.0008) [2023-10-12 18:46:43,399][62634] Updated weights for policy 0, policy_version 71660 (0.0008) [2023-10-12 18:46:43,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 146767872. Throughput: 0: 1683.9, 1: 1687.9. Samples: 36700248. Policy #0 lag: (min: 31.0, avg: 41.2, max: 63.0) [2023-10-12 18:46:43,435][61643] Avg episode reward: [(0, '24.660'), (1, '9.780')] [2023-10-12 18:46:43,779][62634] Updated weights for policy 0, policy_version 71670 (0.0008) [2023-10-12 18:46:44,154][62634] Updated weights for policy 0, policy_version 71680 (0.0010) [2023-10-12 18:46:47,319][62635] Updated weights for policy 1, policy_version 71690 (0.0008) [2023-10-12 18:46:47,675][62635] Updated weights for policy 1, policy_version 71700 (0.0009) [2023-10-12 18:46:48,049][62635] Updated weights for policy 1, policy_version 71710 (0.0007) [2023-10-12 18:46:48,133][62634] Updated weights for policy 0, policy_version 71690 (0.0009) [2023-10-12 18:46:48,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 146833408. Throughput: 0: 1687.6, 1: 1663.5. Samples: 36719782. Policy #0 lag: (min: 31.0, avg: 41.2, max: 63.0) [2023-10-12 18:46:48,435][61643] Avg episode reward: [(0, '24.620'), (1, '9.610')] [2023-10-12 18:46:48,518][62634] Updated weights for policy 0, policy_version 71700 (0.0008) [2023-10-12 18:46:48,897][62634] Updated weights for policy 0, policy_version 71710 (0.0008) [2023-10-12 18:46:52,067][62635] Updated weights for policy 1, policy_version 71720 (0.0008) [2023-10-12 18:46:52,440][62635] Updated weights for policy 1, policy_version 71730 (0.0010) [2023-10-12 18:46:52,800][62635] Updated weights for policy 1, policy_version 71740 (0.0008) [2023-10-12 18:46:52,913][62634] Updated weights for policy 0, policy_version 71720 (0.0008) [2023-10-12 18:46:53,293][62634] Updated weights for policy 0, policy_version 71730 (0.0007) [2023-10-12 18:46:53,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 146898944. Throughput: 0: 1688.1, 1: 1686.7. Samples: 36730212. Policy #0 lag: (min: 31.0, avg: 41.2, max: 63.0) [2023-10-12 18:46:53,435][61643] Avg episode reward: [(0, '24.600'), (1, '9.880')] [2023-10-12 18:46:53,677][62634] Updated weights for policy 0, policy_version 71740 (0.0008) [2023-10-12 18:46:56,904][62635] Updated weights for policy 1, policy_version 71750 (0.0008) [2023-10-12 18:46:57,269][62635] Updated weights for policy 1, policy_version 71760 (0.0009) [2023-10-12 18:46:57,577][62634] Updated weights for policy 0, policy_version 71750 (0.0008) [2023-10-12 18:46:57,638][62635] Updated weights for policy 1, policy_version 71770 (0.0008) [2023-10-12 18:46:57,945][62634] Updated weights for policy 0, policy_version 71760 (0.0009) [2023-10-12 18:46:58,317][62634] Updated weights for policy 0, policy_version 71770 (0.0010) [2023-10-12 18:46:58,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 146964480. Throughput: 0: 1696.5, 1: 1678.3. Samples: 36750734. Policy #0 lag: (min: 31.0, avg: 41.2, max: 63.0) [2023-10-12 18:46:58,435][61643] Avg episode reward: [(0, '24.490'), (1, '9.950')] [2023-10-12 18:47:01,596][62635] Updated weights for policy 1, policy_version 71780 (0.0007) [2023-10-12 18:47:01,959][62635] Updated weights for policy 1, policy_version 71790 (0.0009) [2023-10-12 18:47:02,327][62635] Updated weights for policy 1, policy_version 71800 (0.0007) [2023-10-12 18:47:02,427][62634] Updated weights for policy 0, policy_version 71780 (0.0007) [2023-10-12 18:47:02,810][62634] Updated weights for policy 0, policy_version 71790 (0.0007) [2023-10-12 18:47:03,185][62634] Updated weights for policy 0, policy_version 71800 (0.0008) [2023-10-12 18:47:03,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 147030016. Throughput: 0: 1681.4, 1: 1669.7. Samples: 36769962. Policy #0 lag: (min: 31.0, avg: 41.2, max: 63.0) [2023-10-12 18:47:03,435][61643] Avg episode reward: [(0, '24.450'), (1, '9.780')] [2023-10-12 18:47:06,426][62635] Updated weights for policy 1, policy_version 71810 (0.0007) [2023-10-12 18:47:06,791][62635] Updated weights for policy 1, policy_version 71820 (0.0007) [2023-10-12 18:47:07,097][62634] Updated weights for policy 0, policy_version 71810 (0.0009) [2023-10-12 18:47:07,158][62635] Updated weights for policy 1, policy_version 71830 (0.0010) [2023-10-12 18:47:07,468][62634] Updated weights for policy 0, policy_version 71820 (0.0008) [2023-10-12 18:47:07,524][62635] Updated weights for policy 1, policy_version 71840 (0.0009) [2023-10-12 18:47:07,843][62634] Updated weights for policy 0, policy_version 71830 (0.0007) [2023-10-12 18:47:08,211][62634] Updated weights for policy 0, policy_version 71840 (0.0007) [2023-10-12 18:47:08,435][61643] Fps is (10 sec: 16384.1, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 147128320. Throughput: 0: 1700.0, 1: 1687.5. Samples: 36781122. Policy #0 lag: (min: 31.0, avg: 41.2, max: 63.0) [2023-10-12 18:47:08,435][61643] Avg episode reward: [(0, '24.650'), (1, '10.020')] [2023-10-12 18:47:11,552][62635] Updated weights for policy 1, policy_version 71850 (0.0008) [2023-10-12 18:47:11,925][62635] Updated weights for policy 1, policy_version 71860 (0.0010) [2023-10-12 18:47:12,302][62635] Updated weights for policy 1, policy_version 71870 (0.0008) [2023-10-12 18:47:12,563][62634] Updated weights for policy 0, policy_version 71850 (0.0008) [2023-10-12 18:47:12,937][62634] Updated weights for policy 0, policy_version 71860 (0.0011) [2023-10-12 18:47:13,322][62634] Updated weights for policy 0, policy_version 71870 (0.0009) [2023-10-12 18:47:13,435][61643] Fps is (10 sec: 16384.3, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 147193856. Throughput: 0: 1698.1, 1: 1665.7. Samples: 36800968. Policy #0 lag: (min: 31.0, avg: 41.2, max: 63.0) [2023-10-12 18:47:13,435][61643] Avg episode reward: [(0, '24.700'), (1, '10.110')] [2023-10-12 18:47:16,233][62635] Updated weights for policy 1, policy_version 71880 (0.0008) [2023-10-12 18:47:16,596][62635] Updated weights for policy 1, policy_version 71890 (0.0009) [2023-10-12 18:47:16,968][62635] Updated weights for policy 1, policy_version 71900 (0.0009) [2023-10-12 18:47:17,100][62634] Updated weights for policy 0, policy_version 71880 (0.0009) [2023-10-12 18:47:17,474][62634] Updated weights for policy 0, policy_version 71890 (0.0009) [2023-10-12 18:47:17,854][62634] Updated weights for policy 0, policy_version 71900 (0.0008) [2023-10-12 18:47:18,435][61643] Fps is (10 sec: 13107.0, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 147259392. Throughput: 0: 1671.9, 1: 1674.1. Samples: 36820214. Policy #0 lag: (min: 31.0, avg: 41.2, max: 63.0) [2023-10-12 18:47:18,435][61643] Avg episode reward: [(0, '24.970'), (1, '9.900')] [2023-10-12 18:47:20,846][62635] Updated weights for policy 1, policy_version 71910 (0.0009) [2023-10-12 18:47:21,220][62635] Updated weights for policy 1, policy_version 71920 (0.0008) [2023-10-12 18:47:21,582][62635] Updated weights for policy 1, policy_version 71930 (0.0010) [2023-10-12 18:47:22,133][62634] Updated weights for policy 0, policy_version 71910 (0.0008) [2023-10-12 18:47:22,502][62634] Updated weights for policy 0, policy_version 71920 (0.0007) [2023-10-12 18:47:22,883][62634] Updated weights for policy 0, policy_version 71930 (0.0007) [2023-10-12 18:47:23,435][61643] Fps is (10 sec: 13106.9, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 147324928. Throughput: 0: 1695.6, 1: 1674.6. Samples: 36831388. Policy #0 lag: (min: 31.0, avg: 41.2, max: 63.0) [2023-10-12 18:47:23,436][61643] Avg episode reward: [(0, '24.900'), (1, '9.880')] [2023-10-12 18:47:25,738][62635] Updated weights for policy 1, policy_version 71940 (0.0009) [2023-10-12 18:47:26,094][62635] Updated weights for policy 1, policy_version 71950 (0.0009) [2023-10-12 18:47:26,464][62635] Updated weights for policy 1, policy_version 71960 (0.0010) [2023-10-12 18:47:26,958][62634] Updated weights for policy 0, policy_version 71940 (0.0008) [2023-10-12 18:47:27,343][62634] Updated weights for policy 0, policy_version 71950 (0.0008) [2023-10-12 18:47:27,725][62634] Updated weights for policy 0, policy_version 71960 (0.0010) [2023-10-12 18:47:28,435][61643] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 147390464. Throughput: 0: 1695.1, 1: 1659.4. Samples: 36851202. Policy #0 lag: (min: 11.0, avg: 19.0, max: 43.0) [2023-10-12 18:47:28,435][61643] Avg episode reward: [(0, '24.900'), (1, '10.060')] [2023-10-12 18:47:30,484][62635] Updated weights for policy 1, policy_version 71970 (0.0010) [2023-10-12 18:47:30,859][62635] Updated weights for policy 1, policy_version 71980 (0.0009) [2023-10-12 18:47:31,217][62635] Updated weights for policy 1, policy_version 71990 (0.0007) [2023-10-12 18:47:31,581][62635] Updated weights for policy 1, policy_version 72000 (0.0010) [2023-10-12 18:47:31,739][62634] Updated weights for policy 0, policy_version 71970 (0.0010) [2023-10-12 18:47:32,109][62634] Updated weights for policy 0, policy_version 71980 (0.0010) [2023-10-12 18:47:32,494][62634] Updated weights for policy 0, policy_version 71990 (0.0008) [2023-10-12 18:47:32,860][62634] Updated weights for policy 0, policy_version 72000 (0.0009) [2023-10-12 18:47:33,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 147456000. Throughput: 0: 1668.4, 1: 1686.7. Samples: 36870762. Policy #0 lag: (min: 11.0, avg: 19.0, max: 43.0) [2023-10-12 18:47:33,436][61643] Avg episode reward: [(0, '24.890'), (1, '9.970')] [2023-10-12 18:47:35,790][62635] Updated weights for policy 1, policy_version 72010 (0.0009) [2023-10-12 18:47:36,154][62635] Updated weights for policy 1, policy_version 72020 (0.0007) [2023-10-12 18:47:36,516][62635] Updated weights for policy 1, policy_version 72030 (0.0008) [2023-10-12 18:47:36,778][62634] Updated weights for policy 0, policy_version 72010 (0.0008) [2023-10-12 18:47:37,152][62634] Updated weights for policy 0, policy_version 72020 (0.0009) [2023-10-12 18:47:37,538][62634] Updated weights for policy 0, policy_version 72030 (0.0008) [2023-10-12 18:47:38,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 147521536. Throughput: 0: 1693.9, 1: 1675.4. Samples: 36881830. Policy #0 lag: (min: 11.0, avg: 19.0, max: 43.0) [2023-10-12 18:47:38,436][61643] Avg episode reward: [(0, '24.830'), (1, '9.880')] [2023-10-12 18:47:40,574][62635] Updated weights for policy 1, policy_version 72040 (0.0009) [2023-10-12 18:47:40,939][62635] Updated weights for policy 1, policy_version 72050 (0.0008) [2023-10-12 18:47:41,306][62635] Updated weights for policy 1, policy_version 72060 (0.0008) [2023-10-12 18:47:41,411][62634] Updated weights for policy 0, policy_version 72040 (0.0009) [2023-10-12 18:47:41,798][62634] Updated weights for policy 0, policy_version 72050 (0.0011) [2023-10-12 18:47:42,184][62634] Updated weights for policy 0, policy_version 72060 (0.0011) [2023-10-12 18:47:43,435][61643] Fps is (10 sec: 13106.9, 60 sec: 13653.2, 300 sec: 13551.5). Total num frames: 147587072. Throughput: 0: 1672.5, 1: 1672.2. Samples: 36901246. Policy #0 lag: (min: 11.0, avg: 19.0, max: 43.0) [2023-10-12 18:47:43,437][61643] Avg episode reward: [(0, '24.790'), (1, '9.910')] [2023-10-12 18:47:45,289][62635] Updated weights for policy 1, policy_version 72070 (0.0008) [2023-10-12 18:47:45,652][62635] Updated weights for policy 1, policy_version 72080 (0.0008) [2023-10-12 18:47:46,015][62635] Updated weights for policy 1, policy_version 72090 (0.0007) [2023-10-12 18:47:46,301][62634] Updated weights for policy 0, policy_version 72070 (0.0007) [2023-10-12 18:47:46,688][62634] Updated weights for policy 0, policy_version 72080 (0.0009) [2023-10-12 18:47:47,059][62634] Updated weights for policy 0, policy_version 72090 (0.0009) [2023-10-12 18:47:48,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 147652608. Throughput: 0: 1680.5, 1: 1692.8. Samples: 36921760. Policy #0 lag: (min: 11.0, avg: 19.0, max: 43.0) [2023-10-12 18:47:48,435][61643] Avg episode reward: [(0, '24.490'), (1, '9.660')] [2023-10-12 18:47:50,175][62635] Updated weights for policy 1, policy_version 72100 (0.0008) [2023-10-12 18:47:50,554][62635] Updated weights for policy 1, policy_version 72110 (0.0007) [2023-10-12 18:47:50,926][62635] Updated weights for policy 1, policy_version 72120 (0.0007) [2023-10-12 18:47:51,009][62634] Updated weights for policy 0, policy_version 72100 (0.0008) [2023-10-12 18:47:51,385][62634] Updated weights for policy 0, policy_version 72110 (0.0007) [2023-10-12 18:47:51,758][62634] Updated weights for policy 0, policy_version 72120 (0.0010) [2023-10-12 18:47:53,435][61643] Fps is (10 sec: 13107.8, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 147718144. Throughput: 0: 1691.9, 1: 1669.6. Samples: 36932386. Policy #0 lag: (min: 11.0, avg: 19.0, max: 43.0) [2023-10-12 18:47:53,435][61643] Avg episode reward: [(0, '24.580'), (1, '9.900')] [2023-10-12 18:47:54,977][62635] Updated weights for policy 1, policy_version 72130 (0.0008) [2023-10-12 18:47:55,343][62635] Updated weights for policy 1, policy_version 72140 (0.0011) [2023-10-12 18:47:55,708][62635] Updated weights for policy 1, policy_version 72150 (0.0010) [2023-10-12 18:47:56,035][62634] Updated weights for policy 0, policy_version 72130 (0.0009) [2023-10-12 18:47:56,069][62635] Updated weights for policy 1, policy_version 72160 (0.0010) [2023-10-12 18:47:56,418][62634] Updated weights for policy 0, policy_version 72140 (0.0010) [2023-10-12 18:47:56,802][62634] Updated weights for policy 0, policy_version 72150 (0.0007) [2023-10-12 18:47:57,179][62634] Updated weights for policy 0, policy_version 72160 (0.0008) [2023-10-12 18:47:58,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 147783680. Throughput: 0: 1668.7, 1: 1683.9. Samples: 36951836. Policy #0 lag: (min: 11.0, avg: 19.0, max: 43.0) [2023-10-12 18:47:58,436][61643] Avg episode reward: [(0, '24.740'), (1, '9.930')] [2023-10-12 18:48:00,262][62635] Updated weights for policy 1, policy_version 72170 (0.0009) [2023-10-12 18:48:00,640][62635] Updated weights for policy 1, policy_version 72180 (0.0008) [2023-10-12 18:48:01,015][62635] Updated weights for policy 1, policy_version 72190 (0.0009) [2023-10-12 18:48:01,283][62634] Updated weights for policy 0, policy_version 72170 (0.0009) [2023-10-12 18:48:01,673][62634] Updated weights for policy 0, policy_version 72180 (0.0008) [2023-10-12 18:48:02,048][62634] Updated weights for policy 0, policy_version 72190 (0.0007) [2023-10-12 18:48:03,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 147849216. Throughput: 0: 1691.1, 1: 1690.4. Samples: 36972378. Policy #0 lag: (min: 11.0, avg: 19.0, max: 43.0) [2023-10-12 18:48:03,435][61643] Avg episode reward: [(0, '24.270'), (1, '9.640')] [2023-10-12 18:48:05,133][62635] Updated weights for policy 1, policy_version 72200 (0.0007) [2023-10-12 18:48:05,503][62635] Updated weights for policy 1, policy_version 72210 (0.0007) [2023-10-12 18:48:05,863][62635] Updated weights for policy 1, policy_version 72220 (0.0007) [2023-10-12 18:48:05,949][62634] Updated weights for policy 0, policy_version 72200 (0.0008) [2023-10-12 18:48:06,330][62634] Updated weights for policy 0, policy_version 72210 (0.0008) [2023-10-12 18:48:06,699][62634] Updated weights for policy 0, policy_version 72220 (0.0008) [2023-10-12 18:48:08,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13551.5). Total num frames: 147914752. Throughput: 0: 1689.3, 1: 1671.3. Samples: 36982614. Policy #0 lag: (min: 11.0, avg: 19.0, max: 43.0) [2023-10-12 18:48:08,435][61643] Avg episode reward: [(0, '23.980'), (1, '9.900')] [2023-10-12 18:48:09,791][62635] Updated weights for policy 1, policy_version 72230 (0.0009) [2023-10-12 18:48:10,159][62635] Updated weights for policy 1, policy_version 72240 (0.0009) [2023-10-12 18:48:10,526][62635] Updated weights for policy 1, policy_version 72250 (0.0008) [2023-10-12 18:48:10,621][62634] Updated weights for policy 0, policy_version 72230 (0.0010) [2023-10-12 18:48:11,002][62634] Updated weights for policy 0, policy_version 72240 (0.0009) [2023-10-12 18:48:11,374][62634] Updated weights for policy 0, policy_version 72250 (0.0009) [2023-10-12 18:48:13,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 147980288. Throughput: 0: 1667.3, 1: 1690.1. Samples: 37002284. Policy #0 lag: (min: 11.0, avg: 19.0, max: 43.0) [2023-10-12 18:48:13,436][61643] Avg episode reward: [(0, '24.200'), (1, '9.940')] [2023-10-12 18:48:14,653][62635] Updated weights for policy 1, policy_version 72260 (0.0007) [2023-10-12 18:48:15,020][62635] Updated weights for policy 1, policy_version 72270 (0.0009) [2023-10-12 18:48:15,386][62635] Updated weights for policy 1, policy_version 72280 (0.0007) [2023-10-12 18:48:15,492][62634] Updated weights for policy 0, policy_version 72260 (0.0010) [2023-10-12 18:48:15,874][62634] Updated weights for policy 0, policy_version 72270 (0.0009) [2023-10-12 18:48:16,251][62634] Updated weights for policy 0, policy_version 72280 (0.0008) [2023-10-12 18:48:18,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 148045824. Throughput: 0: 1691.5, 1: 1687.2. Samples: 37022802. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 18:48:18,436][61643] Avg episode reward: [(0, '24.020'), (1, '9.550')] [2023-10-12 18:48:18,444][62495] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000072288_74022912.pth... [2023-10-12 18:48:18,444][62354] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000072288_74022912.pth... [2023-10-12 18:48:18,480][62354] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000070720_72417280.pth [2023-10-12 18:48:18,483][62495] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000070720_72417280.pth [2023-10-12 18:48:19,459][62635] Updated weights for policy 1, policy_version 72290 (0.0007) [2023-10-12 18:48:19,834][62635] Updated weights for policy 1, policy_version 72300 (0.0008) [2023-10-12 18:48:20,194][62635] Updated weights for policy 1, policy_version 72310 (0.0009) [2023-10-12 18:48:20,344][62634] Updated weights for policy 0, policy_version 72290 (0.0008) [2023-10-12 18:48:20,564][62635] Updated weights for policy 1, policy_version 72320 (0.0009) [2023-10-12 18:48:20,707][62634] Updated weights for policy 0, policy_version 72300 (0.0008) [2023-10-12 18:48:21,085][62634] Updated weights for policy 0, policy_version 72310 (0.0008) [2023-10-12 18:48:21,453][62634] Updated weights for policy 0, policy_version 72320 (0.0010) [2023-10-12 18:48:23,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 148111360. Throughput: 0: 1674.1, 1: 1673.6. Samples: 37032478. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 18:48:23,436][61643] Avg episode reward: [(0, '24.060'), (1, '9.900')] [2023-10-12 18:48:24,486][62635] Updated weights for policy 1, policy_version 72330 (0.0008) [2023-10-12 18:48:24,854][62635] Updated weights for policy 1, policy_version 72340 (0.0008) [2023-10-12 18:48:25,224][62635] Updated weights for policy 1, policy_version 72350 (0.0008) [2023-10-12 18:48:25,591][62634] Updated weights for policy 0, policy_version 72330 (0.0007) [2023-10-12 18:48:25,968][62634] Updated weights for policy 0, policy_version 72340 (0.0007) [2023-10-12 18:48:26,345][62634] Updated weights for policy 0, policy_version 72350 (0.0009) [2023-10-12 18:48:28,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 148176896. Throughput: 0: 1677.0, 1: 1688.8. Samples: 37052708. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 18:48:28,435][61643] Avg episode reward: [(0, '24.010'), (1, '9.900')] [2023-10-12 18:48:29,222][62635] Updated weights for policy 1, policy_version 72360 (0.0007) [2023-10-12 18:48:29,588][62635] Updated weights for policy 1, policy_version 72370 (0.0009) [2023-10-12 18:48:29,954][62635] Updated weights for policy 1, policy_version 72380 (0.0007) [2023-10-12 18:48:30,496][62634] Updated weights for policy 0, policy_version 72360 (0.0009) [2023-10-12 18:48:30,880][62634] Updated weights for policy 0, policy_version 72370 (0.0009) [2023-10-12 18:48:31,261][62634] Updated weights for policy 0, policy_version 72380 (0.0008) [2023-10-12 18:48:33,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 148242432. Throughput: 0: 1683.4, 1: 1695.0. Samples: 37073790. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 18:48:33,436][61643] Avg episode reward: [(0, '23.460'), (1, '9.800')] [2023-10-12 18:48:33,952][62635] Updated weights for policy 1, policy_version 72390 (0.0009) [2023-10-12 18:48:34,330][62635] Updated weights for policy 1, policy_version 72400 (0.0008) [2023-10-12 18:48:34,702][62635] Updated weights for policy 1, policy_version 72410 (0.0009) [2023-10-12 18:48:35,292][62634] Updated weights for policy 0, policy_version 72390 (0.0008) [2023-10-12 18:48:35,680][62634] Updated weights for policy 0, policy_version 72400 (0.0010) [2023-10-12 18:48:36,061][62634] Updated weights for policy 0, policy_version 72410 (0.0010) [2023-10-12 18:48:38,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 148307968. Throughput: 0: 1666.1, 1: 1691.9. Samples: 37083494. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 18:48:38,435][61643] Avg episode reward: [(0, '23.560'), (1, '9.980')] [2023-10-12 18:48:38,719][62635] Updated weights for policy 1, policy_version 72420 (0.0008) [2023-10-12 18:48:39,088][62635] Updated weights for policy 1, policy_version 72430 (0.0008) [2023-10-12 18:48:39,450][62635] Updated weights for policy 1, policy_version 72440 (0.0008) [2023-10-12 18:48:39,915][62634] Updated weights for policy 0, policy_version 72420 (0.0010) [2023-10-12 18:48:40,295][62634] Updated weights for policy 0, policy_version 72430 (0.0011) [2023-10-12 18:48:40,676][62634] Updated weights for policy 0, policy_version 72440 (0.0009) [2023-10-12 18:48:43,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 148373504. Throughput: 0: 1679.4, 1: 1701.5. Samples: 37103976. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 18:48:43,436][61643] Avg episode reward: [(0, '23.400'), (1, '10.210')] [2023-10-12 18:48:43,581][62635] Updated weights for policy 1, policy_version 72450 (0.0009) [2023-10-12 18:48:43,937][62635] Updated weights for policy 1, policy_version 72460 (0.0007) [2023-10-12 18:48:44,298][62635] Updated weights for policy 1, policy_version 72470 (0.0007) [2023-10-12 18:48:44,664][62635] Updated weights for policy 1, policy_version 72480 (0.0007) [2023-10-12 18:48:44,708][62634] Updated weights for policy 0, policy_version 72450 (0.0009) [2023-10-12 18:48:45,085][62634] Updated weights for policy 0, policy_version 72460 (0.0007) [2023-10-12 18:48:45,451][62634] Updated weights for policy 0, policy_version 72470 (0.0008) [2023-10-12 18:48:45,830][62634] Updated weights for policy 0, policy_version 72480 (0.0011) [2023-10-12 18:48:48,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 148439040. Throughput: 0: 1678.5, 1: 1703.6. Samples: 37124574. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 18:48:48,435][61643] Avg episode reward: [(0, '23.240'), (1, '10.040')] [2023-10-12 18:48:48,583][62635] Updated weights for policy 1, policy_version 72490 (0.0008) [2023-10-12 18:48:48,952][62635] Updated weights for policy 1, policy_version 72500 (0.0008) [2023-10-12 18:48:49,323][62635] Updated weights for policy 1, policy_version 72510 (0.0009) [2023-10-12 18:48:49,991][62634] Updated weights for policy 0, policy_version 72490 (0.0008) [2023-10-12 18:48:50,369][62634] Updated weights for policy 0, policy_version 72500 (0.0007) [2023-10-12 18:48:50,750][62634] Updated weights for policy 0, policy_version 72510 (0.0007) [2023-10-12 18:48:53,435][61643] Fps is (10 sec: 13107.6, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 148504576. Throughput: 0: 1654.5, 1: 1698.7. Samples: 37133508. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 18:48:53,435][61643] Avg episode reward: [(0, '23.150'), (1, '9.970')] [2023-10-12 18:48:53,491][62635] Updated weights for policy 1, policy_version 72520 (0.0009) [2023-10-12 18:48:53,850][62635] Updated weights for policy 1, policy_version 72530 (0.0010) [2023-10-12 18:48:54,224][62635] Updated weights for policy 1, policy_version 72540 (0.0008) [2023-10-12 18:48:54,974][62634] Updated weights for policy 0, policy_version 72520 (0.0007) [2023-10-12 18:48:55,355][62634] Updated weights for policy 0, policy_version 72530 (0.0010) [2023-10-12 18:48:55,727][62634] Updated weights for policy 0, policy_version 72540 (0.0009) [2023-10-12 18:48:58,280][62635] Updated weights for policy 1, policy_version 72550 (0.0008) [2023-10-12 18:48:58,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 148570112. Throughput: 0: 1675.4, 1: 1699.6. Samples: 37154160. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 18:48:58,435][61643] Avg episode reward: [(0, '23.490'), (1, '10.240')] [2023-10-12 18:48:58,645][62635] Updated weights for policy 1, policy_version 72560 (0.0009) [2023-10-12 18:48:59,019][62635] Updated weights for policy 1, policy_version 72570 (0.0010) [2023-10-12 18:48:59,765][62634] Updated weights for policy 0, policy_version 72550 (0.0009) [2023-10-12 18:49:00,138][62634] Updated weights for policy 0, policy_version 72560 (0.0007) [2023-10-12 18:49:00,521][62634] Updated weights for policy 0, policy_version 72570 (0.0008) [2023-10-12 18:49:03,045][62635] Updated weights for policy 1, policy_version 72580 (0.0010) [2023-10-12 18:49:03,415][62635] Updated weights for policy 1, policy_version 72590 (0.0007) [2023-10-12 18:49:03,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 148635648. Throughput: 0: 1678.2, 1: 1697.6. Samples: 37174710. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 18:49:03,435][61643] Avg episode reward: [(0, '23.460'), (1, '9.970')] [2023-10-12 18:49:03,773][62635] Updated weights for policy 1, policy_version 72600 (0.0011) [2023-10-12 18:49:04,657][62634] Updated weights for policy 0, policy_version 72580 (0.0009) [2023-10-12 18:49:05,032][62634] Updated weights for policy 0, policy_version 72590 (0.0008) [2023-10-12 18:49:05,412][62634] Updated weights for policy 0, policy_version 72600 (0.0007) [2023-10-12 18:49:07,925][62635] Updated weights for policy 1, policy_version 72610 (0.0009) [2023-10-12 18:49:08,297][62635] Updated weights for policy 1, policy_version 72620 (0.0007) [2023-10-12 18:49:08,435][61643] Fps is (10 sec: 13106.8, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 148701184. Throughput: 0: 1667.3, 1: 1699.4. Samples: 37183978. Policy #0 lag: (min: 0.0, avg: 20.6, max: 32.0) [2023-10-12 18:49:08,436][61643] Avg episode reward: [(0, '23.340'), (1, '10.000')] [2023-10-12 18:49:08,670][62635] Updated weights for policy 1, policy_version 72630 (0.0007) [2023-10-12 18:49:09,039][62635] Updated weights for policy 1, policy_version 72640 (0.0008) [2023-10-12 18:49:09,378][62634] Updated weights for policy 0, policy_version 72610 (0.0009) [2023-10-12 18:49:09,753][62634] Updated weights for policy 0, policy_version 72620 (0.0009) [2023-10-12 18:49:10,127][62634] Updated weights for policy 0, policy_version 72630 (0.0008) [2023-10-12 18:49:10,508][62634] Updated weights for policy 0, policy_version 72640 (0.0009) [2023-10-12 18:49:12,983][62635] Updated weights for policy 1, policy_version 72650 (0.0007) [2023-10-12 18:49:13,354][62635] Updated weights for policy 1, policy_version 72660 (0.0009) [2023-10-12 18:49:13,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 148766720. Throughput: 0: 1682.8, 1: 1697.7. Samples: 37204832. Policy #0 lag: (min: 0.0, avg: 20.6, max: 32.0) [2023-10-12 18:49:13,435][61643] Avg episode reward: [(0, '23.190'), (1, '9.900')] [2023-10-12 18:49:13,726][62635] Updated weights for policy 1, policy_version 72670 (0.0008) [2023-10-12 18:49:14,537][62634] Updated weights for policy 0, policy_version 72650 (0.0007) [2023-10-12 18:49:14,914][62634] Updated weights for policy 0, policy_version 72660 (0.0009) [2023-10-12 18:49:15,284][62634] Updated weights for policy 0, policy_version 72670 (0.0009) [2023-10-12 18:49:18,000][62635] Updated weights for policy 1, policy_version 72680 (0.0007) [2023-10-12 18:49:18,371][62635] Updated weights for policy 1, policy_version 72690 (0.0008) [2023-10-12 18:49:18,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 148832256. Throughput: 0: 1687.8, 1: 1679.5. Samples: 37225318. Policy #0 lag: (min: 0.0, avg: 20.6, max: 32.0) [2023-10-12 18:49:18,436][61643] Avg episode reward: [(0, '23.590'), (1, '9.850')] [2023-10-12 18:49:18,743][62635] Updated weights for policy 1, policy_version 72700 (0.0009) [2023-10-12 18:49:19,226][62634] Updated weights for policy 0, policy_version 72680 (0.0008) [2023-10-12 18:49:19,598][62634] Updated weights for policy 0, policy_version 72690 (0.0010) [2023-10-12 18:49:19,976][62634] Updated weights for policy 0, policy_version 72700 (0.0008) [2023-10-12 18:49:22,758][62635] Updated weights for policy 1, policy_version 72710 (0.0009) [2023-10-12 18:49:23,137][62635] Updated weights for policy 1, policy_version 72720 (0.0008) [2023-10-12 18:49:23,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 148897792. Throughput: 0: 1674.8, 1: 1685.2. Samples: 37234696. Policy #0 lag: (min: 0.0, avg: 20.6, max: 32.0) [2023-10-12 18:49:23,435][61643] Avg episode reward: [(0, '23.430'), (1, '9.690')] [2023-10-12 18:49:23,502][62635] Updated weights for policy 1, policy_version 72730 (0.0009) [2023-10-12 18:49:24,081][62634] Updated weights for policy 0, policy_version 72710 (0.0008) [2023-10-12 18:49:24,453][62634] Updated weights for policy 0, policy_version 72720 (0.0008) [2023-10-12 18:49:24,839][62634] Updated weights for policy 0, policy_version 72730 (0.0009) [2023-10-12 18:49:27,575][62635] Updated weights for policy 1, policy_version 72740 (0.0008) [2023-10-12 18:49:27,937][62635] Updated weights for policy 1, policy_version 72750 (0.0007) [2023-10-12 18:49:28,310][62635] Updated weights for policy 1, policy_version 72760 (0.0008) [2023-10-12 18:49:28,435][61643] Fps is (10 sec: 13107.6, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 148963328. Throughput: 0: 1684.2, 1: 1682.2. Samples: 37255462. Policy #0 lag: (min: 0.0, avg: 20.6, max: 32.0) [2023-10-12 18:49:28,435][61643] Avg episode reward: [(0, '23.650'), (1, '9.610')] [2023-10-12 18:49:28,765][62634] Updated weights for policy 0, policy_version 72740 (0.0008) [2023-10-12 18:49:29,142][62634] Updated weights for policy 0, policy_version 72750 (0.0008) [2023-10-12 18:49:29,518][62634] Updated weights for policy 0, policy_version 72760 (0.0007) [2023-10-12 18:49:32,408][62635] Updated weights for policy 1, policy_version 72770 (0.0008) [2023-10-12 18:49:32,776][62635] Updated weights for policy 1, policy_version 72780 (0.0008) [2023-10-12 18:49:33,140][62635] Updated weights for policy 1, policy_version 72790 (0.0007) [2023-10-12 18:49:33,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 149028864. Throughput: 0: 1695.2, 1: 1663.6. Samples: 37275724. Policy #0 lag: (min: 0.0, avg: 20.6, max: 32.0) [2023-10-12 18:49:33,435][61643] Avg episode reward: [(0, '23.810'), (1, '9.580')] [2023-10-12 18:49:33,505][62634] Updated weights for policy 0, policy_version 72770 (0.0008) [2023-10-12 18:49:33,505][62635] Updated weights for policy 1, policy_version 72800 (0.0008) [2023-10-12 18:49:33,879][62634] Updated weights for policy 0, policy_version 72780 (0.0008) [2023-10-12 18:49:34,248][62634] Updated weights for policy 0, policy_version 72790 (0.0008) [2023-10-12 18:49:34,629][62634] Updated weights for policy 0, policy_version 72800 (0.0008) [2023-10-12 18:49:37,624][62635] Updated weights for policy 1, policy_version 72810 (0.0007) [2023-10-12 18:49:37,992][62635] Updated weights for policy 1, policy_version 72820 (0.0008) [2023-10-12 18:49:38,363][62635] Updated weights for policy 1, policy_version 72830 (0.0007) [2023-10-12 18:49:38,435][61643] Fps is (10 sec: 16384.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 149127168. Throughput: 0: 1694.9, 1: 1682.4. Samples: 37285488. Policy #0 lag: (min: 0.0, avg: 20.6, max: 32.0) [2023-10-12 18:49:38,435][61643] Avg episode reward: [(0, '23.770'), (1, '9.620')] [2023-10-12 18:49:38,636][62634] Updated weights for policy 0, policy_version 72810 (0.0007) [2023-10-12 18:49:39,011][62634] Updated weights for policy 0, policy_version 72820 (0.0009) [2023-10-12 18:49:39,388][62634] Updated weights for policy 0, policy_version 72830 (0.0007) [2023-10-12 18:49:42,343][62635] Updated weights for policy 1, policy_version 72840 (0.0007) [2023-10-12 18:49:42,707][62635] Updated weights for policy 1, policy_version 72850 (0.0008) [2023-10-12 18:49:43,082][62635] Updated weights for policy 1, policy_version 72860 (0.0007) [2023-10-12 18:49:43,329][62634] Updated weights for policy 0, policy_version 72840 (0.0008) [2023-10-12 18:49:43,435][61643] Fps is (10 sec: 16384.0, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 149192704. Throughput: 0: 1701.1, 1: 1680.7. Samples: 37306342. Policy #0 lag: (min: 0.0, avg: 20.6, max: 32.0) [2023-10-12 18:49:43,435][61643] Avg episode reward: [(0, '24.190'), (1, '9.620')] [2023-10-12 18:49:43,701][62634] Updated weights for policy 0, policy_version 72850 (0.0008) [2023-10-12 18:49:44,075][62634] Updated weights for policy 0, policy_version 72860 (0.0011) [2023-10-12 18:49:47,061][62635] Updated weights for policy 1, policy_version 72870 (0.0008) [2023-10-12 18:49:47,425][62635] Updated weights for policy 1, policy_version 72880 (0.0008) [2023-10-12 18:49:47,796][62635] Updated weights for policy 1, policy_version 72890 (0.0009) [2023-10-12 18:49:48,170][62634] Updated weights for policy 0, policy_version 72870 (0.0010) [2023-10-12 18:49:48,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 149258240. Throughput: 0: 1698.7, 1: 1656.8. Samples: 37325708. Policy #0 lag: (min: 0.0, avg: 20.6, max: 32.0) [2023-10-12 18:49:48,435][61643] Avg episode reward: [(0, '24.160'), (1, '9.730')] [2023-10-12 18:49:48,553][62634] Updated weights for policy 0, policy_version 72880 (0.0010) [2023-10-12 18:49:48,932][62634] Updated weights for policy 0, policy_version 72890 (0.0007) [2023-10-12 18:49:51,853][62635] Updated weights for policy 1, policy_version 72900 (0.0009) [2023-10-12 18:49:52,219][62635] Updated weights for policy 1, policy_version 72910 (0.0010) [2023-10-12 18:49:52,599][62635] Updated weights for policy 1, policy_version 72920 (0.0008) [2023-10-12 18:49:52,971][62634] Updated weights for policy 0, policy_version 72900 (0.0007) [2023-10-12 18:49:53,347][62634] Updated weights for policy 0, policy_version 72910 (0.0008) [2023-10-12 18:49:53,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 149323776. Throughput: 0: 1695.8, 1: 1681.3. Samples: 37335948. Policy #0 lag: (min: 0.0, avg: 20.6, max: 32.0) [2023-10-12 18:49:53,436][61643] Avg episode reward: [(0, '24.190'), (1, '9.830')] [2023-10-12 18:49:53,715][62634] Updated weights for policy 0, policy_version 72920 (0.0008) [2023-10-12 18:49:56,755][62635] Updated weights for policy 1, policy_version 72930 (0.0008) [2023-10-12 18:49:57,128][62635] Updated weights for policy 1, policy_version 72940 (0.0009) [2023-10-12 18:49:57,491][62635] Updated weights for policy 1, policy_version 72950 (0.0008) [2023-10-12 18:49:57,646][62634] Updated weights for policy 0, policy_version 72930 (0.0007) [2023-10-12 18:49:57,858][62635] Updated weights for policy 1, policy_version 72960 (0.0007) [2023-10-12 18:49:58,015][62634] Updated weights for policy 0, policy_version 72940 (0.0008) [2023-10-12 18:49:58,388][62634] Updated weights for policy 0, policy_version 72950 (0.0009) [2023-10-12 18:49:58,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 149389312. Throughput: 0: 1698.9, 1: 1673.1. Samples: 37356572. Policy #0 lag: (min: 26.0, avg: 26.0, max: 26.0) [2023-10-12 18:49:58,436][61643] Avg episode reward: [(0, '24.500'), (1, '9.970')] [2023-10-12 18:49:58,763][62634] Updated weights for policy 0, policy_version 72960 (0.0009) [2023-10-12 18:50:01,964][62635] Updated weights for policy 1, policy_version 72970 (0.0009) [2023-10-12 18:50:02,328][62635] Updated weights for policy 1, policy_version 72980 (0.0008) [2023-10-12 18:50:02,697][62635] Updated weights for policy 1, policy_version 72990 (0.0008) [2023-10-12 18:50:02,778][62634] Updated weights for policy 0, policy_version 72970 (0.0007) [2023-10-12 18:50:03,154][62634] Updated weights for policy 0, policy_version 72980 (0.0007) [2023-10-12 18:50:03,435][61643] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 149454848. Throughput: 0: 1683.7, 1: 1659.5. Samples: 37375760. Policy #0 lag: (min: 26.0, avg: 26.0, max: 26.0) [2023-10-12 18:50:03,435][61643] Avg episode reward: [(0, '24.710'), (1, '9.860')] [2023-10-12 18:50:03,532][62634] Updated weights for policy 0, policy_version 72990 (0.0007) [2023-10-12 18:50:06,898][62635] Updated weights for policy 1, policy_version 73000 (0.0010) [2023-10-12 18:50:07,269][62635] Updated weights for policy 1, policy_version 73010 (0.0007) [2023-10-12 18:50:07,535][62634] Updated weights for policy 0, policy_version 73000 (0.0008) [2023-10-12 18:50:07,639][62635] Updated weights for policy 1, policy_version 73020 (0.0007) [2023-10-12 18:50:07,907][62634] Updated weights for policy 0, policy_version 73010 (0.0007) [2023-10-12 18:50:08,283][62634] Updated weights for policy 0, policy_version 73020 (0.0010) [2023-10-12 18:50:08,435][61643] Fps is (10 sec: 16384.1, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 149553152. Throughput: 0: 1700.4, 1: 1680.5. Samples: 37386836. Policy #0 lag: (min: 26.0, avg: 26.0, max: 26.0) [2023-10-12 18:50:08,435][61643] Avg episode reward: [(0, '25.120'), (1, '10.000')] [2023-10-12 18:50:11,802][62635] Updated weights for policy 1, policy_version 73030 (0.0007) [2023-10-12 18:50:12,162][62635] Updated weights for policy 1, policy_version 73040 (0.0010) [2023-10-12 18:50:12,501][62634] Updated weights for policy 0, policy_version 73030 (0.0009) [2023-10-12 18:50:12,532][62635] Updated weights for policy 1, policy_version 73050 (0.0008) [2023-10-12 18:50:12,885][62634] Updated weights for policy 0, policy_version 73040 (0.0009) [2023-10-12 18:50:13,262][62634] Updated weights for policy 0, policy_version 73050 (0.0009) [2023-10-12 18:50:13,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 149585920. Throughput: 0: 1695.8, 1: 1670.0. Samples: 37406924. Policy #0 lag: (min: 26.0, avg: 26.0, max: 26.0) [2023-10-12 18:50:13,435][61643] Avg episode reward: [(0, '24.860'), (1, '9.790')] [2023-10-12 18:50:16,538][62635] Updated weights for policy 1, policy_version 73060 (0.0008) [2023-10-12 18:50:16,904][62635] Updated weights for policy 1, policy_version 73070 (0.0008) [2023-10-12 18:50:17,273][62635] Updated weights for policy 1, policy_version 73080 (0.0009) [2023-10-12 18:50:17,400][62634] Updated weights for policy 0, policy_version 73060 (0.0008) [2023-10-12 18:50:17,773][62634] Updated weights for policy 0, policy_version 73070 (0.0008) [2023-10-12 18:50:18,152][62634] Updated weights for policy 0, policy_version 73080 (0.0010) [2023-10-12 18:50:18,435][61643] Fps is (10 sec: 9830.4, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 149651456. Throughput: 0: 1666.8, 1: 1670.6. Samples: 37425906. Policy #0 lag: (min: 26.0, avg: 26.0, max: 26.0) [2023-10-12 18:50:18,435][61643] Avg episode reward: [(0, '24.840'), (1, '9.560')] [2023-10-12 18:50:18,441][62495] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000073088_74842112.pth... [2023-10-12 18:50:18,447][62354] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000073088_74842112.pth... [2023-10-12 18:50:18,474][62495] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000071520_73236480.pth [2023-10-12 18:50:18,486][62354] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000071488_73203712.pth [2023-10-12 18:50:21,307][62635] Updated weights for policy 1, policy_version 73090 (0.0007) [2023-10-12 18:50:21,672][62635] Updated weights for policy 1, policy_version 73100 (0.0010) [2023-10-12 18:50:22,039][62635] Updated weights for policy 1, policy_version 73110 (0.0011) [2023-10-12 18:50:22,244][62634] Updated weights for policy 0, policy_version 73090 (0.0008) [2023-10-12 18:50:22,411][62635] Updated weights for policy 1, policy_version 73120 (0.0008) [2023-10-12 18:50:22,630][62634] Updated weights for policy 0, policy_version 73100 (0.0009) [2023-10-12 18:50:23,002][62634] Updated weights for policy 0, policy_version 73110 (0.0009) [2023-10-12 18:50:23,385][62634] Updated weights for policy 0, policy_version 73120 (0.0009) [2023-10-12 18:50:23,435][61643] Fps is (10 sec: 16383.9, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 149749760. Throughput: 0: 1680.8, 1: 1687.2. Samples: 37437050. Policy #0 lag: (min: 26.0, avg: 26.0, max: 26.0) [2023-10-12 18:50:23,436][61643] Avg episode reward: [(0, '24.900'), (1, '9.800')] [2023-10-12 18:50:26,548][62635] Updated weights for policy 1, policy_version 73130 (0.0010) [2023-10-12 18:50:26,911][62635] Updated weights for policy 1, policy_version 73140 (0.0007) [2023-10-12 18:50:27,283][62635] Updated weights for policy 1, policy_version 73150 (0.0009) [2023-10-12 18:50:27,450][62634] Updated weights for policy 0, policy_version 73130 (0.0007) [2023-10-12 18:50:27,821][62634] Updated weights for policy 0, policy_version 73140 (0.0009) [2023-10-12 18:50:28,206][62634] Updated weights for policy 0, policy_version 73150 (0.0007) [2023-10-12 18:50:28,435][61643] Fps is (10 sec: 16383.7, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 149815296. Throughput: 0: 1678.2, 1: 1663.6. Samples: 37456722. Policy #0 lag: (min: 26.0, avg: 26.0, max: 26.0) [2023-10-12 18:50:28,436][61643] Avg episode reward: [(0, '25.000'), (1, '9.800')] [2023-10-12 18:50:31,383][62635] Updated weights for policy 1, policy_version 73160 (0.0007) [2023-10-12 18:50:31,758][62635] Updated weights for policy 1, policy_version 73170 (0.0007) [2023-10-12 18:50:32,117][62635] Updated weights for policy 1, policy_version 73180 (0.0009) [2023-10-12 18:50:32,225][62634] Updated weights for policy 0, policy_version 73160 (0.0009) [2023-10-12 18:50:32,606][62634] Updated weights for policy 0, policy_version 73170 (0.0009) [2023-10-12 18:50:32,983][62634] Updated weights for policy 0, policy_version 73180 (0.0008) [2023-10-12 18:50:33,435][61643] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 149880832. Throughput: 0: 1656.9, 1: 1684.5. Samples: 37476074. Policy #0 lag: (min: 26.0, avg: 26.0, max: 26.0) [2023-10-12 18:50:33,435][61643] Avg episode reward: [(0, '24.920'), (1, '9.780')] [2023-10-12 18:50:36,135][62635] Updated weights for policy 1, policy_version 73190 (0.0009) [2023-10-12 18:50:36,508][62635] Updated weights for policy 1, policy_version 73200 (0.0010) [2023-10-12 18:50:36,885][62635] Updated weights for policy 1, policy_version 73210 (0.0007) [2023-10-12 18:50:36,983][62634] Updated weights for policy 0, policy_version 73190 (0.0009) [2023-10-12 18:50:37,350][62634] Updated weights for policy 0, policy_version 73200 (0.0008) [2023-10-12 18:50:37,727][62634] Updated weights for policy 0, policy_version 73210 (0.0011) [2023-10-12 18:50:38,435][61643] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 149946368. Throughput: 0: 1683.0, 1: 1681.3. Samples: 37487344. Policy #0 lag: (min: 26.0, avg: 26.0, max: 26.0) [2023-10-12 18:50:38,435][61643] Avg episode reward: [(0, '25.080'), (1, '10.080')] [2023-10-12 18:50:40,820][62635] Updated weights for policy 1, policy_version 73220 (0.0007) [2023-10-12 18:50:41,188][62635] Updated weights for policy 1, policy_version 73230 (0.0009) [2023-10-12 18:50:41,561][62635] Updated weights for policy 1, policy_version 73240 (0.0008) [2023-10-12 18:50:41,786][62634] Updated weights for policy 0, policy_version 73220 (0.0009) [2023-10-12 18:50:42,154][62634] Updated weights for policy 0, policy_version 73230 (0.0008) [2023-10-12 18:50:42,544][62634] Updated weights for policy 0, policy_version 73240 (0.0010) [2023-10-12 18:50:43,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 150011904. Throughput: 0: 1670.7, 1: 1666.9. Samples: 37506764. Policy #0 lag: (min: 10.0, avg: 12.9, max: 42.0) [2023-10-12 18:50:43,435][61643] Avg episode reward: [(0, '25.360'), (1, '9.970')] [2023-10-12 18:50:45,569][62635] Updated weights for policy 1, policy_version 73250 (0.0009) [2023-10-12 18:50:45,933][62635] Updated weights for policy 1, policy_version 73260 (0.0007) [2023-10-12 18:50:46,301][62635] Updated weights for policy 1, policy_version 73270 (0.0010) [2023-10-12 18:50:46,635][62634] Updated weights for policy 0, policy_version 73250 (0.0008) [2023-10-12 18:50:46,657][62635] Updated weights for policy 1, policy_version 73280 (0.0008) [2023-10-12 18:50:47,023][62634] Updated weights for policy 0, policy_version 73260 (0.0010) [2023-10-12 18:50:47,395][62634] Updated weights for policy 0, policy_version 73270 (0.0007) [2023-10-12 18:50:47,766][62634] Updated weights for policy 0, policy_version 73280 (0.0008) [2023-10-12 18:50:48,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 150077440. Throughput: 0: 1656.5, 1: 1693.9. Samples: 37526526. Policy #0 lag: (min: 10.0, avg: 12.9, max: 42.0) [2023-10-12 18:50:48,435][61643] Avg episode reward: [(0, '25.390'), (1, '9.930')] [2023-10-12 18:50:50,697][62635] Updated weights for policy 1, policy_version 73290 (0.0011) [2023-10-12 18:50:51,066][62635] Updated weights for policy 1, policy_version 73300 (0.0009) [2023-10-12 18:50:51,438][62635] Updated weights for policy 1, policy_version 73310 (0.0010) [2023-10-12 18:50:51,834][62634] Updated weights for policy 0, policy_version 73290 (0.0008) [2023-10-12 18:50:52,221][62634] Updated weights for policy 0, policy_version 73300 (0.0007) [2023-10-12 18:50:52,592][62634] Updated weights for policy 0, policy_version 73310 (0.0010) [2023-10-12 18:50:53,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 150142976. Throughput: 0: 1672.0, 1: 1674.2. Samples: 37537414. Policy #0 lag: (min: 10.0, avg: 12.9, max: 42.0) [2023-10-12 18:50:53,435][61643] Avg episode reward: [(0, '25.160'), (1, '10.050')] [2023-10-12 18:50:55,502][62635] Updated weights for policy 1, policy_version 73320 (0.0007) [2023-10-12 18:50:55,873][62635] Updated weights for policy 1, policy_version 73330 (0.0009) [2023-10-12 18:50:56,238][62635] Updated weights for policy 1, policy_version 73340 (0.0008) [2023-10-12 18:50:56,756][62634] Updated weights for policy 0, policy_version 73320 (0.0009) [2023-10-12 18:50:57,139][62634] Updated weights for policy 0, policy_version 73330 (0.0009) [2023-10-12 18:50:57,517][62634] Updated weights for policy 0, policy_version 73340 (0.0010) [2023-10-12 18:50:58,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 150208512. Throughput: 0: 1659.6, 1: 1674.2. Samples: 37556946. Policy #0 lag: (min: 10.0, avg: 12.9, max: 42.0) [2023-10-12 18:50:58,436][61643] Avg episode reward: [(0, '24.930'), (1, '10.050')] [2023-10-12 18:51:00,263][62635] Updated weights for policy 1, policy_version 73350 (0.0008) [2023-10-12 18:51:00,632][62635] Updated weights for policy 1, policy_version 73360 (0.0008) [2023-10-12 18:51:01,003][62635] Updated weights for policy 1, policy_version 73370 (0.0008) [2023-10-12 18:51:01,495][62634] Updated weights for policy 0, policy_version 73350 (0.0010) [2023-10-12 18:51:01,864][62634] Updated weights for policy 0, policy_version 73360 (0.0009) [2023-10-12 18:51:02,239][62634] Updated weights for policy 0, policy_version 73370 (0.0009) [2023-10-12 18:51:03,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 150274048. Throughput: 0: 1661.5, 1: 1695.1. Samples: 37576950. Policy #0 lag: (min: 10.0, avg: 12.9, max: 42.0) [2023-10-12 18:51:03,436][61643] Avg episode reward: [(0, '24.840'), (1, '9.860')] [2023-10-12 18:51:04,949][62635] Updated weights for policy 1, policy_version 73380 (0.0010) [2023-10-12 18:51:05,323][62635] Updated weights for policy 1, policy_version 73390 (0.0008) [2023-10-12 18:51:05,693][62635] Updated weights for policy 1, policy_version 73400 (0.0007) [2023-10-12 18:51:06,336][62634] Updated weights for policy 0, policy_version 73380 (0.0008) [2023-10-12 18:51:06,713][62634] Updated weights for policy 0, policy_version 73390 (0.0008) [2023-10-12 18:51:07,079][62634] Updated weights for policy 0, policy_version 73400 (0.0009) [2023-10-12 18:51:08,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13551.5). Total num frames: 150339584. Throughput: 0: 1676.2, 1: 1664.5. Samples: 37587380. Policy #0 lag: (min: 10.0, avg: 12.9, max: 42.0) [2023-10-12 18:51:08,436][61643] Avg episode reward: [(0, '24.730'), (1, '9.840')] [2023-10-12 18:51:09,668][62635] Updated weights for policy 1, policy_version 73410 (0.0008) [2023-10-12 18:51:10,028][62635] Updated weights for policy 1, policy_version 73420 (0.0009) [2023-10-12 18:51:10,400][62635] Updated weights for policy 1, policy_version 73430 (0.0008) [2023-10-12 18:51:10,766][62635] Updated weights for policy 1, policy_version 73440 (0.0008) [2023-10-12 18:51:11,195][62634] Updated weights for policy 0, policy_version 73410 (0.0009) [2023-10-12 18:51:11,562][62634] Updated weights for policy 0, policy_version 73420 (0.0009) [2023-10-12 18:51:11,937][62634] Updated weights for policy 0, policy_version 73430 (0.0008) [2023-10-12 18:51:12,312][62634] Updated weights for policy 0, policy_version 73440 (0.0010) [2023-10-12 18:51:13,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 150405120. Throughput: 0: 1655.7, 1: 1689.1. Samples: 37607238. Policy #0 lag: (min: 10.0, avg: 12.9, max: 42.0) [2023-10-12 18:51:13,435][61643] Avg episode reward: [(0, '24.510'), (1, '9.990')] [2023-10-12 18:51:14,891][62635] Updated weights for policy 1, policy_version 73450 (0.0008) [2023-10-12 18:51:15,265][62635] Updated weights for policy 1, policy_version 73460 (0.0007) [2023-10-12 18:51:15,631][62635] Updated weights for policy 1, policy_version 73470 (0.0007) [2023-10-12 18:51:16,286][62634] Updated weights for policy 0, policy_version 73450 (0.0010) [2023-10-12 18:51:16,663][62634] Updated weights for policy 0, policy_version 73460 (0.0010) [2023-10-12 18:51:17,041][62634] Updated weights for policy 0, policy_version 73470 (0.0010) [2023-10-12 18:51:18,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 150470656. Throughput: 0: 1671.6, 1: 1691.1. Samples: 37627396. Policy #0 lag: (min: 10.0, avg: 12.9, max: 42.0) [2023-10-12 18:51:18,436][61643] Avg episode reward: [(0, '24.200'), (1, '9.880')] [2023-10-12 18:51:19,832][62635] Updated weights for policy 1, policy_version 73480 (0.0010) [2023-10-12 18:51:20,199][62635] Updated weights for policy 1, policy_version 73490 (0.0010) [2023-10-12 18:51:20,570][62635] Updated weights for policy 1, policy_version 73500 (0.0010) [2023-10-12 18:51:21,203][62634] Updated weights for policy 0, policy_version 73480 (0.0009) [2023-10-12 18:51:21,578][62634] Updated weights for policy 0, policy_version 73490 (0.0009) [2023-10-12 18:51:21,964][62634] Updated weights for policy 0, policy_version 73500 (0.0007) [2023-10-12 18:51:23,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13551.5). Total num frames: 150536192. Throughput: 0: 1673.3, 1: 1662.8. Samples: 37637468. Policy #0 lag: (min: 10.0, avg: 12.9, max: 42.0) [2023-10-12 18:51:23,435][61643] Avg episode reward: [(0, '24.180'), (1, '9.780')] [2023-10-12 18:51:24,688][62635] Updated weights for policy 1, policy_version 73510 (0.0008) [2023-10-12 18:51:25,062][62635] Updated weights for policy 1, policy_version 73520 (0.0007) [2023-10-12 18:51:25,433][62635] Updated weights for policy 1, policy_version 73530 (0.0010) [2023-10-12 18:51:26,116][62634] Updated weights for policy 0, policy_version 73510 (0.0007) [2023-10-12 18:51:26,496][62634] Updated weights for policy 0, policy_version 73520 (0.0010) [2023-10-12 18:51:26,870][62634] Updated weights for policy 0, policy_version 73530 (0.0009) [2023-10-12 18:51:28,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 150601728. Throughput: 0: 1654.8, 1: 1688.0. Samples: 37657186. Policy #0 lag: (min: 10.0, avg: 12.9, max: 42.0) [2023-10-12 18:51:28,435][61643] Avg episode reward: [(0, '24.350'), (1, '9.860')] [2023-10-12 18:51:29,543][62635] Updated weights for policy 1, policy_version 73540 (0.0009) [2023-10-12 18:51:29,917][62635] Updated weights for policy 1, policy_version 73550 (0.0007) [2023-10-12 18:51:30,282][62635] Updated weights for policy 1, policy_version 73560 (0.0007) [2023-10-12 18:51:31,052][62634] Updated weights for policy 0, policy_version 73540 (0.0009) [2023-10-12 18:51:31,436][62634] Updated weights for policy 0, policy_version 73550 (0.0008) [2023-10-12 18:51:31,825][62634] Updated weights for policy 0, policy_version 73560 (0.0009) [2023-10-12 18:51:33,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 150667264. Throughput: 0: 1673.2, 1: 1687.1. Samples: 37677736. Policy #0 lag: (min: 1.0, avg: 13.5, max: 33.0) [2023-10-12 18:51:33,435][61643] Avg episode reward: [(0, '24.400'), (1, '9.870')] [2023-10-12 18:51:34,379][62635] Updated weights for policy 1, policy_version 73570 (0.0007) [2023-10-12 18:51:34,756][62635] Updated weights for policy 1, policy_version 73580 (0.0009) [2023-10-12 18:51:35,118][62635] Updated weights for policy 1, policy_version 73590 (0.0009) [2023-10-12 18:51:35,481][62635] Updated weights for policy 1, policy_version 73600 (0.0008) [2023-10-12 18:51:35,655][62634] Updated weights for policy 0, policy_version 73570 (0.0008) [2023-10-12 18:51:36,039][62634] Updated weights for policy 0, policy_version 73580 (0.0007) [2023-10-12 18:51:36,419][62634] Updated weights for policy 0, policy_version 73590 (0.0009) [2023-10-12 18:51:36,799][62634] Updated weights for policy 0, policy_version 73600 (0.0010) [2023-10-12 18:51:38,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 150732800. Throughput: 0: 1665.3, 1: 1675.3. Samples: 37687744. Policy #0 lag: (min: 1.0, avg: 13.5, max: 33.0) [2023-10-12 18:51:38,435][61643] Avg episode reward: [(0, '24.110'), (1, '9.890')] [2023-10-12 18:51:39,345][62635] Updated weights for policy 1, policy_version 73610 (0.0009) [2023-10-12 18:51:39,721][62635] Updated weights for policy 1, policy_version 73620 (0.0008) [2023-10-12 18:51:40,088][62635] Updated weights for policy 1, policy_version 73630 (0.0008) [2023-10-12 18:51:40,940][62634] Updated weights for policy 0, policy_version 73610 (0.0007) [2023-10-12 18:51:41,315][62634] Updated weights for policy 0, policy_version 73620 (0.0007) [2023-10-12 18:51:41,683][62634] Updated weights for policy 0, policy_version 73630 (0.0008) [2023-10-12 18:51:43,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 150798336. Throughput: 0: 1657.5, 1: 1691.9. Samples: 37707668. Policy #0 lag: (min: 1.0, avg: 13.5, max: 33.0) [2023-10-12 18:51:43,435][61643] Avg episode reward: [(0, '23.970'), (1, '9.930')] [2023-10-12 18:51:44,057][62635] Updated weights for policy 1, policy_version 73640 (0.0008) [2023-10-12 18:51:44,426][62635] Updated weights for policy 1, policy_version 73650 (0.0010) [2023-10-12 18:51:44,790][62635] Updated weights for policy 1, policy_version 73660 (0.0007) [2023-10-12 18:51:45,789][62634] Updated weights for policy 0, policy_version 73640 (0.0008) [2023-10-12 18:51:46,160][62634] Updated weights for policy 0, policy_version 73650 (0.0007) [2023-10-12 18:51:46,545][62634] Updated weights for policy 0, policy_version 73660 (0.0008) [2023-10-12 18:51:48,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 150863872. Throughput: 0: 1678.0, 1: 1687.0. Samples: 37728376. Policy #0 lag: (min: 1.0, avg: 13.5, max: 33.0) [2023-10-12 18:51:48,435][61643] Avg episode reward: [(0, '24.030'), (1, '9.790')] [2023-10-12 18:51:48,951][62635] Updated weights for policy 1, policy_version 73670 (0.0008) [2023-10-12 18:51:49,330][62635] Updated weights for policy 1, policy_version 73680 (0.0011) [2023-10-12 18:51:49,695][62635] Updated weights for policy 1, policy_version 73690 (0.0012) [2023-10-12 18:51:50,579][62634] Updated weights for policy 0, policy_version 73670 (0.0008) [2023-10-12 18:51:50,960][62634] Updated weights for policy 0, policy_version 73680 (0.0007) [2023-10-12 18:51:51,348][62634] Updated weights for policy 0, policy_version 73690 (0.0009) [2023-10-12 18:51:53,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 150929408. Throughput: 0: 1664.5, 1: 1686.2. Samples: 37738162. Policy #0 lag: (min: 1.0, avg: 13.5, max: 33.0) [2023-10-12 18:51:53,435][61643] Avg episode reward: [(0, '24.090'), (1, '9.820')] [2023-10-12 18:51:53,630][62635] Updated weights for policy 1, policy_version 73700 (0.0009) [2023-10-12 18:51:53,995][62635] Updated weights for policy 1, policy_version 73710 (0.0007) [2023-10-12 18:51:54,367][62635] Updated weights for policy 1, policy_version 73720 (0.0007) [2023-10-12 18:51:55,374][62634] Updated weights for policy 0, policy_version 73700 (0.0008) [2023-10-12 18:51:55,749][62634] Updated weights for policy 0, policy_version 73710 (0.0008) [2023-10-12 18:51:56,125][62634] Updated weights for policy 0, policy_version 73720 (0.0007) [2023-10-12 18:51:58,365][62635] Updated weights for policy 1, policy_version 73730 (0.0011) [2023-10-12 18:51:58,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 150994944. Throughput: 0: 1666.6, 1: 1688.4. Samples: 37758210. Policy #0 lag: (min: 1.0, avg: 13.5, max: 33.0) [2023-10-12 18:51:58,436][61643] Avg episode reward: [(0, '23.560'), (1, '9.830')] [2023-10-12 18:51:58,731][62635] Updated weights for policy 1, policy_version 73740 (0.0009) [2023-10-12 18:51:59,095][62635] Updated weights for policy 1, policy_version 73750 (0.0010) [2023-10-12 18:51:59,463][62635] Updated weights for policy 1, policy_version 73760 (0.0008) [2023-10-12 18:52:00,176][62634] Updated weights for policy 0, policy_version 73730 (0.0008) [2023-10-12 18:52:00,559][62634] Updated weights for policy 0, policy_version 73740 (0.0009) [2023-10-12 18:52:00,935][62634] Updated weights for policy 0, policy_version 73750 (0.0008) [2023-10-12 18:52:01,314][62634] Updated weights for policy 0, policy_version 73760 (0.0008) [2023-10-12 18:52:03,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 151060480. Throughput: 0: 1673.3, 1: 1695.4. Samples: 37778986. Policy #0 lag: (min: 1.0, avg: 13.5, max: 33.0) [2023-10-12 18:52:03,435][61643] Avg episode reward: [(0, '23.710'), (1, '9.770')] [2023-10-12 18:52:03,814][62635] Updated weights for policy 1, policy_version 73770 (0.0010) [2023-10-12 18:52:04,184][62635] Updated weights for policy 1, policy_version 73780 (0.0008) [2023-10-12 18:52:04,540][62635] Updated weights for policy 1, policy_version 73790 (0.0011) [2023-10-12 18:52:05,537][62634] Updated weights for policy 0, policy_version 73770 (0.0007) [2023-10-12 18:52:05,910][62634] Updated weights for policy 0, policy_version 73780 (0.0008) [2023-10-12 18:52:06,282][62634] Updated weights for policy 0, policy_version 73790 (0.0009) [2023-10-12 18:52:08,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 151126016. Throughput: 0: 1660.6, 1: 1697.0. Samples: 37788562. Policy #0 lag: (min: 1.0, avg: 13.5, max: 33.0) [2023-10-12 18:52:08,436][61643] Avg episode reward: [(0, '23.340'), (1, '9.770')] [2023-10-12 18:52:08,695][62635] Updated weights for policy 1, policy_version 73800 (0.0010) [2023-10-12 18:52:09,078][62635] Updated weights for policy 1, policy_version 73810 (0.0007) [2023-10-12 18:52:09,452][62635] Updated weights for policy 1, policy_version 73820 (0.0007) [2023-10-12 18:52:10,382][62634] Updated weights for policy 0, policy_version 73800 (0.0008) [2023-10-12 18:52:10,760][62634] Updated weights for policy 0, policy_version 73810 (0.0007) [2023-10-12 18:52:11,136][62634] Updated weights for policy 0, policy_version 73820 (0.0008) [2023-10-12 18:52:13,369][62635] Updated weights for policy 1, policy_version 73830 (0.0008) [2023-10-12 18:52:13,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 151191552. Throughput: 0: 1672.4, 1: 1693.0. Samples: 37808626. Policy #0 lag: (min: 1.0, avg: 13.5, max: 33.0) [2023-10-12 18:52:13,435][61643] Avg episode reward: [(0, '23.420'), (1, '9.790')] [2023-10-12 18:52:13,739][62635] Updated weights for policy 1, policy_version 73840 (0.0010) [2023-10-12 18:52:14,100][62635] Updated weights for policy 1, policy_version 73850 (0.0011) [2023-10-12 18:52:15,091][62634] Updated weights for policy 0, policy_version 73830 (0.0009) [2023-10-12 18:52:15,464][62634] Updated weights for policy 0, policy_version 73840 (0.0007) [2023-10-12 18:52:15,847][62634] Updated weights for policy 0, policy_version 73850 (0.0007) [2023-10-12 18:52:18,141][62635] Updated weights for policy 1, policy_version 73860 (0.0009) [2023-10-12 18:52:18,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 151257088. Throughput: 0: 1678.7, 1: 1687.8. Samples: 37829230. Policy #0 lag: (min: 1.0, avg: 13.5, max: 33.0) [2023-10-12 18:52:18,436][61643] Avg episode reward: [(0, '23.340'), (1, '9.830')] [2023-10-12 18:52:18,448][62354] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000073856_75628544.pth... [2023-10-12 18:52:18,483][62354] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000072288_74022912.pth [2023-10-12 18:52:18,504][62635] Updated weights for policy 1, policy_version 73870 (0.0010) [2023-10-12 18:52:18,876][62635] Updated weights for policy 1, policy_version 73880 (0.0009) [2023-10-12 18:52:19,170][62495] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000073888_75661312.pth... [2023-10-12 18:52:19,200][62495] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000072288_74022912.pth [2023-10-12 18:52:20,056][62634] Updated weights for policy 0, policy_version 73860 (0.0008) [2023-10-12 18:52:20,428][62634] Updated weights for policy 0, policy_version 73870 (0.0007) [2023-10-12 18:52:20,804][62634] Updated weights for policy 0, policy_version 73880 (0.0008) [2023-10-12 18:52:22,963][62635] Updated weights for policy 1, policy_version 73890 (0.0008) [2023-10-12 18:52:23,326][62635] Updated weights for policy 1, policy_version 73900 (0.0008) [2023-10-12 18:52:23,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 151322624. Throughput: 0: 1657.9, 1: 1690.6. Samples: 37838426. Policy #0 lag: (min: 31.0, avg: 38.5, max: 63.0) [2023-10-12 18:52:23,435][61643] Avg episode reward: [(0, '23.390'), (1, '9.830')] [2023-10-12 18:52:23,697][62635] Updated weights for policy 1, policy_version 73910 (0.0010) [2023-10-12 18:52:24,061][62635] Updated weights for policy 1, policy_version 73920 (0.0008) [2023-10-12 18:52:24,970][62634] Updated weights for policy 0, policy_version 73890 (0.0009) [2023-10-12 18:52:25,341][62634] Updated weights for policy 0, policy_version 73900 (0.0008) [2023-10-12 18:52:25,729][62634] Updated weights for policy 0, policy_version 73910 (0.0009) [2023-10-12 18:52:26,094][62634] Updated weights for policy 0, policy_version 73920 (0.0009) [2023-10-12 18:52:28,163][62635] Updated weights for policy 1, policy_version 73930 (0.0007) [2023-10-12 18:52:28,435][61643] Fps is (10 sec: 13107.6, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 151388160. Throughput: 0: 1677.1, 1: 1683.0. Samples: 37858870. Policy #0 lag: (min: 31.0, avg: 38.5, max: 63.0) [2023-10-12 18:52:28,435][61643] Avg episode reward: [(0, '23.380'), (1, '9.980')] [2023-10-12 18:52:28,535][62635] Updated weights for policy 1, policy_version 73940 (0.0009) [2023-10-12 18:52:28,896][62635] Updated weights for policy 1, policy_version 73950 (0.0009) [2023-10-12 18:52:30,235][62634] Updated weights for policy 0, policy_version 73930 (0.0008) [2023-10-12 18:52:30,614][62634] Updated weights for policy 0, policy_version 73940 (0.0007) [2023-10-12 18:52:31,000][62634] Updated weights for policy 0, policy_version 73950 (0.0008) [2023-10-12 18:52:32,880][62635] Updated weights for policy 1, policy_version 73960 (0.0008) [2023-10-12 18:52:33,258][62635] Updated weights for policy 1, policy_version 73970 (0.0009) [2023-10-12 18:52:33,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 151453696. Throughput: 0: 1677.2, 1: 1675.4. Samples: 37879242. Policy #0 lag: (min: 31.0, avg: 38.5, max: 63.0) [2023-10-12 18:52:33,435][61643] Avg episode reward: [(0, '23.590'), (1, '10.030')] [2023-10-12 18:52:33,621][62635] Updated weights for policy 1, policy_version 73980 (0.0010) [2023-10-12 18:52:35,039][62634] Updated weights for policy 0, policy_version 73960 (0.0007) [2023-10-12 18:52:35,409][62634] Updated weights for policy 0, policy_version 73970 (0.0007) [2023-10-12 18:52:35,786][62634] Updated weights for policy 0, policy_version 73980 (0.0010) [2023-10-12 18:52:37,545][62635] Updated weights for policy 1, policy_version 73990 (0.0008) [2023-10-12 18:52:37,913][62635] Updated weights for policy 1, policy_version 74000 (0.0010) [2023-10-12 18:52:38,280][62635] Updated weights for policy 1, policy_version 74010 (0.0010) [2023-10-12 18:52:38,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 151519232. Throughput: 0: 1660.1, 1: 1689.9. Samples: 37888914. Policy #0 lag: (min: 31.0, avg: 38.5, max: 63.0) [2023-10-12 18:52:38,435][61643] Avg episode reward: [(0, '23.330'), (1, '9.850')] [2023-10-12 18:52:39,796][62634] Updated weights for policy 0, policy_version 73990 (0.0010) [2023-10-12 18:52:40,174][62634] Updated weights for policy 0, policy_version 74000 (0.0010) [2023-10-12 18:52:40,557][62634] Updated weights for policy 0, policy_version 74010 (0.0009) [2023-10-12 18:52:42,427][62635] Updated weights for policy 1, policy_version 74020 (0.0007) [2023-10-12 18:52:42,801][62635] Updated weights for policy 1, policy_version 74030 (0.0008) [2023-10-12 18:52:43,162][62635] Updated weights for policy 1, policy_version 74040 (0.0010) [2023-10-12 18:52:43,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 151584768. Throughput: 0: 1672.9, 1: 1690.8. Samples: 37909578. Policy #0 lag: (min: 31.0, avg: 38.5, max: 63.0) [2023-10-12 18:52:43,435][61643] Avg episode reward: [(0, '23.370'), (1, '9.940')] [2023-10-12 18:52:44,612][62634] Updated weights for policy 0, policy_version 74020 (0.0008) [2023-10-12 18:52:44,994][62634] Updated weights for policy 0, policy_version 74030 (0.0008) [2023-10-12 18:52:45,370][62634] Updated weights for policy 0, policy_version 74040 (0.0011) [2023-10-12 18:52:47,140][62635] Updated weights for policy 1, policy_version 74050 (0.0009) [2023-10-12 18:52:47,500][62635] Updated weights for policy 1, policy_version 74060 (0.0008) [2023-10-12 18:52:47,872][62635] Updated weights for policy 1, policy_version 74070 (0.0007) [2023-10-12 18:52:48,235][62635] Updated weights for policy 1, policy_version 74080 (0.0009) [2023-10-12 18:52:48,435][61643] Fps is (10 sec: 16383.8, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 151683072. Throughput: 0: 1672.1, 1: 1669.6. Samples: 37929366. Policy #0 lag: (min: 31.0, avg: 38.5, max: 63.0) [2023-10-12 18:52:48,436][61643] Avg episode reward: [(0, '23.340'), (1, '9.950')] [2023-10-12 18:52:49,429][62634] Updated weights for policy 0, policy_version 74050 (0.0010) [2023-10-12 18:52:49,800][62634] Updated weights for policy 0, policy_version 74060 (0.0007) [2023-10-12 18:52:50,185][62634] Updated weights for policy 0, policy_version 74070 (0.0011) [2023-10-12 18:52:50,562][62634] Updated weights for policy 0, policy_version 74080 (0.0009) [2023-10-12 18:52:52,052][62635] Updated weights for policy 1, policy_version 74090 (0.0009) [2023-10-12 18:52:52,420][62635] Updated weights for policy 1, policy_version 74100 (0.0007) [2023-10-12 18:52:52,782][62635] Updated weights for policy 1, policy_version 74110 (0.0008) [2023-10-12 18:52:53,435][61643] Fps is (10 sec: 16383.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 151748608. Throughput: 0: 1655.4, 1: 1696.0. Samples: 37939376. Policy #0 lag: (min: 31.0, avg: 38.5, max: 63.0) [2023-10-12 18:52:53,436][61643] Avg episode reward: [(0, '23.300'), (1, '9.870')] [2023-10-12 18:52:54,521][62634] Updated weights for policy 0, policy_version 74090 (0.0008) [2023-10-12 18:52:54,896][62634] Updated weights for policy 0, policy_version 74100 (0.0007) [2023-10-12 18:52:55,274][62634] Updated weights for policy 0, policy_version 74110 (0.0008) [2023-10-12 18:52:56,721][62635] Updated weights for policy 1, policy_version 74120 (0.0007) [2023-10-12 18:52:57,085][62635] Updated weights for policy 1, policy_version 74130 (0.0008) [2023-10-12 18:52:57,455][62635] Updated weights for policy 1, policy_version 74140 (0.0007) [2023-10-12 18:52:58,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 151814144. Throughput: 0: 1671.5, 1: 1686.6. Samples: 37959742. Policy #0 lag: (min: 31.0, avg: 38.5, max: 63.0) [2023-10-12 18:52:58,436][61643] Avg episode reward: [(0, '23.150'), (1, '10.050')] [2023-10-12 18:52:59,308][62634] Updated weights for policy 0, policy_version 74120 (0.0008) [2023-10-12 18:52:59,685][62634] Updated weights for policy 0, policy_version 74130 (0.0008) [2023-10-12 18:53:00,059][62634] Updated weights for policy 0, policy_version 74140 (0.0010) [2023-10-12 18:53:01,408][62635] Updated weights for policy 1, policy_version 74150 (0.0009) [2023-10-12 18:53:01,783][62635] Updated weights for policy 1, policy_version 74160 (0.0009) [2023-10-12 18:53:02,153][62635] Updated weights for policy 1, policy_version 74170 (0.0008) [2023-10-12 18:53:03,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 151879680. Throughput: 0: 1669.7, 1: 1678.6. Samples: 37979900. Policy #0 lag: (min: 31.0, avg: 38.5, max: 63.0) [2023-10-12 18:53:03,435][61643] Avg episode reward: [(0, '23.270'), (1, '10.050')] [2023-10-12 18:53:04,145][62634] Updated weights for policy 0, policy_version 74150 (0.0011) [2023-10-12 18:53:04,519][62634] Updated weights for policy 0, policy_version 74160 (0.0010) [2023-10-12 18:53:04,900][62634] Updated weights for policy 0, policy_version 74170 (0.0010) [2023-10-12 18:53:06,279][62635] Updated weights for policy 1, policy_version 74180 (0.0007) [2023-10-12 18:53:06,643][62635] Updated weights for policy 1, policy_version 74190 (0.0007) [2023-10-12 18:53:07,008][62635] Updated weights for policy 1, policy_version 74200 (0.0009) [2023-10-12 18:53:08,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 151945216. Throughput: 0: 1666.0, 1: 1706.4. Samples: 37990184. Policy #0 lag: (min: 31.0, avg: 38.5, max: 63.0) [2023-10-12 18:53:08,435][61643] Avg episode reward: [(0, '23.560'), (1, '9.870')] [2023-10-12 18:53:08,866][62634] Updated weights for policy 0, policy_version 74180 (0.0010) [2023-10-12 18:53:09,237][62634] Updated weights for policy 0, policy_version 74190 (0.0010) [2023-10-12 18:53:09,614][62634] Updated weights for policy 0, policy_version 74200 (0.0007) [2023-10-12 18:53:11,078][62635] Updated weights for policy 1, policy_version 74210 (0.0008) [2023-10-12 18:53:11,450][62635] Updated weights for policy 1, policy_version 74220 (0.0008) [2023-10-12 18:53:11,814][62635] Updated weights for policy 1, policy_version 74230 (0.0008) [2023-10-12 18:53:12,181][62635] Updated weights for policy 1, policy_version 74240 (0.0008) [2023-10-12 18:53:13,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 152010752. Throughput: 0: 1684.3, 1: 1687.2. Samples: 38010590. Policy #0 lag: (min: 31.0, avg: 31.1, max: 39.0) [2023-10-12 18:53:13,435][61643] Avg episode reward: [(0, '23.750'), (1, '10.140')] [2023-10-12 18:53:13,463][62634] Updated weights for policy 0, policy_version 74210 (0.0007) [2023-10-12 18:53:13,837][62634] Updated weights for policy 0, policy_version 74220 (0.0007) [2023-10-12 18:53:14,225][62634] Updated weights for policy 0, policy_version 74230 (0.0008) [2023-10-12 18:53:14,603][62634] Updated weights for policy 0, policy_version 74240 (0.0009) [2023-10-12 18:53:16,239][62635] Updated weights for policy 1, policy_version 74250 (0.0008) [2023-10-12 18:53:16,614][62635] Updated weights for policy 1, policy_version 74260 (0.0011) [2023-10-12 18:53:16,978][62635] Updated weights for policy 1, policy_version 74270 (0.0009) [2023-10-12 18:53:18,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 152076288. Throughput: 0: 1690.8, 1: 1691.5. Samples: 38031442. Policy #0 lag: (min: 31.0, avg: 31.1, max: 39.0) [2023-10-12 18:53:18,435][61643] Avg episode reward: [(0, '23.780'), (1, '9.960')] [2023-10-12 18:53:18,866][62634] Updated weights for policy 0, policy_version 74250 (0.0008) [2023-10-12 18:53:19,232][62634] Updated weights for policy 0, policy_version 74260 (0.0009) [2023-10-12 18:53:19,605][62634] Updated weights for policy 0, policy_version 74270 (0.0008) [2023-10-12 18:53:21,167][62635] Updated weights for policy 1, policy_version 74280 (0.0009) [2023-10-12 18:53:21,537][62635] Updated weights for policy 1, policy_version 74290 (0.0010) [2023-10-12 18:53:21,901][62635] Updated weights for policy 1, policy_version 74300 (0.0009) [2023-10-12 18:53:23,435][61643] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 152141824. Throughput: 0: 1695.8, 1: 1699.3. Samples: 38041696. Policy #0 lag: (min: 31.0, avg: 31.1, max: 39.0) [2023-10-12 18:53:23,436][61643] Avg episode reward: [(0, '24.080'), (1, '9.840')] [2023-10-12 18:53:23,504][62634] Updated weights for policy 0, policy_version 74280 (0.0010) [2023-10-12 18:53:23,873][62634] Updated weights for policy 0, policy_version 74290 (0.0010) [2023-10-12 18:53:24,266][62634] Updated weights for policy 0, policy_version 74300 (0.0008) [2023-10-12 18:53:26,008][62635] Updated weights for policy 1, policy_version 74310 (0.0010) [2023-10-12 18:53:26,379][62635] Updated weights for policy 1, policy_version 74320 (0.0010) [2023-10-12 18:53:26,748][62635] Updated weights for policy 1, policy_version 74330 (0.0011) [2023-10-12 18:53:28,094][62634] Updated weights for policy 0, policy_version 74310 (0.0008) [2023-10-12 18:53:28,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 152207360. Throughput: 0: 1706.6, 1: 1670.8. Samples: 38061562. Policy #0 lag: (min: 31.0, avg: 31.1, max: 39.0) [2023-10-12 18:53:28,436][61643] Avg episode reward: [(0, '24.400'), (1, '10.020')] [2023-10-12 18:53:28,472][62634] Updated weights for policy 0, policy_version 74320 (0.0010) [2023-10-12 18:53:28,857][62634] Updated weights for policy 0, policy_version 74330 (0.0010) [2023-10-12 18:53:30,811][62635] Updated weights for policy 1, policy_version 74340 (0.0009) [2023-10-12 18:53:31,170][62635] Updated weights for policy 1, policy_version 74350 (0.0007) [2023-10-12 18:53:31,548][62635] Updated weights for policy 1, policy_version 74360 (0.0009) [2023-10-12 18:53:32,939][62634] Updated weights for policy 0, policy_version 74340 (0.0009) [2023-10-12 18:53:33,322][62634] Updated weights for policy 0, policy_version 74350 (0.0007) [2023-10-12 18:53:33,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 152272896. Throughput: 0: 1700.8, 1: 1693.8. Samples: 38082122. Policy #0 lag: (min: 31.0, avg: 31.1, max: 39.0) [2023-10-12 18:53:33,435][61643] Avg episode reward: [(0, '24.250'), (1, '9.930')] [2023-10-12 18:53:33,698][62634] Updated weights for policy 0, policy_version 74360 (0.0007) [2023-10-12 18:53:35,569][62635] Updated weights for policy 1, policy_version 74370 (0.0009) [2023-10-12 18:53:35,937][62635] Updated weights for policy 1, policy_version 74380 (0.0008) [2023-10-12 18:53:36,295][62635] Updated weights for policy 1, policy_version 74390 (0.0008) [2023-10-12 18:53:36,660][62635] Updated weights for policy 1, policy_version 74400 (0.0008) [2023-10-12 18:53:37,834][62634] Updated weights for policy 0, policy_version 74370 (0.0009) [2023-10-12 18:53:38,219][62634] Updated weights for policy 0, policy_version 74380 (0.0008) [2023-10-12 18:53:38,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 152338432. Throughput: 0: 1708.9, 1: 1686.8. Samples: 38092180. Policy #0 lag: (min: 31.0, avg: 31.1, max: 39.0) [2023-10-12 18:53:38,436][61643] Avg episode reward: [(0, '24.400'), (1, '9.830')] [2023-10-12 18:53:38,596][62634] Updated weights for policy 0, policy_version 74390 (0.0009) [2023-10-12 18:53:38,972][62634] Updated weights for policy 0, policy_version 74400 (0.0007) [2023-10-12 18:53:40,701][62635] Updated weights for policy 1, policy_version 74410 (0.0008) [2023-10-12 18:53:41,063][62635] Updated weights for policy 1, policy_version 74420 (0.0009) [2023-10-12 18:53:41,437][62635] Updated weights for policy 1, policy_version 74430 (0.0008) [2023-10-12 18:53:42,978][62634] Updated weights for policy 0, policy_version 74410 (0.0009) [2023-10-12 18:53:43,362][62634] Updated weights for policy 0, policy_version 74420 (0.0009) [2023-10-12 18:53:43,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 152403968. Throughput: 0: 1708.3, 1: 1681.7. Samples: 38112290. Policy #0 lag: (min: 31.0, avg: 31.1, max: 39.0) [2023-10-12 18:53:43,435][61643] Avg episode reward: [(0, '24.780'), (1, '10.020')] [2023-10-12 18:53:43,733][62634] Updated weights for policy 0, policy_version 74430 (0.0009) [2023-10-12 18:53:45,537][62635] Updated weights for policy 1, policy_version 74440 (0.0008) [2023-10-12 18:53:45,910][62635] Updated weights for policy 1, policy_version 74450 (0.0007) [2023-10-12 18:53:46,278][62635] Updated weights for policy 1, policy_version 74460 (0.0009) [2023-10-12 18:53:47,822][62634] Updated weights for policy 0, policy_version 74440 (0.0009) [2023-10-12 18:53:48,200][62634] Updated weights for policy 0, policy_version 74450 (0.0009) [2023-10-12 18:53:48,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 152469504. Throughput: 0: 1697.6, 1: 1695.9. Samples: 38132608. Policy #0 lag: (min: 31.0, avg: 31.1, max: 39.0) [2023-10-12 18:53:48,436][61643] Avg episode reward: [(0, '24.660'), (1, '9.830')] [2023-10-12 18:53:48,580][62634] Updated weights for policy 0, policy_version 74460 (0.0010) [2023-10-12 18:53:50,185][62635] Updated weights for policy 1, policy_version 74470 (0.0008) [2023-10-12 18:53:50,546][62635] Updated weights for policy 1, policy_version 74480 (0.0008) [2023-10-12 18:53:50,918][62635] Updated weights for policy 1, policy_version 74490 (0.0008) [2023-10-12 18:53:52,601][62634] Updated weights for policy 0, policy_version 74470 (0.0009) [2023-10-12 18:53:52,981][62634] Updated weights for policy 0, policy_version 74480 (0.0009) [2023-10-12 18:53:53,347][62634] Updated weights for policy 0, policy_version 74490 (0.0009) [2023-10-12 18:53:53,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 152535040. Throughput: 0: 1706.8, 1: 1670.0. Samples: 38142144. Policy #0 lag: (min: 31.0, avg: 31.1, max: 39.0) [2023-10-12 18:53:53,435][61643] Avg episode reward: [(0, '24.510'), (1, '9.860')] [2023-10-12 18:53:55,224][62635] Updated weights for policy 1, policy_version 74500 (0.0007) [2023-10-12 18:53:55,591][62635] Updated weights for policy 1, policy_version 74510 (0.0007) [2023-10-12 18:53:55,958][62635] Updated weights for policy 1, policy_version 74520 (0.0007) [2023-10-12 18:53:57,333][62634] Updated weights for policy 0, policy_version 74500 (0.0008) [2023-10-12 18:53:57,716][62634] Updated weights for policy 0, policy_version 74510 (0.0007) [2023-10-12 18:53:58,093][62634] Updated weights for policy 0, policy_version 74520 (0.0007) [2023-10-12 18:53:58,435][61643] Fps is (10 sec: 16384.3, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 152633344. Throughput: 0: 1695.4, 1: 1683.2. Samples: 38162630. Policy #0 lag: (min: 18.0, avg: 20.9, max: 50.0) [2023-10-12 18:53:58,436][61643] Avg episode reward: [(0, '24.510'), (1, '10.040')] [2023-10-12 18:54:00,038][62635] Updated weights for policy 1, policy_version 74530 (0.0007) [2023-10-12 18:54:00,414][62635] Updated weights for policy 1, policy_version 74540 (0.0008) [2023-10-12 18:54:00,782][62635] Updated weights for policy 1, policy_version 74550 (0.0007) [2023-10-12 18:54:01,153][62635] Updated weights for policy 1, policy_version 74560 (0.0008) [2023-10-12 18:54:01,952][62634] Updated weights for policy 0, policy_version 74530 (0.0007) [2023-10-12 18:54:02,336][62634] Updated weights for policy 0, policy_version 74540 (0.0011) [2023-10-12 18:54:02,702][62634] Updated weights for policy 0, policy_version 74550 (0.0008) [2023-10-12 18:54:03,079][62634] Updated weights for policy 0, policy_version 74560 (0.0007) [2023-10-12 18:54:03,435][61643] Fps is (10 sec: 16384.1, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 152698880. Throughput: 0: 1667.3, 1: 1685.6. Samples: 38182324. Policy #0 lag: (min: 18.0, avg: 20.9, max: 50.0) [2023-10-12 18:54:03,435][61643] Avg episode reward: [(0, '24.580'), (1, '9.850')] [2023-10-12 18:54:05,003][62635] Updated weights for policy 1, policy_version 74570 (0.0007) [2023-10-12 18:54:05,376][62635] Updated weights for policy 1, policy_version 74580 (0.0007) [2023-10-12 18:54:05,747][62635] Updated weights for policy 1, policy_version 74590 (0.0007) [2023-10-12 18:54:07,386][62634] Updated weights for policy 0, policy_version 74570 (0.0008) [2023-10-12 18:54:07,765][62634] Updated weights for policy 0, policy_version 74580 (0.0007) [2023-10-12 18:54:08,136][62634] Updated weights for policy 0, policy_version 74590 (0.0009) [2023-10-12 18:54:08,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 152764416. Throughput: 0: 1687.2, 1: 1664.0. Samples: 38192502. Policy #0 lag: (min: 18.0, avg: 20.9, max: 50.0) [2023-10-12 18:54:08,436][61643] Avg episode reward: [(0, '24.630'), (1, '9.940')] [2023-10-12 18:54:09,753][62635] Updated weights for policy 1, policy_version 74600 (0.0009) [2023-10-12 18:54:10,125][62635] Updated weights for policy 1, policy_version 74610 (0.0010) [2023-10-12 18:54:10,488][62635] Updated weights for policy 1, policy_version 74620 (0.0007) [2023-10-12 18:54:12,131][62634] Updated weights for policy 0, policy_version 74600 (0.0008) [2023-10-12 18:54:12,507][62634] Updated weights for policy 0, policy_version 74610 (0.0008) [2023-10-12 18:54:12,882][62634] Updated weights for policy 0, policy_version 74620 (0.0007) [2023-10-12 18:54:13,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 152829952. Throughput: 0: 1679.6, 1: 1691.2. Samples: 38213246. Policy #0 lag: (min: 18.0, avg: 20.9, max: 50.0) [2023-10-12 18:54:13,435][61643] Avg episode reward: [(0, '24.710'), (1, '10.040')] [2023-10-12 18:54:14,599][62635] Updated weights for policy 1, policy_version 74630 (0.0008) [2023-10-12 18:54:14,971][62635] Updated weights for policy 1, policy_version 74640 (0.0008) [2023-10-12 18:54:15,336][62635] Updated weights for policy 1, policy_version 74650 (0.0010) [2023-10-12 18:54:17,078][62634] Updated weights for policy 0, policy_version 74630 (0.0009) [2023-10-12 18:54:17,454][62634] Updated weights for policy 0, policy_version 74640 (0.0009) [2023-10-12 18:54:17,832][62634] Updated weights for policy 0, policy_version 74650 (0.0008) [2023-10-12 18:54:18,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 152895488. Throughput: 0: 1662.0, 1: 1689.8. Samples: 38232950. Policy #0 lag: (min: 18.0, avg: 20.9, max: 50.0) [2023-10-12 18:54:18,435][61643] Avg episode reward: [(0, '24.970'), (1, '9.800')] [2023-10-12 18:54:18,444][62354] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000074656_76447744.pth... [2023-10-12 18:54:18,444][62495] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000074656_76447744.pth... [2023-10-12 18:54:18,475][62354] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000073088_74842112.pth [2023-10-12 18:54:18,478][62495] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000073088_74842112.pth [2023-10-12 18:54:19,467][62635] Updated weights for policy 1, policy_version 74660 (0.0008) [2023-10-12 18:54:19,834][62635] Updated weights for policy 1, policy_version 74670 (0.0010) [2023-10-12 18:54:20,201][62635] Updated weights for policy 1, policy_version 74680 (0.0010) [2023-10-12 18:54:21,819][62634] Updated weights for policy 0, policy_version 74660 (0.0010) [2023-10-12 18:54:22,188][62634] Updated weights for policy 0, policy_version 74670 (0.0008) [2023-10-12 18:54:22,567][62634] Updated weights for policy 0, policy_version 74680 (0.0008) [2023-10-12 18:54:23,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 152961024. Throughput: 0: 1681.1, 1: 1669.8. Samples: 38242968. Policy #0 lag: (min: 18.0, avg: 20.9, max: 50.0) [2023-10-12 18:54:23,435][61643] Avg episode reward: [(0, '24.970'), (1, '9.870')] [2023-10-12 18:54:24,333][62635] Updated weights for policy 1, policy_version 74690 (0.0008) [2023-10-12 18:54:24,698][62635] Updated weights for policy 1, policy_version 74700 (0.0008) [2023-10-12 18:54:25,061][62635] Updated weights for policy 1, policy_version 74710 (0.0010) [2023-10-12 18:54:25,424][62635] Updated weights for policy 1, policy_version 74720 (0.0010) [2023-10-12 18:54:26,475][62634] Updated weights for policy 0, policy_version 74690 (0.0008) [2023-10-12 18:54:26,857][62634] Updated weights for policy 0, policy_version 74700 (0.0009) [2023-10-12 18:54:27,223][62634] Updated weights for policy 0, policy_version 74710 (0.0008) [2023-10-12 18:54:27,602][62634] Updated weights for policy 0, policy_version 74720 (0.0007) [2023-10-12 18:54:28,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 153026560. Throughput: 0: 1666.3, 1: 1683.2. Samples: 38263022. Policy #0 lag: (min: 18.0, avg: 20.9, max: 50.0) [2023-10-12 18:54:28,436][61643] Avg episode reward: [(0, '24.730'), (1, '9.960')] [2023-10-12 18:54:29,694][62635] Updated weights for policy 1, policy_version 74730 (0.0009) [2023-10-12 18:54:30,052][62635] Updated weights for policy 1, policy_version 74740 (0.0007) [2023-10-12 18:54:30,424][62635] Updated weights for policy 1, policy_version 74750 (0.0008) [2023-10-12 18:54:31,778][62634] Updated weights for policy 0, policy_version 74730 (0.0009) [2023-10-12 18:54:32,143][62634] Updated weights for policy 0, policy_version 74740 (0.0007) [2023-10-12 18:54:32,529][62634] Updated weights for policy 0, policy_version 74750 (0.0007) [2023-10-12 18:54:33,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 153092096. Throughput: 0: 1662.8, 1: 1675.9. Samples: 38282846. Policy #0 lag: (min: 18.0, avg: 20.9, max: 50.0) [2023-10-12 18:54:33,435][61643] Avg episode reward: [(0, '24.790'), (1, '10.050')] [2023-10-12 18:54:34,553][62635] Updated weights for policy 1, policy_version 74760 (0.0008) [2023-10-12 18:54:34,915][62635] Updated weights for policy 1, policy_version 74770 (0.0009) [2023-10-12 18:54:35,286][62635] Updated weights for policy 1, policy_version 74780 (0.0007) [2023-10-12 18:54:36,520][62634] Updated weights for policy 0, policy_version 74760 (0.0010) [2023-10-12 18:54:36,888][62634] Updated weights for policy 0, policy_version 74770 (0.0010) [2023-10-12 18:54:37,269][62634] Updated weights for policy 0, policy_version 74780 (0.0010) [2023-10-12 18:54:38,435][61643] Fps is (10 sec: 13107.5, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 153157632. Throughput: 0: 1686.9, 1: 1672.0. Samples: 38293296. Policy #0 lag: (min: 18.0, avg: 20.9, max: 50.0) [2023-10-12 18:54:38,435][61643] Avg episode reward: [(0, '24.780'), (1, '9.870')] [2023-10-12 18:54:39,244][62635] Updated weights for policy 1, policy_version 74790 (0.0008) [2023-10-12 18:54:39,613][62635] Updated weights for policy 1, policy_version 74800 (0.0007) [2023-10-12 18:54:39,973][62635] Updated weights for policy 1, policy_version 74810 (0.0008) [2023-10-12 18:54:41,368][62634] Updated weights for policy 0, policy_version 74790 (0.0007) [2023-10-12 18:54:41,744][62634] Updated weights for policy 0, policy_version 74800 (0.0007) [2023-10-12 18:54:42,120][62634] Updated weights for policy 0, policy_version 74810 (0.0011) [2023-10-12 18:54:43,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 153223168. Throughput: 0: 1667.5, 1: 1681.9. Samples: 38313352. Policy #0 lag: (min: 18.0, avg: 20.9, max: 50.0) [2023-10-12 18:54:43,435][61643] Avg episode reward: [(0, '24.890'), (1, '9.960')] [2023-10-12 18:54:43,880][62635] Updated weights for policy 1, policy_version 74820 (0.0007) [2023-10-12 18:54:44,254][62635] Updated weights for policy 1, policy_version 74830 (0.0008) [2023-10-12 18:54:44,623][62635] Updated weights for policy 1, policy_version 74840 (0.0008) [2023-10-12 18:54:46,196][62634] Updated weights for policy 0, policy_version 74820 (0.0009) [2023-10-12 18:54:46,567][62634] Updated weights for policy 0, policy_version 74830 (0.0011) [2023-10-12 18:54:46,953][62634] Updated weights for policy 0, policy_version 74840 (0.0009) [2023-10-12 18:54:48,435][61643] Fps is (10 sec: 13106.9, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 153288704. Throughput: 0: 1677.3, 1: 1686.6. Samples: 38333702. Policy #0 lag: (min: 19.0, avg: 19.0, max: 19.0) [2023-10-12 18:54:48,436][61643] Avg episode reward: [(0, '25.120'), (1, '10.120')] [2023-10-12 18:54:48,543][62635] Updated weights for policy 1, policy_version 74850 (0.0009) [2023-10-12 18:54:48,907][62635] Updated weights for policy 1, policy_version 74860 (0.0009) [2023-10-12 18:54:49,274][62635] Updated weights for policy 1, policy_version 74870 (0.0009) [2023-10-12 18:54:49,642][62635] Updated weights for policy 1, policy_version 74880 (0.0009) [2023-10-12 18:54:51,105][62634] Updated weights for policy 0, policy_version 74850 (0.0010) [2023-10-12 18:54:51,488][62634] Updated weights for policy 0, policy_version 74860 (0.0010) [2023-10-12 18:54:51,861][62634] Updated weights for policy 0, policy_version 74870 (0.0007) [2023-10-12 18:54:52,236][62634] Updated weights for policy 0, policy_version 74880 (0.0007) [2023-10-12 18:54:53,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 153354240. Throughput: 0: 1682.4, 1: 1682.0. Samples: 38343900. Policy #0 lag: (min: 19.0, avg: 19.0, max: 19.0) [2023-10-12 18:54:53,435][61643] Avg episode reward: [(0, '25.120'), (1, '10.030')] [2023-10-12 18:54:53,711][62635] Updated weights for policy 1, policy_version 74890 (0.0008) [2023-10-12 18:54:54,072][62635] Updated weights for policy 1, policy_version 74900 (0.0008) [2023-10-12 18:54:54,439][62635] Updated weights for policy 1, policy_version 74910 (0.0007) [2023-10-12 18:54:56,178][62634] Updated weights for policy 0, policy_version 74890 (0.0008) [2023-10-12 18:54:56,559][62634] Updated weights for policy 0, policy_version 74900 (0.0008) [2023-10-12 18:54:56,936][62634] Updated weights for policy 0, policy_version 74910 (0.0008) [2023-10-12 18:54:58,435][61643] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 153419776. Throughput: 0: 1660.4, 1: 1680.4. Samples: 38363578. Policy #0 lag: (min: 19.0, avg: 19.0, max: 19.0) [2023-10-12 18:54:58,435][61643] Avg episode reward: [(0, '25.070'), (1, '9.820')] [2023-10-12 18:54:58,523][62635] Updated weights for policy 1, policy_version 74920 (0.0009) [2023-10-12 18:54:58,889][62635] Updated weights for policy 1, policy_version 74930 (0.0008) [2023-10-12 18:54:59,265][62635] Updated weights for policy 1, policy_version 74940 (0.0009) [2023-10-12 18:55:01,053][62634] Updated weights for policy 0, policy_version 74920 (0.0011) [2023-10-12 18:55:01,437][62634] Updated weights for policy 0, policy_version 74930 (0.0007) [2023-10-12 18:55:01,809][62634] Updated weights for policy 0, policy_version 74940 (0.0007) [2023-10-12 18:55:03,320][62635] Updated weights for policy 1, policy_version 74950 (0.0010) [2023-10-12 18:55:03,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 153485312. Throughput: 0: 1677.6, 1: 1679.4. Samples: 38384014. Policy #0 lag: (min: 19.0, avg: 19.0, max: 19.0) [2023-10-12 18:55:03,436][61643] Avg episode reward: [(0, '24.800'), (1, '9.910')] [2023-10-12 18:55:03,694][62635] Updated weights for policy 1, policy_version 74960 (0.0010) [2023-10-12 18:55:04,053][62635] Updated weights for policy 1, policy_version 74970 (0.0010) [2023-10-12 18:55:05,915][62634] Updated weights for policy 0, policy_version 74950 (0.0009) [2023-10-12 18:55:06,290][62634] Updated weights for policy 0, policy_version 74960 (0.0009) [2023-10-12 18:55:06,668][62634] Updated weights for policy 0, policy_version 74970 (0.0011) [2023-10-12 18:55:08,112][62635] Updated weights for policy 1, policy_version 74980 (0.0010) [2023-10-12 18:55:08,435][61643] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 153550848. Throughput: 0: 1675.6, 1: 1684.0. Samples: 38394154. Policy #0 lag: (min: 19.0, avg: 19.0, max: 19.0) [2023-10-12 18:55:08,436][61643] Avg episode reward: [(0, '24.940'), (1, '9.820')] [2023-10-12 18:55:08,476][62635] Updated weights for policy 1, policy_version 74990 (0.0007) [2023-10-12 18:55:08,840][62635] Updated weights for policy 1, policy_version 75000 (0.0007) [2023-10-12 18:55:10,671][62634] Updated weights for policy 0, policy_version 74980 (0.0008) [2023-10-12 18:55:11,051][62634] Updated weights for policy 0, policy_version 74990 (0.0009) [2023-10-12 18:55:11,427][62634] Updated weights for policy 0, policy_version 75000 (0.0007) [2023-10-12 18:55:13,011][62635] Updated weights for policy 1, policy_version 75010 (0.0008) [2023-10-12 18:55:13,381][62635] Updated weights for policy 1, policy_version 75020 (0.0007) [2023-10-12 18:55:13,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 153616384. Throughput: 0: 1666.1, 1: 1687.6. Samples: 38413942. Policy #0 lag: (min: 19.0, avg: 19.0, max: 19.0) [2023-10-12 18:55:13,435][61643] Avg episode reward: [(0, '24.900'), (1, '9.810')] [2023-10-12 18:55:13,751][62635] Updated weights for policy 1, policy_version 75030 (0.0008) [2023-10-12 18:55:14,107][62635] Updated weights for policy 1, policy_version 75040 (0.0007) [2023-10-12 18:55:15,511][62634] Updated weights for policy 0, policy_version 75010 (0.0007) [2023-10-12 18:55:15,888][62634] Updated weights for policy 0, policy_version 75020 (0.0008) [2023-10-12 18:55:16,258][62634] Updated weights for policy 0, policy_version 75030 (0.0008) [2023-10-12 18:55:16,635][62634] Updated weights for policy 0, policy_version 75040 (0.0011) [2023-10-12 18:55:18,232][62635] Updated weights for policy 1, policy_version 75050 (0.0010) [2023-10-12 18:55:18,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 153681920. Throughput: 0: 1682.4, 1: 1685.8. Samples: 38434416. Policy #0 lag: (min: 19.0, avg: 19.0, max: 19.0) [2023-10-12 18:55:18,435][61643] Avg episode reward: [(0, '24.810'), (1, '9.900')] [2023-10-12 18:55:18,602][62635] Updated weights for policy 1, policy_version 75060 (0.0009) [2023-10-12 18:55:18,970][62635] Updated weights for policy 1, policy_version 75070 (0.0009) [2023-10-12 18:55:20,757][62634] Updated weights for policy 0, policy_version 75050 (0.0010) [2023-10-12 18:55:21,123][62634] Updated weights for policy 0, policy_version 75060 (0.0007) [2023-10-12 18:55:21,499][62634] Updated weights for policy 0, policy_version 75070 (0.0007) [2023-10-12 18:55:23,028][62635] Updated weights for policy 1, policy_version 75080 (0.0008) [2023-10-12 18:55:23,398][62635] Updated weights for policy 1, policy_version 75090 (0.0008) [2023-10-12 18:55:23,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 153747456. Throughput: 0: 1665.6, 1: 1690.8. Samples: 38444334. Policy #0 lag: (min: 19.0, avg: 19.0, max: 19.0) [2023-10-12 18:55:23,435][61643] Avg episode reward: [(0, '24.610'), (1, '9.830')] [2023-10-12 18:55:23,767][62635] Updated weights for policy 1, policy_version 75100 (0.0009) [2023-10-12 18:55:25,505][62634] Updated weights for policy 0, policy_version 75080 (0.0009) [2023-10-12 18:55:25,893][62634] Updated weights for policy 0, policy_version 75090 (0.0010) [2023-10-12 18:55:26,262][62634] Updated weights for policy 0, policy_version 75100 (0.0010) [2023-10-12 18:55:27,803][62635] Updated weights for policy 1, policy_version 75110 (0.0008) [2023-10-12 18:55:28,177][62635] Updated weights for policy 1, policy_version 75120 (0.0010) [2023-10-12 18:55:28,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 153812992. Throughput: 0: 1665.2, 1: 1691.8. Samples: 38464418. Policy #0 lag: (min: 19.0, avg: 19.0, max: 19.0) [2023-10-12 18:55:28,435][61643] Avg episode reward: [(0, '24.420'), (1, '9.690')] [2023-10-12 18:55:28,542][62635] Updated weights for policy 1, policy_version 75130 (0.0008) [2023-10-12 18:55:30,326][62634] Updated weights for policy 0, policy_version 75110 (0.0009) [2023-10-12 18:55:30,703][62634] Updated weights for policy 0, policy_version 75120 (0.0007) [2023-10-12 18:55:31,080][62634] Updated weights for policy 0, policy_version 75130 (0.0007) [2023-10-12 18:55:32,710][62635] Updated weights for policy 1, policy_version 75140 (0.0007) [2023-10-12 18:55:33,076][62635] Updated weights for policy 1, policy_version 75150 (0.0009) [2023-10-12 18:55:33,433][62635] Updated weights for policy 1, policy_version 75160 (0.0010) [2023-10-12 18:55:33,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 153878528. Throughput: 0: 1677.9, 1: 1674.4. Samples: 38484556. Policy #0 lag: (min: 19.0, avg: 19.0, max: 19.0) [2023-10-12 18:55:33,436][61643] Avg episode reward: [(0, '24.280'), (1, '9.970')] [2023-10-12 18:55:35,275][62634] Updated weights for policy 0, policy_version 75140 (0.0008) [2023-10-12 18:55:35,650][62634] Updated weights for policy 0, policy_version 75150 (0.0009) [2023-10-12 18:55:36,024][62634] Updated weights for policy 0, policy_version 75160 (0.0009) [2023-10-12 18:55:37,460][62635] Updated weights for policy 1, policy_version 75170 (0.0009) [2023-10-12 18:55:37,816][62635] Updated weights for policy 1, policy_version 75180 (0.0008) [2023-10-12 18:55:38,184][62635] Updated weights for policy 1, policy_version 75190 (0.0010) [2023-10-12 18:55:38,435][61643] Fps is (10 sec: 13106.8, 60 sec: 13107.1, 300 sec: 13329.3). Total num frames: 153944064. Throughput: 0: 1661.0, 1: 1688.7. Samples: 38494640. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 18:55:38,436][61643] Avg episode reward: [(0, '24.330'), (1, '9.880')] [2023-10-12 18:55:38,551][62635] Updated weights for policy 1, policy_version 75200 (0.0010) [2023-10-12 18:55:40,030][62634] Updated weights for policy 0, policy_version 75170 (0.0010) [2023-10-12 18:55:40,404][62634] Updated weights for policy 0, policy_version 75180 (0.0009) [2023-10-12 18:55:40,791][62634] Updated weights for policy 0, policy_version 75190 (0.0007) [2023-10-12 18:55:41,166][62634] Updated weights for policy 0, policy_version 75200 (0.0008) [2023-10-12 18:55:42,627][62635] Updated weights for policy 1, policy_version 75210 (0.0008) [2023-10-12 18:55:43,004][62635] Updated weights for policy 1, policy_version 75220 (0.0009) [2023-10-12 18:55:43,374][62635] Updated weights for policy 1, policy_version 75230 (0.0008) [2023-10-12 18:55:43,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 154009600. Throughput: 0: 1675.0, 1: 1687.6. Samples: 38514894. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 18:55:43,435][61643] Avg episode reward: [(0, '24.090'), (1, '9.800')] [2023-10-12 18:55:45,125][62634] Updated weights for policy 0, policy_version 75210 (0.0008) [2023-10-12 18:55:45,498][62634] Updated weights for policy 0, policy_version 75220 (0.0009) [2023-10-12 18:55:45,867][62634] Updated weights for policy 0, policy_version 75230 (0.0007) [2023-10-12 18:55:47,413][62635] Updated weights for policy 1, policy_version 75240 (0.0008) [2023-10-12 18:55:47,787][62635] Updated weights for policy 1, policy_version 75250 (0.0009) [2023-10-12 18:55:48,147][62635] Updated weights for policy 1, policy_version 75260 (0.0009) [2023-10-12 18:55:48,435][61643] Fps is (10 sec: 16384.4, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 154107904. Throughput: 0: 1682.3, 1: 1668.7. Samples: 38534808. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 18:55:48,435][61643] Avg episode reward: [(0, '24.120'), (1, '9.990')] [2023-10-12 18:55:49,966][62634] Updated weights for policy 0, policy_version 75240 (0.0010) [2023-10-12 18:55:50,338][62634] Updated weights for policy 0, policy_version 75250 (0.0011) [2023-10-12 18:55:50,722][62634] Updated weights for policy 0, policy_version 75260 (0.0011) [2023-10-12 18:55:52,285][62635] Updated weights for policy 1, policy_version 75270 (0.0009) [2023-10-12 18:55:52,662][62635] Updated weights for policy 1, policy_version 75280 (0.0008) [2023-10-12 18:55:53,022][62635] Updated weights for policy 1, policy_version 75290 (0.0008) [2023-10-12 18:55:53,435][61643] Fps is (10 sec: 16383.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 154173440. Throughput: 0: 1660.1, 1: 1689.5. Samples: 38544888. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 18:55:53,435][61643] Avg episode reward: [(0, '24.120'), (1, '9.900')] [2023-10-12 18:55:54,820][62634] Updated weights for policy 0, policy_version 75270 (0.0010) [2023-10-12 18:55:55,202][62634] Updated weights for policy 0, policy_version 75280 (0.0008) [2023-10-12 18:55:55,578][62634] Updated weights for policy 0, policy_version 75290 (0.0007) [2023-10-12 18:55:57,238][62635] Updated weights for policy 1, policy_version 75300 (0.0009) [2023-10-12 18:55:57,595][62635] Updated weights for policy 1, policy_version 75310 (0.0009) [2023-10-12 18:55:57,963][62635] Updated weights for policy 1, policy_version 75320 (0.0010) [2023-10-12 18:55:58,435][61643] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 154238976. Throughput: 0: 1682.3, 1: 1686.3. Samples: 38565526. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 18:55:58,436][61643] Avg episode reward: [(0, '23.950'), (1, '9.990')] [2023-10-12 18:55:59,629][62634] Updated weights for policy 0, policy_version 75300 (0.0007) [2023-10-12 18:56:00,007][62634] Updated weights for policy 0, policy_version 75310 (0.0009) [2023-10-12 18:56:00,381][62634] Updated weights for policy 0, policy_version 75320 (0.0009) [2023-10-12 18:56:02,137][62635] Updated weights for policy 1, policy_version 75330 (0.0010) [2023-10-12 18:56:02,497][62635] Updated weights for policy 1, policy_version 75340 (0.0008) [2023-10-12 18:56:02,862][62635] Updated weights for policy 1, policy_version 75350 (0.0009) [2023-10-12 18:56:03,230][62635] Updated weights for policy 1, policy_version 75360 (0.0007) [2023-10-12 18:56:03,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 154304512. Throughput: 0: 1682.7, 1: 1667.2. Samples: 38585158. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 18:56:03,435][61643] Avg episode reward: [(0, '23.960'), (1, '9.930')] [2023-10-12 18:56:04,490][62634] Updated weights for policy 0, policy_version 75330 (0.0009) [2023-10-12 18:56:04,858][62634] Updated weights for policy 0, policy_version 75340 (0.0008) [2023-10-12 18:56:05,242][62634] Updated weights for policy 0, policy_version 75350 (0.0010) [2023-10-12 18:56:05,613][62634] Updated weights for policy 0, policy_version 75360 (0.0008) [2023-10-12 18:56:07,444][62635] Updated weights for policy 1, policy_version 75370 (0.0007) [2023-10-12 18:56:07,810][62635] Updated weights for policy 1, policy_version 75380 (0.0007) [2023-10-12 18:56:08,170][62635] Updated weights for policy 1, policy_version 75390 (0.0008) [2023-10-12 18:56:08,435][61643] Fps is (10 sec: 13107.5, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 154370048. Throughput: 0: 1666.0, 1: 1682.6. Samples: 38595020. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 18:56:08,435][61643] Avg episode reward: [(0, '23.840'), (1, '9.890')] [2023-10-12 18:56:09,619][62634] Updated weights for policy 0, policy_version 75370 (0.0009) [2023-10-12 18:56:09,990][62634] Updated weights for policy 0, policy_version 75380 (0.0009) [2023-10-12 18:56:10,367][62634] Updated weights for policy 0, policy_version 75390 (0.0008) [2023-10-12 18:56:11,930][62635] Updated weights for policy 1, policy_version 75400 (0.0010) [2023-10-12 18:56:12,291][62635] Updated weights for policy 1, policy_version 75410 (0.0010) [2023-10-12 18:56:12,661][62635] Updated weights for policy 1, policy_version 75420 (0.0009) [2023-10-12 18:56:13,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 154435584. Throughput: 0: 1684.6, 1: 1665.9. Samples: 38615190. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 18:56:13,436][61643] Avg episode reward: [(0, '23.650'), (1, '9.800')] [2023-10-12 18:56:14,347][62634] Updated weights for policy 0, policy_version 75400 (0.0009) [2023-10-12 18:56:14,722][62634] Updated weights for policy 0, policy_version 75410 (0.0009) [2023-10-12 18:56:15,104][62634] Updated weights for policy 0, policy_version 75420 (0.0007) [2023-10-12 18:56:16,661][62635] Updated weights for policy 1, policy_version 75430 (0.0009) [2023-10-12 18:56:17,028][62635] Updated weights for policy 1, policy_version 75440 (0.0008) [2023-10-12 18:56:17,397][62635] Updated weights for policy 1, policy_version 75450 (0.0008) [2023-10-12 18:56:18,435][61643] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 154501120. Throughput: 0: 1686.8, 1: 1664.2. Samples: 38635348. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 18:56:18,435][61643] Avg episode reward: [(0, '23.220'), (1, '9.980')] [2023-10-12 18:56:18,444][62495] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000075456_77266944.pth... [2023-10-12 18:56:18,445][62354] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000075424_77234176.pth... [2023-10-12 18:56:18,480][62354] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000073856_75628544.pth [2023-10-12 18:56:18,484][62495] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000073888_75661312.pth [2023-10-12 18:56:19,258][62634] Updated weights for policy 0, policy_version 75430 (0.0008) [2023-10-12 18:56:19,630][62634] Updated weights for policy 0, policy_version 75440 (0.0008) [2023-10-12 18:56:20,010][62634] Updated weights for policy 0, policy_version 75450 (0.0009) [2023-10-12 18:56:21,487][62635] Updated weights for policy 1, policy_version 75460 (0.0007) [2023-10-12 18:56:21,847][62635] Updated weights for policy 1, policy_version 75470 (0.0008) [2023-10-12 18:56:22,219][62635] Updated weights for policy 1, policy_version 75480 (0.0008) [2023-10-12 18:56:23,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 154566656. Throughput: 0: 1674.5, 1: 1682.1. Samples: 38645684. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 18:56:23,435][61643] Avg episode reward: [(0, '23.490'), (1, '9.890')] [2023-10-12 18:56:23,975][62634] Updated weights for policy 0, policy_version 75460 (0.0008) [2023-10-12 18:56:24,349][62634] Updated weights for policy 0, policy_version 75470 (0.0007) [2023-10-12 18:56:24,723][62634] Updated weights for policy 0, policy_version 75480 (0.0008) [2023-10-12 18:56:26,375][62635] Updated weights for policy 1, policy_version 75490 (0.0008) [2023-10-12 18:56:26,734][62635] Updated weights for policy 1, policy_version 75500 (0.0009) [2023-10-12 18:56:27,099][62635] Updated weights for policy 1, policy_version 75510 (0.0009) [2023-10-12 18:56:27,466][62635] Updated weights for policy 1, policy_version 75520 (0.0009) [2023-10-12 18:56:28,435][61643] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 154632192. Throughput: 0: 1685.4, 1: 1668.8. Samples: 38665836. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) [2023-10-12 18:56:28,435][61643] Avg episode reward: [(0, '23.370'), (1, '9.850')] [2023-10-12 18:56:28,736][62634] Updated weights for policy 0, policy_version 75490 (0.0008) [2023-10-12 18:56:29,110][62634] Updated weights for policy 0, policy_version 75500 (0.0008) [2023-10-12 18:56:29,486][62634] Updated weights for policy 0, policy_version 75510 (0.0007) [2023-10-12 18:56:29,862][62634] Updated weights for policy 0, policy_version 75520 (0.0010) [2023-10-12 18:56:31,445][62635] Updated weights for policy 1, policy_version 75530 (0.0007) [2023-10-12 18:56:31,821][62635] Updated weights for policy 1, policy_version 75540 (0.0008) [2023-10-12 18:56:32,188][62635] Updated weights for policy 1, policy_version 75550 (0.0007) [2023-10-12 18:56:33,435][61643] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 154697728. Throughput: 0: 1686.6, 1: 1675.7. Samples: 38686112. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) [2023-10-12 18:56:33,436][61643] Avg episode reward: [(0, '23.650'), (1, '9.940')] [2023-10-12 18:56:34,022][62634] Updated weights for policy 0, policy_version 75530 (0.0008) [2023-10-12 18:56:34,395][62634] Updated weights for policy 0, policy_version 75540 (0.0008) [2023-10-12 18:56:34,773][62634] Updated weights for policy 0, policy_version 75550 (0.0010) [2023-10-12 18:56:36,122][62635] Updated weights for policy 1, policy_version 75560 (0.0007) [2023-10-12 18:56:36,499][62635] Updated weights for policy 1, policy_version 75570 (0.0009) [2023-10-12 18:56:36,861][62635] Updated weights for policy 1, policy_version 75580 (0.0010) [2023-10-12 18:56:38,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 154763264. Throughput: 0: 1684.0, 1: 1679.4. Samples: 38696242. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) [2023-10-12 18:56:38,435][61643] Avg episode reward: [(0, '23.950'), (1, '10.030')] [2023-10-12 18:56:38,770][62634] Updated weights for policy 0, policy_version 75560 (0.0009) [2023-10-12 18:56:39,149][62634] Updated weights for policy 0, policy_version 75570 (0.0008) [2023-10-12 18:56:39,532][62634] Updated weights for policy 0, policy_version 75580 (0.0007) [2023-10-12 18:56:40,948][62635] Updated weights for policy 1, policy_version 75590 (0.0008) [2023-10-12 18:56:41,320][62635] Updated weights for policy 1, policy_version 75600 (0.0009) [2023-10-12 18:56:41,689][62635] Updated weights for policy 1, policy_version 75610 (0.0008) [2023-10-12 18:56:43,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 154828800. Throughput: 0: 1685.6, 1: 1657.6. Samples: 38715972. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) [2023-10-12 18:56:43,436][61643] Avg episode reward: [(0, '23.600'), (1, '9.890')] [2023-10-12 18:56:43,709][62634] Updated weights for policy 0, policy_version 75590 (0.0010) [2023-10-12 18:56:44,088][62634] Updated weights for policy 0, policy_version 75600 (0.0007) [2023-10-12 18:56:44,465][62634] Updated weights for policy 0, policy_version 75610 (0.0008) [2023-10-12 18:56:45,658][62635] Updated weights for policy 1, policy_version 75620 (0.0007) [2023-10-12 18:56:46,030][62635] Updated weights for policy 1, policy_version 75630 (0.0007) [2023-10-12 18:56:46,403][62635] Updated weights for policy 1, policy_version 75640 (0.0008) [2023-10-12 18:56:48,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 154894336. Throughput: 0: 1682.8, 1: 1682.9. Samples: 38736612. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) [2023-10-12 18:56:48,435][61643] Avg episode reward: [(0, '23.680'), (1, '10.000')] [2023-10-12 18:56:48,586][62634] Updated weights for policy 0, policy_version 75620 (0.0007) [2023-10-12 18:56:48,961][62634] Updated weights for policy 0, policy_version 75630 (0.0009) [2023-10-12 18:56:49,331][62634] Updated weights for policy 0, policy_version 75640 (0.0009) [2023-10-12 18:56:50,399][62635] Updated weights for policy 1, policy_version 75650 (0.0008) [2023-10-12 18:56:50,769][62635] Updated weights for policy 1, policy_version 75660 (0.0008) [2023-10-12 18:56:51,135][62635] Updated weights for policy 1, policy_version 75670 (0.0007) [2023-10-12 18:56:51,505][62635] Updated weights for policy 1, policy_version 75680 (0.0010) [2023-10-12 18:56:53,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 154959872. Throughput: 0: 1682.6, 1: 1675.2. Samples: 38746122. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) [2023-10-12 18:56:53,435][61643] Avg episode reward: [(0, '23.350'), (1, '9.720')] [2023-10-12 18:56:53,545][62634] Updated weights for policy 0, policy_version 75650 (0.0007) [2023-10-12 18:56:53,921][62634] Updated weights for policy 0, policy_version 75660 (0.0008) [2023-10-12 18:56:54,302][62634] Updated weights for policy 0, policy_version 75670 (0.0009) [2023-10-12 18:56:54,681][62634] Updated weights for policy 0, policy_version 75680 (0.0010) [2023-10-12 18:56:55,720][62635] Updated weights for policy 1, policy_version 75690 (0.0010) [2023-10-12 18:56:56,091][62635] Updated weights for policy 1, policy_version 75700 (0.0009) [2023-10-12 18:56:56,463][62635] Updated weights for policy 1, policy_version 75710 (0.0009) [2023-10-12 18:56:58,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 155025408. Throughput: 0: 1676.8, 1: 1674.6. Samples: 38766000. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) [2023-10-12 18:56:58,436][61643] Avg episode reward: [(0, '23.320'), (1, '9.840')] [2023-10-12 18:56:58,801][62634] Updated weights for policy 0, policy_version 75690 (0.0008) [2023-10-12 18:56:59,172][62634] Updated weights for policy 0, policy_version 75700 (0.0008) [2023-10-12 18:56:59,547][62634] Updated weights for policy 0, policy_version 75710 (0.0009) [2023-10-12 18:57:00,498][62635] Updated weights for policy 1, policy_version 75720 (0.0008) [2023-10-12 18:57:00,876][62635] Updated weights for policy 1, policy_version 75730 (0.0007) [2023-10-12 18:57:01,243][62635] Updated weights for policy 1, policy_version 75740 (0.0007) [2023-10-12 18:57:03,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 155090944. Throughput: 0: 1667.0, 1: 1687.8. Samples: 38786314. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) [2023-10-12 18:57:03,435][61643] Avg episode reward: [(0, '23.290'), (1, '9.740')] [2023-10-12 18:57:03,675][62634] Updated weights for policy 0, policy_version 75720 (0.0010) [2023-10-12 18:57:04,050][62634] Updated weights for policy 0, policy_version 75730 (0.0009) [2023-10-12 18:57:04,438][62634] Updated weights for policy 0, policy_version 75740 (0.0008) [2023-10-12 18:57:05,326][62635] Updated weights for policy 1, policy_version 75750 (0.0007) [2023-10-12 18:57:05,691][62635] Updated weights for policy 1, policy_version 75760 (0.0008) [2023-10-12 18:57:06,049][62635] Updated weights for policy 1, policy_version 75770 (0.0008) [2023-10-12 18:57:08,248][62634] Updated weights for policy 0, policy_version 75750 (0.0007) [2023-10-12 18:57:08,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 155156480. Throughput: 0: 1672.6, 1: 1664.4. Samples: 38795850. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) [2023-10-12 18:57:08,436][61643] Avg episode reward: [(0, '23.410'), (1, '9.630')] [2023-10-12 18:57:08,625][62634] Updated weights for policy 0, policy_version 75760 (0.0008) [2023-10-12 18:57:08,999][62634] Updated weights for policy 0, policy_version 75770 (0.0009) [2023-10-12 18:57:10,167][62635] Updated weights for policy 1, policy_version 75780 (0.0008) [2023-10-12 18:57:10,528][62635] Updated weights for policy 1, policy_version 75790 (0.0010) [2023-10-12 18:57:10,897][62635] Updated weights for policy 1, policy_version 75800 (0.0010) [2023-10-12 18:57:13,022][62634] Updated weights for policy 0, policy_version 75780 (0.0009) [2023-10-12 18:57:13,385][62634] Updated weights for policy 0, policy_version 75790 (0.0008) [2023-10-12 18:57:13,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 155222016. Throughput: 0: 1672.2, 1: 1672.8. Samples: 38816364. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) [2023-10-12 18:57:13,435][61643] Avg episode reward: [(0, '23.550'), (1, '9.850')] [2023-10-12 18:57:13,763][62634] Updated weights for policy 0, policy_version 75800 (0.0007) [2023-10-12 18:57:14,994][62635] Updated weights for policy 1, policy_version 75810 (0.0007) [2023-10-12 18:57:15,371][62635] Updated weights for policy 1, policy_version 75820 (0.0008) [2023-10-12 18:57:15,737][62635] Updated weights for policy 1, policy_version 75830 (0.0008) [2023-10-12 18:57:16,102][62635] Updated weights for policy 1, policy_version 75840 (0.0008) [2023-10-12 18:57:17,809][62634] Updated weights for policy 0, policy_version 75810 (0.0007) [2023-10-12 18:57:18,185][62634] Updated weights for policy 0, policy_version 75820 (0.0009) [2023-10-12 18:57:18,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 155287552. Throughput: 0: 1669.9, 1: 1683.3. Samples: 38837006. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 18:57:18,435][61643] Avg episode reward: [(0, '23.580'), (1, '9.730')] [2023-10-12 18:57:18,561][62634] Updated weights for policy 0, policy_version 75830 (0.0009) [2023-10-12 18:57:18,933][62634] Updated weights for policy 0, policy_version 75840 (0.0009) [2023-10-12 18:57:20,163][62635] Updated weights for policy 1, policy_version 75850 (0.0007) [2023-10-12 18:57:20,533][62635] Updated weights for policy 1, policy_version 75860 (0.0007) [2023-10-12 18:57:20,902][62635] Updated weights for policy 1, policy_version 75870 (0.0009) [2023-10-12 18:57:22,977][62634] Updated weights for policy 0, policy_version 75850 (0.0007) [2023-10-12 18:57:23,356][62634] Updated weights for policy 0, policy_version 75860 (0.0008) [2023-10-12 18:57:23,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 155353088. Throughput: 0: 1680.2, 1: 1659.0. Samples: 38846506. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 18:57:23,436][61643] Avg episode reward: [(0, '23.630'), (1, '9.750')] [2023-10-12 18:57:23,725][62634] Updated weights for policy 0, policy_version 75870 (0.0010) [2023-10-12 18:57:25,166][62635] Updated weights for policy 1, policy_version 75880 (0.0009) [2023-10-12 18:57:25,529][62635] Updated weights for policy 1, policy_version 75890 (0.0010) [2023-10-12 18:57:25,907][62635] Updated weights for policy 1, policy_version 75900 (0.0008) [2023-10-12 18:57:27,950][62634] Updated weights for policy 0, policy_version 75880 (0.0008) [2023-10-12 18:57:28,333][62634] Updated weights for policy 0, policy_version 75890 (0.0009) [2023-10-12 18:57:28,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 155418624. Throughput: 0: 1680.5, 1: 1682.4. Samples: 38867302. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 18:57:28,435][61643] Avg episode reward: [(0, '23.500'), (1, '9.900')] [2023-10-12 18:57:28,710][62634] Updated weights for policy 0, policy_version 75900 (0.0010) [2023-10-12 18:57:29,955][62635] Updated weights for policy 1, policy_version 75910 (0.0010) [2023-10-12 18:57:30,319][62635] Updated weights for policy 1, policy_version 75920 (0.0008) [2023-10-12 18:57:30,688][62635] Updated weights for policy 1, policy_version 75930 (0.0010) [2023-10-12 18:57:32,533][62634] Updated weights for policy 0, policy_version 75910 (0.0009) [2023-10-12 18:57:32,908][62634] Updated weights for policy 0, policy_version 75920 (0.0009) [2023-10-12 18:57:33,284][62634] Updated weights for policy 0, policy_version 75930 (0.0007) [2023-10-12 18:57:33,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13107.3, 300 sec: 13440.4). Total num frames: 155484160. Throughput: 0: 1674.0, 1: 1682.0. Samples: 38887630. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 18:57:33,435][61643] Avg episode reward: [(0, '23.910'), (1, '9.790')] [2023-10-12 18:57:34,801][62635] Updated weights for policy 1, policy_version 75940 (0.0009) [2023-10-12 18:57:35,176][62635] Updated weights for policy 1, policy_version 75950 (0.0009) [2023-10-12 18:57:35,537][62635] Updated weights for policy 1, policy_version 75960 (0.0009) [2023-10-12 18:57:37,136][62634] Updated weights for policy 0, policy_version 75940 (0.0007) [2023-10-12 18:57:37,512][62634] Updated weights for policy 0, policy_version 75950 (0.0007) [2023-10-12 18:57:37,894][62634] Updated weights for policy 0, policy_version 75960 (0.0008) [2023-10-12 18:57:38,435][61643] Fps is (10 sec: 16383.9, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 155582464. Throughput: 0: 1693.5, 1: 1668.7. Samples: 38897418. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 18:57:38,435][61643] Avg episode reward: [(0, '23.580'), (1, '9.710')] [2023-10-12 18:57:39,332][62635] Updated weights for policy 1, policy_version 75970 (0.0009) [2023-10-12 18:57:39,696][62635] Updated weights for policy 1, policy_version 75980 (0.0012) [2023-10-12 18:57:40,061][62635] Updated weights for policy 1, policy_version 75990 (0.0007) [2023-10-12 18:57:40,432][62635] Updated weights for policy 1, policy_version 76000 (0.0007) [2023-10-12 18:57:41,922][62634] Updated weights for policy 0, policy_version 75970 (0.0009) [2023-10-12 18:57:42,292][62634] Updated weights for policy 0, policy_version 75980 (0.0009) [2023-10-12 18:57:42,677][62634] Updated weights for policy 0, policy_version 75990 (0.0009) [2023-10-12 18:57:43,047][62634] Updated weights for policy 0, policy_version 76000 (0.0010) [2023-10-12 18:57:43,435][61643] Fps is (10 sec: 16383.7, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 155648000. Throughput: 0: 1700.8, 1: 1690.0. Samples: 38918590. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 18:57:43,436][61643] Avg episode reward: [(0, '23.150'), (1, '9.910')] [2023-10-12 18:57:44,446][62635] Updated weights for policy 1, policy_version 76010 (0.0009) [2023-10-12 18:57:44,805][62635] Updated weights for policy 1, policy_version 76020 (0.0009) [2023-10-12 18:57:45,169][62635] Updated weights for policy 1, policy_version 76030 (0.0009) [2023-10-12 18:57:47,009][62634] Updated weights for policy 0, policy_version 76010 (0.0007) [2023-10-12 18:57:47,382][62634] Updated weights for policy 0, policy_version 76020 (0.0008) [2023-10-12 18:57:47,762][62634] Updated weights for policy 0, policy_version 76030 (0.0008) [2023-10-12 18:57:48,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 155713536. Throughput: 0: 1683.1, 1: 1696.2. Samples: 38938380. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 18:57:48,435][61643] Avg episode reward: [(0, '23.600'), (1, '9.820')] [2023-10-12 18:57:49,380][62635] Updated weights for policy 1, policy_version 76040 (0.0009) [2023-10-12 18:57:49,752][62635] Updated weights for policy 1, policy_version 76050 (0.0009) [2023-10-12 18:57:50,122][62635] Updated weights for policy 1, policy_version 76060 (0.0008) [2023-10-12 18:57:51,787][62634] Updated weights for policy 0, policy_version 76040 (0.0007) [2023-10-12 18:57:52,160][62634] Updated weights for policy 0, policy_version 76050 (0.0008) [2023-10-12 18:57:52,546][62634] Updated weights for policy 0, policy_version 76060 (0.0007) [2023-10-12 18:57:53,435][61643] Fps is (10 sec: 13107.5, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 155779072. Throughput: 0: 1707.0, 1: 1686.8. Samples: 38948574. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 18:57:53,435][61643] Avg episode reward: [(0, '23.580'), (1, '9.700')] [2023-10-12 18:57:53,966][62635] Updated weights for policy 1, policy_version 76070 (0.0007) [2023-10-12 18:57:54,335][62635] Updated weights for policy 1, policy_version 76080 (0.0008) [2023-10-12 18:57:54,698][62635] Updated weights for policy 1, policy_version 76090 (0.0007) [2023-10-12 18:57:56,459][62634] Updated weights for policy 0, policy_version 76070 (0.0010) [2023-10-12 18:57:56,837][62634] Updated weights for policy 0, policy_version 76080 (0.0008) [2023-10-12 18:57:57,208][62634] Updated weights for policy 0, policy_version 76090 (0.0011) [2023-10-12 18:57:58,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 155844608. Throughput: 0: 1691.5, 1: 1694.8. Samples: 38968752. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 18:57:58,436][61643] Avg episode reward: [(0, '23.730'), (1, '9.850')] [2023-10-12 18:57:58,780][62635] Updated weights for policy 1, policy_version 76100 (0.0008) [2023-10-12 18:57:59,138][62635] Updated weights for policy 1, policy_version 76110 (0.0010) [2023-10-12 18:57:59,513][62635] Updated weights for policy 1, policy_version 76120 (0.0007) [2023-10-12 18:58:01,108][62634] Updated weights for policy 0, policy_version 76100 (0.0010) [2023-10-12 18:58:01,491][62634] Updated weights for policy 0, policy_version 76110 (0.0007) [2023-10-12 18:58:01,878][62634] Updated weights for policy 0, policy_version 76120 (0.0010) [2023-10-12 18:58:03,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 155910144. Throughput: 0: 1682.0, 1: 1694.4. Samples: 38988944. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) [2023-10-12 18:58:03,436][61643] Avg episode reward: [(0, '23.770'), (1, '9.780')] [2023-10-12 18:58:03,632][62635] Updated weights for policy 1, policy_version 76130 (0.0008) [2023-10-12 18:58:04,000][62635] Updated weights for policy 1, policy_version 76140 (0.0007) [2023-10-12 18:58:04,358][62635] Updated weights for policy 1, policy_version 76150 (0.0009) [2023-10-12 18:58:04,726][62635] Updated weights for policy 1, policy_version 76160 (0.0009) [2023-10-12 18:58:06,136][62634] Updated weights for policy 0, policy_version 76130 (0.0007) [2023-10-12 18:58:06,508][62634] Updated weights for policy 0, policy_version 76140 (0.0011) [2023-10-12 18:58:06,892][62634] Updated weights for policy 0, policy_version 76150 (0.0008) [2023-10-12 18:58:07,257][62634] Updated weights for policy 0, policy_version 76160 (0.0008) [2023-10-12 18:58:08,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 155975680. Throughput: 0: 1702.4, 1: 1693.1. Samples: 38999302. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) [2023-10-12 18:58:08,436][61643] Avg episode reward: [(0, '24.230'), (1, '9.870')] [2023-10-12 18:58:08,799][62635] Updated weights for policy 1, policy_version 76170 (0.0010) [2023-10-12 18:58:09,168][62635] Updated weights for policy 1, policy_version 76180 (0.0010) [2023-10-12 18:58:09,542][62635] Updated weights for policy 1, policy_version 76190 (0.0010) [2023-10-12 18:58:11,296][62634] Updated weights for policy 0, policy_version 76170 (0.0010) [2023-10-12 18:58:11,672][62634] Updated weights for policy 0, policy_version 76180 (0.0008) [2023-10-12 18:58:12,043][62634] Updated weights for policy 0, policy_version 76190 (0.0007) [2023-10-12 18:58:13,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 156041216. Throughput: 0: 1678.9, 1: 1690.4. Samples: 39018924. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) [2023-10-12 18:58:13,435][61643] Avg episode reward: [(0, '24.290'), (1, '9.780')] [2023-10-12 18:58:13,640][62635] Updated weights for policy 1, policy_version 76200 (0.0009) [2023-10-12 18:58:14,006][62635] Updated weights for policy 1, policy_version 76210 (0.0008) [2023-10-12 18:58:14,380][62635] Updated weights for policy 1, policy_version 76220 (0.0007) [2023-10-12 18:58:15,908][62634] Updated weights for policy 0, policy_version 76200 (0.0009) [2023-10-12 18:58:16,279][62634] Updated weights for policy 0, policy_version 76210 (0.0008) [2023-10-12 18:58:16,662][62634] Updated weights for policy 0, policy_version 76220 (0.0010) [2023-10-12 18:58:18,435][61643] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 156106752. Throughput: 0: 1686.9, 1: 1692.4. Samples: 39039700. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) [2023-10-12 18:58:18,436][61643] Avg episode reward: [(0, '24.410'), (1, '9.690')] [2023-10-12 18:58:18,445][62354] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000076224_78053376.pth... [2023-10-12 18:58:18,485][62354] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000074656_76447744.pth [2023-10-12 18:58:18,587][62635] Updated weights for policy 1, policy_version 76230 (0.0008) [2023-10-12 18:58:18,951][62635] Updated weights for policy 1, policy_version 76240 (0.0009) [2023-10-12 18:58:19,317][62635] Updated weights for policy 1, policy_version 76250 (0.0007) [2023-10-12 18:58:19,536][62495] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000076256_78086144.pth... [2023-10-12 18:58:19,575][62495] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000074656_76447744.pth [2023-10-12 18:58:20,738][62634] Updated weights for policy 0, policy_version 76230 (0.0007) [2023-10-12 18:58:21,113][62634] Updated weights for policy 0, policy_version 76240 (0.0007) [2023-10-12 18:58:21,497][62634] Updated weights for policy 0, policy_version 76250 (0.0009) [2023-10-12 18:58:23,321][62635] Updated weights for policy 1, policy_version 76260 (0.0008) [2023-10-12 18:58:23,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 156172288. Throughput: 0: 1688.0, 1: 1695.0. Samples: 39049656. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) [2023-10-12 18:58:23,435][61643] Avg episode reward: [(0, '24.480'), (1, '9.750')] [2023-10-12 18:58:23,689][62635] Updated weights for policy 1, policy_version 76270 (0.0008) [2023-10-12 18:58:24,061][62635] Updated weights for policy 1, policy_version 76280 (0.0009) [2023-10-12 18:58:25,496][62634] Updated weights for policy 0, policy_version 76260 (0.0008) [2023-10-12 18:58:25,870][62634] Updated weights for policy 0, policy_version 76270 (0.0009) [2023-10-12 18:58:26,255][62634] Updated weights for policy 0, policy_version 76280 (0.0009) [2023-10-12 18:58:28,194][62635] Updated weights for policy 1, policy_version 76290 (0.0008) [2023-10-12 18:58:28,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 156237824. Throughput: 0: 1667.3, 1: 1684.5. Samples: 39069422. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) [2023-10-12 18:58:28,436][61643] Avg episode reward: [(0, '24.260'), (1, '9.930')] [2023-10-12 18:58:28,569][62635] Updated weights for policy 1, policy_version 76300 (0.0007) [2023-10-12 18:58:28,937][62635] Updated weights for policy 1, policy_version 76310 (0.0009) [2023-10-12 18:58:29,302][62635] Updated weights for policy 1, policy_version 76320 (0.0008) [2023-10-12 18:58:30,256][62634] Updated weights for policy 0, policy_version 76290 (0.0008) [2023-10-12 18:58:30,648][62634] Updated weights for policy 0, policy_version 76300 (0.0009) [2023-10-12 18:58:31,026][62634] Updated weights for policy 0, policy_version 76310 (0.0008) [2023-10-12 18:58:31,392][62634] Updated weights for policy 0, policy_version 76320 (0.0007) [2023-10-12 18:58:33,330][62635] Updated weights for policy 1, policy_version 76330 (0.0010) [2023-10-12 18:58:33,435][61643] Fps is (10 sec: 13106.6, 60 sec: 13653.2, 300 sec: 13440.4). Total num frames: 156303360. Throughput: 0: 1689.2, 1: 1679.2. Samples: 39089960. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) [2023-10-12 18:58:33,437][61643] Avg episode reward: [(0, '24.340'), (1, '9.970')] [2023-10-12 18:58:33,687][62635] Updated weights for policy 1, policy_version 76340 (0.0008) [2023-10-12 18:58:34,049][62635] Updated weights for policy 1, policy_version 76350 (0.0008) [2023-10-12 18:58:35,586][62634] Updated weights for policy 0, policy_version 76330 (0.0010) [2023-10-12 18:58:35,963][62634] Updated weights for policy 0, policy_version 76340 (0.0008) [2023-10-12 18:58:36,335][62634] Updated weights for policy 0, policy_version 76350 (0.0007) [2023-10-12 18:58:38,139][62635] Updated weights for policy 1, policy_version 76360 (0.0007) [2023-10-12 18:58:38,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 156368896. Throughput: 0: 1675.4, 1: 1685.6. Samples: 39099822. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) [2023-10-12 18:58:38,436][61643] Avg episode reward: [(0, '24.270'), (1, '10.060')] [2023-10-12 18:58:38,514][62635] Updated weights for policy 1, policy_version 76370 (0.0007) [2023-10-12 18:58:38,881][62635] Updated weights for policy 1, policy_version 76380 (0.0007) [2023-10-12 18:58:40,384][62634] Updated weights for policy 0, policy_version 76360 (0.0007) [2023-10-12 18:58:40,769][62634] Updated weights for policy 0, policy_version 76370 (0.0008) [2023-10-12 18:58:41,144][62634] Updated weights for policy 0, policy_version 76380 (0.0008) [2023-10-12 18:58:42,868][62635] Updated weights for policy 1, policy_version 76390 (0.0007) [2023-10-12 18:58:43,226][62635] Updated weights for policy 1, policy_version 76400 (0.0007) [2023-10-12 18:58:43,435][61643] Fps is (10 sec: 13107.7, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 156434432. Throughput: 0: 1676.8, 1: 1684.1. Samples: 39119994. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) [2023-10-12 18:58:43,435][61643] Avg episode reward: [(0, '24.390'), (1, '9.980')] [2023-10-12 18:58:43,597][62635] Updated weights for policy 1, policy_version 76410 (0.0010) [2023-10-12 18:58:45,091][62634] Updated weights for policy 0, policy_version 76390 (0.0007) [2023-10-12 18:58:45,477][62634] Updated weights for policy 0, policy_version 76400 (0.0008) [2023-10-12 18:58:45,852][62634] Updated weights for policy 0, policy_version 76410 (0.0009) [2023-10-12 18:58:47,642][62635] Updated weights for policy 1, policy_version 76420 (0.0011) [2023-10-12 18:58:48,020][62635] Updated weights for policy 1, policy_version 76430 (0.0009) [2023-10-12 18:58:48,391][62635] Updated weights for policy 1, policy_version 76440 (0.0009) [2023-10-12 18:58:48,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 156499968. Throughput: 0: 1688.5, 1: 1676.0. Samples: 39140348. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) [2023-10-12 18:58:48,435][61643] Avg episode reward: [(0, '24.430'), (1, '9.890')] [2023-10-12 18:58:50,072][62634] Updated weights for policy 0, policy_version 76420 (0.0010) [2023-10-12 18:58:50,452][62634] Updated weights for policy 0, policy_version 76430 (0.0009) [2023-10-12 18:58:50,836][62634] Updated weights for policy 0, policy_version 76440 (0.0009) [2023-10-12 18:58:52,327][62635] Updated weights for policy 1, policy_version 76450 (0.0009) [2023-10-12 18:58:52,698][62635] Updated weights for policy 1, policy_version 76460 (0.0007) [2023-10-12 18:58:53,064][62635] Updated weights for policy 1, policy_version 76470 (0.0007) [2023-10-12 18:58:53,432][62635] Updated weights for policy 1, policy_version 76480 (0.0008) [2023-10-12 18:58:53,435][61643] Fps is (10 sec: 16384.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 156598272. Throughput: 0: 1663.8, 1: 1690.1. Samples: 39150228. Policy #0 lag: (min: 31.0, avg: 31.8, max: 51.0) [2023-10-12 18:58:53,435][61643] Avg episode reward: [(0, '24.940'), (1, '10.190')] [2023-10-12 18:58:54,900][62634] Updated weights for policy 0, policy_version 76450 (0.0009) [2023-10-12 18:58:55,276][62634] Updated weights for policy 0, policy_version 76460 (0.0007) [2023-10-12 18:58:55,661][62634] Updated weights for policy 0, policy_version 76470 (0.0011) [2023-10-12 18:58:56,032][62634] Updated weights for policy 0, policy_version 76480 (0.0007) [2023-10-12 18:58:57,434][62635] Updated weights for policy 1, policy_version 76490 (0.0009) [2023-10-12 18:58:57,802][62635] Updated weights for policy 1, policy_version 76500 (0.0008) [2023-10-12 18:58:58,179][62635] Updated weights for policy 1, policy_version 76510 (0.0009) [2023-10-12 18:58:58,435][61643] Fps is (10 sec: 16383.9, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 156663808. Throughput: 0: 1682.6, 1: 1697.5. Samples: 39171026. Policy #0 lag: (min: 31.0, avg: 31.8, max: 51.0) [2023-10-12 18:58:58,435][61643] Avg episode reward: [(0, '25.350'), (1, '9.760')] [2023-10-12 18:59:00,044][62634] Updated weights for policy 0, policy_version 76490 (0.0008) [2023-10-12 18:59:00,423][62634] Updated weights for policy 0, policy_version 76500 (0.0007) [2023-10-12 18:59:00,799][62634] Updated weights for policy 0, policy_version 76510 (0.0007) [2023-10-12 18:59:02,304][62635] Updated weights for policy 1, policy_version 76520 (0.0007) [2023-10-12 18:59:02,678][62635] Updated weights for policy 1, policy_version 76530 (0.0007) [2023-10-12 18:59:03,045][62635] Updated weights for policy 1, policy_version 76540 (0.0007) [2023-10-12 18:59:03,435][61643] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 156729344. Throughput: 0: 1689.7, 1: 1678.6. Samples: 39191274. Policy #0 lag: (min: 31.0, avg: 31.8, max: 51.0) [2023-10-12 18:59:03,436][61643] Avg episode reward: [(0, '25.110'), (1, '9.850')] [2023-10-12 18:59:04,738][62634] Updated weights for policy 0, policy_version 76520 (0.0008) [2023-10-12 18:59:05,117][62634] Updated weights for policy 0, policy_version 76530 (0.0007) [2023-10-12 18:59:05,502][62634] Updated weights for policy 0, policy_version 76540 (0.0007) [2023-10-12 18:59:06,939][62635] Updated weights for policy 1, policy_version 76550 (0.0007) [2023-10-12 18:59:07,304][62635] Updated weights for policy 1, policy_version 76560 (0.0008) [2023-10-12 18:59:07,671][62635] Updated weights for policy 1, policy_version 76570 (0.0008) [2023-10-12 18:59:08,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 156794880. Throughput: 0: 1670.0, 1: 1705.2. Samples: 39201538. Policy #0 lag: (min: 31.0, avg: 31.8, max: 51.0) [2023-10-12 18:59:08,435][61643] Avg episode reward: [(0, '25.300'), (1, '9.840')] [2023-10-12 18:59:09,604][62634] Updated weights for policy 0, policy_version 76550 (0.0008) [2023-10-12 18:59:09,982][62634] Updated weights for policy 0, policy_version 76560 (0.0009) [2023-10-12 18:59:10,358][62634] Updated weights for policy 0, policy_version 76570 (0.0010) [2023-10-12 18:59:11,556][62635] Updated weights for policy 1, policy_version 76580 (0.0009) [2023-10-12 18:59:11,915][62635] Updated weights for policy 1, policy_version 76590 (0.0008) [2023-10-12 18:59:12,282][62635] Updated weights for policy 1, policy_version 76600 (0.0008) [2023-10-12 18:59:13,435][61643] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 156860416. Throughput: 0: 1687.9, 1: 1696.9. Samples: 39221738. Policy #0 lag: (min: 31.0, avg: 31.8, max: 51.0) [2023-10-12 18:59:13,435][61643] Avg episode reward: [(0, '24.970'), (1, '9.750')] [2023-10-12 18:59:14,454][62634] Updated weights for policy 0, policy_version 76580 (0.0008) [2023-10-12 18:59:14,830][62634] Updated weights for policy 0, policy_version 76590 (0.0007) [2023-10-12 18:59:15,207][62634] Updated weights for policy 0, policy_version 76600 (0.0007) [2023-10-12 18:59:16,374][62635] Updated weights for policy 1, policy_version 76610 (0.0009) [2023-10-12 18:59:16,745][62635] Updated weights for policy 1, policy_version 76620 (0.0008) [2023-10-12 18:59:17,112][62635] Updated weights for policy 1, policy_version 76630 (0.0008) [2023-10-12 18:59:17,483][62635] Updated weights for policy 1, policy_version 76640 (0.0008) [2023-10-12 18:59:18,435][61643] Fps is (10 sec: 13106.8, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 156925952. Throughput: 0: 1692.2, 1: 1679.9. Samples: 39241706. Policy #0 lag: (min: 31.0, avg: 31.8, max: 51.0) [2023-10-12 18:59:18,436][61643] Avg episode reward: [(0, '25.020'), (1, '9.830')] [2023-10-12 18:59:19,273][62634] Updated weights for policy 0, policy_version 76610 (0.0011) [2023-10-12 18:59:19,669][62634] Updated weights for policy 0, policy_version 76620 (0.0011) [2023-10-12 18:59:20,042][62634] Updated weights for policy 0, policy_version 76630 (0.0010) [2023-10-12 18:59:20,419][62634] Updated weights for policy 0, policy_version 76640 (0.0008) [2023-10-12 18:59:21,609][62635] Updated weights for policy 1, policy_version 76650 (0.0010) [2023-10-12 18:59:21,977][62635] Updated weights for policy 1, policy_version 76660 (0.0008) [2023-10-12 18:59:22,347][62635] Updated weights for policy 1, policy_version 76670 (0.0009) [2023-10-12 18:59:23,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 156991488. Throughput: 0: 1672.8, 1: 1703.6. Samples: 39251758. Policy #0 lag: (min: 31.0, avg: 31.8, max: 51.0) [2023-10-12 18:59:23,435][61643] Avg episode reward: [(0, '25.010'), (1, '10.010')] [2023-10-12 18:59:24,490][62634] Updated weights for policy 0, policy_version 76650 (0.0007) [2023-10-12 18:59:24,878][62634] Updated weights for policy 0, policy_version 76660 (0.0007) [2023-10-12 18:59:25,258][62634] Updated weights for policy 0, policy_version 76670 (0.0009) [2023-10-12 18:59:26,418][62635] Updated weights for policy 1, policy_version 76680 (0.0009) [2023-10-12 18:59:26,795][62635] Updated weights for policy 1, policy_version 76690 (0.0010) [2023-10-12 18:59:27,161][62635] Updated weights for policy 1, policy_version 76700 (0.0009) [2023-10-12 18:59:28,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 157057024. Throughput: 0: 1684.8, 1: 1678.8. Samples: 39271356. Policy #0 lag: (min: 31.0, avg: 31.8, max: 51.0) [2023-10-12 18:59:28,436][61643] Avg episode reward: [(0, '24.810'), (1, '9.880')] [2023-10-12 18:59:29,305][62634] Updated weights for policy 0, policy_version 76680 (0.0007) [2023-10-12 18:59:29,683][62634] Updated weights for policy 0, policy_version 76690 (0.0007) [2023-10-12 18:59:30,055][62634] Updated weights for policy 0, policy_version 76700 (0.0010) [2023-10-12 18:59:31,284][62635] Updated weights for policy 1, policy_version 76710 (0.0009) [2023-10-12 18:59:31,653][62635] Updated weights for policy 1, policy_version 76720 (0.0010) [2023-10-12 18:59:32,030][62635] Updated weights for policy 1, policy_version 76730 (0.0009) [2023-10-12 18:59:33,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 157122560. Throughput: 0: 1685.4, 1: 1678.4. Samples: 39291722. Policy #0 lag: (min: 31.0, avg: 31.8, max: 51.0) [2023-10-12 18:59:33,435][61643] Avg episode reward: [(0, '24.530'), (1, '9.770')] [2023-10-12 18:59:34,051][62634] Updated weights for policy 0, policy_version 76710 (0.0008) [2023-10-12 18:59:34,426][62634] Updated weights for policy 0, policy_version 76720 (0.0008) [2023-10-12 18:59:34,803][62634] Updated weights for policy 0, policy_version 76730 (0.0007) [2023-10-12 18:59:36,198][62635] Updated weights for policy 1, policy_version 76740 (0.0011) [2023-10-12 18:59:36,563][62635] Updated weights for policy 1, policy_version 76750 (0.0009) [2023-10-12 18:59:36,935][62635] Updated weights for policy 1, policy_version 76760 (0.0009) [2023-10-12 18:59:38,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 157188096. Throughput: 0: 1680.0, 1: 1693.2. Samples: 39302022. Policy #0 lag: (min: 31.0, avg: 31.8, max: 51.0) [2023-10-12 18:59:38,436][61643] Avg episode reward: [(0, '24.470'), (1, '9.540')] [2023-10-12 18:59:38,908][62634] Updated weights for policy 0, policy_version 76740 (0.0008) [2023-10-12 18:59:39,272][62634] Updated weights for policy 0, policy_version 76750 (0.0008) [2023-10-12 18:59:39,649][62634] Updated weights for policy 0, policy_version 76760 (0.0009) [2023-10-12 18:59:41,122][62635] Updated weights for policy 1, policy_version 76770 (0.0009) [2023-10-12 18:59:41,488][62635] Updated weights for policy 1, policy_version 76780 (0.0010) [2023-10-12 18:59:41,841][62635] Updated weights for policy 1, policy_version 76790 (0.0008) [2023-10-12 18:59:42,204][62635] Updated weights for policy 1, policy_version 76800 (0.0007) [2023-10-12 18:59:43,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 157253632. Throughput: 0: 1687.8, 1: 1663.6. Samples: 39321838. Policy #0 lag: (min: 9.0, avg: 11.6, max: 41.0) [2023-10-12 18:59:43,435][61643] Avg episode reward: [(0, '24.730'), (1, '9.510')] [2023-10-12 18:59:43,495][62634] Updated weights for policy 0, policy_version 76770 (0.0008) [2023-10-12 18:59:43,873][62634] Updated weights for policy 0, policy_version 76780 (0.0008) [2023-10-12 18:59:44,242][62634] Updated weights for policy 0, policy_version 76790 (0.0008) [2023-10-12 18:59:44,617][62634] Updated weights for policy 0, policy_version 76800 (0.0008) [2023-10-12 18:59:46,180][62635] Updated weights for policy 1, policy_version 76810 (0.0010) [2023-10-12 18:59:46,538][62635] Updated weights for policy 1, policy_version 76820 (0.0008) [2023-10-12 18:59:46,913][62635] Updated weights for policy 1, policy_version 76830 (0.0008) [2023-10-12 18:59:48,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 157319168. Throughput: 0: 1680.4, 1: 1683.1. Samples: 39342628. Policy #0 lag: (min: 9.0, avg: 11.6, max: 41.0) [2023-10-12 18:59:48,435][61643] Avg episode reward: [(0, '24.700'), (1, '9.430')] [2023-10-12 18:59:48,658][62634] Updated weights for policy 0, policy_version 76810 (0.0011) [2023-10-12 18:59:49,045][62634] Updated weights for policy 0, policy_version 76820 (0.0009) [2023-10-12 18:59:49,414][62634] Updated weights for policy 0, policy_version 76830 (0.0009) [2023-10-12 18:59:50,783][62635] Updated weights for policy 1, policy_version 76840 (0.0010) [2023-10-12 18:59:51,156][62635] Updated weights for policy 1, policy_version 76850 (0.0011) [2023-10-12 18:59:51,528][62635] Updated weights for policy 1, policy_version 76860 (0.0010) [2023-10-12 18:59:53,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 157384704. Throughput: 0: 1679.3, 1: 1671.9. Samples: 39352344. Policy #0 lag: (min: 9.0, avg: 11.6, max: 41.0) [2023-10-12 18:59:53,436][61643] Avg episode reward: [(0, '24.950'), (1, '9.200')] [2023-10-12 18:59:53,687][62634] Updated weights for policy 0, policy_version 76840 (0.0007) [2023-10-12 18:59:54,053][62634] Updated weights for policy 0, policy_version 76850 (0.0007) [2023-10-12 18:59:54,423][62634] Updated weights for policy 0, policy_version 76860 (0.0007) [2023-10-12 18:59:55,709][62635] Updated weights for policy 1, policy_version 76870 (0.0010) [2023-10-12 18:59:56,078][62635] Updated weights for policy 1, policy_version 76880 (0.0009) [2023-10-12 18:59:56,451][62635] Updated weights for policy 1, policy_version 76890 (0.0009) [2023-10-12 18:59:58,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 157450240. Throughput: 0: 1680.4, 1: 1662.3. Samples: 39372158. Policy #0 lag: (min: 9.0, avg: 11.6, max: 41.0) [2023-10-12 18:59:58,435][61643] Avg episode reward: [(0, '24.940'), (1, '9.410')] [2023-10-12 18:59:58,461][62634] Updated weights for policy 0, policy_version 76870 (0.0008) [2023-10-12 18:59:58,836][62634] Updated weights for policy 0, policy_version 76880 (0.0008) [2023-10-12 18:59:59,212][62634] Updated weights for policy 0, policy_version 76890 (0.0007) [2023-10-12 19:00:00,421][62635] Updated weights for policy 1, policy_version 76900 (0.0010) [2023-10-12 19:00:00,803][62635] Updated weights for policy 1, policy_version 76910 (0.0009) [2023-10-12 19:00:01,168][62635] Updated weights for policy 1, policy_version 76920 (0.0007) [2023-10-12 19:00:03,090][62634] Updated weights for policy 0, policy_version 76900 (0.0009) [2023-10-12 19:00:03,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13107.3, 300 sec: 13440.4). Total num frames: 157515776. Throughput: 0: 1679.3, 1: 1682.9. Samples: 39393002. Policy #0 lag: (min: 9.0, avg: 11.6, max: 41.0) [2023-10-12 19:00:03,435][61643] Avg episode reward: [(0, '24.670'), (1, '9.410')] [2023-10-12 19:00:03,453][62634] Updated weights for policy 0, policy_version 76910 (0.0010) [2023-10-12 19:00:03,831][62634] Updated weights for policy 0, policy_version 76920 (0.0009) [2023-10-12 19:00:05,135][62635] Updated weights for policy 1, policy_version 76930 (0.0007) [2023-10-12 19:00:05,489][62635] Updated weights for policy 1, policy_version 76940 (0.0008) [2023-10-12 19:00:05,851][62635] Updated weights for policy 1, policy_version 76950 (0.0008) [2023-10-12 19:00:06,217][62635] Updated weights for policy 1, policy_version 76960 (0.0010) [2023-10-12 19:00:07,891][62634] Updated weights for policy 0, policy_version 76930 (0.0008) [2023-10-12 19:00:08,260][62634] Updated weights for policy 0, policy_version 76940 (0.0010) [2023-10-12 19:00:08,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 157581312. Throughput: 0: 1684.0, 1: 1662.4. Samples: 39402350. Policy #0 lag: (min: 9.0, avg: 11.6, max: 41.0) [2023-10-12 19:00:08,435][61643] Avg episode reward: [(0, '24.620'), (1, '9.570')] [2023-10-12 19:00:08,634][62634] Updated weights for policy 0, policy_version 76950 (0.0009) [2023-10-12 19:00:09,009][62634] Updated weights for policy 0, policy_version 76960 (0.0008) [2023-10-12 19:00:10,271][62635] Updated weights for policy 1, policy_version 76970 (0.0008) [2023-10-12 19:00:10,650][62635] Updated weights for policy 1, policy_version 76980 (0.0008) [2023-10-12 19:00:11,023][62635] Updated weights for policy 1, policy_version 76990 (0.0008) [2023-10-12 19:00:13,045][62634] Updated weights for policy 0, policy_version 76970 (0.0009) [2023-10-12 19:00:13,427][62634] Updated weights for policy 0, policy_version 76980 (0.0011) [2023-10-12 19:00:13,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 157646848. Throughput: 0: 1685.5, 1: 1677.7. Samples: 39422698. Policy #0 lag: (min: 9.0, avg: 11.6, max: 41.0) [2023-10-12 19:00:13,435][61643] Avg episode reward: [(0, '24.610'), (1, '9.600')] [2023-10-12 19:00:13,797][62634] Updated weights for policy 0, policy_version 76990 (0.0009) [2023-10-12 19:00:15,264][62635] Updated weights for policy 1, policy_version 77000 (0.0007) [2023-10-12 19:00:15,646][62635] Updated weights for policy 1, policy_version 77010 (0.0007) [2023-10-12 19:00:16,021][62635] Updated weights for policy 1, policy_version 77020 (0.0009) [2023-10-12 19:00:17,792][62634] Updated weights for policy 0, policy_version 77000 (0.0008) [2023-10-12 19:00:18,165][62634] Updated weights for policy 0, policy_version 77010 (0.0008) [2023-10-12 19:00:18,435][61643] Fps is (10 sec: 13106.8, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 157712384. Throughput: 0: 1676.3, 1: 1686.1. Samples: 39443030. Policy #0 lag: (min: 9.0, avg: 11.6, max: 41.0) [2023-10-12 19:00:18,436][61643] Avg episode reward: [(0, '24.660'), (1, '9.690')] [2023-10-12 19:00:18,444][62495] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000077024_78872576.pth... [2023-10-12 19:00:18,479][62495] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000075456_77266944.pth [2023-10-12 19:00:18,551][62634] Updated weights for policy 0, policy_version 77020 (0.0009) [2023-10-12 19:00:18,695][62354] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000077024_78872576.pth... [2023-10-12 19:00:18,730][62354] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000075424_77234176.pth [2023-10-12 19:00:20,120][62635] Updated weights for policy 1, policy_version 77030 (0.0008) [2023-10-12 19:00:20,486][62635] Updated weights for policy 1, policy_version 77040 (0.0008) [2023-10-12 19:00:20,863][62635] Updated weights for policy 1, policy_version 77050 (0.0009) [2023-10-12 19:00:22,789][62634] Updated weights for policy 0, policy_version 77030 (0.0009) [2023-10-12 19:00:23,170][62634] Updated weights for policy 0, policy_version 77040 (0.0007) [2023-10-12 19:00:23,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 157777920. Throughput: 0: 1686.9, 1: 1659.6. Samples: 39452614. Policy #0 lag: (min: 9.0, avg: 11.6, max: 41.0) [2023-10-12 19:00:23,435][61643] Avg episode reward: [(0, '24.530'), (1, '9.870')] [2023-10-12 19:00:23,547][62634] Updated weights for policy 0, policy_version 77050 (0.0010) [2023-10-12 19:00:24,915][62635] Updated weights for policy 1, policy_version 77060 (0.0008) [2023-10-12 19:00:25,278][62635] Updated weights for policy 1, policy_version 77070 (0.0007) [2023-10-12 19:00:25,646][62635] Updated weights for policy 1, policy_version 77080 (0.0008) [2023-10-12 19:00:27,729][62634] Updated weights for policy 0, policy_version 77060 (0.0008) [2023-10-12 19:00:28,107][62634] Updated weights for policy 0, policy_version 77070 (0.0011) [2023-10-12 19:00:28,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 157843456. Throughput: 0: 1683.3, 1: 1681.7. Samples: 39473264. Policy #0 lag: (min: 9.0, avg: 11.6, max: 41.0) [2023-10-12 19:00:28,435][61643] Avg episode reward: [(0, '24.870'), (1, '9.650')] [2023-10-12 19:00:28,484][62634] Updated weights for policy 0, policy_version 77080 (0.0009) [2023-10-12 19:00:29,763][62635] Updated weights for policy 1, policy_version 77090 (0.0008) [2023-10-12 19:00:30,134][62635] Updated weights for policy 1, policy_version 77100 (0.0008) [2023-10-12 19:00:30,504][62635] Updated weights for policy 1, policy_version 77110 (0.0008) [2023-10-12 19:00:30,868][62635] Updated weights for policy 1, policy_version 77120 (0.0009) [2023-10-12 19:00:32,682][62634] Updated weights for policy 0, policy_version 77090 (0.0007) [2023-10-12 19:00:33,070][62634] Updated weights for policy 0, policy_version 77100 (0.0009) [2023-10-12 19:00:33,435][61643] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 157908992. Throughput: 0: 1671.5, 1: 1683.6. Samples: 39493606. Policy #0 lag: (min: 17.0, avg: 25.6, max: 49.0) [2023-10-12 19:00:33,436][61643] Avg episode reward: [(0, '24.690'), (1, '9.760')] [2023-10-12 19:00:33,442][62634] Updated weights for policy 0, policy_version 77110 (0.0008) [2023-10-12 19:00:33,810][62634] Updated weights for policy 0, policy_version 77120 (0.0009) [2023-10-12 19:00:34,651][62635] Updated weights for policy 1, policy_version 77130 (0.0008) [2023-10-12 19:00:35,020][62635] Updated weights for policy 1, policy_version 77140 (0.0007) [2023-10-12 19:00:35,383][62635] Updated weights for policy 1, policy_version 77150 (0.0007) [2023-10-12 19:00:37,650][62634] Updated weights for policy 0, policy_version 77130 (0.0010) [2023-10-12 19:00:38,023][62634] Updated weights for policy 0, policy_version 77140 (0.0011) [2023-10-12 19:00:38,403][62634] Updated weights for policy 0, policy_version 77150 (0.0011) [2023-10-12 19:00:38,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 157974528. Throughput: 0: 1681.4, 1: 1668.4. Samples: 39503088. Policy #0 lag: (min: 17.0, avg: 25.6, max: 49.0) [2023-10-12 19:00:38,435][61643] Avg episode reward: [(0, '24.640'), (1, '10.110')] [2023-10-12 19:00:39,442][62635] Updated weights for policy 1, policy_version 77160 (0.0008) [2023-10-12 19:00:39,803][62635] Updated weights for policy 1, policy_version 77170 (0.0010) [2023-10-12 19:00:40,166][62635] Updated weights for policy 1, policy_version 77180 (0.0009) [2023-10-12 19:00:42,372][62634] Updated weights for policy 0, policy_version 77160 (0.0009) [2023-10-12 19:00:42,737][62634] Updated weights for policy 0, policy_version 77170 (0.0008) [2023-10-12 19:00:43,120][62634] Updated weights for policy 0, policy_version 77180 (0.0007) [2023-10-12 19:00:43,435][61643] Fps is (10 sec: 16384.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 158072832. Throughput: 0: 1680.9, 1: 1689.9. Samples: 39523846. Policy #0 lag: (min: 17.0, avg: 25.6, max: 49.0) [2023-10-12 19:00:43,436][61643] Avg episode reward: [(0, '24.680'), (1, '9.840')] [2023-10-12 19:00:44,373][62635] Updated weights for policy 1, policy_version 77190 (0.0008) [2023-10-12 19:00:44,743][62635] Updated weights for policy 1, policy_version 77200 (0.0009) [2023-10-12 19:00:45,118][62635] Updated weights for policy 1, policy_version 77210 (0.0010) [2023-10-12 19:00:47,204][62634] Updated weights for policy 0, policy_version 77190 (0.0008) [2023-10-12 19:00:47,581][62634] Updated weights for policy 0, policy_version 77200 (0.0010) [2023-10-12 19:00:47,954][62634] Updated weights for policy 0, policy_version 77210 (0.0008) [2023-10-12 19:00:48,435][61643] Fps is (10 sec: 16384.0, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 158138368. Throughput: 0: 1658.3, 1: 1689.2. Samples: 39543638. Policy #0 lag: (min: 17.0, avg: 25.6, max: 49.0) [2023-10-12 19:00:48,435][61643] Avg episode reward: [(0, '24.610'), (1, '9.830')] [2023-10-12 19:00:49,137][62635] Updated weights for policy 1, policy_version 77220 (0.0009) [2023-10-12 19:00:49,508][62635] Updated weights for policy 1, policy_version 77230 (0.0007) [2023-10-12 19:00:49,868][62635] Updated weights for policy 1, policy_version 77240 (0.0007) [2023-10-12 19:00:52,056][62634] Updated weights for policy 0, policy_version 77220 (0.0010) [2023-10-12 19:00:52,432][62634] Updated weights for policy 0, policy_version 77230 (0.0011) [2023-10-12 19:00:52,807][62634] Updated weights for policy 0, policy_version 77240 (0.0008) [2023-10-12 19:00:53,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 158203904. Throughput: 0: 1679.9, 1: 1680.7. Samples: 39553574. Policy #0 lag: (min: 17.0, avg: 25.6, max: 49.0) [2023-10-12 19:00:53,436][61643] Avg episode reward: [(0, '24.800'), (1, '10.010')] [2023-10-12 19:00:53,996][62635] Updated weights for policy 1, policy_version 77250 (0.0008) [2023-10-12 19:00:54,357][62635] Updated weights for policy 1, policy_version 77260 (0.0010) [2023-10-12 19:00:54,730][62635] Updated weights for policy 1, policy_version 77270 (0.0009) [2023-10-12 19:00:55,106][62635] Updated weights for policy 1, policy_version 77280 (0.0009) [2023-10-12 19:00:57,069][62634] Updated weights for policy 0, policy_version 77250 (0.0009) [2023-10-12 19:00:57,455][62634] Updated weights for policy 0, policy_version 77260 (0.0010) [2023-10-12 19:00:57,830][62634] Updated weights for policy 0, policy_version 77270 (0.0010) [2023-10-12 19:00:58,209][62634] Updated weights for policy 0, policy_version 77280 (0.0009) [2023-10-12 19:00:58,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 158269440. Throughput: 0: 1673.4, 1: 1689.1. Samples: 39574012. Policy #0 lag: (min: 17.0, avg: 25.6, max: 49.0) [2023-10-12 19:00:58,435][61643] Avg episode reward: [(0, '24.450'), (1, '9.920')] [2023-10-12 19:00:59,387][62635] Updated weights for policy 1, policy_version 77290 (0.0009) [2023-10-12 19:00:59,755][62635] Updated weights for policy 1, policy_version 77300 (0.0010) [2023-10-12 19:01:00,120][62635] Updated weights for policy 1, policy_version 77310 (0.0007) [2023-10-12 19:01:02,130][62634] Updated weights for policy 0, policy_version 77290 (0.0008) [2023-10-12 19:01:02,509][62634] Updated weights for policy 0, policy_version 77300 (0.0008) [2023-10-12 19:01:02,896][62634] Updated weights for policy 0, policy_version 77310 (0.0009) [2023-10-12 19:01:03,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 158334976. Throughput: 0: 1653.5, 1: 1688.4. Samples: 39593414. Policy #0 lag: (min: 17.0, avg: 25.6, max: 49.0) [2023-10-12 19:01:03,435][61643] Avg episode reward: [(0, '24.440'), (1, '9.890')] [2023-10-12 19:01:04,193][62635] Updated weights for policy 1, policy_version 77320 (0.0009) [2023-10-12 19:01:04,566][62635] Updated weights for policy 1, policy_version 77330 (0.0009) [2023-10-12 19:01:04,924][62635] Updated weights for policy 1, policy_version 77340 (0.0009) [2023-10-12 19:01:06,976][62634] Updated weights for policy 0, policy_version 77320 (0.0009) [2023-10-12 19:01:07,355][62634] Updated weights for policy 0, policy_version 77330 (0.0010) [2023-10-12 19:01:07,728][62634] Updated weights for policy 0, policy_version 77340 (0.0009) [2023-10-12 19:01:08,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 158400512. Throughput: 0: 1674.1, 1: 1682.7. Samples: 39603672. Policy #0 lag: (min: 17.0, avg: 25.6, max: 49.0) [2023-10-12 19:01:08,435][61643] Avg episode reward: [(0, '24.310'), (1, '10.060')] [2023-10-12 19:01:08,901][62635] Updated weights for policy 1, policy_version 77350 (0.0009) [2023-10-12 19:01:09,273][62635] Updated weights for policy 1, policy_version 77360 (0.0008) [2023-10-12 19:01:09,642][62635] Updated weights for policy 1, policy_version 77370 (0.0010) [2023-10-12 19:01:11,716][62634] Updated weights for policy 0, policy_version 77350 (0.0008) [2023-10-12 19:01:12,089][62634] Updated weights for policy 0, policy_version 77360 (0.0009) [2023-10-12 19:01:12,470][62634] Updated weights for policy 0, policy_version 77370 (0.0007) [2023-10-12 19:01:13,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 158466048. Throughput: 0: 1664.2, 1: 1685.6. Samples: 39624006. Policy #0 lag: (min: 17.0, avg: 25.6, max: 49.0) [2023-10-12 19:01:13,435][61643] Avg episode reward: [(0, '24.200'), (1, '9.950')] [2023-10-12 19:01:13,589][62635] Updated weights for policy 1, policy_version 77380 (0.0009) [2023-10-12 19:01:13,948][62635] Updated weights for policy 1, policy_version 77390 (0.0008) [2023-10-12 19:01:14,313][62635] Updated weights for policy 1, policy_version 77400 (0.0007) [2023-10-12 19:01:16,590][62634] Updated weights for policy 0, policy_version 77380 (0.0009) [2023-10-12 19:01:16,974][62634] Updated weights for policy 0, policy_version 77390 (0.0008) [2023-10-12 19:01:17,346][62634] Updated weights for policy 0, policy_version 77400 (0.0008) [2023-10-12 19:01:18,385][62635] Updated weights for policy 1, policy_version 77410 (0.0009) [2023-10-12 19:01:18,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 158531584. Throughput: 0: 1656.7, 1: 1687.6. Samples: 39644096. Policy #0 lag: (min: 25.0, avg: 40.9, max: 57.0) [2023-10-12 19:01:18,435][61643] Avg episode reward: [(0, '24.360'), (1, '9.730')] [2023-10-12 19:01:18,752][62635] Updated weights for policy 1, policy_version 77420 (0.0008) [2023-10-12 19:01:19,122][62635] Updated weights for policy 1, policy_version 77430 (0.0008) [2023-10-12 19:01:19,500][62635] Updated weights for policy 1, policy_version 77440 (0.0007) [2023-10-12 19:01:21,433][62634] Updated weights for policy 0, policy_version 77410 (0.0008) [2023-10-12 19:01:21,803][62634] Updated weights for policy 0, policy_version 77420 (0.0012) [2023-10-12 19:01:22,175][62634] Updated weights for policy 0, policy_version 77430 (0.0010) [2023-10-12 19:01:22,551][62634] Updated weights for policy 0, policy_version 77440 (0.0009) [2023-10-12 19:01:23,435][61643] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 158597120. Throughput: 0: 1678.0, 1: 1684.3. Samples: 39654394. Policy #0 lag: (min: 25.0, avg: 40.9, max: 57.0) [2023-10-12 19:01:23,436][61643] Avg episode reward: [(0, '24.370'), (1, '9.910')] [2023-10-12 19:01:23,747][62635] Updated weights for policy 1, policy_version 77450 (0.0009) [2023-10-12 19:01:24,119][62635] Updated weights for policy 1, policy_version 77460 (0.0009) [2023-10-12 19:01:24,486][62635] Updated weights for policy 1, policy_version 77470 (0.0007) [2023-10-12 19:01:26,725][62634] Updated weights for policy 0, policy_version 77450 (0.0007) [2023-10-12 19:01:27,097][62634] Updated weights for policy 0, policy_version 77460 (0.0009) [2023-10-12 19:01:27,469][62634] Updated weights for policy 0, policy_version 77470 (0.0009) [2023-10-12 19:01:28,420][62635] Updated weights for policy 1, policy_version 77480 (0.0010) [2023-10-12 19:01:28,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 158662656. Throughput: 0: 1663.0, 1: 1685.0. Samples: 39674506. Policy #0 lag: (min: 25.0, avg: 40.9, max: 57.0) [2023-10-12 19:01:28,436][61643] Avg episode reward: [(0, '24.170'), (1, '9.810')] [2023-10-12 19:01:28,792][62635] Updated weights for policy 1, policy_version 77490 (0.0009) [2023-10-12 19:01:29,166][62635] Updated weights for policy 1, policy_version 77500 (0.0008) [2023-10-12 19:01:31,327][62634] Updated weights for policy 0, policy_version 77480 (0.0007) [2023-10-12 19:01:31,700][62634] Updated weights for policy 0, policy_version 77490 (0.0010) [2023-10-12 19:01:32,072][62634] Updated weights for policy 0, policy_version 77500 (0.0011) [2023-10-12 19:01:33,226][62635] Updated weights for policy 1, policy_version 77510 (0.0009) [2023-10-12 19:01:33,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 158728192. Throughput: 0: 1670.7, 1: 1687.7. Samples: 39694768. Policy #0 lag: (min: 25.0, avg: 40.9, max: 57.0) [2023-10-12 19:01:33,436][61643] Avg episode reward: [(0, '24.370'), (1, '9.790')] [2023-10-12 19:01:33,579][62635] Updated weights for policy 1, policy_version 77520 (0.0009) [2023-10-12 19:01:33,950][62635] Updated weights for policy 1, policy_version 77530 (0.0008) [2023-10-12 19:01:36,119][62634] Updated weights for policy 0, policy_version 77510 (0.0009) [2023-10-12 19:01:36,492][62634] Updated weights for policy 0, policy_version 77520 (0.0008) [2023-10-12 19:01:36,871][62634] Updated weights for policy 0, policy_version 77530 (0.0007) [2023-10-12 19:01:37,934][62635] Updated weights for policy 1, policy_version 77540 (0.0009) [2023-10-12 19:01:38,300][62635] Updated weights for policy 1, policy_version 77550 (0.0008) [2023-10-12 19:01:38,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 158793728. Throughput: 0: 1675.2, 1: 1690.8. Samples: 39705040. Policy #0 lag: (min: 25.0, avg: 40.9, max: 57.0) [2023-10-12 19:01:38,435][61643] Avg episode reward: [(0, '24.210'), (1, '9.730')] [2023-10-12 19:01:38,671][62635] Updated weights for policy 1, policy_version 77560 (0.0007) [2023-10-12 19:01:40,851][62634] Updated weights for policy 0, policy_version 77540 (0.0009) [2023-10-12 19:01:41,226][62634] Updated weights for policy 0, policy_version 77550 (0.0007) [2023-10-12 19:01:41,607][62634] Updated weights for policy 0, policy_version 77560 (0.0007) [2023-10-12 19:01:42,784][62635] Updated weights for policy 1, policy_version 77570 (0.0010) [2023-10-12 19:01:43,151][62635] Updated weights for policy 1, policy_version 77580 (0.0009) [2023-10-12 19:01:43,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 158859264. Throughput: 0: 1658.7, 1: 1692.4. Samples: 39724810. Policy #0 lag: (min: 25.0, avg: 40.9, max: 57.0) [2023-10-12 19:01:43,435][61643] Avg episode reward: [(0, '24.240'), (1, '9.590')] [2023-10-12 19:01:43,516][62635] Updated weights for policy 1, policy_version 77590 (0.0009) [2023-10-12 19:01:43,887][62635] Updated weights for policy 1, policy_version 77600 (0.0009) [2023-10-12 19:01:45,755][62634] Updated weights for policy 0, policy_version 77570 (0.0008) [2023-10-12 19:01:46,135][62634] Updated weights for policy 0, policy_version 77580 (0.0007) [2023-10-12 19:01:46,510][62634] Updated weights for policy 0, policy_version 77590 (0.0009) [2023-10-12 19:01:46,895][62634] Updated weights for policy 0, policy_version 77600 (0.0008) [2023-10-12 19:01:47,696][62635] Updated weights for policy 1, policy_version 77610 (0.0008) [2023-10-12 19:01:48,078][62635] Updated weights for policy 1, policy_version 77620 (0.0009) [2023-10-12 19:01:48,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 158924800. Throughput: 0: 1684.3, 1: 1682.7. Samples: 39744928. Policy #0 lag: (min: 25.0, avg: 40.9, max: 57.0) [2023-10-12 19:01:48,435][61643] Avg episode reward: [(0, '24.330'), (1, '9.760')] [2023-10-12 19:01:48,442][62635] Updated weights for policy 1, policy_version 77630 (0.0010) [2023-10-12 19:01:51,075][62634] Updated weights for policy 0, policy_version 77610 (0.0009) [2023-10-12 19:01:51,449][62634] Updated weights for policy 0, policy_version 77620 (0.0008) [2023-10-12 19:01:51,827][62634] Updated weights for policy 0, policy_version 77630 (0.0008) [2023-10-12 19:01:52,629][62635] Updated weights for policy 1, policy_version 77640 (0.0009) [2023-10-12 19:01:53,013][62635] Updated weights for policy 1, policy_version 77650 (0.0008) [2023-10-12 19:01:53,378][62635] Updated weights for policy 1, policy_version 77660 (0.0009) [2023-10-12 19:01:53,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 158990336. Throughput: 0: 1674.0, 1: 1697.0. Samples: 39755368. Policy #0 lag: (min: 25.0, avg: 40.9, max: 57.0) [2023-10-12 19:01:53,435][61643] Avg episode reward: [(0, '24.250'), (1, '9.970')] [2023-10-12 19:01:55,911][62634] Updated weights for policy 0, policy_version 77640 (0.0011) [2023-10-12 19:01:56,288][62634] Updated weights for policy 0, policy_version 77650 (0.0009) [2023-10-12 19:01:56,664][62634] Updated weights for policy 0, policy_version 77660 (0.0010) [2023-10-12 19:01:57,500][62635] Updated weights for policy 1, policy_version 77670 (0.0008) [2023-10-12 19:01:57,873][62635] Updated weights for policy 1, policy_version 77680 (0.0008) [2023-10-12 19:01:58,228][62635] Updated weights for policy 1, policy_version 77690 (0.0010) [2023-10-12 19:01:58,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13107.1, 300 sec: 13440.4). Total num frames: 159055872. Throughput: 0: 1658.7, 1: 1695.0. Samples: 39774922. Policy #0 lag: (min: 25.0, avg: 40.9, max: 57.0) [2023-10-12 19:01:58,436][61643] Avg episode reward: [(0, '24.140'), (1, '9.710')] [2023-10-12 19:02:00,889][62634] Updated weights for policy 0, policy_version 77670 (0.0010) [2023-10-12 19:02:01,276][62634] Updated weights for policy 0, policy_version 77680 (0.0010) [2023-10-12 19:02:01,648][62634] Updated weights for policy 0, policy_version 77690 (0.0008) [2023-10-12 19:02:02,231][62635] Updated weights for policy 1, policy_version 77700 (0.0011) [2023-10-12 19:02:02,588][62635] Updated weights for policy 1, policy_version 77710 (0.0009) [2023-10-12 19:02:02,950][62635] Updated weights for policy 1, policy_version 77720 (0.0008) [2023-10-12 19:02:03,435][61643] Fps is (10 sec: 16383.6, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 159154176. Throughput: 0: 1678.6, 1: 1669.4. Samples: 39794756. Policy #0 lag: (min: 25.0, avg: 40.9, max: 57.0) [2023-10-12 19:02:03,436][61643] Avg episode reward: [(0, '24.450'), (1, '9.800')] [2023-10-12 19:02:05,635][62634] Updated weights for policy 0, policy_version 77700 (0.0011) [2023-10-12 19:02:05,999][62634] Updated weights for policy 0, policy_version 77710 (0.0008) [2023-10-12 19:02:06,387][62634] Updated weights for policy 0, policy_version 77720 (0.0008) [2023-10-12 19:02:07,032][62635] Updated weights for policy 1, policy_version 77730 (0.0008) [2023-10-12 19:02:07,400][62635] Updated weights for policy 1, policy_version 77740 (0.0007) [2023-10-12 19:02:07,774][62635] Updated weights for policy 1, policy_version 77750 (0.0009) [2023-10-12 19:02:08,138][62635] Updated weights for policy 1, policy_version 77760 (0.0008) [2023-10-12 19:02:08,435][61643] Fps is (10 sec: 16384.0, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 159219712. Throughput: 0: 1667.9, 1: 1693.3. Samples: 39805646. Policy #0 lag: (min: 21.0, avg: 23.6, max: 53.0) [2023-10-12 19:02:08,436][61643] Avg episode reward: [(0, '24.370'), (1, '10.020')] [2023-10-12 19:02:10,437][62634] Updated weights for policy 0, policy_version 77730 (0.0009) [2023-10-12 19:02:10,817][62634] Updated weights for policy 0, policy_version 77740 (0.0008) [2023-10-12 19:02:11,192][62634] Updated weights for policy 0, policy_version 77750 (0.0009) [2023-10-12 19:02:11,561][62634] Updated weights for policy 0, policy_version 77760 (0.0008) [2023-10-12 19:02:12,192][62635] Updated weights for policy 1, policy_version 77770 (0.0008) [2023-10-12 19:02:12,560][62635] Updated weights for policy 1, policy_version 77780 (0.0007) [2023-10-12 19:02:12,940][62635] Updated weights for policy 1, policy_version 77790 (0.0007) [2023-10-12 19:02:13,435][61643] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 159285248. Throughput: 0: 1664.1, 1: 1684.8. Samples: 39825204. Policy #0 lag: (min: 21.0, avg: 23.6, max: 53.0) [2023-10-12 19:02:13,435][61643] Avg episode reward: [(0, '24.580'), (1, '10.270')] [2023-10-12 19:02:15,593][62634] Updated weights for policy 0, policy_version 77770 (0.0008) [2023-10-12 19:02:15,975][62634] Updated weights for policy 0, policy_version 77780 (0.0008) [2023-10-12 19:02:16,348][62634] Updated weights for policy 0, policy_version 77790 (0.0008) [2023-10-12 19:02:17,025][62635] Updated weights for policy 1, policy_version 77800 (0.0008) [2023-10-12 19:02:17,381][62635] Updated weights for policy 1, policy_version 77810 (0.0007) [2023-10-12 19:02:17,749][62635] Updated weights for policy 1, policy_version 77820 (0.0009) [2023-10-12 19:02:18,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 159350784. Throughput: 0: 1681.0, 1: 1657.4. Samples: 39844994. Policy #0 lag: (min: 21.0, avg: 23.6, max: 53.0) [2023-10-12 19:02:18,436][61643] Avg episode reward: [(0, '24.190'), (1, '10.040')] [2023-10-12 19:02:18,444][62354] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000077792_79659008.pth... [2023-10-12 19:02:18,445][62495] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000077824_79691776.pth... [2023-10-12 19:02:18,485][62495] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000076256_78086144.pth [2023-10-12 19:02:18,485][62354] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000076224_78053376.pth [2023-10-12 19:02:20,473][62634] Updated weights for policy 0, policy_version 77800 (0.0011) [2023-10-12 19:02:20,847][62634] Updated weights for policy 0, policy_version 77810 (0.0008) [2023-10-12 19:02:21,226][62634] Updated weights for policy 0, policy_version 77820 (0.0007) [2023-10-12 19:02:21,794][62635] Updated weights for policy 1, policy_version 77830 (0.0008) [2023-10-12 19:02:22,159][62635] Updated weights for policy 1, policy_version 77840 (0.0007) [2023-10-12 19:02:22,522][62635] Updated weights for policy 1, policy_version 77850 (0.0007) [2023-10-12 19:02:23,435][61643] Fps is (10 sec: 13106.8, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 159416320. Throughput: 0: 1660.6, 1: 1685.2. Samples: 39855602. Policy #0 lag: (min: 21.0, avg: 23.6, max: 53.0) [2023-10-12 19:02:23,436][61643] Avg episode reward: [(0, '24.350'), (1, '10.240')] [2023-10-12 19:02:25,312][62634] Updated weights for policy 0, policy_version 77830 (0.0008) [2023-10-12 19:02:25,692][62634] Updated weights for policy 0, policy_version 77840 (0.0007) [2023-10-12 19:02:26,073][62634] Updated weights for policy 0, policy_version 77850 (0.0007) [2023-10-12 19:02:26,654][62635] Updated weights for policy 1, policy_version 77860 (0.0007) [2023-10-12 19:02:27,021][62635] Updated weights for policy 1, policy_version 77870 (0.0009) [2023-10-12 19:02:27,389][62635] Updated weights for policy 1, policy_version 77880 (0.0008) [2023-10-12 19:02:28,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 159481856. Throughput: 0: 1671.2, 1: 1670.4. Samples: 39875182. Policy #0 lag: (min: 21.0, avg: 23.6, max: 53.0) [2023-10-12 19:02:28,435][61643] Avg episode reward: [(0, '24.620'), (1, '10.060')] [2023-10-12 19:02:30,133][62634] Updated weights for policy 0, policy_version 77860 (0.0007) [2023-10-12 19:02:30,505][62634] Updated weights for policy 0, policy_version 77870 (0.0007) [2023-10-12 19:02:30,881][62634] Updated weights for policy 0, policy_version 77880 (0.0011) [2023-10-12 19:02:31,373][62635] Updated weights for policy 1, policy_version 77890 (0.0007) [2023-10-12 19:02:31,730][62635] Updated weights for policy 1, policy_version 77900 (0.0008) [2023-10-12 19:02:32,094][62635] Updated weights for policy 1, policy_version 77910 (0.0008) [2023-10-12 19:02:32,464][62635] Updated weights for policy 1, policy_version 77920 (0.0008) [2023-10-12 19:02:33,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 159547392. Throughput: 0: 1676.8, 1: 1669.6. Samples: 39895516. Policy #0 lag: (min: 21.0, avg: 23.6, max: 53.0) [2023-10-12 19:02:33,436][61643] Avg episode reward: [(0, '24.210'), (1, '9.880')] [2023-10-12 19:02:34,901][62634] Updated weights for policy 0, policy_version 77890 (0.0010) [2023-10-12 19:02:35,294][62634] Updated weights for policy 0, policy_version 77900 (0.0010) [2023-10-12 19:02:35,673][62634] Updated weights for policy 0, policy_version 77910 (0.0009) [2023-10-12 19:02:36,054][62634] Updated weights for policy 0, policy_version 77920 (0.0007) [2023-10-12 19:02:36,680][62635] Updated weights for policy 1, policy_version 77930 (0.0009) [2023-10-12 19:02:37,054][62635] Updated weights for policy 1, policy_version 77940 (0.0009) [2023-10-12 19:02:37,413][62635] Updated weights for policy 1, policy_version 77950 (0.0010) [2023-10-12 19:02:38,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 159612928. Throughput: 0: 1661.2, 1: 1686.5. Samples: 39906014. Policy #0 lag: (min: 21.0, avg: 23.6, max: 53.0) [2023-10-12 19:02:38,436][61643] Avg episode reward: [(0, '24.440'), (1, '10.150')] [2023-10-12 19:02:40,078][62634] Updated weights for policy 0, policy_version 77930 (0.0008) [2023-10-12 19:02:40,452][62634] Updated weights for policy 0, policy_version 77940 (0.0009) [2023-10-12 19:02:40,821][62634] Updated weights for policy 0, policy_version 77950 (0.0009) [2023-10-12 19:02:41,396][62635] Updated weights for policy 1, policy_version 77960 (0.0010) [2023-10-12 19:02:41,772][62635] Updated weights for policy 1, policy_version 77970 (0.0009) [2023-10-12 19:02:42,142][62635] Updated weights for policy 1, policy_version 77980 (0.0010) [2023-10-12 19:02:43,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 159678464. Throughput: 0: 1683.2, 1: 1666.8. Samples: 39925672. Policy #0 lag: (min: 21.0, avg: 23.6, max: 53.0) [2023-10-12 19:02:43,436][61643] Avg episode reward: [(0, '24.350'), (1, '10.060')] [2023-10-12 19:02:44,935][62634] Updated weights for policy 0, policy_version 77960 (0.0008) [2023-10-12 19:02:45,319][62634] Updated weights for policy 0, policy_version 77970 (0.0008) [2023-10-12 19:02:45,704][62634] Updated weights for policy 0, policy_version 77980 (0.0010) [2023-10-12 19:02:46,057][62635] Updated weights for policy 1, policy_version 77990 (0.0008) [2023-10-12 19:02:46,431][62635] Updated weights for policy 1, policy_version 78000 (0.0008) [2023-10-12 19:02:46,793][62635] Updated weights for policy 1, policy_version 78010 (0.0007) [2023-10-12 19:02:48,435][61643] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 159744000. Throughput: 0: 1687.4, 1: 1682.9. Samples: 39946418. Policy #0 lag: (min: 21.0, avg: 23.6, max: 53.0) [2023-10-12 19:02:48,435][61643] Avg episode reward: [(0, '24.180'), (1, '9.820')] [2023-10-12 19:02:49,664][62634] Updated weights for policy 0, policy_version 77990 (0.0009) [2023-10-12 19:02:50,039][62634] Updated weights for policy 0, policy_version 78000 (0.0009) [2023-10-12 19:02:50,411][62634] Updated weights for policy 0, policy_version 78010 (0.0010) [2023-10-12 19:02:51,051][62635] Updated weights for policy 1, policy_version 78020 (0.0007) [2023-10-12 19:02:51,419][62635] Updated weights for policy 1, policy_version 78030 (0.0007) [2023-10-12 19:02:51,783][62635] Updated weights for policy 1, policy_version 78040 (0.0007) [2023-10-12 19:02:53,435][61643] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 159809536. Throughput: 0: 1666.6, 1: 1685.9. Samples: 39956508. Policy #0 lag: (min: 21.0, avg: 23.6, max: 53.0) [2023-10-12 19:02:53,435][61643] Avg episode reward: [(0, '24.360'), (1, '10.090')] [2023-10-12 19:02:54,483][62634] Updated weights for policy 0, policy_version 78020 (0.0008) [2023-10-12 19:02:54,861][62634] Updated weights for policy 0, policy_version 78030 (0.0007) [2023-10-12 19:02:55,240][62634] Updated weights for policy 0, policy_version 78040 (0.0009) [2023-10-12 19:02:55,732][62635] Updated weights for policy 1, policy_version 78050 (0.0010) [2023-10-12 19:02:56,101][62635] Updated weights for policy 1, policy_version 78060 (0.0008) [2023-10-12 19:02:56,477][62635] Updated weights for policy 1, policy_version 78070 (0.0010) [2023-10-12 19:02:56,840][62635] Updated weights for policy 1, policy_version 78080 (0.0008) [2023-10-12 19:02:58,435][61643] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 159875072. Throughput: 0: 1682.1, 1: 1673.4. Samples: 39976202. Policy #0 lag: (min: 31.0, avg: 32.4, max: 56.0) [2023-10-12 19:02:58,436][61643] Avg episode reward: [(0, '23.930'), (1, '10.060')] [2023-10-12 19:02:59,185][62634] Updated weights for policy 0, policy_version 78050 (0.0010) [2023-10-12 19:02:59,553][62634] Updated weights for policy 0, policy_version 78060 (0.0008) [2023-10-12 19:02:59,933][62634] Updated weights for policy 0, policy_version 78070 (0.0007) [2023-10-12 19:03:00,307][62634] Updated weights for policy 0, policy_version 78080 (0.0007) [2023-10-12 19:03:00,801][62635] Updated weights for policy 1, policy_version 78090 (0.0008) [2023-10-12 19:03:01,165][62635] Updated weights for policy 1, policy_version 78100 (0.0011) [2023-10-12 19:03:01,539][62635] Updated weights for policy 1, policy_version 78110 (0.0008) [2023-10-12 19:03:03,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 159940608. Throughput: 0: 1680.1, 1: 1698.5. Samples: 39997030. Policy #0 lag: (min: 31.0, avg: 32.4, max: 56.0) [2023-10-12 19:03:03,435][61643] Avg episode reward: [(0, '24.080'), (1, '9.740')] [2023-10-12 19:03:04,526][62634] Updated weights for policy 0, policy_version 78090 (0.0009) [2023-10-12 19:03:04,900][62634] Updated weights for policy 0, policy_version 78100 (0.0009) [2023-10-12 19:03:05,284][62634] Updated weights for policy 0, policy_version 78110 (0.0009) [2023-10-12 19:03:05,576][62635] Updated weights for policy 1, policy_version 78120 (0.0007) [2023-10-12 19:03:05,950][62635] Updated weights for policy 1, policy_version 78130 (0.0008) [2023-10-12 19:03:06,314][62635] Updated weights for policy 1, policy_version 78140 (0.0008) [2023-10-12 19:03:08,435][61643] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 160006144. Throughput: 0: 1672.0, 1: 1680.8. Samples: 40006478. Policy #0 lag: (min: 31.0, avg: 32.4, max: 56.0) [2023-10-12 19:03:08,435][61643] Avg episode reward: [(0, '24.190'), (1, '9.910')] [2023-10-12 19:03:09,290][62634] Updated weights for policy 0, policy_version 78120 (0.0008) [2023-10-12 19:03:09,658][62634] Updated weights for policy 0, policy_version 78130 (0.0010) [2023-10-12 19:03:10,042][62634] Updated weights for policy 0, policy_version 78140 (0.0010) [2023-10-12 19:03:10,277][62635] Updated weights for policy 1, policy_version 78150 (0.0008) [2023-10-12 19:03:10,650][62635] Updated weights for policy 1, policy_version 78160 (0.0008) [2023-10-12 19:03:11,020][62635] Updated weights for policy 1, policy_version 78170 (0.0007) [2023-10-12 19:03:13,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 160071680. Throughput: 0: 1680.5, 1: 1686.3. Samples: 40026688. Policy #0 lag: (min: 31.0, avg: 32.4, max: 56.0) [2023-10-12 19:03:13,435][61643] Avg episode reward: [(0, '23.740'), (1, '9.890')] [2023-10-12 19:03:14,161][62634] Updated weights for policy 0, policy_version 78150 (0.0008) [2023-10-12 19:03:14,540][62634] Updated weights for policy 0, policy_version 78160 (0.0007) [2023-10-12 19:03:14,912][62634] Updated weights for policy 0, policy_version 78170 (0.0008) [2023-10-12 19:03:15,024][62635] Updated weights for policy 1, policy_version 78180 (0.0008) [2023-10-12 19:03:15,394][62635] Updated weights for policy 1, policy_version 78190 (0.0009) [2023-10-12 19:03:15,755][62635] Updated weights for policy 1, policy_version 78200 (0.0009) [2023-10-12 19:03:18,435][61643] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 160137216. Throughput: 0: 1675.1, 1: 1700.4. Samples: 40047412. Policy #0 lag: (min: 31.0, avg: 32.4, max: 56.0) [2023-10-12 19:03:18,436][61643] Avg episode reward: [(0, '23.860'), (1, '9.700')] [2023-10-12 19:03:19,069][62634] Updated weights for policy 0, policy_version 78180 (0.0007) [2023-10-12 19:03:19,445][62634] Updated weights for policy 0, policy_version 78190 (0.0008) [2023-10-12 19:03:19,815][62634] Updated weights for policy 0, policy_version 78200 (0.0009) [2023-10-12 19:03:19,824][62635] Updated weights for policy 1, policy_version 78210 (0.0009) [2023-10-12 19:03:20,188][62635] Updated weights for policy 1, policy_version 78220 (0.0008) [2023-10-12 19:03:20,563][62635] Updated weights for policy 1, policy_version 78230 (0.0008) [2023-10-12 19:03:20,927][62635] Updated weights for policy 1, policy_version 78240 (0.0008) [2023-10-12 19:03:23,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.3, 300 sec: 13440.4). Total num frames: 160202752. Throughput: 0: 1668.9, 1: 1669.7. Samples: 40056252. Policy #0 lag: (min: 31.0, avg: 32.4, max: 56.0) [2023-10-12 19:03:23,435][61643] Avg episode reward: [(0, '23.740'), (1, '9.790')] [2023-10-12 19:03:23,869][62634] Updated weights for policy 0, policy_version 78210 (0.0009) [2023-10-12 19:03:24,262][62634] Updated weights for policy 0, policy_version 78220 (0.0010) [2023-10-12 19:03:24,632][62634] Updated weights for policy 0, policy_version 78230 (0.0008) [2023-10-12 19:03:25,005][62634] Updated weights for policy 0, policy_version 78240 (0.0009) [2023-10-12 19:03:25,064][62635] Updated weights for policy 1, policy_version 78250 (0.0009) [2023-10-12 19:03:25,440][62635] Updated weights for policy 1, policy_version 78260 (0.0010) [2023-10-12 19:03:25,794][62635] Updated weights for policy 1, policy_version 78270 (0.0010) [2023-10-12 19:03:28,435][61643] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13440.5). Total num frames: 160268288. Throughput: 0: 1670.0, 1: 1683.1. Samples: 40076560. Policy #0 lag: (min: 31.0, avg: 32.4, max: 56.0) [2023-10-12 19:03:28,435][61643] Avg episode reward: [(0, '23.710'), (1, '9.740')] [2023-10-12 19:03:28,966][62634] Updated weights for policy 0, policy_version 78250 (0.0009) [2023-10-12 19:03:29,347][62634] Updated weights for policy 0, policy_version 78260 (0.0009) [2023-10-12 19:03:29,723][62634] Updated weights for policy 0, policy_version 78270 (0.0008) [2023-10-12 19:03:30,104][62635] Updated weights for policy 1, policy_version 78280 (0.0009) [2023-10-12 19:03:30,474][62635] Updated weights for policy 1, policy_version 78290 (0.0011) [2023-10-12 19:03:30,838][62635] Updated weights for policy 1, policy_version 78300 (0.0009) [2023-10-12 19:03:33,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 160333824. Throughput: 0: 1665.5, 1: 1683.0. Samples: 40097098. Policy #0 lag: (min: 31.0, avg: 32.4, max: 56.0) [2023-10-12 19:03:33,436][61643] Avg episode reward: [(0, '23.950'), (1, '9.810')] [2023-10-12 19:03:33,796][62634] Updated weights for policy 0, policy_version 78280 (0.0008) [2023-10-12 19:03:34,174][62634] Updated weights for policy 0, policy_version 78290 (0.0009) [2023-10-12 19:03:34,549][62634] Updated weights for policy 0, policy_version 78300 (0.0011) [2023-10-12 19:03:34,981][62635] Updated weights for policy 1, policy_version 78310 (0.0008) [2023-10-12 19:03:35,347][62635] Updated weights for policy 1, policy_version 78320 (0.0009) [2023-10-12 19:03:35,711][62635] Updated weights for policy 1, policy_version 78330 (0.0008) [2023-10-12 19:03:38,435][61643] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 160399360. Throughput: 0: 1669.5, 1: 1659.9. Samples: 40106334. Policy #0 lag: (min: 31.0, avg: 32.4, max: 56.0) [2023-10-12 19:03:38,436][61643] Avg episode reward: [(0, '24.030'), (1, '9.680')] [2023-10-12 19:03:38,608][62634] Updated weights for policy 0, policy_version 78310 (0.0009) [2023-10-12 19:03:38,985][62634] Updated weights for policy 0, policy_version 78320 (0.0007) [2023-10-12 19:03:39,357][62634] Updated weights for policy 0, policy_version 78330 (0.0008) [2023-10-12 19:03:39,765][62635] Updated weights for policy 1, policy_version 78340 (0.0009) [2023-10-12 19:03:40,139][62635] Updated weights for policy 1, policy_version 78350 (0.0008) [2023-10-12 19:03:40,506][62635] Updated weights for policy 1, policy_version 78360 (0.0008) [2023-10-12 19:03:43,312][62634] Updated weights for policy 0, policy_version 78340 (0.0010) [2023-10-12 19:03:43,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 160464896. Throughput: 0: 1676.0, 1: 1678.8. Samples: 40127166. Policy #0 lag: (min: 31.0, avg: 32.4, max: 56.0) [2023-10-12 19:03:43,436][61643] Avg episode reward: [(0, '24.010'), (1, '9.800')] [2023-10-12 19:03:43,684][62634] Updated weights for policy 0, policy_version 78350 (0.0009) [2023-10-12 19:03:44,065][62634] Updated weights for policy 0, policy_version 78360 (0.0007) [2023-10-12 19:03:44,587][62635] Updated weights for policy 1, policy_version 78370 (0.0008) [2023-10-12 19:03:44,958][62635] Updated weights for policy 1, policy_version 78380 (0.0009) [2023-10-12 19:03:45,324][62635] Updated weights for policy 1, policy_version 78390 (0.0009) [2023-10-12 19:03:45,697][62635] Updated weights for policy 1, policy_version 78400 (0.0007) [2023-10-12 19:03:48,065][62634] Updated weights for policy 0, policy_version 78370 (0.0008) [2023-10-12 19:03:48,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 160530432. Throughput: 0: 1680.8, 1: 1680.7. Samples: 40148296. Policy #0 lag: (min: 6.0, avg: 9.3, max: 38.0) [2023-10-12 19:03:48,435][61643] Avg episode reward: [(0, '24.360'), (1, '9.810')] [2023-10-12 19:03:48,439][62634] Updated weights for policy 0, policy_version 78380 (0.0010) [2023-10-12 19:03:48,818][62634] Updated weights for policy 0, policy_version 78390 (0.0010) [2023-10-12 19:03:49,191][62634] Updated weights for policy 0, policy_version 78400 (0.0008) [2023-10-12 19:03:49,733][62635] Updated weights for policy 1, policy_version 78410 (0.0007) [2023-10-12 19:03:50,100][62635] Updated weights for policy 1, policy_version 78420 (0.0007) [2023-10-12 19:03:50,473][62635] Updated weights for policy 1, policy_version 78430 (0.0007) [2023-10-12 19:03:53,140][62634] Updated weights for policy 0, policy_version 78410 (0.0007) [2023-10-12 19:03:53,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 160595968. Throughput: 0: 1681.7, 1: 1670.4. Samples: 40157322. Policy #0 lag: (min: 6.0, avg: 9.3, max: 38.0) [2023-10-12 19:03:53,435][61643] Avg episode reward: [(0, '24.400'), (1, '9.780')] [2023-10-12 19:03:53,528][62634] Updated weights for policy 0, policy_version 78420 (0.0007) [2023-10-12 19:03:53,893][62634] Updated weights for policy 0, policy_version 78430 (0.0007) [2023-10-12 19:03:54,487][62635] Updated weights for policy 1, policy_version 78440 (0.0009) [2023-10-12 19:03:54,857][62635] Updated weights for policy 1, policy_version 78450 (0.0008) [2023-10-12 19:03:55,228][62635] Updated weights for policy 1, policy_version 78460 (0.0008) [2023-10-12 19:03:58,116][62634] Updated weights for policy 0, policy_version 78440 (0.0007) [2023-10-12 19:03:58,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 160661504. Throughput: 0: 1688.1, 1: 1674.5. Samples: 40178006. Policy #0 lag: (min: 6.0, avg: 9.3, max: 38.0) [2023-10-12 19:03:58,436][61643] Avg episode reward: [(0, '24.280'), (1, '9.850')] [2023-10-12 19:03:58,488][62634] Updated weights for policy 0, policy_version 78450 (0.0009) [2023-10-12 19:03:58,865][62634] Updated weights for policy 0, policy_version 78460 (0.0008) [2023-10-12 19:03:59,317][62635] Updated weights for policy 1, policy_version 78470 (0.0007) [2023-10-12 19:03:59,684][62635] Updated weights for policy 1, policy_version 78480 (0.0007) [2023-10-12 19:04:00,046][62635] Updated weights for policy 1, policy_version 78490 (0.0007) [2023-10-12 19:04:02,861][62634] Updated weights for policy 0, policy_version 78470 (0.0007) [2023-10-12 19:04:03,241][62634] Updated weights for policy 0, policy_version 78480 (0.0011) [2023-10-12 19:04:03,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 160727040. Throughput: 0: 1683.9, 1: 1670.5. Samples: 40198360. Policy #0 lag: (min: 6.0, avg: 9.3, max: 38.0) [2023-10-12 19:04:03,435][61643] Avg episode reward: [(0, '24.300'), (1, '10.070')] [2023-10-12 19:04:03,621][62634] Updated weights for policy 0, policy_version 78490 (0.0009) [2023-10-12 19:04:04,118][62635] Updated weights for policy 1, policy_version 78500 (0.0008) [2023-10-12 19:04:04,482][62635] Updated weights for policy 1, policy_version 78510 (0.0008) [2023-10-12 19:04:04,858][62635] Updated weights for policy 1, policy_version 78520 (0.0008) [2023-10-12 19:04:07,625][62634] Updated weights for policy 0, policy_version 78500 (0.0007) [2023-10-12 19:04:08,008][62634] Updated weights for policy 0, policy_version 78510 (0.0009) [2023-10-12 19:04:08,380][62634] Updated weights for policy 0, policy_version 78520 (0.0009) [2023-10-12 19:04:08,435][61643] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 160792576. Throughput: 0: 1695.7, 1: 1671.9. Samples: 40207792. Policy #0 lag: (min: 6.0, avg: 9.3, max: 38.0) [2023-10-12 19:04:08,435][61643] Avg episode reward: [(0, '24.220'), (1, '9.740')] [2023-10-12 19:04:08,987][62635] Updated weights for policy 1, policy_version 78530 (0.0009) [2023-10-12 19:04:09,345][62635] Updated weights for policy 1, policy_version 78540 (0.0007) [2023-10-12 19:04:09,711][62635] Updated weights for policy 1, policy_version 78550 (0.0010) [2023-10-12 19:04:10,086][62635] Updated weights for policy 1, policy_version 78560 (0.0008) [2023-10-12 19:04:12,329][62634] Updated weights for policy 0, policy_version 78530 (0.0008) [2023-10-12 19:04:12,735][62634] Updated weights for policy 0, policy_version 78540 (0.0008) [2023-10-12 19:04:13,113][62634] Updated weights for policy 0, policy_version 78550 (0.0011) [2023-10-12 19:04:13,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 160858112. Throughput: 0: 1701.0, 1: 1682.2. Samples: 40228806. Policy #0 lag: (min: 6.0, avg: 9.3, max: 38.0) [2023-10-12 19:04:13,435][61643] Avg episode reward: [(0, '24.350'), (1, '9.870')] [2023-10-12 19:04:13,495][62634] Updated weights for policy 0, policy_version 78560 (0.0008) [2023-10-12 19:04:14,197][62635] Updated weights for policy 1, policy_version 78570 (0.0008) [2023-10-12 19:04:14,563][62635] Updated weights for policy 1, policy_version 78580 (0.0007) [2023-10-12 19:04:14,942][62635] Updated weights for policy 1, policy_version 78590 (0.0010) [2023-10-12 19:04:17,403][62634] Updated weights for policy 0, policy_version 78570 (0.0008) [2023-10-12 19:04:17,774][62634] Updated weights for policy 0, policy_version 78580 (0.0009) [2023-10-12 19:04:18,142][62634] Updated weights for policy 0, policy_version 78590 (0.0008) [2023-10-12 19:04:18,435][61643] Fps is (10 sec: 16383.4, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 160956416. Throughput: 0: 1680.1, 1: 1685.2. Samples: 40248540. Policy #0 lag: (min: 6.0, avg: 9.3, max: 38.0) [2023-10-12 19:04:18,436][61643] Avg episode reward: [(0, '24.370'), (1, '9.950')] [2023-10-12 19:04:18,445][62354] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000078592_80478208.pth... [2023-10-12 19:04:18,446][62495] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000078592_80478208.pth... [2023-10-12 19:04:18,481][62354] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000077024_78872576.pth [2023-10-12 19:04:18,485][62354] Saving a milestone ./train_atari/atari_kangaroo_APPO/checkpoint_p0/milestones/checkpoint_000078592_80478208.pth [2023-10-12 19:04:18,487][62495] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000077024_78872576.pth [2023-10-12 19:04:18,492][62495] Saving a milestone ./train_atari/atari_kangaroo_APPO/checkpoint_p1/milestones/checkpoint_000078592_80478208.pth [2023-10-12 19:04:18,978][62635] Updated weights for policy 1, policy_version 78600 (0.0012) [2023-10-12 19:04:19,340][62635] Updated weights for policy 1, policy_version 78610 (0.0009) [2023-10-12 19:04:19,702][62635] Updated weights for policy 1, policy_version 78620 (0.0008) [2023-10-12 19:04:22,378][62634] Updated weights for policy 0, policy_version 78600 (0.0010) [2023-10-12 19:04:22,745][62634] Updated weights for policy 0, policy_version 78610 (0.0009) [2023-10-12 19:04:23,135][62634] Updated weights for policy 0, policy_version 78620 (0.0009) [2023-10-12 19:04:23,435][61643] Fps is (10 sec: 16383.7, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 161021952. Throughput: 0: 1698.7, 1: 1682.5. Samples: 40258486. Policy #0 lag: (min: 6.0, avg: 9.3, max: 38.0) [2023-10-12 19:04:23,436][61643] Avg episode reward: [(0, '24.530'), (1, '9.840')] [2023-10-12 19:04:23,847][62635] Updated weights for policy 1, policy_version 78630 (0.0008) [2023-10-12 19:04:24,216][62635] Updated weights for policy 1, policy_version 78640 (0.0009) [2023-10-12 19:04:24,584][62635] Updated weights for policy 1, policy_version 78650 (0.0008) [2023-10-12 19:04:27,337][62634] Updated weights for policy 0, policy_version 78630 (0.0010) [2023-10-12 19:04:27,720][62634] Updated weights for policy 0, policy_version 78640 (0.0008) [2023-10-12 19:04:28,092][62634] Updated weights for policy 0, policy_version 78650 (0.0010) [2023-10-12 19:04:28,435][61643] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 161087488. Throughput: 0: 1695.8, 1: 1683.3. Samples: 40279224. Policy #0 lag: (min: 6.0, avg: 9.3, max: 38.0) [2023-10-12 19:04:28,435][61643] Avg episode reward: [(0, '24.950'), (1, '9.760')] [2023-10-12 19:04:28,673][62635] Updated weights for policy 1, policy_version 78660 (0.0009) [2023-10-12 19:04:29,040][62635] Updated weights for policy 1, policy_version 78670 (0.0010) [2023-10-12 19:04:29,417][62635] Updated weights for policy 1, policy_version 78680 (0.0009) [2023-10-12 19:04:32,243][62634] Updated weights for policy 0, policy_version 78660 (0.0009) [2023-10-12 19:04:32,625][62634] Updated weights for policy 0, policy_version 78670 (0.0008) [2023-10-12 19:04:33,009][62634] Updated weights for policy 0, policy_version 78680 (0.0010) [2023-10-12 19:04:33,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 161153024. Throughput: 0: 1671.3, 1: 1674.6. Samples: 40298862. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) [2023-10-12 19:04:33,436][61643] Avg episode reward: [(0, '24.800'), (1, '10.060')] [2023-10-12 19:04:33,585][62635] Updated weights for policy 1, policy_version 78690 (0.0008) [2023-10-12 19:04:33,938][62635] Updated weights for policy 1, policy_version 78700 (0.0008) [2023-10-12 19:04:34,315][62635] Updated weights for policy 1, policy_version 78710 (0.0007) [2023-10-12 19:04:34,677][62635] Updated weights for policy 1, policy_version 78720 (0.0009) [2023-10-12 19:04:37,120][62634] Updated weights for policy 0, policy_version 78690 (0.0009) [2023-10-12 19:04:37,490][62634] Updated weights for policy 0, policy_version 78700 (0.0008) [2023-10-12 19:04:37,870][62634] Updated weights for policy 0, policy_version 78710 (0.0009) [2023-10-12 19:04:38,251][62634] Updated weights for policy 0, policy_version 78720 (0.0010) [2023-10-12 19:04:38,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 161218560. Throughput: 0: 1693.5, 1: 1672.0. Samples: 40308772. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) [2023-10-12 19:04:38,435][61643] Avg episode reward: [(0, '24.860'), (1, '9.860')] [2023-10-12 19:04:38,760][62635] Updated weights for policy 1, policy_version 78730 (0.0008) [2023-10-12 19:04:39,127][62635] Updated weights for policy 1, policy_version 78740 (0.0008) [2023-10-12 19:04:39,504][62635] Updated weights for policy 1, policy_version 78750 (0.0011) [2023-10-12 19:04:42,429][62634] Updated weights for policy 0, policy_version 78730 (0.0009) [2023-10-12 19:04:42,792][62634] Updated weights for policy 0, policy_version 78740 (0.0008) [2023-10-12 19:04:43,168][62634] Updated weights for policy 0, policy_version 78750 (0.0009) [2023-10-12 19:04:43,329][62635] Updated weights for policy 1, policy_version 78760 (0.0007) [2023-10-12 19:04:43,435][61643] Fps is (10 sec: 13107.6, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 161284096. Throughput: 0: 1682.7, 1: 1680.3. Samples: 40329340. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) [2023-10-12 19:04:43,435][61643] Avg episode reward: [(0, '24.880'), (1, '9.950')] [2023-10-12 19:04:43,700][62635] Updated weights for policy 1, policy_version 78770 (0.0008) [2023-10-12 19:04:44,072][62635] Updated weights for policy 1, policy_version 78780 (0.0007) [2023-10-12 19:04:47,420][62634] Updated weights for policy 0, policy_version 78760 (0.0010) [2023-10-12 19:04:47,803][62634] Updated weights for policy 0, policy_version 78770 (0.0009) [2023-10-12 19:04:48,130][62635] Updated weights for policy 1, policy_version 78790 (0.0008) [2023-10-12 19:04:48,178][62634] Updated weights for policy 0, policy_version 78780 (0.0008) [2023-10-12 19:04:48,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 161349632. Throughput: 0: 1666.3, 1: 1681.8. Samples: 40349024. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) [2023-10-12 19:04:48,435][61643] Avg episode reward: [(0, '24.690'), (1, '10.040')] [2023-10-12 19:04:48,502][62635] Updated weights for policy 1, policy_version 78800 (0.0008) [2023-10-12 19:04:48,868][62635] Updated weights for policy 1, policy_version 78810 (0.0007) [2023-10-12 19:04:51,976][62634] Updated weights for policy 0, policy_version 78790 (0.0007) [2023-10-12 19:04:52,349][62634] Updated weights for policy 0, policy_version 78800 (0.0010) [2023-10-12 19:04:52,725][62634] Updated weights for policy 0, policy_version 78810 (0.0009) [2023-10-12 19:04:52,958][62635] Updated weights for policy 1, policy_version 78820 (0.0007) [2023-10-12 19:04:53,327][62635] Updated weights for policy 1, policy_version 78830 (0.0011) [2023-10-12 19:04:53,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 161415168. Throughput: 0: 1681.7, 1: 1685.7. Samples: 40359326. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) [2023-10-12 19:04:53,435][61643] Avg episode reward: [(0, '24.740'), (1, '9.950')] [2023-10-12 19:04:53,691][62635] Updated weights for policy 1, policy_version 78840 (0.0009) [2023-10-12 19:04:56,651][62634] Updated weights for policy 0, policy_version 78820 (0.0008) [2023-10-12 19:04:57,036][62634] Updated weights for policy 0, policy_version 78830 (0.0007) [2023-10-12 19:04:57,408][62634] Updated weights for policy 0, policy_version 78840 (0.0007) [2023-10-12 19:04:57,710][62635] Updated weights for policy 1, policy_version 78850 (0.0009) [2023-10-12 19:04:58,069][62635] Updated weights for policy 1, policy_version 78860 (0.0007) [2023-10-12 19:04:58,431][62635] Updated weights for policy 1, policy_version 78870 (0.0009) [2023-10-12 19:04:58,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 161480704. Throughput: 0: 1667.5, 1: 1676.2. Samples: 40379270. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) [2023-10-12 19:04:58,435][61643] Avg episode reward: [(0, '24.760'), (1, '9.960')] [2023-10-12 19:04:58,803][62635] Updated weights for policy 1, policy_version 78880 (0.0011) [2023-10-12 19:05:01,518][62634] Updated weights for policy 0, policy_version 78850 (0.0009) [2023-10-12 19:05:01,913][62634] Updated weights for policy 0, policy_version 78860 (0.0011) [2023-10-12 19:05:02,289][62634] Updated weights for policy 0, policy_version 78870 (0.0011) [2023-10-12 19:05:02,664][62634] Updated weights for policy 0, policy_version 78880 (0.0009) [2023-10-12 19:05:02,970][62635] Updated weights for policy 1, policy_version 78890 (0.0009) [2023-10-12 19:05:03,334][62635] Updated weights for policy 1, policy_version 78900 (0.0010) [2023-10-12 19:05:03,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 161546240. Throughput: 0: 1668.3, 1: 1664.9. Samples: 40398530. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) [2023-10-12 19:05:03,435][61643] Avg episode reward: [(0, '24.610'), (1, '10.040')] [2023-10-12 19:05:03,711][62635] Updated weights for policy 1, policy_version 78910 (0.0008) [2023-10-12 19:05:06,649][62634] Updated weights for policy 0, policy_version 78890 (0.0011) [2023-10-12 19:05:07,023][62634] Updated weights for policy 0, policy_version 78900 (0.0009) [2023-10-12 19:05:07,394][62634] Updated weights for policy 0, policy_version 78910 (0.0008) [2023-10-12 19:05:07,863][62635] Updated weights for policy 1, policy_version 78920 (0.0008) [2023-10-12 19:05:08,231][62635] Updated weights for policy 1, policy_version 78930 (0.0009) [2023-10-12 19:05:08,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 161611776. Throughput: 0: 1680.7, 1: 1672.3. Samples: 40409368. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) [2023-10-12 19:05:08,435][61643] Avg episode reward: [(0, '24.490'), (1, '9.700')] [2023-10-12 19:05:08,606][62635] Updated weights for policy 1, policy_version 78940 (0.0009) [2023-10-12 19:05:11,471][62634] Updated weights for policy 0, policy_version 78920 (0.0007) [2023-10-12 19:05:11,851][62634] Updated weights for policy 0, policy_version 78930 (0.0007) [2023-10-12 19:05:12,235][62634] Updated weights for policy 0, policy_version 78940 (0.0007) [2023-10-12 19:05:12,629][62635] Updated weights for policy 1, policy_version 78950 (0.0008) [2023-10-12 19:05:12,988][62635] Updated weights for policy 1, policy_version 78960 (0.0008) [2023-10-12 19:05:13,363][62635] Updated weights for policy 1, policy_version 78970 (0.0008) [2023-10-12 19:05:13,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 161677312. Throughput: 0: 1659.8, 1: 1677.6. Samples: 40429404. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) [2023-10-12 19:05:13,435][61643] Avg episode reward: [(0, '24.380'), (1, '9.730')] [2023-10-12 19:05:16,312][62634] Updated weights for policy 0, policy_version 78950 (0.0010) [2023-10-12 19:05:16,674][62634] Updated weights for policy 0, policy_version 78960 (0.0010) [2023-10-12 19:05:17,051][62634] Updated weights for policy 0, policy_version 78970 (0.0008) [2023-10-12 19:05:17,550][62635] Updated weights for policy 1, policy_version 78980 (0.0009) [2023-10-12 19:05:17,907][62635] Updated weights for policy 1, policy_version 78990 (0.0010) [2023-10-12 19:05:18,272][62635] Updated weights for policy 1, policy_version 79000 (0.0009) [2023-10-12 19:05:18,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.3, 300 sec: 13440.4). Total num frames: 161742848. Throughput: 0: 1663.4, 1: 1667.0. Samples: 40448730. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) [2023-10-12 19:05:18,435][61643] Avg episode reward: [(0, '24.200'), (1, '9.980')] [2023-10-12 19:05:21,079][62634] Updated weights for policy 0, policy_version 78980 (0.0009) [2023-10-12 19:05:21,462][62634] Updated weights for policy 0, policy_version 78990 (0.0010) [2023-10-12 19:05:21,851][62634] Updated weights for policy 0, policy_version 79000 (0.0007) [2023-10-12 19:05:22,310][62635] Updated weights for policy 1, policy_version 79010 (0.0010) [2023-10-12 19:05:22,674][62635] Updated weights for policy 1, policy_version 79020 (0.0009) [2023-10-12 19:05:23,045][62635] Updated weights for policy 1, policy_version 79030 (0.0007) [2023-10-12 19:05:23,406][62635] Updated weights for policy 1, policy_version 79040 (0.0007) [2023-10-12 19:05:23,435][61643] Fps is (10 sec: 16383.8, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 161841152. Throughput: 0: 1674.7, 1: 1682.6. Samples: 40459852. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) [2023-10-12 19:05:23,436][61643] Avg episode reward: [(0, '24.020'), (1, '9.740')] [2023-10-12 19:05:25,961][62634] Updated weights for policy 0, policy_version 79010 (0.0007) [2023-10-12 19:05:26,327][62634] Updated weights for policy 0, policy_version 79020 (0.0008) [2023-10-12 19:05:26,700][62634] Updated weights for policy 0, policy_version 79030 (0.0010) [2023-10-12 19:05:27,078][62634] Updated weights for policy 0, policy_version 79040 (0.0011) [2023-10-12 19:05:27,553][62635] Updated weights for policy 1, policy_version 79050 (0.0007) [2023-10-12 19:05:27,913][62635] Updated weights for policy 1, policy_version 79060 (0.0007) [2023-10-12 19:05:28,277][62635] Updated weights for policy 1, policy_version 79070 (0.0009) [2023-10-12 19:05:28,435][61643] Fps is (10 sec: 16383.7, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 161906688. Throughput: 0: 1654.4, 1: 1680.6. Samples: 40479414. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 19:05:28,436][61643] Avg episode reward: [(0, '24.050'), (1, '9.720')] [2023-10-12 19:05:31,191][62634] Updated weights for policy 0, policy_version 79050 (0.0009) [2023-10-12 19:05:31,571][62634] Updated weights for policy 0, policy_version 79060 (0.0010) [2023-10-12 19:05:31,948][62634] Updated weights for policy 0, policy_version 79070 (0.0010) [2023-10-12 19:05:32,349][62635] Updated weights for policy 1, policy_version 79080 (0.0010) [2023-10-12 19:05:32,719][62635] Updated weights for policy 1, policy_version 79090 (0.0011) [2023-10-12 19:05:33,091][62635] Updated weights for policy 1, policy_version 79100 (0.0008) [2023-10-12 19:05:33,435][61643] Fps is (10 sec: 13107.5, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 161972224. Throughput: 0: 1675.3, 1: 1658.0. Samples: 40499024. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 19:05:33,435][61643] Avg episode reward: [(0, '24.010'), (1, '9.750')] [2023-10-12 19:05:35,921][62634] Updated weights for policy 0, policy_version 79080 (0.0008) [2023-10-12 19:05:36,299][62634] Updated weights for policy 0, policy_version 79090 (0.0009) [2023-10-12 19:05:36,674][62634] Updated weights for policy 0, policy_version 79100 (0.0010) [2023-10-12 19:05:37,122][62635] Updated weights for policy 1, policy_version 79110 (0.0009) [2023-10-12 19:05:37,493][62635] Updated weights for policy 1, policy_version 79120 (0.0010) [2023-10-12 19:05:37,858][62635] Updated weights for policy 1, policy_version 79130 (0.0007) [2023-10-12 19:05:38,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 162037760. Throughput: 0: 1673.7, 1: 1678.1. Samples: 40510156. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 19:05:38,435][61643] Avg episode reward: [(0, '24.040'), (1, '9.850')] [2023-10-12 19:05:40,745][62634] Updated weights for policy 0, policy_version 79110 (0.0007) [2023-10-12 19:05:41,116][62634] Updated weights for policy 0, policy_version 79120 (0.0007) [2023-10-12 19:05:41,495][62634] Updated weights for policy 0, policy_version 79130 (0.0007) [2023-10-12 19:05:41,970][62635] Updated weights for policy 1, policy_version 79140 (0.0009) [2023-10-12 19:05:42,337][62635] Updated weights for policy 1, policy_version 79150 (0.0010) [2023-10-12 19:05:42,704][62635] Updated weights for policy 1, policy_version 79160 (0.0010) [2023-10-12 19:05:43,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 162103296. Throughput: 0: 1661.5, 1: 1684.4. Samples: 40529840. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 19:05:43,436][61643] Avg episode reward: [(0, '24.150'), (1, '9.900')] [2023-10-12 19:05:45,418][62634] Updated weights for policy 0, policy_version 79140 (0.0009) [2023-10-12 19:05:45,790][62634] Updated weights for policy 0, policy_version 79150 (0.0007) [2023-10-12 19:05:46,164][62634] Updated weights for policy 0, policy_version 79160 (0.0008) [2023-10-12 19:05:46,681][62635] Updated weights for policy 1, policy_version 79170 (0.0007) [2023-10-12 19:05:47,059][62635] Updated weights for policy 1, policy_version 79180 (0.0009) [2023-10-12 19:05:47,425][62635] Updated weights for policy 1, policy_version 79190 (0.0009) [2023-10-12 19:05:47,797][62635] Updated weights for policy 1, policy_version 79200 (0.0010) [2023-10-12 19:05:48,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 162168832. Throughput: 0: 1685.4, 1: 1673.6. Samples: 40549686. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 19:05:48,436][61643] Avg episode reward: [(0, '24.030'), (1, '10.000')] [2023-10-12 19:05:50,316][62634] Updated weights for policy 0, policy_version 79170 (0.0010) [2023-10-12 19:05:50,694][62634] Updated weights for policy 0, policy_version 79180 (0.0010) [2023-10-12 19:05:51,069][62634] Updated weights for policy 0, policy_version 79190 (0.0009) [2023-10-12 19:05:51,450][62634] Updated weights for policy 0, policy_version 79200 (0.0009) [2023-10-12 19:05:51,569][62635] Updated weights for policy 1, policy_version 79210 (0.0008) [2023-10-12 19:05:51,934][62635] Updated weights for policy 1, policy_version 79220 (0.0010) [2023-10-12 19:05:52,306][62635] Updated weights for policy 1, policy_version 79230 (0.0008) [2023-10-12 19:05:53,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 162234368. Throughput: 0: 1665.0, 1: 1695.7. Samples: 40560600. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 19:05:53,435][61643] Avg episode reward: [(0, '24.070'), (1, '9.740')] [2023-10-12 19:05:55,329][62634] Updated weights for policy 0, policy_version 79210 (0.0009) [2023-10-12 19:05:55,713][62634] Updated weights for policy 0, policy_version 79220 (0.0007) [2023-10-12 19:05:56,091][62634] Updated weights for policy 0, policy_version 79230 (0.0009) [2023-10-12 19:05:56,345][62635] Updated weights for policy 1, policy_version 79240 (0.0009) [2023-10-12 19:05:56,721][62635] Updated weights for policy 1, policy_version 79250 (0.0009) [2023-10-12 19:05:57,085][62635] Updated weights for policy 1, policy_version 79260 (0.0010) [2023-10-12 19:05:58,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 162299904. Throughput: 0: 1672.9, 1: 1667.9. Samples: 40579740. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 19:05:58,435][61643] Avg episode reward: [(0, '24.340'), (1, '9.900')] [2023-10-12 19:06:00,267][62634] Updated weights for policy 0, policy_version 79240 (0.0008) [2023-10-12 19:06:00,651][62634] Updated weights for policy 0, policy_version 79250 (0.0007) [2023-10-12 19:06:01,028][62634] Updated weights for policy 0, policy_version 79260 (0.0008) [2023-10-12 19:06:01,155][62635] Updated weights for policy 1, policy_version 79270 (0.0008) [2023-10-12 19:06:01,526][62635] Updated weights for policy 1, policy_version 79280 (0.0007) [2023-10-12 19:06:01,888][62635] Updated weights for policy 1, policy_version 79290 (0.0009) [2023-10-12 19:06:03,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 162365440. Throughput: 0: 1687.5, 1: 1677.9. Samples: 40600172. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 19:06:03,436][61643] Avg episode reward: [(0, '24.090'), (1, '9.940')] [2023-10-12 19:06:05,007][62634] Updated weights for policy 0, policy_version 79270 (0.0009) [2023-10-12 19:06:05,381][62634] Updated weights for policy 0, policy_version 79280 (0.0008) [2023-10-12 19:06:05,755][62634] Updated weights for policy 0, policy_version 79290 (0.0011) [2023-10-12 19:06:05,988][62635] Updated weights for policy 1, policy_version 79300 (0.0007) [2023-10-12 19:06:06,351][62635] Updated weights for policy 1, policy_version 79310 (0.0010) [2023-10-12 19:06:06,721][62635] Updated weights for policy 1, policy_version 79320 (0.0011) [2023-10-12 19:06:08,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 162430976. Throughput: 0: 1657.6, 1: 1688.9. Samples: 40610442. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 19:06:08,436][61643] Avg episode reward: [(0, '24.200'), (1, '9.980')] [2023-10-12 19:06:09,969][62634] Updated weights for policy 0, policy_version 79300 (0.0007) [2023-10-12 19:06:10,334][62634] Updated weights for policy 0, policy_version 79310 (0.0007) [2023-10-12 19:06:10,713][62634] Updated weights for policy 0, policy_version 79320 (0.0009) [2023-10-12 19:06:10,917][62635] Updated weights for policy 1, policy_version 79330 (0.0010) [2023-10-12 19:06:11,279][62635] Updated weights for policy 1, policy_version 79340 (0.0007) [2023-10-12 19:06:11,647][62635] Updated weights for policy 1, policy_version 79350 (0.0008) [2023-10-12 19:06:12,018][62635] Updated weights for policy 1, policy_version 79360 (0.0009) [2023-10-12 19:06:13,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 162496512. Throughput: 0: 1682.8, 1: 1661.1. Samples: 40629888. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 19:06:13,435][61643] Avg episode reward: [(0, '24.360'), (1, '9.970')] [2023-10-12 19:06:14,716][62634] Updated weights for policy 0, policy_version 79330 (0.0009) [2023-10-12 19:06:15,093][62634] Updated weights for policy 0, policy_version 79340 (0.0010) [2023-10-12 19:06:15,473][62634] Updated weights for policy 0, policy_version 79350 (0.0007) [2023-10-12 19:06:15,842][62634] Updated weights for policy 0, policy_version 79360 (0.0011) [2023-10-12 19:06:16,177][62635] Updated weights for policy 1, policy_version 79370 (0.0009) [2023-10-12 19:06:16,552][62635] Updated weights for policy 1, policy_version 79380 (0.0007) [2023-10-12 19:06:16,916][62635] Updated weights for policy 1, policy_version 79390 (0.0008) [2023-10-12 19:06:18,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 162562048. Throughput: 0: 1687.5, 1: 1682.4. Samples: 40650666. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 19:06:18,435][61643] Avg episode reward: [(0, '24.310'), (1, '9.880')] [2023-10-12 19:06:18,443][62495] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000079392_81297408.pth... [2023-10-12 19:06:18,443][62354] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000079360_81264640.pth... [2023-10-12 19:06:18,473][62495] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000077824_79691776.pth [2023-10-12 19:06:18,473][62354] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000077792_79659008.pth [2023-10-12 19:06:19,818][62634] Updated weights for policy 0, policy_version 79370 (0.0008) [2023-10-12 19:06:20,196][62634] Updated weights for policy 0, policy_version 79380 (0.0009) [2023-10-12 19:06:20,568][62634] Updated weights for policy 0, policy_version 79390 (0.0008) [2023-10-12 19:06:21,025][62635] Updated weights for policy 1, policy_version 79400 (0.0008) [2023-10-12 19:06:21,381][62635] Updated weights for policy 1, policy_version 79410 (0.0012) [2023-10-12 19:06:21,749][62635] Updated weights for policy 1, policy_version 79420 (0.0008) [2023-10-12 19:06:23,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 162627584. Throughput: 0: 1665.8, 1: 1681.9. Samples: 40660802. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 19:06:23,436][61643] Avg episode reward: [(0, '24.380'), (1, '9.850')] [2023-10-12 19:06:24,647][62634] Updated weights for policy 0, policy_version 79400 (0.0007) [2023-10-12 19:06:25,029][62634] Updated weights for policy 0, policy_version 79410 (0.0007) [2023-10-12 19:06:25,398][62634] Updated weights for policy 0, policy_version 79420 (0.0007) [2023-10-12 19:06:25,852][62635] Updated weights for policy 1, policy_version 79430 (0.0011) [2023-10-12 19:06:26,220][62635] Updated weights for policy 1, policy_version 79440 (0.0008) [2023-10-12 19:06:26,587][62635] Updated weights for policy 1, policy_version 79450 (0.0007) [2023-10-12 19:06:28,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 162693120. Throughput: 0: 1687.8, 1: 1659.9. Samples: 40680484. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 19:06:28,436][61643] Avg episode reward: [(0, '24.400'), (1, '9.850')] [2023-10-12 19:06:29,426][62634] Updated weights for policy 0, policy_version 79430 (0.0009) [2023-10-12 19:06:29,803][62634] Updated weights for policy 0, policy_version 79440 (0.0008) [2023-10-12 19:06:30,179][62634] Updated weights for policy 0, policy_version 79450 (0.0007) [2023-10-12 19:06:30,595][62635] Updated weights for policy 1, policy_version 79460 (0.0009) [2023-10-12 19:06:30,960][62635] Updated weights for policy 1, policy_version 79470 (0.0007) [2023-10-12 19:06:31,328][62635] Updated weights for policy 1, policy_version 79480 (0.0007) [2023-10-12 19:06:33,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 162758656. Throughput: 0: 1684.3, 1: 1685.5. Samples: 40701328. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 19:06:33,435][61643] Avg episode reward: [(0, '24.650'), (1, '9.850')] [2023-10-12 19:06:34,261][62634] Updated weights for policy 0, policy_version 79460 (0.0008) [2023-10-12 19:06:34,647][62634] Updated weights for policy 0, policy_version 79470 (0.0007) [2023-10-12 19:06:35,021][62634] Updated weights for policy 0, policy_version 79480 (0.0009) [2023-10-12 19:06:35,367][62635] Updated weights for policy 1, policy_version 79490 (0.0007) [2023-10-12 19:06:35,770][62635] Updated weights for policy 1, policy_version 79500 (0.0007) [2023-10-12 19:06:36,134][62635] Updated weights for policy 1, policy_version 79510 (0.0008) [2023-10-12 19:06:36,501][62635] Updated weights for policy 1, policy_version 79520 (0.0010) [2023-10-12 19:06:38,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 162824192. Throughput: 0: 1672.4, 1: 1668.7. Samples: 40710952. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 19:06:38,436][61643] Avg episode reward: [(0, '24.810'), (1, '9.940')] [2023-10-12 19:06:39,127][62634] Updated weights for policy 0, policy_version 79490 (0.0007) [2023-10-12 19:06:39,535][62634] Updated weights for policy 0, policy_version 79500 (0.0009) [2023-10-12 19:06:39,901][62634] Updated weights for policy 0, policy_version 79510 (0.0007) [2023-10-12 19:06:40,275][62634] Updated weights for policy 0, policy_version 79520 (0.0008) [2023-10-12 19:06:40,645][62635] Updated weights for policy 1, policy_version 79530 (0.0008) [2023-10-12 19:06:41,009][62635] Updated weights for policy 1, policy_version 79540 (0.0007) [2023-10-12 19:06:41,378][62635] Updated weights for policy 1, policy_version 79550 (0.0007) [2023-10-12 19:06:43,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 162889728. Throughput: 0: 1681.4, 1: 1678.9. Samples: 40730952. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 19:06:43,435][61643] Avg episode reward: [(0, '24.990'), (1, '10.030')] [2023-10-12 19:06:44,387][62634] Updated weights for policy 0, policy_version 79530 (0.0007) [2023-10-12 19:06:44,770][62634] Updated weights for policy 0, policy_version 79540 (0.0008) [2023-10-12 19:06:45,136][62634] Updated weights for policy 0, policy_version 79550 (0.0007) [2023-10-12 19:06:45,340][62635] Updated weights for policy 1, policy_version 79560 (0.0007) [2023-10-12 19:06:45,703][62635] Updated weights for policy 1, policy_version 79570 (0.0009) [2023-10-12 19:06:46,069][62635] Updated weights for policy 1, policy_version 79580 (0.0011) [2023-10-12 19:06:48,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 162955264. Throughput: 0: 1680.3, 1: 1690.2. Samples: 40751844. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 19:06:48,436][61643] Avg episode reward: [(0, '25.080'), (1, '9.770')] [2023-10-12 19:06:49,296][62634] Updated weights for policy 0, policy_version 79560 (0.0010) [2023-10-12 19:06:49,676][62634] Updated weights for policy 0, policy_version 79570 (0.0009) [2023-10-12 19:06:50,022][62635] Updated weights for policy 1, policy_version 79590 (0.0009) [2023-10-12 19:06:50,057][62634] Updated weights for policy 0, policy_version 79580 (0.0009) [2023-10-12 19:06:50,386][62635] Updated weights for policy 1, policy_version 79600 (0.0010) [2023-10-12 19:06:50,754][62635] Updated weights for policy 1, policy_version 79610 (0.0010) [2023-10-12 19:06:53,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 163020800. Throughput: 0: 1676.4, 1: 1666.9. Samples: 40760892. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 19:06:53,436][61643] Avg episode reward: [(0, '24.960'), (1, '9.940')] [2023-10-12 19:06:54,202][62634] Updated weights for policy 0, policy_version 79590 (0.0010) [2023-10-12 19:06:54,585][62634] Updated weights for policy 0, policy_version 79600 (0.0008) [2023-10-12 19:06:54,786][62635] Updated weights for policy 1, policy_version 79620 (0.0009) [2023-10-12 19:06:54,952][62634] Updated weights for policy 0, policy_version 79610 (0.0008) [2023-10-12 19:06:55,155][62635] Updated weights for policy 1, policy_version 79630 (0.0010) [2023-10-12 19:06:55,522][62635] Updated weights for policy 1, policy_version 79640 (0.0009) [2023-10-12 19:06:58,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 163086336. Throughput: 0: 1673.1, 1: 1693.6. Samples: 40781390. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 19:06:58,436][61643] Avg episode reward: [(0, '24.880'), (1, '10.150')] [2023-10-12 19:06:59,065][62634] Updated weights for policy 0, policy_version 79620 (0.0008) [2023-10-12 19:06:59,439][62634] Updated weights for policy 0, policy_version 79630 (0.0008) [2023-10-12 19:06:59,627][62635] Updated weights for policy 1, policy_version 79650 (0.0009) [2023-10-12 19:06:59,818][62634] Updated weights for policy 0, policy_version 79640 (0.0008) [2023-10-12 19:06:59,994][62635] Updated weights for policy 1, policy_version 79660 (0.0009) [2023-10-12 19:07:00,367][62635] Updated weights for policy 1, policy_version 79670 (0.0009) [2023-10-12 19:07:00,725][62635] Updated weights for policy 1, policy_version 79680 (0.0010) [2023-10-12 19:07:03,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 163151872. Throughput: 0: 1673.3, 1: 1695.9. Samples: 40802280. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 19:07:03,435][61643] Avg episode reward: [(0, '24.910'), (1, '10.040')] [2023-10-12 19:07:03,756][62634] Updated weights for policy 0, policy_version 79650 (0.0009) [2023-10-12 19:07:04,123][62634] Updated weights for policy 0, policy_version 79660 (0.0009) [2023-10-12 19:07:04,503][62634] Updated weights for policy 0, policy_version 79670 (0.0007) [2023-10-12 19:07:04,878][62635] Updated weights for policy 1, policy_version 79690 (0.0007) [2023-10-12 19:07:04,883][62634] Updated weights for policy 0, policy_version 79680 (0.0007) [2023-10-12 19:07:05,239][62635] Updated weights for policy 1, policy_version 79700 (0.0008) [2023-10-12 19:07:05,602][62635] Updated weights for policy 1, policy_version 79710 (0.0008) [2023-10-12 19:07:08,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 163217408. Throughput: 0: 1669.6, 1: 1674.2. Samples: 40811274. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 19:07:08,436][61643] Avg episode reward: [(0, '24.950'), (1, '9.870')] [2023-10-12 19:07:08,967][62634] Updated weights for policy 0, policy_version 79690 (0.0009) [2023-10-12 19:07:09,350][62634] Updated weights for policy 0, policy_version 79700 (0.0008) [2023-10-12 19:07:09,724][62634] Updated weights for policy 0, policy_version 79710 (0.0008) [2023-10-12 19:07:09,749][62635] Updated weights for policy 1, policy_version 79720 (0.0009) [2023-10-12 19:07:10,111][62635] Updated weights for policy 1, policy_version 79730 (0.0011) [2023-10-12 19:07:10,482][62635] Updated weights for policy 1, policy_version 79740 (0.0008) [2023-10-12 19:07:13,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 163282944. Throughput: 0: 1668.8, 1: 1696.3. Samples: 40831916. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 19:07:13,435][61643] Avg episode reward: [(0, '24.910'), (1, '10.050')] [2023-10-12 19:07:13,580][62634] Updated weights for policy 0, policy_version 79720 (0.0010) [2023-10-12 19:07:13,964][62634] Updated weights for policy 0, policy_version 79730 (0.0008) [2023-10-12 19:07:14,342][62634] Updated weights for policy 0, policy_version 79740 (0.0007) [2023-10-12 19:07:14,605][62635] Updated weights for policy 1, policy_version 79750 (0.0010) [2023-10-12 19:07:14,959][62635] Updated weights for policy 1, policy_version 79760 (0.0010) [2023-10-12 19:07:15,329][62635] Updated weights for policy 1, policy_version 79770 (0.0007) [2023-10-12 19:07:18,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 163348480. Throughput: 0: 1667.3, 1: 1697.1. Samples: 40852728. Policy #0 lag: (min: 29.0, avg: 36.0, max: 61.0) [2023-10-12 19:07:18,435][61643] Avg episode reward: [(0, '24.930'), (1, '10.000')] [2023-10-12 19:07:18,509][62634] Updated weights for policy 0, policy_version 79750 (0.0009) [2023-10-12 19:07:18,890][62634] Updated weights for policy 0, policy_version 79760 (0.0011) [2023-10-12 19:07:19,253][62635] Updated weights for policy 1, policy_version 79780 (0.0008) [2023-10-12 19:07:19,274][62634] Updated weights for policy 0, policy_version 79770 (0.0008) [2023-10-12 19:07:19,616][62635] Updated weights for policy 1, policy_version 79790 (0.0010) [2023-10-12 19:07:19,983][62635] Updated weights for policy 1, policy_version 79800 (0.0009) [2023-10-12 19:07:23,382][62634] Updated weights for policy 0, policy_version 79780 (0.0009) [2023-10-12 19:07:23,435][61643] Fps is (10 sec: 13106.8, 60 sec: 13107.1, 300 sec: 13329.3). Total num frames: 163414016. Throughput: 0: 1668.1, 1: 1683.5. Samples: 40861774. Policy #0 lag: (min: 29.0, avg: 36.0, max: 61.0) [2023-10-12 19:07:23,436][61643] Avg episode reward: [(0, '24.670'), (1, '9.910')] [2023-10-12 19:07:23,756][62634] Updated weights for policy 0, policy_version 79790 (0.0008) [2023-10-12 19:07:24,040][62635] Updated weights for policy 1, policy_version 79810 (0.0009) [2023-10-12 19:07:24,140][62634] Updated weights for policy 0, policy_version 79800 (0.0008) [2023-10-12 19:07:24,428][62635] Updated weights for policy 1, policy_version 79820 (0.0009) [2023-10-12 19:07:24,803][62635] Updated weights for policy 1, policy_version 79830 (0.0009) [2023-10-12 19:07:25,166][62635] Updated weights for policy 1, policy_version 79840 (0.0008) [2023-10-12 19:07:28,039][62634] Updated weights for policy 0, policy_version 79810 (0.0008) [2023-10-12 19:07:28,411][62634] Updated weights for policy 0, policy_version 79820 (0.0009) [2023-10-12 19:07:28,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 163479552. Throughput: 0: 1677.5, 1: 1693.5. Samples: 40882646. Policy #0 lag: (min: 29.0, avg: 36.0, max: 61.0) [2023-10-12 19:07:28,435][61643] Avg episode reward: [(0, '24.410'), (1, '10.090')] [2023-10-12 19:07:28,793][62634] Updated weights for policy 0, policy_version 79830 (0.0007) [2023-10-12 19:07:29,166][62634] Updated weights for policy 0, policy_version 79840 (0.0007) [2023-10-12 19:07:29,290][62635] Updated weights for policy 1, policy_version 79850 (0.0009) [2023-10-12 19:07:29,662][62635] Updated weights for policy 1, policy_version 79860 (0.0009) [2023-10-12 19:07:30,036][62635] Updated weights for policy 1, policy_version 79870 (0.0007) [2023-10-12 19:07:33,225][62634] Updated weights for policy 0, policy_version 79850 (0.0007) [2023-10-12 19:07:33,435][61643] Fps is (10 sec: 13107.6, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 163545088. Throughput: 0: 1681.8, 1: 1687.0. Samples: 40903440. Policy #0 lag: (min: 29.0, avg: 36.0, max: 61.0) [2023-10-12 19:07:33,435][61643] Avg episode reward: [(0, '24.510'), (1, '9.820')] [2023-10-12 19:07:33,595][62634] Updated weights for policy 0, policy_version 79860 (0.0010) [2023-10-12 19:07:33,965][62635] Updated weights for policy 1, policy_version 79880 (0.0008) [2023-10-12 19:07:33,979][62634] Updated weights for policy 0, policy_version 79870 (0.0007) [2023-10-12 19:07:34,336][62635] Updated weights for policy 1, policy_version 79890 (0.0007) [2023-10-12 19:07:34,700][62635] Updated weights for policy 1, policy_version 79900 (0.0009) [2023-10-12 19:07:38,008][62634] Updated weights for policy 0, policy_version 79880 (0.0011) [2023-10-12 19:07:38,397][62634] Updated weights for policy 0, policy_version 79890 (0.0007) [2023-10-12 19:07:38,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.3, 300 sec: 13329.4). Total num frames: 163610624. Throughput: 0: 1688.3, 1: 1685.9. Samples: 40912728. Policy #0 lag: (min: 29.0, avg: 36.0, max: 61.0) [2023-10-12 19:07:38,435][61643] Avg episode reward: [(0, '24.440'), (1, '9.920')] [2023-10-12 19:07:38,777][62634] Updated weights for policy 0, policy_version 79900 (0.0008) [2023-10-12 19:07:38,815][62635] Updated weights for policy 1, policy_version 79910 (0.0010) [2023-10-12 19:07:39,182][62635] Updated weights for policy 1, policy_version 79920 (0.0009) [2023-10-12 19:07:39,556][62635] Updated weights for policy 1, policy_version 79930 (0.0009) [2023-10-12 19:07:42,771][62634] Updated weights for policy 0, policy_version 79910 (0.0009) [2023-10-12 19:07:43,145][62634] Updated weights for policy 0, policy_version 79920 (0.0008) [2023-10-12 19:07:43,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 163676160. Throughput: 0: 1692.1, 1: 1687.3. Samples: 40933462. Policy #0 lag: (min: 29.0, avg: 36.0, max: 61.0) [2023-10-12 19:07:43,435][61643] Avg episode reward: [(0, '24.680'), (1, '10.000')] [2023-10-12 19:07:43,526][62634] Updated weights for policy 0, policy_version 79930 (0.0009) [2023-10-12 19:07:43,556][62635] Updated weights for policy 1, policy_version 79940 (0.0007) [2023-10-12 19:07:43,918][62635] Updated weights for policy 1, policy_version 79950 (0.0007) [2023-10-12 19:07:44,291][62635] Updated weights for policy 1, policy_version 79960 (0.0010) [2023-10-12 19:07:47,677][62634] Updated weights for policy 0, policy_version 79940 (0.0011) [2023-10-12 19:07:48,058][62634] Updated weights for policy 0, policy_version 79950 (0.0010) [2023-10-12 19:07:48,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 163741696. Throughput: 0: 1678.7, 1: 1682.0. Samples: 40953512. Policy #0 lag: (min: 29.0, avg: 36.0, max: 61.0) [2023-10-12 19:07:48,435][61643] Avg episode reward: [(0, '24.840'), (1, '9.860')] [2023-10-12 19:07:48,441][62634] Updated weights for policy 0, policy_version 79960 (0.0009) [2023-10-12 19:07:48,657][62635] Updated weights for policy 1, policy_version 79970 (0.0007) [2023-10-12 19:07:49,021][62635] Updated weights for policy 1, policy_version 79980 (0.0009) [2023-10-12 19:07:49,388][62635] Updated weights for policy 1, policy_version 79990 (0.0009) [2023-10-12 19:07:49,756][62635] Updated weights for policy 1, policy_version 80000 (0.0008) [2023-10-12 19:07:52,456][62634] Updated weights for policy 0, policy_version 79970 (0.0008) [2023-10-12 19:07:52,841][62634] Updated weights for policy 0, policy_version 79980 (0.0007) [2023-10-12 19:07:53,218][62634] Updated weights for policy 0, policy_version 79990 (0.0007) [2023-10-12 19:07:53,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 163807232. Throughput: 0: 1691.6, 1: 1678.2. Samples: 40962912. Policy #0 lag: (min: 29.0, avg: 36.0, max: 61.0) [2023-10-12 19:07:53,435][61643] Avg episode reward: [(0, '24.900'), (1, '9.880')] [2023-10-12 19:07:53,598][62634] Updated weights for policy 0, policy_version 80000 (0.0008) [2023-10-12 19:07:53,901][62635] Updated weights for policy 1, policy_version 80010 (0.0009) [2023-10-12 19:07:54,263][62635] Updated weights for policy 1, policy_version 80020 (0.0008) [2023-10-12 19:07:54,636][62635] Updated weights for policy 1, policy_version 80030 (0.0009) [2023-10-12 19:07:57,769][62634] Updated weights for policy 0, policy_version 80010 (0.0007) [2023-10-12 19:07:58,153][62634] Updated weights for policy 0, policy_version 80020 (0.0008) [2023-10-12 19:07:58,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 163872768. Throughput: 0: 1692.0, 1: 1674.5. Samples: 40983408. Policy #0 lag: (min: 29.0, avg: 36.0, max: 61.0) [2023-10-12 19:07:58,436][61643] Avg episode reward: [(0, '24.790'), (1, '9.970')] [2023-10-12 19:07:58,531][62634] Updated weights for policy 0, policy_version 80030 (0.0009) [2023-10-12 19:07:58,809][62635] Updated weights for policy 1, policy_version 80040 (0.0008) [2023-10-12 19:07:59,180][62635] Updated weights for policy 1, policy_version 80050 (0.0007) [2023-10-12 19:07:59,547][62635] Updated weights for policy 1, policy_version 80060 (0.0007) [2023-10-12 19:08:02,600][62634] Updated weights for policy 0, policy_version 80040 (0.0008) [2023-10-12 19:08:02,981][62634] Updated weights for policy 0, policy_version 80050 (0.0009) [2023-10-12 19:08:03,362][62634] Updated weights for policy 0, policy_version 80060 (0.0009) [2023-10-12 19:08:03,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 163938304. Throughput: 0: 1676.5, 1: 1669.2. Samples: 41003282. Policy #0 lag: (min: 29.0, avg: 36.0, max: 61.0) [2023-10-12 19:08:03,435][61643] Avg episode reward: [(0, '24.440'), (1, '9.860')] [2023-10-12 19:08:03,605][62635] Updated weights for policy 1, policy_version 80070 (0.0009) [2023-10-12 19:08:03,968][62635] Updated weights for policy 1, policy_version 80080 (0.0010) [2023-10-12 19:08:04,332][62635] Updated weights for policy 1, policy_version 80090 (0.0011) [2023-10-12 19:08:07,362][62634] Updated weights for policy 0, policy_version 80070 (0.0010) [2023-10-12 19:08:07,740][62634] Updated weights for policy 0, policy_version 80080 (0.0009) [2023-10-12 19:08:08,110][62634] Updated weights for policy 0, policy_version 80090 (0.0007) [2023-10-12 19:08:08,415][62635] Updated weights for policy 1, policy_version 80100 (0.0009) [2023-10-12 19:08:08,435][61643] Fps is (10 sec: 16384.5, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 164036608. Throughput: 0: 1692.0, 1: 1668.1. Samples: 41012976. Policy #0 lag: (min: 31.0, avg: 33.7, max: 63.0) [2023-10-12 19:08:08,435][61643] Avg episode reward: [(0, '24.400'), (1, '9.850')] [2023-10-12 19:08:08,788][62635] Updated weights for policy 1, policy_version 80110 (0.0007) [2023-10-12 19:08:09,156][62635] Updated weights for policy 1, policy_version 80120 (0.0008) [2023-10-12 19:08:12,298][62634] Updated weights for policy 0, policy_version 80100 (0.0008) [2023-10-12 19:08:12,685][62634] Updated weights for policy 0, policy_version 80110 (0.0007) [2023-10-12 19:08:13,052][62634] Updated weights for policy 0, policy_version 80120 (0.0007) [2023-10-12 19:08:13,344][62635] Updated weights for policy 1, policy_version 80130 (0.0009) [2023-10-12 19:08:13,435][61643] Fps is (10 sec: 16384.2, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 164102144. Throughput: 0: 1685.0, 1: 1672.0. Samples: 41033714. Policy #0 lag: (min: 31.0, avg: 33.7, max: 63.0) [2023-10-12 19:08:13,435][61643] Avg episode reward: [(0, '24.420'), (1, '10.030')] [2023-10-12 19:08:13,741][62635] Updated weights for policy 1, policy_version 80140 (0.0007) [2023-10-12 19:08:14,117][62635] Updated weights for policy 1, policy_version 80150 (0.0007) [2023-10-12 19:08:14,487][62635] Updated weights for policy 1, policy_version 80160 (0.0010) [2023-10-12 19:08:17,010][62634] Updated weights for policy 0, policy_version 80130 (0.0009) [2023-10-12 19:08:17,389][62634] Updated weights for policy 0, policy_version 80140 (0.0010) [2023-10-12 19:08:17,774][62634] Updated weights for policy 0, policy_version 80150 (0.0010) [2023-10-12 19:08:18,142][62634] Updated weights for policy 0, policy_version 80160 (0.0010) [2023-10-12 19:08:18,285][62635] Updated weights for policy 1, policy_version 80170 (0.0009) [2023-10-12 19:08:18,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 164167680. Throughput: 0: 1657.8, 1: 1673.2. Samples: 41053334. Policy #0 lag: (min: 31.0, avg: 33.7, max: 63.0) [2023-10-12 19:08:18,435][61643] Avg episode reward: [(0, '23.860'), (1, '9.850')] [2023-10-12 19:08:18,444][62354] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000080160_82083840.pth... [2023-10-12 19:08:18,484][62354] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000078592_80478208.pth [2023-10-12 19:08:18,653][62635] Updated weights for policy 1, policy_version 80180 (0.0010) [2023-10-12 19:08:19,022][62635] Updated weights for policy 1, policy_version 80190 (0.0010) [2023-10-12 19:08:19,087][62495] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000080192_82116608.pth... [2023-10-12 19:08:19,116][62495] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000078592_80478208.pth [2023-10-12 19:08:22,024][62634] Updated weights for policy 0, policy_version 80170 (0.0007) [2023-10-12 19:08:22,407][62634] Updated weights for policy 0, policy_version 80180 (0.0009) [2023-10-12 19:08:22,791][62634] Updated weights for policy 0, policy_version 80190 (0.0007) [2023-10-12 19:08:23,078][62635] Updated weights for policy 1, policy_version 80200 (0.0007) [2023-10-12 19:08:23,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 164233216. Throughput: 0: 1682.0, 1: 1674.3. Samples: 41063758. Policy #0 lag: (min: 31.0, avg: 33.7, max: 63.0) [2023-10-12 19:08:23,435][61643] Avg episode reward: [(0, '23.700'), (1, '9.850')] [2023-10-12 19:08:23,445][62635] Updated weights for policy 1, policy_version 80210 (0.0007) [2023-10-12 19:08:23,808][62635] Updated weights for policy 1, policy_version 80220 (0.0008) [2023-10-12 19:08:26,791][62634] Updated weights for policy 0, policy_version 80200 (0.0008) [2023-10-12 19:08:27,172][62634] Updated weights for policy 0, policy_version 80210 (0.0008) [2023-10-12 19:08:27,543][62634] Updated weights for policy 0, policy_version 80220 (0.0008) [2023-10-12 19:08:27,740][62635] Updated weights for policy 1, policy_version 80230 (0.0009) [2023-10-12 19:08:28,108][62635] Updated weights for policy 1, policy_version 80240 (0.0009) [2023-10-12 19:08:28,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 164298752. Throughput: 0: 1677.2, 1: 1673.1. Samples: 41084222. Policy #0 lag: (min: 31.0, avg: 33.7, max: 63.0) [2023-10-12 19:08:28,435][61643] Avg episode reward: [(0, '23.780'), (1, '10.030')] [2023-10-12 19:08:28,473][62635] Updated weights for policy 1, policy_version 80250 (0.0008) [2023-10-12 19:08:31,605][62634] Updated weights for policy 0, policy_version 80230 (0.0011) [2023-10-12 19:08:31,983][62634] Updated weights for policy 0, policy_version 80240 (0.0011) [2023-10-12 19:08:32,369][62634] Updated weights for policy 0, policy_version 80250 (0.0009) [2023-10-12 19:08:32,629][62635] Updated weights for policy 1, policy_version 80260 (0.0007) [2023-10-12 19:08:32,985][62635] Updated weights for policy 1, policy_version 80270 (0.0009) [2023-10-12 19:08:33,346][62635] Updated weights for policy 1, policy_version 80280 (0.0010) [2023-10-12 19:08:33,435][61643] Fps is (10 sec: 13106.8, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 164364288. Throughput: 0: 1665.6, 1: 1667.3. Samples: 41103490. Policy #0 lag: (min: 31.0, avg: 33.7, max: 63.0) [2023-10-12 19:08:33,436][61643] Avg episode reward: [(0, '23.760'), (1, '9.870')] [2023-10-12 19:08:36,521][62634] Updated weights for policy 0, policy_version 80260 (0.0008) [2023-10-12 19:08:36,912][62634] Updated weights for policy 0, policy_version 80270 (0.0007) [2023-10-12 19:08:37,279][62634] Updated weights for policy 0, policy_version 80280 (0.0008) [2023-10-12 19:08:37,291][62635] Updated weights for policy 1, policy_version 80290 (0.0009) [2023-10-12 19:08:37,651][62635] Updated weights for policy 1, policy_version 80300 (0.0007) [2023-10-12 19:08:38,026][62635] Updated weights for policy 1, policy_version 80310 (0.0007) [2023-10-12 19:08:38,383][62635] Updated weights for policy 1, policy_version 80320 (0.0008) [2023-10-12 19:08:38,435][61643] Fps is (10 sec: 16383.9, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 164462592. Throughput: 0: 1685.1, 1: 1683.0. Samples: 41114474. Policy #0 lag: (min: 31.0, avg: 33.7, max: 63.0) [2023-10-12 19:08:38,435][61643] Avg episode reward: [(0, '23.780'), (1, '9.800')] [2023-10-12 19:08:41,171][62634] Updated weights for policy 0, policy_version 80290 (0.0007) [2023-10-12 19:08:41,543][62634] Updated weights for policy 0, policy_version 80300 (0.0007) [2023-10-12 19:08:41,922][62634] Updated weights for policy 0, policy_version 80310 (0.0009) [2023-10-12 19:08:42,291][62634] Updated weights for policy 0, policy_version 80320 (0.0008) [2023-10-12 19:08:42,430][62635] Updated weights for policy 1, policy_version 80330 (0.0007) [2023-10-12 19:08:42,803][62635] Updated weights for policy 1, policy_version 80340 (0.0008) [2023-10-12 19:08:43,178][62635] Updated weights for policy 1, policy_version 80350 (0.0010) [2023-10-12 19:08:43,435][61643] Fps is (10 sec: 16384.4, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 164528128. Throughput: 0: 1666.6, 1: 1686.0. Samples: 41134274. Policy #0 lag: (min: 31.0, avg: 33.7, max: 63.0) [2023-10-12 19:08:43,435][61643] Avg episode reward: [(0, '23.620'), (1, '9.990')] [2023-10-12 19:08:46,547][62634] Updated weights for policy 0, policy_version 80330 (0.0008) [2023-10-12 19:08:46,913][62634] Updated weights for policy 0, policy_version 80340 (0.0009) [2023-10-12 19:08:47,273][62635] Updated weights for policy 1, policy_version 80360 (0.0008) [2023-10-12 19:08:47,289][62634] Updated weights for policy 0, policy_version 80350 (0.0008) [2023-10-12 19:08:47,644][62635] Updated weights for policy 1, policy_version 80370 (0.0011) [2023-10-12 19:08:48,013][62635] Updated weights for policy 1, policy_version 80380 (0.0010) [2023-10-12 19:08:48,435][61643] Fps is (10 sec: 13106.9, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 164593664. Throughput: 0: 1671.4, 1: 1665.1. Samples: 41153426. Policy #0 lag: (min: 31.0, avg: 33.7, max: 63.0) [2023-10-12 19:08:48,436][61643] Avg episode reward: [(0, '23.360'), (1, '10.080')] [2023-10-12 19:08:51,445][62634] Updated weights for policy 0, policy_version 80360 (0.0008) [2023-10-12 19:08:51,832][62634] Updated weights for policy 0, policy_version 80370 (0.0008) [2023-10-12 19:08:52,120][62635] Updated weights for policy 1, policy_version 80390 (0.0009) [2023-10-12 19:08:52,204][62634] Updated weights for policy 0, policy_version 80380 (0.0008) [2023-10-12 19:08:52,489][62635] Updated weights for policy 1, policy_version 80400 (0.0007) [2023-10-12 19:08:52,848][62635] Updated weights for policy 1, policy_version 80410 (0.0009) [2023-10-12 19:08:53,435][61643] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 164659200. Throughput: 0: 1683.6, 1: 1690.6. Samples: 41164816. Policy #0 lag: (min: 31.0, avg: 33.7, max: 63.0) [2023-10-12 19:08:53,435][61643] Avg episode reward: [(0, '23.280'), (1, '9.730')] [2023-10-12 19:08:56,381][62634] Updated weights for policy 0, policy_version 80390 (0.0007) [2023-10-12 19:08:56,761][62634] Updated weights for policy 0, policy_version 80400 (0.0009) [2023-10-12 19:08:57,118][62635] Updated weights for policy 1, policy_version 80420 (0.0008) [2023-10-12 19:08:57,142][62634] Updated weights for policy 0, policy_version 80410 (0.0010) [2023-10-12 19:08:57,486][62635] Updated weights for policy 1, policy_version 80430 (0.0007) [2023-10-12 19:08:57,856][62635] Updated weights for policy 1, policy_version 80440 (0.0007) [2023-10-12 19:08:58,435][61643] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 164724736. Throughput: 0: 1664.5, 1: 1685.7. Samples: 41184472. Policy #0 lag: (min: 31.0, avg: 33.7, max: 63.0) [2023-10-12 19:08:58,435][61643] Avg episode reward: [(0, '23.250'), (1, '9.900')] [2023-10-12 19:09:01,076][62634] Updated weights for policy 0, policy_version 80420 (0.0010) [2023-10-12 19:09:01,463][62634] Updated weights for policy 0, policy_version 80430 (0.0009) [2023-10-12 19:09:01,840][62634] Updated weights for policy 0, policy_version 80440 (0.0008) [2023-10-12 19:09:02,118][62635] Updated weights for policy 1, policy_version 80450 (0.0007) [2023-10-12 19:09:02,537][62635] Updated weights for policy 1, policy_version 80460 (0.0007) [2023-10-12 19:09:02,908][62635] Updated weights for policy 1, policy_version 80470 (0.0008) [2023-10-12 19:09:03,272][62635] Updated weights for policy 1, policy_version 80480 (0.0009) [2023-10-12 19:09:03,435][61643] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 164790272. Throughput: 0: 1677.9, 1: 1661.8. Samples: 41203622. Policy #0 lag: (min: 10.0, avg: 10.6, max: 26.0) [2023-10-12 19:09:03,435][61643] Avg episode reward: [(0, '23.120'), (1, '9.840')] [2023-10-12 19:09:05,829][62634] Updated weights for policy 0, policy_version 80450 (0.0007) [2023-10-12 19:09:06,199][62634] Updated weights for policy 0, policy_version 80460 (0.0008) [2023-10-12 19:09:06,582][62634] Updated weights for policy 0, policy_version 80470 (0.0009) [2023-10-12 19:09:06,948][62634] Updated weights for policy 0, policy_version 80480 (0.0008) [2023-10-12 19:09:07,009][62635] Updated weights for policy 1, policy_version 80490 (0.0008) [2023-10-12 19:09:07,381][62635] Updated weights for policy 1, policy_version 80500 (0.0009) [2023-10-12 19:09:07,753][62635] Updated weights for policy 1, policy_version 80510 (0.0011) [2023-10-12 19:09:08,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 164855808. Throughput: 0: 1673.7, 1: 1686.9. Samples: 41214986. Policy #0 lag: (min: 10.0, avg: 10.6, max: 26.0) [2023-10-12 19:09:08,436][61643] Avg episode reward: [(0, '23.210'), (1, '9.890')] [2023-10-12 19:09:10,985][62634] Updated weights for policy 0, policy_version 80490 (0.0011) [2023-10-12 19:09:11,365][62634] Updated weights for policy 0, policy_version 80500 (0.0009) [2023-10-12 19:09:11,732][62634] Updated weights for policy 0, policy_version 80510 (0.0007) [2023-10-12 19:09:11,808][62635] Updated weights for policy 1, policy_version 80520 (0.0008) [2023-10-12 19:09:12,175][62635] Updated weights for policy 1, policy_version 80530 (0.0009) [2023-10-12 19:09:12,550][62635] Updated weights for policy 1, policy_version 80540 (0.0008) [2023-10-12 19:09:13,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 164921344. Throughput: 0: 1652.1, 1: 1673.0. Samples: 41233854. Policy #0 lag: (min: 10.0, avg: 10.6, max: 26.0) [2023-10-12 19:09:13,435][61643] Avg episode reward: [(0, '23.410'), (1, '9.790')] [2023-10-12 19:09:15,663][62634] Updated weights for policy 0, policy_version 80520 (0.0008) [2023-10-12 19:09:16,040][62634] Updated weights for policy 0, policy_version 80530 (0.0009) [2023-10-12 19:09:16,415][62634] Updated weights for policy 0, policy_version 80540 (0.0009) [2023-10-12 19:09:16,614][62635] Updated weights for policy 1, policy_version 80550 (0.0009) [2023-10-12 19:09:16,976][62635] Updated weights for policy 1, policy_version 80560 (0.0008) [2023-10-12 19:09:17,341][62635] Updated weights for policy 1, policy_version 80570 (0.0007) [2023-10-12 19:09:18,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 164986880. Throughput: 0: 1677.0, 1: 1665.9. Samples: 41253922. Policy #0 lag: (min: 10.0, avg: 10.6, max: 26.0) [2023-10-12 19:09:18,435][61643] Avg episode reward: [(0, '23.620'), (1, '9.700')] [2023-10-12 19:09:20,485][62634] Updated weights for policy 0, policy_version 80550 (0.0009) [2023-10-12 19:09:20,866][62634] Updated weights for policy 0, policy_version 80560 (0.0010) [2023-10-12 19:09:21,250][62634] Updated weights for policy 0, policy_version 80570 (0.0009) [2023-10-12 19:09:21,384][62635] Updated weights for policy 1, policy_version 80580 (0.0007) [2023-10-12 19:09:21,754][62635] Updated weights for policy 1, policy_version 80590 (0.0007) [2023-10-12 19:09:22,125][62635] Updated weights for policy 1, policy_version 80600 (0.0008) [2023-10-12 19:09:23,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 165052416. Throughput: 0: 1662.7, 1: 1680.7. Samples: 41264926. Policy #0 lag: (min: 10.0, avg: 10.6, max: 26.0) [2023-10-12 19:09:23,435][61643] Avg episode reward: [(0, '23.360'), (1, '9.730')] [2023-10-12 19:09:25,492][62634] Updated weights for policy 0, policy_version 80580 (0.0010) [2023-10-12 19:09:25,867][62634] Updated weights for policy 0, policy_version 80590 (0.0009) [2023-10-12 19:09:26,247][62634] Updated weights for policy 0, policy_version 80600 (0.0010) [2023-10-12 19:09:26,304][62635] Updated weights for policy 1, policy_version 80610 (0.0009) [2023-10-12 19:09:26,674][62635] Updated weights for policy 1, policy_version 80620 (0.0010) [2023-10-12 19:09:27,044][62635] Updated weights for policy 1, policy_version 80630 (0.0010) [2023-10-12 19:09:27,418][62635] Updated weights for policy 1, policy_version 80640 (0.0008) [2023-10-12 19:09:28,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 165117952. Throughput: 0: 1664.7, 1: 1665.3. Samples: 41284122. Policy #0 lag: (min: 10.0, avg: 10.6, max: 26.0) [2023-10-12 19:09:28,436][61643] Avg episode reward: [(0, '23.440'), (1, '9.820')] [2023-10-12 19:09:30,419][62634] Updated weights for policy 0, policy_version 80610 (0.0007) [2023-10-12 19:09:30,805][62634] Updated weights for policy 0, policy_version 80620 (0.0009) [2023-10-12 19:09:31,185][62634] Updated weights for policy 0, policy_version 80630 (0.0009) [2023-10-12 19:09:31,513][62635] Updated weights for policy 1, policy_version 80650 (0.0007) [2023-10-12 19:09:31,571][62634] Updated weights for policy 0, policy_version 80640 (0.0008) [2023-10-12 19:09:31,884][62635] Updated weights for policy 1, policy_version 80660 (0.0008) [2023-10-12 19:09:32,252][62635] Updated weights for policy 1, policy_version 80670 (0.0010) [2023-10-12 19:09:33,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 165183488. Throughput: 0: 1675.0, 1: 1677.8. Samples: 41304300. Policy #0 lag: (min: 10.0, avg: 10.6, max: 26.0) [2023-10-12 19:09:33,436][61643] Avg episode reward: [(0, '23.850'), (1, '9.830')] [2023-10-12 19:09:35,408][62634] Updated weights for policy 0, policy_version 80650 (0.0010) [2023-10-12 19:09:35,788][62634] Updated weights for policy 0, policy_version 80660 (0.0008) [2023-10-12 19:09:36,170][62634] Updated weights for policy 0, policy_version 80670 (0.0009) [2023-10-12 19:09:36,311][62635] Updated weights for policy 1, policy_version 80680 (0.0009) [2023-10-12 19:09:36,674][62635] Updated weights for policy 1, policy_version 80690 (0.0008) [2023-10-12 19:09:37,046][62635] Updated weights for policy 1, policy_version 80700 (0.0008) [2023-10-12 19:09:38,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 165249024. Throughput: 0: 1657.3, 1: 1683.9. Samples: 41315170. Policy #0 lag: (min: 10.0, avg: 10.6, max: 26.0) [2023-10-12 19:09:38,436][61643] Avg episode reward: [(0, '23.840'), (1, '9.910')] [2023-10-12 19:09:40,173][62634] Updated weights for policy 0, policy_version 80680 (0.0007) [2023-10-12 19:09:40,548][62634] Updated weights for policy 0, policy_version 80690 (0.0007) [2023-10-12 19:09:40,925][62634] Updated weights for policy 0, policy_version 80700 (0.0007) [2023-10-12 19:09:41,108][62635] Updated weights for policy 1, policy_version 80710 (0.0008) [2023-10-12 19:09:41,475][62635] Updated weights for policy 1, policy_version 80720 (0.0010) [2023-10-12 19:09:41,839][62635] Updated weights for policy 1, policy_version 80730 (0.0010) [2023-10-12 19:09:43,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 165314560. Throughput: 0: 1671.8, 1: 1657.7. Samples: 41334298. Policy #0 lag: (min: 10.0, avg: 10.6, max: 26.0) [2023-10-12 19:09:43,436][61643] Avg episode reward: [(0, '24.020'), (1, '9.880')] [2023-10-12 19:09:44,842][62634] Updated weights for policy 0, policy_version 80710 (0.0007) [2023-10-12 19:09:45,226][62634] Updated weights for policy 0, policy_version 80720 (0.0007) [2023-10-12 19:09:45,617][62634] Updated weights for policy 0, policy_version 80730 (0.0008) [2023-10-12 19:09:45,910][62635] Updated weights for policy 1, policy_version 80740 (0.0010) [2023-10-12 19:09:46,274][62635] Updated weights for policy 1, policy_version 80750 (0.0008) [2023-10-12 19:09:46,642][62635] Updated weights for policy 1, policy_version 80760 (0.0008) [2023-10-12 19:09:48,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 165380096. Throughput: 0: 1683.7, 1: 1679.6. Samples: 41354970. Policy #0 lag: (min: 10.0, avg: 10.6, max: 26.0) [2023-10-12 19:09:48,435][61643] Avg episode reward: [(0, '24.240'), (1, '9.960')] [2023-10-12 19:09:49,617][62634] Updated weights for policy 0, policy_version 80740 (0.0008) [2023-10-12 19:09:49,997][62634] Updated weights for policy 0, policy_version 80750 (0.0010) [2023-10-12 19:09:50,369][62634] Updated weights for policy 0, policy_version 80760 (0.0008) [2023-10-12 19:09:50,858][62635] Updated weights for policy 1, policy_version 80770 (0.0007) [2023-10-12 19:09:51,270][62635] Updated weights for policy 1, policy_version 80780 (0.0007) [2023-10-12 19:09:51,632][62635] Updated weights for policy 1, policy_version 80790 (0.0007) [2023-10-12 19:09:51,999][62635] Updated weights for policy 1, policy_version 80800 (0.0009) [2023-10-12 19:09:53,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 165445632. Throughput: 0: 1657.7, 1: 1672.6. Samples: 41364850. Policy #0 lag: (min: 10.0, avg: 10.6, max: 26.0) [2023-10-12 19:09:53,436][61643] Avg episode reward: [(0, '24.320'), (1, '9.800')] [2023-10-12 19:09:54,603][62634] Updated weights for policy 0, policy_version 80770 (0.0007) [2023-10-12 19:09:54,988][62634] Updated weights for policy 0, policy_version 80780 (0.0008) [2023-10-12 19:09:55,367][62634] Updated weights for policy 0, policy_version 80790 (0.0009) [2023-10-12 19:09:55,752][62634] Updated weights for policy 0, policy_version 80800 (0.0008) [2023-10-12 19:09:55,982][62635] Updated weights for policy 1, policy_version 80810 (0.0010) [2023-10-12 19:09:56,359][62635] Updated weights for policy 1, policy_version 80820 (0.0009) [2023-10-12 19:09:56,724][62635] Updated weights for policy 1, policy_version 80830 (0.0009) [2023-10-12 19:09:58,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 165511168. Throughput: 0: 1690.8, 1: 1658.0. Samples: 41384546. Policy #0 lag: (min: 31.0, avg: 34.6, max: 63.0) [2023-10-12 19:09:58,435][61643] Avg episode reward: [(0, '24.620'), (1, '9.870')] [2023-10-12 19:09:59,805][62634] Updated weights for policy 0, policy_version 80810 (0.0007) [2023-10-12 19:10:00,172][62634] Updated weights for policy 0, policy_version 80820 (0.0008) [2023-10-12 19:10:00,546][62634] Updated weights for policy 0, policy_version 80830 (0.0009) [2023-10-12 19:10:00,756][62635] Updated weights for policy 1, policy_version 80840 (0.0008) [2023-10-12 19:10:01,129][62635] Updated weights for policy 1, policy_version 80850 (0.0007) [2023-10-12 19:10:01,494][62635] Updated weights for policy 1, policy_version 80860 (0.0009) [2023-10-12 19:10:03,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 165576704. Throughput: 0: 1688.6, 1: 1672.4. Samples: 41405168. Policy #0 lag: (min: 31.0, avg: 34.6, max: 63.0) [2023-10-12 19:10:03,435][61643] Avg episode reward: [(0, '24.890'), (1, '9.870')] [2023-10-12 19:10:04,649][62634] Updated weights for policy 0, policy_version 80840 (0.0007) [2023-10-12 19:10:05,028][62634] Updated weights for policy 0, policy_version 80850 (0.0007) [2023-10-12 19:10:05,406][62634] Updated weights for policy 0, policy_version 80860 (0.0008) [2023-10-12 19:10:05,647][62635] Updated weights for policy 1, policy_version 80870 (0.0009) [2023-10-12 19:10:06,015][62635] Updated weights for policy 1, policy_version 80880 (0.0010) [2023-10-12 19:10:06,376][62635] Updated weights for policy 1, policy_version 80890 (0.0009) [2023-10-12 19:10:08,435][61643] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 165642240. Throughput: 0: 1673.0, 1: 1660.4. Samples: 41414928. Policy #0 lag: (min: 31.0, avg: 34.6, max: 63.0) [2023-10-12 19:10:08,436][61643] Avg episode reward: [(0, '24.750'), (1, '9.790')] [2023-10-12 19:10:09,429][62634] Updated weights for policy 0, policy_version 80870 (0.0008) [2023-10-12 19:10:09,806][62634] Updated weights for policy 0, policy_version 80880 (0.0008) [2023-10-12 19:10:10,180][62634] Updated weights for policy 0, policy_version 80890 (0.0011) [2023-10-12 19:10:10,523][62635] Updated weights for policy 1, policy_version 80900 (0.0009) [2023-10-12 19:10:10,902][62635] Updated weights for policy 1, policy_version 80910 (0.0010) [2023-10-12 19:10:11,263][62635] Updated weights for policy 1, policy_version 80920 (0.0008) [2023-10-12 19:10:13,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 165707776. Throughput: 0: 1686.7, 1: 1660.0. Samples: 41434724. Policy #0 lag: (min: 31.0, avg: 34.6, max: 63.0) [2023-10-12 19:10:13,436][61643] Avg episode reward: [(0, '24.780'), (1, '9.690')] [2023-10-12 19:10:14,328][62634] Updated weights for policy 0, policy_version 80900 (0.0010) [2023-10-12 19:10:14,705][62634] Updated weights for policy 0, policy_version 80910 (0.0009) [2023-10-12 19:10:15,073][62634] Updated weights for policy 0, policy_version 80920 (0.0008) [2023-10-12 19:10:15,273][62635] Updated weights for policy 1, policy_version 80930 (0.0008) [2023-10-12 19:10:15,639][62635] Updated weights for policy 1, policy_version 80940 (0.0009) [2023-10-12 19:10:16,021][62635] Updated weights for policy 1, policy_version 80950 (0.0007) [2023-10-12 19:10:16,397][62635] Updated weights for policy 1, policy_version 80960 (0.0009) [2023-10-12 19:10:18,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 165773312. Throughput: 0: 1680.8, 1: 1671.3. Samples: 41455142. Policy #0 lag: (min: 31.0, avg: 34.6, max: 63.0) [2023-10-12 19:10:18,436][61643] Avg episode reward: [(0, '24.790'), (1, '9.680')] [2023-10-12 19:10:18,446][62354] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000080928_82870272.pth... [2023-10-12 19:10:18,446][62495] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000080960_82903040.pth... [2023-10-12 19:10:18,482][62495] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000079392_81297408.pth [2023-10-12 19:10:18,489][62354] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000079360_81264640.pth [2023-10-12 19:10:19,213][62634] Updated weights for policy 0, policy_version 80930 (0.0007) [2023-10-12 19:10:19,588][62634] Updated weights for policy 0, policy_version 80940 (0.0010) [2023-10-12 19:10:19,970][62634] Updated weights for policy 0, policy_version 80950 (0.0010) [2023-10-12 19:10:20,339][62634] Updated weights for policy 0, policy_version 80960 (0.0010) [2023-10-12 19:10:20,472][62635] Updated weights for policy 1, policy_version 80970 (0.0009) [2023-10-12 19:10:20,842][62635] Updated weights for policy 1, policy_version 80980 (0.0009) [2023-10-12 19:10:21,212][62635] Updated weights for policy 1, policy_version 80990 (0.0009) [2023-10-12 19:10:23,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 165838848. Throughput: 0: 1670.0, 1: 1650.1. Samples: 41464578. Policy #0 lag: (min: 31.0, avg: 34.6, max: 63.0) [2023-10-12 19:10:23,436][61643] Avg episode reward: [(0, '24.740'), (1, '9.550')] [2023-10-12 19:10:24,245][62634] Updated weights for policy 0, policy_version 80970 (0.0009) [2023-10-12 19:10:24,629][62634] Updated weights for policy 0, policy_version 80980 (0.0010) [2023-10-12 19:10:25,003][62634] Updated weights for policy 0, policy_version 80990 (0.0010) [2023-10-12 19:10:25,336][62635] Updated weights for policy 1, policy_version 81000 (0.0008) [2023-10-12 19:10:25,700][62635] Updated weights for policy 1, policy_version 81010 (0.0010) [2023-10-12 19:10:26,061][62635] Updated weights for policy 1, policy_version 81020 (0.0008) [2023-10-12 19:10:28,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 165904384. Throughput: 0: 1676.8, 1: 1672.8. Samples: 41485032. Policy #0 lag: (min: 31.0, avg: 34.6, max: 63.0) [2023-10-12 19:10:28,435][61643] Avg episode reward: [(0, '24.640'), (1, '9.740')] [2023-10-12 19:10:29,078][62634] Updated weights for policy 0, policy_version 81000 (0.0009) [2023-10-12 19:10:29,456][62634] Updated weights for policy 0, policy_version 81010 (0.0009) [2023-10-12 19:10:29,839][62634] Updated weights for policy 0, policy_version 81020 (0.0007) [2023-10-12 19:10:30,242][62635] Updated weights for policy 1, policy_version 81030 (0.0009) [2023-10-12 19:10:30,613][62635] Updated weights for policy 1, policy_version 81040 (0.0008) [2023-10-12 19:10:30,979][62635] Updated weights for policy 1, policy_version 81050 (0.0008) [2023-10-12 19:10:33,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 165969920. Throughput: 0: 1681.2, 1: 1672.0. Samples: 41505864. Policy #0 lag: (min: 31.0, avg: 34.6, max: 63.0) [2023-10-12 19:10:33,435][61643] Avg episode reward: [(0, '24.660'), (1, '9.740')] [2023-10-12 19:10:33,871][62634] Updated weights for policy 0, policy_version 81030 (0.0008) [2023-10-12 19:10:34,243][62634] Updated weights for policy 0, policy_version 81040 (0.0009) [2023-10-12 19:10:34,624][62634] Updated weights for policy 0, policy_version 81050 (0.0010) [2023-10-12 19:10:35,017][62635] Updated weights for policy 1, policy_version 81060 (0.0009) [2023-10-12 19:10:35,410][62635] Updated weights for policy 1, policy_version 81070 (0.0009) [2023-10-12 19:10:35,774][62635] Updated weights for policy 1, policy_version 81080 (0.0007) [2023-10-12 19:10:38,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 166035456. Throughput: 0: 1685.0, 1: 1652.3. Samples: 41515028. Policy #0 lag: (min: 31.0, avg: 34.6, max: 63.0) [2023-10-12 19:10:38,436][61643] Avg episode reward: [(0, '24.790'), (1, '9.620')] [2023-10-12 19:10:38,728][62634] Updated weights for policy 0, policy_version 81060 (0.0008) [2023-10-12 19:10:39,100][62634] Updated weights for policy 0, policy_version 81070 (0.0010) [2023-10-12 19:10:39,479][62634] Updated weights for policy 0, policy_version 81080 (0.0008) [2023-10-12 19:10:39,842][62635] Updated weights for policy 1, policy_version 81090 (0.0008) [2023-10-12 19:10:40,214][62635] Updated weights for policy 1, policy_version 81100 (0.0008) [2023-10-12 19:10:40,571][62635] Updated weights for policy 1, policy_version 81110 (0.0010) [2023-10-12 19:10:40,949][62635] Updated weights for policy 1, policy_version 81120 (0.0009) [2023-10-12 19:10:43,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 166100992. Throughput: 0: 1675.4, 1: 1676.7. Samples: 41535392. Policy #0 lag: (min: 31.0, avg: 34.6, max: 63.0) [2023-10-12 19:10:43,436][61643] Avg episode reward: [(0, '24.780'), (1, '9.760')] [2023-10-12 19:10:43,675][62634] Updated weights for policy 0, policy_version 81090 (0.0010) [2023-10-12 19:10:44,064][62634] Updated weights for policy 0, policy_version 81100 (0.0008) [2023-10-12 19:10:44,437][62634] Updated weights for policy 0, policy_version 81110 (0.0008) [2023-10-12 19:10:44,820][62634] Updated weights for policy 0, policy_version 81120 (0.0007) [2023-10-12 19:10:45,080][62635] Updated weights for policy 1, policy_version 81130 (0.0009) [2023-10-12 19:10:45,441][62635] Updated weights for policy 1, policy_version 81140 (0.0008) [2023-10-12 19:10:45,810][62635] Updated weights for policy 1, policy_version 81150 (0.0010) [2023-10-12 19:10:48,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13107.1, 300 sec: 13329.3). Total num frames: 166166528. Throughput: 0: 1672.6, 1: 1681.9. Samples: 41556122. Policy #0 lag: (min: 31.0, avg: 34.6, max: 63.0) [2023-10-12 19:10:48,436][61643] Avg episode reward: [(0, '24.840'), (1, '10.220')] [2023-10-12 19:10:49,024][62634] Updated weights for policy 0, policy_version 81130 (0.0008) [2023-10-12 19:10:49,393][62634] Updated weights for policy 0, policy_version 81140 (0.0009) [2023-10-12 19:10:49,774][62634] Updated weights for policy 0, policy_version 81150 (0.0010) [2023-10-12 19:10:49,924][62635] Updated weights for policy 1, policy_version 81160 (0.0007) [2023-10-12 19:10:50,294][62635] Updated weights for policy 1, policy_version 81170 (0.0008) [2023-10-12 19:10:50,668][62635] Updated weights for policy 1, policy_version 81180 (0.0008) [2023-10-12 19:10:53,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 166232064. Throughput: 0: 1673.7, 1: 1665.6. Samples: 41565194. Policy #0 lag: (min: 8.0, avg: 35.3, max: 40.0) [2023-10-12 19:10:53,436][61643] Avg episode reward: [(0, '24.840'), (1, '9.830')] [2023-10-12 19:10:53,732][62634] Updated weights for policy 0, policy_version 81160 (0.0008) [2023-10-12 19:10:54,116][62634] Updated weights for policy 0, policy_version 81170 (0.0009) [2023-10-12 19:10:54,488][62634] Updated weights for policy 0, policy_version 81180 (0.0007) [2023-10-12 19:10:54,585][62635] Updated weights for policy 1, policy_version 81190 (0.0007) [2023-10-12 19:10:54,962][62635] Updated weights for policy 1, policy_version 81200 (0.0008) [2023-10-12 19:10:55,335][62635] Updated weights for policy 1, policy_version 81210 (0.0008) [2023-10-12 19:10:58,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 166297600. Throughput: 0: 1680.3, 1: 1680.7. Samples: 41585970. Policy #0 lag: (min: 8.0, avg: 35.3, max: 40.0) [2023-10-12 19:10:58,436][61643] Avg episode reward: [(0, '24.860'), (1, '9.940')] [2023-10-12 19:10:58,466][62634] Updated weights for policy 0, policy_version 81190 (0.0008) [2023-10-12 19:10:58,838][62634] Updated weights for policy 0, policy_version 81200 (0.0009) [2023-10-12 19:10:59,210][62634] Updated weights for policy 0, policy_version 81210 (0.0009) [2023-10-12 19:10:59,468][62635] Updated weights for policy 1, policy_version 81220 (0.0010) [2023-10-12 19:10:59,834][62635] Updated weights for policy 1, policy_version 81230 (0.0008) [2023-10-12 19:11:00,197][62635] Updated weights for policy 1, policy_version 81240 (0.0008) [2023-10-12 19:11:03,280][62634] Updated weights for policy 0, policy_version 81220 (0.0009) [2023-10-12 19:11:03,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 166363136. Throughput: 0: 1689.3, 1: 1677.3. Samples: 41606636. Policy #0 lag: (min: 8.0, avg: 35.3, max: 40.0) [2023-10-12 19:11:03,435][61643] Avg episode reward: [(0, '24.730'), (1, '10.060')] [2023-10-12 19:11:03,652][62634] Updated weights for policy 0, policy_version 81230 (0.0009) [2023-10-12 19:11:04,038][62634] Updated weights for policy 0, policy_version 81240 (0.0010) [2023-10-12 19:11:04,117][62635] Updated weights for policy 1, policy_version 81250 (0.0008) [2023-10-12 19:11:04,486][62635] Updated weights for policy 1, policy_version 81260 (0.0008) [2023-10-12 19:11:04,858][62635] Updated weights for policy 1, policy_version 81270 (0.0007) [2023-10-12 19:11:05,232][62635] Updated weights for policy 1, policy_version 81280 (0.0009) [2023-10-12 19:11:08,117][62634] Updated weights for policy 0, policy_version 81250 (0.0010) [2023-10-12 19:11:08,435][61643] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 166428672. Throughput: 0: 1686.1, 1: 1671.8. Samples: 41615684. Policy #0 lag: (min: 8.0, avg: 35.3, max: 40.0) [2023-10-12 19:11:08,435][61643] Avg episode reward: [(0, '24.700'), (1, '9.780')] [2023-10-12 19:11:08,483][62634] Updated weights for policy 0, policy_version 81260 (0.0008) [2023-10-12 19:11:08,869][62634] Updated weights for policy 0, policy_version 81270 (0.0008) [2023-10-12 19:11:09,238][62634] Updated weights for policy 0, policy_version 81280 (0.0009) [2023-10-12 19:11:09,318][62635] Updated weights for policy 1, policy_version 81290 (0.0010) [2023-10-12 19:11:09,689][62635] Updated weights for policy 1, policy_version 81300 (0.0008) [2023-10-12 19:11:10,058][62635] Updated weights for policy 1, policy_version 81310 (0.0011) [2023-10-12 19:11:13,320][62634] Updated weights for policy 0, policy_version 81290 (0.0010) [2023-10-12 19:11:13,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.3, 300 sec: 13329.4). Total num frames: 166494208. Throughput: 0: 1682.5, 1: 1676.8. Samples: 41636202. Policy #0 lag: (min: 8.0, avg: 35.3, max: 40.0) [2023-10-12 19:11:13,435][61643] Avg episode reward: [(0, '24.600'), (1, '9.860')] [2023-10-12 19:11:13,706][62634] Updated weights for policy 0, policy_version 81300 (0.0012) [2023-10-12 19:11:14,077][62634] Updated weights for policy 0, policy_version 81310 (0.0007) [2023-10-12 19:11:14,165][62635] Updated weights for policy 1, policy_version 81320 (0.0008) [2023-10-12 19:11:14,527][62635] Updated weights for policy 1, policy_version 81330 (0.0007) [2023-10-12 19:11:14,891][62635] Updated weights for policy 1, policy_version 81340 (0.0007) [2023-10-12 19:11:18,081][62634] Updated weights for policy 0, policy_version 81320 (0.0007) [2023-10-12 19:11:18,435][61643] Fps is (10 sec: 13106.8, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 166559744. Throughput: 0: 1676.4, 1: 1676.3. Samples: 41656736. Policy #0 lag: (min: 8.0, avg: 35.3, max: 40.0) [2023-10-12 19:11:18,436][61643] Avg episode reward: [(0, '24.860'), (1, '9.950')] [2023-10-12 19:11:18,466][62634] Updated weights for policy 0, policy_version 81330 (0.0008) [2023-10-12 19:11:18,837][62634] Updated weights for policy 0, policy_version 81340 (0.0009) [2023-10-12 19:11:19,201][62635] Updated weights for policy 1, policy_version 81350 (0.0009) [2023-10-12 19:11:19,566][62635] Updated weights for policy 1, policy_version 81360 (0.0010) [2023-10-12 19:11:19,939][62635] Updated weights for policy 1, policy_version 81370 (0.0009) [2023-10-12 19:11:23,019][62634] Updated weights for policy 0, policy_version 81350 (0.0007) [2023-10-12 19:11:23,398][62634] Updated weights for policy 0, policy_version 81360 (0.0007) [2023-10-12 19:11:23,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 166625280. Throughput: 0: 1678.5, 1: 1676.7. Samples: 41666010. Policy #0 lag: (min: 8.0, avg: 35.3, max: 40.0) [2023-10-12 19:11:23,435][61643] Avg episode reward: [(0, '24.930'), (1, '9.860')] [2023-10-12 19:11:23,771][62634] Updated weights for policy 0, policy_version 81370 (0.0008) [2023-10-12 19:11:24,157][62635] Updated weights for policy 1, policy_version 81380 (0.0009) [2023-10-12 19:11:24,542][62635] Updated weights for policy 1, policy_version 81390 (0.0007) [2023-10-12 19:11:24,911][62635] Updated weights for policy 1, policy_version 81400 (0.0009) [2023-10-12 19:11:27,810][62634] Updated weights for policy 0, policy_version 81380 (0.0010) [2023-10-12 19:11:28,175][62634] Updated weights for policy 0, policy_version 81390 (0.0008) [2023-10-12 19:11:28,435][61643] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 166690816. Throughput: 0: 1673.9, 1: 1680.4. Samples: 41686336. Policy #0 lag: (min: 8.0, avg: 35.3, max: 40.0) [2023-10-12 19:11:28,435][61643] Avg episode reward: [(0, '24.930'), (1, '9.930')] [2023-10-12 19:11:28,561][62634] Updated weights for policy 0, policy_version 81400 (0.0010) [2023-10-12 19:11:28,784][62635] Updated weights for policy 1, policy_version 81410 (0.0008) [2023-10-12 19:11:29,166][62635] Updated weights for policy 1, policy_version 81420 (0.0010) [2023-10-12 19:11:29,526][62635] Updated weights for policy 1, policy_version 81430 (0.0009) [2023-10-12 19:11:29,897][62635] Updated weights for policy 1, policy_version 81440 (0.0008) [2023-10-12 19:11:32,718][62634] Updated weights for policy 0, policy_version 81410 (0.0009) [2023-10-12 19:11:33,089][62634] Updated weights for policy 0, policy_version 81420 (0.0007) [2023-10-12 19:11:33,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 166756352. Throughput: 0: 1663.4, 1: 1679.0. Samples: 41706532. Policy #0 lag: (min: 8.0, avg: 35.3, max: 40.0) [2023-10-12 19:11:33,435][61643] Avg episode reward: [(0, '24.860'), (1, '10.110')] [2023-10-12 19:11:33,468][62634] Updated weights for policy 0, policy_version 81430 (0.0008) [2023-10-12 19:11:33,836][62634] Updated weights for policy 0, policy_version 81440 (0.0009) [2023-10-12 19:11:33,960][62635] Updated weights for policy 1, policy_version 81450 (0.0011) [2023-10-12 19:11:34,329][62635] Updated weights for policy 1, policy_version 81460 (0.0007) [2023-10-12 19:11:34,709][62635] Updated weights for policy 1, policy_version 81470 (0.0007) [2023-10-12 19:11:37,872][62634] Updated weights for policy 0, policy_version 81450 (0.0007) [2023-10-12 19:11:38,258][62634] Updated weights for policy 0, policy_version 81460 (0.0007) [2023-10-12 19:11:38,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 166821888. Throughput: 0: 1669.3, 1: 1681.0. Samples: 41715960. Policy #0 lag: (min: 8.0, avg: 35.3, max: 40.0) [2023-10-12 19:11:38,436][61643] Avg episode reward: [(0, '25.020'), (1, '9.930')] [2023-10-12 19:11:38,623][62634] Updated weights for policy 0, policy_version 81470 (0.0008) [2023-10-12 19:11:38,763][62635] Updated weights for policy 1, policy_version 81480 (0.0008) [2023-10-12 19:11:39,136][62635] Updated weights for policy 1, policy_version 81490 (0.0007) [2023-10-12 19:11:39,500][62635] Updated weights for policy 1, policy_version 81500 (0.0008) [2023-10-12 19:11:42,596][62634] Updated weights for policy 0, policy_version 81480 (0.0009) [2023-10-12 19:11:42,974][62634] Updated weights for policy 0, policy_version 81490 (0.0008) [2023-10-12 19:11:43,355][62634] Updated weights for policy 0, policy_version 81500 (0.0007) [2023-10-12 19:11:43,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 166887424. Throughput: 0: 1670.3, 1: 1679.0. Samples: 41736688. Policy #0 lag: (min: 8.0, avg: 35.3, max: 40.0) [2023-10-12 19:11:43,435][61643] Avg episode reward: [(0, '24.660'), (1, '9.840')] [2023-10-12 19:11:43,505][62635] Updated weights for policy 1, policy_version 81510 (0.0007) [2023-10-12 19:11:43,873][62635] Updated weights for policy 1, policy_version 81520 (0.0008) [2023-10-12 19:11:44,248][62635] Updated weights for policy 1, policy_version 81530 (0.0009) [2023-10-12 19:11:47,473][62634] Updated weights for policy 0, policy_version 81510 (0.0009) [2023-10-12 19:11:47,840][62634] Updated weights for policy 0, policy_version 81520 (0.0011) [2023-10-12 19:11:48,230][62634] Updated weights for policy 0, policy_version 81530 (0.0009) [2023-10-12 19:11:48,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13107.3, 300 sec: 13329.4). Total num frames: 166952960. Throughput: 0: 1649.2, 1: 1680.3. Samples: 41756462. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 19:11:48,435][61643] Avg episode reward: [(0, '24.680'), (1, '10.040')] [2023-10-12 19:11:48,564][62635] Updated weights for policy 1, policy_version 81540 (0.0009) [2023-10-12 19:11:48,930][62635] Updated weights for policy 1, policy_version 81550 (0.0009) [2023-10-12 19:11:49,296][62635] Updated weights for policy 1, policy_version 81560 (0.0009) [2023-10-12 19:11:52,284][62634] Updated weights for policy 0, policy_version 81540 (0.0008) [2023-10-12 19:11:52,660][62634] Updated weights for policy 0, policy_version 81550 (0.0007) [2023-10-12 19:11:53,035][62634] Updated weights for policy 0, policy_version 81560 (0.0007) [2023-10-12 19:11:53,362][62635] Updated weights for policy 1, policy_version 81570 (0.0010) [2023-10-12 19:11:53,435][61643] Fps is (10 sec: 16383.8, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 167051264. Throughput: 0: 1672.4, 1: 1674.0. Samples: 41766270. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 19:11:53,435][61643] Avg episode reward: [(0, '24.320'), (1, '9.950')] [2023-10-12 19:11:53,739][62635] Updated weights for policy 1, policy_version 81580 (0.0010) [2023-10-12 19:11:54,101][62635] Updated weights for policy 1, policy_version 81590 (0.0007) [2023-10-12 19:11:54,475][62635] Updated weights for policy 1, policy_version 81600 (0.0009) [2023-10-12 19:11:57,034][62634] Updated weights for policy 0, policy_version 81570 (0.0008) [2023-10-12 19:11:57,410][62634] Updated weights for policy 0, policy_version 81580 (0.0009) [2023-10-12 19:11:57,793][62634] Updated weights for policy 0, policy_version 81590 (0.0009) [2023-10-12 19:11:58,175][62634] Updated weights for policy 0, policy_version 81600 (0.0007) [2023-10-12 19:11:58,391][62635] Updated weights for policy 1, policy_version 81610 (0.0009) [2023-10-12 19:11:58,435][61643] Fps is (10 sec: 16383.9, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 167116800. Throughput: 0: 1674.9, 1: 1680.4. Samples: 41787190. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 19:11:58,435][61643] Avg episode reward: [(0, '24.280'), (1, '10.040')] [2023-10-12 19:11:58,750][62635] Updated weights for policy 1, policy_version 81620 (0.0010) [2023-10-12 19:11:59,112][62635] Updated weights for policy 1, policy_version 81630 (0.0011) [2023-10-12 19:12:02,298][62634] Updated weights for policy 0, policy_version 81610 (0.0008) [2023-10-12 19:12:02,673][62634] Updated weights for policy 0, policy_version 81620 (0.0008) [2023-10-12 19:12:03,061][62634] Updated weights for policy 0, policy_version 81630 (0.0009) [2023-10-12 19:12:03,320][62635] Updated weights for policy 1, policy_version 81640 (0.0010) [2023-10-12 19:12:03,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 167182336. Throughput: 0: 1654.5, 1: 1681.0. Samples: 41806834. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 19:12:03,435][61643] Avg episode reward: [(0, '24.220'), (1, '9.950')] [2023-10-12 19:12:03,677][62635] Updated weights for policy 1, policy_version 81650 (0.0010) [2023-10-12 19:12:04,042][62635] Updated weights for policy 1, policy_version 81660 (0.0010) [2023-10-12 19:12:06,904][62634] Updated weights for policy 0, policy_version 81640 (0.0008) [2023-10-12 19:12:07,266][62634] Updated weights for policy 0, policy_version 81650 (0.0010) [2023-10-12 19:12:07,643][62634] Updated weights for policy 0, policy_version 81660 (0.0007) [2023-10-12 19:12:07,936][62635] Updated weights for policy 1, policy_version 81670 (0.0010) [2023-10-12 19:12:08,305][62635] Updated weights for policy 1, policy_version 81680 (0.0009) [2023-10-12 19:12:08,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 167247872. Throughput: 0: 1674.4, 1: 1678.8. Samples: 41816900. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 19:12:08,435][61643] Avg episode reward: [(0, '24.270'), (1, '9.950')] [2023-10-12 19:12:08,668][62635] Updated weights for policy 1, policy_version 81690 (0.0009) [2023-10-12 19:12:11,932][62634] Updated weights for policy 0, policy_version 81670 (0.0009) [2023-10-12 19:12:12,320][62634] Updated weights for policy 0, policy_version 81680 (0.0010) [2023-10-12 19:12:12,696][62634] Updated weights for policy 0, policy_version 81690 (0.0008) [2023-10-12 19:12:12,805][62635] Updated weights for policy 1, policy_version 81700 (0.0007) [2023-10-12 19:12:13,194][62635] Updated weights for policy 1, policy_version 81710 (0.0008) [2023-10-12 19:12:13,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 167313408. Throughput: 0: 1671.3, 1: 1686.1. Samples: 41837420. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 19:12:13,435][61643] Avg episode reward: [(0, '24.250'), (1, '9.990')] [2023-10-12 19:12:13,562][62635] Updated weights for policy 1, policy_version 81720 (0.0009) [2023-10-12 19:12:16,630][62634] Updated weights for policy 0, policy_version 81700 (0.0008) [2023-10-12 19:12:17,010][62634] Updated weights for policy 0, policy_version 81710 (0.0010) [2023-10-12 19:12:17,395][62634] Updated weights for policy 0, policy_version 81720 (0.0010) [2023-10-12 19:12:17,577][62635] Updated weights for policy 1, policy_version 81730 (0.0008) [2023-10-12 19:12:17,944][62635] Updated weights for policy 1, policy_version 81740 (0.0009) [2023-10-12 19:12:18,303][62635] Updated weights for policy 1, policy_version 81750 (0.0011) [2023-10-12 19:12:18,435][61643] Fps is (10 sec: 13106.8, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 167378944. Throughput: 0: 1662.7, 1: 1670.9. Samples: 41856548. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 19:12:18,436][61643] Avg episode reward: [(0, '24.250'), (1, '10.010')] [2023-10-12 19:12:18,445][62354] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000081728_83689472.pth... [2023-10-12 19:12:18,482][62354] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000080160_82083840.pth [2023-10-12 19:12:18,668][62495] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000081760_83722240.pth... [2023-10-12 19:12:18,668][62635] Updated weights for policy 1, policy_version 81760 (0.0010) [2023-10-12 19:12:18,705][62495] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000080192_82116608.pth [2023-10-12 19:12:21,403][62634] Updated weights for policy 0, policy_version 81730 (0.0009) [2023-10-12 19:12:21,785][62634] Updated weights for policy 0, policy_version 81740 (0.0007) [2023-10-12 19:12:22,164][62634] Updated weights for policy 0, policy_version 81750 (0.0007) [2023-10-12 19:12:22,549][62634] Updated weights for policy 0, policy_version 81760 (0.0008) [2023-10-12 19:12:22,778][62635] Updated weights for policy 1, policy_version 81770 (0.0009) [2023-10-12 19:12:23,153][62635] Updated weights for policy 1, policy_version 81780 (0.0009) [2023-10-12 19:12:23,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 167444480. Throughput: 0: 1685.3, 1: 1678.5. Samples: 41867332. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 19:12:23,435][61643] Avg episode reward: [(0, '24.320'), (1, '9.760')] [2023-10-12 19:12:23,521][62635] Updated weights for policy 1, policy_version 81790 (0.0008) [2023-10-12 19:12:26,681][62634] Updated weights for policy 0, policy_version 81770 (0.0008) [2023-10-12 19:12:27,055][62634] Updated weights for policy 0, policy_version 81780 (0.0009) [2023-10-12 19:12:27,433][62634] Updated weights for policy 0, policy_version 81790 (0.0008) [2023-10-12 19:12:27,632][62635] Updated weights for policy 1, policy_version 81800 (0.0008) [2023-10-12 19:12:27,992][62635] Updated weights for policy 1, policy_version 81810 (0.0010) [2023-10-12 19:12:28,364][62635] Updated weights for policy 1, policy_version 81820 (0.0010) [2023-10-12 19:12:28,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 167510016. Throughput: 0: 1665.1, 1: 1685.5. Samples: 41887468. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 19:12:28,436][61643] Avg episode reward: [(0, '24.430'), (1, '9.830')] [2023-10-12 19:12:31,483][62634] Updated weights for policy 0, policy_version 81800 (0.0007) [2023-10-12 19:12:31,859][62634] Updated weights for policy 0, policy_version 81810 (0.0009) [2023-10-12 19:12:32,227][62634] Updated weights for policy 0, policy_version 81820 (0.0008) [2023-10-12 19:12:32,265][62635] Updated weights for policy 1, policy_version 81830 (0.0007) [2023-10-12 19:12:32,621][62635] Updated weights for policy 1, policy_version 81840 (0.0009) [2023-10-12 19:12:32,989][62635] Updated weights for policy 1, policy_version 81850 (0.0007) [2023-10-12 19:12:33,435][61643] Fps is (10 sec: 16383.9, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 167608320. Throughput: 0: 1670.8, 1: 1667.0. Samples: 41906662. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 19:12:33,435][61643] Avg episode reward: [(0, '24.390'), (1, '10.010')] [2023-10-12 19:12:36,260][62634] Updated weights for policy 0, policy_version 81830 (0.0009) [2023-10-12 19:12:36,645][62634] Updated weights for policy 0, policy_version 81840 (0.0007) [2023-10-12 19:12:37,021][62634] Updated weights for policy 0, policy_version 81850 (0.0008) [2023-10-12 19:12:37,076][62635] Updated weights for policy 1, policy_version 81860 (0.0008) [2023-10-12 19:12:37,443][62635] Updated weights for policy 1, policy_version 81870 (0.0007) [2023-10-12 19:12:37,819][62635] Updated weights for policy 1, policy_version 81880 (0.0007) [2023-10-12 19:12:38,435][61643] Fps is (10 sec: 16384.1, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 167673856. Throughput: 0: 1683.3, 1: 1693.2. Samples: 41918214. Policy #0 lag: (min: 0.0, avg: 26.4, max: 32.0) [2023-10-12 19:12:38,435][61643] Avg episode reward: [(0, '24.440'), (1, '9.820')] [2023-10-12 19:12:41,029][62634] Updated weights for policy 0, policy_version 81860 (0.0007) [2023-10-12 19:12:41,410][62634] Updated weights for policy 0, policy_version 81870 (0.0010) [2023-10-12 19:12:41,796][62634] Updated weights for policy 0, policy_version 81880 (0.0008) [2023-10-12 19:12:41,852][62635] Updated weights for policy 1, policy_version 81890 (0.0010) [2023-10-12 19:12:42,211][62635] Updated weights for policy 1, policy_version 81900 (0.0009) [2023-10-12 19:12:42,586][62635] Updated weights for policy 1, policy_version 81910 (0.0010) [2023-10-12 19:12:42,956][62635] Updated weights for policy 1, policy_version 81920 (0.0008) [2023-10-12 19:12:43,435][61643] Fps is (10 sec: 13107.2, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 167739392. Throughput: 0: 1659.7, 1: 1686.3. Samples: 41937760. Policy #0 lag: (min: 0.0, avg: 26.4, max: 32.0) [2023-10-12 19:12:43,436][61643] Avg episode reward: [(0, '24.150'), (1, '9.810')] [2023-10-12 19:12:45,827][62634] Updated weights for policy 0, policy_version 81890 (0.0010) [2023-10-12 19:12:46,204][62634] Updated weights for policy 0, policy_version 81900 (0.0009) [2023-10-12 19:12:46,585][62634] Updated weights for policy 0, policy_version 81910 (0.0010) [2023-10-12 19:12:46,960][62634] Updated weights for policy 0, policy_version 81920 (0.0008) [2023-10-12 19:12:47,099][62635] Updated weights for policy 1, policy_version 81930 (0.0007) [2023-10-12 19:12:47,464][62635] Updated weights for policy 1, policy_version 81940 (0.0007) [2023-10-12 19:12:47,828][62635] Updated weights for policy 1, policy_version 81950 (0.0007) [2023-10-12 19:12:48,435][61643] Fps is (10 sec: 13107.0, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 167804928. Throughput: 0: 1681.2, 1: 1663.9. Samples: 41957362. Policy #0 lag: (min: 0.0, avg: 26.4, max: 32.0) [2023-10-12 19:12:48,436][61643] Avg episode reward: [(0, '24.310'), (1, '10.080')] [2023-10-12 19:12:51,242][62634] Updated weights for policy 0, policy_version 81930 (0.0008) [2023-10-12 19:12:51,611][62634] Updated weights for policy 0, policy_version 81940 (0.0008) [2023-10-12 19:12:51,841][62635] Updated weights for policy 1, policy_version 81960 (0.0008) [2023-10-12 19:12:51,995][62634] Updated weights for policy 0, policy_version 81950 (0.0009) [2023-10-12 19:12:52,209][62635] Updated weights for policy 1, policy_version 81970 (0.0008) [2023-10-12 19:12:52,578][62635] Updated weights for policy 1, policy_version 81980 (0.0007) [2023-10-12 19:12:53,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 167870464. Throughput: 0: 1681.4, 1: 1695.2. Samples: 41968846. Policy #0 lag: (min: 0.0, avg: 26.4, max: 32.0) [2023-10-12 19:12:53,436][61643] Avg episode reward: [(0, '24.220'), (1, '9.860')] [2023-10-12 19:12:56,064][62634] Updated weights for policy 0, policy_version 81960 (0.0008) [2023-10-12 19:12:56,440][62634] Updated weights for policy 0, policy_version 81970 (0.0009) [2023-10-12 19:12:56,773][62635] Updated weights for policy 1, policy_version 81990 (0.0008) [2023-10-12 19:12:56,812][62634] Updated weights for policy 0, policy_version 81980 (0.0008) [2023-10-12 19:12:57,154][62635] Updated weights for policy 1, policy_version 82000 (0.0007) [2023-10-12 19:12:57,522][62635] Updated weights for policy 1, policy_version 82010 (0.0009) [2023-10-12 19:12:58,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 167936000. Throughput: 0: 1670.8, 1: 1676.3. Samples: 41988040. Policy #0 lag: (min: 0.0, avg: 26.4, max: 32.0) [2023-10-12 19:12:58,436][61643] Avg episode reward: [(0, '24.540'), (1, '9.930')] [2023-10-12 19:13:00,900][62634] Updated weights for policy 0, policy_version 81990 (0.0008) [2023-10-12 19:13:01,289][62634] Updated weights for policy 0, policy_version 82000 (0.0007) [2023-10-12 19:13:01,521][62635] Updated weights for policy 1, policy_version 82020 (0.0009) [2023-10-12 19:13:01,668][62634] Updated weights for policy 0, policy_version 82010 (0.0007) [2023-10-12 19:13:01,886][62635] Updated weights for policy 1, policy_version 82030 (0.0007) [2023-10-12 19:13:02,245][62635] Updated weights for policy 1, policy_version 82040 (0.0009) [2023-10-12 19:13:03,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 168001536. Throughput: 0: 1688.5, 1: 1673.0. Samples: 42007814. Policy #0 lag: (min: 0.0, avg: 26.4, max: 32.0) [2023-10-12 19:13:03,436][61643] Avg episode reward: [(0, '24.770'), (1, '9.950')] [2023-10-12 19:13:05,656][62634] Updated weights for policy 0, policy_version 82020 (0.0010) [2023-10-12 19:13:06,041][62634] Updated weights for policy 0, policy_version 82030 (0.0010) [2023-10-12 19:13:06,347][62635] Updated weights for policy 1, policy_version 82050 (0.0007) [2023-10-12 19:13:06,410][62634] Updated weights for policy 0, policy_version 82040 (0.0008) [2023-10-12 19:13:06,715][62635] Updated weights for policy 1, policy_version 82060 (0.0008) [2023-10-12 19:13:07,076][62635] Updated weights for policy 1, policy_version 82070 (0.0010) [2023-10-12 19:13:07,444][62635] Updated weights for policy 1, policy_version 82080 (0.0010) [2023-10-12 19:13:08,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 168067072. Throughput: 0: 1678.0, 1: 1692.4. Samples: 42018998. Policy #0 lag: (min: 0.0, avg: 26.4, max: 32.0) [2023-10-12 19:13:08,436][61643] Avg episode reward: [(0, '25.340'), (1, '9.840')] [2023-10-12 19:13:10,486][62634] Updated weights for policy 0, policy_version 82050 (0.0008) [2023-10-12 19:13:10,866][62634] Updated weights for policy 0, policy_version 82060 (0.0009) [2023-10-12 19:13:11,248][62634] Updated weights for policy 0, policy_version 82070 (0.0011) [2023-10-12 19:13:11,552][62635] Updated weights for policy 1, policy_version 82090 (0.0008) [2023-10-12 19:13:11,628][62634] Updated weights for policy 0, policy_version 82080 (0.0009) [2023-10-12 19:13:11,922][62635] Updated weights for policy 1, policy_version 82100 (0.0010) [2023-10-12 19:13:12,295][62635] Updated weights for policy 1, policy_version 82110 (0.0010) [2023-10-12 19:13:13,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 168132608. Throughput: 0: 1671.8, 1: 1668.6. Samples: 42037786. Policy #0 lag: (min: 0.0, avg: 26.4, max: 32.0) [2023-10-12 19:13:13,435][61643] Avg episode reward: [(0, '25.530'), (1, '9.850')] [2023-10-12 19:13:15,713][62634] Updated weights for policy 0, policy_version 82090 (0.0007) [2023-10-12 19:13:16,081][62634] Updated weights for policy 0, policy_version 82100 (0.0009) [2023-10-12 19:13:16,308][62635] Updated weights for policy 1, policy_version 82120 (0.0009) [2023-10-12 19:13:16,456][62634] Updated weights for policy 0, policy_version 82110 (0.0009) [2023-10-12 19:13:16,676][62635] Updated weights for policy 1, policy_version 82130 (0.0009) [2023-10-12 19:13:17,041][62635] Updated weights for policy 1, policy_version 82140 (0.0009) [2023-10-12 19:13:18,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 168198144. Throughput: 0: 1689.3, 1: 1681.3. Samples: 42058338. Policy #0 lag: (min: 0.0, avg: 26.4, max: 32.0) [2023-10-12 19:13:18,435][61643] Avg episode reward: [(0, '25.540'), (1, '10.150')] [2023-10-12 19:13:20,314][62634] Updated weights for policy 0, policy_version 82120 (0.0007) [2023-10-12 19:13:20,692][62634] Updated weights for policy 0, policy_version 82130 (0.0008) [2023-10-12 19:13:21,078][62634] Updated weights for policy 0, policy_version 82140 (0.0007) [2023-10-12 19:13:21,113][62635] Updated weights for policy 1, policy_version 82150 (0.0009) [2023-10-12 19:13:21,477][62635] Updated weights for policy 1, policy_version 82160 (0.0007) [2023-10-12 19:13:21,855][62635] Updated weights for policy 1, policy_version 82170 (0.0008) [2023-10-12 19:13:23,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 168263680. Throughput: 0: 1665.3, 1: 1681.4. Samples: 42068816. Policy #0 lag: (min: 0.0, avg: 26.4, max: 32.0) [2023-10-12 19:13:23,436][61643] Avg episode reward: [(0, '25.670'), (1, '9.900')] [2023-10-12 19:13:25,094][62634] Updated weights for policy 0, policy_version 82150 (0.0008) [2023-10-12 19:13:25,473][62634] Updated weights for policy 0, policy_version 82160 (0.0007) [2023-10-12 19:13:25,855][62635] Updated weights for policy 1, policy_version 82180 (0.0008) [2023-10-12 19:13:25,860][62634] Updated weights for policy 0, policy_version 82170 (0.0007) [2023-10-12 19:13:26,218][62635] Updated weights for policy 1, policy_version 82190 (0.0009) [2023-10-12 19:13:26,586][62635] Updated weights for policy 1, policy_version 82200 (0.0007) [2023-10-12 19:13:28,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 168329216. Throughput: 0: 1679.6, 1: 1665.3. Samples: 42088280. Policy #0 lag: (min: 0.0, avg: 26.4, max: 32.0) [2023-10-12 19:13:28,435][61643] Avg episode reward: [(0, '25.620'), (1, '9.900')] [2023-10-12 19:13:29,820][62634] Updated weights for policy 0, policy_version 82180 (0.0010) [2023-10-12 19:13:30,196][62634] Updated weights for policy 0, policy_version 82190 (0.0008) [2023-10-12 19:13:30,568][62634] Updated weights for policy 0, policy_version 82200 (0.0007) [2023-10-12 19:13:30,728][62635] Updated weights for policy 1, policy_version 82210 (0.0011) [2023-10-12 19:13:31,093][62635] Updated weights for policy 1, policy_version 82220 (0.0008) [2023-10-12 19:13:31,454][62635] Updated weights for policy 1, policy_version 82230 (0.0008) [2023-10-12 19:13:31,811][62635] Updated weights for policy 1, policy_version 82240 (0.0009) [2023-10-12 19:13:33,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 168394752. Throughput: 0: 1685.2, 1: 1689.0. Samples: 42109202. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 19:13:33,435][61643] Avg episode reward: [(0, '25.220'), (1, '10.210')] [2023-10-12 19:13:34,584][62634] Updated weights for policy 0, policy_version 82210 (0.0007) [2023-10-12 19:13:34,957][62634] Updated weights for policy 0, policy_version 82220 (0.0007) [2023-10-12 19:13:35,336][62634] Updated weights for policy 0, policy_version 82230 (0.0008) [2023-10-12 19:13:35,712][62634] Updated weights for policy 0, policy_version 82240 (0.0008) [2023-10-12 19:13:35,802][62635] Updated weights for policy 1, policy_version 82250 (0.0008) [2023-10-12 19:13:36,165][62635] Updated weights for policy 1, policy_version 82260 (0.0008) [2023-10-12 19:13:36,536][62635] Updated weights for policy 1, policy_version 82270 (0.0010) [2023-10-12 19:13:38,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 168460288. Throughput: 0: 1660.8, 1: 1677.7. Samples: 42119080. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 19:13:38,436][61643] Avg episode reward: [(0, '25.280'), (1, '9.960')] [2023-10-12 19:13:39,905][62634] Updated weights for policy 0, policy_version 82250 (0.0009) [2023-10-12 19:13:40,279][62634] Updated weights for policy 0, policy_version 82260 (0.0009) [2023-10-12 19:13:40,464][62635] Updated weights for policy 1, policy_version 82280 (0.0007) [2023-10-12 19:13:40,654][62634] Updated weights for policy 0, policy_version 82270 (0.0007) [2023-10-12 19:13:40,825][62635] Updated weights for policy 1, policy_version 82290 (0.0008) [2023-10-12 19:13:41,196][62635] Updated weights for policy 1, policy_version 82300 (0.0007) [2023-10-12 19:13:43,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 168525824. Throughput: 0: 1684.4, 1: 1677.0. Samples: 42139302. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 19:13:43,435][61643] Avg episode reward: [(0, '25.450'), (1, '10.050')] [2023-10-12 19:13:44,752][62634] Updated weights for policy 0, policy_version 82280 (0.0007) [2023-10-12 19:13:45,123][62634] Updated weights for policy 0, policy_version 82290 (0.0007) [2023-10-12 19:13:45,264][62635] Updated weights for policy 1, policy_version 82310 (0.0007) [2023-10-12 19:13:45,509][62634] Updated weights for policy 0, policy_version 82300 (0.0007) [2023-10-12 19:13:45,636][62635] Updated weights for policy 1, policy_version 82320 (0.0008) [2023-10-12 19:13:46,007][62635] Updated weights for policy 1, policy_version 82330 (0.0007) [2023-10-12 19:13:48,435][61643] Fps is (10 sec: 13107.5, 60 sec: 13107.3, 300 sec: 13329.4). Total num frames: 168591360. Throughput: 0: 1694.4, 1: 1690.3. Samples: 42160126. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 19:13:48,435][61643] Avg episode reward: [(0, '25.410'), (1, '10.050')] [2023-10-12 19:13:49,618][62634] Updated weights for policy 0, policy_version 82310 (0.0008) [2023-10-12 19:13:49,998][62634] Updated weights for policy 0, policy_version 82320 (0.0007) [2023-10-12 19:13:50,078][62635] Updated weights for policy 1, policy_version 82340 (0.0009) [2023-10-12 19:13:50,371][62634] Updated weights for policy 0, policy_version 82330 (0.0009) [2023-10-12 19:13:50,441][62635] Updated weights for policy 1, policy_version 82350 (0.0008) [2023-10-12 19:13:50,809][62635] Updated weights for policy 1, policy_version 82360 (0.0009) [2023-10-12 19:13:53,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 168656896. Throughput: 0: 1672.8, 1: 1664.2. Samples: 42169162. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 19:13:53,436][61643] Avg episode reward: [(0, '25.360'), (1, '10.050')] [2023-10-12 19:13:54,428][62634] Updated weights for policy 0, policy_version 82340 (0.0009) [2023-10-12 19:13:54,807][62634] Updated weights for policy 0, policy_version 82350 (0.0008) [2023-10-12 19:13:54,903][62635] Updated weights for policy 1, policy_version 82370 (0.0009) [2023-10-12 19:13:55,181][62634] Updated weights for policy 0, policy_version 82360 (0.0008) [2023-10-12 19:13:55,269][62635] Updated weights for policy 1, policy_version 82380 (0.0007) [2023-10-12 19:13:55,631][62635] Updated weights for policy 1, policy_version 82390 (0.0009) [2023-10-12 19:13:55,997][62635] Updated weights for policy 1, policy_version 82400 (0.0008) [2023-10-12 19:13:58,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 168722432. Throughput: 0: 1693.7, 1: 1686.4. Samples: 42189890. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 19:13:58,436][61643] Avg episode reward: [(0, '25.610'), (1, '9.970')] [2023-10-12 19:13:59,282][62634] Updated weights for policy 0, policy_version 82370 (0.0009) [2023-10-12 19:13:59,664][62634] Updated weights for policy 0, policy_version 82380 (0.0008) [2023-10-12 19:14:00,047][62634] Updated weights for policy 0, policy_version 82390 (0.0009) [2023-10-12 19:14:00,067][62635] Updated weights for policy 1, policy_version 82410 (0.0009) [2023-10-12 19:14:00,409][62634] Updated weights for policy 0, policy_version 82400 (0.0009) [2023-10-12 19:14:00,431][62635] Updated weights for policy 1, policy_version 82420 (0.0007) [2023-10-12 19:14:00,797][62635] Updated weights for policy 1, policy_version 82430 (0.0007) [2023-10-12 19:14:03,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 168787968. Throughput: 0: 1683.9, 1: 1695.2. Samples: 42210398. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 19:14:03,435][61643] Avg episode reward: [(0, '25.560'), (1, '9.970')] [2023-10-12 19:14:04,549][62634] Updated weights for policy 0, policy_version 82410 (0.0008) [2023-10-12 19:14:04,875][62635] Updated weights for policy 1, policy_version 82440 (0.0008) [2023-10-12 19:14:04,922][62634] Updated weights for policy 0, policy_version 82420 (0.0009) [2023-10-12 19:14:05,243][62635] Updated weights for policy 1, policy_version 82450 (0.0008) [2023-10-12 19:14:05,300][62634] Updated weights for policy 0, policy_version 82430 (0.0009) [2023-10-12 19:14:05,608][62635] Updated weights for policy 1, policy_version 82460 (0.0008) [2023-10-12 19:14:08,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 168853504. Throughput: 0: 1672.9, 1: 1674.1. Samples: 42219432. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 19:14:08,436][61643] Avg episode reward: [(0, '25.480'), (1, '10.060')] [2023-10-12 19:14:09,507][62634] Updated weights for policy 0, policy_version 82440 (0.0008) [2023-10-12 19:14:09,631][62635] Updated weights for policy 1, policy_version 82470 (0.0008) [2023-10-12 19:14:09,889][62634] Updated weights for policy 0, policy_version 82450 (0.0009) [2023-10-12 19:14:09,984][62635] Updated weights for policy 1, policy_version 82480 (0.0008) [2023-10-12 19:14:10,254][62634] Updated weights for policy 0, policy_version 82460 (0.0007) [2023-10-12 19:14:10,343][62635] Updated weights for policy 1, policy_version 82490 (0.0010) [2023-10-12 19:14:13,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 168919040. Throughput: 0: 1678.2, 1: 1693.1. Samples: 42239990. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 19:14:13,436][61643] Avg episode reward: [(0, '25.450'), (1, '10.060')] [2023-10-12 19:14:14,379][62634] Updated weights for policy 0, policy_version 82470 (0.0007) [2023-10-12 19:14:14,383][62635] Updated weights for policy 1, policy_version 82500 (0.0008) [2023-10-12 19:14:14,753][62635] Updated weights for policy 1, policy_version 82510 (0.0007) [2023-10-12 19:14:14,754][62634] Updated weights for policy 0, policy_version 82480 (0.0007) [2023-10-12 19:14:15,119][62635] Updated weights for policy 1, policy_version 82520 (0.0007) [2023-10-12 19:14:15,121][62634] Updated weights for policy 0, policy_version 82490 (0.0007) [2023-10-12 19:14:18,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 168984576. Throughput: 0: 1668.5, 1: 1695.7. Samples: 42260590. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 19:14:18,436][61643] Avg episode reward: [(0, '25.320'), (1, '9.880')] [2023-10-12 19:14:18,446][62354] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000082496_84475904.pth... [2023-10-12 19:14:18,447][62495] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000082528_84508672.pth... [2023-10-12 19:14:18,479][62495] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000080960_82903040.pth [2023-10-12 19:14:18,486][62354] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000080928_82870272.pth [2023-10-12 19:14:19,246][62635] Updated weights for policy 1, policy_version 82530 (0.0007) [2023-10-12 19:14:19,262][62634] Updated weights for policy 0, policy_version 82500 (0.0007) [2023-10-12 19:14:19,616][62635] Updated weights for policy 1, policy_version 82540 (0.0009) [2023-10-12 19:14:19,644][62634] Updated weights for policy 0, policy_version 82510 (0.0008) [2023-10-12 19:14:19,986][62635] Updated weights for policy 1, policy_version 82550 (0.0009) [2023-10-12 19:14:20,010][62634] Updated weights for policy 0, policy_version 82520 (0.0008) [2023-10-12 19:14:20,358][62635] Updated weights for policy 1, policy_version 82560 (0.0008) [2023-10-12 19:14:23,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 169050112. Throughput: 0: 1667.9, 1: 1673.0. Samples: 42269420. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 19:14:23,436][61643] Avg episode reward: [(0, '25.200'), (1, '10.050')] [2023-10-12 19:14:23,992][62634] Updated weights for policy 0, policy_version 82530 (0.0008) [2023-10-12 19:14:24,373][62634] Updated weights for policy 0, policy_version 82540 (0.0010) [2023-10-12 19:14:24,594][62635] Updated weights for policy 1, policy_version 82570 (0.0009) [2023-10-12 19:14:24,738][62634] Updated weights for policy 0, policy_version 82550 (0.0007) [2023-10-12 19:14:24,960][62635] Updated weights for policy 1, policy_version 82580 (0.0007) [2023-10-12 19:14:25,117][62634] Updated weights for policy 0, policy_version 82560 (0.0008) [2023-10-12 19:14:25,332][62635] Updated weights for policy 1, policy_version 82590 (0.0008) [2023-10-12 19:14:28,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 169115648. Throughput: 0: 1661.0, 1: 1687.2. Samples: 42289968. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 19:14:28,436][61643] Avg episode reward: [(0, '25.040'), (1, '9.960')] [2023-10-12 19:14:29,164][62634] Updated weights for policy 0, policy_version 82570 (0.0008) [2023-10-12 19:14:29,278][62635] Updated weights for policy 1, policy_version 82600 (0.0009) [2023-10-12 19:14:29,547][62634] Updated weights for policy 0, policy_version 82580 (0.0009) [2023-10-12 19:14:29,649][62635] Updated weights for policy 1, policy_version 82610 (0.0008) [2023-10-12 19:14:29,918][62634] Updated weights for policy 0, policy_version 82590 (0.0008) [2023-10-12 19:14:30,021][62635] Updated weights for policy 1, policy_version 82620 (0.0007) [2023-10-12 19:14:33,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 169181184. Throughput: 0: 1655.0, 1: 1689.0. Samples: 42310604. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 19:14:33,436][61643] Avg episode reward: [(0, '25.080'), (1, '9.870')] [2023-10-12 19:14:33,996][62634] Updated weights for policy 0, policy_version 82600 (0.0007) [2023-10-12 19:14:34,103][62635] Updated weights for policy 1, policy_version 82630 (0.0008) [2023-10-12 19:14:34,373][62634] Updated weights for policy 0, policy_version 82610 (0.0007) [2023-10-12 19:14:34,490][62635] Updated weights for policy 1, policy_version 82640 (0.0007) [2023-10-12 19:14:34,746][62634] Updated weights for policy 0, policy_version 82620 (0.0007) [2023-10-12 19:14:34,865][62635] Updated weights for policy 1, policy_version 82650 (0.0009) [2023-10-12 19:14:38,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 169246720. Throughput: 0: 1657.7, 1: 1684.8. Samples: 42319578. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 19:14:38,435][61643] Avg episode reward: [(0, '25.040'), (1, '9.870')] [2023-10-12 19:14:38,770][62634] Updated weights for policy 0, policy_version 82630 (0.0008) [2023-10-12 19:14:38,940][62635] Updated weights for policy 1, policy_version 82660 (0.0008) [2023-10-12 19:14:39,153][62634] Updated weights for policy 0, policy_version 82640 (0.0007) [2023-10-12 19:14:39,297][62635] Updated weights for policy 1, policy_version 82670 (0.0008) [2023-10-12 19:14:39,528][62634] Updated weights for policy 0, policy_version 82650 (0.0009) [2023-10-12 19:14:39,661][62635] Updated weights for policy 1, policy_version 82680 (0.0008) [2023-10-12 19:14:43,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 169312256. Throughput: 0: 1659.5, 1: 1684.7. Samples: 42340376. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 19:14:43,436][61643] Avg episode reward: [(0, '24.990'), (1, '9.960')] [2023-10-12 19:14:43,639][62634] Updated weights for policy 0, policy_version 82660 (0.0009) [2023-10-12 19:14:43,687][62635] Updated weights for policy 1, policy_version 82690 (0.0008) [2023-10-12 19:14:44,007][62634] Updated weights for policy 0, policy_version 82670 (0.0007) [2023-10-12 19:14:44,061][62635] Updated weights for policy 1, policy_version 82700 (0.0007) [2023-10-12 19:14:44,377][62634] Updated weights for policy 0, policy_version 82680 (0.0007) [2023-10-12 19:14:44,419][62635] Updated weights for policy 1, policy_version 82710 (0.0007) [2023-10-12 19:14:44,794][62635] Updated weights for policy 1, policy_version 82720 (0.0007) [2023-10-12 19:14:48,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13107.1, 300 sec: 13329.3). Total num frames: 169377792. Throughput: 0: 1662.9, 1: 1685.4. Samples: 42361074. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 19:14:48,436][61643] Avg episode reward: [(0, '24.670'), (1, '9.870')] [2023-10-12 19:14:48,674][62634] Updated weights for policy 0, policy_version 82690 (0.0008) [2023-10-12 19:14:48,899][62635] Updated weights for policy 1, policy_version 82730 (0.0009) [2023-10-12 19:14:49,051][62634] Updated weights for policy 0, policy_version 82700 (0.0008) [2023-10-12 19:14:49,260][62635] Updated weights for policy 1, policy_version 82740 (0.0008) [2023-10-12 19:14:49,421][62634] Updated weights for policy 0, policy_version 82710 (0.0007) [2023-10-12 19:14:49,632][62635] Updated weights for policy 1, policy_version 82750 (0.0008) [2023-10-12 19:14:49,797][62634] Updated weights for policy 0, policy_version 82720 (0.0010) [2023-10-12 19:14:53,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 169443328. Throughput: 0: 1663.3, 1: 1683.1. Samples: 42370020. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 19:14:53,436][61643] Avg episode reward: [(0, '24.970'), (1, '10.060')] [2023-10-12 19:14:53,722][62635] Updated weights for policy 1, policy_version 82760 (0.0007) [2023-10-12 19:14:53,905][62634] Updated weights for policy 0, policy_version 82730 (0.0009) [2023-10-12 19:14:54,083][62635] Updated weights for policy 1, policy_version 82770 (0.0007) [2023-10-12 19:14:54,290][62634] Updated weights for policy 0, policy_version 82740 (0.0010) [2023-10-12 19:14:54,454][62635] Updated weights for policy 1, policy_version 82780 (0.0008) [2023-10-12 19:14:54,667][62634] Updated weights for policy 0, policy_version 82750 (0.0009) [2023-10-12 19:14:58,435][61643] Fps is (10 sec: 13107.6, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 169508864. Throughput: 0: 1665.7, 1: 1682.3. Samples: 42390646. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 19:14:58,435][61643] Avg episode reward: [(0, '24.960'), (1, '9.900')] [2023-10-12 19:14:58,542][62635] Updated weights for policy 1, policy_version 82790 (0.0007) [2023-10-12 19:14:58,692][62634] Updated weights for policy 0, policy_version 82760 (0.0008) [2023-10-12 19:14:58,903][62635] Updated weights for policy 1, policy_version 82800 (0.0008) [2023-10-12 19:14:59,067][62634] Updated weights for policy 0, policy_version 82770 (0.0008) [2023-10-12 19:14:59,283][62635] Updated weights for policy 1, policy_version 82810 (0.0008) [2023-10-12 19:14:59,441][62634] Updated weights for policy 0, policy_version 82780 (0.0009) [2023-10-12 19:15:03,321][62634] Updated weights for policy 0, policy_version 82790 (0.0009) [2023-10-12 19:15:03,392][62635] Updated weights for policy 1, policy_version 82820 (0.0008) [2023-10-12 19:15:03,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 169574400. Throughput: 0: 1671.4, 1: 1680.5. Samples: 42411426. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 19:15:03,435][61643] Avg episode reward: [(0, '24.880'), (1, '9.810')] [2023-10-12 19:15:03,697][62634] Updated weights for policy 0, policy_version 82800 (0.0009) [2023-10-12 19:15:03,765][62635] Updated weights for policy 1, policy_version 82830 (0.0007) [2023-10-12 19:15:04,063][62634] Updated weights for policy 0, policy_version 82810 (0.0007) [2023-10-12 19:15:04,135][62635] Updated weights for policy 1, policy_version 82840 (0.0008) [2023-10-12 19:15:08,126][62634] Updated weights for policy 0, policy_version 82820 (0.0007) [2023-10-12 19:15:08,164][62635] Updated weights for policy 1, policy_version 82850 (0.0010) [2023-10-12 19:15:08,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 169639936. Throughput: 0: 1670.6, 1: 1687.9. Samples: 42420552. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 19:15:08,435][61643] Avg episode reward: [(0, '24.890'), (1, '9.900')] [2023-10-12 19:15:08,505][62634] Updated weights for policy 0, policy_version 82830 (0.0009) [2023-10-12 19:15:08,528][62635] Updated weights for policy 1, policy_version 82860 (0.0007) [2023-10-12 19:15:08,871][62634] Updated weights for policy 0, policy_version 82840 (0.0009) [2023-10-12 19:15:08,904][62635] Updated weights for policy 1, policy_version 82870 (0.0010) [2023-10-12 19:15:09,269][62635] Updated weights for policy 1, policy_version 82880 (0.0008) [2023-10-12 19:15:12,982][62634] Updated weights for policy 0, policy_version 82850 (0.0007) [2023-10-12 19:15:13,349][62635] Updated weights for policy 1, policy_version 82890 (0.0008) [2023-10-12 19:15:13,355][62634] Updated weights for policy 0, policy_version 82860 (0.0007) [2023-10-12 19:15:13,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 169705472. Throughput: 0: 1676.9, 1: 1685.9. Samples: 42441294. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 19:15:13,435][61643] Avg episode reward: [(0, '24.720'), (1, '9.810')] [2023-10-12 19:15:13,711][62635] Updated weights for policy 1, policy_version 82900 (0.0007) [2023-10-12 19:15:13,723][62634] Updated weights for policy 0, policy_version 82870 (0.0007) [2023-10-12 19:15:14,072][62635] Updated weights for policy 1, policy_version 82910 (0.0007) [2023-10-12 19:15:14,099][62634] Updated weights for policy 0, policy_version 82880 (0.0007) [2023-10-12 19:15:18,007][62635] Updated weights for policy 1, policy_version 82920 (0.0007) [2023-10-12 19:15:18,161][62634] Updated weights for policy 0, policy_version 82890 (0.0009) [2023-10-12 19:15:18,382][62635] Updated weights for policy 1, policy_version 82930 (0.0008) [2023-10-12 19:15:18,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.3, 300 sec: 13329.4). Total num frames: 169771008. Throughput: 0: 1671.2, 1: 1684.6. Samples: 42461618. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 19:15:18,435][61643] Avg episode reward: [(0, '24.700'), (1, '9.810')] [2023-10-12 19:15:18,532][62634] Updated weights for policy 0, policy_version 82900 (0.0009) [2023-10-12 19:15:18,745][62635] Updated weights for policy 1, policy_version 82940 (0.0008) [2023-10-12 19:15:18,905][62634] Updated weights for policy 0, policy_version 82910 (0.0009) [2023-10-12 19:15:22,858][62635] Updated weights for policy 1, policy_version 82950 (0.0009) [2023-10-12 19:15:22,943][62634] Updated weights for policy 0, policy_version 82920 (0.0008) [2023-10-12 19:15:23,241][62635] Updated weights for policy 1, policy_version 82960 (0.0010) [2023-10-12 19:15:23,323][62634] Updated weights for policy 0, policy_version 82930 (0.0008) [2023-10-12 19:15:23,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 169836544. Throughput: 0: 1673.7, 1: 1694.7. Samples: 42471154. Policy #0 lag: (min: 31.0, avg: 38.0, max: 63.0) [2023-10-12 19:15:23,436][61643] Avg episode reward: [(0, '24.700'), (1, '10.080')] [2023-10-12 19:15:23,605][62635] Updated weights for policy 1, policy_version 82970 (0.0007) [2023-10-12 19:15:23,692][62634] Updated weights for policy 0, policy_version 82940 (0.0008) [2023-10-12 19:15:27,633][62635] Updated weights for policy 1, policy_version 82980 (0.0008) [2023-10-12 19:15:27,886][62634] Updated weights for policy 0, policy_version 82950 (0.0009) [2023-10-12 19:15:27,991][62635] Updated weights for policy 1, policy_version 82990 (0.0007) [2023-10-12 19:15:28,258][62634] Updated weights for policy 0, policy_version 82960 (0.0008) [2023-10-12 19:15:28,361][62635] Updated weights for policy 1, policy_version 83000 (0.0009) [2023-10-12 19:15:28,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 169902080. Throughput: 0: 1673.2, 1: 1689.7. Samples: 42491706. Policy #0 lag: (min: 31.0, avg: 38.0, max: 63.0) [2023-10-12 19:15:28,436][61643] Avg episode reward: [(0, '24.790'), (1, '9.930')] [2023-10-12 19:15:28,632][62634] Updated weights for policy 0, policy_version 82970 (0.0009) [2023-10-12 19:15:32,405][62635] Updated weights for policy 1, policy_version 83010 (0.0009) [2023-10-12 19:15:32,554][62634] Updated weights for policy 0, policy_version 82980 (0.0008) [2023-10-12 19:15:32,770][62635] Updated weights for policy 1, policy_version 83020 (0.0007) [2023-10-12 19:15:32,934][62634] Updated weights for policy 0, policy_version 82990 (0.0007) [2023-10-12 19:15:33,148][62635] Updated weights for policy 1, policy_version 83030 (0.0007) [2023-10-12 19:15:33,306][62634] Updated weights for policy 0, policy_version 83000 (0.0008) [2023-10-12 19:15:33,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 169967616. Throughput: 0: 1667.2, 1: 1670.8. Samples: 42511286. Policy #0 lag: (min: 31.0, avg: 38.0, max: 63.0) [2023-10-12 19:15:33,436][61643] Avg episode reward: [(0, '24.880'), (1, '9.880')] [2023-10-12 19:15:33,513][62635] Updated weights for policy 1, policy_version 83040 (0.0009) [2023-10-12 19:15:37,331][62634] Updated weights for policy 0, policy_version 83010 (0.0008) [2023-10-12 19:15:37,363][62635] Updated weights for policy 1, policy_version 83050 (0.0008) [2023-10-12 19:15:37,704][62634] Updated weights for policy 0, policy_version 83020 (0.0009) [2023-10-12 19:15:37,730][62635] Updated weights for policy 1, policy_version 83060 (0.0009) [2023-10-12 19:15:38,073][62634] Updated weights for policy 0, policy_version 83030 (0.0009) [2023-10-12 19:15:38,107][62635] Updated weights for policy 1, policy_version 83070 (0.0008) [2023-10-12 19:15:38,435][61643] Fps is (10 sec: 16384.2, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 170065920. Throughput: 0: 1683.9, 1: 1689.8. Samples: 42521836. Policy #0 lag: (min: 31.0, avg: 38.0, max: 63.0) [2023-10-12 19:15:38,435][61643] Avg episode reward: [(0, '24.620'), (1, '9.970')] [2023-10-12 19:15:38,448][62634] Updated weights for policy 0, policy_version 83040 (0.0010) [2023-10-12 19:15:42,231][62635] Updated weights for policy 1, policy_version 83080 (0.0008) [2023-10-12 19:15:42,599][62635] Updated weights for policy 1, policy_version 83090 (0.0007) [2023-10-12 19:15:42,659][62634] Updated weights for policy 0, policy_version 83050 (0.0008) [2023-10-12 19:15:42,959][62635] Updated weights for policy 1, policy_version 83100 (0.0007) [2023-10-12 19:15:43,033][62634] Updated weights for policy 0, policy_version 83060 (0.0008) [2023-10-12 19:15:43,419][62634] Updated weights for policy 0, policy_version 83070 (0.0009) [2023-10-12 19:15:43,435][61643] Fps is (10 sec: 16384.2, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 170131456. Throughput: 0: 1682.0, 1: 1683.5. Samples: 42542096. Policy #0 lag: (min: 31.0, avg: 38.0, max: 63.0) [2023-10-12 19:15:43,435][61643] Avg episode reward: [(0, '24.670'), (1, '9.830')] [2023-10-12 19:15:47,169][62635] Updated weights for policy 1, policy_version 83110 (0.0009) [2023-10-12 19:15:47,531][62635] Updated weights for policy 1, policy_version 83120 (0.0010) [2023-10-12 19:15:47,551][62634] Updated weights for policy 0, policy_version 83080 (0.0009) [2023-10-12 19:15:47,898][62635] Updated weights for policy 1, policy_version 83130 (0.0008) [2023-10-12 19:15:47,938][62634] Updated weights for policy 0, policy_version 83090 (0.0009) [2023-10-12 19:15:48,304][62634] Updated weights for policy 0, policy_version 83100 (0.0010) [2023-10-12 19:15:48,435][61643] Fps is (10 sec: 13106.8, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 170196992. Throughput: 0: 1665.4, 1: 1660.9. Samples: 42561108. Policy #0 lag: (min: 31.0, avg: 38.0, max: 63.0) [2023-10-12 19:15:48,436][61643] Avg episode reward: [(0, '24.820'), (1, '9.920')] [2023-10-12 19:15:52,063][62635] Updated weights for policy 1, policy_version 83140 (0.0008) [2023-10-12 19:15:52,369][62634] Updated weights for policy 0, policy_version 83110 (0.0009) [2023-10-12 19:15:52,418][62635] Updated weights for policy 1, policy_version 83150 (0.0009) [2023-10-12 19:15:52,737][62634] Updated weights for policy 0, policy_version 83120 (0.0009) [2023-10-12 19:15:52,787][62635] Updated weights for policy 1, policy_version 83160 (0.0008) [2023-10-12 19:15:53,104][62634] Updated weights for policy 0, policy_version 83130 (0.0008) [2023-10-12 19:15:53,435][61643] Fps is (10 sec: 16384.1, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 170295296. Throughput: 0: 1679.9, 1: 1683.3. Samples: 42571896. Policy #0 lag: (min: 31.0, avg: 38.0, max: 63.0) [2023-10-12 19:15:53,435][61643] Avg episode reward: [(0, '25.030'), (1, '9.920')] [2023-10-12 19:15:56,917][62635] Updated weights for policy 1, policy_version 83170 (0.0009) [2023-10-12 19:15:57,247][62634] Updated weights for policy 0, policy_version 83140 (0.0010) [2023-10-12 19:15:57,295][62635] Updated weights for policy 1, policy_version 83180 (0.0007) [2023-10-12 19:15:57,619][62634] Updated weights for policy 0, policy_version 83150 (0.0008) [2023-10-12 19:15:57,669][62635] Updated weights for policy 1, policy_version 83190 (0.0010) [2023-10-12 19:15:57,993][62634] Updated weights for policy 0, policy_version 83160 (0.0008) [2023-10-12 19:15:58,035][62635] Updated weights for policy 1, policy_version 83200 (0.0008) [2023-10-12 19:15:58,435][61643] Fps is (10 sec: 16384.4, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 170360832. Throughput: 0: 1675.5, 1: 1678.8. Samples: 42592238. Policy #0 lag: (min: 31.0, avg: 38.0, max: 63.0) [2023-10-12 19:15:58,435][61643] Avg episode reward: [(0, '24.290'), (1, '9.910')] [2023-10-12 19:16:01,914][62634] Updated weights for policy 0, policy_version 83170 (0.0008) [2023-10-12 19:16:01,945][62635] Updated weights for policy 1, policy_version 83210 (0.0009) [2023-10-12 19:16:02,283][62634] Updated weights for policy 0, policy_version 83180 (0.0008) [2023-10-12 19:16:02,304][62635] Updated weights for policy 1, policy_version 83220 (0.0007) [2023-10-12 19:16:02,662][62634] Updated weights for policy 0, policy_version 83190 (0.0007) [2023-10-12 19:16:02,675][62635] Updated weights for policy 1, policy_version 83230 (0.0008) [2023-10-12 19:16:03,045][62634] Updated weights for policy 0, policy_version 83200 (0.0007) [2023-10-12 19:16:03,435][61643] Fps is (10 sec: 13106.8, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 170426368. Throughput: 0: 1659.7, 1: 1660.3. Samples: 42611016. Policy #0 lag: (min: 31.0, avg: 38.0, max: 63.0) [2023-10-12 19:16:03,436][61643] Avg episode reward: [(0, '24.540'), (1, '10.000')] [2023-10-12 19:16:06,831][62635] Updated weights for policy 1, policy_version 83240 (0.0008) [2023-10-12 19:16:06,936][62634] Updated weights for policy 0, policy_version 83210 (0.0008) [2023-10-12 19:16:07,198][62635] Updated weights for policy 1, policy_version 83250 (0.0007) [2023-10-12 19:16:07,316][62634] Updated weights for policy 0, policy_version 83220 (0.0010) [2023-10-12 19:16:07,572][62635] Updated weights for policy 1, policy_version 83260 (0.0007) [2023-10-12 19:16:07,685][62634] Updated weights for policy 0, policy_version 83230 (0.0009) [2023-10-12 19:16:08,435][61643] Fps is (10 sec: 13107.0, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 170491904. Throughput: 0: 1686.5, 1: 1676.2. Samples: 42622474. Policy #0 lag: (min: 31.0, avg: 38.0, max: 63.0) [2023-10-12 19:16:08,436][61643] Avg episode reward: [(0, '24.340'), (1, '9.910')] [2023-10-12 19:16:11,732][62635] Updated weights for policy 1, policy_version 83270 (0.0007) [2023-10-12 19:16:11,949][62634] Updated weights for policy 0, policy_version 83240 (0.0007) [2023-10-12 19:16:12,114][62635] Updated weights for policy 1, policy_version 83280 (0.0010) [2023-10-12 19:16:12,333][62634] Updated weights for policy 0, policy_version 83250 (0.0008) [2023-10-12 19:16:12,483][62635] Updated weights for policy 1, policy_version 83290 (0.0008) [2023-10-12 19:16:12,698][62634] Updated weights for policy 0, policy_version 83260 (0.0008) [2023-10-12 19:16:13,435][61643] Fps is (10 sec: 13107.6, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 170557440. Throughput: 0: 1675.0, 1: 1667.2. Samples: 42642106. Policy #0 lag: (min: 0.0, avg: 23.4, max: 32.0) [2023-10-12 19:16:13,435][61643] Avg episode reward: [(0, '24.310'), (1, '9.780')] [2023-10-12 19:16:16,462][62635] Updated weights for policy 1, policy_version 83300 (0.0008) [2023-10-12 19:16:16,755][62634] Updated weights for policy 0, policy_version 83270 (0.0008) [2023-10-12 19:16:16,830][62635] Updated weights for policy 1, policy_version 83310 (0.0008) [2023-10-12 19:16:17,124][62634] Updated weights for policy 0, policy_version 83280 (0.0009) [2023-10-12 19:16:17,193][62635] Updated weights for policy 1, policy_version 83320 (0.0007) [2023-10-12 19:16:17,502][62634] Updated weights for policy 0, policy_version 83290 (0.0009) [2023-10-12 19:16:18,435][61643] Fps is (10 sec: 13107.2, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 170622976. Throughput: 0: 1662.0, 1: 1667.0. Samples: 42661092. Policy #0 lag: (min: 0.0, avg: 23.4, max: 32.0) [2023-10-12 19:16:18,436][61643] Avg episode reward: [(0, '24.230'), (1, '9.770')] [2023-10-12 19:16:18,445][62354] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000083296_85295104.pth... [2023-10-12 19:16:18,445][62495] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000083328_85327872.pth... [2023-10-12 19:16:18,483][62354] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000081728_83689472.pth [2023-10-12 19:16:18,485][62495] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000081760_83722240.pth [2023-10-12 19:16:21,231][62635] Updated weights for policy 1, policy_version 83330 (0.0009) [2023-10-12 19:16:21,575][62634] Updated weights for policy 0, policy_version 83300 (0.0009) [2023-10-12 19:16:21,602][62635] Updated weights for policy 1, policy_version 83340 (0.0008) [2023-10-12 19:16:21,951][62634] Updated weights for policy 0, policy_version 83310 (0.0008) [2023-10-12 19:16:21,960][62635] Updated weights for policy 1, policy_version 83350 (0.0009) [2023-10-12 19:16:22,324][62635] Updated weights for policy 1, policy_version 83360 (0.0008) [2023-10-12 19:16:22,326][62634] Updated weights for policy 0, policy_version 83320 (0.0008) [2023-10-12 19:16:23,435][61643] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 170688512. Throughput: 0: 1672.8, 1: 1677.7. Samples: 42672608. Policy #0 lag: (min: 0.0, avg: 23.4, max: 32.0) [2023-10-12 19:16:23,435][61643] Avg episode reward: [(0, '24.140'), (1, '9.840')] [2023-10-12 19:16:26,430][62634] Updated weights for policy 0, policy_version 83330 (0.0009) [2023-10-12 19:16:26,454][62635] Updated weights for policy 1, policy_version 83370 (0.0010) [2023-10-12 19:16:26,806][62634] Updated weights for policy 0, policy_version 83340 (0.0007) [2023-10-12 19:16:26,814][62635] Updated weights for policy 1, policy_version 83380 (0.0008) [2023-10-12 19:16:27,187][62634] Updated weights for policy 0, policy_version 83350 (0.0009) [2023-10-12 19:16:27,187][62635] Updated weights for policy 1, policy_version 83390 (0.0008) [2023-10-12 19:16:27,570][62634] Updated weights for policy 0, policy_version 83360 (0.0009) [2023-10-12 19:16:28,435][61643] Fps is (10 sec: 13107.2, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 170754048. Throughput: 0: 1663.0, 1: 1664.8. Samples: 42691850. Policy #0 lag: (min: 0.0, avg: 23.4, max: 32.0) [2023-10-12 19:16:28,436][61643] Avg episode reward: [(0, '24.350'), (1, '9.670')] [2023-10-12 19:16:31,455][62635] Updated weights for policy 1, policy_version 83400 (0.0008) [2023-10-12 19:16:31,751][62634] Updated weights for policy 0, policy_version 83370 (0.0008) [2023-10-12 19:16:31,818][62635] Updated weights for policy 1, policy_version 83410 (0.0007) [2023-10-12 19:16:32,132][62634] Updated weights for policy 0, policy_version 83380 (0.0007) [2023-10-12 19:16:32,185][62635] Updated weights for policy 1, policy_version 83420 (0.0008) [2023-10-12 19:16:32,507][62634] Updated weights for policy 0, policy_version 83390 (0.0007) [2023-10-12 19:16:33,435][61643] Fps is (10 sec: 13107.0, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 170819584. Throughput: 0: 1661.2, 1: 1677.6. Samples: 42711354. Policy #0 lag: (min: 0.0, avg: 23.4, max: 32.0) [2023-10-12 19:16:33,436][61643] Avg episode reward: [(0, '24.390'), (1, '9.540')] [2023-10-12 19:16:36,321][62635] Updated weights for policy 1, policy_version 83430 (0.0009) [2023-10-12 19:16:36,649][62634] Updated weights for policy 0, policy_version 83400 (0.0007) [2023-10-12 19:16:36,689][62635] Updated weights for policy 1, policy_version 83440 (0.0009) [2023-10-12 19:16:37,022][62634] Updated weights for policy 0, policy_version 83410 (0.0008) [2023-10-12 19:16:37,051][62635] Updated weights for policy 1, policy_version 83450 (0.0008) [2023-10-12 19:16:37,395][62634] Updated weights for policy 0, policy_version 83420 (0.0008) [2023-10-12 19:16:38,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 170885120. Throughput: 0: 1672.4, 1: 1680.8. Samples: 42722788. Policy #0 lag: (min: 0.0, avg: 23.4, max: 32.0) [2023-10-12 19:16:38,435][61643] Avg episode reward: [(0, '24.300'), (1, '9.720')] [2023-10-12 19:16:41,126][62635] Updated weights for policy 1, policy_version 83460 (0.0010) [2023-10-12 19:16:41,448][62634] Updated weights for policy 0, policy_version 83430 (0.0007) [2023-10-12 19:16:41,487][62635] Updated weights for policy 1, policy_version 83470 (0.0008) [2023-10-12 19:16:41,819][62634] Updated weights for policy 0, policy_version 83440 (0.0009) [2023-10-12 19:16:41,855][62635] Updated weights for policy 1, policy_version 83480 (0.0007) [2023-10-12 19:16:42,196][62634] Updated weights for policy 0, policy_version 83450 (0.0008) [2023-10-12 19:16:43,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 170950656. Throughput: 0: 1662.3, 1: 1659.5. Samples: 42741718. Policy #0 lag: (min: 0.0, avg: 23.4, max: 32.0) [2023-10-12 19:16:43,436][61643] Avg episode reward: [(0, '23.950'), (1, '9.700')] [2023-10-12 19:16:45,645][62635] Updated weights for policy 1, policy_version 83490 (0.0008) [2023-10-12 19:16:46,019][62635] Updated weights for policy 1, policy_version 83500 (0.0009) [2023-10-12 19:16:46,272][62634] Updated weights for policy 0, policy_version 83460 (0.0009) [2023-10-12 19:16:46,383][62635] Updated weights for policy 1, policy_version 83510 (0.0010) [2023-10-12 19:16:46,649][62634] Updated weights for policy 0, policy_version 83470 (0.0007) [2023-10-12 19:16:46,753][62635] Updated weights for policy 1, policy_version 83520 (0.0008) [2023-10-12 19:16:47,038][62634] Updated weights for policy 0, policy_version 83480 (0.0008) [2023-10-12 19:16:48,435][61643] Fps is (10 sec: 13106.8, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 171016192. Throughput: 0: 1670.3, 1: 1682.3. Samples: 42761882. Policy #0 lag: (min: 0.0, avg: 23.4, max: 32.0) [2023-10-12 19:16:48,436][61643] Avg episode reward: [(0, '23.890'), (1, '9.910')] [2023-10-12 19:16:50,877][62635] Updated weights for policy 1, policy_version 83530 (0.0008) [2023-10-12 19:16:51,176][62634] Updated weights for policy 0, policy_version 83490 (0.0010) [2023-10-12 19:16:51,249][62635] Updated weights for policy 1, policy_version 83540 (0.0007) [2023-10-12 19:16:51,549][62634] Updated weights for policy 0, policy_version 83500 (0.0007) [2023-10-12 19:16:51,619][62635] Updated weights for policy 1, policy_version 83550 (0.0008) [2023-10-12 19:16:51,931][62634] Updated weights for policy 0, policy_version 83510 (0.0008) [2023-10-12 19:16:52,306][62634] Updated weights for policy 0, policy_version 83520 (0.0007) [2023-10-12 19:16:53,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 171081728. Throughput: 0: 1670.3, 1: 1671.4. Samples: 42772850. Policy #0 lag: (min: 0.0, avg: 23.4, max: 32.0) [2023-10-12 19:16:53,435][61643] Avg episode reward: [(0, '23.520'), (1, '9.860')] [2023-10-12 19:16:55,795][62635] Updated weights for policy 1, policy_version 83560 (0.0009) [2023-10-12 19:16:56,163][62635] Updated weights for policy 1, policy_version 83570 (0.0009) [2023-10-12 19:16:56,327][62634] Updated weights for policy 0, policy_version 83530 (0.0009) [2023-10-12 19:16:56,529][62635] Updated weights for policy 1, policy_version 83580 (0.0008) [2023-10-12 19:16:56,700][62634] Updated weights for policy 0, policy_version 83540 (0.0008) [2023-10-12 19:16:57,073][62634] Updated weights for policy 0, policy_version 83550 (0.0010) [2023-10-12 19:16:58,435][61643] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 171147264. Throughput: 0: 1656.9, 1: 1663.2. Samples: 42791510. Policy #0 lag: (min: 0.0, avg: 23.4, max: 32.0) [2023-10-12 19:16:58,436][61643] Avg episode reward: [(0, '23.350'), (1, '9.830')] [2023-10-12 19:17:00,614][62635] Updated weights for policy 1, policy_version 83590 (0.0009) [2023-10-12 19:17:01,007][62635] Updated weights for policy 1, policy_version 83600 (0.0009) [2023-10-12 19:17:01,094][62634] Updated weights for policy 0, policy_version 83560 (0.0008) [2023-10-12 19:17:01,379][62635] Updated weights for policy 1, policy_version 83610 (0.0008) [2023-10-12 19:17:01,467][62634] Updated weights for policy 0, policy_version 83570 (0.0008) [2023-10-12 19:17:01,838][62634] Updated weights for policy 0, policy_version 83580 (0.0010) [2023-10-12 19:17:03,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.3, 300 sec: 13440.4). Total num frames: 171212800. Throughput: 0: 1671.7, 1: 1679.9. Samples: 42811912. Policy #0 lag: (min: 0.0, avg: 23.4, max: 32.0) [2023-10-12 19:17:03,435][61643] Avg episode reward: [(0, '23.290'), (1, '9.720')] [2023-10-12 19:17:05,342][62635] Updated weights for policy 1, policy_version 83620 (0.0009) [2023-10-12 19:17:05,708][62635] Updated weights for policy 1, policy_version 83630 (0.0008) [2023-10-12 19:17:05,910][62634] Updated weights for policy 0, policy_version 83590 (0.0008) [2023-10-12 19:17:06,080][62635] Updated weights for policy 1, policy_version 83640 (0.0008) [2023-10-12 19:17:06,294][62634] Updated weights for policy 0, policy_version 83600 (0.0008) [2023-10-12 19:17:06,669][62634] Updated weights for policy 0, policy_version 83610 (0.0009) [2023-10-12 19:17:08,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 171278336. Throughput: 0: 1666.6, 1: 1660.6. Samples: 42822334. Policy #0 lag: (min: 31.0, avg: 32.6, max: 59.0) [2023-10-12 19:17:08,435][61643] Avg episode reward: [(0, '23.350'), (1, '9.920')] [2023-10-12 19:17:10,191][62635] Updated weights for policy 1, policy_version 83650 (0.0009) [2023-10-12 19:17:10,562][62635] Updated weights for policy 1, policy_version 83660 (0.0010) [2023-10-12 19:17:10,714][62634] Updated weights for policy 0, policy_version 83620 (0.0009) [2023-10-12 19:17:10,926][62635] Updated weights for policy 1, policy_version 83670 (0.0008) [2023-10-12 19:17:11,082][62634] Updated weights for policy 0, policy_version 83630 (0.0009) [2023-10-12 19:17:11,287][62635] Updated weights for policy 1, policy_version 83680 (0.0007) [2023-10-12 19:17:11,451][62634] Updated weights for policy 0, policy_version 83640 (0.0008) [2023-10-12 19:17:13,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 171343872. Throughput: 0: 1654.8, 1: 1674.4. Samples: 42841660. Policy #0 lag: (min: 31.0, avg: 32.6, max: 59.0) [2023-10-12 19:17:13,435][61643] Avg episode reward: [(0, '23.610'), (1, '9.670')] [2023-10-12 19:17:15,393][62635] Updated weights for policy 1, policy_version 83690 (0.0008) [2023-10-12 19:17:15,515][62634] Updated weights for policy 0, policy_version 83650 (0.0009) [2023-10-12 19:17:15,765][62635] Updated weights for policy 1, policy_version 83700 (0.0007) [2023-10-12 19:17:15,893][62634] Updated weights for policy 0, policy_version 83660 (0.0007) [2023-10-12 19:17:16,129][62635] Updated weights for policy 1, policy_version 83710 (0.0007) [2023-10-12 19:17:16,268][62634] Updated weights for policy 0, policy_version 83670 (0.0010) [2023-10-12 19:17:16,641][62634] Updated weights for policy 0, policy_version 83680 (0.0010) [2023-10-12 19:17:18,435][61643] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 171409408. Throughput: 0: 1675.9, 1: 1685.0. Samples: 42862594. Policy #0 lag: (min: 31.0, avg: 32.6, max: 59.0) [2023-10-12 19:17:18,436][61643] Avg episode reward: [(0, '23.870'), (1, '9.760')] [2023-10-12 19:17:20,274][62635] Updated weights for policy 1, policy_version 83720 (0.0007) [2023-10-12 19:17:20,543][62634] Updated weights for policy 0, policy_version 83690 (0.0007) [2023-10-12 19:17:20,645][62635] Updated weights for policy 1, policy_version 83730 (0.0007) [2023-10-12 19:17:20,927][62634] Updated weights for policy 0, policy_version 83700 (0.0008) [2023-10-12 19:17:21,001][62635] Updated weights for policy 1, policy_version 83740 (0.0007) [2023-10-12 19:17:21,302][62634] Updated weights for policy 0, policy_version 83710 (0.0007) [2023-10-12 19:17:23,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 171474944. Throughput: 0: 1666.9, 1: 1661.2. Samples: 42872556. Policy #0 lag: (min: 31.0, avg: 32.6, max: 59.0) [2023-10-12 19:17:23,435][61643] Avg episode reward: [(0, '23.600'), (1, '9.940')] [2023-10-12 19:17:25,135][62635] Updated weights for policy 1, policy_version 83750 (0.0009) [2023-10-12 19:17:25,392][62634] Updated weights for policy 0, policy_version 83720 (0.0009) [2023-10-12 19:17:25,495][62635] Updated weights for policy 1, policy_version 83760 (0.0008) [2023-10-12 19:17:25,777][62634] Updated weights for policy 0, policy_version 83730 (0.0007) [2023-10-12 19:17:25,867][62635] Updated weights for policy 1, policy_version 83770 (0.0009) [2023-10-12 19:17:26,150][62634] Updated weights for policy 0, policy_version 83740 (0.0008) [2023-10-12 19:17:28,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 171540480. Throughput: 0: 1670.4, 1: 1683.3. Samples: 42892636. Policy #0 lag: (min: 31.0, avg: 32.6, max: 59.0) [2023-10-12 19:17:28,436][61643] Avg episode reward: [(0, '23.560'), (1, '9.780')] [2023-10-12 19:17:29,902][62635] Updated weights for policy 1, policy_version 83780 (0.0008) [2023-10-12 19:17:30,206][62634] Updated weights for policy 0, policy_version 83750 (0.0009) [2023-10-12 19:17:30,270][62635] Updated weights for policy 1, policy_version 83790 (0.0008) [2023-10-12 19:17:30,583][62634] Updated weights for policy 0, policy_version 83760 (0.0008) [2023-10-12 19:17:30,634][62635] Updated weights for policy 1, policy_version 83800 (0.0007) [2023-10-12 19:17:30,946][62634] Updated weights for policy 0, policy_version 83770 (0.0009) [2023-10-12 19:17:33,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 171606016. Throughput: 0: 1682.1, 1: 1682.8. Samples: 42913302. Policy #0 lag: (min: 31.0, avg: 32.6, max: 59.0) [2023-10-12 19:17:33,435][61643] Avg episode reward: [(0, '23.770'), (1, '9.890')] [2023-10-12 19:17:34,634][62635] Updated weights for policy 1, policy_version 83810 (0.0009) [2023-10-12 19:17:35,007][62635] Updated weights for policy 1, policy_version 83820 (0.0007) [2023-10-12 19:17:35,020][62634] Updated weights for policy 0, policy_version 83780 (0.0009) [2023-10-12 19:17:35,381][62635] Updated weights for policy 1, policy_version 83830 (0.0007) [2023-10-12 19:17:35,391][62634] Updated weights for policy 0, policy_version 83790 (0.0011) [2023-10-12 19:17:35,740][62635] Updated weights for policy 1, policy_version 83840 (0.0007) [2023-10-12 19:17:35,771][62634] Updated weights for policy 0, policy_version 83800 (0.0007) [2023-10-12 19:17:38,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 171671552. Throughput: 0: 1657.5, 1: 1667.6. Samples: 42922476. Policy #0 lag: (min: 31.0, avg: 32.6, max: 59.0) [2023-10-12 19:17:38,435][61643] Avg episode reward: [(0, '23.780'), (1, '9.970')] [2023-10-12 19:17:39,732][62634] Updated weights for policy 0, policy_version 83810 (0.0008) [2023-10-12 19:17:39,754][62635] Updated weights for policy 1, policy_version 83850 (0.0008) [2023-10-12 19:17:40,112][62634] Updated weights for policy 0, policy_version 83820 (0.0009) [2023-10-12 19:17:40,124][62635] Updated weights for policy 1, policy_version 83860 (0.0008) [2023-10-12 19:17:40,481][62634] Updated weights for policy 0, policy_version 83830 (0.0007) [2023-10-12 19:17:40,482][62635] Updated weights for policy 1, policy_version 83870 (0.0010) [2023-10-12 19:17:40,858][62634] Updated weights for policy 0, policy_version 83840 (0.0008) [2023-10-12 19:17:43,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 171737088. Throughput: 0: 1678.7, 1: 1685.1. Samples: 42942880. Policy #0 lag: (min: 31.0, avg: 32.6, max: 59.0) [2023-10-12 19:17:43,435][61643] Avg episode reward: [(0, '23.850'), (1, '9.890')] [2023-10-12 19:17:44,589][62635] Updated weights for policy 1, policy_version 83880 (0.0009) [2023-10-12 19:17:44,851][62634] Updated weights for policy 0, policy_version 83850 (0.0008) [2023-10-12 19:17:44,956][62635] Updated weights for policy 1, policy_version 83890 (0.0010) [2023-10-12 19:17:45,223][62634] Updated weights for policy 0, policy_version 83860 (0.0008) [2023-10-12 19:17:45,326][62635] Updated weights for policy 1, policy_version 83900 (0.0007) [2023-10-12 19:17:45,603][62634] Updated weights for policy 0, policy_version 83870 (0.0009) [2023-10-12 19:17:48,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 171802624. Throughput: 0: 1685.9, 1: 1682.5. Samples: 42963490. Policy #0 lag: (min: 31.0, avg: 32.6, max: 59.0) [2023-10-12 19:17:48,436][61643] Avg episode reward: [(0, '23.900'), (1, '9.890')] [2023-10-12 19:17:49,719][62635] Updated weights for policy 1, policy_version 83910 (0.0009) [2023-10-12 19:17:49,815][62634] Updated weights for policy 0, policy_version 83880 (0.0008) [2023-10-12 19:17:50,099][62635] Updated weights for policy 1, policy_version 83920 (0.0007) [2023-10-12 19:17:50,185][62634] Updated weights for policy 0, policy_version 83890 (0.0009) [2023-10-12 19:17:50,464][62635] Updated weights for policy 1, policy_version 83930 (0.0008) [2023-10-12 19:17:50,566][62634] Updated weights for policy 0, policy_version 83900 (0.0009) [2023-10-12 19:17:53,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 171868160. Throughput: 0: 1662.6, 1: 1668.9. Samples: 42972252. Policy #0 lag: (min: 31.0, avg: 32.6, max: 59.0) [2023-10-12 19:17:53,435][61643] Avg episode reward: [(0, '23.970'), (1, '9.980')] [2023-10-12 19:17:54,612][62635] Updated weights for policy 1, policy_version 83940 (0.0009) [2023-10-12 19:17:54,724][62634] Updated weights for policy 0, policy_version 83910 (0.0008) [2023-10-12 19:17:54,981][62635] Updated weights for policy 1, policy_version 83950 (0.0007) [2023-10-12 19:17:55,106][62634] Updated weights for policy 0, policy_version 83920 (0.0007) [2023-10-12 19:17:55,344][62635] Updated weights for policy 1, policy_version 83960 (0.0007) [2023-10-12 19:17:55,481][62634] Updated weights for policy 0, policy_version 83930 (0.0008) [2023-10-12 19:17:58,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 171933696. Throughput: 0: 1685.1, 1: 1676.2. Samples: 42992916. Policy #0 lag: (min: 31.0, avg: 32.6, max: 59.0) [2023-10-12 19:17:58,436][61643] Avg episode reward: [(0, '24.400'), (1, '9.800')] [2023-10-12 19:17:59,390][62635] Updated weights for policy 1, policy_version 83970 (0.0007) [2023-10-12 19:17:59,753][62635] Updated weights for policy 1, policy_version 83980 (0.0008) [2023-10-12 19:17:59,783][62634] Updated weights for policy 0, policy_version 83940 (0.0009) [2023-10-12 19:18:00,125][62635] Updated weights for policy 1, policy_version 83990 (0.0007) [2023-10-12 19:18:00,174][62634] Updated weights for policy 0, policy_version 83950 (0.0010) [2023-10-12 19:18:00,482][62635] Updated weights for policy 1, policy_version 84000 (0.0008) [2023-10-12 19:18:00,564][62634] Updated weights for policy 0, policy_version 83960 (0.0007) [2023-10-12 19:18:03,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 171999232. Throughput: 0: 1676.5, 1: 1669.6. Samples: 43013168. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) [2023-10-12 19:18:03,436][61643] Avg episode reward: [(0, '24.490'), (1, '10.040')] [2023-10-12 19:18:04,552][62634] Updated weights for policy 0, policy_version 83970 (0.0007) [2023-10-12 19:18:04,635][62635] Updated weights for policy 1, policy_version 84010 (0.0008) [2023-10-12 19:18:04,919][62634] Updated weights for policy 0, policy_version 83980 (0.0008) [2023-10-12 19:18:05,004][62635] Updated weights for policy 1, policy_version 84020 (0.0009) [2023-10-12 19:18:05,299][62634] Updated weights for policy 0, policy_version 83990 (0.0007) [2023-10-12 19:18:05,365][62635] Updated weights for policy 1, policy_version 84030 (0.0008) [2023-10-12 19:18:05,669][62634] Updated weights for policy 0, policy_version 84000 (0.0008) [2023-10-12 19:18:08,435][61643] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 172064768. Throughput: 0: 1663.5, 1: 1660.9. Samples: 43022156. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) [2023-10-12 19:18:08,435][61643] Avg episode reward: [(0, '24.470'), (1, '10.060')] [2023-10-12 19:18:09,509][62635] Updated weights for policy 1, policy_version 84040 (0.0008) [2023-10-12 19:18:09,821][62634] Updated weights for policy 0, policy_version 84010 (0.0008) [2023-10-12 19:18:09,871][62635] Updated weights for policy 1, policy_version 84050 (0.0008) [2023-10-12 19:18:10,200][62634] Updated weights for policy 0, policy_version 84020 (0.0008) [2023-10-12 19:18:10,244][62635] Updated weights for policy 1, policy_version 84060 (0.0008) [2023-10-12 19:18:10,579][62634] Updated weights for policy 0, policy_version 84030 (0.0008) [2023-10-12 19:18:13,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13107.1, 300 sec: 13329.3). Total num frames: 172130304. Throughput: 0: 1673.6, 1: 1661.9. Samples: 43042734. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) [2023-10-12 19:18:13,436][61643] Avg episode reward: [(0, '24.900'), (1, '9.880')] [2023-10-12 19:18:14,217][62635] Updated weights for policy 1, policy_version 84070 (0.0008) [2023-10-12 19:18:14,573][62635] Updated weights for policy 1, policy_version 84080 (0.0009) [2023-10-12 19:18:14,577][62634] Updated weights for policy 0, policy_version 84040 (0.0010) [2023-10-12 19:18:14,950][62635] Updated weights for policy 1, policy_version 84090 (0.0007) [2023-10-12 19:18:14,953][62634] Updated weights for policy 0, policy_version 84050 (0.0009) [2023-10-12 19:18:15,324][62634] Updated weights for policy 0, policy_version 84060 (0.0008) [2023-10-12 19:18:18,435][61643] Fps is (10 sec: 13106.8, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 172195840. Throughput: 0: 1669.1, 1: 1664.0. Samples: 43063296. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) [2023-10-12 19:18:18,436][61643] Avg episode reward: [(0, '24.940'), (1, '10.060')] [2023-10-12 19:18:18,446][62495] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000084096_86114304.pth... [2023-10-12 19:18:18,446][62354] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000084064_86081536.pth... [2023-10-12 19:18:18,486][62354] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000082496_84475904.pth [2023-10-12 19:18:18,486][62495] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000082528_84508672.pth [2023-10-12 19:18:19,018][62635] Updated weights for policy 1, policy_version 84100 (0.0008) [2023-10-12 19:18:19,388][62635] Updated weights for policy 1, policy_version 84110 (0.0007) [2023-10-12 19:18:19,465][62634] Updated weights for policy 0, policy_version 84070 (0.0008) [2023-10-12 19:18:19,756][62635] Updated weights for policy 1, policy_version 84120 (0.0008) [2023-10-12 19:18:19,833][62634] Updated weights for policy 0, policy_version 84080 (0.0008) [2023-10-12 19:18:20,213][62634] Updated weights for policy 0, policy_version 84090 (0.0007) [2023-10-12 19:18:23,435][61643] Fps is (10 sec: 13107.6, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 172261376. Throughput: 0: 1666.4, 1: 1665.2. Samples: 43072398. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) [2023-10-12 19:18:23,435][61643] Avg episode reward: [(0, '25.040'), (1, '9.960')] [2023-10-12 19:18:23,890][62635] Updated weights for policy 1, policy_version 84130 (0.0007) [2023-10-12 19:18:24,122][62634] Updated weights for policy 0, policy_version 84100 (0.0007) [2023-10-12 19:18:24,250][62635] Updated weights for policy 1, policy_version 84140 (0.0007) [2023-10-12 19:18:24,498][62634] Updated weights for policy 0, policy_version 84110 (0.0008) [2023-10-12 19:18:24,621][62635] Updated weights for policy 1, policy_version 84150 (0.0008) [2023-10-12 19:18:24,874][62634] Updated weights for policy 0, policy_version 84120 (0.0007) [2023-10-12 19:18:24,987][62635] Updated weights for policy 1, policy_version 84160 (0.0008) [2023-10-12 19:18:28,435][61643] Fps is (10 sec: 13107.6, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 172326912. Throughput: 0: 1668.2, 1: 1665.6. Samples: 43092902. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) [2023-10-12 19:18:28,435][61643] Avg episode reward: [(0, '25.000'), (1, '9.860')] [2023-10-12 19:18:29,031][62635] Updated weights for policy 1, policy_version 84170 (0.0009) [2023-10-12 19:18:29,142][62634] Updated weights for policy 0, policy_version 84130 (0.0008) [2023-10-12 19:18:29,402][62635] Updated weights for policy 1, policy_version 84180 (0.0008) [2023-10-12 19:18:29,524][62634] Updated weights for policy 0, policy_version 84140 (0.0008) [2023-10-12 19:18:29,770][62635] Updated weights for policy 1, policy_version 84190 (0.0009) [2023-10-12 19:18:29,906][62634] Updated weights for policy 0, policy_version 84150 (0.0008) [2023-10-12 19:18:30,275][62634] Updated weights for policy 0, policy_version 84160 (0.0007) [2023-10-12 19:18:33,435][61643] Fps is (10 sec: 13106.8, 60 sec: 13107.1, 300 sec: 13329.3). Total num frames: 172392448. Throughput: 0: 1670.4, 1: 1665.1. Samples: 43113586. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) [2023-10-12 19:18:33,436][61643] Avg episode reward: [(0, '25.090'), (1, '10.120')] [2023-10-12 19:18:33,938][62635] Updated weights for policy 1, policy_version 84200 (0.0008) [2023-10-12 19:18:34,216][62634] Updated weights for policy 0, policy_version 84170 (0.0009) [2023-10-12 19:18:34,300][62635] Updated weights for policy 1, policy_version 84210 (0.0007) [2023-10-12 19:18:34,602][62634] Updated weights for policy 0, policy_version 84180 (0.0009) [2023-10-12 19:18:34,678][62635] Updated weights for policy 1, policy_version 84220 (0.0007) [2023-10-12 19:18:34,964][62634] Updated weights for policy 0, policy_version 84190 (0.0007) [2023-10-12 19:18:38,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 172457984. Throughput: 0: 1670.6, 1: 1668.3. Samples: 43122502. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) [2023-10-12 19:18:38,435][61643] Avg episode reward: [(0, '25.150'), (1, '10.030')] [2023-10-12 19:18:38,991][62635] Updated weights for policy 1, policy_version 84230 (0.0008) [2023-10-12 19:18:39,241][62634] Updated weights for policy 0, policy_version 84200 (0.0007) [2023-10-12 19:18:39,361][62635] Updated weights for policy 1, policy_version 84240 (0.0009) [2023-10-12 19:18:39,620][62634] Updated weights for policy 0, policy_version 84210 (0.0008) [2023-10-12 19:18:39,720][62635] Updated weights for policy 1, policy_version 84250 (0.0007) [2023-10-12 19:18:39,994][62634] Updated weights for policy 0, policy_version 84220 (0.0009) [2023-10-12 19:18:43,435][61643] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 172523520. Throughput: 0: 1670.2, 1: 1666.2. Samples: 43143054. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) [2023-10-12 19:18:43,436][61643] Avg episode reward: [(0, '25.220'), (1, '9.850')] [2023-10-12 19:18:43,679][62635] Updated weights for policy 1, policy_version 84260 (0.0009) [2023-10-12 19:18:44,043][62635] Updated weights for policy 1, policy_version 84270 (0.0008) [2023-10-12 19:18:44,058][62634] Updated weights for policy 0, policy_version 84230 (0.0008) [2023-10-12 19:18:44,411][62635] Updated weights for policy 1, policy_version 84280 (0.0008) [2023-10-12 19:18:44,433][62634] Updated weights for policy 0, policy_version 84240 (0.0009) [2023-10-12 19:18:44,806][62634] Updated weights for policy 0, policy_version 84250 (0.0009) [2023-10-12 19:18:48,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 172589056. Throughput: 0: 1674.4, 1: 1672.0. Samples: 43163756. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) [2023-10-12 19:18:48,436][61643] Avg episode reward: [(0, '25.230'), (1, '10.120')] [2023-10-12 19:18:48,444][62635] Updated weights for policy 1, policy_version 84290 (0.0007) [2023-10-12 19:18:48,810][62635] Updated weights for policy 1, policy_version 84300 (0.0009) [2023-10-12 19:18:48,970][62634] Updated weights for policy 0, policy_version 84260 (0.0009) [2023-10-12 19:18:49,181][62635] Updated weights for policy 1, policy_version 84310 (0.0008) [2023-10-12 19:18:49,357][62634] Updated weights for policy 0, policy_version 84270 (0.0008) [2023-10-12 19:18:49,546][62635] Updated weights for policy 1, policy_version 84320 (0.0007) [2023-10-12 19:18:49,737][62634] Updated weights for policy 0, policy_version 84280 (0.0009) [2023-10-12 19:18:53,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 172654592. Throughput: 0: 1669.3, 1: 1673.5. Samples: 43172582. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) [2023-10-12 19:18:53,435][61643] Avg episode reward: [(0, '25.300'), (1, '10.030')] [2023-10-12 19:18:53,621][62634] Updated weights for policy 0, policy_version 84290 (0.0011) [2023-10-12 19:18:53,717][62635] Updated weights for policy 1, policy_version 84330 (0.0009) [2023-10-12 19:18:53,996][62634] Updated weights for policy 0, policy_version 84300 (0.0009) [2023-10-12 19:18:54,093][62635] Updated weights for policy 1, policy_version 84340 (0.0008) [2023-10-12 19:18:54,373][62634] Updated weights for policy 0, policy_version 84310 (0.0008) [2023-10-12 19:18:54,466][62635] Updated weights for policy 1, policy_version 84350 (0.0008) [2023-10-12 19:18:54,745][62634] Updated weights for policy 0, policy_version 84320 (0.0010) [2023-10-12 19:18:58,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.3, 300 sec: 13329.4). Total num frames: 172720128. Throughput: 0: 1669.7, 1: 1677.8. Samples: 43193374. Policy #0 lag: (min: 19.0, avg: 26.5, max: 51.0) [2023-10-12 19:18:58,435][61643] Avg episode reward: [(0, '25.120'), (1, '9.960')] [2023-10-12 19:18:58,542][62635] Updated weights for policy 1, policy_version 84360 (0.0009) [2023-10-12 19:18:58,751][62634] Updated weights for policy 0, policy_version 84330 (0.0008) [2023-10-12 19:18:58,907][62635] Updated weights for policy 1, policy_version 84370 (0.0009) [2023-10-12 19:18:59,137][62634] Updated weights for policy 0, policy_version 84340 (0.0008) [2023-10-12 19:18:59,275][62635] Updated weights for policy 1, policy_version 84380 (0.0008) [2023-10-12 19:18:59,512][62634] Updated weights for policy 0, policy_version 84350 (0.0007) [2023-10-12 19:19:03,150][62635] Updated weights for policy 1, policy_version 84390 (0.0007) [2023-10-12 19:19:03,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 172785664. Throughput: 0: 1679.3, 1: 1673.6. Samples: 43214176. Policy #0 lag: (min: 19.0, avg: 26.5, max: 51.0) [2023-10-12 19:19:03,435][61643] Avg episode reward: [(0, '25.130'), (1, '10.010')] [2023-10-12 19:19:03,514][62635] Updated weights for policy 1, policy_version 84400 (0.0008) [2023-10-12 19:19:03,529][62634] Updated weights for policy 0, policy_version 84360 (0.0009) [2023-10-12 19:19:03,877][62635] Updated weights for policy 1, policy_version 84410 (0.0008) [2023-10-12 19:19:03,895][62634] Updated weights for policy 0, policy_version 84370 (0.0009) [2023-10-12 19:19:04,275][62634] Updated weights for policy 0, policy_version 84380 (0.0008) [2023-10-12 19:19:08,127][62635] Updated weights for policy 1, policy_version 84420 (0.0007) [2023-10-12 19:19:08,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 172851200. Throughput: 0: 1675.3, 1: 1678.1. Samples: 43223302. Policy #0 lag: (min: 19.0, avg: 26.5, max: 51.0) [2023-10-12 19:19:08,436][61643] Avg episode reward: [(0, '25.130'), (1, '9.970')] [2023-10-12 19:19:08,447][62634] Updated weights for policy 0, policy_version 84390 (0.0008) [2023-10-12 19:19:08,492][62635] Updated weights for policy 1, policy_version 84430 (0.0007) [2023-10-12 19:19:08,818][62634] Updated weights for policy 0, policy_version 84400 (0.0008) [2023-10-12 19:19:08,858][62635] Updated weights for policy 1, policy_version 84440 (0.0009) [2023-10-12 19:19:09,204][62634] Updated weights for policy 0, policy_version 84410 (0.0008) [2023-10-12 19:19:12,764][62635] Updated weights for policy 1, policy_version 84450 (0.0008) [2023-10-12 19:19:13,125][62635] Updated weights for policy 1, policy_version 84460 (0.0010) [2023-10-12 19:19:13,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.3, 300 sec: 13329.4). Total num frames: 172916736. Throughput: 0: 1674.3, 1: 1683.8. Samples: 43244018. Policy #0 lag: (min: 19.0, avg: 26.5, max: 51.0) [2023-10-12 19:19:13,435][61643] Avg episode reward: [(0, '24.920'), (1, '9.740')] [2023-10-12 19:19:13,436][62634] Updated weights for policy 0, policy_version 84420 (0.0008) [2023-10-12 19:19:13,487][62635] Updated weights for policy 1, policy_version 84470 (0.0008) [2023-10-12 19:19:13,804][62634] Updated weights for policy 0, policy_version 84430 (0.0008) [2023-10-12 19:19:13,856][62635] Updated weights for policy 1, policy_version 84480 (0.0008) [2023-10-12 19:19:14,179][62634] Updated weights for policy 0, policy_version 84440 (0.0009) [2023-10-12 19:19:17,784][62635] Updated weights for policy 1, policy_version 84490 (0.0010) [2023-10-12 19:19:18,147][62634] Updated weights for policy 0, policy_version 84450 (0.0009) [2023-10-12 19:19:18,154][62635] Updated weights for policy 1, policy_version 84500 (0.0010) [2023-10-12 19:19:18,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.3, 300 sec: 13329.4). Total num frames: 172982272. Throughput: 0: 1670.1, 1: 1675.6. Samples: 43264140. Policy #0 lag: (min: 19.0, avg: 26.5, max: 51.0) [2023-10-12 19:19:18,435][61643] Avg episode reward: [(0, '24.920'), (1, '9.890')] [2023-10-12 19:19:18,520][62634] Updated weights for policy 0, policy_version 84460 (0.0007) [2023-10-12 19:19:18,521][62635] Updated weights for policy 1, policy_version 84510 (0.0008) [2023-10-12 19:19:18,902][62634] Updated weights for policy 0, policy_version 84470 (0.0010) [2023-10-12 19:19:19,274][62634] Updated weights for policy 0, policy_version 84480 (0.0010) [2023-10-12 19:19:22,622][62635] Updated weights for policy 1, policy_version 84520 (0.0008) [2023-10-12 19:19:22,989][62635] Updated weights for policy 1, policy_version 84530 (0.0007) [2023-10-12 19:19:23,335][62634] Updated weights for policy 0, policy_version 84490 (0.0010) [2023-10-12 19:19:23,360][62635] Updated weights for policy 1, policy_version 84540 (0.0008) [2023-10-12 19:19:23,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 173047808. Throughput: 0: 1670.4, 1: 1688.8. Samples: 43273664. Policy #0 lag: (min: 19.0, avg: 26.5, max: 51.0) [2023-10-12 19:19:23,435][61643] Avg episode reward: [(0, '25.000'), (1, '9.730')] [2023-10-12 19:19:23,711][62634] Updated weights for policy 0, policy_version 84500 (0.0009) [2023-10-12 19:19:24,098][62634] Updated weights for policy 0, policy_version 84510 (0.0009) [2023-10-12 19:19:27,495][62635] Updated weights for policy 1, policy_version 84550 (0.0007) [2023-10-12 19:19:27,865][62635] Updated weights for policy 1, policy_version 84560 (0.0009) [2023-10-12 19:19:28,216][62634] Updated weights for policy 0, policy_version 84520 (0.0009) [2023-10-12 19:19:28,236][62635] Updated weights for policy 1, policy_version 84570 (0.0009) [2023-10-12 19:19:28,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 173113344. Throughput: 0: 1672.9, 1: 1688.0. Samples: 43294294. Policy #0 lag: (min: 19.0, avg: 26.5, max: 51.0) [2023-10-12 19:19:28,435][61643] Avg episode reward: [(0, '24.960'), (1, '9.580')] [2023-10-12 19:19:28,595][62634] Updated weights for policy 0, policy_version 84530 (0.0010) [2023-10-12 19:19:28,972][62634] Updated weights for policy 0, policy_version 84540 (0.0008) [2023-10-12 19:19:32,088][62635] Updated weights for policy 1, policy_version 84580 (0.0008) [2023-10-12 19:19:32,460][62635] Updated weights for policy 1, policy_version 84590 (0.0007) [2023-10-12 19:19:32,828][62635] Updated weights for policy 1, policy_version 84600 (0.0008) [2023-10-12 19:19:33,079][62634] Updated weights for policy 0, policy_version 84550 (0.0007) [2023-10-12 19:19:33,435][61643] Fps is (10 sec: 16383.8, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 173211648. Throughput: 0: 1673.6, 1: 1665.2. Samples: 43314006. Policy #0 lag: (min: 19.0, avg: 26.5, max: 51.0) [2023-10-12 19:19:33,435][61643] Avg episode reward: [(0, '25.080'), (1, '9.640')] [2023-10-12 19:19:33,452][62634] Updated weights for policy 0, policy_version 84560 (0.0008) [2023-10-12 19:19:33,825][62634] Updated weights for policy 0, policy_version 84570 (0.0008) [2023-10-12 19:19:37,165][62635] Updated weights for policy 1, policy_version 84610 (0.0009) [2023-10-12 19:19:37,534][62635] Updated weights for policy 1, policy_version 84620 (0.0008) [2023-10-12 19:19:37,896][62634] Updated weights for policy 0, policy_version 84580 (0.0007) [2023-10-12 19:19:37,905][62635] Updated weights for policy 1, policy_version 84630 (0.0009) [2023-10-12 19:19:38,277][62635] Updated weights for policy 1, policy_version 84640 (0.0008) [2023-10-12 19:19:38,282][62634] Updated weights for policy 0, policy_version 84590 (0.0009) [2023-10-12 19:19:38,435][61643] Fps is (10 sec: 16383.8, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 173277184. Throughput: 0: 1678.9, 1: 1691.9. Samples: 43324268. Policy #0 lag: (min: 19.0, avg: 26.5, max: 51.0) [2023-10-12 19:19:38,436][61643] Avg episode reward: [(0, '24.980'), (1, '9.730')] [2023-10-12 19:19:38,663][62634] Updated weights for policy 0, policy_version 84600 (0.0011) [2023-10-12 19:19:42,509][62635] Updated weights for policy 1, policy_version 84650 (0.0007) [2023-10-12 19:19:42,692][62634] Updated weights for policy 0, policy_version 84610 (0.0009) [2023-10-12 19:19:42,872][62635] Updated weights for policy 1, policy_version 84660 (0.0008) [2023-10-12 19:19:43,066][62634] Updated weights for policy 0, policy_version 84620 (0.0009) [2023-10-12 19:19:43,242][62635] Updated weights for policy 1, policy_version 84670 (0.0008) [2023-10-12 19:19:43,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 173342720. Throughput: 0: 1671.2, 1: 1687.5. Samples: 43344512. Policy #0 lag: (min: 19.0, avg: 26.5, max: 51.0) [2023-10-12 19:19:43,435][61643] Avg episode reward: [(0, '25.200'), (1, '9.540')] [2023-10-12 19:19:43,447][62634] Updated weights for policy 0, policy_version 84630 (0.0009) [2023-10-12 19:19:43,816][62634] Updated weights for policy 0, policy_version 84640 (0.0008) [2023-10-12 19:19:47,323][62635] Updated weights for policy 1, policy_version 84680 (0.0010) [2023-10-12 19:19:47,686][62635] Updated weights for policy 1, policy_version 84690 (0.0011) [2023-10-12 19:19:47,946][62634] Updated weights for policy 0, policy_version 84650 (0.0008) [2023-10-12 19:19:48,061][62635] Updated weights for policy 1, policy_version 84700 (0.0009) [2023-10-12 19:19:48,315][62634] Updated weights for policy 0, policy_version 84660 (0.0010) [2023-10-12 19:19:48,435][61643] Fps is (10 sec: 13106.8, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 173408256. Throughput: 0: 1658.9, 1: 1662.5. Samples: 43363640. Policy #0 lag: (min: 19.0, avg: 26.5, max: 51.0) [2023-10-12 19:19:48,437][61643] Avg episode reward: [(0, '25.160'), (1, '9.740')] [2023-10-12 19:19:48,687][62634] Updated weights for policy 0, policy_version 84670 (0.0008) [2023-10-12 19:19:52,288][62635] Updated weights for policy 1, policy_version 84710 (0.0009) [2023-10-12 19:19:52,663][62635] Updated weights for policy 1, policy_version 84720 (0.0007) [2023-10-12 19:19:52,721][62634] Updated weights for policy 0, policy_version 84680 (0.0009) [2023-10-12 19:19:53,034][62635] Updated weights for policy 1, policy_version 84730 (0.0008) [2023-10-12 19:19:53,095][62634] Updated weights for policy 0, policy_version 84690 (0.0009) [2023-10-12 19:19:53,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 173473792. Throughput: 0: 1671.3, 1: 1680.0. Samples: 43374112. Policy #0 lag: (min: 12.0, avg: 15.2, max: 44.0) [2023-10-12 19:19:53,435][61643] Avg episode reward: [(0, '25.150'), (1, '10.030')] [2023-10-12 19:19:53,468][62634] Updated weights for policy 0, policy_version 84700 (0.0008) [2023-10-12 19:19:56,966][62635] Updated weights for policy 1, policy_version 84740 (0.0009) [2023-10-12 19:19:57,332][62635] Updated weights for policy 1, policy_version 84750 (0.0009) [2023-10-12 19:19:57,563][62634] Updated weights for policy 0, policy_version 84710 (0.0008) [2023-10-12 19:19:57,703][62635] Updated weights for policy 1, policy_version 84760 (0.0008) [2023-10-12 19:19:57,943][62634] Updated weights for policy 0, policy_version 84720 (0.0007) [2023-10-12 19:19:58,311][62634] Updated weights for policy 0, policy_version 84730 (0.0008) [2023-10-12 19:19:58,435][61643] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 173539328. Throughput: 0: 1674.4, 1: 1669.9. Samples: 43394512. Policy #0 lag: (min: 12.0, avg: 15.2, max: 44.0) [2023-10-12 19:19:58,436][61643] Avg episode reward: [(0, '25.130'), (1, '9.910')] [2023-10-12 19:20:01,476][62635] Updated weights for policy 1, policy_version 84770 (0.0009) [2023-10-12 19:20:01,848][62635] Updated weights for policy 1, policy_version 84780 (0.0008) [2023-10-12 19:20:02,212][62635] Updated weights for policy 1, policy_version 84790 (0.0010) [2023-10-12 19:20:02,383][62634] Updated weights for policy 0, policy_version 84740 (0.0008) [2023-10-12 19:20:02,585][62635] Updated weights for policy 1, policy_version 84800 (0.0008) [2023-10-12 19:20:02,753][62634] Updated weights for policy 0, policy_version 84750 (0.0009) [2023-10-12 19:20:03,140][62634] Updated weights for policy 0, policy_version 84760 (0.0008) [2023-10-12 19:20:03,435][61643] Fps is (10 sec: 16383.8, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 173637632. Throughput: 0: 1656.2, 1: 1661.7. Samples: 43413448. Policy #0 lag: (min: 12.0, avg: 15.2, max: 44.0) [2023-10-12 19:20:03,436][61643] Avg episode reward: [(0, '25.280'), (1, '9.990')] [2023-10-12 19:20:06,761][62635] Updated weights for policy 1, policy_version 84810 (0.0008) [2023-10-12 19:20:07,133][62635] Updated weights for policy 1, policy_version 84820 (0.0008) [2023-10-12 19:20:07,208][62634] Updated weights for policy 0, policy_version 84770 (0.0007) [2023-10-12 19:20:07,494][62635] Updated weights for policy 1, policy_version 84830 (0.0009) [2023-10-12 19:20:07,591][62634] Updated weights for policy 0, policy_version 84780 (0.0010) [2023-10-12 19:20:07,972][62634] Updated weights for policy 0, policy_version 84790 (0.0008) [2023-10-12 19:20:08,344][62634] Updated weights for policy 0, policy_version 84800 (0.0009) [2023-10-12 19:20:08,435][61643] Fps is (10 sec: 16384.1, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 173703168. Throughput: 0: 1673.4, 1: 1678.8. Samples: 43424512. Policy #0 lag: (min: 12.0, avg: 15.2, max: 44.0) [2023-10-12 19:20:08,435][61643] Avg episode reward: [(0, '25.040'), (1, '10.070')] [2023-10-12 19:20:11,638][62635] Updated weights for policy 1, policy_version 84840 (0.0010) [2023-10-12 19:20:12,004][62635] Updated weights for policy 1, policy_version 84850 (0.0010) [2023-10-12 19:20:12,378][62635] Updated weights for policy 1, policy_version 84860 (0.0008) [2023-10-12 19:20:12,460][62634] Updated weights for policy 0, policy_version 84810 (0.0007) [2023-10-12 19:20:12,835][62634] Updated weights for policy 0, policy_version 84820 (0.0007) [2023-10-12 19:20:13,216][62634] Updated weights for policy 0, policy_version 84830 (0.0007) [2023-10-12 19:20:13,435][61643] Fps is (10 sec: 13107.2, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 173768704. Throughput: 0: 1672.3, 1: 1664.2. Samples: 43444436. Policy #0 lag: (min: 12.0, avg: 15.2, max: 44.0) [2023-10-12 19:20:13,436][61643] Avg episode reward: [(0, '25.080'), (1, '9.960')] [2023-10-12 19:20:16,515][62635] Updated weights for policy 1, policy_version 84870 (0.0009) [2023-10-12 19:20:16,901][62635] Updated weights for policy 1, policy_version 84880 (0.0009) [2023-10-12 19:20:17,260][62635] Updated weights for policy 1, policy_version 84890 (0.0009) [2023-10-12 19:20:17,279][62634] Updated weights for policy 0, policy_version 84840 (0.0009) [2023-10-12 19:20:17,653][62634] Updated weights for policy 0, policy_version 84850 (0.0009) [2023-10-12 19:20:18,035][62634] Updated weights for policy 0, policy_version 84860 (0.0009) [2023-10-12 19:20:18,435][61643] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 173834240. Throughput: 0: 1648.4, 1: 1677.2. Samples: 43463660. Policy #0 lag: (min: 12.0, avg: 15.2, max: 44.0) [2023-10-12 19:20:18,435][61643] Avg episode reward: [(0, '24.500'), (1, '9.900')] [2023-10-12 19:20:18,444][62495] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000084896_86933504.pth... [2023-10-12 19:20:18,445][62354] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000084864_86900736.pth... [2023-10-12 19:20:18,484][62495] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000083328_85327872.pth [2023-10-12 19:20:18,485][62354] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000083296_85295104.pth [2023-10-12 19:20:21,298][62635] Updated weights for policy 1, policy_version 84900 (0.0008) [2023-10-12 19:20:21,665][62635] Updated weights for policy 1, policy_version 84910 (0.0007) [2023-10-12 19:20:22,040][62635] Updated weights for policy 1, policy_version 84920 (0.0007) [2023-10-12 19:20:22,164][62634] Updated weights for policy 0, policy_version 84870 (0.0009) [2023-10-12 19:20:22,547][62634] Updated weights for policy 0, policy_version 84880 (0.0008) [2023-10-12 19:20:22,929][62634] Updated weights for policy 0, policy_version 84890 (0.0008) [2023-10-12 19:20:23,435][61643] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 173899776. Throughput: 0: 1666.8, 1: 1682.9. Samples: 43475008. Policy #0 lag: (min: 12.0, avg: 15.2, max: 44.0) [2023-10-12 19:20:23,435][61643] Avg episode reward: [(0, '24.500'), (1, '10.050')] [2023-10-12 19:20:26,244][62635] Updated weights for policy 1, policy_version 84930 (0.0009) [2023-10-12 19:20:26,618][62635] Updated weights for policy 1, policy_version 84940 (0.0011) [2023-10-12 19:20:26,985][62635] Updated weights for policy 1, policy_version 84950 (0.0009) [2023-10-12 19:20:27,083][62634] Updated weights for policy 0, policy_version 84900 (0.0008) [2023-10-12 19:20:27,348][62635] Updated weights for policy 1, policy_version 84960 (0.0009) [2023-10-12 19:20:27,473][62634] Updated weights for policy 0, policy_version 84910 (0.0009) [2023-10-12 19:20:27,845][62634] Updated weights for policy 0, policy_version 84920 (0.0008) [2023-10-12 19:20:28,435][61643] Fps is (10 sec: 13107.1, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 173965312. Throughput: 0: 1672.6, 1: 1663.9. Samples: 43494652. Policy #0 lag: (min: 12.0, avg: 15.2, max: 44.0) [2023-10-12 19:20:28,436][61643] Avg episode reward: [(0, '24.420'), (1, '9.810')] [2023-10-12 19:20:31,486][62635] Updated weights for policy 1, policy_version 84970 (0.0008) [2023-10-12 19:20:31,848][62635] Updated weights for policy 1, policy_version 84980 (0.0008) [2023-10-12 19:20:31,894][62634] Updated weights for policy 0, policy_version 84930 (0.0008) [2023-10-12 19:20:32,214][62635] Updated weights for policy 1, policy_version 84990 (0.0008) [2023-10-12 19:20:32,267][62634] Updated weights for policy 0, policy_version 84940 (0.0008) [2023-10-12 19:20:32,644][62634] Updated weights for policy 0, policy_version 84950 (0.0007) [2023-10-12 19:20:33,021][62634] Updated weights for policy 0, policy_version 84960 (0.0007) [2023-10-12 19:20:33,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 174030848. Throughput: 0: 1658.7, 1: 1679.3. Samples: 43513850. Policy #0 lag: (min: 12.0, avg: 15.2, max: 44.0) [2023-10-12 19:20:33,436][61643] Avg episode reward: [(0, '24.500'), (1, '9.900')] [2023-10-12 19:20:36,408][62635] Updated weights for policy 1, policy_version 85000 (0.0007) [2023-10-12 19:20:36,766][62635] Updated weights for policy 1, policy_version 85010 (0.0008) [2023-10-12 19:20:36,989][62634] Updated weights for policy 0, policy_version 84970 (0.0007) [2023-10-12 19:20:37,139][62635] Updated weights for policy 1, policy_version 85020 (0.0008) [2023-10-12 19:20:37,368][62634] Updated weights for policy 0, policy_version 84980 (0.0008) [2023-10-12 19:20:37,742][62634] Updated weights for policy 0, policy_version 84990 (0.0007) [2023-10-12 19:20:38,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 174096384. Throughput: 0: 1675.8, 1: 1684.7. Samples: 43525334. Policy #0 lag: (min: 12.0, avg: 15.2, max: 44.0) [2023-10-12 19:20:38,435][61643] Avg episode reward: [(0, '24.420'), (1, '10.080')] [2023-10-12 19:20:41,232][62635] Updated weights for policy 1, policy_version 85030 (0.0008) [2023-10-12 19:20:41,604][62635] Updated weights for policy 1, policy_version 85040 (0.0010) [2023-10-12 19:20:41,865][62634] Updated weights for policy 0, policy_version 85000 (0.0009) [2023-10-12 19:20:41,970][62635] Updated weights for policy 1, policy_version 85050 (0.0007) [2023-10-12 19:20:42,245][62634] Updated weights for policy 0, policy_version 85010 (0.0010) [2023-10-12 19:20:42,629][62634] Updated weights for policy 0, policy_version 85020 (0.0011) [2023-10-12 19:20:43,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 174161920. Throughput: 0: 1664.6, 1: 1666.1. Samples: 43544392. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 19:20:43,435][61643] Avg episode reward: [(0, '24.270'), (1, '9.750')] [2023-10-12 19:20:45,788][62635] Updated weights for policy 1, policy_version 85060 (0.0007) [2023-10-12 19:20:46,165][62635] Updated weights for policy 1, policy_version 85070 (0.0007) [2023-10-12 19:20:46,528][62635] Updated weights for policy 1, policy_version 85080 (0.0008) [2023-10-12 19:20:46,727][62634] Updated weights for policy 0, policy_version 85030 (0.0009) [2023-10-12 19:20:47,097][62634] Updated weights for policy 0, policy_version 85040 (0.0009) [2023-10-12 19:20:47,475][62634] Updated weights for policy 0, policy_version 85050 (0.0010) [2023-10-12 19:20:48,435][61643] Fps is (10 sec: 13106.9, 60 sec: 13653.4, 300 sec: 13329.3). Total num frames: 174227456. Throughput: 0: 1663.3, 1: 1685.1. Samples: 43564126. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 19:20:48,436][61643] Avg episode reward: [(0, '24.240'), (1, '9.660')] [2023-10-12 19:20:50,603][62635] Updated weights for policy 1, policy_version 85090 (0.0008) [2023-10-12 19:20:50,977][62635] Updated weights for policy 1, policy_version 85100 (0.0011) [2023-10-12 19:20:51,351][62635] Updated weights for policy 1, policy_version 85110 (0.0009) [2023-10-12 19:20:51,576][62634] Updated weights for policy 0, policy_version 85060 (0.0009) [2023-10-12 19:20:51,717][62635] Updated weights for policy 1, policy_version 85120 (0.0009) [2023-10-12 19:20:51,954][62634] Updated weights for policy 0, policy_version 85070 (0.0008) [2023-10-12 19:20:52,324][62634] Updated weights for policy 0, policy_version 85080 (0.0009) [2023-10-12 19:20:53,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13329.4). Total num frames: 174292992. Throughput: 0: 1674.2, 1: 1672.0. Samples: 43575092. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 19:20:53,435][61643] Avg episode reward: [(0, '24.240'), (1, '9.640')] [2023-10-12 19:20:55,840][62635] Updated weights for policy 1, policy_version 85130 (0.0007) [2023-10-12 19:20:56,211][62635] Updated weights for policy 1, policy_version 85140 (0.0008) [2023-10-12 19:20:56,454][62634] Updated weights for policy 0, policy_version 85090 (0.0009) [2023-10-12 19:20:56,570][62635] Updated weights for policy 1, policy_version 85150 (0.0008) [2023-10-12 19:20:56,821][62634] Updated weights for policy 0, policy_version 85100 (0.0008) [2023-10-12 19:20:57,201][62634] Updated weights for policy 0, policy_version 85110 (0.0008) [2023-10-12 19:20:57,575][62634] Updated weights for policy 0, policy_version 85120 (0.0010) [2023-10-12 19:20:58,435][61643] Fps is (10 sec: 13107.5, 60 sec: 13653.4, 300 sec: 13329.4). Total num frames: 174358528. Throughput: 0: 1663.6, 1: 1668.8. Samples: 43594396. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 19:20:58,435][61643] Avg episode reward: [(0, '24.290'), (1, '9.580')] [2023-10-12 19:21:00,491][62635] Updated weights for policy 1, policy_version 85160 (0.0009) [2023-10-12 19:21:00,851][62635] Updated weights for policy 1, policy_version 85170 (0.0009) [2023-10-12 19:21:01,215][62635] Updated weights for policy 1, policy_version 85180 (0.0008) [2023-10-12 19:21:01,532][62634] Updated weights for policy 0, policy_version 85130 (0.0009) [2023-10-12 19:21:01,900][62634] Updated weights for policy 0, policy_version 85140 (0.0008) [2023-10-12 19:21:02,281][62634] Updated weights for policy 0, policy_version 85150 (0.0009) [2023-10-12 19:21:03,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 174424064. Throughput: 0: 1670.2, 1: 1682.5. Samples: 43614534. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 19:21:03,435][61643] Avg episode reward: [(0, '23.970'), (1, '9.620')] [2023-10-12 19:21:05,430][62635] Updated weights for policy 1, policy_version 85190 (0.0009) [2023-10-12 19:21:05,801][62635] Updated weights for policy 1, policy_version 85200 (0.0008) [2023-10-12 19:21:06,172][62635] Updated weights for policy 1, policy_version 85210 (0.0008) [2023-10-12 19:21:06,542][62634] Updated weights for policy 0, policy_version 85160 (0.0009) [2023-10-12 19:21:06,918][62634] Updated weights for policy 0, policy_version 85170 (0.0011) [2023-10-12 19:21:07,301][62634] Updated weights for policy 0, policy_version 85180 (0.0011) [2023-10-12 19:21:08,435][61643] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 174489600. Throughput: 0: 1673.9, 1: 1658.3. Samples: 43624960. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 19:21:08,436][61643] Avg episode reward: [(0, '24.020'), (1, '9.790')] [2023-10-12 19:21:10,111][62635] Updated weights for policy 1, policy_version 85220 (0.0008) [2023-10-12 19:21:10,485][62635] Updated weights for policy 1, policy_version 85230 (0.0007) [2023-10-12 19:21:10,854][62635] Updated weights for policy 1, policy_version 85240 (0.0007) [2023-10-12 19:21:11,471][62634] Updated weights for policy 0, policy_version 85190 (0.0008) [2023-10-12 19:21:11,866][62634] Updated weights for policy 0, policy_version 85200 (0.0007) [2023-10-12 19:21:12,235][62634] Updated weights for policy 0, policy_version 85210 (0.0009) [2023-10-12 19:21:13,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 174555136. Throughput: 0: 1654.2, 1: 1673.3. Samples: 43644386. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 19:21:13,435][61643] Avg episode reward: [(0, '23.990'), (1, '9.590')] [2023-10-12 19:21:14,866][62635] Updated weights for policy 1, policy_version 85250 (0.0007) [2023-10-12 19:21:15,228][62635] Updated weights for policy 1, policy_version 85260 (0.0009) [2023-10-12 19:21:15,589][62635] Updated weights for policy 1, policy_version 85270 (0.0009) [2023-10-12 19:21:15,957][62635] Updated weights for policy 1, policy_version 85280 (0.0009) [2023-10-12 19:21:16,146][62634] Updated weights for policy 0, policy_version 85220 (0.0010) [2023-10-12 19:21:16,511][62634] Updated weights for policy 0, policy_version 85230 (0.0011) [2023-10-12 19:21:16,886][62634] Updated weights for policy 0, policy_version 85240 (0.0009) [2023-10-12 19:21:18,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 174620672. Throughput: 0: 1671.5, 1: 1689.6. Samples: 43665098. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 19:21:18,436][61643] Avg episode reward: [(0, '23.950'), (1, '9.670')] [2023-10-12 19:21:19,988][62635] Updated weights for policy 1, policy_version 85290 (0.0008) [2023-10-12 19:21:20,351][62635] Updated weights for policy 1, policy_version 85300 (0.0009) [2023-10-12 19:21:20,719][62635] Updated weights for policy 1, policy_version 85310 (0.0007) [2023-10-12 19:21:21,001][62634] Updated weights for policy 0, policy_version 85250 (0.0008) [2023-10-12 19:21:21,383][62634] Updated weights for policy 0, policy_version 85260 (0.0008) [2023-10-12 19:21:21,759][62634] Updated weights for policy 0, policy_version 85270 (0.0007) [2023-10-12 19:21:22,127][62634] Updated weights for policy 0, policy_version 85280 (0.0008) [2023-10-12 19:21:23,435][61643] Fps is (10 sec: 13106.8, 60 sec: 13107.1, 300 sec: 13329.3). Total num frames: 174686208. Throughput: 0: 1673.2, 1: 1662.4. Samples: 43675438. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 19:21:23,436][61643] Avg episode reward: [(0, '23.930'), (1, '9.980')] [2023-10-12 19:21:24,819][62635] Updated weights for policy 1, policy_version 85320 (0.0009) [2023-10-12 19:21:25,190][62635] Updated weights for policy 1, policy_version 85330 (0.0009) [2023-10-12 19:21:25,560][62635] Updated weights for policy 1, policy_version 85340 (0.0007) [2023-10-12 19:21:26,106][62634] Updated weights for policy 0, policy_version 85290 (0.0008) [2023-10-12 19:21:26,478][62634] Updated weights for policy 0, policy_version 85300 (0.0009) [2023-10-12 19:21:26,861][62634] Updated weights for policy 0, policy_version 85310 (0.0007) [2023-10-12 19:21:28,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 174751744. Throughput: 0: 1656.3, 1: 1687.0. Samples: 43694838. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 19:21:28,436][61643] Avg episode reward: [(0, '24.000'), (1, '9.810')] [2023-10-12 19:21:29,621][62635] Updated weights for policy 1, policy_version 85350 (0.0008) [2023-10-12 19:21:29,983][62635] Updated weights for policy 1, policy_version 85360 (0.0008) [2023-10-12 19:21:30,355][62635] Updated weights for policy 1, policy_version 85370 (0.0007) [2023-10-12 19:21:30,825][62634] Updated weights for policy 0, policy_version 85320 (0.0007) [2023-10-12 19:21:31,194][62634] Updated weights for policy 0, policy_version 85330 (0.0007) [2023-10-12 19:21:31,567][62634] Updated weights for policy 0, policy_version 85340 (0.0007) [2023-10-12 19:21:33,435][61643] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 174817280. Throughput: 0: 1677.8, 1: 1691.1. Samples: 43715726. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 19:21:33,435][61643] Avg episode reward: [(0, '24.270'), (1, '9.830')] [2023-10-12 19:21:34,470][62635] Updated weights for policy 1, policy_version 85380 (0.0007) [2023-10-12 19:21:34,836][62635] Updated weights for policy 1, policy_version 85390 (0.0009) [2023-10-12 19:21:35,207][62635] Updated weights for policy 1, policy_version 85400 (0.0007) [2023-10-12 19:21:35,584][62634] Updated weights for policy 0, policy_version 85350 (0.0007) [2023-10-12 19:21:35,962][62634] Updated weights for policy 0, policy_version 85360 (0.0009) [2023-10-12 19:21:36,348][62634] Updated weights for policy 0, policy_version 85370 (0.0010) [2023-10-12 19:21:38,435][61643] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 174882816. Throughput: 0: 1668.9, 1: 1674.9. Samples: 43725564. Policy #0 lag: (min: 17.0, avg: 27.7, max: 49.0) [2023-10-12 19:21:38,435][61643] Avg episode reward: [(0, '24.160'), (1, '10.060')] [2023-10-12 19:21:39,195][62635] Updated weights for policy 1, policy_version 85410 (0.0007) [2023-10-12 19:21:39,555][62635] Updated weights for policy 1, policy_version 85420 (0.0007) [2023-10-12 19:21:39,926][62635] Updated weights for policy 1, policy_version 85430 (0.0007) [2023-10-12 19:21:40,282][62635] Updated weights for policy 1, policy_version 85440 (0.0007) [2023-10-12 19:21:40,433][62634] Updated weights for policy 0, policy_version 85380 (0.0009) [2023-10-12 19:21:40,811][62634] Updated weights for policy 0, policy_version 85390 (0.0009) [2023-10-12 19:21:41,187][62634] Updated weights for policy 0, policy_version 85400 (0.0009) [2023-10-12 19:21:43,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 174948352. Throughput: 0: 1662.3, 1: 1698.0. Samples: 43745608. Policy #0 lag: (min: 17.0, avg: 27.7, max: 49.0) [2023-10-12 19:21:43,435][61643] Avg episode reward: [(0, '24.150'), (1, '9.880')] [2023-10-12 19:21:44,366][62635] Updated weights for policy 1, policy_version 85450 (0.0007) [2023-10-12 19:21:44,740][62635] Updated weights for policy 1, policy_version 85460 (0.0007) [2023-10-12 19:21:45,109][62635] Updated weights for policy 1, policy_version 85470 (0.0008) [2023-10-12 19:21:45,202][62634] Updated weights for policy 0, policy_version 85410 (0.0008) [2023-10-12 19:21:45,578][62634] Updated weights for policy 0, policy_version 85420 (0.0010) [2023-10-12 19:21:45,961][62634] Updated weights for policy 0, policy_version 85430 (0.0010) [2023-10-12 19:21:46,333][62634] Updated weights for policy 0, policy_version 85440 (0.0008) [2023-10-12 19:21:48,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 175013888. Throughput: 0: 1677.2, 1: 1693.8. Samples: 43766230. Policy #0 lag: (min: 17.0, avg: 27.7, max: 49.0) [2023-10-12 19:21:48,435][61643] Avg episode reward: [(0, '24.260'), (1, '9.780')] [2023-10-12 19:21:49,143][62635] Updated weights for policy 1, policy_version 85480 (0.0009) [2023-10-12 19:21:49,516][62635] Updated weights for policy 1, policy_version 85490 (0.0008) [2023-10-12 19:21:49,879][62635] Updated weights for policy 1, policy_version 85500 (0.0009) [2023-10-12 19:21:50,450][62634] Updated weights for policy 0, policy_version 85450 (0.0011) [2023-10-12 19:21:50,821][62634] Updated weights for policy 0, policy_version 85460 (0.0010) [2023-10-12 19:21:51,207][62634] Updated weights for policy 0, policy_version 85470 (0.0009) [2023-10-12 19:21:53,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 175079424. Throughput: 0: 1659.3, 1: 1688.8. Samples: 43775626. Policy #0 lag: (min: 17.0, avg: 27.7, max: 49.0) [2023-10-12 19:21:53,435][61643] Avg episode reward: [(0, '24.430'), (1, '9.970')] [2023-10-12 19:21:54,096][62635] Updated weights for policy 1, policy_version 85510 (0.0008) [2023-10-12 19:21:54,471][62635] Updated weights for policy 1, policy_version 85520 (0.0009) [2023-10-12 19:21:54,839][62635] Updated weights for policy 1, policy_version 85530 (0.0009) [2023-10-12 19:21:55,292][62634] Updated weights for policy 0, policy_version 85480 (0.0009) [2023-10-12 19:21:55,671][62634] Updated weights for policy 0, policy_version 85490 (0.0007) [2023-10-12 19:21:56,056][62634] Updated weights for policy 0, policy_version 85500 (0.0010) [2023-10-12 19:21:58,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 175144960. Throughput: 0: 1671.0, 1: 1693.4. Samples: 43795784. Policy #0 lag: (min: 17.0, avg: 27.7, max: 49.0) [2023-10-12 19:21:58,436][61643] Avg episode reward: [(0, '24.480'), (1, '9.820')] [2023-10-12 19:21:58,701][62635] Updated weights for policy 1, policy_version 85540 (0.0008) [2023-10-12 19:21:59,075][62635] Updated weights for policy 1, policy_version 85550 (0.0011) [2023-10-12 19:21:59,444][62635] Updated weights for policy 1, policy_version 85560 (0.0007) [2023-10-12 19:22:00,171][62634] Updated weights for policy 0, policy_version 85510 (0.0009) [2023-10-12 19:22:00,554][62634] Updated weights for policy 0, policy_version 85520 (0.0010) [2023-10-12 19:22:00,929][62634] Updated weights for policy 0, policy_version 85530 (0.0010) [2023-10-12 19:22:03,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 175210496. Throughput: 0: 1673.5, 1: 1691.5. Samples: 43816520. Policy #0 lag: (min: 17.0, avg: 27.7, max: 49.0) [2023-10-12 19:22:03,435][61643] Avg episode reward: [(0, '23.720'), (1, '9.820')] [2023-10-12 19:22:03,454][62635] Updated weights for policy 1, policy_version 85570 (0.0008) [2023-10-12 19:22:03,818][62635] Updated weights for policy 1, policy_version 85580 (0.0009) [2023-10-12 19:22:04,180][62635] Updated weights for policy 1, policy_version 85590 (0.0007) [2023-10-12 19:22:04,553][62635] Updated weights for policy 1, policy_version 85600 (0.0007) [2023-10-12 19:22:05,116][62634] Updated weights for policy 0, policy_version 85540 (0.0011) [2023-10-12 19:22:05,491][62634] Updated weights for policy 0, policy_version 85550 (0.0007) [2023-10-12 19:22:05,868][62634] Updated weights for policy 0, policy_version 85560 (0.0009) [2023-10-12 19:22:08,435][61643] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 175276032. Throughput: 0: 1652.5, 1: 1691.3. Samples: 43825910. Policy #0 lag: (min: 17.0, avg: 27.7, max: 49.0) [2023-10-12 19:22:08,435][61643] Avg episode reward: [(0, '23.730'), (1, '10.090')] [2023-10-12 19:22:08,628][62635] Updated weights for policy 1, policy_version 85610 (0.0011) [2023-10-12 19:22:08,985][62635] Updated weights for policy 1, policy_version 85620 (0.0009) [2023-10-12 19:22:09,348][62635] Updated weights for policy 1, policy_version 85630 (0.0008) [2023-10-12 19:22:09,849][62634] Updated weights for policy 0, policy_version 85570 (0.0010) [2023-10-12 19:22:10,235][62634] Updated weights for policy 0, policy_version 85580 (0.0011) [2023-10-12 19:22:10,599][62634] Updated weights for policy 0, policy_version 85590 (0.0010) [2023-10-12 19:22:10,967][62634] Updated weights for policy 0, policy_version 85600 (0.0007) [2023-10-12 19:22:13,433][62635] Updated weights for policy 1, policy_version 85640 (0.0009) [2023-10-12 19:22:13,435][61643] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 175341568. Throughput: 0: 1677.1, 1: 1689.0. Samples: 43846310. Policy #0 lag: (min: 17.0, avg: 27.7, max: 49.0) [2023-10-12 19:22:13,436][61643] Avg episode reward: [(0, '23.900'), (1, '10.050')] [2023-10-12 19:22:13,796][62635] Updated weights for policy 1, policy_version 85650 (0.0008) [2023-10-12 19:22:14,167][62635] Updated weights for policy 1, policy_version 85660 (0.0008) [2023-10-12 19:22:15,035][62634] Updated weights for policy 0, policy_version 85610 (0.0008) [2023-10-12 19:22:15,417][62634] Updated weights for policy 0, policy_version 85620 (0.0009) [2023-10-12 19:22:15,788][62634] Updated weights for policy 0, policy_version 85630 (0.0007) [2023-10-12 19:22:18,181][62635] Updated weights for policy 1, policy_version 85670 (0.0009) [2023-10-12 19:22:18,435][61643] Fps is (10 sec: 13106.8, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 175407104. Throughput: 0: 1675.9, 1: 1687.7. Samples: 43867088. Policy #0 lag: (min: 17.0, avg: 27.7, max: 49.0) [2023-10-12 19:22:18,436][61643] Avg episode reward: [(0, '23.460'), (1, '9.950')] [2023-10-12 19:22:18,446][62354] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000085632_87687168.pth... [2023-10-12 19:22:18,477][62354] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000084064_86081536.pth [2023-10-12 19:22:18,544][62635] Updated weights for policy 1, policy_version 85680 (0.0010) [2023-10-12 19:22:18,905][62635] Updated weights for policy 1, policy_version 85690 (0.0007) [2023-10-12 19:22:19,122][62495] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000085696_87752704.pth... [2023-10-12 19:22:19,152][62495] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000084096_86114304.pth [2023-10-12 19:22:19,956][62634] Updated weights for policy 0, policy_version 85640 (0.0008) [2023-10-12 19:22:20,340][62634] Updated weights for policy 0, policy_version 85650 (0.0008) [2023-10-12 19:22:20,712][62634] Updated weights for policy 0, policy_version 85660 (0.0007) [2023-10-12 19:22:22,839][62635] Updated weights for policy 1, policy_version 85700 (0.0010) [2023-10-12 19:22:23,215][62635] Updated weights for policy 1, policy_version 85710 (0.0009) [2023-10-12 19:22:23,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13107.3, 300 sec: 13329.4). Total num frames: 175472640. Throughput: 0: 1657.2, 1: 1689.6. Samples: 43876172. Policy #0 lag: (min: 17.0, avg: 27.7, max: 49.0) [2023-10-12 19:22:23,435][61643] Avg episode reward: [(0, '23.500'), (1, '9.980')] [2023-10-12 19:22:23,572][62635] Updated weights for policy 1, policy_version 85720 (0.0009) [2023-10-12 19:22:24,727][62634] Updated weights for policy 0, policy_version 85670 (0.0009) [2023-10-12 19:22:25,104][62634] Updated weights for policy 0, policy_version 85680 (0.0007) [2023-10-12 19:22:25,482][62634] Updated weights for policy 0, policy_version 85690 (0.0007) [2023-10-12 19:22:27,874][62635] Updated weights for policy 1, policy_version 85730 (0.0008) [2023-10-12 19:22:28,245][62635] Updated weights for policy 1, policy_version 85740 (0.0010) [2023-10-12 19:22:28,435][61643] Fps is (10 sec: 13107.6, 60 sec: 13107.3, 300 sec: 13329.4). Total num frames: 175538176. Throughput: 0: 1677.8, 1: 1682.7. Samples: 43896828. Policy #0 lag: (min: 17.0, avg: 27.7, max: 49.0) [2023-10-12 19:22:28,435][61643] Avg episode reward: [(0, '23.550'), (1, '10.060')] [2023-10-12 19:22:28,609][62635] Updated weights for policy 1, policy_version 85750 (0.0007) [2023-10-12 19:22:28,978][62635] Updated weights for policy 1, policy_version 85760 (0.0008) [2023-10-12 19:22:29,404][62634] Updated weights for policy 0, policy_version 85700 (0.0008) [2023-10-12 19:22:29,784][62634] Updated weights for policy 0, policy_version 85710 (0.0009) [2023-10-12 19:22:30,155][62634] Updated weights for policy 0, policy_version 85720 (0.0009) [2023-10-12 19:22:33,257][62635] Updated weights for policy 1, policy_version 85770 (0.0007) [2023-10-12 19:22:33,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 175603712. Throughput: 0: 1680.6, 1: 1676.8. Samples: 43917312. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 19:22:33,435][61643] Avg episode reward: [(0, '23.550'), (1, '9.960')] [2023-10-12 19:22:33,624][62635] Updated weights for policy 1, policy_version 85780 (0.0010) [2023-10-12 19:22:33,987][62635] Updated weights for policy 1, policy_version 85790 (0.0012) [2023-10-12 19:22:34,369][62634] Updated weights for policy 0, policy_version 85730 (0.0008) [2023-10-12 19:22:34,750][62634] Updated weights for policy 0, policy_version 85740 (0.0008) [2023-10-12 19:22:35,139][62634] Updated weights for policy 0, policy_version 85750 (0.0009) [2023-10-12 19:22:35,512][62634] Updated weights for policy 0, policy_version 85760 (0.0010) [2023-10-12 19:22:38,049][62635] Updated weights for policy 1, policy_version 85800 (0.0008) [2023-10-12 19:22:38,419][62635] Updated weights for policy 1, policy_version 85810 (0.0007) [2023-10-12 19:22:38,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 175669248. Throughput: 0: 1674.6, 1: 1683.6. Samples: 43926744. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 19:22:38,435][61643] Avg episode reward: [(0, '23.560'), (1, '9.940')] [2023-10-12 19:22:38,780][62635] Updated weights for policy 1, policy_version 85820 (0.0008) [2023-10-12 19:22:39,516][62634] Updated weights for policy 0, policy_version 85770 (0.0007) [2023-10-12 19:22:39,890][62634] Updated weights for policy 0, policy_version 85780 (0.0008) [2023-10-12 19:22:40,270][62634] Updated weights for policy 0, policy_version 85790 (0.0008) [2023-10-12 19:22:42,814][62635] Updated weights for policy 1, policy_version 85830 (0.0009) [2023-10-12 19:22:43,200][62635] Updated weights for policy 1, policy_version 85840 (0.0008) [2023-10-12 19:22:43,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 175734784. Throughput: 0: 1683.9, 1: 1686.2. Samples: 43947436. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 19:22:43,435][61643] Avg episode reward: [(0, '23.490'), (1, '9.940')] [2023-10-12 19:22:43,567][62635] Updated weights for policy 1, policy_version 85850 (0.0009) [2023-10-12 19:22:44,372][62634] Updated weights for policy 0, policy_version 85800 (0.0008) [2023-10-12 19:22:44,758][62634] Updated weights for policy 0, policy_version 85810 (0.0007) [2023-10-12 19:22:45,131][62634] Updated weights for policy 0, policy_version 85820 (0.0008) [2023-10-12 19:22:47,672][62635] Updated weights for policy 1, policy_version 85860 (0.0010) [2023-10-12 19:22:48,039][62635] Updated weights for policy 1, policy_version 85870 (0.0011) [2023-10-12 19:22:48,408][62635] Updated weights for policy 1, policy_version 85880 (0.0007) [2023-10-12 19:22:48,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 175800320. Throughput: 0: 1689.2, 1: 1666.0. Samples: 43967504. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 19:22:48,435][61643] Avg episode reward: [(0, '23.490'), (1, '9.880')] [2023-10-12 19:22:49,272][62634] Updated weights for policy 0, policy_version 85830 (0.0008) [2023-10-12 19:22:49,646][62634] Updated weights for policy 0, policy_version 85840 (0.0009) [2023-10-12 19:22:50,025][62634] Updated weights for policy 0, policy_version 85850 (0.0008) [2023-10-12 19:22:52,361][62635] Updated weights for policy 1, policy_version 85890 (0.0007) [2023-10-12 19:22:52,733][62635] Updated weights for policy 1, policy_version 85900 (0.0008) [2023-10-12 19:22:53,105][62635] Updated weights for policy 1, policy_version 85910 (0.0008) [2023-10-12 19:22:53,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 175865856. Throughput: 0: 1677.7, 1: 1680.9. Samples: 43977046. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 19:22:53,435][61643] Avg episode reward: [(0, '23.680'), (1, '9.750')] [2023-10-12 19:22:53,468][62635] Updated weights for policy 1, policy_version 85920 (0.0008) [2023-10-12 19:22:54,031][62634] Updated weights for policy 0, policy_version 85860 (0.0009) [2023-10-12 19:22:54,405][62634] Updated weights for policy 0, policy_version 85870 (0.0008) [2023-10-12 19:22:54,786][62634] Updated weights for policy 0, policy_version 85880 (0.0010) [2023-10-12 19:22:57,552][62635] Updated weights for policy 1, policy_version 85930 (0.0008) [2023-10-12 19:22:57,916][62635] Updated weights for policy 1, policy_version 85940 (0.0008) [2023-10-12 19:22:58,282][62635] Updated weights for policy 1, policy_version 85950 (0.0009) [2023-10-12 19:22:58,435][61643] Fps is (10 sec: 16384.0, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 175964160. Throughput: 0: 1681.3, 1: 1680.3. Samples: 43997584. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 19:22:58,435][61643] Avg episode reward: [(0, '23.580'), (1, '9.810')] [2023-10-12 19:22:58,799][62634] Updated weights for policy 0, policy_version 85890 (0.0007) [2023-10-12 19:22:59,170][62634] Updated weights for policy 0, policy_version 85900 (0.0009) [2023-10-12 19:22:59,547][62634] Updated weights for policy 0, policy_version 85910 (0.0007) [2023-10-12 19:22:59,926][62634] Updated weights for policy 0, policy_version 85920 (0.0009) [2023-10-12 19:23:02,419][62635] Updated weights for policy 1, policy_version 85960 (0.0010) [2023-10-12 19:23:02,788][62635] Updated weights for policy 1, policy_version 85970 (0.0011) [2023-10-12 19:23:03,156][62635] Updated weights for policy 1, policy_version 85980 (0.0008) [2023-10-12 19:23:03,435][61643] Fps is (10 sec: 16384.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 176029696. Throughput: 0: 1682.4, 1: 1658.6. Samples: 44017432. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 19:23:03,435][61643] Avg episode reward: [(0, '23.620'), (1, '9.850')] [2023-10-12 19:23:03,869][62634] Updated weights for policy 0, policy_version 85930 (0.0007) [2023-10-12 19:23:04,250][62634] Updated weights for policy 0, policy_version 85940 (0.0008) [2023-10-12 19:23:04,631][62634] Updated weights for policy 0, policy_version 85950 (0.0010) [2023-10-12 19:23:07,275][62635] Updated weights for policy 1, policy_version 85990 (0.0011) [2023-10-12 19:23:07,632][62635] Updated weights for policy 1, policy_version 86000 (0.0011) [2023-10-12 19:23:08,009][62635] Updated weights for policy 1, policy_version 86010 (0.0008) [2023-10-12 19:23:08,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 176095232. Throughput: 0: 1683.3, 1: 1678.3. Samples: 44027446. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 19:23:08,435][61643] Avg episode reward: [(0, '23.610'), (1, '9.620')] [2023-10-12 19:23:08,549][62634] Updated weights for policy 0, policy_version 85960 (0.0010) [2023-10-12 19:23:08,922][62634] Updated weights for policy 0, policy_version 85970 (0.0008) [2023-10-12 19:23:09,295][62634] Updated weights for policy 0, policy_version 85980 (0.0009) [2023-10-12 19:23:11,963][62635] Updated weights for policy 1, policy_version 86020 (0.0007) [2023-10-12 19:23:12,341][62635] Updated weights for policy 1, policy_version 86030 (0.0008) [2023-10-12 19:23:12,702][62635] Updated weights for policy 1, policy_version 86040 (0.0009) [2023-10-12 19:23:13,385][62634] Updated weights for policy 0, policy_version 85990 (0.0009) [2023-10-12 19:23:13,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 176160768. Throughput: 0: 1686.5, 1: 1677.5. Samples: 44048210. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 19:23:13,435][61643] Avg episode reward: [(0, '23.570'), (1, '9.790')] [2023-10-12 19:23:13,769][62634] Updated weights for policy 0, policy_version 86000 (0.0009) [2023-10-12 19:23:14,142][62634] Updated weights for policy 0, policy_version 86010 (0.0009) [2023-10-12 19:23:16,882][62635] Updated weights for policy 1, policy_version 86050 (0.0011) [2023-10-12 19:23:17,249][62635] Updated weights for policy 1, policy_version 86060 (0.0010) [2023-10-12 19:23:17,609][62635] Updated weights for policy 1, policy_version 86070 (0.0009) [2023-10-12 19:23:17,983][62635] Updated weights for policy 1, policy_version 86080 (0.0008) [2023-10-12 19:23:18,175][62634] Updated weights for policy 0, policy_version 86020 (0.0008) [2023-10-12 19:23:18,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 176226304. Throughput: 0: 1684.2, 1: 1659.2. Samples: 44067766. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 19:23:18,436][61643] Avg episode reward: [(0, '24.180'), (1, '9.790')] [2023-10-12 19:23:18,547][62634] Updated weights for policy 0, policy_version 86030 (0.0008) [2023-10-12 19:23:18,929][62634] Updated weights for policy 0, policy_version 86040 (0.0008) [2023-10-12 19:23:22,051][62635] Updated weights for policy 1, policy_version 86090 (0.0008) [2023-10-12 19:23:22,421][62635] Updated weights for policy 1, policy_version 86100 (0.0009) [2023-10-12 19:23:22,793][62635] Updated weights for policy 1, policy_version 86110 (0.0008) [2023-10-12 19:23:22,926][62634] Updated weights for policy 0, policy_version 86050 (0.0009) [2023-10-12 19:23:23,291][62634] Updated weights for policy 0, policy_version 86060 (0.0008) [2023-10-12 19:23:23,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 176291840. Throughput: 0: 1684.4, 1: 1682.7. Samples: 44078262. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 19:23:23,435][61643] Avg episode reward: [(0, '24.230'), (1, '9.700')] [2023-10-12 19:23:23,668][62634] Updated weights for policy 0, policy_version 86070 (0.0010) [2023-10-12 19:23:24,047][62634] Updated weights for policy 0, policy_version 86080 (0.0008) [2023-10-12 19:23:26,917][62635] Updated weights for policy 1, policy_version 86120 (0.0008) [2023-10-12 19:23:27,292][62635] Updated weights for policy 1, policy_version 86130 (0.0008) [2023-10-12 19:23:27,657][62635] Updated weights for policy 1, policy_version 86140 (0.0011) [2023-10-12 19:23:28,175][62634] Updated weights for policy 0, policy_version 86090 (0.0010) [2023-10-12 19:23:28,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 176357376. Throughput: 0: 1691.4, 1: 1670.0. Samples: 44098696. Policy #0 lag: (min: 30.0, avg: 30.0, max: 30.0) [2023-10-12 19:23:28,435][61643] Avg episode reward: [(0, '24.170'), (1, '9.880')] [2023-10-12 19:23:28,546][62634] Updated weights for policy 0, policy_version 86100 (0.0010) [2023-10-12 19:23:28,927][62634] Updated weights for policy 0, policy_version 86110 (0.0010) [2023-10-12 19:23:31,570][62635] Updated weights for policy 1, policy_version 86150 (0.0008) [2023-10-12 19:23:31,936][62635] Updated weights for policy 1, policy_version 86160 (0.0009) [2023-10-12 19:23:32,304][62635] Updated weights for policy 1, policy_version 86170 (0.0010) [2023-10-12 19:23:32,887][62634] Updated weights for policy 0, policy_version 86120 (0.0009) [2023-10-12 19:23:33,257][62634] Updated weights for policy 0, policy_version 86130 (0.0007) [2023-10-12 19:23:33,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 176422912. Throughput: 0: 1681.4, 1: 1671.7. Samples: 44118394. Policy #0 lag: (min: 30.0, avg: 30.0, max: 30.0) [2023-10-12 19:23:33,435][61643] Avg episode reward: [(0, '24.800'), (1, '9.880')] [2023-10-12 19:23:33,631][62634] Updated weights for policy 0, policy_version 86140 (0.0009) [2023-10-12 19:23:36,494][62635] Updated weights for policy 1, policy_version 86180 (0.0009) [2023-10-12 19:23:36,867][62635] Updated weights for policy 1, policy_version 86190 (0.0010) [2023-10-12 19:23:37,239][62635] Updated weights for policy 1, policy_version 86200 (0.0012) [2023-10-12 19:23:37,718][62634] Updated weights for policy 0, policy_version 86150 (0.0008) [2023-10-12 19:23:38,103][62634] Updated weights for policy 0, policy_version 86160 (0.0010) [2023-10-12 19:23:38,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 176488448. Throughput: 0: 1690.5, 1: 1682.6. Samples: 44128836. Policy #0 lag: (min: 30.0, avg: 30.0, max: 30.0) [2023-10-12 19:23:38,435][61643] Avg episode reward: [(0, '24.730'), (1, '9.850')] [2023-10-12 19:23:38,480][62634] Updated weights for policy 0, policy_version 86170 (0.0011) [2023-10-12 19:23:41,206][62635] Updated weights for policy 1, policy_version 86210 (0.0009) [2023-10-12 19:23:41,569][62635] Updated weights for policy 1, policy_version 86220 (0.0008) [2023-10-12 19:23:41,935][62635] Updated weights for policy 1, policy_version 86230 (0.0008) [2023-10-12 19:23:42,294][62635] Updated weights for policy 1, policy_version 86240 (0.0007) [2023-10-12 19:23:42,559][62634] Updated weights for policy 0, policy_version 86180 (0.0011) [2023-10-12 19:23:42,942][62634] Updated weights for policy 0, policy_version 86190 (0.0010) [2023-10-12 19:23:43,325][62634] Updated weights for policy 0, policy_version 86200 (0.0008) [2023-10-12 19:23:43,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 176553984. Throughput: 0: 1690.7, 1: 1665.2. Samples: 44148598. Policy #0 lag: (min: 30.0, avg: 30.0, max: 30.0) [2023-10-12 19:23:43,435][61643] Avg episode reward: [(0, '24.720'), (1, '9.970')] [2023-10-12 19:23:46,433][62635] Updated weights for policy 1, policy_version 86250 (0.0010) [2023-10-12 19:23:46,801][62635] Updated weights for policy 1, policy_version 86260 (0.0007) [2023-10-12 19:23:47,166][62635] Updated weights for policy 1, policy_version 86270 (0.0008) [2023-10-12 19:23:47,376][62634] Updated weights for policy 0, policy_version 86210 (0.0008) [2023-10-12 19:23:47,754][62634] Updated weights for policy 0, policy_version 86220 (0.0008) [2023-10-12 19:23:48,143][62634] Updated weights for policy 0, policy_version 86230 (0.0008) [2023-10-12 19:23:48,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 176619520. Throughput: 0: 1673.7, 1: 1674.4. Samples: 44168094. Policy #0 lag: (min: 30.0, avg: 30.0, max: 30.0) [2023-10-12 19:23:48,436][61643] Avg episode reward: [(0, '24.610'), (1, '9.850')] [2023-10-12 19:23:48,518][62634] Updated weights for policy 0, policy_version 86240 (0.0008) [2023-10-12 19:23:51,302][62635] Updated weights for policy 1, policy_version 86280 (0.0007) [2023-10-12 19:23:51,680][62635] Updated weights for policy 1, policy_version 86290 (0.0007) [2023-10-12 19:23:52,038][62635] Updated weights for policy 1, policy_version 86300 (0.0007) [2023-10-12 19:23:52,595][62634] Updated weights for policy 0, policy_version 86250 (0.0007) [2023-10-12 19:23:52,965][62634] Updated weights for policy 0, policy_version 86260 (0.0007) [2023-10-12 19:23:53,339][62634] Updated weights for policy 0, policy_version 86270 (0.0007) [2023-10-12 19:23:53,435][61643] Fps is (10 sec: 16384.2, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 176717824. Throughput: 0: 1688.5, 1: 1679.6. Samples: 44179012. Policy #0 lag: (min: 30.0, avg: 30.0, max: 30.0) [2023-10-12 19:23:53,435][61643] Avg episode reward: [(0, '24.490'), (1, '9.950')] [2023-10-12 19:23:56,039][62635] Updated weights for policy 1, policy_version 86310 (0.0007) [2023-10-12 19:23:56,419][62635] Updated weights for policy 1, policy_version 86320 (0.0009) [2023-10-12 19:23:56,783][62635] Updated weights for policy 1, policy_version 86330 (0.0009) [2023-10-12 19:23:57,465][62634] Updated weights for policy 0, policy_version 86280 (0.0009) [2023-10-12 19:23:57,848][62634] Updated weights for policy 0, policy_version 86290 (0.0009) [2023-10-12 19:23:58,232][62634] Updated weights for policy 0, policy_version 86300 (0.0008) [2023-10-12 19:23:58,435][61643] Fps is (10 sec: 16384.3, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 176783360. Throughput: 0: 1686.3, 1: 1659.1. Samples: 44198754. Policy #0 lag: (min: 30.0, avg: 30.0, max: 30.0) [2023-10-12 19:23:58,435][61643] Avg episode reward: [(0, '24.630'), (1, '10.030')] [2023-10-12 19:24:00,886][62635] Updated weights for policy 1, policy_version 86340 (0.0008) [2023-10-12 19:24:01,248][62635] Updated weights for policy 1, policy_version 86350 (0.0008) [2023-10-12 19:24:01,619][62635] Updated weights for policy 1, policy_version 86360 (0.0007) [2023-10-12 19:24:02,278][62634] Updated weights for policy 0, policy_version 86310 (0.0007) [2023-10-12 19:24:02,663][62634] Updated weights for policy 0, policy_version 86320 (0.0007) [2023-10-12 19:24:03,038][62634] Updated weights for policy 0, policy_version 86330 (0.0009) [2023-10-12 19:24:03,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 176848896. Throughput: 0: 1668.7, 1: 1686.5. Samples: 44218752. Policy #0 lag: (min: 30.0, avg: 30.0, max: 30.0) [2023-10-12 19:24:03,436][61643] Avg episode reward: [(0, '24.660'), (1, '9.840')] [2023-10-12 19:24:05,617][62635] Updated weights for policy 1, policy_version 86370 (0.0007) [2023-10-12 19:24:05,998][62635] Updated weights for policy 1, policy_version 86380 (0.0007) [2023-10-12 19:24:06,367][62635] Updated weights for policy 1, policy_version 86390 (0.0007) [2023-10-12 19:24:06,729][62635] Updated weights for policy 1, policy_version 86400 (0.0007) [2023-10-12 19:24:06,967][62634] Updated weights for policy 0, policy_version 86340 (0.0009) [2023-10-12 19:24:07,347][62634] Updated weights for policy 0, policy_version 86350 (0.0009) [2023-10-12 19:24:07,719][62634] Updated weights for policy 0, policy_version 86360 (0.0008) [2023-10-12 19:24:08,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 176914432. Throughput: 0: 1687.8, 1: 1673.6. Samples: 44229526. Policy #0 lag: (min: 30.0, avg: 30.0, max: 30.0) [2023-10-12 19:24:08,436][61643] Avg episode reward: [(0, '24.460'), (1, '10.020')] [2023-10-12 19:24:10,687][62635] Updated weights for policy 1, policy_version 86410 (0.0008) [2023-10-12 19:24:11,053][62635] Updated weights for policy 1, policy_version 86420 (0.0007) [2023-10-12 19:24:11,417][62635] Updated weights for policy 1, policy_version 86430 (0.0007) [2023-10-12 19:24:11,851][62634] Updated weights for policy 0, policy_version 86370 (0.0010) [2023-10-12 19:24:12,228][62634] Updated weights for policy 0, policy_version 86380 (0.0008) [2023-10-12 19:24:12,602][62634] Updated weights for policy 0, policy_version 86390 (0.0009) [2023-10-12 19:24:12,974][62634] Updated weights for policy 0, policy_version 86400 (0.0011) [2023-10-12 19:24:13,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 176979968. Throughput: 0: 1673.2, 1: 1670.0. Samples: 44249140. Policy #0 lag: (min: 30.0, avg: 30.0, max: 30.0) [2023-10-12 19:24:13,435][61643] Avg episode reward: [(0, '24.380'), (1, '9.990')] [2023-10-12 19:24:15,468][62635] Updated weights for policy 1, policy_version 86440 (0.0008) [2023-10-12 19:24:15,845][62635] Updated weights for policy 1, policy_version 86450 (0.0010) [2023-10-12 19:24:16,207][62635] Updated weights for policy 1, policy_version 86460 (0.0008) [2023-10-12 19:24:17,126][62634] Updated weights for policy 0, policy_version 86410 (0.0009) [2023-10-12 19:24:17,497][62634] Updated weights for policy 0, policy_version 86420 (0.0009) [2023-10-12 19:24:17,881][62634] Updated weights for policy 0, policy_version 86430 (0.0008) [2023-10-12 19:24:18,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 177045504. Throughput: 0: 1658.0, 1: 1689.1. Samples: 44269010. Policy #0 lag: (min: 29.0, avg: 29.0, max: 29.0) [2023-10-12 19:24:18,436][61643] Avg episode reward: [(0, '24.410'), (1, '9.900')] [2023-10-12 19:24:18,446][62495] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000086464_88539136.pth... [2023-10-12 19:24:18,447][62354] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000086432_88506368.pth... [2023-10-12 19:24:18,482][62495] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000084896_86933504.pth [2023-10-12 19:24:18,483][62354] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000084864_86900736.pth [2023-10-12 19:24:18,486][62495] Saving a milestone ./train_atari/atari_kangaroo_APPO/checkpoint_p1/milestones/checkpoint_000086464_88539136.pth [2023-10-12 19:24:18,487][62354] Saving a milestone ./train_atari/atari_kangaroo_APPO/checkpoint_p0/milestones/checkpoint_000086432_88506368.pth [2023-10-12 19:24:20,305][62635] Updated weights for policy 1, policy_version 86470 (0.0008) [2023-10-12 19:24:20,679][62635] Updated weights for policy 1, policy_version 86480 (0.0009) [2023-10-12 19:24:21,048][62635] Updated weights for policy 1, policy_version 86490 (0.0007) [2023-10-12 19:24:21,986][62634] Updated weights for policy 0, policy_version 86440 (0.0009) [2023-10-12 19:24:22,362][62634] Updated weights for policy 0, policy_version 86450 (0.0010) [2023-10-12 19:24:22,743][62634] Updated weights for policy 0, policy_version 86460 (0.0011) [2023-10-12 19:24:23,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 177111040. Throughput: 0: 1680.9, 1: 1669.9. Samples: 44279620. Policy #0 lag: (min: 29.0, avg: 29.0, max: 29.0) [2023-10-12 19:24:23,435][61643] Avg episode reward: [(0, '24.000'), (1, '10.050')] [2023-10-12 19:24:25,019][62635] Updated weights for policy 1, policy_version 86500 (0.0009) [2023-10-12 19:24:25,391][62635] Updated weights for policy 1, policy_version 86510 (0.0010) [2023-10-12 19:24:25,752][62635] Updated weights for policy 1, policy_version 86520 (0.0008) [2023-10-12 19:24:26,732][62634] Updated weights for policy 0, policy_version 86470 (0.0008) [2023-10-12 19:24:27,101][62634] Updated weights for policy 0, policy_version 86480 (0.0008) [2023-10-12 19:24:27,477][62634] Updated weights for policy 0, policy_version 86490 (0.0010) [2023-10-12 19:24:28,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 177176576. Throughput: 0: 1671.0, 1: 1686.0. Samples: 44299664. Policy #0 lag: (min: 29.0, avg: 29.0, max: 29.0) [2023-10-12 19:24:28,435][61643] Avg episode reward: [(0, '23.900'), (1, '10.140')] [2023-10-12 19:24:29,723][62635] Updated weights for policy 1, policy_version 86530 (0.0008) [2023-10-12 19:24:30,089][62635] Updated weights for policy 1, policy_version 86540 (0.0010) [2023-10-12 19:24:30,460][62635] Updated weights for policy 1, policy_version 86550 (0.0009) [2023-10-12 19:24:30,828][62635] Updated weights for policy 1, policy_version 86560 (0.0009) [2023-10-12 19:24:31,566][62634] Updated weights for policy 0, policy_version 86500 (0.0009) [2023-10-12 19:24:31,933][62634] Updated weights for policy 0, policy_version 86510 (0.0008) [2023-10-12 19:24:32,320][62634] Updated weights for policy 0, policy_version 86520 (0.0009) [2023-10-12 19:24:33,435][61643] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 177242112. Throughput: 0: 1668.8, 1: 1696.9. Samples: 44319554. Policy #0 lag: (min: 29.0, avg: 29.0, max: 29.0) [2023-10-12 19:24:33,436][61643] Avg episode reward: [(0, '24.050'), (1, '9.970')] [2023-10-12 19:24:34,975][62635] Updated weights for policy 1, policy_version 86570 (0.0009) [2023-10-12 19:24:35,340][62635] Updated weights for policy 1, policy_version 86580 (0.0007) [2023-10-12 19:24:35,705][62635] Updated weights for policy 1, policy_version 86590 (0.0008) [2023-10-12 19:24:36,143][62634] Updated weights for policy 0, policy_version 86530 (0.0009) [2023-10-12 19:24:36,517][62634] Updated weights for policy 0, policy_version 86540 (0.0009) [2023-10-12 19:24:36,892][62634] Updated weights for policy 0, policy_version 86550 (0.0009) [2023-10-12 19:24:37,265][62634] Updated weights for policy 0, policy_version 86560 (0.0010) [2023-10-12 19:24:38,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 177307648. Throughput: 0: 1689.3, 1: 1669.6. Samples: 44330166. Policy #0 lag: (min: 29.0, avg: 29.0, max: 29.0) [2023-10-12 19:24:38,436][61643] Avg episode reward: [(0, '23.910'), (1, '10.060')] [2023-10-12 19:24:39,849][62635] Updated weights for policy 1, policy_version 86600 (0.0009) [2023-10-12 19:24:40,219][62635] Updated weights for policy 1, policy_version 86610 (0.0010) [2023-10-12 19:24:40,587][62635] Updated weights for policy 1, policy_version 86620 (0.0010) [2023-10-12 19:24:41,274][62634] Updated weights for policy 0, policy_version 86570 (0.0008) [2023-10-12 19:24:41,640][62634] Updated weights for policy 0, policy_version 86580 (0.0008) [2023-10-12 19:24:42,021][62634] Updated weights for policy 0, policy_version 86590 (0.0009) [2023-10-12 19:24:43,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13440.5). Total num frames: 177373184. Throughput: 0: 1664.4, 1: 1694.5. Samples: 44349908. Policy #0 lag: (min: 29.0, avg: 29.0, max: 29.0) [2023-10-12 19:24:43,435][61643] Avg episode reward: [(0, '23.630'), (1, '10.150')] [2023-10-12 19:24:44,590][62635] Updated weights for policy 1, policy_version 86630 (0.0007) [2023-10-12 19:24:44,961][62635] Updated weights for policy 1, policy_version 86640 (0.0007) [2023-10-12 19:24:45,339][62635] Updated weights for policy 1, policy_version 86650 (0.0009) [2023-10-12 19:24:46,172][62634] Updated weights for policy 0, policy_version 86600 (0.0009) [2023-10-12 19:24:46,548][62634] Updated weights for policy 0, policy_version 86610 (0.0007) [2023-10-12 19:24:46,921][62634] Updated weights for policy 0, policy_version 86620 (0.0009) [2023-10-12 19:24:48,435][61643] Fps is (10 sec: 13107.5, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 177438720. Throughput: 0: 1673.9, 1: 1695.0. Samples: 44370352. Policy #0 lag: (min: 29.0, avg: 29.0, max: 29.0) [2023-10-12 19:24:48,435][61643] Avg episode reward: [(0, '23.580'), (1, '9.970')] [2023-10-12 19:24:49,440][62635] Updated weights for policy 1, policy_version 86660 (0.0009) [2023-10-12 19:24:49,810][62635] Updated weights for policy 1, policy_version 86670 (0.0008) [2023-10-12 19:24:50,187][62635] Updated weights for policy 1, policy_version 86680 (0.0008) [2023-10-12 19:24:51,085][62634] Updated weights for policy 0, policy_version 86630 (0.0008) [2023-10-12 19:24:51,460][62634] Updated weights for policy 0, policy_version 86640 (0.0010) [2023-10-12 19:24:51,832][62634] Updated weights for policy 0, policy_version 86650 (0.0011) [2023-10-12 19:24:53,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 177504256. Throughput: 0: 1678.8, 1: 1675.5. Samples: 44380468. Policy #0 lag: (min: 29.0, avg: 29.0, max: 29.0) [2023-10-12 19:24:53,435][61643] Avg episode reward: [(0, '23.630'), (1, '9.970')] [2023-10-12 19:24:54,105][62635] Updated weights for policy 1, policy_version 86690 (0.0007) [2023-10-12 19:24:54,477][62635] Updated weights for policy 1, policy_version 86700 (0.0007) [2023-10-12 19:24:54,847][62635] Updated weights for policy 1, policy_version 86710 (0.0007) [2023-10-12 19:24:55,216][62635] Updated weights for policy 1, policy_version 86720 (0.0007) [2023-10-12 19:24:55,939][62634] Updated weights for policy 0, policy_version 86660 (0.0010) [2023-10-12 19:24:56,315][62634] Updated weights for policy 0, policy_version 86670 (0.0009) [2023-10-12 19:24:56,692][62634] Updated weights for policy 0, policy_version 86680 (0.0007) [2023-10-12 19:24:58,435][61643] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 177569792. Throughput: 0: 1660.3, 1: 1692.7. Samples: 44400026. Policy #0 lag: (min: 29.0, avg: 29.0, max: 29.0) [2023-10-12 19:24:58,436][61643] Avg episode reward: [(0, '23.630'), (1, '9.830')] [2023-10-12 19:24:59,335][62635] Updated weights for policy 1, policy_version 86730 (0.0007) [2023-10-12 19:24:59,703][62635] Updated weights for policy 1, policy_version 86740 (0.0008) [2023-10-12 19:25:00,073][62635] Updated weights for policy 1, policy_version 86750 (0.0008) [2023-10-12 19:25:00,690][62634] Updated weights for policy 0, policy_version 86690 (0.0008) [2023-10-12 19:25:01,065][62634] Updated weights for policy 0, policy_version 86700 (0.0008) [2023-10-12 19:25:01,441][62634] Updated weights for policy 0, policy_version 86710 (0.0010) [2023-10-12 19:25:01,816][62634] Updated weights for policy 0, policy_version 86720 (0.0011) [2023-10-12 19:25:03,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 177635328. Throughput: 0: 1684.0, 1: 1689.6. Samples: 44420824. Policy #0 lag: (min: 29.0, avg: 29.0, max: 29.0) [2023-10-12 19:25:03,435][61643] Avg episode reward: [(0, '23.430'), (1, '9.910')] [2023-10-12 19:25:04,182][62635] Updated weights for policy 1, policy_version 86760 (0.0008) [2023-10-12 19:25:04,550][62635] Updated weights for policy 1, policy_version 86770 (0.0009) [2023-10-12 19:25:04,914][62635] Updated weights for policy 1, policy_version 86780 (0.0007) [2023-10-12 19:25:05,773][62634] Updated weights for policy 0, policy_version 86730 (0.0009) [2023-10-12 19:25:06,148][62634] Updated weights for policy 0, policy_version 86740 (0.0009) [2023-10-12 19:25:06,521][62634] Updated weights for policy 0, policy_version 86750 (0.0009) [2023-10-12 19:25:08,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 177700864. Throughput: 0: 1675.3, 1: 1681.4. Samples: 44430674. Policy #0 lag: (min: 29.0, avg: 29.0, max: 29.0) [2023-10-12 19:25:08,436][61643] Avg episode reward: [(0, '23.520'), (1, '9.900')] [2023-10-12 19:25:08,797][62635] Updated weights for policy 1, policy_version 86790 (0.0010) [2023-10-12 19:25:09,175][62635] Updated weights for policy 1, policy_version 86800 (0.0010) [2023-10-12 19:25:09,549][62635] Updated weights for policy 1, policy_version 86810 (0.0008) [2023-10-12 19:25:10,689][62634] Updated weights for policy 0, policy_version 86760 (0.0009) [2023-10-12 19:25:11,062][62634] Updated weights for policy 0, policy_version 86770 (0.0007) [2023-10-12 19:25:11,442][62634] Updated weights for policy 0, policy_version 86780 (0.0007) [2023-10-12 19:25:13,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 177766400. Throughput: 0: 1663.3, 1: 1690.0. Samples: 44450562. Policy #0 lag: (min: 7.0, avg: 8.7, max: 36.0) [2023-10-12 19:25:13,435][61643] Avg episode reward: [(0, '23.380'), (1, '9.880')] [2023-10-12 19:25:13,492][62635] Updated weights for policy 1, policy_version 86820 (0.0010) [2023-10-12 19:25:13,850][62635] Updated weights for policy 1, policy_version 86830 (0.0010) [2023-10-12 19:25:14,215][62635] Updated weights for policy 1, policy_version 86840 (0.0007) [2023-10-12 19:25:15,584][62634] Updated weights for policy 0, policy_version 86790 (0.0007) [2023-10-12 19:25:15,958][62634] Updated weights for policy 0, policy_version 86800 (0.0009) [2023-10-12 19:25:16,340][62634] Updated weights for policy 0, policy_version 86810 (0.0008) [2023-10-12 19:25:18,271][62635] Updated weights for policy 1, policy_version 86850 (0.0007) [2023-10-12 19:25:18,435][61643] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 177831936. Throughput: 0: 1681.1, 1: 1691.8. Samples: 44471336. Policy #0 lag: (min: 7.0, avg: 8.7, max: 36.0) [2023-10-12 19:25:18,436][61643] Avg episode reward: [(0, '23.310'), (1, '9.720')] [2023-10-12 19:25:18,635][62635] Updated weights for policy 1, policy_version 86860 (0.0008) [2023-10-12 19:25:19,001][62635] Updated weights for policy 1, policy_version 86870 (0.0008) [2023-10-12 19:25:19,379][62635] Updated weights for policy 1, policy_version 86880 (0.0010) [2023-10-12 19:25:20,296][62634] Updated weights for policy 0, policy_version 86820 (0.0009) [2023-10-12 19:25:20,668][62634] Updated weights for policy 0, policy_version 86830 (0.0009) [2023-10-12 19:25:21,046][62634] Updated weights for policy 0, policy_version 86840 (0.0008) [2023-10-12 19:25:23,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 177897472. Throughput: 0: 1657.7, 1: 1692.7. Samples: 44480936. Policy #0 lag: (min: 7.0, avg: 8.7, max: 36.0) [2023-10-12 19:25:23,435][61643] Avg episode reward: [(0, '23.200'), (1, '9.800')] [2023-10-12 19:25:23,551][62635] Updated weights for policy 1, policy_version 86890 (0.0009) [2023-10-12 19:25:23,912][62635] Updated weights for policy 1, policy_version 86900 (0.0009) [2023-10-12 19:25:24,290][62635] Updated weights for policy 1, policy_version 86910 (0.0008) [2023-10-12 19:25:25,100][62634] Updated weights for policy 0, policy_version 86850 (0.0008) [2023-10-12 19:25:25,480][62634] Updated weights for policy 0, policy_version 86860 (0.0007) [2023-10-12 19:25:25,856][62634] Updated weights for policy 0, policy_version 86870 (0.0007) [2023-10-12 19:25:26,227][62634] Updated weights for policy 0, policy_version 86880 (0.0008) [2023-10-12 19:25:28,435][61643] Fps is (10 sec: 13107.6, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 177963008. Throughput: 0: 1667.5, 1: 1692.3. Samples: 44501098. Policy #0 lag: (min: 7.0, avg: 8.7, max: 36.0) [2023-10-12 19:25:28,435][61643] Avg episode reward: [(0, '23.400'), (1, '9.910')] [2023-10-12 19:25:28,436][62635] Updated weights for policy 1, policy_version 86920 (0.0009) [2023-10-12 19:25:28,801][62635] Updated weights for policy 1, policy_version 86930 (0.0008) [2023-10-12 19:25:29,178][62635] Updated weights for policy 1, policy_version 86940 (0.0008) [2023-10-12 19:25:30,306][62634] Updated weights for policy 0, policy_version 86890 (0.0009) [2023-10-12 19:25:30,678][62634] Updated weights for policy 0, policy_version 86900 (0.0008) [2023-10-12 19:25:31,056][62634] Updated weights for policy 0, policy_version 86910 (0.0009) [2023-10-12 19:25:33,320][62635] Updated weights for policy 1, policy_version 86950 (0.0009) [2023-10-12 19:25:33,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 178028544. Throughput: 0: 1678.8, 1: 1686.9. Samples: 44521808. Policy #0 lag: (min: 7.0, avg: 8.7, max: 36.0) [2023-10-12 19:25:33,436][61643] Avg episode reward: [(0, '23.370'), (1, '9.530')] [2023-10-12 19:25:33,681][62635] Updated weights for policy 1, policy_version 86960 (0.0009) [2023-10-12 19:25:34,050][62635] Updated weights for policy 1, policy_version 86970 (0.0008) [2023-10-12 19:25:35,159][62634] Updated weights for policy 0, policy_version 86920 (0.0009) [2023-10-12 19:25:35,531][62634] Updated weights for policy 0, policy_version 86930 (0.0007) [2023-10-12 19:25:35,918][62634] Updated weights for policy 0, policy_version 86940 (0.0007) [2023-10-12 19:25:38,135][62635] Updated weights for policy 1, policy_version 86980 (0.0008) [2023-10-12 19:25:38,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 178094080. Throughput: 0: 1658.9, 1: 1692.3. Samples: 44531274. Policy #0 lag: (min: 7.0, avg: 8.7, max: 36.0) [2023-10-12 19:25:38,435][61643] Avg episode reward: [(0, '23.800'), (1, '9.610')] [2023-10-12 19:25:38,507][62635] Updated weights for policy 1, policy_version 86990 (0.0009) [2023-10-12 19:25:38,870][62635] Updated weights for policy 1, policy_version 87000 (0.0011) [2023-10-12 19:25:40,029][62634] Updated weights for policy 0, policy_version 86950 (0.0009) [2023-10-12 19:25:40,395][62634] Updated weights for policy 0, policy_version 86960 (0.0007) [2023-10-12 19:25:40,777][62634] Updated weights for policy 0, policy_version 86970 (0.0008) [2023-10-12 19:25:42,963][62635] Updated weights for policy 1, policy_version 87010 (0.0009) [2023-10-12 19:25:43,324][62635] Updated weights for policy 1, policy_version 87020 (0.0008) [2023-10-12 19:25:43,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 178159616. Throughput: 0: 1681.7, 1: 1689.8. Samples: 44551746. Policy #0 lag: (min: 7.0, avg: 8.7, max: 36.0) [2023-10-12 19:25:43,435][61643] Avg episode reward: [(0, '23.840'), (1, '9.810')] [2023-10-12 19:25:43,690][62635] Updated weights for policy 1, policy_version 87030 (0.0009) [2023-10-12 19:25:44,060][62635] Updated weights for policy 1, policy_version 87040 (0.0008) [2023-10-12 19:25:44,750][62634] Updated weights for policy 0, policy_version 86980 (0.0007) [2023-10-12 19:25:45,138][62634] Updated weights for policy 0, policy_version 86990 (0.0007) [2023-10-12 19:25:45,511][62634] Updated weights for policy 0, policy_version 87000 (0.0007) [2023-10-12 19:25:48,052][62635] Updated weights for policy 1, policy_version 87050 (0.0010) [2023-10-12 19:25:48,419][62635] Updated weights for policy 1, policy_version 87060 (0.0009) [2023-10-12 19:25:48,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 178225152. Throughput: 0: 1681.9, 1: 1678.0. Samples: 44572016. Policy #0 lag: (min: 7.0, avg: 8.7, max: 36.0) [2023-10-12 19:25:48,435][61643] Avg episode reward: [(0, '23.830'), (1, '9.790')] [2023-10-12 19:25:48,784][62635] Updated weights for policy 1, policy_version 87070 (0.0008) [2023-10-12 19:25:49,654][62634] Updated weights for policy 0, policy_version 87010 (0.0008) [2023-10-12 19:25:50,026][62634] Updated weights for policy 0, policy_version 87020 (0.0007) [2023-10-12 19:25:50,397][62634] Updated weights for policy 0, policy_version 87030 (0.0008) [2023-10-12 19:25:50,770][62634] Updated weights for policy 0, policy_version 87040 (0.0007) [2023-10-12 19:25:52,909][62635] Updated weights for policy 1, policy_version 87080 (0.0009) [2023-10-12 19:25:53,280][62635] Updated weights for policy 1, policy_version 87090 (0.0009) [2023-10-12 19:25:53,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 178290688. Throughput: 0: 1659.6, 1: 1688.7. Samples: 44581348. Policy #0 lag: (min: 7.0, avg: 8.7, max: 36.0) [2023-10-12 19:25:53,436][61643] Avg episode reward: [(0, '24.170'), (1, '9.800')] [2023-10-12 19:25:53,652][62635] Updated weights for policy 1, policy_version 87100 (0.0009) [2023-10-12 19:25:54,928][62634] Updated weights for policy 0, policy_version 87050 (0.0007) [2023-10-12 19:25:55,305][62634] Updated weights for policy 0, policy_version 87060 (0.0009) [2023-10-12 19:25:55,675][62634] Updated weights for policy 0, policy_version 87070 (0.0008) [2023-10-12 19:25:57,641][62635] Updated weights for policy 1, policy_version 87110 (0.0010) [2023-10-12 19:25:58,003][62635] Updated weights for policy 1, policy_version 87120 (0.0008) [2023-10-12 19:25:58,368][62635] Updated weights for policy 1, policy_version 87130 (0.0009) [2023-10-12 19:25:58,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 178356224. Throughput: 0: 1683.6, 1: 1680.0. Samples: 44601924. Policy #0 lag: (min: 7.0, avg: 8.7, max: 36.0) [2023-10-12 19:25:58,436][61643] Avg episode reward: [(0, '23.790'), (1, '9.860')] [2023-10-12 19:25:59,583][62634] Updated weights for policy 0, policy_version 87080 (0.0009) [2023-10-12 19:25:59,960][62634] Updated weights for policy 0, policy_version 87090 (0.0009) [2023-10-12 19:26:00,340][62634] Updated weights for policy 0, policy_version 87100 (0.0008) [2023-10-12 19:26:02,544][62635] Updated weights for policy 1, policy_version 87140 (0.0010) [2023-10-12 19:26:02,910][62635] Updated weights for policy 1, policy_version 87150 (0.0008) [2023-10-12 19:26:03,283][62635] Updated weights for policy 1, policy_version 87160 (0.0007) [2023-10-12 19:26:03,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 178421760. Throughput: 0: 1684.7, 1: 1663.9. Samples: 44622022. Policy #0 lag: (min: 7.0, avg: 8.7, max: 36.0) [2023-10-12 19:26:03,435][61643] Avg episode reward: [(0, '24.050'), (1, '9.900')] [2023-10-12 19:26:04,475][62634] Updated weights for policy 0, policy_version 87110 (0.0009) [2023-10-12 19:26:04,863][62634] Updated weights for policy 0, policy_version 87120 (0.0008) [2023-10-12 19:26:05,228][62634] Updated weights for policy 0, policy_version 87130 (0.0009) [2023-10-12 19:26:07,383][62635] Updated weights for policy 1, policy_version 87170 (0.0009) [2023-10-12 19:26:07,752][62635] Updated weights for policy 1, policy_version 87180 (0.0007) [2023-10-12 19:26:08,114][62635] Updated weights for policy 1, policy_version 87190 (0.0007) [2023-10-12 19:26:08,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 178487296. Throughput: 0: 1672.4, 1: 1679.4. Samples: 44631770. Policy #0 lag: (min: 31.0, avg: 33.8, max: 63.0) [2023-10-12 19:26:08,435][61643] Avg episode reward: [(0, '24.150'), (1, '9.800')] [2023-10-12 19:26:08,484][62635] Updated weights for policy 1, policy_version 87200 (0.0009) [2023-10-12 19:26:09,226][62634] Updated weights for policy 0, policy_version 87140 (0.0007) [2023-10-12 19:26:09,612][62634] Updated weights for policy 0, policy_version 87150 (0.0010) [2023-10-12 19:26:09,986][62634] Updated weights for policy 0, policy_version 87160 (0.0009) [2023-10-12 19:26:12,519][62635] Updated weights for policy 1, policy_version 87210 (0.0008) [2023-10-12 19:26:12,892][62635] Updated weights for policy 1, policy_version 87220 (0.0008) [2023-10-12 19:26:13,258][62635] Updated weights for policy 1, policy_version 87230 (0.0009) [2023-10-12 19:26:13,435][61643] Fps is (10 sec: 16384.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 178585600. Throughput: 0: 1683.0, 1: 1678.6. Samples: 44652368. Policy #0 lag: (min: 31.0, avg: 33.8, max: 63.0) [2023-10-12 19:26:13,435][61643] Avg episode reward: [(0, '24.050'), (1, '9.910')] [2023-10-12 19:26:14,078][62634] Updated weights for policy 0, policy_version 87170 (0.0009) [2023-10-12 19:26:14,448][62634] Updated weights for policy 0, policy_version 87180 (0.0008) [2023-10-12 19:26:14,824][62634] Updated weights for policy 0, policy_version 87190 (0.0008) [2023-10-12 19:26:15,205][62634] Updated weights for policy 0, policy_version 87200 (0.0007) [2023-10-12 19:26:17,364][62635] Updated weights for policy 1, policy_version 87240 (0.0007) [2023-10-12 19:26:17,731][62635] Updated weights for policy 1, policy_version 87250 (0.0008) [2023-10-12 19:26:18,105][62635] Updated weights for policy 1, policy_version 87260 (0.0008) [2023-10-12 19:26:18,435][61643] Fps is (10 sec: 16383.9, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 178651136. Throughput: 0: 1680.7, 1: 1663.5. Samples: 44672294. Policy #0 lag: (min: 31.0, avg: 33.8, max: 63.0) [2023-10-12 19:26:18,435][61643] Avg episode reward: [(0, '24.290'), (1, '9.700')] [2023-10-12 19:26:18,443][62495] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000087264_89358336.pth... [2023-10-12 19:26:18,443][62354] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000087200_89292800.pth... [2023-10-12 19:26:18,474][62354] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000085632_87687168.pth [2023-10-12 19:26:18,482][62495] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000085696_87752704.pth [2023-10-12 19:26:19,254][62634] Updated weights for policy 0, policy_version 87210 (0.0008) [2023-10-12 19:26:19,632][62634] Updated weights for policy 0, policy_version 87220 (0.0007) [2023-10-12 19:26:20,005][62634] Updated weights for policy 0, policy_version 87230 (0.0010) [2023-10-12 19:26:22,135][62635] Updated weights for policy 1, policy_version 87270 (0.0009) [2023-10-12 19:26:22,503][62635] Updated weights for policy 1, policy_version 87280 (0.0009) [2023-10-12 19:26:22,872][62635] Updated weights for policy 1, policy_version 87290 (0.0009) [2023-10-12 19:26:23,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 178716672. Throughput: 0: 1675.3, 1: 1682.2. Samples: 44682362. Policy #0 lag: (min: 31.0, avg: 33.8, max: 63.0) [2023-10-12 19:26:23,436][61643] Avg episode reward: [(0, '24.340'), (1, '9.700')] [2023-10-12 19:26:24,100][62634] Updated weights for policy 0, policy_version 87240 (0.0009) [2023-10-12 19:26:24,483][62634] Updated weights for policy 0, policy_version 87250 (0.0007) [2023-10-12 19:26:24,858][62634] Updated weights for policy 0, policy_version 87260 (0.0008) [2023-10-12 19:26:26,823][62635] Updated weights for policy 1, policy_version 87300 (0.0009) [2023-10-12 19:26:27,200][62635] Updated weights for policy 1, policy_version 87310 (0.0008) [2023-10-12 19:26:27,561][62635] Updated weights for policy 1, policy_version 87320 (0.0009) [2023-10-12 19:26:28,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 178782208. Throughput: 0: 1682.5, 1: 1681.1. Samples: 44703110. Policy #0 lag: (min: 31.0, avg: 33.8, max: 63.0) [2023-10-12 19:26:28,435][61643] Avg episode reward: [(0, '24.530'), (1, '9.890')] [2023-10-12 19:26:28,848][62634] Updated weights for policy 0, policy_version 87270 (0.0009) [2023-10-12 19:26:29,212][62634] Updated weights for policy 0, policy_version 87280 (0.0009) [2023-10-12 19:26:29,592][62634] Updated weights for policy 0, policy_version 87290 (0.0009) [2023-10-12 19:26:31,629][62635] Updated weights for policy 1, policy_version 87330 (0.0007) [2023-10-12 19:26:32,005][62635] Updated weights for policy 1, policy_version 87340 (0.0008) [2023-10-12 19:26:32,373][62635] Updated weights for policy 1, policy_version 87350 (0.0009) [2023-10-12 19:26:32,744][62635] Updated weights for policy 1, policy_version 87360 (0.0009) [2023-10-12 19:26:33,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 178847744. Throughput: 0: 1687.2, 1: 1670.8. Samples: 44723124. Policy #0 lag: (min: 31.0, avg: 33.8, max: 63.0) [2023-10-12 19:26:33,436][61643] Avg episode reward: [(0, '24.640'), (1, '9.790')] [2023-10-12 19:26:33,650][62634] Updated weights for policy 0, policy_version 87300 (0.0008) [2023-10-12 19:26:34,026][62634] Updated weights for policy 0, policy_version 87310 (0.0010) [2023-10-12 19:26:34,400][62634] Updated weights for policy 0, policy_version 87320 (0.0009) [2023-10-12 19:26:36,688][62635] Updated weights for policy 1, policy_version 87370 (0.0009) [2023-10-12 19:26:37,058][62635] Updated weights for policy 1, policy_version 87380 (0.0010) [2023-10-12 19:26:37,422][62635] Updated weights for policy 1, policy_version 87390 (0.0010) [2023-10-12 19:26:38,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 178913280. Throughput: 0: 1686.1, 1: 1691.3. Samples: 44733334. Policy #0 lag: (min: 31.0, avg: 33.8, max: 63.0) [2023-10-12 19:26:38,435][61643] Avg episode reward: [(0, '24.850'), (1, '9.840')] [2023-10-12 19:26:38,584][62634] Updated weights for policy 0, policy_version 87330 (0.0009) [2023-10-12 19:26:38,961][62634] Updated weights for policy 0, policy_version 87340 (0.0007) [2023-10-12 19:26:39,335][62634] Updated weights for policy 0, policy_version 87350 (0.0007) [2023-10-12 19:26:39,709][62634] Updated weights for policy 0, policy_version 87360 (0.0009) [2023-10-12 19:26:41,758][62635] Updated weights for policy 1, policy_version 87400 (0.0009) [2023-10-12 19:26:42,137][62635] Updated weights for policy 1, policy_version 87410 (0.0011) [2023-10-12 19:26:42,502][62635] Updated weights for policy 1, policy_version 87420 (0.0010) [2023-10-12 19:26:43,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 178978816. Throughput: 0: 1689.6, 1: 1676.0. Samples: 44753376. Policy #0 lag: (min: 31.0, avg: 33.8, max: 63.0) [2023-10-12 19:26:43,435][61643] Avg episode reward: [(0, '25.090'), (1, '10.010')] [2023-10-12 19:26:43,502][62634] Updated weights for policy 0, policy_version 87370 (0.0010) [2023-10-12 19:26:43,880][62634] Updated weights for policy 0, policy_version 87380 (0.0007) [2023-10-12 19:26:44,258][62634] Updated weights for policy 0, policy_version 87390 (0.0009) [2023-10-12 19:26:46,482][62635] Updated weights for policy 1, policy_version 87430 (0.0010) [2023-10-12 19:26:46,852][62635] Updated weights for policy 1, policy_version 87440 (0.0008) [2023-10-12 19:26:47,217][62635] Updated weights for policy 1, policy_version 87450 (0.0008) [2023-10-12 19:26:48,169][62634] Updated weights for policy 0, policy_version 87400 (0.0009) [2023-10-12 19:26:48,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 179044352. Throughput: 0: 1690.0, 1: 1673.4. Samples: 44773374. Policy #0 lag: (min: 31.0, avg: 33.8, max: 63.0) [2023-10-12 19:26:48,436][61643] Avg episode reward: [(0, '24.910'), (1, '9.950')] [2023-10-12 19:26:48,548][62634] Updated weights for policy 0, policy_version 87410 (0.0010) [2023-10-12 19:26:48,928][62634] Updated weights for policy 0, policy_version 87420 (0.0009) [2023-10-12 19:26:51,309][62635] Updated weights for policy 1, policy_version 87460 (0.0007) [2023-10-12 19:26:51,684][62635] Updated weights for policy 1, policy_version 87470 (0.0007) [2023-10-12 19:26:52,038][62635] Updated weights for policy 1, policy_version 87480 (0.0008) [2023-10-12 19:26:52,910][62634] Updated weights for policy 0, policy_version 87430 (0.0008) [2023-10-12 19:26:53,288][62634] Updated weights for policy 0, policy_version 87440 (0.0007) [2023-10-12 19:26:53,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 179109888. Throughput: 0: 1696.8, 1: 1684.5. Samples: 44783928. Policy #0 lag: (min: 31.0, avg: 33.8, max: 63.0) [2023-10-12 19:26:53,435][61643] Avg episode reward: [(0, '24.810'), (1, '9.770')] [2023-10-12 19:26:53,667][62634] Updated weights for policy 0, policy_version 87450 (0.0008) [2023-10-12 19:26:56,040][62635] Updated weights for policy 1, policy_version 87490 (0.0009) [2023-10-12 19:26:56,396][62635] Updated weights for policy 1, policy_version 87500 (0.0008) [2023-10-12 19:26:56,762][62635] Updated weights for policy 1, policy_version 87510 (0.0007) [2023-10-12 19:26:57,131][62635] Updated weights for policy 1, policy_version 87520 (0.0008) [2023-10-12 19:26:57,645][62634] Updated weights for policy 0, policy_version 87460 (0.0009) [2023-10-12 19:26:58,018][62634] Updated weights for policy 0, policy_version 87470 (0.0009) [2023-10-12 19:26:58,405][62634] Updated weights for policy 0, policy_version 87480 (0.0011) [2023-10-12 19:26:58,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 179175424. Throughput: 0: 1701.4, 1: 1664.8. Samples: 44803846. Policy #0 lag: (min: 31.0, avg: 33.8, max: 63.0) [2023-10-12 19:26:58,435][61643] Avg episode reward: [(0, '24.700'), (1, '9.900')] [2023-10-12 19:27:01,222][62635] Updated weights for policy 1, policy_version 87530 (0.0008) [2023-10-12 19:27:01,589][62635] Updated weights for policy 1, policy_version 87540 (0.0010) [2023-10-12 19:27:01,959][62635] Updated weights for policy 1, policy_version 87550 (0.0008) [2023-10-12 19:27:02,441][62634] Updated weights for policy 0, policy_version 87490 (0.0007) [2023-10-12 19:27:02,809][62634] Updated weights for policy 0, policy_version 87500 (0.0010) [2023-10-12 19:27:03,183][62634] Updated weights for policy 0, policy_version 87510 (0.0010) [2023-10-12 19:27:03,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 179240960. Throughput: 0: 1689.1, 1: 1681.4. Samples: 44823968. Policy #0 lag: (min: 31.0, avg: 33.8, max: 63.0) [2023-10-12 19:27:03,435][61643] Avg episode reward: [(0, '24.460'), (1, '9.820')] [2023-10-12 19:27:03,565][62634] Updated weights for policy 0, policy_version 87520 (0.0009) [2023-10-12 19:27:06,131][62635] Updated weights for policy 1, policy_version 87560 (0.0007) [2023-10-12 19:27:06,488][62635] Updated weights for policy 1, policy_version 87570 (0.0008) [2023-10-12 19:27:06,850][62635] Updated weights for policy 1, policy_version 87580 (0.0008) [2023-10-12 19:27:07,588][62634] Updated weights for policy 0, policy_version 87530 (0.0011) [2023-10-12 19:27:07,966][62634] Updated weights for policy 0, policy_version 87540 (0.0010) [2023-10-12 19:27:08,353][62634] Updated weights for policy 0, policy_version 87550 (0.0011) [2023-10-12 19:27:08,435][61643] Fps is (10 sec: 16384.0, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 179339264. Throughput: 0: 1703.7, 1: 1685.6. Samples: 44834880. Policy #0 lag: (min: 30.0, avg: 30.0, max: 30.0) [2023-10-12 19:27:08,435][61643] Avg episode reward: [(0, '24.410'), (1, '9.730')] [2023-10-12 19:27:10,798][62635] Updated weights for policy 1, policy_version 87590 (0.0008) [2023-10-12 19:27:11,165][62635] Updated weights for policy 1, policy_version 87600 (0.0008) [2023-10-12 19:27:11,536][62635] Updated weights for policy 1, policy_version 87610 (0.0007) [2023-10-12 19:27:12,447][62634] Updated weights for policy 0, policy_version 87560 (0.0007) [2023-10-12 19:27:12,826][62634] Updated weights for policy 0, policy_version 87570 (0.0008) [2023-10-12 19:27:13,193][62634] Updated weights for policy 0, policy_version 87580 (0.0007) [2023-10-12 19:27:13,435][61643] Fps is (10 sec: 16384.0, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 179404800. Throughput: 0: 1701.1, 1: 1670.0. Samples: 44854808. Policy #0 lag: (min: 30.0, avg: 30.0, max: 30.0) [2023-10-12 19:27:13,435][61643] Avg episode reward: [(0, '24.760'), (1, '9.920')] [2023-10-12 19:27:15,520][62635] Updated weights for policy 1, policy_version 87620 (0.0008) [2023-10-12 19:27:15,884][62635] Updated weights for policy 1, policy_version 87630 (0.0007) [2023-10-12 19:27:16,252][62635] Updated weights for policy 1, policy_version 87640 (0.0008) [2023-10-12 19:27:17,299][62634] Updated weights for policy 0, policy_version 87590 (0.0008) [2023-10-12 19:27:17,679][62634] Updated weights for policy 0, policy_version 87600 (0.0007) [2023-10-12 19:27:18,053][62634] Updated weights for policy 0, policy_version 87610 (0.0009) [2023-10-12 19:27:18,435][61643] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 179470336. Throughput: 0: 1676.2, 1: 1691.1. Samples: 44874652. Policy #0 lag: (min: 30.0, avg: 30.0, max: 30.0) [2023-10-12 19:27:18,436][61643] Avg episode reward: [(0, '24.980'), (1, '9.810')] [2023-10-12 19:27:20,238][62635] Updated weights for policy 1, policy_version 87650 (0.0008) [2023-10-12 19:27:20,612][62635] Updated weights for policy 1, policy_version 87660 (0.0008) [2023-10-12 19:27:20,985][62635] Updated weights for policy 1, policy_version 87670 (0.0007) [2023-10-12 19:27:21,360][62635] Updated weights for policy 1, policy_version 87680 (0.0007) [2023-10-12 19:27:22,194][62634] Updated weights for policy 0, policy_version 87620 (0.0008) [2023-10-12 19:27:22,572][62634] Updated weights for policy 0, policy_version 87630 (0.0008) [2023-10-12 19:27:22,945][62634] Updated weights for policy 0, policy_version 87640 (0.0009) [2023-10-12 19:27:23,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 179535872. Throughput: 0: 1698.4, 1: 1671.7. Samples: 44884992. Policy #0 lag: (min: 30.0, avg: 30.0, max: 30.0) [2023-10-12 19:27:23,435][61643] Avg episode reward: [(0, '24.510'), (1, '9.730')] [2023-10-12 19:27:25,520][62635] Updated weights for policy 1, policy_version 87690 (0.0007) [2023-10-12 19:27:25,895][62635] Updated weights for policy 1, policy_version 87700 (0.0007) [2023-10-12 19:27:26,261][62635] Updated weights for policy 1, policy_version 87710 (0.0009) [2023-10-12 19:27:26,847][62634] Updated weights for policy 0, policy_version 87650 (0.0009) [2023-10-12 19:27:27,229][62634] Updated weights for policy 0, policy_version 87660 (0.0008) [2023-10-12 19:27:27,605][62634] Updated weights for policy 0, policy_version 87670 (0.0009) [2023-10-12 19:27:27,975][62634] Updated weights for policy 0, policy_version 87680 (0.0009) [2023-10-12 19:27:28,435][61643] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 179601408. Throughput: 0: 1689.1, 1: 1684.9. Samples: 44905206. Policy #0 lag: (min: 30.0, avg: 30.0, max: 30.0) [2023-10-12 19:27:28,435][61643] Avg episode reward: [(0, '24.520'), (1, '9.850')] [2023-10-12 19:27:30,414][62635] Updated weights for policy 1, policy_version 87720 (0.0010) [2023-10-12 19:27:30,785][62635] Updated weights for policy 1, policy_version 87730 (0.0009) [2023-10-12 19:27:31,151][62635] Updated weights for policy 1, policy_version 87740 (0.0007) [2023-10-12 19:27:32,038][62634] Updated weights for policy 0, policy_version 87690 (0.0008) [2023-10-12 19:27:32,409][62634] Updated weights for policy 0, policy_version 87700 (0.0007) [2023-10-12 19:27:32,794][62634] Updated weights for policy 0, policy_version 87710 (0.0010) [2023-10-12 19:27:33,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 179666944. Throughput: 0: 1662.7, 1: 1694.7. Samples: 44924456. Policy #0 lag: (min: 30.0, avg: 30.0, max: 30.0) [2023-10-12 19:27:33,435][61643] Avg episode reward: [(0, '24.620'), (1, '9.830')] [2023-10-12 19:27:35,310][62635] Updated weights for policy 1, policy_version 87750 (0.0009) [2023-10-12 19:27:35,671][62635] Updated weights for policy 1, policy_version 87760 (0.0010) [2023-10-12 19:27:36,036][62635] Updated weights for policy 1, policy_version 87770 (0.0007) [2023-10-12 19:27:36,508][62634] Updated weights for policy 0, policy_version 87720 (0.0008) [2023-10-12 19:27:36,882][62634] Updated weights for policy 0, policy_version 87730 (0.0010) [2023-10-12 19:27:37,266][62634] Updated weights for policy 0, policy_version 87740 (0.0009) [2023-10-12 19:27:38,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 179732480. Throughput: 0: 1686.2, 1: 1673.1. Samples: 44935096. Policy #0 lag: (min: 30.0, avg: 30.0, max: 30.0) [2023-10-12 19:27:38,436][61643] Avg episode reward: [(0, '24.540'), (1, '9.730')] [2023-10-12 19:27:40,003][62635] Updated weights for policy 1, policy_version 87780 (0.0009) [2023-10-12 19:27:40,368][62635] Updated weights for policy 1, policy_version 87790 (0.0010) [2023-10-12 19:27:40,743][62635] Updated weights for policy 1, policy_version 87800 (0.0009) [2023-10-12 19:27:41,436][62634] Updated weights for policy 0, policy_version 87750 (0.0008) [2023-10-12 19:27:41,821][62634] Updated weights for policy 0, policy_version 87760 (0.0010) [2023-10-12 19:27:42,200][62634] Updated weights for policy 0, policy_version 87770 (0.0009) [2023-10-12 19:27:43,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 179798016. Throughput: 0: 1669.3, 1: 1685.1. Samples: 44954794. Policy #0 lag: (min: 30.0, avg: 30.0, max: 30.0) [2023-10-12 19:27:43,435][61643] Avg episode reward: [(0, '24.540'), (1, '9.910')] [2023-10-12 19:27:44,878][62635] Updated weights for policy 1, policy_version 87810 (0.0012) [2023-10-12 19:27:45,247][62635] Updated weights for policy 1, policy_version 87820 (0.0009) [2023-10-12 19:27:45,610][62635] Updated weights for policy 1, policy_version 87830 (0.0007) [2023-10-12 19:27:45,977][62635] Updated weights for policy 1, policy_version 87840 (0.0007) [2023-10-12 19:27:46,261][62634] Updated weights for policy 0, policy_version 87780 (0.0009) [2023-10-12 19:27:46,642][62634] Updated weights for policy 0, policy_version 87790 (0.0007) [2023-10-12 19:27:47,016][62634] Updated weights for policy 0, policy_version 87800 (0.0010) [2023-10-12 19:27:48,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 179863552. Throughput: 0: 1667.3, 1: 1682.3. Samples: 44974702. Policy #0 lag: (min: 30.0, avg: 30.0, max: 30.0) [2023-10-12 19:27:48,435][61643] Avg episode reward: [(0, '24.400'), (1, '9.910')] [2023-10-12 19:27:50,084][62635] Updated weights for policy 1, policy_version 87850 (0.0009) [2023-10-12 19:27:50,453][62635] Updated weights for policy 1, policy_version 87860 (0.0009) [2023-10-12 19:27:50,826][62635] Updated weights for policy 1, policy_version 87870 (0.0009) [2023-10-12 19:27:51,178][62634] Updated weights for policy 0, policy_version 87810 (0.0010) [2023-10-12 19:27:51,552][62634] Updated weights for policy 0, policy_version 87820 (0.0007) [2023-10-12 19:27:51,931][62634] Updated weights for policy 0, policy_version 87830 (0.0008) [2023-10-12 19:27:52,303][62634] Updated weights for policy 0, policy_version 87840 (0.0008) [2023-10-12 19:27:53,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 179929088. Throughput: 0: 1683.1, 1: 1652.2. Samples: 44984968. Policy #0 lag: (min: 30.0, avg: 30.0, max: 30.0) [2023-10-12 19:27:53,436][61643] Avg episode reward: [(0, '24.340'), (1, '9.770')] [2023-10-12 19:27:54,914][62635] Updated weights for policy 1, policy_version 87880 (0.0008) [2023-10-12 19:27:55,280][62635] Updated weights for policy 1, policy_version 87890 (0.0008) [2023-10-12 19:27:55,653][62635] Updated weights for policy 1, policy_version 87900 (0.0009) [2023-10-12 19:27:56,433][62634] Updated weights for policy 0, policy_version 87850 (0.0009) [2023-10-12 19:27:56,818][62634] Updated weights for policy 0, policy_version 87860 (0.0010) [2023-10-12 19:27:57,200][62634] Updated weights for policy 0, policy_version 87870 (0.0007) [2023-10-12 19:27:58,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 179994624. Throughput: 0: 1664.1, 1: 1670.3. Samples: 45004856. Policy #0 lag: (min: 30.0, avg: 30.0, max: 30.0) [2023-10-12 19:27:58,435][61643] Avg episode reward: [(0, '24.260'), (1, '10.100')] [2023-10-12 19:27:59,911][62635] Updated weights for policy 1, policy_version 87910 (0.0008) [2023-10-12 19:28:00,286][62635] Updated weights for policy 1, policy_version 87920 (0.0008) [2023-10-12 19:28:00,661][62635] Updated weights for policy 1, policy_version 87930 (0.0008) [2023-10-12 19:28:01,289][62634] Updated weights for policy 0, policy_version 87880 (0.0009) [2023-10-12 19:28:01,668][62634] Updated weights for policy 0, policy_version 87890 (0.0011) [2023-10-12 19:28:02,036][62634] Updated weights for policy 0, policy_version 87900 (0.0008) [2023-10-12 19:28:03,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 180060160. Throughput: 0: 1673.5, 1: 1669.2. Samples: 45025074. Policy #0 lag: (min: 31.0, avg: 39.9, max: 63.0) [2023-10-12 19:28:03,435][61643] Avg episode reward: [(0, '24.260'), (1, '10.070')] [2023-10-12 19:28:04,558][62635] Updated weights for policy 1, policy_version 87940 (0.0010) [2023-10-12 19:28:04,925][62635] Updated weights for policy 1, policy_version 87950 (0.0008) [2023-10-12 19:28:05,288][62635] Updated weights for policy 1, policy_version 87960 (0.0009) [2023-10-12 19:28:06,190][62634] Updated weights for policy 0, policy_version 87910 (0.0008) [2023-10-12 19:28:06,563][62634] Updated weights for policy 0, policy_version 87920 (0.0010) [2023-10-12 19:28:06,929][62634] Updated weights for policy 0, policy_version 87930 (0.0010) [2023-10-12 19:28:08,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 180125696. Throughput: 0: 1682.9, 1: 1659.6. Samples: 45035406. Policy #0 lag: (min: 31.0, avg: 39.9, max: 63.0) [2023-10-12 19:28:08,436][61643] Avg episode reward: [(0, '24.400'), (1, '9.730')] [2023-10-12 19:28:09,278][62635] Updated weights for policy 1, policy_version 87970 (0.0010) [2023-10-12 19:28:09,642][62635] Updated weights for policy 1, policy_version 87980 (0.0011) [2023-10-12 19:28:10,006][62635] Updated weights for policy 1, policy_version 87990 (0.0009) [2023-10-12 19:28:10,386][62635] Updated weights for policy 1, policy_version 88000 (0.0008) [2023-10-12 19:28:11,063][62634] Updated weights for policy 0, policy_version 87940 (0.0009) [2023-10-12 19:28:11,427][62634] Updated weights for policy 0, policy_version 87950 (0.0007) [2023-10-12 19:28:11,807][62634] Updated weights for policy 0, policy_version 87960 (0.0007) [2023-10-12 19:28:13,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 180191232. Throughput: 0: 1658.2, 1: 1666.4. Samples: 45054816. Policy #0 lag: (min: 31.0, avg: 39.9, max: 63.0) [2023-10-12 19:28:13,435][61643] Avg episode reward: [(0, '24.450'), (1, '9.820')] [2023-10-12 19:28:14,548][62635] Updated weights for policy 1, policy_version 88010 (0.0007) [2023-10-12 19:28:14,907][62635] Updated weights for policy 1, policy_version 88020 (0.0007) [2023-10-12 19:28:15,280][62635] Updated weights for policy 1, policy_version 88030 (0.0008) [2023-10-12 19:28:15,963][62634] Updated weights for policy 0, policy_version 87970 (0.0010) [2023-10-12 19:28:16,343][62634] Updated weights for policy 0, policy_version 87980 (0.0008) [2023-10-12 19:28:16,713][62634] Updated weights for policy 0, policy_version 87990 (0.0008) [2023-10-12 19:28:17,092][62634] Updated weights for policy 0, policy_version 88000 (0.0010) [2023-10-12 19:28:18,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 180256768. Throughput: 0: 1677.5, 1: 1671.6. Samples: 45075168. Policy #0 lag: (min: 31.0, avg: 39.9, max: 63.0) [2023-10-12 19:28:18,436][61643] Avg episode reward: [(0, '24.680'), (1, '9.790')] [2023-10-12 19:28:18,447][62495] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000088032_90144768.pth... [2023-10-12 19:28:18,447][62354] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000088000_90112000.pth... [2023-10-12 19:28:18,482][62495] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000086464_88539136.pth [2023-10-12 19:28:18,486][62354] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000086432_88506368.pth [2023-10-12 19:28:19,444][62635] Updated weights for policy 1, policy_version 88040 (0.0010) [2023-10-12 19:28:19,809][62635] Updated weights for policy 1, policy_version 88050 (0.0010) [2023-10-12 19:28:20,181][62635] Updated weights for policy 1, policy_version 88060 (0.0010) [2023-10-12 19:28:21,121][62634] Updated weights for policy 0, policy_version 88010 (0.0010) [2023-10-12 19:28:21,506][62634] Updated weights for policy 0, policy_version 88020 (0.0010) [2023-10-12 19:28:21,889][62634] Updated weights for policy 0, policy_version 88030 (0.0010) [2023-10-12 19:28:23,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 180322304. Throughput: 0: 1671.8, 1: 1663.6. Samples: 45085186. Policy #0 lag: (min: 31.0, avg: 39.9, max: 63.0) [2023-10-12 19:28:23,436][61643] Avg episode reward: [(0, '24.720'), (1, '9.640')] [2023-10-12 19:28:24,359][62635] Updated weights for policy 1, policy_version 88070 (0.0009) [2023-10-12 19:28:24,738][62635] Updated weights for policy 1, policy_version 88080 (0.0008) [2023-10-12 19:28:25,098][62635] Updated weights for policy 1, policy_version 88090 (0.0010) [2023-10-12 19:28:25,902][62634] Updated weights for policy 0, policy_version 88040 (0.0008) [2023-10-12 19:28:26,277][62634] Updated weights for policy 0, policy_version 88050 (0.0009) [2023-10-12 19:28:26,648][62634] Updated weights for policy 0, policy_version 88060 (0.0011) [2023-10-12 19:28:28,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 180387840. Throughput: 0: 1664.9, 1: 1674.8. Samples: 45105078. Policy #0 lag: (min: 31.0, avg: 39.9, max: 63.0) [2023-10-12 19:28:28,436][61643] Avg episode reward: [(0, '24.750'), (1, '9.650')] [2023-10-12 19:28:29,109][62635] Updated weights for policy 1, policy_version 88100 (0.0008) [2023-10-12 19:28:29,471][62635] Updated weights for policy 1, policy_version 88110 (0.0010) [2023-10-12 19:28:29,842][62635] Updated weights for policy 1, policy_version 88120 (0.0008) [2023-10-12 19:28:30,691][62634] Updated weights for policy 0, policy_version 88070 (0.0011) [2023-10-12 19:28:31,069][62634] Updated weights for policy 0, policy_version 88080 (0.0009) [2023-10-12 19:28:31,440][62634] Updated weights for policy 0, policy_version 88090 (0.0011) [2023-10-12 19:28:33,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 180453376. Throughput: 0: 1679.2, 1: 1678.9. Samples: 45125816. Policy #0 lag: (min: 31.0, avg: 39.9, max: 63.0) [2023-10-12 19:28:33,435][61643] Avg episode reward: [(0, '24.680'), (1, '9.560')] [2023-10-12 19:28:33,803][62635] Updated weights for policy 1, policy_version 88130 (0.0007) [2023-10-12 19:28:34,178][62635] Updated weights for policy 1, policy_version 88140 (0.0009) [2023-10-12 19:28:34,549][62635] Updated weights for policy 1, policy_version 88150 (0.0009) [2023-10-12 19:28:34,920][62635] Updated weights for policy 1, policy_version 88160 (0.0010) [2023-10-12 19:28:35,374][62634] Updated weights for policy 0, policy_version 88100 (0.0010) [2023-10-12 19:28:35,744][62634] Updated weights for policy 0, policy_version 88110 (0.0011) [2023-10-12 19:28:36,127][62634] Updated weights for policy 0, policy_version 88120 (0.0009) [2023-10-12 19:28:38,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 180518912. Throughput: 0: 1666.4, 1: 1679.9. Samples: 45135552. Policy #0 lag: (min: 31.0, avg: 39.9, max: 63.0) [2023-10-12 19:28:38,435][61643] Avg episode reward: [(0, '24.670'), (1, '9.830')] [2023-10-12 19:28:39,135][62635] Updated weights for policy 1, policy_version 88170 (0.0008) [2023-10-12 19:28:39,503][62635] Updated weights for policy 1, policy_version 88180 (0.0007) [2023-10-12 19:28:39,869][62635] Updated weights for policy 1, policy_version 88190 (0.0007) [2023-10-12 19:28:40,173][62634] Updated weights for policy 0, policy_version 88130 (0.0009) [2023-10-12 19:28:40,540][62634] Updated weights for policy 0, policy_version 88140 (0.0009) [2023-10-12 19:28:40,915][62634] Updated weights for policy 0, policy_version 88150 (0.0010) [2023-10-12 19:28:41,292][62634] Updated weights for policy 0, policy_version 88160 (0.0007) [2023-10-12 19:28:43,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 180584448. Throughput: 0: 1673.5, 1: 1677.0. Samples: 45155628. Policy #0 lag: (min: 31.0, avg: 39.9, max: 63.0) [2023-10-12 19:28:43,435][61643] Avg episode reward: [(0, '24.750'), (1, '9.830')] [2023-10-12 19:28:44,141][62635] Updated weights for policy 1, policy_version 88200 (0.0008) [2023-10-12 19:28:44,510][62635] Updated weights for policy 1, policy_version 88210 (0.0008) [2023-10-12 19:28:44,874][62635] Updated weights for policy 1, policy_version 88220 (0.0007) [2023-10-12 19:28:45,419][62634] Updated weights for policy 0, policy_version 88170 (0.0008) [2023-10-12 19:28:45,792][62634] Updated weights for policy 0, policy_version 88180 (0.0008) [2023-10-12 19:28:46,163][62634] Updated weights for policy 0, policy_version 88190 (0.0009) [2023-10-12 19:28:48,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 180649984. Throughput: 0: 1683.3, 1: 1678.1. Samples: 45176336. Policy #0 lag: (min: 31.0, avg: 39.9, max: 63.0) [2023-10-12 19:28:48,435][61643] Avg episode reward: [(0, '24.800'), (1, '9.770')] [2023-10-12 19:28:48,907][62635] Updated weights for policy 1, policy_version 88230 (0.0008) [2023-10-12 19:28:49,287][62635] Updated weights for policy 1, policy_version 88240 (0.0009) [2023-10-12 19:28:49,659][62635] Updated weights for policy 1, policy_version 88250 (0.0008) [2023-10-12 19:28:50,191][62634] Updated weights for policy 0, policy_version 88200 (0.0008) [2023-10-12 19:28:50,566][62634] Updated weights for policy 0, policy_version 88210 (0.0007) [2023-10-12 19:28:50,941][62634] Updated weights for policy 0, policy_version 88220 (0.0008) [2023-10-12 19:28:53,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 180715520. Throughput: 0: 1661.0, 1: 1677.5. Samples: 45185640. Policy #0 lag: (min: 31.0, avg: 39.9, max: 63.0) [2023-10-12 19:28:53,436][61643] Avg episode reward: [(0, '24.680'), (1, '10.010')] [2023-10-12 19:28:53,604][62635] Updated weights for policy 1, policy_version 88260 (0.0009) [2023-10-12 19:28:53,974][62635] Updated weights for policy 1, policy_version 88270 (0.0010) [2023-10-12 19:28:54,346][62635] Updated weights for policy 1, policy_version 88280 (0.0009) [2023-10-12 19:28:54,905][62634] Updated weights for policy 0, policy_version 88230 (0.0007) [2023-10-12 19:28:55,279][62634] Updated weights for policy 0, policy_version 88240 (0.0008) [2023-10-12 19:28:55,658][62634] Updated weights for policy 0, policy_version 88250 (0.0009) [2023-10-12 19:28:58,294][62635] Updated weights for policy 1, policy_version 88290 (0.0009) [2023-10-12 19:28:58,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 180781056. Throughput: 0: 1685.8, 1: 1683.0. Samples: 45206410. Policy #0 lag: (min: 31.0, avg: 39.9, max: 63.0) [2023-10-12 19:28:58,435][61643] Avg episode reward: [(0, '24.440'), (1, '9.930')] [2023-10-12 19:28:58,663][62635] Updated weights for policy 1, policy_version 88300 (0.0009) [2023-10-12 19:28:59,034][62635] Updated weights for policy 1, policy_version 88310 (0.0007) [2023-10-12 19:28:59,396][62635] Updated weights for policy 1, policy_version 88320 (0.0009) [2023-10-12 19:28:59,765][62634] Updated weights for policy 0, policy_version 88260 (0.0008) [2023-10-12 19:29:00,146][62634] Updated weights for policy 0, policy_version 88270 (0.0010) [2023-10-12 19:29:00,514][62634] Updated weights for policy 0, policy_version 88280 (0.0009) [2023-10-12 19:29:03,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 180846592. Throughput: 0: 1695.2, 1: 1684.4. Samples: 45227250. Policy #0 lag: (min: 31.0, avg: 38.7, max: 63.0) [2023-10-12 19:29:03,436][61643] Avg episode reward: [(0, '24.550'), (1, '10.190')] [2023-10-12 19:29:03,451][62635] Updated weights for policy 1, policy_version 88330 (0.0011) [2023-10-12 19:29:03,813][62635] Updated weights for policy 1, policy_version 88340 (0.0011) [2023-10-12 19:29:04,176][62635] Updated weights for policy 1, policy_version 88350 (0.0009) [2023-10-12 19:29:04,377][62634] Updated weights for policy 0, policy_version 88290 (0.0007) [2023-10-12 19:29:04,756][62634] Updated weights for policy 0, policy_version 88300 (0.0007) [2023-10-12 19:29:05,134][62634] Updated weights for policy 0, policy_version 88310 (0.0009) [2023-10-12 19:29:05,521][62634] Updated weights for policy 0, policy_version 88320 (0.0008) [2023-10-12 19:29:08,105][62635] Updated weights for policy 1, policy_version 88360 (0.0008) [2023-10-12 19:29:08,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 180912128. Throughput: 0: 1670.0, 1: 1693.2. Samples: 45236530. Policy #0 lag: (min: 31.0, avg: 38.7, max: 63.0) [2023-10-12 19:29:08,435][61643] Avg episode reward: [(0, '24.530'), (1, '10.000')] [2023-10-12 19:29:08,482][62635] Updated weights for policy 1, policy_version 88370 (0.0008) [2023-10-12 19:29:08,848][62635] Updated weights for policy 1, policy_version 88380 (0.0008) [2023-10-12 19:29:09,553][62634] Updated weights for policy 0, policy_version 88330 (0.0008) [2023-10-12 19:29:09,919][62634] Updated weights for policy 0, policy_version 88340 (0.0010) [2023-10-12 19:29:10,301][62634] Updated weights for policy 0, policy_version 88350 (0.0009) [2023-10-12 19:29:12,828][62635] Updated weights for policy 1, policy_version 88390 (0.0008) [2023-10-12 19:29:13,201][62635] Updated weights for policy 1, policy_version 88400 (0.0008) [2023-10-12 19:29:13,435][61643] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 180977664. Throughput: 0: 1696.8, 1: 1687.6. Samples: 45257372. Policy #0 lag: (min: 31.0, avg: 38.7, max: 63.0) [2023-10-12 19:29:13,435][61643] Avg episode reward: [(0, '24.620'), (1, '9.980')] [2023-10-12 19:29:13,566][62635] Updated weights for policy 1, policy_version 88410 (0.0008) [2023-10-12 19:29:14,366][62634] Updated weights for policy 0, policy_version 88360 (0.0008) [2023-10-12 19:29:14,752][62634] Updated weights for policy 0, policy_version 88370 (0.0010) [2023-10-12 19:29:15,127][62634] Updated weights for policy 0, policy_version 88380 (0.0010) [2023-10-12 19:29:17,651][62635] Updated weights for policy 1, policy_version 88420 (0.0007) [2023-10-12 19:29:18,021][62635] Updated weights for policy 1, policy_version 88430 (0.0008) [2023-10-12 19:29:18,376][62635] Updated weights for policy 1, policy_version 88440 (0.0008) [2023-10-12 19:29:18,435][61643] Fps is (10 sec: 13106.8, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 181043200. Throughput: 0: 1694.7, 1: 1678.3. Samples: 45277602. Policy #0 lag: (min: 31.0, avg: 38.7, max: 63.0) [2023-10-12 19:29:18,436][61643] Avg episode reward: [(0, '24.540'), (1, '10.210')] [2023-10-12 19:29:19,213][62634] Updated weights for policy 0, policy_version 88390 (0.0009) [2023-10-12 19:29:19,592][62634] Updated weights for policy 0, policy_version 88400 (0.0008) [2023-10-12 19:29:19,977][62634] Updated weights for policy 0, policy_version 88410 (0.0008) [2023-10-12 19:29:22,643][62635] Updated weights for policy 1, policy_version 88450 (0.0007) [2023-10-12 19:29:23,010][62635] Updated weights for policy 1, policy_version 88460 (0.0007) [2023-10-12 19:29:23,377][62635] Updated weights for policy 1, policy_version 88470 (0.0008) [2023-10-12 19:29:23,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 181108736. Throughput: 0: 1678.5, 1: 1691.3. Samples: 45287192. Policy #0 lag: (min: 31.0, avg: 38.7, max: 63.0) [2023-10-12 19:29:23,435][61643] Avg episode reward: [(0, '24.070'), (1, '10.120')] [2023-10-12 19:29:23,738][62635] Updated weights for policy 1, policy_version 88480 (0.0008) [2023-10-12 19:29:23,928][62634] Updated weights for policy 0, policy_version 88420 (0.0011) [2023-10-12 19:29:24,313][62634] Updated weights for policy 0, policy_version 88430 (0.0008) [2023-10-12 19:29:24,691][62634] Updated weights for policy 0, policy_version 88440 (0.0008) [2023-10-12 19:29:27,800][62635] Updated weights for policy 1, policy_version 88490 (0.0007) [2023-10-12 19:29:28,172][62635] Updated weights for policy 1, policy_version 88500 (0.0007) [2023-10-12 19:29:28,435][61643] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 181174272. Throughput: 0: 1698.7, 1: 1694.1. Samples: 45308304. Policy #0 lag: (min: 31.0, avg: 38.7, max: 63.0) [2023-10-12 19:29:28,435][61643] Avg episode reward: [(0, '24.110'), (1, '9.870')] [2023-10-12 19:29:28,540][62635] Updated weights for policy 1, policy_version 88510 (0.0009) [2023-10-12 19:29:28,704][62634] Updated weights for policy 0, policy_version 88450 (0.0009) [2023-10-12 19:29:29,087][62634] Updated weights for policy 0, policy_version 88460 (0.0010) [2023-10-12 19:29:29,462][62634] Updated weights for policy 0, policy_version 88470 (0.0008) [2023-10-12 19:29:29,844][62634] Updated weights for policy 0, policy_version 88480 (0.0011) [2023-10-12 19:29:32,445][62635] Updated weights for policy 1, policy_version 88520 (0.0011) [2023-10-12 19:29:32,814][62635] Updated weights for policy 1, policy_version 88530 (0.0011) [2023-10-12 19:29:33,178][62635] Updated weights for policy 1, policy_version 88540 (0.0010) [2023-10-12 19:29:33,435][61643] Fps is (10 sec: 16383.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 181272576. Throughput: 0: 1700.9, 1: 1677.5. Samples: 45328364. Policy #0 lag: (min: 31.0, avg: 38.7, max: 63.0) [2023-10-12 19:29:33,435][61643] Avg episode reward: [(0, '23.960'), (1, '10.020')] [2023-10-12 19:29:33,669][62634] Updated weights for policy 0, policy_version 88490 (0.0010) [2023-10-12 19:29:34,050][62634] Updated weights for policy 0, policy_version 88500 (0.0010) [2023-10-12 19:29:34,424][62634] Updated weights for policy 0, policy_version 88510 (0.0007) [2023-10-12 19:29:37,158][62635] Updated weights for policy 1, policy_version 88550 (0.0008) [2023-10-12 19:29:37,522][62635] Updated weights for policy 1, policy_version 88560 (0.0007) [2023-10-12 19:29:37,888][62635] Updated weights for policy 1, policy_version 88570 (0.0007) [2023-10-12 19:29:38,435][61643] Fps is (10 sec: 16383.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 181338112. Throughput: 0: 1692.0, 1: 1702.0. Samples: 45338372. Policy #0 lag: (min: 31.0, avg: 38.7, max: 63.0) [2023-10-12 19:29:38,435][61643] Avg episode reward: [(0, '23.370'), (1, '9.800')] [2023-10-12 19:29:38,666][62634] Updated weights for policy 0, policy_version 88520 (0.0007) [2023-10-12 19:29:39,043][62634] Updated weights for policy 0, policy_version 88530 (0.0008) [2023-10-12 19:29:39,417][62634] Updated weights for policy 0, policy_version 88540 (0.0007) [2023-10-12 19:29:41,986][62635] Updated weights for policy 1, policy_version 88580 (0.0008) [2023-10-12 19:29:42,361][62635] Updated weights for policy 1, policy_version 88590 (0.0009) [2023-10-12 19:29:42,732][62635] Updated weights for policy 1, policy_version 88600 (0.0008) [2023-10-12 19:29:43,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 181403648. Throughput: 0: 1697.9, 1: 1691.6. Samples: 45358938. Policy #0 lag: (min: 31.0, avg: 38.7, max: 63.0) [2023-10-12 19:29:43,436][62634] Updated weights for policy 0, policy_version 88550 (0.0008) [2023-10-12 19:29:43,435][61643] Avg episode reward: [(0, '23.190'), (1, '9.800')] [2023-10-12 19:29:43,810][62634] Updated weights for policy 0, policy_version 88560 (0.0011) [2023-10-12 19:29:44,185][62634] Updated weights for policy 0, policy_version 88570 (0.0010) [2023-10-12 19:29:46,843][62635] Updated weights for policy 1, policy_version 88610 (0.0010) [2023-10-12 19:29:47,211][62635] Updated weights for policy 1, policy_version 88620 (0.0011) [2023-10-12 19:29:47,584][62635] Updated weights for policy 1, policy_version 88630 (0.0010) [2023-10-12 19:29:47,952][62635] Updated weights for policy 1, policy_version 88640 (0.0009) [2023-10-12 19:29:48,401][62634] Updated weights for policy 0, policy_version 88580 (0.0008) [2023-10-12 19:29:48,435][61643] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 181469184. Throughput: 0: 1689.0, 1: 1666.8. Samples: 45378262. Policy #0 lag: (min: 31.0, avg: 38.7, max: 63.0) [2023-10-12 19:29:48,436][61643] Avg episode reward: [(0, '23.260'), (1, '9.900')] [2023-10-12 19:29:48,775][62634] Updated weights for policy 0, policy_version 88590 (0.0007) [2023-10-12 19:29:49,144][62634] Updated weights for policy 0, policy_version 88600 (0.0007) [2023-10-12 19:29:51,937][62635] Updated weights for policy 1, policy_version 88650 (0.0009) [2023-10-12 19:29:52,304][62635] Updated weights for policy 1, policy_version 88660 (0.0007) [2023-10-12 19:29:52,672][62635] Updated weights for policy 1, policy_version 88670 (0.0007) [2023-10-12 19:29:53,243][62634] Updated weights for policy 0, policy_version 88610 (0.0009) [2023-10-12 19:29:53,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 181534720. Throughput: 0: 1689.6, 1: 1688.0. Samples: 45388518. Policy #0 lag: (min: 31.0, avg: 38.7, max: 63.0) [2023-10-12 19:29:53,435][61643] Avg episode reward: [(0, '22.820'), (1, '9.700')] [2023-10-12 19:29:53,616][62634] Updated weights for policy 0, policy_version 88620 (0.0009) [2023-10-12 19:29:53,995][62634] Updated weights for policy 0, policy_version 88630 (0.0008) [2023-10-12 19:29:54,368][62634] Updated weights for policy 0, policy_version 88640 (0.0008) [2023-10-12 19:29:56,876][62635] Updated weights for policy 1, policy_version 88680 (0.0007) [2023-10-12 19:29:57,242][62635] Updated weights for policy 1, policy_version 88690 (0.0007) [2023-10-12 19:29:57,610][62635] Updated weights for policy 1, policy_version 88700 (0.0007) [2023-10-12 19:29:58,292][62634] Updated weights for policy 0, policy_version 88650 (0.0008) [2023-10-12 19:29:58,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 181600256. Throughput: 0: 1685.4, 1: 1678.1. Samples: 45408728. Policy #0 lag: (min: 31.0, avg: 38.7, max: 63.0) [2023-10-12 19:29:58,435][61643] Avg episode reward: [(0, '22.520'), (1, '9.720')] [2023-10-12 19:29:58,653][62634] Updated weights for policy 0, policy_version 88660 (0.0010) [2023-10-12 19:29:59,035][62634] Updated weights for policy 0, policy_version 88670 (0.0010) [2023-10-12 19:30:01,586][62635] Updated weights for policy 1, policy_version 88710 (0.0007) [2023-10-12 19:30:01,952][62635] Updated weights for policy 1, policy_version 88720 (0.0007) [2023-10-12 19:30:02,318][62635] Updated weights for policy 1, policy_version 88730 (0.0007) [2023-10-12 19:30:03,199][62634] Updated weights for policy 0, policy_version 88680 (0.0008) [2023-10-12 19:30:03,435][61643] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 181665792. Throughput: 0: 1690.3, 1: 1667.2. Samples: 45428692. Policy #0 lag: (min: 31.0, avg: 39.4, max: 63.0) [2023-10-12 19:30:03,436][61643] Avg episode reward: [(0, '22.740'), (1, '9.900')] [2023-10-12 19:30:03,575][62634] Updated weights for policy 0, policy_version 88690 (0.0007) [2023-10-12 19:30:03,944][62634] Updated weights for policy 0, policy_version 88700 (0.0008) [2023-10-12 19:30:06,374][62635] Updated weights for policy 1, policy_version 88740 (0.0008) [2023-10-12 19:30:06,749][62635] Updated weights for policy 1, policy_version 88750 (0.0008) [2023-10-12 19:30:07,118][62635] Updated weights for policy 1, policy_version 88760 (0.0008) [2023-10-12 19:30:07,988][62634] Updated weights for policy 0, policy_version 88710 (0.0008) [2023-10-12 19:30:08,361][62634] Updated weights for policy 0, policy_version 88720 (0.0010) [2023-10-12 19:30:08,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 181731328. Throughput: 0: 1690.6, 1: 1687.9. Samples: 45439222. Policy #0 lag: (min: 31.0, avg: 39.4, max: 63.0) [2023-10-12 19:30:08,435][61643] Avg episode reward: [(0, '22.750'), (1, '9.880')] [2023-10-12 19:30:08,736][62634] Updated weights for policy 0, policy_version 88730 (0.0009) [2023-10-12 19:30:11,258][62635] Updated weights for policy 1, policy_version 88770 (0.0009) [2023-10-12 19:30:11,623][62635] Updated weights for policy 1, policy_version 88780 (0.0010) [2023-10-12 19:30:11,998][62635] Updated weights for policy 1, policy_version 88790 (0.0010) [2023-10-12 19:30:12,359][62635] Updated weights for policy 1, policy_version 88800 (0.0011) [2023-10-12 19:30:12,679][62634] Updated weights for policy 0, policy_version 88740 (0.0007) [2023-10-12 19:30:13,056][62634] Updated weights for policy 0, policy_version 88750 (0.0009) [2023-10-12 19:30:13,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 181796864. Throughput: 0: 1680.1, 1: 1668.0. Samples: 45458968. Policy #0 lag: (min: 31.0, avg: 39.4, max: 63.0) [2023-10-12 19:30:13,435][61643] Avg episode reward: [(0, '22.590'), (1, '9.880')] [2023-10-12 19:30:13,436][62634] Updated weights for policy 0, policy_version 88760 (0.0007) [2023-10-12 19:30:16,463][62635] Updated weights for policy 1, policy_version 88810 (0.0009) [2023-10-12 19:30:16,829][62635] Updated weights for policy 1, policy_version 88820 (0.0010) [2023-10-12 19:30:17,204][62635] Updated weights for policy 1, policy_version 88830 (0.0010) [2023-10-12 19:30:17,530][62634] Updated weights for policy 0, policy_version 88770 (0.0009) [2023-10-12 19:30:17,911][62634] Updated weights for policy 0, policy_version 88780 (0.0009) [2023-10-12 19:30:18,274][62634] Updated weights for policy 0, policy_version 88790 (0.0012) [2023-10-12 19:30:18,435][61643] Fps is (10 sec: 13106.7, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 181862400. Throughput: 0: 1665.6, 1: 1672.4. Samples: 45478576. Policy #0 lag: (min: 31.0, avg: 39.4, max: 63.0) [2023-10-12 19:30:18,436][61643] Avg episode reward: [(0, '22.620'), (1, '9.970')] [2023-10-12 19:30:18,446][62495] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000088832_90963968.pth... [2023-10-12 19:30:18,482][62495] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000087264_89358336.pth [2023-10-12 19:30:18,648][62354] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000088800_90931200.pth... [2023-10-12 19:30:18,649][62634] Updated weights for policy 0, policy_version 88800 (0.0008) [2023-10-12 19:30:18,678][62354] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000087200_89292800.pth [2023-10-12 19:30:21,246][62635] Updated weights for policy 1, policy_version 88840 (0.0009) [2023-10-12 19:30:21,603][62635] Updated weights for policy 1, policy_version 88850 (0.0007) [2023-10-12 19:30:21,968][62635] Updated weights for policy 1, policy_version 88860 (0.0007) [2023-10-12 19:30:22,721][62634] Updated weights for policy 0, policy_version 88810 (0.0009) [2023-10-12 19:30:23,103][62634] Updated weights for policy 0, policy_version 88820 (0.0008) [2023-10-12 19:30:23,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 181927936. Throughput: 0: 1680.0, 1: 1678.5. Samples: 45489504. Policy #0 lag: (min: 31.0, avg: 39.4, max: 63.0) [2023-10-12 19:30:23,435][61643] Avg episode reward: [(0, '22.590'), (1, '9.900')] [2023-10-12 19:30:23,478][62634] Updated weights for policy 0, policy_version 88830 (0.0010) [2023-10-12 19:30:26,028][62635] Updated weights for policy 1, policy_version 88870 (0.0008) [2023-10-12 19:30:26,395][62635] Updated weights for policy 1, policy_version 88880 (0.0007) [2023-10-12 19:30:26,766][62635] Updated weights for policy 1, policy_version 88890 (0.0008) [2023-10-12 19:30:27,446][62634] Updated weights for policy 0, policy_version 88840 (0.0009) [2023-10-12 19:30:27,824][62634] Updated weights for policy 0, policy_version 88850 (0.0008) [2023-10-12 19:30:28,206][62634] Updated weights for policy 0, policy_version 88860 (0.0007) [2023-10-12 19:30:28,435][61643] Fps is (10 sec: 16384.6, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 182026240. Throughput: 0: 1680.8, 1: 1659.0. Samples: 45509232. Policy #0 lag: (min: 31.0, avg: 39.4, max: 63.0) [2023-10-12 19:30:28,435][61643] Avg episode reward: [(0, '23.010'), (1, '10.020')] [2023-10-12 19:30:30,780][62635] Updated weights for policy 1, policy_version 88900 (0.0007) [2023-10-12 19:30:31,152][62635] Updated weights for policy 1, policy_version 88910 (0.0007) [2023-10-12 19:30:31,524][62635] Updated weights for policy 1, policy_version 88920 (0.0008) [2023-10-12 19:30:32,215][62634] Updated weights for policy 0, policy_version 88870 (0.0010) [2023-10-12 19:30:32,593][62634] Updated weights for policy 0, policy_version 88880 (0.0007) [2023-10-12 19:30:32,978][62634] Updated weights for policy 0, policy_version 88890 (0.0007) [2023-10-12 19:30:33,435][61643] Fps is (10 sec: 16383.6, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 182091776. Throughput: 0: 1667.8, 1: 1682.4. Samples: 45529022. Policy #0 lag: (min: 31.0, avg: 39.4, max: 63.0) [2023-10-12 19:30:33,436][61643] Avg episode reward: [(0, '23.550'), (1, '10.150')] [2023-10-12 19:30:35,723][62635] Updated weights for policy 1, policy_version 88930 (0.0008) [2023-10-12 19:30:36,100][62635] Updated weights for policy 1, policy_version 88940 (0.0009) [2023-10-12 19:30:36,463][62635] Updated weights for policy 1, policy_version 88950 (0.0011) [2023-10-12 19:30:36,825][62635] Updated weights for policy 1, policy_version 88960 (0.0008) [2023-10-12 19:30:37,018][62634] Updated weights for policy 0, policy_version 88900 (0.0008) [2023-10-12 19:30:37,390][62634] Updated weights for policy 0, policy_version 88910 (0.0009) [2023-10-12 19:30:37,766][62634] Updated weights for policy 0, policy_version 88920 (0.0011) [2023-10-12 19:30:38,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 182157312. Throughput: 0: 1690.3, 1: 1672.4. Samples: 45539842. Policy #0 lag: (min: 31.0, avg: 39.4, max: 63.0) [2023-10-12 19:30:38,435][61643] Avg episode reward: [(0, '23.450'), (1, '9.960')] [2023-10-12 19:30:40,858][62635] Updated weights for policy 1, policy_version 88970 (0.0010) [2023-10-12 19:30:41,233][62635] Updated weights for policy 1, policy_version 88980 (0.0011) [2023-10-12 19:30:41,588][62635] Updated weights for policy 1, policy_version 88990 (0.0011) [2023-10-12 19:30:41,903][62634] Updated weights for policy 0, policy_version 88930 (0.0009) [2023-10-12 19:30:42,280][62634] Updated weights for policy 0, policy_version 88940 (0.0009) [2023-10-12 19:30:42,652][62634] Updated weights for policy 0, policy_version 88950 (0.0008) [2023-10-12 19:30:43,027][62634] Updated weights for policy 0, policy_version 88960 (0.0007) [2023-10-12 19:30:43,435][61643] Fps is (10 sec: 13107.6, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 182222848. Throughput: 0: 1683.8, 1: 1666.5. Samples: 45559492. Policy #0 lag: (min: 31.0, avg: 39.4, max: 63.0) [2023-10-12 19:30:43,435][61643] Avg episode reward: [(0, '23.120'), (1, '9.870')] [2023-10-12 19:30:45,722][62635] Updated weights for policy 1, policy_version 89000 (0.0009) [2023-10-12 19:30:46,091][62635] Updated weights for policy 1, policy_version 89010 (0.0011) [2023-10-12 19:30:46,457][62635] Updated weights for policy 1, policy_version 89020 (0.0008) [2023-10-12 19:30:47,058][62634] Updated weights for policy 0, policy_version 88970 (0.0011) [2023-10-12 19:30:47,432][62634] Updated weights for policy 0, policy_version 88980 (0.0009) [2023-10-12 19:30:47,808][62634] Updated weights for policy 0, policy_version 88990 (0.0010) [2023-10-12 19:30:48,435][61643] Fps is (10 sec: 13106.8, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 182288384. Throughput: 0: 1654.1, 1: 1682.6. Samples: 45578846. Policy #0 lag: (min: 31.0, avg: 39.4, max: 63.0) [2023-10-12 19:30:48,436][61643] Avg episode reward: [(0, '23.090'), (1, '10.050')] [2023-10-12 19:30:50,545][62635] Updated weights for policy 1, policy_version 89030 (0.0007) [2023-10-12 19:30:50,913][62635] Updated weights for policy 1, policy_version 89040 (0.0010) [2023-10-12 19:30:51,267][62635] Updated weights for policy 1, policy_version 89050 (0.0009) [2023-10-12 19:30:51,861][62634] Updated weights for policy 0, policy_version 89000 (0.0010) [2023-10-12 19:30:52,238][62634] Updated weights for policy 0, policy_version 89010 (0.0010) [2023-10-12 19:30:52,610][62634] Updated weights for policy 0, policy_version 89020 (0.0011) [2023-10-12 19:30:53,435][61643] Fps is (10 sec: 13106.8, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 182353920. Throughput: 0: 1682.0, 1: 1662.3. Samples: 45589716. Policy #0 lag: (min: 31.0, avg: 39.4, max: 63.0) [2023-10-12 19:30:53,436][61643] Avg episode reward: [(0, '23.160'), (1, '10.140')] [2023-10-12 19:30:55,398][62635] Updated weights for policy 1, policy_version 89060 (0.0008) [2023-10-12 19:30:55,768][62635] Updated weights for policy 1, policy_version 89070 (0.0008) [2023-10-12 19:30:56,134][62635] Updated weights for policy 1, policy_version 89080 (0.0008) [2023-10-12 19:30:56,660][62634] Updated weights for policy 0, policy_version 89030 (0.0011) [2023-10-12 19:30:57,051][62634] Updated weights for policy 0, policy_version 89040 (0.0010) [2023-10-12 19:30:57,426][62634] Updated weights for policy 0, policy_version 89050 (0.0007) [2023-10-12 19:30:58,435][61643] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 182419456. Throughput: 0: 1668.9, 1: 1671.6. Samples: 45609292. Policy #0 lag: (min: 31.0, avg: 35.5, max: 63.0) [2023-10-12 19:30:58,435][61643] Avg episode reward: [(0, '22.860'), (1, '9.810')] [2023-10-12 19:30:59,971][62635] Updated weights for policy 1, policy_version 89090 (0.0008) [2023-10-12 19:31:00,334][62635] Updated weights for policy 1, policy_version 89100 (0.0008) [2023-10-12 19:31:00,697][62635] Updated weights for policy 1, policy_version 89110 (0.0009) [2023-10-12 19:31:01,068][62635] Updated weights for policy 1, policy_version 89120 (0.0008) [2023-10-12 19:31:01,484][62634] Updated weights for policy 0, policy_version 89060 (0.0008) [2023-10-12 19:31:01,851][62634] Updated weights for policy 0, policy_version 89070 (0.0008) [2023-10-12 19:31:02,228][62634] Updated weights for policy 0, policy_version 89080 (0.0007) [2023-10-12 19:31:03,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 182484992. Throughput: 0: 1663.7, 1: 1686.3. Samples: 45629326. Policy #0 lag: (min: 31.0, avg: 35.5, max: 63.0) [2023-10-12 19:31:03,435][61643] Avg episode reward: [(0, '23.130'), (1, '9.990')] [2023-10-12 19:31:05,286][62635] Updated weights for policy 1, policy_version 89130 (0.0009) [2023-10-12 19:31:05,650][62635] Updated weights for policy 1, policy_version 89140 (0.0008) [2023-10-12 19:31:06,017][62635] Updated weights for policy 1, policy_version 89150 (0.0008) [2023-10-12 19:31:06,252][62634] Updated weights for policy 0, policy_version 89090 (0.0009) [2023-10-12 19:31:06,632][62634] Updated weights for policy 0, policy_version 89100 (0.0011) [2023-10-12 19:31:07,006][62634] Updated weights for policy 0, policy_version 89110 (0.0007) [2023-10-12 19:31:07,386][62634] Updated weights for policy 0, policy_version 89120 (0.0007) [2023-10-12 19:31:08,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 182550528. Throughput: 0: 1683.2, 1: 1660.3. Samples: 45639962. Policy #0 lag: (min: 31.0, avg: 35.5, max: 63.0) [2023-10-12 19:31:08,435][61643] Avg episode reward: [(0, '23.580'), (1, '10.080')] [2023-10-12 19:31:10,001][62635] Updated weights for policy 1, policy_version 89160 (0.0008) [2023-10-12 19:31:10,365][62635] Updated weights for policy 1, policy_version 89170 (0.0007) [2023-10-12 19:31:10,728][62635] Updated weights for policy 1, policy_version 89180 (0.0008) [2023-10-12 19:31:11,451][62634] Updated weights for policy 0, policy_version 89130 (0.0009) [2023-10-12 19:31:11,832][62634] Updated weights for policy 0, policy_version 89140 (0.0010) [2023-10-12 19:31:12,203][62634] Updated weights for policy 0, policy_version 89150 (0.0011) [2023-10-12 19:31:13,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 182616064. Throughput: 0: 1661.9, 1: 1684.0. Samples: 45659798. Policy #0 lag: (min: 31.0, avg: 35.5, max: 63.0) [2023-10-12 19:31:13,435][61643] Avg episode reward: [(0, '23.670'), (1, '9.910')] [2023-10-12 19:31:14,736][62635] Updated weights for policy 1, policy_version 89190 (0.0010) [2023-10-12 19:31:15,101][62635] Updated weights for policy 1, policy_version 89200 (0.0008) [2023-10-12 19:31:15,465][62635] Updated weights for policy 1, policy_version 89210 (0.0008) [2023-10-12 19:31:16,321][62634] Updated weights for policy 0, policy_version 89160 (0.0009) [2023-10-12 19:31:16,684][62634] Updated weights for policy 0, policy_version 89170 (0.0011) [2023-10-12 19:31:17,056][62634] Updated weights for policy 0, policy_version 89180 (0.0010) [2023-10-12 19:31:18,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 182681600. Throughput: 0: 1669.4, 1: 1687.3. Samples: 45680074. Policy #0 lag: (min: 31.0, avg: 35.5, max: 63.0) [2023-10-12 19:31:18,436][61643] Avg episode reward: [(0, '23.870'), (1, '10.000')] [2023-10-12 19:31:19,515][62635] Updated weights for policy 1, policy_version 89220 (0.0008) [2023-10-12 19:31:19,887][62635] Updated weights for policy 1, policy_version 89230 (0.0009) [2023-10-12 19:31:20,259][62635] Updated weights for policy 1, policy_version 89240 (0.0010) [2023-10-12 19:31:21,208][62634] Updated weights for policy 0, policy_version 89190 (0.0009) [2023-10-12 19:31:21,594][62634] Updated weights for policy 0, policy_version 89200 (0.0007) [2023-10-12 19:31:21,958][62634] Updated weights for policy 0, policy_version 89210 (0.0007) [2023-10-12 19:31:23,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 182747136. Throughput: 0: 1674.6, 1: 1670.7. Samples: 45690378. Policy #0 lag: (min: 31.0, avg: 35.5, max: 63.0) [2023-10-12 19:31:23,435][61643] Avg episode reward: [(0, '23.830'), (1, '10.150')] [2023-10-12 19:31:24,297][62635] Updated weights for policy 1, policy_version 89250 (0.0010) [2023-10-12 19:31:24,670][62635] Updated weights for policy 1, policy_version 89260 (0.0009) [2023-10-12 19:31:25,042][62635] Updated weights for policy 1, policy_version 89270 (0.0008) [2023-10-12 19:31:25,403][62635] Updated weights for policy 1, policy_version 89280 (0.0007) [2023-10-12 19:31:26,051][62634] Updated weights for policy 0, policy_version 89220 (0.0007) [2023-10-12 19:31:26,422][62634] Updated weights for policy 0, policy_version 89230 (0.0008) [2023-10-12 19:31:26,799][62634] Updated weights for policy 0, policy_version 89240 (0.0008) [2023-10-12 19:31:28,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 182812672. Throughput: 0: 1659.1, 1: 1689.5. Samples: 45710176. Policy #0 lag: (min: 31.0, avg: 35.5, max: 63.0) [2023-10-12 19:31:28,435][61643] Avg episode reward: [(0, '23.920'), (1, '9.830')] [2023-10-12 19:31:29,563][62635] Updated weights for policy 1, policy_version 89290 (0.0010) [2023-10-12 19:31:29,931][62635] Updated weights for policy 1, policy_version 89300 (0.0009) [2023-10-12 19:31:30,294][62635] Updated weights for policy 1, policy_version 89310 (0.0008) [2023-10-12 19:31:30,944][62634] Updated weights for policy 0, policy_version 89250 (0.0008) [2023-10-12 19:31:31,317][62634] Updated weights for policy 0, policy_version 89260 (0.0007) [2023-10-12 19:31:31,709][62634] Updated weights for policy 0, policy_version 89270 (0.0009) [2023-10-12 19:31:32,076][62634] Updated weights for policy 0, policy_version 89280 (0.0009) [2023-10-12 19:31:33,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 182878208. Throughput: 0: 1680.3, 1: 1693.9. Samples: 45730686. Policy #0 lag: (min: 31.0, avg: 35.5, max: 63.0) [2023-10-12 19:31:33,435][61643] Avg episode reward: [(0, '23.900'), (1, '9.680')] [2023-10-12 19:31:34,298][62635] Updated weights for policy 1, policy_version 89320 (0.0008) [2023-10-12 19:31:34,670][62635] Updated weights for policy 1, policy_version 89330 (0.0009) [2023-10-12 19:31:35,044][62635] Updated weights for policy 1, policy_version 89340 (0.0008) [2023-10-12 19:31:35,994][62634] Updated weights for policy 0, policy_version 89290 (0.0008) [2023-10-12 19:31:36,370][62634] Updated weights for policy 0, policy_version 89300 (0.0009) [2023-10-12 19:31:36,745][62634] Updated weights for policy 0, policy_version 89310 (0.0008) [2023-10-12 19:31:38,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 182943744. Throughput: 0: 1673.5, 1: 1680.4. Samples: 45740638. Policy #0 lag: (min: 31.0, avg: 35.5, max: 63.0) [2023-10-12 19:31:38,435][61643] Avg episode reward: [(0, '23.740'), (1, '9.760')] [2023-10-12 19:31:39,188][62635] Updated weights for policy 1, policy_version 89350 (0.0007) [2023-10-12 19:31:39,565][62635] Updated weights for policy 1, policy_version 89360 (0.0008) [2023-10-12 19:31:39,932][62635] Updated weights for policy 1, policy_version 89370 (0.0008) [2023-10-12 19:31:40,862][62634] Updated weights for policy 0, policy_version 89320 (0.0008) [2023-10-12 19:31:41,240][62634] Updated weights for policy 0, policy_version 89330 (0.0008) [2023-10-12 19:31:41,615][62634] Updated weights for policy 0, policy_version 89340 (0.0009) [2023-10-12 19:31:43,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 183009280. Throughput: 0: 1665.2, 1: 1688.9. Samples: 45760226. Policy #0 lag: (min: 31.0, avg: 35.5, max: 63.0) [2023-10-12 19:31:43,436][61643] Avg episode reward: [(0, '23.220'), (1, '9.620')] [2023-10-12 19:31:44,116][62635] Updated weights for policy 1, policy_version 89380 (0.0008) [2023-10-12 19:31:44,489][62635] Updated weights for policy 1, policy_version 89390 (0.0009) [2023-10-12 19:31:44,863][62635] Updated weights for policy 1, policy_version 89400 (0.0008) [2023-10-12 19:31:45,598][62634] Updated weights for policy 0, policy_version 89350 (0.0008) [2023-10-12 19:31:45,985][62634] Updated weights for policy 0, policy_version 89360 (0.0008) [2023-10-12 19:31:46,368][62634] Updated weights for policy 0, policy_version 89370 (0.0009) [2023-10-12 19:31:48,435][61643] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 183074816. Throughput: 0: 1686.8, 1: 1688.6. Samples: 45781220. Policy #0 lag: (min: 31.0, avg: 35.5, max: 63.0) [2023-10-12 19:31:48,436][61643] Avg episode reward: [(0, '23.300'), (1, '9.530')] [2023-10-12 19:31:48,844][62635] Updated weights for policy 1, policy_version 89410 (0.0010) [2023-10-12 19:31:49,199][62635] Updated weights for policy 1, policy_version 89420 (0.0009) [2023-10-12 19:31:49,562][62635] Updated weights for policy 1, policy_version 89430 (0.0009) [2023-10-12 19:31:49,928][62635] Updated weights for policy 1, policy_version 89440 (0.0007) [2023-10-12 19:31:50,425][62634] Updated weights for policy 0, policy_version 89380 (0.0008) [2023-10-12 19:31:50,797][62634] Updated weights for policy 0, policy_version 89390 (0.0009) [2023-10-12 19:31:51,187][62634] Updated weights for policy 0, policy_version 89400 (0.0010) [2023-10-12 19:31:53,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 183140352. Throughput: 0: 1667.4, 1: 1683.1. Samples: 45790732. Policy #0 lag: (min: 31.0, avg: 35.5, max: 63.0) [2023-10-12 19:31:53,436][61643] Avg episode reward: [(0, '23.370'), (1, '9.980')] [2023-10-12 19:31:53,970][62635] Updated weights for policy 1, policy_version 89450 (0.0010) [2023-10-12 19:31:54,348][62635] Updated weights for policy 1, policy_version 89460 (0.0010) [2023-10-12 19:31:54,719][62635] Updated weights for policy 1, policy_version 89470 (0.0009) [2023-10-12 19:31:55,288][62634] Updated weights for policy 0, policy_version 89410 (0.0007) [2023-10-12 19:31:55,662][62634] Updated weights for policy 0, policy_version 89420 (0.0007) [2023-10-12 19:31:56,035][62634] Updated weights for policy 0, policy_version 89430 (0.0007) [2023-10-12 19:31:56,407][62634] Updated weights for policy 0, policy_version 89440 (0.0010) [2023-10-12 19:31:58,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 183205888. Throughput: 0: 1670.2, 1: 1687.0. Samples: 45810870. Policy #0 lag: (min: 9.0, avg: 21.8, max: 41.0) [2023-10-12 19:31:58,436][61643] Avg episode reward: [(0, '23.930'), (1, '9.660')] [2023-10-12 19:31:58,728][62635] Updated weights for policy 1, policy_version 89480 (0.0010) [2023-10-12 19:31:59,089][62635] Updated weights for policy 1, policy_version 89490 (0.0011) [2023-10-12 19:31:59,460][62635] Updated weights for policy 1, policy_version 89500 (0.0008) [2023-10-12 19:32:00,575][62634] Updated weights for policy 0, policy_version 89450 (0.0009) [2023-10-12 19:32:00,955][62634] Updated weights for policy 0, policy_version 89460 (0.0010) [2023-10-12 19:32:01,339][62634] Updated weights for policy 0, policy_version 89470 (0.0009) [2023-10-12 19:32:03,435][61643] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 183271424. Throughput: 0: 1680.8, 1: 1686.9. Samples: 45831620. Policy #0 lag: (min: 9.0, avg: 21.8, max: 41.0) [2023-10-12 19:32:03,435][61643] Avg episode reward: [(0, '23.940'), (1, '9.870')] [2023-10-12 19:32:03,535][62635] Updated weights for policy 1, policy_version 89510 (0.0008) [2023-10-12 19:32:03,902][62635] Updated weights for policy 1, policy_version 89520 (0.0010) [2023-10-12 19:32:04,272][62635] Updated weights for policy 1, policy_version 89530 (0.0011) [2023-10-12 19:32:05,239][62634] Updated weights for policy 0, policy_version 89480 (0.0011) [2023-10-12 19:32:05,612][62634] Updated weights for policy 0, policy_version 89490 (0.0010) [2023-10-12 19:32:05,987][62634] Updated weights for policy 0, policy_version 89500 (0.0007) [2023-10-12 19:32:08,382][62635] Updated weights for policy 1, policy_version 89540 (0.0010) [2023-10-12 19:32:08,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 183336960. Throughput: 0: 1659.4, 1: 1687.8. Samples: 45841000. Policy #0 lag: (min: 9.0, avg: 21.8, max: 41.0) [2023-10-12 19:32:08,436][61643] Avg episode reward: [(0, '24.170'), (1, '10.010')] [2023-10-12 19:32:08,751][62635] Updated weights for policy 1, policy_version 89550 (0.0008) [2023-10-12 19:32:09,118][62635] Updated weights for policy 1, policy_version 89560 (0.0008) [2023-10-12 19:32:10,245][62634] Updated weights for policy 0, policy_version 89510 (0.0007) [2023-10-12 19:32:10,622][62634] Updated weights for policy 0, policy_version 89520 (0.0007) [2023-10-12 19:32:10,999][62634] Updated weights for policy 0, policy_version 89530 (0.0008) [2023-10-12 19:32:13,103][62635] Updated weights for policy 1, policy_version 89570 (0.0009) [2023-10-12 19:32:13,435][61643] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 183402496. Throughput: 0: 1671.1, 1: 1688.4. Samples: 45861352. Policy #0 lag: (min: 9.0, avg: 21.8, max: 41.0) [2023-10-12 19:32:13,436][61643] Avg episode reward: [(0, '24.690'), (1, '9.700')] [2023-10-12 19:32:13,476][62635] Updated weights for policy 1, policy_version 89580 (0.0009) [2023-10-12 19:32:13,844][62635] Updated weights for policy 1, policy_version 89590 (0.0009) [2023-10-12 19:32:14,212][62635] Updated weights for policy 1, policy_version 89600 (0.0009) [2023-10-12 19:32:15,013][62634] Updated weights for policy 0, policy_version 89540 (0.0007) [2023-10-12 19:32:15,394][62634] Updated weights for policy 0, policy_version 89550 (0.0008) [2023-10-12 19:32:15,772][62634] Updated weights for policy 0, policy_version 89560 (0.0009) [2023-10-12 19:32:18,112][62635] Updated weights for policy 1, policy_version 89610 (0.0009) [2023-10-12 19:32:18,435][61643] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 183468032. Throughput: 0: 1679.4, 1: 1686.2. Samples: 45882138. Policy #0 lag: (min: 9.0, avg: 21.8, max: 41.0) [2023-10-12 19:32:18,436][61643] Avg episode reward: [(0, '24.370'), (1, '9.610')] [2023-10-12 19:32:18,448][62354] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000089568_91717632.pth... [2023-10-12 19:32:18,483][62635] Updated weights for policy 1, policy_version 89620 (0.0011) [2023-10-12 19:32:18,484][62354] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000088000_90112000.pth [2023-10-12 19:32:18,852][62635] Updated weights for policy 1, policy_version 89630 (0.0010) [2023-10-12 19:32:18,920][62495] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000089632_91783168.pth... [2023-10-12 19:32:18,958][62495] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000088032_90144768.pth [2023-10-12 19:32:19,897][62634] Updated weights for policy 0, policy_version 89570 (0.0007) [2023-10-12 19:32:20,267][62634] Updated weights for policy 0, policy_version 89580 (0.0007) [2023-10-12 19:32:20,654][62634] Updated weights for policy 0, policy_version 89590 (0.0010) [2023-10-12 19:32:21,019][62634] Updated weights for policy 0, policy_version 89600 (0.0010) [2023-10-12 19:32:23,050][62635] Updated weights for policy 1, policy_version 89640 (0.0008) [2023-10-12 19:32:23,415][62635] Updated weights for policy 1, policy_version 89650 (0.0008) [2023-10-12 19:32:23,435][61643] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 183533568. Throughput: 0: 1659.0, 1: 1695.7. Samples: 45891602. Policy #0 lag: (min: 9.0, avg: 21.8, max: 41.0) [2023-10-12 19:32:23,435][61643] Avg episode reward: [(0, '24.190'), (1, '9.710')] [2023-10-12 19:32:23,784][62635] Updated weights for policy 1, policy_version 89660 (0.0008) [2023-10-12 19:32:25,260][62634] Updated weights for policy 0, policy_version 89610 (0.0008) [2023-10-12 19:32:25,640][62634] Updated weights for policy 0, policy_version 89620 (0.0009) [2023-10-12 19:32:26,011][62634] Updated weights for policy 0, policy_version 89630 (0.0007) [2023-10-12 19:32:27,982][62635] Updated weights for policy 1, policy_version 89670 (0.0008) [2023-10-12 19:32:28,349][62635] Updated weights for policy 1, policy_version 89680 (0.0007) [2023-10-12 19:32:28,435][61643] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 183599104. Throughput: 0: 1678.9, 1: 1696.9. Samples: 45912138. Policy #0 lag: (min: 9.0, avg: 21.8, max: 41.0) [2023-10-12 19:32:28,435][61643] Avg episode reward: [(0, '24.460'), (1, '9.690')] [2023-10-12 19:32:28,720][62635] Updated weights for policy 1, policy_version 89690 (0.0009) [2023-10-12 19:32:29,945][62634] Updated weights for policy 0, policy_version 89640 (0.0008) [2023-10-12 19:32:30,325][62634] Updated weights for policy 0, policy_version 89650 (0.0010) [2023-10-12 19:32:30,704][62634] Updated weights for policy 0, policy_version 89660 (0.0009) [2023-10-12 19:32:32,676][62635] Updated weights for policy 1, policy_version 89700 (0.0008) [2023-10-12 19:32:33,049][62635] Updated weights for policy 1, policy_version 89710 (0.0009) [2023-10-12 19:32:33,407][62635] Updated weights for policy 1, policy_version 89720 (0.0009) [2023-10-12 19:32:33,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 183664640. Throughput: 0: 1677.4, 1: 1682.2. Samples: 45932400. Policy #0 lag: (min: 9.0, avg: 21.8, max: 41.0) [2023-10-12 19:32:33,435][61643] Avg episode reward: [(0, '24.440'), (1, '9.680')] [2023-10-12 19:32:34,644][62634] Updated weights for policy 0, policy_version 89670 (0.0008) [2023-10-12 19:32:35,011][62634] Updated weights for policy 0, policy_version 89680 (0.0010) [2023-10-12 19:32:35,390][62634] Updated weights for policy 0, policy_version 89690 (0.0009) [2023-10-12 19:32:37,379][62635] Updated weights for policy 1, policy_version 89730 (0.0009) [2023-10-12 19:32:37,744][62635] Updated weights for policy 1, policy_version 89740 (0.0009) [2023-10-12 19:32:38,105][62635] Updated weights for policy 1, policy_version 89750 (0.0008) [2023-10-12 19:32:38,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 183730176. Throughput: 0: 1666.2, 1: 1699.9. Samples: 45942208. Policy #0 lag: (min: 9.0, avg: 21.8, max: 41.0) [2023-10-12 19:32:38,435][61643] Avg episode reward: [(0, '24.570'), (1, '9.600')] [2023-10-12 19:32:38,468][62635] Updated weights for policy 1, policy_version 89760 (0.0007) [2023-10-12 19:32:39,432][62634] Updated weights for policy 0, policy_version 89700 (0.0009) [2023-10-12 19:32:39,806][62634] Updated weights for policy 0, policy_version 89710 (0.0008) [2023-10-12 19:32:40,188][62634] Updated weights for policy 0, policy_version 89720 (0.0009) [2023-10-12 19:32:42,530][62635] Updated weights for policy 1, policy_version 89770 (0.0007) [2023-10-12 19:32:42,905][62635] Updated weights for policy 1, policy_version 89780 (0.0009) [2023-10-12 19:32:43,268][62635] Updated weights for policy 1, policy_version 89790 (0.0008) [2023-10-12 19:32:43,435][61643] Fps is (10 sec: 16384.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 183828480. Throughput: 0: 1681.9, 1: 1699.0. Samples: 45963008. Policy #0 lag: (min: 9.0, avg: 21.8, max: 41.0) [2023-10-12 19:32:43,435][61643] Avg episode reward: [(0, '24.850'), (1, '9.420')] [2023-10-12 19:32:44,181][62634] Updated weights for policy 0, policy_version 89730 (0.0010) [2023-10-12 19:32:44,554][62634] Updated weights for policy 0, policy_version 89740 (0.0009) [2023-10-12 19:32:44,932][62634] Updated weights for policy 0, policy_version 89750 (0.0009) [2023-10-12 19:32:45,312][62634] Updated weights for policy 0, policy_version 89760 (0.0008) [2023-10-12 19:32:47,244][62635] Updated weights for policy 1, policy_version 89800 (0.0009) [2023-10-12 19:32:47,600][62635] Updated weights for policy 1, policy_version 89810 (0.0008) [2023-10-12 19:32:47,962][62635] Updated weights for policy 1, policy_version 89820 (0.0008) [2023-10-12 19:32:48,435][61643] Fps is (10 sec: 16383.8, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 183894016. Throughput: 0: 1683.5, 1: 1673.4. Samples: 45982678. Policy #0 lag: (min: 9.0, avg: 21.8, max: 41.0) [2023-10-12 19:32:48,436][61643] Avg episode reward: [(0, '24.740'), (1, '9.710')] [2023-10-12 19:32:49,515][62634] Updated weights for policy 0, policy_version 89770 (0.0008) [2023-10-12 19:32:49,884][62634] Updated weights for policy 0, policy_version 89780 (0.0008) [2023-10-12 19:32:50,259][62634] Updated weights for policy 0, policy_version 89790 (0.0010) [2023-10-12 19:32:51,998][62635] Updated weights for policy 1, policy_version 89830 (0.0008) [2023-10-12 19:32:52,367][62635] Updated weights for policy 1, policy_version 89840 (0.0007) [2023-10-12 19:32:52,734][62635] Updated weights for policy 1, policy_version 89850 (0.0008) [2023-10-12 19:32:53,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 183959552. Throughput: 0: 1674.6, 1: 1696.8. Samples: 45992712. Policy #0 lag: (min: 9.0, avg: 21.8, max: 41.0) [2023-10-12 19:32:53,436][61643] Avg episode reward: [(0, '24.790'), (1, '9.530')] [2023-10-12 19:32:54,437][62634] Updated weights for policy 0, policy_version 89800 (0.0008) [2023-10-12 19:32:54,812][62634] Updated weights for policy 0, policy_version 89810 (0.0007) [2023-10-12 19:32:55,191][62634] Updated weights for policy 0, policy_version 89820 (0.0008) [2023-10-12 19:32:56,969][62635] Updated weights for policy 1, policy_version 89860 (0.0008) [2023-10-12 19:32:57,339][62635] Updated weights for policy 1, policy_version 89870 (0.0007) [2023-10-12 19:32:57,717][62635] Updated weights for policy 1, policy_version 89880 (0.0007) [2023-10-12 19:32:58,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 184025088. Throughput: 0: 1680.8, 1: 1692.1. Samples: 46013134. Policy #0 lag: (min: 31.0, avg: 38.6, max: 63.0) [2023-10-12 19:32:58,436][61643] Avg episode reward: [(0, '25.180'), (1, '9.380')] [2023-10-12 19:32:59,259][62634] Updated weights for policy 0, policy_version 89830 (0.0009) [2023-10-12 19:32:59,640][62634] Updated weights for policy 0, policy_version 89840 (0.0010) [2023-10-12 19:33:00,011][62634] Updated weights for policy 0, policy_version 89850 (0.0010) [2023-10-12 19:33:01,673][62635] Updated weights for policy 1, policy_version 89890 (0.0008) [2023-10-12 19:33:02,039][62635] Updated weights for policy 1, policy_version 89900 (0.0009) [2023-10-12 19:33:02,409][62635] Updated weights for policy 1, policy_version 89910 (0.0010) [2023-10-12 19:33:02,775][62635] Updated weights for policy 1, policy_version 89920 (0.0009) [2023-10-12 19:33:03,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 184090624. Throughput: 0: 1682.9, 1: 1669.3. Samples: 46032982. Policy #0 lag: (min: 31.0, avg: 38.6, max: 63.0) [2023-10-12 19:33:03,436][61643] Avg episode reward: [(0, '25.090'), (1, '9.500')] [2023-10-12 19:33:04,086][62634] Updated weights for policy 0, policy_version 89860 (0.0008) [2023-10-12 19:33:04,454][62634] Updated weights for policy 0, policy_version 89870 (0.0008) [2023-10-12 19:33:04,837][62634] Updated weights for policy 0, policy_version 89880 (0.0007) [2023-10-12 19:33:06,732][62635] Updated weights for policy 1, policy_version 89930 (0.0008) [2023-10-12 19:33:07,090][62635] Updated weights for policy 1, policy_version 89940 (0.0010) [2023-10-12 19:33:07,461][62635] Updated weights for policy 1, policy_version 89950 (0.0007) [2023-10-12 19:33:08,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 184156160. Throughput: 0: 1679.2, 1: 1693.1. Samples: 46043356. Policy #0 lag: (min: 31.0, avg: 38.6, max: 63.0) [2023-10-12 19:33:08,436][61643] Avg episode reward: [(0, '24.940'), (1, '9.550')] [2023-10-12 19:33:08,776][62634] Updated weights for policy 0, policy_version 89890 (0.0007) [2023-10-12 19:33:09,158][62634] Updated weights for policy 0, policy_version 89900 (0.0010) [2023-10-12 19:33:09,528][62634] Updated weights for policy 0, policy_version 89910 (0.0010) [2023-10-12 19:33:09,903][62634] Updated weights for policy 0, policy_version 89920 (0.0009) [2023-10-12 19:33:11,580][62635] Updated weights for policy 1, policy_version 89960 (0.0009) [2023-10-12 19:33:11,946][62635] Updated weights for policy 1, policy_version 89970 (0.0008) [2023-10-12 19:33:12,321][62635] Updated weights for policy 1, policy_version 89980 (0.0008) [2023-10-12 19:33:13,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 184221696. Throughput: 0: 1686.8, 1: 1675.6. Samples: 46063446. Policy #0 lag: (min: 31.0, avg: 38.6, max: 63.0) [2023-10-12 19:33:13,435][61643] Avg episode reward: [(0, '24.660'), (1, '9.840')] [2023-10-12 19:33:13,808][62634] Updated weights for policy 0, policy_version 89930 (0.0009) [2023-10-12 19:33:14,196][62634] Updated weights for policy 0, policy_version 89940 (0.0010) [2023-10-12 19:33:14,573][62634] Updated weights for policy 0, policy_version 89950 (0.0008) [2023-10-12 19:33:16,438][62635] Updated weights for policy 1, policy_version 89990 (0.0009) [2023-10-12 19:33:16,808][62635] Updated weights for policy 1, policy_version 90000 (0.0007) [2023-10-12 19:33:17,184][62635] Updated weights for policy 1, policy_version 90010 (0.0010) [2023-10-12 19:33:18,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 184287232. Throughput: 0: 1689.2, 1: 1676.8. Samples: 46083870. Policy #0 lag: (min: 31.0, avg: 38.6, max: 63.0) [2023-10-12 19:33:18,435][61643] Avg episode reward: [(0, '24.770'), (1, '9.800')] [2023-10-12 19:33:18,497][62634] Updated weights for policy 0, policy_version 89960 (0.0007) [2023-10-12 19:33:18,868][62634] Updated weights for policy 0, policy_version 89970 (0.0009) [2023-10-12 19:33:19,248][62634] Updated weights for policy 0, policy_version 89980 (0.0007) [2023-10-12 19:33:21,141][62635] Updated weights for policy 1, policy_version 90020 (0.0008) [2023-10-12 19:33:21,506][62635] Updated weights for policy 1, policy_version 90030 (0.0009) [2023-10-12 19:33:21,880][62635] Updated weights for policy 1, policy_version 90040 (0.0009) [2023-10-12 19:33:23,284][62634] Updated weights for policy 0, policy_version 89990 (0.0007) [2023-10-12 19:33:23,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 184352768. Throughput: 0: 1688.1, 1: 1689.5. Samples: 46094200. Policy #0 lag: (min: 31.0, avg: 38.6, max: 63.0) [2023-10-12 19:33:23,435][61643] Avg episode reward: [(0, '24.810'), (1, '9.780')] [2023-10-12 19:33:23,657][62634] Updated weights for policy 0, policy_version 90000 (0.0010) [2023-10-12 19:33:24,040][62634] Updated weights for policy 0, policy_version 90010 (0.0008) [2023-10-12 19:33:25,905][62635] Updated weights for policy 1, policy_version 90050 (0.0007) [2023-10-12 19:33:26,265][62635] Updated weights for policy 1, policy_version 90060 (0.0009) [2023-10-12 19:33:26,628][62635] Updated weights for policy 1, policy_version 90070 (0.0008) [2023-10-12 19:33:26,992][62635] Updated weights for policy 1, policy_version 90080 (0.0007) [2023-10-12 19:33:28,087][62634] Updated weights for policy 0, policy_version 90020 (0.0009) [2023-10-12 19:33:28,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 184418304. Throughput: 0: 1687.8, 1: 1663.6. Samples: 46113818. Policy #0 lag: (min: 31.0, avg: 38.6, max: 63.0) [2023-10-12 19:33:28,435][61643] Avg episode reward: [(0, '24.790'), (1, '9.720')] [2023-10-12 19:33:28,450][62634] Updated weights for policy 0, policy_version 90030 (0.0010) [2023-10-12 19:33:28,835][62634] Updated weights for policy 0, policy_version 90040 (0.0010) [2023-10-12 19:33:31,033][62635] Updated weights for policy 1, policy_version 90090 (0.0007) [2023-10-12 19:33:31,410][62635] Updated weights for policy 1, policy_version 90100 (0.0008) [2023-10-12 19:33:31,779][62635] Updated weights for policy 1, policy_version 90110 (0.0008) [2023-10-12 19:33:32,867][62634] Updated weights for policy 0, policy_version 90050 (0.0010) [2023-10-12 19:33:33,256][62634] Updated weights for policy 0, policy_version 90060 (0.0009) [2023-10-12 19:33:33,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 184483840. Throughput: 0: 1684.8, 1: 1688.6. Samples: 46134484. Policy #0 lag: (min: 31.0, avg: 38.6, max: 63.0) [2023-10-12 19:33:33,435][61643] Avg episode reward: [(0, '24.840'), (1, '9.900')] [2023-10-12 19:33:33,625][62634] Updated weights for policy 0, policy_version 90070 (0.0009) [2023-10-12 19:33:34,005][62634] Updated weights for policy 0, policy_version 90080 (0.0008) [2023-10-12 19:33:35,956][62635] Updated weights for policy 1, policy_version 90120 (0.0008) [2023-10-12 19:33:36,313][62635] Updated weights for policy 1, policy_version 90130 (0.0008) [2023-10-12 19:33:36,689][62635] Updated weights for policy 1, policy_version 90140 (0.0008) [2023-10-12 19:33:37,856][62634] Updated weights for policy 0, policy_version 90090 (0.0010) [2023-10-12 19:33:38,237][62634] Updated weights for policy 0, policy_version 90100 (0.0010) [2023-10-12 19:33:38,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 184549376. Throughput: 0: 1695.6, 1: 1682.0. Samples: 46144708. Policy #0 lag: (min: 31.0, avg: 38.6, max: 63.0) [2023-10-12 19:33:38,435][61643] Avg episode reward: [(0, '25.080'), (1, '9.900')] [2023-10-12 19:33:38,614][62634] Updated weights for policy 0, policy_version 90110 (0.0010) [2023-10-12 19:33:40,692][62635] Updated weights for policy 1, policy_version 90150 (0.0009) [2023-10-12 19:33:41,051][62635] Updated weights for policy 1, policy_version 90160 (0.0008) [2023-10-12 19:33:41,421][62635] Updated weights for policy 1, policy_version 90170 (0.0007) [2023-10-12 19:33:42,834][62634] Updated weights for policy 0, policy_version 90120 (0.0008) [2023-10-12 19:33:43,203][62634] Updated weights for policy 0, policy_version 90130 (0.0008) [2023-10-12 19:33:43,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 184614912. Throughput: 0: 1699.7, 1: 1667.4. Samples: 46164654. Policy #0 lag: (min: 31.0, avg: 38.6, max: 63.0) [2023-10-12 19:33:43,435][61643] Avg episode reward: [(0, '25.040'), (1, '9.840')] [2023-10-12 19:33:43,575][62634] Updated weights for policy 0, policy_version 90140 (0.0010) [2023-10-12 19:33:45,400][62635] Updated weights for policy 1, policy_version 90180 (0.0008) [2023-10-12 19:33:45,769][62635] Updated weights for policy 1, policy_version 90190 (0.0007) [2023-10-12 19:33:46,140][62635] Updated weights for policy 1, policy_version 90200 (0.0009) [2023-10-12 19:33:47,563][62634] Updated weights for policy 0, policy_version 90150 (0.0009) [2023-10-12 19:33:47,941][62634] Updated weights for policy 0, policy_version 90160 (0.0009) [2023-10-12 19:33:48,320][62634] Updated weights for policy 0, policy_version 90170 (0.0009) [2023-10-12 19:33:48,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 184680448. Throughput: 0: 1682.2, 1: 1694.9. Samples: 46184952. Policy #0 lag: (min: 31.0, avg: 38.6, max: 63.0) [2023-10-12 19:33:48,435][61643] Avg episode reward: [(0, '24.970'), (1, '9.780')] [2023-10-12 19:33:50,276][62635] Updated weights for policy 1, policy_version 90210 (0.0009) [2023-10-12 19:33:50,644][62635] Updated weights for policy 1, policy_version 90220 (0.0007) [2023-10-12 19:33:51,019][62635] Updated weights for policy 1, policy_version 90230 (0.0008) [2023-10-12 19:33:51,377][62635] Updated weights for policy 1, policy_version 90240 (0.0009) [2023-10-12 19:33:52,529][62634] Updated weights for policy 0, policy_version 90180 (0.0007) [2023-10-12 19:33:52,900][62634] Updated weights for policy 0, policy_version 90190 (0.0009) [2023-10-12 19:33:53,280][62634] Updated weights for policy 0, policy_version 90200 (0.0008) [2023-10-12 19:33:53,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 184745984. Throughput: 0: 1696.4, 1: 1675.1. Samples: 46195072. Policy #0 lag: (min: 31.0, avg: 38.6, max: 63.0) [2023-10-12 19:33:53,435][61643] Avg episode reward: [(0, '24.820'), (1, '9.500')] [2023-10-12 19:33:55,504][62635] Updated weights for policy 1, policy_version 90250 (0.0007) [2023-10-12 19:33:55,863][62635] Updated weights for policy 1, policy_version 90260 (0.0007) [2023-10-12 19:33:56,228][62635] Updated weights for policy 1, policy_version 90270 (0.0009) [2023-10-12 19:33:57,222][62634] Updated weights for policy 0, policy_version 90210 (0.0009) [2023-10-12 19:33:57,598][62634] Updated weights for policy 0, policy_version 90220 (0.0009) [2023-10-12 19:33:57,979][62634] Updated weights for policy 0, policy_version 90230 (0.0008) [2023-10-12 19:33:58,357][62634] Updated weights for policy 0, policy_version 90240 (0.0008) [2023-10-12 19:33:58,435][61643] Fps is (10 sec: 16384.0, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 184844288. Throughput: 0: 1694.4, 1: 1683.2. Samples: 46215438. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 19:33:58,436][61643] Avg episode reward: [(0, '24.850'), (1, '9.790')] [2023-10-12 19:34:00,347][62635] Updated weights for policy 1, policy_version 90280 (0.0009) [2023-10-12 19:34:00,717][62635] Updated weights for policy 1, policy_version 90290 (0.0010) [2023-10-12 19:34:01,090][62635] Updated weights for policy 1, policy_version 90300 (0.0007) [2023-10-12 19:34:02,293][62634] Updated weights for policy 0, policy_version 90250 (0.0008) [2023-10-12 19:34:02,666][62634] Updated weights for policy 0, policy_version 90260 (0.0008) [2023-10-12 19:34:03,040][62634] Updated weights for policy 0, policy_version 90270 (0.0010) [2023-10-12 19:34:03,435][61643] Fps is (10 sec: 16384.0, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 184909824. Throughput: 0: 1668.4, 1: 1694.9. Samples: 46235220. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 19:34:03,435][61643] Avg episode reward: [(0, '24.900'), (1, '9.970')] [2023-10-12 19:34:04,958][62635] Updated weights for policy 1, policy_version 90310 (0.0007) [2023-10-12 19:34:05,321][62635] Updated weights for policy 1, policy_version 90320 (0.0010) [2023-10-12 19:34:05,696][62635] Updated weights for policy 1, policy_version 90330 (0.0011) [2023-10-12 19:34:07,076][62634] Updated weights for policy 0, policy_version 90280 (0.0009) [2023-10-12 19:34:07,457][62634] Updated weights for policy 0, policy_version 90290 (0.0008) [2023-10-12 19:34:07,839][62634] Updated weights for policy 0, policy_version 90300 (0.0007) [2023-10-12 19:34:08,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 184975360. Throughput: 0: 1689.5, 1: 1666.1. Samples: 46245200. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 19:34:08,435][61643] Avg episode reward: [(0, '24.940'), (1, '9.830')] [2023-10-12 19:34:09,743][62635] Updated weights for policy 1, policy_version 90340 (0.0008) [2023-10-12 19:34:10,112][62635] Updated weights for policy 1, policy_version 90350 (0.0007) [2023-10-12 19:34:10,481][62635] Updated weights for policy 1, policy_version 90360 (0.0008) [2023-10-12 19:34:11,977][62634] Updated weights for policy 0, policy_version 90310 (0.0010) [2023-10-12 19:34:12,348][62634] Updated weights for policy 0, policy_version 90320 (0.0010) [2023-10-12 19:34:12,724][62634] Updated weights for policy 0, policy_version 90330 (0.0007) [2023-10-12 19:34:13,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 185040896. Throughput: 0: 1682.6, 1: 1696.8. Samples: 46265890. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 19:34:13,435][61643] Avg episode reward: [(0, '24.940'), (1, '9.880')] [2023-10-12 19:34:14,534][62635] Updated weights for policy 1, policy_version 90370 (0.0007) [2023-10-12 19:34:14,901][62635] Updated weights for policy 1, policy_version 90380 (0.0007) [2023-10-12 19:34:15,266][62635] Updated weights for policy 1, policy_version 90390 (0.0007) [2023-10-12 19:34:15,643][62635] Updated weights for policy 1, policy_version 90400 (0.0007) [2023-10-12 19:34:16,806][62634] Updated weights for policy 0, policy_version 90340 (0.0007) [2023-10-12 19:34:17,173][62634] Updated weights for policy 0, policy_version 90350 (0.0009) [2023-10-12 19:34:17,553][62634] Updated weights for policy 0, policy_version 90360 (0.0009) [2023-10-12 19:34:18,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 185106432. Throughput: 0: 1661.9, 1: 1694.9. Samples: 46285542. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 19:34:18,435][61643] Avg episode reward: [(0, '25.070'), (1, '10.010')] [2023-10-12 19:34:18,444][62495] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000090400_92569600.pth... [2023-10-12 19:34:18,445][62354] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000090368_92536832.pth... [2023-10-12 19:34:18,482][62354] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000088800_90931200.pth [2023-10-12 19:34:18,486][62495] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000088832_90963968.pth [2023-10-12 19:34:19,707][62635] Updated weights for policy 1, policy_version 90410 (0.0008) [2023-10-12 19:34:20,073][62635] Updated weights for policy 1, policy_version 90420 (0.0008) [2023-10-12 19:34:20,440][62635] Updated weights for policy 1, policy_version 90430 (0.0009) [2023-10-12 19:34:21,570][62634] Updated weights for policy 0, policy_version 90370 (0.0010) [2023-10-12 19:34:21,950][62634] Updated weights for policy 0, policy_version 90380 (0.0010) [2023-10-12 19:34:22,321][62634] Updated weights for policy 0, policy_version 90390 (0.0011) [2023-10-12 19:34:22,695][62634] Updated weights for policy 0, policy_version 90400 (0.0011) [2023-10-12 19:34:23,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 185171968. Throughput: 0: 1683.7, 1: 1676.5. Samples: 46295916. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 19:34:23,436][61643] Avg episode reward: [(0, '24.940'), (1, '9.920')] [2023-10-12 19:34:24,570][62635] Updated weights for policy 1, policy_version 90440 (0.0007) [2023-10-12 19:34:24,937][62635] Updated weights for policy 1, policy_version 90450 (0.0008) [2023-10-12 19:34:25,312][62635] Updated weights for policy 1, policy_version 90460 (0.0008) [2023-10-12 19:34:26,896][62634] Updated weights for policy 0, policy_version 90410 (0.0007) [2023-10-12 19:34:27,271][62634] Updated weights for policy 0, policy_version 90420 (0.0008) [2023-10-12 19:34:27,650][62634] Updated weights for policy 0, policy_version 90430 (0.0010) [2023-10-12 19:34:28,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 185237504. Throughput: 0: 1668.9, 1: 1695.2. Samples: 46316038. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 19:34:28,435][61643] Avg episode reward: [(0, '25.080'), (1, '9.910')] [2023-10-12 19:34:29,380][62635] Updated weights for policy 1, policy_version 90470 (0.0009) [2023-10-12 19:34:29,748][62635] Updated weights for policy 1, policy_version 90480 (0.0009) [2023-10-12 19:34:30,128][62635] Updated weights for policy 1, policy_version 90490 (0.0007) [2023-10-12 19:34:31,663][62634] Updated weights for policy 0, policy_version 90440 (0.0010) [2023-10-12 19:34:32,049][62634] Updated weights for policy 0, policy_version 90450 (0.0010) [2023-10-12 19:34:32,428][62634] Updated weights for policy 0, policy_version 90460 (0.0008) [2023-10-12 19:34:33,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 185303040. Throughput: 0: 1663.6, 1: 1695.2. Samples: 46336100. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 19:34:33,435][61643] Avg episode reward: [(0, '25.390'), (1, '9.750')] [2023-10-12 19:34:34,000][62635] Updated weights for policy 1, policy_version 90500 (0.0007) [2023-10-12 19:34:34,373][62635] Updated weights for policy 1, policy_version 90510 (0.0009) [2023-10-12 19:34:34,741][62635] Updated weights for policy 1, policy_version 90520 (0.0007) [2023-10-12 19:34:36,508][62634] Updated weights for policy 0, policy_version 90470 (0.0008) [2023-10-12 19:34:36,886][62634] Updated weights for policy 0, policy_version 90480 (0.0010) [2023-10-12 19:34:37,266][62634] Updated weights for policy 0, policy_version 90490 (0.0007) [2023-10-12 19:34:38,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 185368576. Throughput: 0: 1681.3, 1: 1684.9. Samples: 46346552. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 19:34:38,435][61643] Avg episode reward: [(0, '25.360'), (1, '9.660')] [2023-10-12 19:34:38,768][62635] Updated weights for policy 1, policy_version 90530 (0.0008) [2023-10-12 19:34:39,136][62635] Updated weights for policy 1, policy_version 90540 (0.0008) [2023-10-12 19:34:39,509][62635] Updated weights for policy 1, policy_version 90550 (0.0008) [2023-10-12 19:34:39,873][62635] Updated weights for policy 1, policy_version 90560 (0.0008) [2023-10-12 19:34:41,322][62634] Updated weights for policy 0, policy_version 90500 (0.0009) [2023-10-12 19:34:41,699][62634] Updated weights for policy 0, policy_version 90510 (0.0009) [2023-10-12 19:34:42,074][62634] Updated weights for policy 0, policy_version 90520 (0.0010) [2023-10-12 19:34:43,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 185434112. Throughput: 0: 1660.5, 1: 1698.0. Samples: 46366568. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 19:34:43,436][61643] Avg episode reward: [(0, '25.080'), (1, '9.700')] [2023-10-12 19:34:43,917][62635] Updated weights for policy 1, policy_version 90570 (0.0009) [2023-10-12 19:34:44,294][62635] Updated weights for policy 1, policy_version 90580 (0.0007) [2023-10-12 19:34:44,657][62635] Updated weights for policy 1, policy_version 90590 (0.0007) [2023-10-12 19:34:46,024][62634] Updated weights for policy 0, policy_version 90530 (0.0010) [2023-10-12 19:34:46,398][62634] Updated weights for policy 0, policy_version 90540 (0.0010) [2023-10-12 19:34:46,772][62634] Updated weights for policy 0, policy_version 90550 (0.0008) [2023-10-12 19:34:47,148][62634] Updated weights for policy 0, policy_version 90560 (0.0007) [2023-10-12 19:34:48,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 185499648. Throughput: 0: 1674.2, 1: 1697.1. Samples: 46386926. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 19:34:48,435][61643] Avg episode reward: [(0, '25.060'), (1, '9.880')] [2023-10-12 19:34:48,835][62635] Updated weights for policy 1, policy_version 90600 (0.0007) [2023-10-12 19:34:49,203][62635] Updated weights for policy 1, policy_version 90610 (0.0007) [2023-10-12 19:34:49,569][62635] Updated weights for policy 1, policy_version 90620 (0.0009) [2023-10-12 19:34:51,235][62634] Updated weights for policy 0, policy_version 90570 (0.0008) [2023-10-12 19:34:51,625][62634] Updated weights for policy 0, policy_version 90580 (0.0008) [2023-10-12 19:34:52,000][62634] Updated weights for policy 0, policy_version 90590 (0.0009) [2023-10-12 19:34:53,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 185565184. Throughput: 0: 1681.1, 1: 1693.8. Samples: 46397068. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 19:34:53,435][61643] Avg episode reward: [(0, '25.000'), (1, '9.790')] [2023-10-12 19:34:53,542][62635] Updated weights for policy 1, policy_version 90630 (0.0010) [2023-10-12 19:34:53,911][62635] Updated weights for policy 1, policy_version 90640 (0.0010) [2023-10-12 19:34:54,284][62635] Updated weights for policy 1, policy_version 90650 (0.0011) [2023-10-12 19:34:56,028][62634] Updated weights for policy 0, policy_version 90600 (0.0011) [2023-10-12 19:34:56,402][62634] Updated weights for policy 0, policy_version 90610 (0.0007) [2023-10-12 19:34:56,787][62634] Updated weights for policy 0, policy_version 90620 (0.0007) [2023-10-12 19:34:58,406][62635] Updated weights for policy 1, policy_version 90660 (0.0009) [2023-10-12 19:34:58,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 185630720. Throughput: 0: 1662.0, 1: 1689.1. Samples: 46416692. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 19:34:58,435][61643] Avg episode reward: [(0, '25.040'), (1, '9.790')] [2023-10-12 19:34:58,776][62635] Updated weights for policy 1, policy_version 90670 (0.0009) [2023-10-12 19:34:59,134][62635] Updated weights for policy 1, policy_version 90680 (0.0008) [2023-10-12 19:35:00,798][62634] Updated weights for policy 0, policy_version 90630 (0.0007) [2023-10-12 19:35:01,164][62634] Updated weights for policy 0, policy_version 90640 (0.0008) [2023-10-12 19:35:01,548][62634] Updated weights for policy 0, policy_version 90650 (0.0007) [2023-10-12 19:35:03,155][62635] Updated weights for policy 1, policy_version 90690 (0.0010) [2023-10-12 19:35:03,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 185696256. Throughput: 0: 1685.7, 1: 1691.7. Samples: 46437526. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 19:35:03,435][61643] Avg episode reward: [(0, '24.830'), (1, '9.720')] [2023-10-12 19:35:03,516][62635] Updated weights for policy 1, policy_version 90700 (0.0008) [2023-10-12 19:35:03,878][62635] Updated weights for policy 1, policy_version 90710 (0.0009) [2023-10-12 19:35:04,246][62635] Updated weights for policy 1, policy_version 90720 (0.0009) [2023-10-12 19:35:05,437][62634] Updated weights for policy 0, policy_version 90660 (0.0007) [2023-10-12 19:35:05,811][62634] Updated weights for policy 0, policy_version 90670 (0.0009) [2023-10-12 19:35:06,184][62634] Updated weights for policy 0, policy_version 90680 (0.0009) [2023-10-12 19:35:08,355][62635] Updated weights for policy 1, policy_version 90730 (0.0007) [2023-10-12 19:35:08,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 185761792. Throughput: 0: 1672.2, 1: 1694.0. Samples: 46447396. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 19:35:08,436][61643] Avg episode reward: [(0, '24.870'), (1, '9.940')] [2023-10-12 19:35:08,722][62635] Updated weights for policy 1, policy_version 90740 (0.0008) [2023-10-12 19:35:09,086][62635] Updated weights for policy 1, policy_version 90750 (0.0007) [2023-10-12 19:35:10,251][62634] Updated weights for policy 0, policy_version 90690 (0.0009) [2023-10-12 19:35:10,623][62634] Updated weights for policy 0, policy_version 90700 (0.0008) [2023-10-12 19:35:11,010][62634] Updated weights for policy 0, policy_version 90710 (0.0007) [2023-10-12 19:35:11,383][62634] Updated weights for policy 0, policy_version 90720 (0.0007) [2023-10-12 19:35:12,994][62635] Updated weights for policy 1, policy_version 90760 (0.0008) [2023-10-12 19:35:13,362][62635] Updated weights for policy 1, policy_version 90770 (0.0008) [2023-10-12 19:35:13,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 185827328. Throughput: 0: 1672.8, 1: 1696.1. Samples: 46467638. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 19:35:13,435][61643] Avg episode reward: [(0, '24.970'), (1, '9.820')] [2023-10-12 19:35:13,736][62635] Updated weights for policy 1, policy_version 90780 (0.0009) [2023-10-12 19:35:15,448][62634] Updated weights for policy 0, policy_version 90730 (0.0010) [2023-10-12 19:35:15,827][62634] Updated weights for policy 0, policy_version 90740 (0.0011) [2023-10-12 19:35:16,207][62634] Updated weights for policy 0, policy_version 90750 (0.0010) [2023-10-12 19:35:17,874][62635] Updated weights for policy 1, policy_version 90790 (0.0008) [2023-10-12 19:35:18,246][62635] Updated weights for policy 1, policy_version 90800 (0.0007) [2023-10-12 19:35:18,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 185892864. Throughput: 0: 1691.3, 1: 1682.8. Samples: 46487938. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 19:35:18,435][61643] Avg episode reward: [(0, '24.960'), (1, '9.780')] [2023-10-12 19:35:18,613][62635] Updated weights for policy 1, policy_version 90810 (0.0007) [2023-10-12 19:35:20,422][62634] Updated weights for policy 0, policy_version 90760 (0.0009) [2023-10-12 19:35:20,797][62634] Updated weights for policy 0, policy_version 90770 (0.0008) [2023-10-12 19:35:21,168][62634] Updated weights for policy 0, policy_version 90780 (0.0007) [2023-10-12 19:35:22,586][62635] Updated weights for policy 1, policy_version 90820 (0.0008) [2023-10-12 19:35:22,950][62635] Updated weights for policy 1, policy_version 90830 (0.0009) [2023-10-12 19:35:23,312][62635] Updated weights for policy 1, policy_version 90840 (0.0008) [2023-10-12 19:35:23,435][61643] Fps is (10 sec: 13106.8, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 185958400. Throughput: 0: 1672.3, 1: 1693.3. Samples: 46498008. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 19:35:23,436][61643] Avg episode reward: [(0, '24.770'), (1, '9.680')] [2023-10-12 19:35:25,347][62634] Updated weights for policy 0, policy_version 90790 (0.0008) [2023-10-12 19:35:25,729][62634] Updated weights for policy 0, policy_version 90800 (0.0008) [2023-10-12 19:35:26,103][62634] Updated weights for policy 0, policy_version 90810 (0.0009) [2023-10-12 19:35:27,455][62635] Updated weights for policy 1, policy_version 90850 (0.0009) [2023-10-12 19:35:27,824][62635] Updated weights for policy 1, policy_version 90860 (0.0009) [2023-10-12 19:35:28,180][62635] Updated weights for policy 1, policy_version 90870 (0.0007) [2023-10-12 19:35:28,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 186023936. Throughput: 0: 1681.6, 1: 1695.3. Samples: 46518530. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 19:35:28,436][61643] Avg episode reward: [(0, '24.770'), (1, '9.490')] [2023-10-12 19:35:28,544][62635] Updated weights for policy 1, policy_version 90880 (0.0008) [2023-10-12 19:35:30,094][62634] Updated weights for policy 0, policy_version 90820 (0.0010) [2023-10-12 19:35:30,459][62634] Updated weights for policy 0, policy_version 90830 (0.0009) [2023-10-12 19:35:30,831][62634] Updated weights for policy 0, policy_version 90840 (0.0008) [2023-10-12 19:35:32,318][62635] Updated weights for policy 1, policy_version 90890 (0.0007) [2023-10-12 19:35:32,688][62635] Updated weights for policy 1, policy_version 90900 (0.0011) [2023-10-12 19:35:33,056][62635] Updated weights for policy 1, policy_version 90910 (0.0011) [2023-10-12 19:35:33,435][61643] Fps is (10 sec: 16384.5, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 186122240. Throughput: 0: 1694.4, 1: 1675.6. Samples: 46538576. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 19:35:33,435][61643] Avg episode reward: [(0, '24.770'), (1, '9.710')] [2023-10-12 19:35:34,784][62634] Updated weights for policy 0, policy_version 90850 (0.0008) [2023-10-12 19:35:35,151][62634] Updated weights for policy 0, policy_version 90860 (0.0010) [2023-10-12 19:35:35,526][62634] Updated weights for policy 0, policy_version 90870 (0.0009) [2023-10-12 19:35:35,900][62634] Updated weights for policy 0, policy_version 90880 (0.0010) [2023-10-12 19:35:37,218][62635] Updated weights for policy 1, policy_version 90920 (0.0008) [2023-10-12 19:35:37,603][62635] Updated weights for policy 1, policy_version 90930 (0.0007) [2023-10-12 19:35:37,975][62635] Updated weights for policy 1, policy_version 90940 (0.0008) [2023-10-12 19:35:38,435][61643] Fps is (10 sec: 16384.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 186187776. Throughput: 0: 1664.7, 1: 1703.9. Samples: 46548654. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 19:35:38,436][61643] Avg episode reward: [(0, '24.820'), (1, '9.370')] [2023-10-12 19:35:40,109][62634] Updated weights for policy 0, policy_version 90890 (0.0007) [2023-10-12 19:35:40,498][62634] Updated weights for policy 0, policy_version 90900 (0.0010) [2023-10-12 19:35:40,869][62634] Updated weights for policy 0, policy_version 90910 (0.0009) [2023-10-12 19:35:41,952][62635] Updated weights for policy 1, policy_version 90950 (0.0008) [2023-10-12 19:35:42,321][62635] Updated weights for policy 1, policy_version 90960 (0.0008) [2023-10-12 19:35:42,691][62635] Updated weights for policy 1, policy_version 90970 (0.0007) [2023-10-12 19:35:43,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 186253312. Throughput: 0: 1689.4, 1: 1691.0. Samples: 46568812. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 19:35:43,435][61643] Avg episode reward: [(0, '24.580'), (1, '9.400')] [2023-10-12 19:35:44,774][62634] Updated weights for policy 0, policy_version 90920 (0.0008) [2023-10-12 19:35:45,166][62634] Updated weights for policy 0, policy_version 90930 (0.0010) [2023-10-12 19:35:45,536][62634] Updated weights for policy 0, policy_version 90940 (0.0009) [2023-10-12 19:35:46,657][62635] Updated weights for policy 1, policy_version 90980 (0.0008) [2023-10-12 19:35:47,031][62635] Updated weights for policy 1, policy_version 90990 (0.0010) [2023-10-12 19:35:47,412][62635] Updated weights for policy 1, policy_version 91000 (0.0009) [2023-10-12 19:35:48,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 186318848. Throughput: 0: 1691.6, 1: 1669.3. Samples: 46588768. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 19:35:48,436][61643] Avg episode reward: [(0, '24.710'), (1, '9.590')] [2023-10-12 19:35:49,694][62634] Updated weights for policy 0, policy_version 90950 (0.0009) [2023-10-12 19:35:50,082][62634] Updated weights for policy 0, policy_version 90960 (0.0009) [2023-10-12 19:35:50,468][62634] Updated weights for policy 0, policy_version 90970 (0.0008) [2023-10-12 19:35:51,570][62635] Updated weights for policy 1, policy_version 91010 (0.0008) [2023-10-12 19:35:51,931][62635] Updated weights for policy 1, policy_version 91020 (0.0010) [2023-10-12 19:35:52,303][62635] Updated weights for policy 1, policy_version 91030 (0.0008) [2023-10-12 19:35:52,670][62635] Updated weights for policy 1, policy_version 91040 (0.0008) [2023-10-12 19:35:53,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 186384384. Throughput: 0: 1670.9, 1: 1696.5. Samples: 46598928. Policy #0 lag: (min: 29.0, avg: 29.0, max: 32.0) [2023-10-12 19:35:53,435][61643] Avg episode reward: [(0, '24.380'), (1, '9.840')] [2023-10-12 19:35:54,459][62634] Updated weights for policy 0, policy_version 90980 (0.0010) [2023-10-12 19:35:54,823][62634] Updated weights for policy 0, policy_version 90990 (0.0008) [2023-10-12 19:35:55,208][62634] Updated weights for policy 0, policy_version 91000 (0.0007) [2023-10-12 19:35:56,681][62635] Updated weights for policy 1, policy_version 91050 (0.0009) [2023-10-12 19:35:57,051][62635] Updated weights for policy 1, policy_version 91060 (0.0010) [2023-10-12 19:35:57,424][62635] Updated weights for policy 1, policy_version 91070 (0.0007) [2023-10-12 19:35:58,435][61643] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 186449920. Throughput: 0: 1683.3, 1: 1680.1. Samples: 46618992. Policy #0 lag: (min: 29.0, avg: 29.0, max: 32.0) [2023-10-12 19:35:58,435][61643] Avg episode reward: [(0, '24.560'), (1, '9.680')] [2023-10-12 19:35:59,156][62634] Updated weights for policy 0, policy_version 91010 (0.0009) [2023-10-12 19:35:59,538][62634] Updated weights for policy 0, policy_version 91020 (0.0009) [2023-10-12 19:35:59,915][62634] Updated weights for policy 0, policy_version 91030 (0.0009) [2023-10-12 19:36:00,298][62634] Updated weights for policy 0, policy_version 91040 (0.0007) [2023-10-12 19:36:01,537][62635] Updated weights for policy 1, policy_version 91080 (0.0007) [2023-10-12 19:36:01,900][62635] Updated weights for policy 1, policy_version 91090 (0.0007) [2023-10-12 19:36:02,269][62635] Updated weights for policy 1, policy_version 91100 (0.0009) [2023-10-12 19:36:03,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 186515456. Throughput: 0: 1685.8, 1: 1677.1. Samples: 46639268. Policy #0 lag: (min: 29.0, avg: 29.0, max: 32.0) [2023-10-12 19:36:03,436][61643] Avg episode reward: [(0, '24.530'), (1, '9.800')] [2023-10-12 19:36:04,425][62634] Updated weights for policy 0, policy_version 91050 (0.0007) [2023-10-12 19:36:04,807][62634] Updated weights for policy 0, policy_version 91060 (0.0010) [2023-10-12 19:36:05,182][62634] Updated weights for policy 0, policy_version 91070 (0.0008) [2023-10-12 19:36:06,329][62635] Updated weights for policy 1, policy_version 91110 (0.0010) [2023-10-12 19:36:06,694][62635] Updated weights for policy 1, policy_version 91120 (0.0008) [2023-10-12 19:36:07,058][62635] Updated weights for policy 1, policy_version 91130 (0.0008) [2023-10-12 19:36:08,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 186580992. Throughput: 0: 1673.1, 1: 1694.1. Samples: 46649534. Policy #0 lag: (min: 29.0, avg: 29.0, max: 32.0) [2023-10-12 19:36:08,436][61643] Avg episode reward: [(0, '24.560'), (1, '9.760')] [2023-10-12 19:36:09,349][62634] Updated weights for policy 0, policy_version 91080 (0.0009) [2023-10-12 19:36:09,725][62634] Updated weights for policy 0, policy_version 91090 (0.0008) [2023-10-12 19:36:10,105][62634] Updated weights for policy 0, policy_version 91100 (0.0008) [2023-10-12 19:36:11,006][62635] Updated weights for policy 1, policy_version 91140 (0.0009) [2023-10-12 19:36:11,376][62635] Updated weights for policy 1, policy_version 91150 (0.0009) [2023-10-12 19:36:11,736][62635] Updated weights for policy 1, policy_version 91160 (0.0009) [2023-10-12 19:36:13,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 186646528. Throughput: 0: 1690.1, 1: 1664.8. Samples: 46669496. Policy #0 lag: (min: 29.0, avg: 29.0, max: 32.0) [2023-10-12 19:36:13,435][61643] Avg episode reward: [(0, '24.580'), (1, '9.490')] [2023-10-12 19:36:14,068][62634] Updated weights for policy 0, policy_version 91110 (0.0009) [2023-10-12 19:36:14,452][62634] Updated weights for policy 0, policy_version 91120 (0.0009) [2023-10-12 19:36:14,819][62634] Updated weights for policy 0, policy_version 91130 (0.0008) [2023-10-12 19:36:15,907][62635] Updated weights for policy 1, policy_version 91170 (0.0009) [2023-10-12 19:36:16,278][62635] Updated weights for policy 1, policy_version 91180 (0.0008) [2023-10-12 19:36:16,649][62635] Updated weights for policy 1, policy_version 91190 (0.0009) [2023-10-12 19:36:17,018][62635] Updated weights for policy 1, policy_version 91200 (0.0007) [2023-10-12 19:36:18,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 186712064. Throughput: 0: 1685.8, 1: 1684.6. Samples: 46690244. Policy #0 lag: (min: 29.0, avg: 29.0, max: 32.0) [2023-10-12 19:36:18,436][61643] Avg episode reward: [(0, '24.810'), (1, '9.500')] [2023-10-12 19:36:18,446][62354] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000091136_93323264.pth... [2023-10-12 19:36:18,446][62495] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000091200_93388800.pth... [2023-10-12 19:36:18,477][62354] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000089568_91717632.pth [2023-10-12 19:36:18,482][62495] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000089632_91783168.pth [2023-10-12 19:36:18,827][62634] Updated weights for policy 0, policy_version 91140 (0.0009) [2023-10-12 19:36:19,202][62634] Updated weights for policy 0, policy_version 91150 (0.0010) [2023-10-12 19:36:19,585][62634] Updated weights for policy 0, policy_version 91160 (0.0010) [2023-10-12 19:36:21,149][62635] Updated weights for policy 1, policy_version 91210 (0.0008) [2023-10-12 19:36:21,515][62635] Updated weights for policy 1, policy_version 91220 (0.0010) [2023-10-12 19:36:21,876][62635] Updated weights for policy 1, policy_version 91230 (0.0011) [2023-10-12 19:36:23,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 186777600. Throughput: 0: 1684.4, 1: 1682.1. Samples: 46700148. Policy #0 lag: (min: 29.0, avg: 29.0, max: 32.0) [2023-10-12 19:36:23,436][61643] Avg episode reward: [(0, '24.760'), (1, '9.740')] [2023-10-12 19:36:23,473][62634] Updated weights for policy 0, policy_version 91170 (0.0010) [2023-10-12 19:36:23,851][62634] Updated weights for policy 0, policy_version 91180 (0.0009) [2023-10-12 19:36:24,229][62634] Updated weights for policy 0, policy_version 91190 (0.0009) [2023-10-12 19:36:24,603][62634] Updated weights for policy 0, policy_version 91200 (0.0008) [2023-10-12 19:36:26,166][62635] Updated weights for policy 1, policy_version 91240 (0.0008) [2023-10-12 19:36:26,530][62635] Updated weights for policy 1, policy_version 91250 (0.0008) [2023-10-12 19:36:26,894][62635] Updated weights for policy 1, policy_version 91260 (0.0008) [2023-10-12 19:36:28,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 186843136. Throughput: 0: 1689.8, 1: 1666.5. Samples: 46719846. Policy #0 lag: (min: 29.0, avg: 29.0, max: 32.0) [2023-10-12 19:36:28,435][61643] Avg episode reward: [(0, '24.700'), (1, '9.550')] [2023-10-12 19:36:28,660][62634] Updated weights for policy 0, policy_version 91210 (0.0009) [2023-10-12 19:36:29,029][62634] Updated weights for policy 0, policy_version 91220 (0.0008) [2023-10-12 19:36:29,405][62634] Updated weights for policy 0, policy_version 91230 (0.0008) [2023-10-12 19:36:30,959][62635] Updated weights for policy 1, policy_version 91270 (0.0008) [2023-10-12 19:36:31,333][62635] Updated weights for policy 1, policy_version 91280 (0.0007) [2023-10-12 19:36:31,693][62635] Updated weights for policy 1, policy_version 91290 (0.0008) [2023-10-12 19:36:33,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 186908672. Throughput: 0: 1687.2, 1: 1684.9. Samples: 46740514. Policy #0 lag: (min: 29.0, avg: 29.0, max: 32.0) [2023-10-12 19:36:33,435][61643] Avg episode reward: [(0, '24.770'), (1, '9.730')] [2023-10-12 19:36:33,540][62634] Updated weights for policy 0, policy_version 91240 (0.0008) [2023-10-12 19:36:33,911][62634] Updated weights for policy 0, policy_version 91250 (0.0010) [2023-10-12 19:36:34,296][62634] Updated weights for policy 0, policy_version 91260 (0.0008) [2023-10-12 19:36:35,652][62635] Updated weights for policy 1, policy_version 91300 (0.0008) [2023-10-12 19:36:36,018][62635] Updated weights for policy 1, policy_version 91310 (0.0007) [2023-10-12 19:36:36,382][62635] Updated weights for policy 1, policy_version 91320 (0.0008) [2023-10-12 19:36:38,311][62634] Updated weights for policy 0, policy_version 91270 (0.0008) [2023-10-12 19:36:38,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 186974208. Throughput: 0: 1691.2, 1: 1675.7. Samples: 46750440. Policy #0 lag: (min: 29.0, avg: 29.0, max: 32.0) [2023-10-12 19:36:38,435][61643] Avg episode reward: [(0, '24.840'), (1, '9.650')] [2023-10-12 19:36:38,694][62634] Updated weights for policy 0, policy_version 91280 (0.0007) [2023-10-12 19:36:39,070][62634] Updated weights for policy 0, policy_version 91290 (0.0009) [2023-10-12 19:36:40,500][62635] Updated weights for policy 1, policy_version 91330 (0.0007) [2023-10-12 19:36:40,864][62635] Updated weights for policy 1, policy_version 91340 (0.0007) [2023-10-12 19:36:41,228][62635] Updated weights for policy 1, policy_version 91350 (0.0008) [2023-10-12 19:36:41,596][62635] Updated weights for policy 1, policy_version 91360 (0.0007) [2023-10-12 19:36:43,201][62634] Updated weights for policy 0, policy_version 91300 (0.0009) [2023-10-12 19:36:43,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 187039744. Throughput: 0: 1689.7, 1: 1676.3. Samples: 46770462. Policy #0 lag: (min: 29.0, avg: 29.0, max: 32.0) [2023-10-12 19:36:43,436][61643] Avg episode reward: [(0, '24.910'), (1, '9.740')] [2023-10-12 19:36:43,587][62634] Updated weights for policy 0, policy_version 91310 (0.0009) [2023-10-12 19:36:43,962][62634] Updated weights for policy 0, policy_version 91320 (0.0007) [2023-10-12 19:36:45,455][62635] Updated weights for policy 1, policy_version 91370 (0.0009) [2023-10-12 19:36:45,825][62635] Updated weights for policy 1, policy_version 91380 (0.0007) [2023-10-12 19:36:46,197][62635] Updated weights for policy 1, policy_version 91390 (0.0008) [2023-10-12 19:36:48,072][62634] Updated weights for policy 0, policy_version 91330 (0.0008) [2023-10-12 19:36:48,435][61643] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 187105280. Throughput: 0: 1691.7, 1: 1684.0. Samples: 46791176. Policy #0 lag: (min: 29.0, avg: 29.0, max: 32.0) [2023-10-12 19:36:48,436][61643] Avg episode reward: [(0, '24.780'), (1, '10.010')] [2023-10-12 19:36:48,448][62634] Updated weights for policy 0, policy_version 91340 (0.0007) [2023-10-12 19:36:48,834][62634] Updated weights for policy 0, policy_version 91350 (0.0007) [2023-10-12 19:36:49,205][62634] Updated weights for policy 0, policy_version 91360 (0.0007) [2023-10-12 19:36:50,459][62635] Updated weights for policy 1, policy_version 91400 (0.0007) [2023-10-12 19:36:50,822][62635] Updated weights for policy 1, policy_version 91410 (0.0007) [2023-10-12 19:36:51,189][62635] Updated weights for policy 1, policy_version 91420 (0.0010) [2023-10-12 19:36:53,135][62634] Updated weights for policy 0, policy_version 91370 (0.0007) [2023-10-12 19:36:53,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 187170816. Throughput: 0: 1695.7, 1: 1664.3. Samples: 46800730. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 19:36:53,435][61643] Avg episode reward: [(0, '24.860'), (1, '9.950')] [2023-10-12 19:36:53,506][62634] Updated weights for policy 0, policy_version 91380 (0.0008) [2023-10-12 19:36:53,892][62634] Updated weights for policy 0, policy_version 91390 (0.0009) [2023-10-12 19:36:55,186][62635] Updated weights for policy 1, policy_version 91430 (0.0009) [2023-10-12 19:36:55,549][62635] Updated weights for policy 1, policy_version 91440 (0.0009) [2023-10-12 19:36:55,916][62635] Updated weights for policy 1, policy_version 91450 (0.0009) [2023-10-12 19:36:57,951][62634] Updated weights for policy 0, policy_version 91400 (0.0011) [2023-10-12 19:36:58,325][62634] Updated weights for policy 0, policy_version 91410 (0.0011) [2023-10-12 19:36:58,435][61643] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 187236352. Throughput: 0: 1687.2, 1: 1675.5. Samples: 46820820. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 19:36:58,435][61643] Avg episode reward: [(0, '25.030'), (1, '9.670')] [2023-10-12 19:36:58,708][62634] Updated weights for policy 0, policy_version 91420 (0.0010) [2023-10-12 19:36:59,856][62635] Updated weights for policy 1, policy_version 91460 (0.0009) [2023-10-12 19:37:00,212][62635] Updated weights for policy 1, policy_version 91470 (0.0009) [2023-10-12 19:37:00,572][62635] Updated weights for policy 1, policy_version 91480 (0.0010) [2023-10-12 19:37:02,923][62634] Updated weights for policy 0, policy_version 91430 (0.0008) [2023-10-12 19:37:03,311][62634] Updated weights for policy 0, policy_version 91440 (0.0007) [2023-10-12 19:37:03,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 187301888. Throughput: 0: 1673.3, 1: 1681.0. Samples: 46841190. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 19:37:03,435][61643] Avg episode reward: [(0, '24.900'), (1, '9.850')] [2023-10-12 19:37:03,686][62634] Updated weights for policy 0, policy_version 91450 (0.0009) [2023-10-12 19:37:04,692][62635] Updated weights for policy 1, policy_version 91490 (0.0007) [2023-10-12 19:37:05,054][62635] Updated weights for policy 1, policy_version 91500 (0.0007) [2023-10-12 19:37:05,417][62635] Updated weights for policy 1, policy_version 91510 (0.0008) [2023-10-12 19:37:05,785][62635] Updated weights for policy 1, policy_version 91520 (0.0010) [2023-10-12 19:37:07,659][62634] Updated weights for policy 0, policy_version 91460 (0.0010) [2023-10-12 19:37:08,036][62634] Updated weights for policy 0, policy_version 91470 (0.0009) [2023-10-12 19:37:08,411][62634] Updated weights for policy 0, policy_version 91480 (0.0007) [2023-10-12 19:37:08,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 187367424. Throughput: 0: 1686.0, 1: 1658.1. Samples: 46850630. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 19:37:08,435][61643] Avg episode reward: [(0, '24.800'), (1, '9.760')] [2023-10-12 19:37:09,875][62635] Updated weights for policy 1, policy_version 91530 (0.0009) [2023-10-12 19:37:10,246][62635] Updated weights for policy 1, policy_version 91540 (0.0008) [2023-10-12 19:37:10,608][62635] Updated weights for policy 1, policy_version 91550 (0.0007) [2023-10-12 19:37:12,491][62634] Updated weights for policy 0, policy_version 91490 (0.0009) [2023-10-12 19:37:12,855][62634] Updated weights for policy 0, policy_version 91500 (0.0007) [2023-10-12 19:37:13,231][62634] Updated weights for policy 0, policy_version 91510 (0.0009) [2023-10-12 19:37:13,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 187432960. Throughput: 0: 1680.3, 1: 1683.8. Samples: 46871230. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 19:37:13,435][61643] Avg episode reward: [(0, '25.080'), (1, '9.510')] [2023-10-12 19:37:13,609][62634] Updated weights for policy 0, policy_version 91520 (0.0009) [2023-10-12 19:37:14,823][62635] Updated weights for policy 1, policy_version 91560 (0.0008) [2023-10-12 19:37:15,195][62635] Updated weights for policy 1, policy_version 91570 (0.0008) [2023-10-12 19:37:15,560][62635] Updated weights for policy 1, policy_version 91580 (0.0007) [2023-10-12 19:37:17,721][62634] Updated weights for policy 0, policy_version 91530 (0.0010) [2023-10-12 19:37:18,098][62634] Updated weights for policy 0, policy_version 91540 (0.0009) [2023-10-12 19:37:18,435][61643] Fps is (10 sec: 13106.6, 60 sec: 13107.1, 300 sec: 13440.4). Total num frames: 187498496. Throughput: 0: 1663.7, 1: 1682.7. Samples: 46891102. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 19:37:18,436][61643] Avg episode reward: [(0, '25.400'), (1, '9.600')] [2023-10-12 19:37:18,477][62634] Updated weights for policy 0, policy_version 91550 (0.0009) [2023-10-12 19:37:19,688][62635] Updated weights for policy 1, policy_version 91590 (0.0009) [2023-10-12 19:37:20,048][62635] Updated weights for policy 1, policy_version 91600 (0.0007) [2023-10-12 19:37:20,413][62635] Updated weights for policy 1, policy_version 91610 (0.0007) [2023-10-12 19:37:22,555][62634] Updated weights for policy 0, policy_version 91560 (0.0008) [2023-10-12 19:37:22,922][62634] Updated weights for policy 0, policy_version 91570 (0.0010) [2023-10-12 19:37:23,302][62634] Updated weights for policy 0, policy_version 91580 (0.0009) [2023-10-12 19:37:23,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 187564032. Throughput: 0: 1678.2, 1: 1663.2. Samples: 46900800. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 19:37:23,436][61643] Avg episode reward: [(0, '25.390'), (1, '9.500')] [2023-10-12 19:37:24,276][62635] Updated weights for policy 1, policy_version 91620 (0.0007) [2023-10-12 19:37:24,649][62635] Updated weights for policy 1, policy_version 91630 (0.0009) [2023-10-12 19:37:25,018][62635] Updated weights for policy 1, policy_version 91640 (0.0007) [2023-10-12 19:37:27,490][62634] Updated weights for policy 0, policy_version 91590 (0.0010) [2023-10-12 19:37:27,869][62634] Updated weights for policy 0, policy_version 91600 (0.0009) [2023-10-12 19:37:28,241][62634] Updated weights for policy 0, policy_version 91610 (0.0007) [2023-10-12 19:37:28,435][61643] Fps is (10 sec: 13107.8, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 187629568. Throughput: 0: 1676.9, 1: 1683.1. Samples: 46921662. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 19:37:28,435][61643] Avg episode reward: [(0, '25.160'), (1, '9.600')] [2023-10-12 19:37:29,075][62635] Updated weights for policy 1, policy_version 91650 (0.0007) [2023-10-12 19:37:29,439][62635] Updated weights for policy 1, policy_version 91660 (0.0008) [2023-10-12 19:37:29,808][62635] Updated weights for policy 1, policy_version 91670 (0.0007) [2023-10-12 19:37:30,172][62635] Updated weights for policy 1, policy_version 91680 (0.0008) [2023-10-12 19:37:32,206][62634] Updated weights for policy 0, policy_version 91620 (0.0008) [2023-10-12 19:37:32,593][62634] Updated weights for policy 0, policy_version 91630 (0.0007) [2023-10-12 19:37:32,962][62634] Updated weights for policy 0, policy_version 91640 (0.0008) [2023-10-12 19:37:33,435][61643] Fps is (10 sec: 16384.1, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 187727872. Throughput: 0: 1650.5, 1: 1686.6. Samples: 46941348. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 19:37:33,435][61643] Avg episode reward: [(0, '25.220'), (1, '9.780')] [2023-10-12 19:37:34,355][62635] Updated weights for policy 1, policy_version 91690 (0.0007) [2023-10-12 19:37:34,723][62635] Updated weights for policy 1, policy_version 91700 (0.0007) [2023-10-12 19:37:35,088][62635] Updated weights for policy 1, policy_version 91710 (0.0009) [2023-10-12 19:37:36,978][62634] Updated weights for policy 0, policy_version 91650 (0.0007) [2023-10-12 19:37:37,353][62634] Updated weights for policy 0, policy_version 91660 (0.0010) [2023-10-12 19:37:37,730][62634] Updated weights for policy 0, policy_version 91670 (0.0010) [2023-10-12 19:37:38,098][62634] Updated weights for policy 0, policy_version 91680 (0.0010) [2023-10-12 19:37:38,435][61643] Fps is (10 sec: 16382.9, 60 sec: 13653.2, 300 sec: 13440.4). Total num frames: 187793408. Throughput: 0: 1668.9, 1: 1674.9. Samples: 46951206. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 19:37:38,436][61643] Avg episode reward: [(0, '25.230'), (1, '9.650')] [2023-10-12 19:37:39,213][62635] Updated weights for policy 1, policy_version 91720 (0.0008) [2023-10-12 19:37:39,580][62635] Updated weights for policy 1, policy_version 91730 (0.0008) [2023-10-12 19:37:39,940][62635] Updated weights for policy 1, policy_version 91740 (0.0008) [2023-10-12 19:37:42,088][62634] Updated weights for policy 0, policy_version 91690 (0.0010) [2023-10-12 19:37:42,474][62634] Updated weights for policy 0, policy_version 91700 (0.0009) [2023-10-12 19:37:42,848][62634] Updated weights for policy 0, policy_version 91710 (0.0007) [2023-10-12 19:37:43,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 187858944. Throughput: 0: 1664.4, 1: 1694.3. Samples: 46971958. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 19:37:43,435][61643] Avg episode reward: [(0, '25.310'), (1, '9.620')] [2023-10-12 19:37:44,067][62635] Updated weights for policy 1, policy_version 91750 (0.0010) [2023-10-12 19:37:44,434][62635] Updated weights for policy 1, policy_version 91760 (0.0008) [2023-10-12 19:37:44,810][62635] Updated weights for policy 1, policy_version 91770 (0.0007) [2023-10-12 19:37:47,016][62634] Updated weights for policy 0, policy_version 91720 (0.0008) [2023-10-12 19:37:47,385][62634] Updated weights for policy 0, policy_version 91730 (0.0008) [2023-10-12 19:37:47,761][62634] Updated weights for policy 0, policy_version 91740 (0.0009) [2023-10-12 19:37:48,435][61643] Fps is (10 sec: 13108.1, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 187924480. Throughput: 0: 1651.8, 1: 1689.9. Samples: 46991568. Policy #0 lag: (min: 10.0, avg: 13.3, max: 42.0) [2023-10-12 19:37:48,435][61643] Avg episode reward: [(0, '25.380'), (1, '9.810')] [2023-10-12 19:37:48,706][62635] Updated weights for policy 1, policy_version 91780 (0.0008) [2023-10-12 19:37:49,075][62635] Updated weights for policy 1, policy_version 91790 (0.0009) [2023-10-12 19:37:49,442][62635] Updated weights for policy 1, policy_version 91800 (0.0008) [2023-10-12 19:37:51,771][62634] Updated weights for policy 0, policy_version 91750 (0.0011) [2023-10-12 19:37:52,142][62634] Updated weights for policy 0, policy_version 91760 (0.0010) [2023-10-12 19:37:52,528][62634] Updated weights for policy 0, policy_version 91770 (0.0009) [2023-10-12 19:37:53,356][62635] Updated weights for policy 1, policy_version 91810 (0.0009) [2023-10-12 19:37:53,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 187990016. Throughput: 0: 1670.1, 1: 1692.0. Samples: 47001926. Policy #0 lag: (min: 10.0, avg: 13.3, max: 42.0) [2023-10-12 19:37:53,435][61643] Avg episode reward: [(0, '25.370'), (1, '9.940')] [2023-10-12 19:37:53,718][62635] Updated weights for policy 1, policy_version 91820 (0.0011) [2023-10-12 19:37:54,079][62635] Updated weights for policy 1, policy_version 91830 (0.0008) [2023-10-12 19:37:54,448][62635] Updated weights for policy 1, policy_version 91840 (0.0008) [2023-10-12 19:37:56,649][62634] Updated weights for policy 0, policy_version 91780 (0.0008) [2023-10-12 19:37:57,028][62634] Updated weights for policy 0, policy_version 91790 (0.0008) [2023-10-12 19:37:57,401][62634] Updated weights for policy 0, policy_version 91800 (0.0008) [2023-10-12 19:37:58,431][62635] Updated weights for policy 1, policy_version 91850 (0.0007) [2023-10-12 19:37:58,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 188055552. Throughput: 0: 1659.9, 1: 1699.0. Samples: 47022378. Policy #0 lag: (min: 10.0, avg: 13.3, max: 42.0) [2023-10-12 19:37:58,435][61643] Avg episode reward: [(0, '25.370'), (1, '9.570')] [2023-10-12 19:37:58,792][62635] Updated weights for policy 1, policy_version 91860 (0.0010) [2023-10-12 19:37:59,164][62635] Updated weights for policy 1, policy_version 91870 (0.0011) [2023-10-12 19:38:01,413][62634] Updated weights for policy 0, policy_version 91810 (0.0007) [2023-10-12 19:38:01,783][62634] Updated weights for policy 0, policy_version 91820 (0.0010) [2023-10-12 19:38:02,161][62634] Updated weights for policy 0, policy_version 91830 (0.0010) [2023-10-12 19:38:02,536][62634] Updated weights for policy 0, policy_version 91840 (0.0011) [2023-10-12 19:38:03,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 188121088. Throughput: 0: 1660.0, 1: 1700.7. Samples: 47042332. Policy #0 lag: (min: 10.0, avg: 13.3, max: 42.0) [2023-10-12 19:38:03,435][61643] Avg episode reward: [(0, '25.340'), (1, '9.580')] [2023-10-12 19:38:03,497][62635] Updated weights for policy 1, policy_version 91880 (0.0009) [2023-10-12 19:38:03,879][62635] Updated weights for policy 1, policy_version 91890 (0.0010) [2023-10-12 19:38:04,238][62635] Updated weights for policy 1, policy_version 91900 (0.0009) [2023-10-12 19:38:06,533][62634] Updated weights for policy 0, policy_version 91850 (0.0009) [2023-10-12 19:38:06,913][62634] Updated weights for policy 0, policy_version 91860 (0.0008) [2023-10-12 19:38:07,282][62634] Updated weights for policy 0, policy_version 91870 (0.0009) [2023-10-12 19:38:08,307][62635] Updated weights for policy 1, policy_version 91910 (0.0010) [2023-10-12 19:38:08,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 188186624. Throughput: 0: 1674.8, 1: 1694.1. Samples: 47052400. Policy #0 lag: (min: 10.0, avg: 13.3, max: 42.0) [2023-10-12 19:38:08,436][61643] Avg episode reward: [(0, '25.380'), (1, '9.730')] [2023-10-12 19:38:08,665][62635] Updated weights for policy 1, policy_version 91920 (0.0008) [2023-10-12 19:38:09,036][62635] Updated weights for policy 1, policy_version 91930 (0.0009) [2023-10-12 19:38:11,459][62634] Updated weights for policy 0, policy_version 91880 (0.0007) [2023-10-12 19:38:11,841][62634] Updated weights for policy 0, policy_version 91890 (0.0008) [2023-10-12 19:38:12,211][62634] Updated weights for policy 0, policy_version 91900 (0.0009) [2023-10-12 19:38:13,001][62635] Updated weights for policy 1, policy_version 91940 (0.0008) [2023-10-12 19:38:13,366][62635] Updated weights for policy 1, policy_version 91950 (0.0007) [2023-10-12 19:38:13,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 188252160. Throughput: 0: 1656.8, 1: 1694.3. Samples: 47072462. Policy #0 lag: (min: 10.0, avg: 13.3, max: 42.0) [2023-10-12 19:38:13,435][61643] Avg episode reward: [(0, '25.420'), (1, '9.650')] [2023-10-12 19:38:13,725][62635] Updated weights for policy 1, policy_version 91960 (0.0011) [2023-10-12 19:38:16,349][62634] Updated weights for policy 0, policy_version 91910 (0.0010) [2023-10-12 19:38:16,733][62634] Updated weights for policy 0, policy_version 91920 (0.0008) [2023-10-12 19:38:17,111][62634] Updated weights for policy 0, policy_version 91930 (0.0011) [2023-10-12 19:38:17,753][62635] Updated weights for policy 1, policy_version 91970 (0.0010) [2023-10-12 19:38:18,133][62635] Updated weights for policy 1, policy_version 91980 (0.0010) [2023-10-12 19:38:18,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 188317696. Throughput: 0: 1669.4, 1: 1683.2. Samples: 47092212. Policy #0 lag: (min: 10.0, avg: 13.3, max: 42.0) [2023-10-12 19:38:18,435][61643] Avg episode reward: [(0, '25.380'), (1, '9.650')] [2023-10-12 19:38:18,444][62354] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000091936_94142464.pth... [2023-10-12 19:38:18,477][62354] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000090368_92536832.pth [2023-10-12 19:38:18,499][62635] Updated weights for policy 1, policy_version 91990 (0.0010) [2023-10-12 19:38:18,862][62495] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000092000_94208000.pth... [2023-10-12 19:38:18,862][62635] Updated weights for policy 1, policy_version 92000 (0.0010) [2023-10-12 19:38:18,901][62495] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000090400_92569600.pth [2023-10-12 19:38:21,071][62634] Updated weights for policy 0, policy_version 91940 (0.0008) [2023-10-12 19:38:21,460][62634] Updated weights for policy 0, policy_version 91950 (0.0009) [2023-10-12 19:38:21,825][62634] Updated weights for policy 0, policy_version 91960 (0.0011) [2023-10-12 19:38:22,968][62635] Updated weights for policy 1, policy_version 92010 (0.0008) [2023-10-12 19:38:23,324][62635] Updated weights for policy 1, policy_version 92020 (0.0008) [2023-10-12 19:38:23,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 188383232. Throughput: 0: 1679.5, 1: 1692.7. Samples: 47102954. Policy #0 lag: (min: 10.0, avg: 13.3, max: 42.0) [2023-10-12 19:38:23,436][61643] Avg episode reward: [(0, '25.360'), (1, '9.680')] [2023-10-12 19:38:23,693][62635] Updated weights for policy 1, policy_version 92030 (0.0008) [2023-10-12 19:38:25,885][62634] Updated weights for policy 0, policy_version 91970 (0.0009) [2023-10-12 19:38:26,265][62634] Updated weights for policy 0, policy_version 91980 (0.0011) [2023-10-12 19:38:26,641][62634] Updated weights for policy 0, policy_version 91990 (0.0010) [2023-10-12 19:38:27,021][62634] Updated weights for policy 0, policy_version 92000 (0.0009) [2023-10-12 19:38:27,812][62635] Updated weights for policy 1, policy_version 92040 (0.0008) [2023-10-12 19:38:28,182][62635] Updated weights for policy 1, policy_version 92050 (0.0007) [2023-10-12 19:38:28,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 188448768. Throughput: 0: 1661.7, 1: 1687.1. Samples: 47122656. Policy #0 lag: (min: 10.0, avg: 13.3, max: 42.0) [2023-10-12 19:38:28,436][61643] Avg episode reward: [(0, '25.490'), (1, '9.900')] [2023-10-12 19:38:28,555][62635] Updated weights for policy 1, policy_version 92060 (0.0010) [2023-10-12 19:38:31,184][62634] Updated weights for policy 0, policy_version 92010 (0.0011) [2023-10-12 19:38:31,561][62634] Updated weights for policy 0, policy_version 92020 (0.0008) [2023-10-12 19:38:31,948][62634] Updated weights for policy 0, policy_version 92030 (0.0009) [2023-10-12 19:38:32,592][62635] Updated weights for policy 1, policy_version 92070 (0.0010) [2023-10-12 19:38:32,953][62635] Updated weights for policy 1, policy_version 92080 (0.0010) [2023-10-12 19:38:33,322][62635] Updated weights for policy 1, policy_version 92090 (0.0008) [2023-10-12 19:38:33,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 188514304. Throughput: 0: 1684.1, 1: 1667.6. Samples: 47142392. Policy #0 lag: (min: 10.0, avg: 13.3, max: 42.0) [2023-10-12 19:38:33,435][61643] Avg episode reward: [(0, '25.450'), (1, '9.900')] [2023-10-12 19:38:36,068][62634] Updated weights for policy 0, policy_version 92040 (0.0010) [2023-10-12 19:38:36,445][62634] Updated weights for policy 0, policy_version 92050 (0.0009) [2023-10-12 19:38:36,817][62634] Updated weights for policy 0, policy_version 92060 (0.0008) [2023-10-12 19:38:37,454][62635] Updated weights for policy 1, policy_version 92100 (0.0009) [2023-10-12 19:38:37,822][62635] Updated weights for policy 1, policy_version 92110 (0.0011) [2023-10-12 19:38:38,191][62635] Updated weights for policy 1, policy_version 92120 (0.0008) [2023-10-12 19:38:38,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13107.3, 300 sec: 13440.4). Total num frames: 188579840. Throughput: 0: 1678.4, 1: 1682.0. Samples: 47153148. Policy #0 lag: (min: 10.0, avg: 13.3, max: 42.0) [2023-10-12 19:38:38,435][61643] Avg episode reward: [(0, '25.180'), (1, '9.840')] [2023-10-12 19:38:40,835][62634] Updated weights for policy 0, policy_version 92070 (0.0010) [2023-10-12 19:38:41,208][62634] Updated weights for policy 0, policy_version 92080 (0.0007) [2023-10-12 19:38:41,587][62634] Updated weights for policy 0, policy_version 92090 (0.0009) [2023-10-12 19:38:42,297][62635] Updated weights for policy 1, policy_version 92130 (0.0009) [2023-10-12 19:38:42,659][62635] Updated weights for policy 1, policy_version 92140 (0.0007) [2023-10-12 19:38:43,031][62635] Updated weights for policy 1, policy_version 92150 (0.0009) [2023-10-12 19:38:43,389][62635] Updated weights for policy 1, policy_version 92160 (0.0008) [2023-10-12 19:38:43,435][61643] Fps is (10 sec: 16383.9, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 188678144. Throughput: 0: 1663.9, 1: 1680.3. Samples: 47172866. Policy #0 lag: (min: 10.0, avg: 13.3, max: 42.0) [2023-10-12 19:38:43,436][61643] Avg episode reward: [(0, '25.280'), (1, '9.950')] [2023-10-12 19:38:45,568][62634] Updated weights for policy 0, policy_version 92100 (0.0009) [2023-10-12 19:38:45,939][62634] Updated weights for policy 0, policy_version 92110 (0.0008) [2023-10-12 19:38:46,320][62634] Updated weights for policy 0, policy_version 92120 (0.0008) [2023-10-12 19:38:47,389][62635] Updated weights for policy 1, policy_version 92170 (0.0007) [2023-10-12 19:38:47,752][62635] Updated weights for policy 1, policy_version 92180 (0.0008) [2023-10-12 19:38:48,120][62635] Updated weights for policy 1, policy_version 92190 (0.0007) [2023-10-12 19:38:48,435][61643] Fps is (10 sec: 16383.6, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 188743680. Throughput: 0: 1679.8, 1: 1659.2. Samples: 47192588. Policy #0 lag: (min: 31.0, avg: 31.7, max: 49.0) [2023-10-12 19:38:48,436][61643] Avg episode reward: [(0, '25.360'), (1, '9.860')] [2023-10-12 19:38:50,388][62634] Updated weights for policy 0, policy_version 92130 (0.0009) [2023-10-12 19:38:50,775][62634] Updated weights for policy 0, policy_version 92140 (0.0007) [2023-10-12 19:38:51,143][62634] Updated weights for policy 0, policy_version 92150 (0.0009) [2023-10-12 19:38:51,522][62634] Updated weights for policy 0, policy_version 92160 (0.0009) [2023-10-12 19:38:52,245][62635] Updated weights for policy 1, policy_version 92200 (0.0008) [2023-10-12 19:38:52,626][62635] Updated weights for policy 1, policy_version 92210 (0.0007) [2023-10-12 19:38:52,980][62635] Updated weights for policy 1, policy_version 92220 (0.0007) [2023-10-12 19:38:53,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 188809216. Throughput: 0: 1665.2, 1: 1688.4. Samples: 47203308. Policy #0 lag: (min: 31.0, avg: 31.7, max: 49.0) [2023-10-12 19:38:53,436][61643] Avg episode reward: [(0, '25.200'), (1, '9.940')] [2023-10-12 19:38:55,658][62634] Updated weights for policy 0, policy_version 92170 (0.0009) [2023-10-12 19:38:56,030][62634] Updated weights for policy 0, policy_version 92180 (0.0007) [2023-10-12 19:38:56,416][62634] Updated weights for policy 0, policy_version 92190 (0.0007) [2023-10-12 19:38:57,029][62635] Updated weights for policy 1, policy_version 92230 (0.0008) [2023-10-12 19:38:57,392][62635] Updated weights for policy 1, policy_version 92240 (0.0007) [2023-10-12 19:38:57,759][62635] Updated weights for policy 1, policy_version 92250 (0.0007) [2023-10-12 19:38:58,435][61643] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 188874752. Throughput: 0: 1671.3, 1: 1671.5. Samples: 47222890. Policy #0 lag: (min: 31.0, avg: 31.7, max: 49.0) [2023-10-12 19:38:58,435][61643] Avg episode reward: [(0, '25.250'), (1, '9.940')] [2023-10-12 19:39:00,454][62634] Updated weights for policy 0, policy_version 92200 (0.0008) [2023-10-12 19:39:00,834][62634] Updated weights for policy 0, policy_version 92210 (0.0007) [2023-10-12 19:39:01,217][62634] Updated weights for policy 0, policy_version 92220 (0.0008) [2023-10-12 19:39:01,905][62635] Updated weights for policy 1, policy_version 92260 (0.0008) [2023-10-12 19:39:02,278][62635] Updated weights for policy 1, policy_version 92270 (0.0008) [2023-10-12 19:39:02,644][62635] Updated weights for policy 1, policy_version 92280 (0.0007) [2023-10-12 19:39:03,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 188940288. Throughput: 0: 1677.7, 1: 1663.5. Samples: 47242566. Policy #0 lag: (min: 31.0, avg: 31.7, max: 49.0) [2023-10-12 19:39:03,435][61643] Avg episode reward: [(0, '25.150'), (1, '9.810')] [2023-10-12 19:39:05,202][62634] Updated weights for policy 0, policy_version 92230 (0.0008) [2023-10-12 19:39:05,580][62634] Updated weights for policy 0, policy_version 92240 (0.0007) [2023-10-12 19:39:05,954][62634] Updated weights for policy 0, policy_version 92250 (0.0007) [2023-10-12 19:39:06,753][62635] Updated weights for policy 1, policy_version 92290 (0.0009) [2023-10-12 19:39:07,117][62635] Updated weights for policy 1, policy_version 92300 (0.0008) [2023-10-12 19:39:07,490][62635] Updated weights for policy 1, policy_version 92310 (0.0008) [2023-10-12 19:39:07,855][62635] Updated weights for policy 1, policy_version 92320 (0.0007) [2023-10-12 19:39:08,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 189005824. Throughput: 0: 1655.9, 1: 1684.8. Samples: 47253284. Policy #0 lag: (min: 31.0, avg: 31.7, max: 49.0) [2023-10-12 19:39:08,435][61643] Avg episode reward: [(0, '25.020'), (1, '9.620')] [2023-10-12 19:39:09,999][62634] Updated weights for policy 0, policy_version 92260 (0.0007) [2023-10-12 19:39:10,367][62634] Updated weights for policy 0, policy_version 92270 (0.0007) [2023-10-12 19:39:10,749][62634] Updated weights for policy 0, policy_version 92280 (0.0007) [2023-10-12 19:39:11,769][62635] Updated weights for policy 1, policy_version 92330 (0.0011) [2023-10-12 19:39:12,147][62635] Updated weights for policy 1, policy_version 92340 (0.0009) [2023-10-12 19:39:12,519][62635] Updated weights for policy 1, policy_version 92350 (0.0011) [2023-10-12 19:39:13,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 189071360. Throughput: 0: 1669.0, 1: 1672.1. Samples: 47273004. Policy #0 lag: (min: 31.0, avg: 31.7, max: 49.0) [2023-10-12 19:39:13,435][61643] Avg episode reward: [(0, '24.990'), (1, '9.560')] [2023-10-12 19:39:14,885][62634] Updated weights for policy 0, policy_version 92290 (0.0009) [2023-10-12 19:39:15,258][62634] Updated weights for policy 0, policy_version 92300 (0.0007) [2023-10-12 19:39:15,642][62634] Updated weights for policy 0, policy_version 92310 (0.0008) [2023-10-12 19:39:16,012][62634] Updated weights for policy 0, policy_version 92320 (0.0009) [2023-10-12 19:39:16,835][62635] Updated weights for policy 1, policy_version 92360 (0.0009) [2023-10-12 19:39:17,206][62635] Updated weights for policy 1, policy_version 92370 (0.0009) [2023-10-12 19:39:17,571][62635] Updated weights for policy 1, policy_version 92380 (0.0008) [2023-10-12 19:39:18,435][61643] Fps is (10 sec: 13106.6, 60 sec: 13653.2, 300 sec: 13440.4). Total num frames: 189136896. Throughput: 0: 1671.9, 1: 1671.9. Samples: 47292868. Policy #0 lag: (min: 31.0, avg: 31.7, max: 49.0) [2023-10-12 19:39:18,436][61643] Avg episode reward: [(0, '25.020'), (1, '9.530')] [2023-10-12 19:39:20,168][62634] Updated weights for policy 0, policy_version 92330 (0.0011) [2023-10-12 19:39:20,551][62634] Updated weights for policy 0, policy_version 92340 (0.0011) [2023-10-12 19:39:20,920][62634] Updated weights for policy 0, policy_version 92350 (0.0011) [2023-10-12 19:39:21,682][62635] Updated weights for policy 1, policy_version 92390 (0.0010) [2023-10-12 19:39:22,056][62635] Updated weights for policy 1, policy_version 92400 (0.0010) [2023-10-12 19:39:22,432][62635] Updated weights for policy 1, policy_version 92410 (0.0008) [2023-10-12 19:39:23,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 189202432. Throughput: 0: 1649.7, 1: 1685.5. Samples: 47303232. Policy #0 lag: (min: 31.0, avg: 31.7, max: 49.0) [2023-10-12 19:39:23,436][61643] Avg episode reward: [(0, '24.910'), (1, '9.560')] [2023-10-12 19:39:25,025][62634] Updated weights for policy 0, policy_version 92360 (0.0008) [2023-10-12 19:39:25,409][62634] Updated weights for policy 0, policy_version 92370 (0.0008) [2023-10-12 19:39:25,777][62634] Updated weights for policy 0, policy_version 92380 (0.0008) [2023-10-12 19:39:26,434][62635] Updated weights for policy 1, policy_version 92420 (0.0010) [2023-10-12 19:39:26,804][62635] Updated weights for policy 1, policy_version 92430 (0.0008) [2023-10-12 19:39:27,167][62635] Updated weights for policy 1, policy_version 92440 (0.0009) [2023-10-12 19:39:28,435][61643] Fps is (10 sec: 13107.6, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 189267968. Throughput: 0: 1672.6, 1: 1665.3. Samples: 47323074. Policy #0 lag: (min: 31.0, avg: 31.7, max: 49.0) [2023-10-12 19:39:28,436][61643] Avg episode reward: [(0, '24.720'), (1, '9.490')] [2023-10-12 19:39:29,925][62634] Updated weights for policy 0, policy_version 92390 (0.0010) [2023-10-12 19:39:30,304][62634] Updated weights for policy 0, policy_version 92400 (0.0010) [2023-10-12 19:39:30,680][62634] Updated weights for policy 0, policy_version 92410 (0.0010) [2023-10-12 19:39:31,029][62635] Updated weights for policy 1, policy_version 92450 (0.0009) [2023-10-12 19:39:31,405][62635] Updated weights for policy 1, policy_version 92460 (0.0010) [2023-10-12 19:39:31,774][62635] Updated weights for policy 1, policy_version 92470 (0.0009) [2023-10-12 19:39:32,142][62635] Updated weights for policy 1, policy_version 92480 (0.0010) [2023-10-12 19:39:33,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 189333504. Throughput: 0: 1672.0, 1: 1675.3. Samples: 47343216. Policy #0 lag: (min: 31.0, avg: 31.7, max: 49.0) [2023-10-12 19:39:33,436][61643] Avg episode reward: [(0, '24.770'), (1, '9.530')] [2023-10-12 19:39:34,630][62634] Updated weights for policy 0, policy_version 92420 (0.0009) [2023-10-12 19:39:35,005][62634] Updated weights for policy 0, policy_version 92430 (0.0008) [2023-10-12 19:39:35,377][62634] Updated weights for policy 0, policy_version 92440 (0.0007) [2023-10-12 19:39:36,175][62635] Updated weights for policy 1, policy_version 92490 (0.0009) [2023-10-12 19:39:36,543][62635] Updated weights for policy 1, policy_version 92500 (0.0009) [2023-10-12 19:39:36,915][62635] Updated weights for policy 1, policy_version 92510 (0.0008) [2023-10-12 19:39:38,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 189399040. Throughput: 0: 1657.3, 1: 1679.5. Samples: 47353466. Policy #0 lag: (min: 31.0, avg: 31.7, max: 49.0) [2023-10-12 19:39:38,436][61643] Avg episode reward: [(0, '24.660'), (1, '9.840')] [2023-10-12 19:39:39,246][62634] Updated weights for policy 0, policy_version 92450 (0.0007) [2023-10-12 19:39:39,635][62634] Updated weights for policy 0, policy_version 92460 (0.0009) [2023-10-12 19:39:40,002][62634] Updated weights for policy 0, policy_version 92470 (0.0007) [2023-10-12 19:39:40,375][62634] Updated weights for policy 0, policy_version 92480 (0.0007) [2023-10-12 19:39:41,156][62635] Updated weights for policy 1, policy_version 92520 (0.0009) [2023-10-12 19:39:41,532][62635] Updated weights for policy 1, policy_version 92530 (0.0009) [2023-10-12 19:39:41,887][62635] Updated weights for policy 1, policy_version 92540 (0.0009) [2023-10-12 19:39:43,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 189464576. Throughput: 0: 1674.4, 1: 1666.1. Samples: 47373210. Policy #0 lag: (min: 31.0, avg: 31.7, max: 49.0) [2023-10-12 19:39:43,435][61643] Avg episode reward: [(0, '24.570'), (1, '9.820')] [2023-10-12 19:39:44,543][62634] Updated weights for policy 0, policy_version 92490 (0.0007) [2023-10-12 19:39:44,922][62634] Updated weights for policy 0, policy_version 92500 (0.0009) [2023-10-12 19:39:45,300][62634] Updated weights for policy 0, policy_version 92510 (0.0010) [2023-10-12 19:39:45,766][62635] Updated weights for policy 1, policy_version 92550 (0.0008) [2023-10-12 19:39:46,135][62635] Updated weights for policy 1, policy_version 92560 (0.0008) [2023-10-12 19:39:46,493][62635] Updated weights for policy 1, policy_version 92570 (0.0008) [2023-10-12 19:39:48,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13107.3, 300 sec: 13440.4). Total num frames: 189530112. Throughput: 0: 1680.6, 1: 1684.7. Samples: 47394004. Policy #0 lag: (min: 24.0, avg: 49.0, max: 56.0) [2023-10-12 19:39:48,435][61643] Avg episode reward: [(0, '24.690'), (1, '9.780')] [2023-10-12 19:39:49,541][62634] Updated weights for policy 0, policy_version 92520 (0.0008) [2023-10-12 19:39:49,918][62634] Updated weights for policy 0, policy_version 92530 (0.0008) [2023-10-12 19:39:50,298][62634] Updated weights for policy 0, policy_version 92540 (0.0009) [2023-10-12 19:39:50,587][62635] Updated weights for policy 1, policy_version 92580 (0.0007) [2023-10-12 19:39:50,956][62635] Updated weights for policy 1, policy_version 92590 (0.0007) [2023-10-12 19:39:51,332][62635] Updated weights for policy 1, policy_version 92600 (0.0008) [2023-10-12 19:39:53,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 189595648. Throughput: 0: 1667.2, 1: 1672.0. Samples: 47403544. Policy #0 lag: (min: 24.0, avg: 49.0, max: 56.0) [2023-10-12 19:39:53,435][61643] Avg episode reward: [(0, '24.810'), (1, '9.620')] [2023-10-12 19:39:54,311][62634] Updated weights for policy 0, policy_version 92550 (0.0008) [2023-10-12 19:39:54,684][62634] Updated weights for policy 0, policy_version 92560 (0.0007) [2023-10-12 19:39:55,065][62634] Updated weights for policy 0, policy_version 92570 (0.0008) [2023-10-12 19:39:55,294][62635] Updated weights for policy 1, policy_version 92610 (0.0008) [2023-10-12 19:39:55,663][62635] Updated weights for policy 1, policy_version 92620 (0.0010) [2023-10-12 19:39:56,018][62635] Updated weights for policy 1, policy_version 92630 (0.0009) [2023-10-12 19:39:56,389][62635] Updated weights for policy 1, policy_version 92640 (0.0008) [2023-10-12 19:39:58,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 189661184. Throughput: 0: 1681.1, 1: 1670.9. Samples: 47423844. Policy #0 lag: (min: 24.0, avg: 49.0, max: 56.0) [2023-10-12 19:39:58,436][61643] Avg episode reward: [(0, '24.760'), (1, '9.430')] [2023-10-12 19:39:59,102][62634] Updated weights for policy 0, policy_version 92580 (0.0008) [2023-10-12 19:39:59,500][62634] Updated weights for policy 0, policy_version 92590 (0.0010) [2023-10-12 19:39:59,876][62634] Updated weights for policy 0, policy_version 92600 (0.0010) [2023-10-12 19:40:00,656][62635] Updated weights for policy 1, policy_version 92650 (0.0009) [2023-10-12 19:40:01,022][62635] Updated weights for policy 1, policy_version 92660 (0.0008) [2023-10-12 19:40:01,387][62635] Updated weights for policy 1, policy_version 92670 (0.0009) [2023-10-12 19:40:03,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 189726720. Throughput: 0: 1681.7, 1: 1691.4. Samples: 47444660. Policy #0 lag: (min: 24.0, avg: 49.0, max: 56.0) [2023-10-12 19:40:03,436][61643] Avg episode reward: [(0, '24.800'), (1, '9.830')] [2023-10-12 19:40:03,888][62634] Updated weights for policy 0, policy_version 92610 (0.0010) [2023-10-12 19:40:04,257][62634] Updated weights for policy 0, policy_version 92620 (0.0008) [2023-10-12 19:40:04,640][62634] Updated weights for policy 0, policy_version 92630 (0.0008) [2023-10-12 19:40:05,010][62634] Updated weights for policy 0, policy_version 92640 (0.0009) [2023-10-12 19:40:05,292][62635] Updated weights for policy 1, policy_version 92680 (0.0007) [2023-10-12 19:40:05,662][62635] Updated weights for policy 1, policy_version 92690 (0.0008) [2023-10-12 19:40:06,030][62635] Updated weights for policy 1, policy_version 92700 (0.0008) [2023-10-12 19:40:08,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 189792256. Throughput: 0: 1683.2, 1: 1666.6. Samples: 47453972. Policy #0 lag: (min: 24.0, avg: 49.0, max: 56.0) [2023-10-12 19:40:08,435][61643] Avg episode reward: [(0, '24.550'), (1, '9.990')] [2023-10-12 19:40:09,009][62634] Updated weights for policy 0, policy_version 92650 (0.0010) [2023-10-12 19:40:09,396][62634] Updated weights for policy 0, policy_version 92660 (0.0008) [2023-10-12 19:40:09,769][62634] Updated weights for policy 0, policy_version 92670 (0.0008) [2023-10-12 19:40:10,037][62635] Updated weights for policy 1, policy_version 92710 (0.0009) [2023-10-12 19:40:10,402][62635] Updated weights for policy 1, policy_version 92720 (0.0007) [2023-10-12 19:40:10,774][62635] Updated weights for policy 1, policy_version 92730 (0.0008) [2023-10-12 19:40:13,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 189857792. Throughput: 0: 1688.9, 1: 1679.2. Samples: 47474638. Policy #0 lag: (min: 24.0, avg: 49.0, max: 56.0) [2023-10-12 19:40:13,435][61643] Avg episode reward: [(0, '24.440'), (1, '9.850')] [2023-10-12 19:40:13,902][62634] Updated weights for policy 0, policy_version 92680 (0.0010) [2023-10-12 19:40:14,269][62634] Updated weights for policy 0, policy_version 92690 (0.0009) [2023-10-12 19:40:14,657][62634] Updated weights for policy 0, policy_version 92700 (0.0009) [2023-10-12 19:40:14,756][62635] Updated weights for policy 1, policy_version 92740 (0.0007) [2023-10-12 19:40:15,111][62635] Updated weights for policy 1, policy_version 92750 (0.0009) [2023-10-12 19:40:15,478][62635] Updated weights for policy 1, policy_version 92760 (0.0009) [2023-10-12 19:40:18,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.3, 300 sec: 13440.4). Total num frames: 189923328. Throughput: 0: 1689.3, 1: 1689.0. Samples: 47495242. Policy #0 lag: (min: 24.0, avg: 49.0, max: 56.0) [2023-10-12 19:40:18,435][61643] Avg episode reward: [(0, '24.560'), (1, '9.760')] [2023-10-12 19:40:18,443][62495] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000092768_94994432.pth... [2023-10-12 19:40:18,472][62495] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000091200_93388800.pth [2023-10-12 19:40:18,677][62634] Updated weights for policy 0, policy_version 92710 (0.0008) [2023-10-12 19:40:19,061][62634] Updated weights for policy 0, policy_version 92720 (0.0007) [2023-10-12 19:40:19,438][62634] Updated weights for policy 0, policy_version 92730 (0.0009) [2023-10-12 19:40:19,559][62635] Updated weights for policy 1, policy_version 92770 (0.0007) [2023-10-12 19:40:19,652][62354] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000092736_94961664.pth... [2023-10-12 19:40:19,687][62354] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000091136_93323264.pth [2023-10-12 19:40:19,931][62635] Updated weights for policy 1, policy_version 92780 (0.0007) [2023-10-12 19:40:20,296][62635] Updated weights for policy 1, policy_version 92790 (0.0007) [2023-10-12 19:40:20,663][62635] Updated weights for policy 1, policy_version 92800 (0.0009) [2023-10-12 19:40:23,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 189988864. Throughput: 0: 1687.0, 1: 1664.6. Samples: 47504290. Policy #0 lag: (min: 24.0, avg: 49.0, max: 56.0) [2023-10-12 19:40:23,435][61643] Avg episode reward: [(0, '24.660'), (1, '9.960')] [2023-10-12 19:40:23,572][62634] Updated weights for policy 0, policy_version 92740 (0.0008) [2023-10-12 19:40:23,954][62634] Updated weights for policy 0, policy_version 92750 (0.0011) [2023-10-12 19:40:24,328][62634] Updated weights for policy 0, policy_version 92760 (0.0009) [2023-10-12 19:40:24,710][62635] Updated weights for policy 1, policy_version 92810 (0.0008) [2023-10-12 19:40:25,085][62635] Updated weights for policy 1, policy_version 92820 (0.0008) [2023-10-12 19:40:25,446][62635] Updated weights for policy 1, policy_version 92830 (0.0008) [2023-10-12 19:40:28,274][62634] Updated weights for policy 0, policy_version 92770 (0.0008) [2023-10-12 19:40:28,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.3, 300 sec: 13329.4). Total num frames: 190054400. Throughput: 0: 1685.9, 1: 1692.1. Samples: 47525218. Policy #0 lag: (min: 24.0, avg: 49.0, max: 56.0) [2023-10-12 19:40:28,435][61643] Avg episode reward: [(0, '24.650'), (1, '9.970')] [2023-10-12 19:40:28,639][62634] Updated weights for policy 0, policy_version 92780 (0.0009) [2023-10-12 19:40:29,012][62634] Updated weights for policy 0, policy_version 92790 (0.0007) [2023-10-12 19:40:29,399][62634] Updated weights for policy 0, policy_version 92800 (0.0007) [2023-10-12 19:40:29,588][62635] Updated weights for policy 1, policy_version 92840 (0.0007) [2023-10-12 19:40:29,971][62635] Updated weights for policy 1, policy_version 92850 (0.0010) [2023-10-12 19:40:30,341][62635] Updated weights for policy 1, policy_version 92860 (0.0011) [2023-10-12 19:40:33,273][62634] Updated weights for policy 0, policy_version 92810 (0.0009) [2023-10-12 19:40:33,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 190119936. Throughput: 0: 1683.2, 1: 1691.4. Samples: 47545860. Policy #0 lag: (min: 24.0, avg: 49.0, max: 56.0) [2023-10-12 19:40:33,435][61643] Avg episode reward: [(0, '24.370'), (1, '9.880')] [2023-10-12 19:40:33,646][62634] Updated weights for policy 0, policy_version 92820 (0.0009) [2023-10-12 19:40:34,029][62634] Updated weights for policy 0, policy_version 92830 (0.0008) [2023-10-12 19:40:34,284][62635] Updated weights for policy 1, policy_version 92870 (0.0010) [2023-10-12 19:40:34,655][62635] Updated weights for policy 1, policy_version 92880 (0.0008) [2023-10-12 19:40:35,015][62635] Updated weights for policy 1, policy_version 92890 (0.0007) [2023-10-12 19:40:37,982][62634] Updated weights for policy 0, policy_version 92840 (0.0010) [2023-10-12 19:40:38,368][62634] Updated weights for policy 0, policy_version 92850 (0.0008) [2023-10-12 19:40:38,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 190185472. Throughput: 0: 1693.2, 1: 1678.1. Samples: 47555254. Policy #0 lag: (min: 24.0, avg: 49.0, max: 56.0) [2023-10-12 19:40:38,436][61643] Avg episode reward: [(0, '24.360'), (1, '9.960')] [2023-10-12 19:40:38,741][62634] Updated weights for policy 0, policy_version 92860 (0.0008) [2023-10-12 19:40:39,181][62635] Updated weights for policy 1, policy_version 92900 (0.0008) [2023-10-12 19:40:39,551][62635] Updated weights for policy 1, policy_version 92910 (0.0007) [2023-10-12 19:40:39,914][62635] Updated weights for policy 1, policy_version 92920 (0.0007) [2023-10-12 19:40:42,731][62634] Updated weights for policy 0, policy_version 92870 (0.0008) [2023-10-12 19:40:43,105][62634] Updated weights for policy 0, policy_version 92880 (0.0007) [2023-10-12 19:40:43,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 190251008. Throughput: 0: 1689.5, 1: 1692.0. Samples: 47576012. Policy #0 lag: (min: 24.0, avg: 49.0, max: 56.0) [2023-10-12 19:40:43,435][61643] Avg episode reward: [(0, '23.800'), (1, '9.710')] [2023-10-12 19:40:43,472][62634] Updated weights for policy 0, policy_version 92890 (0.0009) [2023-10-12 19:40:43,973][62635] Updated weights for policy 1, policy_version 92930 (0.0007) [2023-10-12 19:40:44,343][62635] Updated weights for policy 1, policy_version 92940 (0.0007) [2023-10-12 19:40:44,710][62635] Updated weights for policy 1, policy_version 92950 (0.0008) [2023-10-12 19:40:45,081][62635] Updated weights for policy 1, policy_version 92960 (0.0010) [2023-10-12 19:40:47,707][62634] Updated weights for policy 0, policy_version 92900 (0.0009) [2023-10-12 19:40:48,101][62634] Updated weights for policy 0, policy_version 92910 (0.0008) [2023-10-12 19:40:48,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 190316544. Throughput: 0: 1676.9, 1: 1687.6. Samples: 47596062. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 19:40:48,435][61643] Avg episode reward: [(0, '23.840'), (1, '9.890')] [2023-10-12 19:40:48,475][62634] Updated weights for policy 0, policy_version 92920 (0.0008) [2023-10-12 19:40:49,200][62635] Updated weights for policy 1, policy_version 92970 (0.0008) [2023-10-12 19:40:49,568][62635] Updated weights for policy 1, policy_version 92980 (0.0007) [2023-10-12 19:40:49,937][62635] Updated weights for policy 1, policy_version 92990 (0.0007) [2023-10-12 19:40:52,421][62634] Updated weights for policy 0, policy_version 92930 (0.0008) [2023-10-12 19:40:52,794][62634] Updated weights for policy 0, policy_version 92940 (0.0009) [2023-10-12 19:40:53,174][62634] Updated weights for policy 0, policy_version 92950 (0.0009) [2023-10-12 19:40:53,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 190382080. Throughput: 0: 1690.0, 1: 1682.5. Samples: 47605736. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 19:40:53,435][61643] Avg episode reward: [(0, '23.840'), (1, '10.010')] [2023-10-12 19:40:53,549][62634] Updated weights for policy 0, policy_version 92960 (0.0011) [2023-10-12 19:40:53,915][62635] Updated weights for policy 1, policy_version 93000 (0.0010) [2023-10-12 19:40:54,279][62635] Updated weights for policy 1, policy_version 93010 (0.0009) [2023-10-12 19:40:54,649][62635] Updated weights for policy 1, policy_version 93020 (0.0007) [2023-10-12 19:40:57,592][62634] Updated weights for policy 0, policy_version 92970 (0.0009) [2023-10-12 19:40:57,966][62634] Updated weights for policy 0, policy_version 92980 (0.0010) [2023-10-12 19:40:58,332][62634] Updated weights for policy 0, policy_version 92990 (0.0009) [2023-10-12 19:40:58,435][61643] Fps is (10 sec: 16384.1, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 190480384. Throughput: 0: 1688.7, 1: 1690.6. Samples: 47626704. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 19:40:58,435][61643] Avg episode reward: [(0, '23.850'), (1, '9.840')] [2023-10-12 19:40:58,691][62635] Updated weights for policy 1, policy_version 93030 (0.0009) [2023-10-12 19:40:59,053][62635] Updated weights for policy 1, policy_version 93040 (0.0011) [2023-10-12 19:40:59,432][62635] Updated weights for policy 1, policy_version 93050 (0.0010) [2023-10-12 19:41:02,479][62634] Updated weights for policy 0, policy_version 93000 (0.0008) [2023-10-12 19:41:02,861][62634] Updated weights for policy 0, policy_version 93010 (0.0007) [2023-10-12 19:41:03,243][62634] Updated weights for policy 0, policy_version 93020 (0.0007) [2023-10-12 19:41:03,412][62635] Updated weights for policy 1, policy_version 93060 (0.0008) [2023-10-12 19:41:03,435][61643] Fps is (10 sec: 16384.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 190545920. Throughput: 0: 1668.0, 1: 1697.1. Samples: 47646670. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 19:41:03,435][61643] Avg episode reward: [(0, '23.800'), (1, '9.890')] [2023-10-12 19:41:03,774][62635] Updated weights for policy 1, policy_version 93070 (0.0009) [2023-10-12 19:41:04,144][62635] Updated weights for policy 1, policy_version 93080 (0.0010) [2023-10-12 19:41:07,162][62634] Updated weights for policy 0, policy_version 93030 (0.0009) [2023-10-12 19:41:07,546][62634] Updated weights for policy 0, policy_version 93040 (0.0010) [2023-10-12 19:41:07,928][62634] Updated weights for policy 0, policy_version 93050 (0.0008) [2023-10-12 19:41:08,200][62635] Updated weights for policy 1, policy_version 93090 (0.0010) [2023-10-12 19:41:08,435][61643] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 190611456. Throughput: 0: 1688.7, 1: 1693.2. Samples: 47656478. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 19:41:08,436][61643] Avg episode reward: [(0, '24.070'), (1, '9.980')] [2023-10-12 19:41:08,566][62635] Updated weights for policy 1, policy_version 93100 (0.0009) [2023-10-12 19:41:08,932][62635] Updated weights for policy 1, policy_version 93110 (0.0009) [2023-10-12 19:41:09,302][62635] Updated weights for policy 1, policy_version 93120 (0.0009) [2023-10-12 19:41:11,900][62634] Updated weights for policy 0, policy_version 93060 (0.0007) [2023-10-12 19:41:12,287][62634] Updated weights for policy 0, policy_version 93070 (0.0008) [2023-10-12 19:41:12,659][62634] Updated weights for policy 0, policy_version 93080 (0.0008) [2023-10-12 19:41:13,332][62635] Updated weights for policy 1, policy_version 93130 (0.0009) [2023-10-12 19:41:13,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 190676992. Throughput: 0: 1685.3, 1: 1693.9. Samples: 47677282. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 19:41:13,435][61643] Avg episode reward: [(0, '23.970'), (1, '9.920')] [2023-10-12 19:41:13,700][62635] Updated weights for policy 1, policy_version 93140 (0.0008) [2023-10-12 19:41:14,065][62635] Updated weights for policy 1, policy_version 93150 (0.0008) [2023-10-12 19:41:16,550][62634] Updated weights for policy 0, policy_version 93090 (0.0011) [2023-10-12 19:41:16,927][62634] Updated weights for policy 0, policy_version 93100 (0.0009) [2023-10-12 19:41:17,309][62634] Updated weights for policy 0, policy_version 93110 (0.0007) [2023-10-12 19:41:17,684][62634] Updated weights for policy 0, policy_version 93120 (0.0008) [2023-10-12 19:41:18,282][62635] Updated weights for policy 1, policy_version 93160 (0.0008) [2023-10-12 19:41:18,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 190742528. Throughput: 0: 1663.0, 1: 1690.0. Samples: 47696748. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 19:41:18,436][61643] Avg episode reward: [(0, '24.010'), (1, '10.030')] [2023-10-12 19:41:18,662][62635] Updated weights for policy 1, policy_version 93170 (0.0008) [2023-10-12 19:41:19,029][62635] Updated weights for policy 1, policy_version 93180 (0.0009) [2023-10-12 19:41:21,752][62634] Updated weights for policy 0, policy_version 93130 (0.0010) [2023-10-12 19:41:22,127][62634] Updated weights for policy 0, policy_version 93140 (0.0008) [2023-10-12 19:41:22,509][62634] Updated weights for policy 0, policy_version 93150 (0.0009) [2023-10-12 19:41:22,978][62635] Updated weights for policy 1, policy_version 93190 (0.0010) [2023-10-12 19:41:23,342][62635] Updated weights for policy 1, policy_version 93200 (0.0008) [2023-10-12 19:41:23,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 190808064. Throughput: 0: 1689.4, 1: 1687.7. Samples: 47707224. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 19:41:23,435][61643] Avg episode reward: [(0, '24.190'), (1, '9.930')] [2023-10-12 19:41:23,718][62635] Updated weights for policy 1, policy_version 93210 (0.0008) [2023-10-12 19:41:26,671][62634] Updated weights for policy 0, policy_version 93160 (0.0010) [2023-10-12 19:41:27,043][62634] Updated weights for policy 0, policy_version 93170 (0.0010) [2023-10-12 19:41:27,427][62634] Updated weights for policy 0, policy_version 93180 (0.0010) [2023-10-12 19:41:27,715][62635] Updated weights for policy 1, policy_version 93220 (0.0010) [2023-10-12 19:41:28,088][62635] Updated weights for policy 1, policy_version 93230 (0.0007) [2023-10-12 19:41:28,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 190873600. Throughput: 0: 1679.2, 1: 1693.0. Samples: 47727762. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 19:41:28,436][61643] Avg episode reward: [(0, '23.870'), (1, '9.820')] [2023-10-12 19:41:28,451][62635] Updated weights for policy 1, policy_version 93240 (0.0010) [2023-10-12 19:41:31,629][62634] Updated weights for policy 0, policy_version 93190 (0.0009) [2023-10-12 19:41:32,006][62634] Updated weights for policy 0, policy_version 93200 (0.0008) [2023-10-12 19:41:32,378][62634] Updated weights for policy 0, policy_version 93210 (0.0007) [2023-10-12 19:41:32,603][62635] Updated weights for policy 1, policy_version 93250 (0.0010) [2023-10-12 19:41:32,973][62635] Updated weights for policy 1, policy_version 93260 (0.0010) [2023-10-12 19:41:33,344][62635] Updated weights for policy 1, policy_version 93270 (0.0009) [2023-10-12 19:41:33,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 190939136. Throughput: 0: 1674.4, 1: 1687.2. Samples: 47747336. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 19:41:33,435][61643] Avg episode reward: [(0, '23.800'), (1, '9.580')] [2023-10-12 19:41:33,705][62635] Updated weights for policy 1, policy_version 93280 (0.0008) [2023-10-12 19:41:36,662][62634] Updated weights for policy 0, policy_version 93220 (0.0008) [2023-10-12 19:41:37,057][62634] Updated weights for policy 0, policy_version 93230 (0.0008) [2023-10-12 19:41:37,431][62634] Updated weights for policy 0, policy_version 93240 (0.0007) [2023-10-12 19:41:37,793][62635] Updated weights for policy 1, policy_version 93290 (0.0008) [2023-10-12 19:41:38,153][62635] Updated weights for policy 1, policy_version 93300 (0.0008) [2023-10-12 19:41:38,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 191004672. Throughput: 0: 1690.1, 1: 1698.9. Samples: 47758242. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 19:41:38,435][61643] Avg episode reward: [(0, '23.810'), (1, '9.700')] [2023-10-12 19:41:38,517][62635] Updated weights for policy 1, policy_version 93310 (0.0007) [2023-10-12 19:41:41,444][62634] Updated weights for policy 0, policy_version 93250 (0.0007) [2023-10-12 19:41:41,809][62634] Updated weights for policy 0, policy_version 93260 (0.0009) [2023-10-12 19:41:42,182][62634] Updated weights for policy 0, policy_version 93270 (0.0008) [2023-10-12 19:41:42,559][62635] Updated weights for policy 1, policy_version 93320 (0.0011) [2023-10-12 19:41:42,563][62634] Updated weights for policy 0, policy_version 93280 (0.0007) [2023-10-12 19:41:42,931][62635] Updated weights for policy 1, policy_version 93330 (0.0008) [2023-10-12 19:41:43,303][62635] Updated weights for policy 1, policy_version 93340 (0.0007) [2023-10-12 19:41:43,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 191070208. Throughput: 0: 1671.0, 1: 1695.5. Samples: 47778194. Policy #0 lag: (min: 31.0, avg: 31.0, max: 33.0) [2023-10-12 19:41:43,436][61643] Avg episode reward: [(0, '23.760'), (1, '9.670')] [2023-10-12 19:41:46,507][62634] Updated weights for policy 0, policy_version 93290 (0.0008) [2023-10-12 19:41:46,887][62634] Updated weights for policy 0, policy_version 93300 (0.0008) [2023-10-12 19:41:47,263][62634] Updated weights for policy 0, policy_version 93310 (0.0009) [2023-10-12 19:41:47,582][62635] Updated weights for policy 1, policy_version 93350 (0.0008) [2023-10-12 19:41:47,947][62635] Updated weights for policy 1, policy_version 93360 (0.0007) [2023-10-12 19:41:48,321][62635] Updated weights for policy 1, policy_version 93370 (0.0009) [2023-10-12 19:41:48,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 191135744. Throughput: 0: 1675.8, 1: 1672.7. Samples: 47797352. Policy #0 lag: (min: 31.0, avg: 31.0, max: 33.0) [2023-10-12 19:41:48,436][61643] Avg episode reward: [(0, '24.130'), (1, '9.620')] [2023-10-12 19:41:51,262][62634] Updated weights for policy 0, policy_version 93320 (0.0007) [2023-10-12 19:41:51,636][62634] Updated weights for policy 0, policy_version 93330 (0.0007) [2023-10-12 19:41:52,022][62634] Updated weights for policy 0, policy_version 93340 (0.0010) [2023-10-12 19:41:52,271][62635] Updated weights for policy 1, policy_version 93380 (0.0008) [2023-10-12 19:41:52,641][62635] Updated weights for policy 1, policy_version 93390 (0.0007) [2023-10-12 19:41:53,009][62635] Updated weights for policy 1, policy_version 93400 (0.0007) [2023-10-12 19:41:53,435][61643] Fps is (10 sec: 16384.1, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 191234048. Throughput: 0: 1685.4, 1: 1688.7. Samples: 47808314. Policy #0 lag: (min: 31.0, avg: 31.0, max: 33.0) [2023-10-12 19:41:53,435][61643] Avg episode reward: [(0, '24.320'), (1, '9.800')] [2023-10-12 19:41:56,162][62634] Updated weights for policy 0, policy_version 93350 (0.0008) [2023-10-12 19:41:56,529][62634] Updated weights for policy 0, policy_version 93360 (0.0010) [2023-10-12 19:41:56,916][62634] Updated weights for policy 0, policy_version 93370 (0.0009) [2023-10-12 19:41:57,068][62635] Updated weights for policy 1, policy_version 93410 (0.0008) [2023-10-12 19:41:57,430][62635] Updated weights for policy 1, policy_version 93420 (0.0008) [2023-10-12 19:41:57,811][62635] Updated weights for policy 1, policy_version 93430 (0.0008) [2023-10-12 19:41:58,181][62635] Updated weights for policy 1, policy_version 93440 (0.0008) [2023-10-12 19:41:58,435][61643] Fps is (10 sec: 16384.4, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 191299584. Throughput: 0: 1664.1, 1: 1684.9. Samples: 47827990. Policy #0 lag: (min: 31.0, avg: 31.0, max: 33.0) [2023-10-12 19:41:58,435][61643] Avg episode reward: [(0, '24.750'), (1, '9.730')] [2023-10-12 19:42:00,947][62634] Updated weights for policy 0, policy_version 93380 (0.0008) [2023-10-12 19:42:01,327][62634] Updated weights for policy 0, policy_version 93390 (0.0009) [2023-10-12 19:42:01,709][62634] Updated weights for policy 0, policy_version 93400 (0.0010) [2023-10-12 19:42:02,192][62635] Updated weights for policy 1, policy_version 93450 (0.0008) [2023-10-12 19:42:02,561][62635] Updated weights for policy 1, policy_version 93460 (0.0008) [2023-10-12 19:42:02,933][62635] Updated weights for policy 1, policy_version 93470 (0.0009) [2023-10-12 19:42:03,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 191365120. Throughput: 0: 1682.4, 1: 1662.6. Samples: 47847274. Policy #0 lag: (min: 31.0, avg: 31.0, max: 33.0) [2023-10-12 19:42:03,436][61643] Avg episode reward: [(0, '24.610'), (1, '9.720')] [2023-10-12 19:42:05,730][62634] Updated weights for policy 0, policy_version 93410 (0.0008) [2023-10-12 19:42:06,102][62634] Updated weights for policy 0, policy_version 93420 (0.0011) [2023-10-12 19:42:06,482][62634] Updated weights for policy 0, policy_version 93430 (0.0010) [2023-10-12 19:42:06,860][62634] Updated weights for policy 0, policy_version 93440 (0.0010) [2023-10-12 19:42:07,215][62635] Updated weights for policy 1, policy_version 93480 (0.0011) [2023-10-12 19:42:07,584][62635] Updated weights for policy 1, policy_version 93490 (0.0009) [2023-10-12 19:42:07,956][62635] Updated weights for policy 1, policy_version 93500 (0.0009) [2023-10-12 19:42:08,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 191430656. Throughput: 0: 1671.9, 1: 1684.8. Samples: 47858276. Policy #0 lag: (min: 31.0, avg: 31.0, max: 33.0) [2023-10-12 19:42:08,436][61643] Avg episode reward: [(0, '24.500'), (1, '10.050')] [2023-10-12 19:42:10,884][62634] Updated weights for policy 0, policy_version 93450 (0.0007) [2023-10-12 19:42:11,250][62634] Updated weights for policy 0, policy_version 93460 (0.0010) [2023-10-12 19:42:11,637][62634] Updated weights for policy 0, policy_version 93470 (0.0009) [2023-10-12 19:42:12,052][62635] Updated weights for policy 1, policy_version 93510 (0.0009) [2023-10-12 19:42:12,420][62635] Updated weights for policy 1, policy_version 93520 (0.0010) [2023-10-12 19:42:12,781][62635] Updated weights for policy 1, policy_version 93530 (0.0011) [2023-10-12 19:42:13,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 191496192. Throughput: 0: 1657.7, 1: 1671.6. Samples: 47877582. Policy #0 lag: (min: 31.0, avg: 31.0, max: 33.0) [2023-10-12 19:42:13,436][61643] Avg episode reward: [(0, '24.430'), (1, '9.960')] [2023-10-12 19:42:15,826][62634] Updated weights for policy 0, policy_version 93480 (0.0009) [2023-10-12 19:42:16,203][62634] Updated weights for policy 0, policy_version 93490 (0.0008) [2023-10-12 19:42:16,576][62634] Updated weights for policy 0, policy_version 93500 (0.0009) [2023-10-12 19:42:16,962][62635] Updated weights for policy 1, policy_version 93540 (0.0010) [2023-10-12 19:42:17,319][62635] Updated weights for policy 1, policy_version 93550 (0.0008) [2023-10-12 19:42:17,685][62635] Updated weights for policy 1, policy_version 93560 (0.0009) [2023-10-12 19:42:18,435][61643] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 191561728. Throughput: 0: 1675.0, 1: 1658.5. Samples: 47897346. Policy #0 lag: (min: 31.0, avg: 31.0, max: 33.0) [2023-10-12 19:42:18,436][61643] Avg episode reward: [(0, '24.330'), (1, '9.700')] [2023-10-12 19:42:18,447][62495] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000093568_95813632.pth... [2023-10-12 19:42:18,447][62354] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000093504_95748096.pth... [2023-10-12 19:42:18,488][62354] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000091936_94142464.pth [2023-10-12 19:42:18,489][62495] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000092000_94208000.pth [2023-10-12 19:42:20,745][62634] Updated weights for policy 0, policy_version 93510 (0.0008) [2023-10-12 19:42:21,117][62634] Updated weights for policy 0, policy_version 93520 (0.0007) [2023-10-12 19:42:21,494][62634] Updated weights for policy 0, policy_version 93530 (0.0008) [2023-10-12 19:42:21,699][62635] Updated weights for policy 1, policy_version 93570 (0.0010) [2023-10-12 19:42:22,075][62635] Updated weights for policy 1, policy_version 93580 (0.0010) [2023-10-12 19:42:22,434][62635] Updated weights for policy 1, policy_version 93590 (0.0007) [2023-10-12 19:42:22,800][62635] Updated weights for policy 1, policy_version 93600 (0.0007) [2023-10-12 19:42:23,435][61643] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 191627264. Throughput: 0: 1663.6, 1: 1676.7. Samples: 47908552. Policy #0 lag: (min: 31.0, avg: 31.0, max: 33.0) [2023-10-12 19:42:23,435][61643] Avg episode reward: [(0, '24.250'), (1, '10.020')] [2023-10-12 19:42:25,547][62634] Updated weights for policy 0, policy_version 93540 (0.0008) [2023-10-12 19:42:25,927][62634] Updated weights for policy 0, policy_version 93550 (0.0008) [2023-10-12 19:42:26,302][62634] Updated weights for policy 0, policy_version 93560 (0.0008) [2023-10-12 19:42:26,697][62635] Updated weights for policy 1, policy_version 93610 (0.0008) [2023-10-12 19:42:27,063][62635] Updated weights for policy 1, policy_version 93620 (0.0009) [2023-10-12 19:42:27,436][62635] Updated weights for policy 1, policy_version 93630 (0.0010) [2023-10-12 19:42:28,435][61643] Fps is (10 sec: 13107.7, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 191692800. Throughput: 0: 1664.7, 1: 1661.9. Samples: 47927890. Policy #0 lag: (min: 31.0, avg: 31.0, max: 33.0) [2023-10-12 19:42:28,435][61643] Avg episode reward: [(0, '24.210'), (1, '9.740')] [2023-10-12 19:42:30,277][62634] Updated weights for policy 0, policy_version 93570 (0.0009) [2023-10-12 19:42:30,677][62634] Updated weights for policy 0, policy_version 93580 (0.0011) [2023-10-12 19:42:31,063][62634] Updated weights for policy 0, policy_version 93590 (0.0009) [2023-10-12 19:42:31,435][62634] Updated weights for policy 0, policy_version 93600 (0.0008) [2023-10-12 19:42:31,656][62635] Updated weights for policy 1, policy_version 93640 (0.0009) [2023-10-12 19:42:32,021][62635] Updated weights for policy 1, policy_version 93650 (0.0010) [2023-10-12 19:42:32,397][62635] Updated weights for policy 1, policy_version 93660 (0.0009) [2023-10-12 19:42:33,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.5). Total num frames: 191758336. Throughput: 0: 1680.5, 1: 1663.0. Samples: 47947810. Policy #0 lag: (min: 31.0, avg: 31.0, max: 33.0) [2023-10-12 19:42:33,435][61643] Avg episode reward: [(0, '24.280'), (1, '9.650')] [2023-10-12 19:42:35,463][62634] Updated weights for policy 0, policy_version 93610 (0.0009) [2023-10-12 19:42:35,837][62634] Updated weights for policy 0, policy_version 93620 (0.0009) [2023-10-12 19:42:36,219][62634] Updated weights for policy 0, policy_version 93630 (0.0008) [2023-10-12 19:42:36,360][62635] Updated weights for policy 1, policy_version 93670 (0.0007) [2023-10-12 19:42:36,736][62635] Updated weights for policy 1, policy_version 93680 (0.0007) [2023-10-12 19:42:37,110][62635] Updated weights for policy 1, policy_version 93690 (0.0009) [2023-10-12 19:42:38,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 191823872. Throughput: 0: 1660.8, 1: 1677.2. Samples: 47958526. Policy #0 lag: (min: 31.0, avg: 31.0, max: 33.0) [2023-10-12 19:42:38,435][61643] Avg episode reward: [(0, '24.200'), (1, '9.940')] [2023-10-12 19:42:40,375][62634] Updated weights for policy 0, policy_version 93640 (0.0009) [2023-10-12 19:42:40,754][62634] Updated weights for policy 0, policy_version 93650 (0.0007) [2023-10-12 19:42:41,132][62634] Updated weights for policy 0, policy_version 93660 (0.0008) [2023-10-12 19:42:41,251][62635] Updated weights for policy 1, policy_version 93700 (0.0008) [2023-10-12 19:42:41,616][62635] Updated weights for policy 1, policy_version 93710 (0.0008) [2023-10-12 19:42:41,986][62635] Updated weights for policy 1, policy_version 93720 (0.0008) [2023-10-12 19:42:43,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 191889408. Throughput: 0: 1670.9, 1: 1657.7. Samples: 47977778. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 19:42:43,435][61643] Avg episode reward: [(0, '24.300'), (1, '9.850')] [2023-10-12 19:42:45,220][62634] Updated weights for policy 0, policy_version 93670 (0.0008) [2023-10-12 19:42:45,604][62634] Updated weights for policy 0, policy_version 93680 (0.0009) [2023-10-12 19:42:45,975][62634] Updated weights for policy 0, policy_version 93690 (0.0008) [2023-10-12 19:42:46,047][62635] Updated weights for policy 1, policy_version 93730 (0.0009) [2023-10-12 19:42:46,409][62635] Updated weights for policy 1, policy_version 93740 (0.0009) [2023-10-12 19:42:46,789][62635] Updated weights for policy 1, policy_version 93750 (0.0008) [2023-10-12 19:42:47,149][62635] Updated weights for policy 1, policy_version 93760 (0.0008) [2023-10-12 19:42:48,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 191954944. Throughput: 0: 1674.1, 1: 1682.0. Samples: 47998300. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 19:42:48,435][61643] Avg episode reward: [(0, '23.980'), (1, '9.650')] [2023-10-12 19:42:49,950][62634] Updated weights for policy 0, policy_version 93700 (0.0008) [2023-10-12 19:42:50,324][62634] Updated weights for policy 0, policy_version 93710 (0.0010) [2023-10-12 19:42:50,707][62634] Updated weights for policy 0, policy_version 93720 (0.0011) [2023-10-12 19:42:51,409][62635] Updated weights for policy 1, policy_version 93770 (0.0008) [2023-10-12 19:42:51,780][62635] Updated weights for policy 1, policy_version 93780 (0.0009) [2023-10-12 19:42:52,162][62635] Updated weights for policy 1, policy_version 93790 (0.0009) [2023-10-12 19:42:53,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 192020480. Throughput: 0: 1655.5, 1: 1686.0. Samples: 48008644. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 19:42:53,435][61643] Avg episode reward: [(0, '24.020'), (1, '9.770')] [2023-10-12 19:42:54,690][62634] Updated weights for policy 0, policy_version 93730 (0.0008) [2023-10-12 19:42:55,070][62634] Updated weights for policy 0, policy_version 93740 (0.0007) [2023-10-12 19:42:55,444][62634] Updated weights for policy 0, policy_version 93750 (0.0008) [2023-10-12 19:42:55,823][62634] Updated weights for policy 0, policy_version 93760 (0.0007) [2023-10-12 19:42:56,180][62635] Updated weights for policy 1, policy_version 93800 (0.0008) [2023-10-12 19:42:56,560][62635] Updated weights for policy 1, policy_version 93810 (0.0009) [2023-10-12 19:42:56,924][62635] Updated weights for policy 1, policy_version 93820 (0.0008) [2023-10-12 19:42:58,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 192086016. Throughput: 0: 1681.8, 1: 1664.8. Samples: 48028180. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 19:42:58,435][61643] Avg episode reward: [(0, '23.990'), (1, '9.830')] [2023-10-12 19:42:59,727][62634] Updated weights for policy 0, policy_version 93770 (0.0007) [2023-10-12 19:43:00,108][62634] Updated weights for policy 0, policy_version 93780 (0.0010) [2023-10-12 19:43:00,494][62634] Updated weights for policy 0, policy_version 93790 (0.0009) [2023-10-12 19:43:00,960][62635] Updated weights for policy 1, policy_version 93830 (0.0007) [2023-10-12 19:43:01,332][62635] Updated weights for policy 1, policy_version 93840 (0.0010) [2023-10-12 19:43:01,698][62635] Updated weights for policy 1, policy_version 93850 (0.0008) [2023-10-12 19:43:03,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 192151552. Throughput: 0: 1685.4, 1: 1685.0. Samples: 48049014. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 19:43:03,436][61643] Avg episode reward: [(0, '24.000'), (1, '9.740')] [2023-10-12 19:43:04,494][62634] Updated weights for policy 0, policy_version 93800 (0.0007) [2023-10-12 19:43:04,874][62634] Updated weights for policy 0, policy_version 93810 (0.0008) [2023-10-12 19:43:05,263][62634] Updated weights for policy 0, policy_version 93820 (0.0010) [2023-10-12 19:43:05,767][62635] Updated weights for policy 1, policy_version 93860 (0.0009) [2023-10-12 19:43:06,139][62635] Updated weights for policy 1, policy_version 93870 (0.0008) [2023-10-12 19:43:06,505][62635] Updated weights for policy 1, policy_version 93880 (0.0008) [2023-10-12 19:43:08,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 192217088. Throughput: 0: 1664.7, 1: 1672.8. Samples: 48058738. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 19:43:08,436][61643] Avg episode reward: [(0, '24.170'), (1, '10.010')] [2023-10-12 19:43:09,378][62634] Updated weights for policy 0, policy_version 93830 (0.0008) [2023-10-12 19:43:09,755][62634] Updated weights for policy 0, policy_version 93840 (0.0008) [2023-10-12 19:43:10,131][62634] Updated weights for policy 0, policy_version 93850 (0.0009) [2023-10-12 19:43:10,458][62635] Updated weights for policy 1, policy_version 93890 (0.0009) [2023-10-12 19:43:10,828][62635] Updated weights for policy 1, policy_version 93900 (0.0007) [2023-10-12 19:43:11,188][62635] Updated weights for policy 1, policy_version 93910 (0.0010) [2023-10-12 19:43:11,557][62635] Updated weights for policy 1, policy_version 93920 (0.0010) [2023-10-12 19:43:13,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 192282624. Throughput: 0: 1683.3, 1: 1663.1. Samples: 48078480. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 19:43:13,435][61643] Avg episode reward: [(0, '23.930'), (1, '9.780')] [2023-10-12 19:43:14,313][62634] Updated weights for policy 0, policy_version 93860 (0.0007) [2023-10-12 19:43:14,685][62634] Updated weights for policy 0, policy_version 93870 (0.0008) [2023-10-12 19:43:15,058][62634] Updated weights for policy 0, policy_version 93880 (0.0009) [2023-10-12 19:43:15,651][62635] Updated weights for policy 1, policy_version 93930 (0.0008) [2023-10-12 19:43:16,026][62635] Updated weights for policy 1, policy_version 93940 (0.0008) [2023-10-12 19:43:16,396][62635] Updated weights for policy 1, policy_version 93950 (0.0009) [2023-10-12 19:43:18,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 192348160. Throughput: 0: 1677.2, 1: 1676.9. Samples: 48098744. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 19:43:18,436][61643] Avg episode reward: [(0, '23.480'), (1, '9.560')] [2023-10-12 19:43:19,296][62634] Updated weights for policy 0, policy_version 93890 (0.0008) [2023-10-12 19:43:19,678][62634] Updated weights for policy 0, policy_version 93900 (0.0009) [2023-10-12 19:43:20,065][62634] Updated weights for policy 0, policy_version 93910 (0.0007) [2023-10-12 19:43:20,328][62635] Updated weights for policy 1, policy_version 93960 (0.0008) [2023-10-12 19:43:20,437][62634] Updated weights for policy 0, policy_version 93920 (0.0007) [2023-10-12 19:43:20,699][62635] Updated weights for policy 1, policy_version 93970 (0.0008) [2023-10-12 19:43:21,081][62635] Updated weights for policy 1, policy_version 93980 (0.0010) [2023-10-12 19:43:23,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 192413696. Throughput: 0: 1668.2, 1: 1655.3. Samples: 48108084. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 19:43:23,436][61643] Avg episode reward: [(0, '23.390'), (1, '9.890')] [2023-10-12 19:43:24,447][62634] Updated weights for policy 0, policy_version 93930 (0.0008) [2023-10-12 19:43:24,835][62634] Updated weights for policy 0, policy_version 93940 (0.0009) [2023-10-12 19:43:25,185][62635] Updated weights for policy 1, policy_version 93990 (0.0008) [2023-10-12 19:43:25,205][62634] Updated weights for policy 0, policy_version 93950 (0.0007) [2023-10-12 19:43:25,556][62635] Updated weights for policy 1, policy_version 94000 (0.0007) [2023-10-12 19:43:25,916][62635] Updated weights for policy 1, policy_version 94010 (0.0007) [2023-10-12 19:43:28,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 192479232. Throughput: 0: 1683.8, 1: 1672.0. Samples: 48128790. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 19:43:28,436][61643] Avg episode reward: [(0, '23.430'), (1, '9.850')] [2023-10-12 19:43:29,141][62634] Updated weights for policy 0, policy_version 93960 (0.0009) [2023-10-12 19:43:29,520][62634] Updated weights for policy 0, policy_version 93970 (0.0009) [2023-10-12 19:43:29,902][62634] Updated weights for policy 0, policy_version 93980 (0.0011) [2023-10-12 19:43:29,965][62635] Updated weights for policy 1, policy_version 94020 (0.0008) [2023-10-12 19:43:30,330][62635] Updated weights for policy 1, policy_version 94030 (0.0008) [2023-10-12 19:43:30,706][62635] Updated weights for policy 1, policy_version 94040 (0.0008) [2023-10-12 19:43:33,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 192544768. Throughput: 0: 1685.9, 1: 1677.0. Samples: 48149630. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 19:43:33,435][61643] Avg episode reward: [(0, '23.070'), (1, '9.670')] [2023-10-12 19:43:33,775][62634] Updated weights for policy 0, policy_version 93990 (0.0008) [2023-10-12 19:43:34,150][62634] Updated weights for policy 0, policy_version 94000 (0.0008) [2023-10-12 19:43:34,531][62634] Updated weights for policy 0, policy_version 94010 (0.0009) [2023-10-12 19:43:34,993][62635] Updated weights for policy 1, policy_version 94050 (0.0008) [2023-10-12 19:43:35,359][62635] Updated weights for policy 1, policy_version 94060 (0.0010) [2023-10-12 19:43:35,720][62635] Updated weights for policy 1, policy_version 94070 (0.0009) [2023-10-12 19:43:36,087][62635] Updated weights for policy 1, policy_version 94080 (0.0007) [2023-10-12 19:43:38,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.1, 300 sec: 13329.3). Total num frames: 192610304. Throughput: 0: 1682.1, 1: 1650.7. Samples: 48158622. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 19:43:38,436][61643] Avg episode reward: [(0, '23.150'), (1, '9.940')] [2023-10-12 19:43:38,725][62634] Updated weights for policy 0, policy_version 94020 (0.0009) [2023-10-12 19:43:39,108][62634] Updated weights for policy 0, policy_version 94030 (0.0008) [2023-10-12 19:43:39,482][62634] Updated weights for policy 0, policy_version 94040 (0.0007) [2023-10-12 19:43:40,259][62635] Updated weights for policy 1, policy_version 94090 (0.0010) [2023-10-12 19:43:40,627][62635] Updated weights for policy 1, policy_version 94100 (0.0008) [2023-10-12 19:43:40,996][62635] Updated weights for policy 1, policy_version 94110 (0.0008) [2023-10-12 19:43:43,383][62634] Updated weights for policy 0, policy_version 94050 (0.0008) [2023-10-12 19:43:43,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 192675840. Throughput: 0: 1677.9, 1: 1675.8. Samples: 48179098. Policy #0 lag: (min: 19.0, avg: 19.0, max: 19.0) [2023-10-12 19:43:43,435][61643] Avg episode reward: [(0, '23.290'), (1, '9.970')] [2023-10-12 19:43:43,756][62634] Updated weights for policy 0, policy_version 94060 (0.0009) [2023-10-12 19:43:44,138][62634] Updated weights for policy 0, policy_version 94070 (0.0007) [2023-10-12 19:43:44,523][62634] Updated weights for policy 0, policy_version 94080 (0.0009) [2023-10-12 19:43:44,805][62635] Updated weights for policy 1, policy_version 94120 (0.0007) [2023-10-12 19:43:45,165][62635] Updated weights for policy 1, policy_version 94130 (0.0009) [2023-10-12 19:43:45,529][62635] Updated weights for policy 1, policy_version 94140 (0.0010) [2023-10-12 19:43:48,435][61643] Fps is (10 sec: 13107.6, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 192741376. Throughput: 0: 1674.4, 1: 1683.5. Samples: 48200120. Policy #0 lag: (min: 19.0, avg: 19.0, max: 19.0) [2023-10-12 19:43:48,435][61643] Avg episode reward: [(0, '23.250'), (1, '9.710')] [2023-10-12 19:43:48,782][62634] Updated weights for policy 0, policy_version 94090 (0.0009) [2023-10-12 19:43:49,163][62634] Updated weights for policy 0, policy_version 94100 (0.0008) [2023-10-12 19:43:49,534][62634] Updated weights for policy 0, policy_version 94110 (0.0007) [2023-10-12 19:43:49,718][62635] Updated weights for policy 1, policy_version 94150 (0.0009) [2023-10-12 19:43:50,088][62635] Updated weights for policy 1, policy_version 94160 (0.0007) [2023-10-12 19:43:50,452][62635] Updated weights for policy 1, policy_version 94170 (0.0008) [2023-10-12 19:43:53,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 192806912. Throughput: 0: 1675.3, 1: 1664.7. Samples: 48209038. Policy #0 lag: (min: 19.0, avg: 19.0, max: 19.0) [2023-10-12 19:43:53,435][61643] Avg episode reward: [(0, '23.090'), (1, '10.110')] [2023-10-12 19:43:53,659][62634] Updated weights for policy 0, policy_version 94120 (0.0008) [2023-10-12 19:43:54,049][62634] Updated weights for policy 0, policy_version 94130 (0.0008) [2023-10-12 19:43:54,419][62634] Updated weights for policy 0, policy_version 94140 (0.0010) [2023-10-12 19:43:54,593][62635] Updated weights for policy 1, policy_version 94180 (0.0008) [2023-10-12 19:43:54,959][62635] Updated weights for policy 1, policy_version 94190 (0.0010) [2023-10-12 19:43:55,319][62635] Updated weights for policy 1, policy_version 94200 (0.0010) [2023-10-12 19:43:58,435][61643] Fps is (10 sec: 13106.8, 60 sec: 13107.1, 300 sec: 13329.3). Total num frames: 192872448. Throughput: 0: 1678.2, 1: 1686.9. Samples: 48229908. Policy #0 lag: (min: 19.0, avg: 19.0, max: 19.0) [2023-10-12 19:43:58,436][61643] Avg episode reward: [(0, '23.390'), (1, '9.970')] [2023-10-12 19:43:58,568][62634] Updated weights for policy 0, policy_version 94150 (0.0008) [2023-10-12 19:43:58,942][62634] Updated weights for policy 0, policy_version 94160 (0.0008) [2023-10-12 19:43:59,258][62635] Updated weights for policy 1, policy_version 94210 (0.0008) [2023-10-12 19:43:59,319][62634] Updated weights for policy 0, policy_version 94170 (0.0007) [2023-10-12 19:43:59,630][62635] Updated weights for policy 1, policy_version 94220 (0.0007) [2023-10-12 19:43:59,996][62635] Updated weights for policy 1, policy_version 94230 (0.0011) [2023-10-12 19:44:00,370][62635] Updated weights for policy 1, policy_version 94240 (0.0008) [2023-10-12 19:44:03,268][62634] Updated weights for policy 0, policy_version 94180 (0.0009) [2023-10-12 19:44:03,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 192937984. Throughput: 0: 1685.8, 1: 1688.1. Samples: 48250566. Policy #0 lag: (min: 19.0, avg: 19.0, max: 19.0) [2023-10-12 19:44:03,435][61643] Avg episode reward: [(0, '23.310'), (1, '9.860')] [2023-10-12 19:44:03,638][62634] Updated weights for policy 0, policy_version 94190 (0.0010) [2023-10-12 19:44:04,013][62634] Updated weights for policy 0, policy_version 94200 (0.0010) [2023-10-12 19:44:04,517][62635] Updated weights for policy 1, policy_version 94250 (0.0009) [2023-10-12 19:44:04,878][62635] Updated weights for policy 1, policy_version 94260 (0.0009) [2023-10-12 19:44:05,258][62635] Updated weights for policy 1, policy_version 94270 (0.0008) [2023-10-12 19:44:08,070][62634] Updated weights for policy 0, policy_version 94210 (0.0009) [2023-10-12 19:44:08,435][61643] Fps is (10 sec: 13107.6, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 193003520. Throughput: 0: 1690.7, 1: 1679.3. Samples: 48259734. Policy #0 lag: (min: 19.0, avg: 19.0, max: 19.0) [2023-10-12 19:44:08,435][61643] Avg episode reward: [(0, '23.330'), (1, '9.950')] [2023-10-12 19:44:08,481][62634] Updated weights for policy 0, policy_version 94220 (0.0008) [2023-10-12 19:44:08,855][62634] Updated weights for policy 0, policy_version 94230 (0.0010) [2023-10-12 19:44:09,232][62634] Updated weights for policy 0, policy_version 94240 (0.0010) [2023-10-12 19:44:09,342][62635] Updated weights for policy 1, policy_version 94280 (0.0009) [2023-10-12 19:44:09,703][62635] Updated weights for policy 1, policy_version 94290 (0.0008) [2023-10-12 19:44:10,075][62635] Updated weights for policy 1, policy_version 94300 (0.0008) [2023-10-12 19:44:13,143][62634] Updated weights for policy 0, policy_version 94250 (0.0010) [2023-10-12 19:44:13,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 193069056. Throughput: 0: 1685.5, 1: 1680.0. Samples: 48280238. Policy #0 lag: (min: 19.0, avg: 19.0, max: 19.0) [2023-10-12 19:44:13,435][61643] Avg episode reward: [(0, '23.260'), (1, '9.740')] [2023-10-12 19:44:13,515][62634] Updated weights for policy 0, policy_version 94260 (0.0008) [2023-10-12 19:44:13,891][62634] Updated weights for policy 0, policy_version 94270 (0.0010) [2023-10-12 19:44:14,177][62635] Updated weights for policy 1, policy_version 94310 (0.0010) [2023-10-12 19:44:14,551][62635] Updated weights for policy 1, policy_version 94320 (0.0008) [2023-10-12 19:44:14,911][62635] Updated weights for policy 1, policy_version 94330 (0.0007) [2023-10-12 19:44:17,810][62634] Updated weights for policy 0, policy_version 94280 (0.0007) [2023-10-12 19:44:18,185][62634] Updated weights for policy 0, policy_version 94290 (0.0007) [2023-10-12 19:44:18,435][61643] Fps is (10 sec: 13106.7, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 193134592. Throughput: 0: 1675.8, 1: 1684.0. Samples: 48300824. Policy #0 lag: (min: 19.0, avg: 19.0, max: 19.0) [2023-10-12 19:44:18,436][61643] Avg episode reward: [(0, '23.240'), (1, '9.720')] [2023-10-12 19:44:18,447][62495] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000094336_96600064.pth... [2023-10-12 19:44:18,480][62495] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000092768_94994432.pth [2023-10-12 19:44:18,485][62495] Saving a milestone ./train_atari/atari_kangaroo_APPO/checkpoint_p1/milestones/checkpoint_000094336_96600064.pth [2023-10-12 19:44:18,565][62634] Updated weights for policy 0, policy_version 94300 (0.0007) [2023-10-12 19:44:18,703][62354] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000094304_96567296.pth... [2023-10-12 19:44:18,738][62354] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000092736_94961664.pth [2023-10-12 19:44:18,744][62354] Saving a milestone ./train_atari/atari_kangaroo_APPO/checkpoint_p0/milestones/checkpoint_000094304_96567296.pth [2023-10-12 19:44:18,839][62635] Updated weights for policy 1, policy_version 94340 (0.0008) [2023-10-12 19:44:19,212][62635] Updated weights for policy 1, policy_version 94350 (0.0008) [2023-10-12 19:44:19,579][62635] Updated weights for policy 1, policy_version 94360 (0.0010) [2023-10-12 19:44:22,563][62634] Updated weights for policy 0, policy_version 94310 (0.0010) [2023-10-12 19:44:22,952][62634] Updated weights for policy 0, policy_version 94320 (0.0011) [2023-10-12 19:44:23,321][62634] Updated weights for policy 0, policy_version 94330 (0.0009) [2023-10-12 19:44:23,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 193200128. Throughput: 0: 1686.7, 1: 1684.2. Samples: 48310312. Policy #0 lag: (min: 19.0, avg: 19.0, max: 19.0) [2023-10-12 19:44:23,435][61643] Avg episode reward: [(0, '23.420'), (1, '9.890')] [2023-10-12 19:44:23,713][62635] Updated weights for policy 1, policy_version 94370 (0.0010) [2023-10-12 19:44:24,077][62635] Updated weights for policy 1, policy_version 94380 (0.0007) [2023-10-12 19:44:24,454][62635] Updated weights for policy 1, policy_version 94390 (0.0009) [2023-10-12 19:44:24,832][62635] Updated weights for policy 1, policy_version 94400 (0.0010) [2023-10-12 19:44:27,541][62634] Updated weights for policy 0, policy_version 94340 (0.0009) [2023-10-12 19:44:27,917][62634] Updated weights for policy 0, policy_version 94350 (0.0008) [2023-10-12 19:44:28,291][62634] Updated weights for policy 0, policy_version 94360 (0.0009) [2023-10-12 19:44:28,435][61643] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 193265664. Throughput: 0: 1688.8, 1: 1687.2. Samples: 48331018. Policy #0 lag: (min: 19.0, avg: 19.0, max: 19.0) [2023-10-12 19:44:28,436][61643] Avg episode reward: [(0, '23.920'), (1, '9.860')] [2023-10-12 19:44:28,946][62635] Updated weights for policy 1, policy_version 94410 (0.0007) [2023-10-12 19:44:29,320][62635] Updated weights for policy 1, policy_version 94420 (0.0007) [2023-10-12 19:44:29,693][62635] Updated weights for policy 1, policy_version 94430 (0.0007) [2023-10-12 19:44:32,234][62634] Updated weights for policy 0, policy_version 94370 (0.0009) [2023-10-12 19:44:32,604][62634] Updated weights for policy 0, policy_version 94380 (0.0007) [2023-10-12 19:44:32,986][62634] Updated weights for policy 0, policy_version 94390 (0.0007) [2023-10-12 19:44:33,357][62634] Updated weights for policy 0, policy_version 94400 (0.0007) [2023-10-12 19:44:33,435][61643] Fps is (10 sec: 16383.6, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 193363968. Throughput: 0: 1671.2, 1: 1679.1. Samples: 48350886. Policy #0 lag: (min: 19.0, avg: 19.0, max: 19.0) [2023-10-12 19:44:33,436][61643] Avg episode reward: [(0, '24.240'), (1, '9.650')] [2023-10-12 19:44:33,624][62635] Updated weights for policy 1, policy_version 94440 (0.0009) [2023-10-12 19:44:33,994][62635] Updated weights for policy 1, policy_version 94450 (0.0007) [2023-10-12 19:44:34,368][62635] Updated weights for policy 1, policy_version 94460 (0.0008) [2023-10-12 19:44:37,457][62634] Updated weights for policy 0, policy_version 94410 (0.0008) [2023-10-12 19:44:37,830][62634] Updated weights for policy 0, policy_version 94420 (0.0007) [2023-10-12 19:44:38,214][62634] Updated weights for policy 0, policy_version 94430 (0.0008) [2023-10-12 19:44:38,416][62635] Updated weights for policy 1, policy_version 94470 (0.0009) [2023-10-12 19:44:38,435][61643] Fps is (10 sec: 16384.1, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 193429504. Throughput: 0: 1692.1, 1: 1683.1. Samples: 48360920. Policy #0 lag: (min: 31.0, avg: 31.7, max: 49.0) [2023-10-12 19:44:38,435][61643] Avg episode reward: [(0, '24.370'), (1, '9.650')] [2023-10-12 19:44:38,782][62635] Updated weights for policy 1, policy_version 94480 (0.0009) [2023-10-12 19:44:39,151][62635] Updated weights for policy 1, policy_version 94490 (0.0009) [2023-10-12 19:44:42,237][62634] Updated weights for policy 0, policy_version 94440 (0.0008) [2023-10-12 19:44:42,618][62634] Updated weights for policy 0, policy_version 94450 (0.0007) [2023-10-12 19:44:42,982][62634] Updated weights for policy 0, policy_version 94460 (0.0008) [2023-10-12 19:44:43,386][62635] Updated weights for policy 1, policy_version 94500 (0.0008) [2023-10-12 19:44:43,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 193495040. Throughput: 0: 1687.8, 1: 1681.9. Samples: 48381542. Policy #0 lag: (min: 31.0, avg: 31.7, max: 49.0) [2023-10-12 19:44:43,435][61643] Avg episode reward: [(0, '24.530'), (1, '9.840')] [2023-10-12 19:44:43,743][62635] Updated weights for policy 1, policy_version 94510 (0.0009) [2023-10-12 19:44:44,104][62635] Updated weights for policy 1, policy_version 94520 (0.0010) [2023-10-12 19:44:47,166][62634] Updated weights for policy 0, policy_version 94470 (0.0010) [2023-10-12 19:44:47,541][62634] Updated weights for policy 0, policy_version 94480 (0.0010) [2023-10-12 19:44:47,923][62634] Updated weights for policy 0, policy_version 94490 (0.0011) [2023-10-12 19:44:48,158][62635] Updated weights for policy 1, policy_version 94530 (0.0008) [2023-10-12 19:44:48,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 193560576. Throughput: 0: 1663.7, 1: 1683.8. Samples: 48401204. Policy #0 lag: (min: 31.0, avg: 31.7, max: 49.0) [2023-10-12 19:44:48,436][61643] Avg episode reward: [(0, '24.710'), (1, '9.740')] [2023-10-12 19:44:48,519][62635] Updated weights for policy 1, policy_version 94540 (0.0009) [2023-10-12 19:44:48,883][62635] Updated weights for policy 1, policy_version 94550 (0.0010) [2023-10-12 19:44:49,260][62635] Updated weights for policy 1, policy_version 94560 (0.0009) [2023-10-12 19:44:52,118][62634] Updated weights for policy 0, policy_version 94500 (0.0010) [2023-10-12 19:44:52,498][62634] Updated weights for policy 0, policy_version 94510 (0.0009) [2023-10-12 19:44:52,872][62634] Updated weights for policy 0, policy_version 94520 (0.0009) [2023-10-12 19:44:53,365][62635] Updated weights for policy 1, policy_version 94570 (0.0007) [2023-10-12 19:44:53,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 193626112. Throughput: 0: 1685.8, 1: 1684.2. Samples: 48411384. Policy #0 lag: (min: 31.0, avg: 31.7, max: 49.0) [2023-10-12 19:44:53,436][61643] Avg episode reward: [(0, '24.910'), (1, '9.910')] [2023-10-12 19:44:53,722][62635] Updated weights for policy 1, policy_version 94580 (0.0009) [2023-10-12 19:44:54,088][62635] Updated weights for policy 1, policy_version 94590 (0.0007) [2023-10-12 19:44:56,888][62634] Updated weights for policy 0, policy_version 94530 (0.0008) [2023-10-12 19:44:57,272][62634] Updated weights for policy 0, policy_version 94540 (0.0008) [2023-10-12 19:44:57,656][62634] Updated weights for policy 0, policy_version 94550 (0.0007) [2023-10-12 19:44:58,031][62634] Updated weights for policy 0, policy_version 94560 (0.0008) [2023-10-12 19:44:58,117][62635] Updated weights for policy 1, policy_version 94600 (0.0007) [2023-10-12 19:44:58,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 193691648. Throughput: 0: 1687.4, 1: 1687.1. Samples: 48432090. Policy #0 lag: (min: 31.0, avg: 31.7, max: 49.0) [2023-10-12 19:44:58,435][61643] Avg episode reward: [(0, '25.090'), (1, '9.940')] [2023-10-12 19:44:58,482][62635] Updated weights for policy 1, policy_version 94610 (0.0008) [2023-10-12 19:44:58,842][62635] Updated weights for policy 1, policy_version 94620 (0.0009) [2023-10-12 19:45:02,011][62634] Updated weights for policy 0, policy_version 94570 (0.0007) [2023-10-12 19:45:02,375][62634] Updated weights for policy 0, policy_version 94580 (0.0009) [2023-10-12 19:45:02,756][62634] Updated weights for policy 0, policy_version 94590 (0.0009) [2023-10-12 19:45:02,773][62635] Updated weights for policy 1, policy_version 94630 (0.0008) [2023-10-12 19:45:03,132][62635] Updated weights for policy 1, policy_version 94640 (0.0007) [2023-10-12 19:45:03,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 193757184. Throughput: 0: 1668.3, 1: 1673.6. Samples: 48451210. Policy #0 lag: (min: 31.0, avg: 31.7, max: 49.0) [2023-10-12 19:45:03,436][61643] Avg episode reward: [(0, '25.220'), (1, '9.860')] [2023-10-12 19:45:03,499][62635] Updated weights for policy 1, policy_version 94650 (0.0008) [2023-10-12 19:45:06,802][62634] Updated weights for policy 0, policy_version 94600 (0.0008) [2023-10-12 19:45:07,178][62634] Updated weights for policy 0, policy_version 94610 (0.0011) [2023-10-12 19:45:07,555][62634] Updated weights for policy 0, policy_version 94620 (0.0009) [2023-10-12 19:45:07,610][62635] Updated weights for policy 1, policy_version 94660 (0.0009) [2023-10-12 19:45:07,977][62635] Updated weights for policy 1, policy_version 94670 (0.0009) [2023-10-12 19:45:08,344][62635] Updated weights for policy 1, policy_version 94680 (0.0010) [2023-10-12 19:45:08,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 193822720. Throughput: 0: 1688.9, 1: 1685.6. Samples: 48462162. Policy #0 lag: (min: 31.0, avg: 31.7, max: 49.0) [2023-10-12 19:45:08,435][61643] Avg episode reward: [(0, '25.290'), (1, '9.950')] [2023-10-12 19:45:11,619][62634] Updated weights for policy 0, policy_version 94630 (0.0007) [2023-10-12 19:45:11,988][62634] Updated weights for policy 0, policy_version 94640 (0.0008) [2023-10-12 19:45:12,363][62634] Updated weights for policy 0, policy_version 94650 (0.0010) [2023-10-12 19:45:12,487][62635] Updated weights for policy 1, policy_version 94690 (0.0008) [2023-10-12 19:45:12,854][62635] Updated weights for policy 1, policy_version 94700 (0.0009) [2023-10-12 19:45:13,225][62635] Updated weights for policy 1, policy_version 94710 (0.0009) [2023-10-12 19:45:13,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 193888256. Throughput: 0: 1676.4, 1: 1685.6. Samples: 48482304. Policy #0 lag: (min: 31.0, avg: 31.7, max: 49.0) [2023-10-12 19:45:13,436][61643] Avg episode reward: [(0, '25.510'), (1, '9.990')] [2023-10-12 19:45:13,588][62635] Updated weights for policy 1, policy_version 94720 (0.0008) [2023-10-12 19:45:16,545][62634] Updated weights for policy 0, policy_version 94660 (0.0009) [2023-10-12 19:45:16,915][62634] Updated weights for policy 0, policy_version 94670 (0.0009) [2023-10-12 19:45:17,305][62634] Updated weights for policy 0, policy_version 94680 (0.0007) [2023-10-12 19:45:17,813][62635] Updated weights for policy 1, policy_version 94730 (0.0009) [2023-10-12 19:45:18,181][62635] Updated weights for policy 1, policy_version 94740 (0.0009) [2023-10-12 19:45:18,435][61643] Fps is (10 sec: 13106.8, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 193953792. Throughput: 0: 1674.6, 1: 1671.8. Samples: 48501472. Policy #0 lag: (min: 31.0, avg: 31.7, max: 49.0) [2023-10-12 19:45:18,436][61643] Avg episode reward: [(0, '25.580'), (1, '9.960')] [2023-10-12 19:45:18,552][62635] Updated weights for policy 1, policy_version 94750 (0.0008) [2023-10-12 19:45:21,526][62634] Updated weights for policy 0, policy_version 94690 (0.0008) [2023-10-12 19:45:21,907][62634] Updated weights for policy 0, policy_version 94700 (0.0010) [2023-10-12 19:45:22,282][62634] Updated weights for policy 0, policy_version 94710 (0.0009) [2023-10-12 19:45:22,613][62635] Updated weights for policy 1, policy_version 94760 (0.0009) [2023-10-12 19:45:22,653][62634] Updated weights for policy 0, policy_version 94720 (0.0008) [2023-10-12 19:45:22,976][62635] Updated weights for policy 1, policy_version 94770 (0.0008) [2023-10-12 19:45:23,351][62635] Updated weights for policy 1, policy_version 94780 (0.0007) [2023-10-12 19:45:23,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 194019328. Throughput: 0: 1681.1, 1: 1681.6. Samples: 48512244. Policy #0 lag: (min: 31.0, avg: 31.7, max: 49.0) [2023-10-12 19:45:23,435][61643] Avg episode reward: [(0, '25.670'), (1, '9.970')] [2023-10-12 19:45:26,483][62634] Updated weights for policy 0, policy_version 94730 (0.0007) [2023-10-12 19:45:26,866][62634] Updated weights for policy 0, policy_version 94740 (0.0007) [2023-10-12 19:45:27,235][62634] Updated weights for policy 0, policy_version 94750 (0.0008) [2023-10-12 19:45:27,501][62635] Updated weights for policy 1, policy_version 94790 (0.0008) [2023-10-12 19:45:27,866][62635] Updated weights for policy 1, policy_version 94800 (0.0007) [2023-10-12 19:45:28,235][62635] Updated weights for policy 1, policy_version 94810 (0.0007) [2023-10-12 19:45:28,435][61643] Fps is (10 sec: 13107.7, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 194084864. Throughput: 0: 1665.9, 1: 1682.4. Samples: 48532216. Policy #0 lag: (min: 31.0, avg: 31.7, max: 49.0) [2023-10-12 19:45:28,435][61643] Avg episode reward: [(0, '25.550'), (1, '10.070')] [2023-10-12 19:45:31,205][62634] Updated weights for policy 0, policy_version 94760 (0.0010) [2023-10-12 19:45:31,581][62634] Updated weights for policy 0, policy_version 94770 (0.0009) [2023-10-12 19:45:31,970][62634] Updated weights for policy 0, policy_version 94780 (0.0010) [2023-10-12 19:45:32,239][62635] Updated weights for policy 1, policy_version 94820 (0.0008) [2023-10-12 19:45:32,602][62635] Updated weights for policy 1, policy_version 94830 (0.0009) [2023-10-12 19:45:32,973][62635] Updated weights for policy 1, policy_version 94840 (0.0008) [2023-10-12 19:45:33,435][61643] Fps is (10 sec: 16384.1, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 194183168. Throughput: 0: 1680.7, 1: 1660.9. Samples: 48551576. Policy #0 lag: (min: 31.0, avg: 31.7, max: 49.0) [2023-10-12 19:45:33,435][61643] Avg episode reward: [(0, '25.320'), (1, '9.850')] [2023-10-12 19:45:36,098][62634] Updated weights for policy 0, policy_version 94790 (0.0009) [2023-10-12 19:45:36,474][62634] Updated weights for policy 0, policy_version 94800 (0.0008) [2023-10-12 19:45:36,853][62634] Updated weights for policy 0, policy_version 94810 (0.0009) [2023-10-12 19:45:37,084][62635] Updated weights for policy 1, policy_version 94850 (0.0007) [2023-10-12 19:45:37,455][62635] Updated weights for policy 1, policy_version 94860 (0.0010) [2023-10-12 19:45:37,826][62635] Updated weights for policy 1, policy_version 94870 (0.0007) [2023-10-12 19:45:38,193][62635] Updated weights for policy 1, policy_version 94880 (0.0007) [2023-10-12 19:45:38,435][61643] Fps is (10 sec: 16383.9, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 194248704. Throughput: 0: 1676.1, 1: 1681.2. Samples: 48562460. Policy #0 lag: (min: 2.0, avg: 6.0, max: 34.0) [2023-10-12 19:45:38,435][61643] Avg episode reward: [(0, '25.160'), (1, '9.890')] [2023-10-12 19:45:40,973][62634] Updated weights for policy 0, policy_version 94820 (0.0007) [2023-10-12 19:45:41,350][62634] Updated weights for policy 0, policy_version 94830 (0.0010) [2023-10-12 19:45:41,728][62634] Updated weights for policy 0, policy_version 94840 (0.0010) [2023-10-12 19:45:42,302][62635] Updated weights for policy 1, policy_version 94890 (0.0008) [2023-10-12 19:45:42,663][62635] Updated weights for policy 1, policy_version 94900 (0.0007) [2023-10-12 19:45:43,020][62635] Updated weights for policy 1, policy_version 94910 (0.0007) [2023-10-12 19:45:43,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 194314240. Throughput: 0: 1650.4, 1: 1680.0. Samples: 48581958. Policy #0 lag: (min: 2.0, avg: 6.0, max: 34.0) [2023-10-12 19:45:43,435][61643] Avg episode reward: [(0, '25.010'), (1, '10.150')] [2023-10-12 19:45:45,718][62634] Updated weights for policy 0, policy_version 94850 (0.0008) [2023-10-12 19:45:46,110][62634] Updated weights for policy 0, policy_version 94860 (0.0010) [2023-10-12 19:45:46,485][62634] Updated weights for policy 0, policy_version 94870 (0.0007) [2023-10-12 19:45:46,862][62634] Updated weights for policy 0, policy_version 94880 (0.0007) [2023-10-12 19:45:47,082][62635] Updated weights for policy 1, policy_version 94920 (0.0008) [2023-10-12 19:45:47,451][62635] Updated weights for policy 1, policy_version 94930 (0.0009) [2023-10-12 19:45:47,822][62635] Updated weights for policy 1, policy_version 94940 (0.0007) [2023-10-12 19:45:48,435][61643] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 194379776. Throughput: 0: 1673.4, 1: 1665.0. Samples: 48601440. Policy #0 lag: (min: 2.0, avg: 6.0, max: 34.0) [2023-10-12 19:45:48,436][61643] Avg episode reward: [(0, '24.990'), (1, '9.800')] [2023-10-12 19:45:50,788][62634] Updated weights for policy 0, policy_version 94890 (0.0009) [2023-10-12 19:45:51,163][62634] Updated weights for policy 0, policy_version 94900 (0.0009) [2023-10-12 19:45:51,547][62634] Updated weights for policy 0, policy_version 94910 (0.0009) [2023-10-12 19:45:51,749][62635] Updated weights for policy 1, policy_version 94950 (0.0008) [2023-10-12 19:45:52,113][62635] Updated weights for policy 1, policy_version 94960 (0.0009) [2023-10-12 19:45:52,484][62635] Updated weights for policy 1, policy_version 94970 (0.0008) [2023-10-12 19:45:53,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 194445312. Throughput: 0: 1661.7, 1: 1681.0. Samples: 48612584. Policy #0 lag: (min: 2.0, avg: 6.0, max: 34.0) [2023-10-12 19:45:53,435][61643] Avg episode reward: [(0, '24.950'), (1, '9.800')] [2023-10-12 19:45:55,689][62634] Updated weights for policy 0, policy_version 94920 (0.0007) [2023-10-12 19:45:56,065][62634] Updated weights for policy 0, policy_version 94930 (0.0008) [2023-10-12 19:45:56,437][62634] Updated weights for policy 0, policy_version 94940 (0.0009) [2023-10-12 19:45:56,590][62635] Updated weights for policy 1, policy_version 94980 (0.0008) [2023-10-12 19:45:56,950][62635] Updated weights for policy 1, policy_version 94990 (0.0009) [2023-10-12 19:45:57,311][62635] Updated weights for policy 1, policy_version 95000 (0.0009) [2023-10-12 19:45:58,435][61643] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 194510848. Throughput: 0: 1655.8, 1: 1671.3. Samples: 48632022. Policy #0 lag: (min: 2.0, avg: 6.0, max: 34.0) [2023-10-12 19:45:58,436][61643] Avg episode reward: [(0, '25.020'), (1, '10.070')] [2023-10-12 19:46:00,627][62634] Updated weights for policy 0, policy_version 94950 (0.0009) [2023-10-12 19:46:01,005][62634] Updated weights for policy 0, policy_version 94960 (0.0007) [2023-10-12 19:46:01,346][62635] Updated weights for policy 1, policy_version 95010 (0.0009) [2023-10-12 19:46:01,390][62634] Updated weights for policy 0, policy_version 94970 (0.0008) [2023-10-12 19:46:01,702][62635] Updated weights for policy 1, policy_version 95020 (0.0008) [2023-10-12 19:46:02,073][62635] Updated weights for policy 1, policy_version 95030 (0.0010) [2023-10-12 19:46:02,444][62635] Updated weights for policy 1, policy_version 95040 (0.0010) [2023-10-12 19:46:03,435][61643] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 194576384. Throughput: 0: 1669.9, 1: 1671.2. Samples: 48651824. Policy #0 lag: (min: 2.0, avg: 6.0, max: 34.0) [2023-10-12 19:46:03,436][61643] Avg episode reward: [(0, '25.010'), (1, '10.010')] [2023-10-12 19:46:05,366][62634] Updated weights for policy 0, policy_version 94980 (0.0008) [2023-10-12 19:46:05,749][62634] Updated weights for policy 0, policy_version 94990 (0.0007) [2023-10-12 19:46:06,115][62634] Updated weights for policy 0, policy_version 95000 (0.0007) [2023-10-12 19:46:06,820][62635] Updated weights for policy 1, policy_version 95050 (0.0009) [2023-10-12 19:46:07,176][62635] Updated weights for policy 1, policy_version 95060 (0.0008) [2023-10-12 19:46:07,544][62635] Updated weights for policy 1, policy_version 95070 (0.0008) [2023-10-12 19:46:08,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 194641920. Throughput: 0: 1661.8, 1: 1684.7. Samples: 48662834. Policy #0 lag: (min: 2.0, avg: 6.0, max: 34.0) [2023-10-12 19:46:08,436][61643] Avg episode reward: [(0, '24.980'), (1, '9.930')] [2023-10-12 19:46:10,194][62634] Updated weights for policy 0, policy_version 95010 (0.0008) [2023-10-12 19:46:10,574][62634] Updated weights for policy 0, policy_version 95020 (0.0009) [2023-10-12 19:46:10,955][62634] Updated weights for policy 0, policy_version 95030 (0.0009) [2023-10-12 19:46:11,330][62634] Updated weights for policy 0, policy_version 95040 (0.0010) [2023-10-12 19:46:11,575][62635] Updated weights for policy 1, policy_version 95080 (0.0009) [2023-10-12 19:46:11,945][62635] Updated weights for policy 1, policy_version 95090 (0.0009) [2023-10-12 19:46:12,308][62635] Updated weights for policy 1, policy_version 95100 (0.0009) [2023-10-12 19:46:13,435][61643] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 194707456. Throughput: 0: 1664.2, 1: 1665.7. Samples: 48682062. Policy #0 lag: (min: 2.0, avg: 6.0, max: 34.0) [2023-10-12 19:46:13,436][61643] Avg episode reward: [(0, '25.060'), (1, '9.900')] [2023-10-12 19:46:15,289][62634] Updated weights for policy 0, policy_version 95050 (0.0009) [2023-10-12 19:46:15,658][62634] Updated weights for policy 0, policy_version 95060 (0.0008) [2023-10-12 19:46:16,033][62634] Updated weights for policy 0, policy_version 95070 (0.0009) [2023-10-12 19:46:16,470][62635] Updated weights for policy 1, policy_version 95110 (0.0008) [2023-10-12 19:46:16,839][62635] Updated weights for policy 1, policy_version 95120 (0.0007) [2023-10-12 19:46:17,201][62635] Updated weights for policy 1, policy_version 95130 (0.0008) [2023-10-12 19:46:18,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 194772992. Throughput: 0: 1674.8, 1: 1670.5. Samples: 48702118. Policy #0 lag: (min: 2.0, avg: 6.0, max: 34.0) [2023-10-12 19:46:18,436][61643] Avg episode reward: [(0, '24.910'), (1, '9.810')] [2023-10-12 19:46:18,448][62495] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000095136_97419264.pth... [2023-10-12 19:46:18,448][62354] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000095072_97353728.pth... [2023-10-12 19:46:18,489][62495] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000093568_95813632.pth [2023-10-12 19:46:18,490][62354] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000093504_95748096.pth [2023-10-12 19:46:20,200][62634] Updated weights for policy 0, policy_version 95080 (0.0009) [2023-10-12 19:46:20,577][62634] Updated weights for policy 0, policy_version 95090 (0.0007) [2023-10-12 19:46:20,960][62634] Updated weights for policy 0, policy_version 95100 (0.0007) [2023-10-12 19:46:21,387][62635] Updated weights for policy 1, policy_version 95140 (0.0008) [2023-10-12 19:46:21,751][62635] Updated weights for policy 1, policy_version 95150 (0.0010) [2023-10-12 19:46:22,109][62635] Updated weights for policy 1, policy_version 95160 (0.0010) [2023-10-12 19:46:23,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 194838528. Throughput: 0: 1656.3, 1: 1677.2. Samples: 48712466. Policy #0 lag: (min: 2.0, avg: 6.0, max: 34.0) [2023-10-12 19:46:23,435][61643] Avg episode reward: [(0, '24.780'), (1, '9.840')] [2023-10-12 19:46:25,054][62634] Updated weights for policy 0, policy_version 95110 (0.0008) [2023-10-12 19:46:25,431][62634] Updated weights for policy 0, policy_version 95120 (0.0007) [2023-10-12 19:46:25,806][62634] Updated weights for policy 0, policy_version 95130 (0.0007) [2023-10-12 19:46:26,217][62635] Updated weights for policy 1, policy_version 95170 (0.0008) [2023-10-12 19:46:26,598][62635] Updated weights for policy 1, policy_version 95180 (0.0009) [2023-10-12 19:46:26,963][62635] Updated weights for policy 1, policy_version 95190 (0.0008) [2023-10-12 19:46:27,333][62635] Updated weights for policy 1, policy_version 95200 (0.0007) [2023-10-12 19:46:28,435][61643] Fps is (10 sec: 13107.6, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 194904064. Throughput: 0: 1684.2, 1: 1656.1. Samples: 48732272. Policy #0 lag: (min: 2.0, avg: 6.0, max: 34.0) [2023-10-12 19:46:28,435][61643] Avg episode reward: [(0, '24.750'), (1, '10.020')] [2023-10-12 19:46:29,818][62634] Updated weights for policy 0, policy_version 95140 (0.0009) [2023-10-12 19:46:30,190][62634] Updated weights for policy 0, policy_version 95150 (0.0010) [2023-10-12 19:46:30,573][62634] Updated weights for policy 0, policy_version 95160 (0.0007) [2023-10-12 19:46:31,232][62635] Updated weights for policy 1, policy_version 95210 (0.0009) [2023-10-12 19:46:31,606][62635] Updated weights for policy 1, policy_version 95220 (0.0009) [2023-10-12 19:46:31,971][62635] Updated weights for policy 1, policy_version 95230 (0.0011) [2023-10-12 19:46:33,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 194969600. Throughput: 0: 1693.4, 1: 1672.0. Samples: 48752880. Policy #0 lag: (min: 2.0, avg: 6.0, max: 34.0) [2023-10-12 19:46:33,436][61643] Avg episode reward: [(0, '24.740'), (1, '9.830')] [2023-10-12 19:46:34,640][62634] Updated weights for policy 0, policy_version 95170 (0.0009) [2023-10-12 19:46:35,031][62634] Updated weights for policy 0, policy_version 95180 (0.0009) [2023-10-12 19:46:35,398][62634] Updated weights for policy 0, policy_version 95190 (0.0008) [2023-10-12 19:46:35,778][62634] Updated weights for policy 0, policy_version 95200 (0.0007) [2023-10-12 19:46:36,074][62635] Updated weights for policy 1, policy_version 95240 (0.0008) [2023-10-12 19:46:36,444][62635] Updated weights for policy 1, policy_version 95250 (0.0008) [2023-10-12 19:46:36,803][62635] Updated weights for policy 1, policy_version 95260 (0.0008) [2023-10-12 19:46:38,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 195035136. Throughput: 0: 1669.3, 1: 1669.2. Samples: 48762818. Policy #0 lag: (min: 36.0, avg: 39.9, max: 40.0) [2023-10-12 19:46:38,435][61643] Avg episode reward: [(0, '24.580'), (1, '9.740')] [2023-10-12 19:46:39,836][62634] Updated weights for policy 0, policy_version 95210 (0.0008) [2023-10-12 19:46:40,214][62634] Updated weights for policy 0, policy_version 95220 (0.0009) [2023-10-12 19:46:40,587][62634] Updated weights for policy 0, policy_version 95230 (0.0007) [2023-10-12 19:46:40,742][62635] Updated weights for policy 1, policy_version 95270 (0.0008) [2023-10-12 19:46:41,109][62635] Updated weights for policy 1, policy_version 95280 (0.0007) [2023-10-12 19:46:41,482][62635] Updated weights for policy 1, policy_version 95290 (0.0007) [2023-10-12 19:46:43,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 195100672. Throughput: 0: 1684.8, 1: 1659.4. Samples: 48782512. Policy #0 lag: (min: 36.0, avg: 39.9, max: 40.0) [2023-10-12 19:46:43,435][61643] Avg episode reward: [(0, '24.710'), (1, '10.120')] [2023-10-12 19:46:44,532][62634] Updated weights for policy 0, policy_version 95240 (0.0007) [2023-10-12 19:46:44,900][62634] Updated weights for policy 0, policy_version 95250 (0.0009) [2023-10-12 19:46:45,281][62634] Updated weights for policy 0, policy_version 95260 (0.0008) [2023-10-12 19:46:45,510][62635] Updated weights for policy 1, policy_version 95300 (0.0008) [2023-10-12 19:46:45,876][62635] Updated weights for policy 1, policy_version 95310 (0.0007) [2023-10-12 19:46:46,251][62635] Updated weights for policy 1, policy_version 95320 (0.0009) [2023-10-12 19:46:48,435][61643] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 195166208. Throughput: 0: 1689.6, 1: 1678.8. Samples: 48803404. Policy #0 lag: (min: 36.0, avg: 39.9, max: 40.0) [2023-10-12 19:46:48,436][61643] Avg episode reward: [(0, '25.060'), (1, '9.940')] [2023-10-12 19:46:49,305][62634] Updated weights for policy 0, policy_version 95270 (0.0008) [2023-10-12 19:46:49,669][62634] Updated weights for policy 0, policy_version 95280 (0.0009) [2023-10-12 19:46:50,046][62634] Updated weights for policy 0, policy_version 95290 (0.0008) [2023-10-12 19:46:50,388][62635] Updated weights for policy 1, policy_version 95330 (0.0008) [2023-10-12 19:46:50,747][62635] Updated weights for policy 1, policy_version 95340 (0.0008) [2023-10-12 19:46:51,111][62635] Updated weights for policy 1, policy_version 95350 (0.0007) [2023-10-12 19:46:51,485][62635] Updated weights for policy 1, policy_version 95360 (0.0008) [2023-10-12 19:46:53,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 195231744. Throughput: 0: 1672.2, 1: 1663.4. Samples: 48812936. Policy #0 lag: (min: 36.0, avg: 39.9, max: 40.0) [2023-10-12 19:46:53,435][61643] Avg episode reward: [(0, '25.220'), (1, '9.850')] [2023-10-12 19:46:54,137][62634] Updated weights for policy 0, policy_version 95300 (0.0008) [2023-10-12 19:46:54,513][62634] Updated weights for policy 0, policy_version 95310 (0.0007) [2023-10-12 19:46:54,888][62634] Updated weights for policy 0, policy_version 95320 (0.0007) [2023-10-12 19:46:55,491][62635] Updated weights for policy 1, policy_version 95370 (0.0009) [2023-10-12 19:46:55,859][62635] Updated weights for policy 1, policy_version 95380 (0.0007) [2023-10-12 19:46:56,231][62635] Updated weights for policy 1, policy_version 95390 (0.0007) [2023-10-12 19:46:58,435][61643] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 195297280. Throughput: 0: 1684.4, 1: 1674.1. Samples: 48833194. Policy #0 lag: (min: 36.0, avg: 39.9, max: 40.0) [2023-10-12 19:46:58,435][61643] Avg episode reward: [(0, '25.220'), (1, '10.020')] [2023-10-12 19:46:58,939][62634] Updated weights for policy 0, policy_version 95330 (0.0008) [2023-10-12 19:46:59,322][62634] Updated weights for policy 0, policy_version 95340 (0.0009) [2023-10-12 19:46:59,706][62634] Updated weights for policy 0, policy_version 95350 (0.0008) [2023-10-12 19:47:00,083][62634] Updated weights for policy 0, policy_version 95360 (0.0010) [2023-10-12 19:47:00,197][62635] Updated weights for policy 1, policy_version 95400 (0.0008) [2023-10-12 19:47:00,565][62635] Updated weights for policy 1, policy_version 95410 (0.0007) [2023-10-12 19:47:00,924][62635] Updated weights for policy 1, policy_version 95420 (0.0007) [2023-10-12 19:47:03,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 195362816. Throughput: 0: 1685.6, 1: 1693.8. Samples: 48854188. Policy #0 lag: (min: 36.0, avg: 39.9, max: 40.0) [2023-10-12 19:47:03,435][61643] Avg episode reward: [(0, '25.260'), (1, '9.930')] [2023-10-12 19:47:04,161][62634] Updated weights for policy 0, policy_version 95370 (0.0007) [2023-10-12 19:47:04,543][62634] Updated weights for policy 0, policy_version 95380 (0.0008) [2023-10-12 19:47:04,885][62635] Updated weights for policy 1, policy_version 95430 (0.0007) [2023-10-12 19:47:04,916][62634] Updated weights for policy 0, policy_version 95390 (0.0009) [2023-10-12 19:47:05,244][62635] Updated weights for policy 1, policy_version 95440 (0.0008) [2023-10-12 19:47:05,614][62635] Updated weights for policy 1, policy_version 95450 (0.0007) [2023-10-12 19:47:08,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 195428352. Throughput: 0: 1680.9, 1: 1669.0. Samples: 48863212. Policy #0 lag: (min: 36.0, avg: 39.9, max: 40.0) [2023-10-12 19:47:08,436][61643] Avg episode reward: [(0, '25.420'), (1, '9.860')] [2023-10-12 19:47:08,983][62634] Updated weights for policy 0, policy_version 95400 (0.0008) [2023-10-12 19:47:09,360][62634] Updated weights for policy 0, policy_version 95410 (0.0007) [2023-10-12 19:47:09,558][62635] Updated weights for policy 1, policy_version 95460 (0.0007) [2023-10-12 19:47:09,734][62634] Updated weights for policy 0, policy_version 95420 (0.0008) [2023-10-12 19:47:09,937][62635] Updated weights for policy 1, policy_version 95470 (0.0009) [2023-10-12 19:47:10,311][62635] Updated weights for policy 1, policy_version 95480 (0.0010) [2023-10-12 19:47:13,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 195493888. Throughput: 0: 1684.0, 1: 1689.2. Samples: 48884070. Policy #0 lag: (min: 36.0, avg: 39.9, max: 40.0) [2023-10-12 19:47:13,436][61643] Avg episode reward: [(0, '25.360'), (1, '9.950')] [2023-10-12 19:47:13,830][62634] Updated weights for policy 0, policy_version 95430 (0.0007) [2023-10-12 19:47:14,205][62634] Updated weights for policy 0, policy_version 95440 (0.0008) [2023-10-12 19:47:14,491][62635] Updated weights for policy 1, policy_version 95490 (0.0009) [2023-10-12 19:47:14,579][62634] Updated weights for policy 0, policy_version 95450 (0.0007) [2023-10-12 19:47:14,871][62635] Updated weights for policy 1, policy_version 95500 (0.0008) [2023-10-12 19:47:15,236][62635] Updated weights for policy 1, policy_version 95510 (0.0008) [2023-10-12 19:47:15,600][62635] Updated weights for policy 1, policy_version 95520 (0.0007) [2023-10-12 19:47:18,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13107.3, 300 sec: 13329.4). Total num frames: 195559424. Throughput: 0: 1676.7, 1: 1696.2. Samples: 48904662. Policy #0 lag: (min: 36.0, avg: 39.9, max: 40.0) [2023-10-12 19:47:18,435][61643] Avg episode reward: [(0, '25.320'), (1, '10.040')] [2023-10-12 19:47:18,500][62634] Updated weights for policy 0, policy_version 95460 (0.0007) [2023-10-12 19:47:18,886][62634] Updated weights for policy 0, policy_version 95470 (0.0009) [2023-10-12 19:47:19,252][62634] Updated weights for policy 0, policy_version 95480 (0.0008) [2023-10-12 19:47:19,848][62635] Updated weights for policy 1, policy_version 95530 (0.0007) [2023-10-12 19:47:20,216][62635] Updated weights for policy 1, policy_version 95540 (0.0011) [2023-10-12 19:47:20,581][62635] Updated weights for policy 1, policy_version 95550 (0.0009) [2023-10-12 19:47:23,431][62634] Updated weights for policy 0, policy_version 95490 (0.0008) [2023-10-12 19:47:23,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 195624960. Throughput: 0: 1682.1, 1: 1672.4. Samples: 48913772. Policy #0 lag: (min: 36.0, avg: 39.9, max: 40.0) [2023-10-12 19:47:23,436][61643] Avg episode reward: [(0, '25.030'), (1, '9.950')] [2023-10-12 19:47:23,810][62634] Updated weights for policy 0, policy_version 95500 (0.0009) [2023-10-12 19:47:24,183][62634] Updated weights for policy 0, policy_version 95510 (0.0010) [2023-10-12 19:47:24,561][62634] Updated weights for policy 0, policy_version 95520 (0.0010) [2023-10-12 19:47:24,686][62635] Updated weights for policy 1, policy_version 95560 (0.0008) [2023-10-12 19:47:25,043][62635] Updated weights for policy 1, policy_version 95570 (0.0008) [2023-10-12 19:47:25,411][62635] Updated weights for policy 1, policy_version 95580 (0.0009) [2023-10-12 19:47:28,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 195690496. Throughput: 0: 1681.5, 1: 1692.1. Samples: 48934322. Policy #0 lag: (min: 36.0, avg: 39.9, max: 40.0) [2023-10-12 19:47:28,436][61643] Avg episode reward: [(0, '24.990'), (1, '10.030')] [2023-10-12 19:47:28,642][62634] Updated weights for policy 0, policy_version 95530 (0.0007) [2023-10-12 19:47:29,019][62634] Updated weights for policy 0, policy_version 95540 (0.0008) [2023-10-12 19:47:29,393][62634] Updated weights for policy 0, policy_version 95550 (0.0008) [2023-10-12 19:47:29,480][62635] Updated weights for policy 1, policy_version 95590 (0.0008) [2023-10-12 19:47:29,840][62635] Updated weights for policy 1, policy_version 95600 (0.0007) [2023-10-12 19:47:30,214][62635] Updated weights for policy 1, policy_version 95610 (0.0009) [2023-10-12 19:47:33,435][61643] Fps is (10 sec: 13106.8, 60 sec: 13107.1, 300 sec: 13329.3). Total num frames: 195756032. Throughput: 0: 1683.9, 1: 1689.2. Samples: 48955196. Policy #0 lag: (min: 36.0, avg: 39.9, max: 40.0) [2023-10-12 19:47:33,436][61643] Avg episode reward: [(0, '25.020'), (1, '10.040')] [2023-10-12 19:47:33,489][62634] Updated weights for policy 0, policy_version 95560 (0.0009) [2023-10-12 19:47:33,859][62634] Updated weights for policy 0, policy_version 95570 (0.0008) [2023-10-12 19:47:34,209][62635] Updated weights for policy 1, policy_version 95620 (0.0009) [2023-10-12 19:47:34,240][62634] Updated weights for policy 0, policy_version 95580 (0.0007) [2023-10-12 19:47:34,584][62635] Updated weights for policy 1, policy_version 95630 (0.0010) [2023-10-12 19:47:34,957][62635] Updated weights for policy 1, policy_version 95640 (0.0008) [2023-10-12 19:47:38,369][62634] Updated weights for policy 0, policy_version 95590 (0.0008) [2023-10-12 19:47:38,435][61643] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 195821568. Throughput: 0: 1683.7, 1: 1678.5. Samples: 48964236. Policy #0 lag: (min: 2.0, avg: 6.5, max: 34.0) [2023-10-12 19:47:38,435][61643] Avg episode reward: [(0, '24.830'), (1, '9.860')] [2023-10-12 19:47:38,754][62634] Updated weights for policy 0, policy_version 95600 (0.0008) [2023-10-12 19:47:38,991][62635] Updated weights for policy 1, policy_version 95650 (0.0007) [2023-10-12 19:47:39,125][62634] Updated weights for policy 0, policy_version 95610 (0.0008) [2023-10-12 19:47:39,361][62635] Updated weights for policy 1, policy_version 95660 (0.0007) [2023-10-12 19:47:39,726][62635] Updated weights for policy 1, policy_version 95670 (0.0009) [2023-10-12 19:47:40,092][62635] Updated weights for policy 1, policy_version 95680 (0.0009) [2023-10-12 19:47:43,051][62634] Updated weights for policy 0, policy_version 95620 (0.0008) [2023-10-12 19:47:43,422][62634] Updated weights for policy 0, policy_version 95630 (0.0009) [2023-10-12 19:47:43,435][61643] Fps is (10 sec: 13107.6, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 195887104. Throughput: 0: 1684.8, 1: 1688.0. Samples: 48984972. Policy #0 lag: (min: 2.0, avg: 6.5, max: 34.0) [2023-10-12 19:47:43,435][61643] Avg episode reward: [(0, '25.080'), (1, '10.050')] [2023-10-12 19:47:43,805][62634] Updated weights for policy 0, policy_version 95640 (0.0011) [2023-10-12 19:47:44,302][62635] Updated weights for policy 1, policy_version 95690 (0.0009) [2023-10-12 19:47:44,667][62635] Updated weights for policy 1, policy_version 95700 (0.0007) [2023-10-12 19:47:45,033][62635] Updated weights for policy 1, policy_version 95710 (0.0007) [2023-10-12 19:47:47,757][62634] Updated weights for policy 0, policy_version 95650 (0.0008) [2023-10-12 19:47:48,137][62634] Updated weights for policy 0, policy_version 95660 (0.0009) [2023-10-12 19:47:48,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.3, 300 sec: 13329.4). Total num frames: 195952640. Throughput: 0: 1681.1, 1: 1681.6. Samples: 49005512. Policy #0 lag: (min: 2.0, avg: 6.5, max: 34.0) [2023-10-12 19:47:48,435][61643] Avg episode reward: [(0, '25.060'), (1, '10.050')] [2023-10-12 19:47:48,504][62634] Updated weights for policy 0, policy_version 95670 (0.0008) [2023-10-12 19:47:48,885][62634] Updated weights for policy 0, policy_version 95680 (0.0008) [2023-10-12 19:47:49,031][62635] Updated weights for policy 1, policy_version 95720 (0.0009) [2023-10-12 19:47:49,397][62635] Updated weights for policy 1, policy_version 95730 (0.0008) [2023-10-12 19:47:49,761][62635] Updated weights for policy 1, policy_version 95740 (0.0009) [2023-10-12 19:47:52,791][62634] Updated weights for policy 0, policy_version 95690 (0.0008) [2023-10-12 19:47:53,172][62634] Updated weights for policy 0, policy_version 95700 (0.0008) [2023-10-12 19:47:53,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 196018176. Throughput: 0: 1690.0, 1: 1681.6. Samples: 49014938. Policy #0 lag: (min: 2.0, avg: 6.5, max: 34.0) [2023-10-12 19:47:53,436][61643] Avg episode reward: [(0, '25.110'), (1, '9.960')] [2023-10-12 19:47:53,550][62634] Updated weights for policy 0, policy_version 95710 (0.0009) [2023-10-12 19:47:53,908][62635] Updated weights for policy 1, policy_version 95750 (0.0010) [2023-10-12 19:47:54,276][62635] Updated weights for policy 1, policy_version 95760 (0.0008) [2023-10-12 19:47:54,648][62635] Updated weights for policy 1, policy_version 95770 (0.0008) [2023-10-12 19:47:57,554][62634] Updated weights for policy 0, policy_version 95720 (0.0010) [2023-10-12 19:47:57,925][62634] Updated weights for policy 0, policy_version 95730 (0.0009) [2023-10-12 19:47:58,298][62634] Updated weights for policy 0, policy_version 95740 (0.0011) [2023-10-12 19:47:58,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 196083712. Throughput: 0: 1688.0, 1: 1681.7. Samples: 49035706. Policy #0 lag: (min: 2.0, avg: 6.5, max: 34.0) [2023-10-12 19:47:58,435][61643] Avg episode reward: [(0, '25.110'), (1, '10.060')] [2023-10-12 19:47:58,739][62635] Updated weights for policy 1, policy_version 95780 (0.0008) [2023-10-12 19:47:59,105][62635] Updated weights for policy 1, policy_version 95790 (0.0010) [2023-10-12 19:47:59,470][62635] Updated weights for policy 1, policy_version 95800 (0.0010) [2023-10-12 19:48:02,279][62634] Updated weights for policy 0, policy_version 95750 (0.0010) [2023-10-12 19:48:02,662][62634] Updated weights for policy 0, policy_version 95760 (0.0007) [2023-10-12 19:48:03,044][62634] Updated weights for policy 0, policy_version 95770 (0.0009) [2023-10-12 19:48:03,435][61643] Fps is (10 sec: 16383.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 196182016. Throughput: 0: 1671.9, 1: 1683.9. Samples: 49055672. Policy #0 lag: (min: 2.0, avg: 6.5, max: 34.0) [2023-10-12 19:48:03,435][61643] Avg episode reward: [(0, '24.920'), (1, '9.900')] [2023-10-12 19:48:03,523][62635] Updated weights for policy 1, policy_version 95810 (0.0010) [2023-10-12 19:48:03,889][62635] Updated weights for policy 1, policy_version 95820 (0.0007) [2023-10-12 19:48:04,249][62635] Updated weights for policy 1, policy_version 95830 (0.0007) [2023-10-12 19:48:04,619][62635] Updated weights for policy 1, policy_version 95840 (0.0007) [2023-10-12 19:48:07,204][62634] Updated weights for policy 0, policy_version 95780 (0.0008) [2023-10-12 19:48:07,578][62634] Updated weights for policy 0, policy_version 95790 (0.0009) [2023-10-12 19:48:07,947][62634] Updated weights for policy 0, policy_version 95800 (0.0008) [2023-10-12 19:48:08,435][61643] Fps is (10 sec: 16384.0, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 196247552. Throughput: 0: 1691.4, 1: 1686.0. Samples: 49065756. Policy #0 lag: (min: 2.0, avg: 6.5, max: 34.0) [2023-10-12 19:48:08,435][61643] Avg episode reward: [(0, '24.830'), (1, '9.820')] [2023-10-12 19:48:08,698][62635] Updated weights for policy 1, policy_version 95850 (0.0008) [2023-10-12 19:48:09,067][62635] Updated weights for policy 1, policy_version 95860 (0.0007) [2023-10-12 19:48:09,430][62635] Updated weights for policy 1, policy_version 95870 (0.0007) [2023-10-12 19:48:11,910][62634] Updated weights for policy 0, policy_version 95810 (0.0010) [2023-10-12 19:48:12,299][62634] Updated weights for policy 0, policy_version 95820 (0.0010) [2023-10-12 19:48:12,681][62634] Updated weights for policy 0, policy_version 95830 (0.0009) [2023-10-12 19:48:13,051][62634] Updated weights for policy 0, policy_version 95840 (0.0009) [2023-10-12 19:48:13,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 196313088. Throughput: 0: 1692.0, 1: 1689.3. Samples: 49086480. Policy #0 lag: (min: 2.0, avg: 6.5, max: 34.0) [2023-10-12 19:48:13,435][61643] Avg episode reward: [(0, '24.760'), (1, '9.760')] [2023-10-12 19:48:13,455][62635] Updated weights for policy 1, policy_version 95880 (0.0008) [2023-10-12 19:48:13,823][62635] Updated weights for policy 1, policy_version 95890 (0.0009) [2023-10-12 19:48:14,198][62635] Updated weights for policy 1, policy_version 95900 (0.0009) [2023-10-12 19:48:17,180][62634] Updated weights for policy 0, policy_version 95850 (0.0009) [2023-10-12 19:48:17,558][62634] Updated weights for policy 0, policy_version 95860 (0.0009) [2023-10-12 19:48:17,934][62634] Updated weights for policy 0, policy_version 95870 (0.0009) [2023-10-12 19:48:18,122][62635] Updated weights for policy 1, policy_version 95910 (0.0007) [2023-10-12 19:48:18,435][61643] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 196378624. Throughput: 0: 1663.6, 1: 1686.3. Samples: 49105940. Policy #0 lag: (min: 2.0, avg: 6.5, max: 34.0) [2023-10-12 19:48:18,436][61643] Avg episode reward: [(0, '24.730'), (1, '9.730')] [2023-10-12 19:48:18,444][62354] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000095872_98172928.pth... [2023-10-12 19:48:18,480][62354] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000094304_96567296.pth [2023-10-12 19:48:18,489][62635] Updated weights for policy 1, policy_version 95920 (0.0007) [2023-10-12 19:48:18,854][62635] Updated weights for policy 1, policy_version 95930 (0.0008) [2023-10-12 19:48:19,067][62495] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000095936_98238464.pth... [2023-10-12 19:48:19,106][62495] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000094336_96600064.pth [2023-10-12 19:48:22,035][62634] Updated weights for policy 0, policy_version 95880 (0.0009) [2023-10-12 19:48:22,420][62634] Updated weights for policy 0, policy_version 95890 (0.0010) [2023-10-12 19:48:22,781][62634] Updated weights for policy 0, policy_version 95900 (0.0010) [2023-10-12 19:48:23,049][62635] Updated weights for policy 1, policy_version 95940 (0.0007) [2023-10-12 19:48:23,424][62635] Updated weights for policy 1, policy_version 95950 (0.0007) [2023-10-12 19:48:23,435][61643] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 196444160. Throughput: 0: 1690.3, 1: 1686.7. Samples: 49116202. Policy #0 lag: (min: 2.0, avg: 6.5, max: 34.0) [2023-10-12 19:48:23,436][61643] Avg episode reward: [(0, '24.690'), (1, '9.900')] [2023-10-12 19:48:23,784][62635] Updated weights for policy 1, policy_version 95960 (0.0009) [2023-10-12 19:48:26,784][62634] Updated weights for policy 0, policy_version 95910 (0.0007) [2023-10-12 19:48:27,153][62634] Updated weights for policy 0, policy_version 95920 (0.0007) [2023-10-12 19:48:27,527][62634] Updated weights for policy 0, policy_version 95930 (0.0008) [2023-10-12 19:48:27,672][62635] Updated weights for policy 1, policy_version 95970 (0.0009) [2023-10-12 19:48:28,087][62635] Updated weights for policy 1, policy_version 95980 (0.0008) [2023-10-12 19:48:28,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 196509696. Throughput: 0: 1684.6, 1: 1689.7. Samples: 49136818. Policy #0 lag: (min: 2.0, avg: 6.5, max: 34.0) [2023-10-12 19:48:28,436][61643] Avg episode reward: [(0, '24.720'), (1, '9.720')] [2023-10-12 19:48:28,451][62635] Updated weights for policy 1, policy_version 95990 (0.0009) [2023-10-12 19:48:28,820][62635] Updated weights for policy 1, policy_version 96000 (0.0009) [2023-10-12 19:48:31,650][62634] Updated weights for policy 0, policy_version 95940 (0.0009) [2023-10-12 19:48:32,033][62634] Updated weights for policy 0, policy_version 95950 (0.0011) [2023-10-12 19:48:32,398][62634] Updated weights for policy 0, policy_version 95960 (0.0011) [2023-10-12 19:48:32,891][62635] Updated weights for policy 1, policy_version 96010 (0.0008) [2023-10-12 19:48:33,258][62635] Updated weights for policy 1, policy_version 96020 (0.0008) [2023-10-12 19:48:33,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 196575232. Throughput: 0: 1661.5, 1: 1678.8. Samples: 49155822. Policy #0 lag: (min: 8.0, avg: 32.8, max: 40.0) [2023-10-12 19:48:33,435][61643] Avg episode reward: [(0, '24.660'), (1, '9.700')] [2023-10-12 19:48:33,626][62635] Updated weights for policy 1, policy_version 96030 (0.0010) [2023-10-12 19:48:36,551][62634] Updated weights for policy 0, policy_version 95970 (0.0009) [2023-10-12 19:48:36,920][62634] Updated weights for policy 0, policy_version 95980 (0.0007) [2023-10-12 19:48:37,293][62634] Updated weights for policy 0, policy_version 95990 (0.0008) [2023-10-12 19:48:37,672][62634] Updated weights for policy 0, policy_version 96000 (0.0009) [2023-10-12 19:48:37,756][62635] Updated weights for policy 1, policy_version 96040 (0.0009) [2023-10-12 19:48:38,131][62635] Updated weights for policy 1, policy_version 96050 (0.0007) [2023-10-12 19:48:38,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 196640768. Throughput: 0: 1685.7, 1: 1690.4. Samples: 49166864. Policy #0 lag: (min: 8.0, avg: 32.8, max: 40.0) [2023-10-12 19:48:38,436][61643] Avg episode reward: [(0, '24.620'), (1, '9.870')] [2023-10-12 19:48:38,495][62635] Updated weights for policy 1, policy_version 96060 (0.0009) [2023-10-12 19:48:41,828][62634] Updated weights for policy 0, policy_version 96010 (0.0009) [2023-10-12 19:48:42,203][62634] Updated weights for policy 0, policy_version 96020 (0.0008) [2023-10-12 19:48:42,578][62635] Updated weights for policy 1, policy_version 96070 (0.0008) [2023-10-12 19:48:42,587][62634] Updated weights for policy 0, policy_version 96030 (0.0007) [2023-10-12 19:48:42,938][62635] Updated weights for policy 1, policy_version 96080 (0.0008) [2023-10-12 19:48:43,304][62635] Updated weights for policy 1, policy_version 96090 (0.0009) [2023-10-12 19:48:43,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 196706304. Throughput: 0: 1672.3, 1: 1687.8. Samples: 49186908. Policy #0 lag: (min: 8.0, avg: 32.8, max: 40.0) [2023-10-12 19:48:43,435][61643] Avg episode reward: [(0, '24.790'), (1, '9.920')] [2023-10-12 19:48:46,529][62634] Updated weights for policy 0, policy_version 96040 (0.0007) [2023-10-12 19:48:46,899][62634] Updated weights for policy 0, policy_version 96050 (0.0009) [2023-10-12 19:48:47,282][62634] Updated weights for policy 0, policy_version 96060 (0.0010) [2023-10-12 19:48:47,390][62635] Updated weights for policy 1, policy_version 96100 (0.0008) [2023-10-12 19:48:47,766][62635] Updated weights for policy 1, policy_version 96110 (0.0009) [2023-10-12 19:48:48,124][62635] Updated weights for policy 1, policy_version 96120 (0.0010) [2023-10-12 19:48:48,435][61643] Fps is (10 sec: 16383.9, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 196804608. Throughput: 0: 1676.0, 1: 1669.0. Samples: 49206198. Policy #0 lag: (min: 8.0, avg: 32.8, max: 40.0) [2023-10-12 19:48:48,436][61643] Avg episode reward: [(0, '25.050'), (1, '9.860')] [2023-10-12 19:48:51,373][62634] Updated weights for policy 0, policy_version 96070 (0.0008) [2023-10-12 19:48:51,742][62634] Updated weights for policy 0, policy_version 96080 (0.0007) [2023-10-12 19:48:52,119][62634] Updated weights for policy 0, policy_version 96090 (0.0008) [2023-10-12 19:48:52,173][62635] Updated weights for policy 1, policy_version 96130 (0.0011) [2023-10-12 19:48:52,541][62635] Updated weights for policy 1, policy_version 96140 (0.0008) [2023-10-12 19:48:52,915][62635] Updated weights for policy 1, policy_version 96150 (0.0009) [2023-10-12 19:48:53,281][62635] Updated weights for policy 1, policy_version 96160 (0.0009) [2023-10-12 19:48:53,435][61643] Fps is (10 sec: 16384.0, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 196870144. Throughput: 0: 1685.5, 1: 1681.5. Samples: 49217270. Policy #0 lag: (min: 8.0, avg: 32.8, max: 40.0) [2023-10-12 19:48:53,435][61643] Avg episode reward: [(0, '25.180'), (1, '10.050')] [2023-10-12 19:48:56,182][62634] Updated weights for policy 0, policy_version 96100 (0.0009) [2023-10-12 19:48:56,555][62634] Updated weights for policy 0, policy_version 96110 (0.0007) [2023-10-12 19:48:56,935][62634] Updated weights for policy 0, policy_version 96120 (0.0008) [2023-10-12 19:48:57,348][62635] Updated weights for policy 1, policy_version 96170 (0.0008) [2023-10-12 19:48:57,726][62635] Updated weights for policy 1, policy_version 96180 (0.0010) [2023-10-12 19:48:58,099][62635] Updated weights for policy 1, policy_version 96190 (0.0008) [2023-10-12 19:48:58,435][61643] Fps is (10 sec: 13107.2, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 196935680. Throughput: 0: 1666.9, 1: 1678.6. Samples: 49237028. Policy #0 lag: (min: 8.0, avg: 32.8, max: 40.0) [2023-10-12 19:48:58,436][61643] Avg episode reward: [(0, '24.910'), (1, '9.940')] [2023-10-12 19:49:00,842][62634] Updated weights for policy 0, policy_version 96130 (0.0009) [2023-10-12 19:49:01,221][62634] Updated weights for policy 0, policy_version 96140 (0.0008) [2023-10-12 19:49:01,596][62634] Updated weights for policy 0, policy_version 96150 (0.0007) [2023-10-12 19:49:01,971][62634] Updated weights for policy 0, policy_version 96160 (0.0007) [2023-10-12 19:49:02,196][62635] Updated weights for policy 1, policy_version 96200 (0.0011) [2023-10-12 19:49:02,563][62635] Updated weights for policy 1, policy_version 96210 (0.0010) [2023-10-12 19:49:02,940][62635] Updated weights for policy 1, policy_version 96220 (0.0010) [2023-10-12 19:49:03,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 197001216. Throughput: 0: 1686.6, 1: 1655.0. Samples: 49256310. Policy #0 lag: (min: 8.0, avg: 32.8, max: 40.0) [2023-10-12 19:49:03,436][61643] Avg episode reward: [(0, '25.060'), (1, '9.840')] [2023-10-12 19:49:06,180][62634] Updated weights for policy 0, policy_version 96170 (0.0010) [2023-10-12 19:49:06,556][62634] Updated weights for policy 0, policy_version 96180 (0.0009) [2023-10-12 19:49:06,934][62634] Updated weights for policy 0, policy_version 96190 (0.0007) [2023-10-12 19:49:07,091][62635] Updated weights for policy 1, policy_version 96230 (0.0008) [2023-10-12 19:49:07,454][62635] Updated weights for policy 1, policy_version 96240 (0.0008) [2023-10-12 19:49:07,822][62635] Updated weights for policy 1, policy_version 96250 (0.0008) [2023-10-12 19:49:08,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 197066752. Throughput: 0: 1686.0, 1: 1677.8. Samples: 49267576. Policy #0 lag: (min: 8.0, avg: 32.8, max: 40.0) [2023-10-12 19:49:08,436][61643] Avg episode reward: [(0, '24.930'), (1, '10.000')] [2023-10-12 19:49:11,023][62634] Updated weights for policy 0, policy_version 96200 (0.0009) [2023-10-12 19:49:11,400][62634] Updated weights for policy 0, policy_version 96210 (0.0008) [2023-10-12 19:49:11,788][62634] Updated weights for policy 0, policy_version 96220 (0.0007) [2023-10-12 19:49:11,915][62635] Updated weights for policy 1, policy_version 96260 (0.0009) [2023-10-12 19:49:12,279][62635] Updated weights for policy 1, policy_version 96270 (0.0007) [2023-10-12 19:49:12,646][62635] Updated weights for policy 1, policy_version 96280 (0.0008) [2023-10-12 19:49:13,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 197132288. Throughput: 0: 1664.2, 1: 1669.0. Samples: 49286810. Policy #0 lag: (min: 8.0, avg: 32.8, max: 40.0) [2023-10-12 19:49:13,436][61643] Avg episode reward: [(0, '25.000'), (1, '9.910')] [2023-10-12 19:49:15,772][62634] Updated weights for policy 0, policy_version 96230 (0.0010) [2023-10-12 19:49:16,160][62634] Updated weights for policy 0, policy_version 96240 (0.0009) [2023-10-12 19:49:16,523][62634] Updated weights for policy 0, policy_version 96250 (0.0008) [2023-10-12 19:49:16,771][62635] Updated weights for policy 1, policy_version 96290 (0.0010) [2023-10-12 19:49:17,175][62635] Updated weights for policy 1, policy_version 96300 (0.0011) [2023-10-12 19:49:17,543][62635] Updated weights for policy 1, policy_version 96310 (0.0009) [2023-10-12 19:49:17,910][62635] Updated weights for policy 1, policy_version 96320 (0.0008) [2023-10-12 19:49:18,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 197197824. Throughput: 0: 1686.9, 1: 1661.0. Samples: 49306480. Policy #0 lag: (min: 8.0, avg: 32.8, max: 40.0) [2023-10-12 19:49:18,436][61643] Avg episode reward: [(0, '25.090'), (1, '9.910')] [2023-10-12 19:49:20,518][62634] Updated weights for policy 0, policy_version 96260 (0.0008) [2023-10-12 19:49:20,893][62634] Updated weights for policy 0, policy_version 96270 (0.0007) [2023-10-12 19:49:21,279][62634] Updated weights for policy 0, policy_version 96280 (0.0008) [2023-10-12 19:49:21,912][62635] Updated weights for policy 1, policy_version 96330 (0.0009) [2023-10-12 19:49:22,271][62635] Updated weights for policy 1, policy_version 96340 (0.0009) [2023-10-12 19:49:22,641][62635] Updated weights for policy 1, policy_version 96350 (0.0007) [2023-10-12 19:49:23,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 197263360. Throughput: 0: 1671.0, 1: 1677.1. Samples: 49317528. Policy #0 lag: (min: 8.0, avg: 32.8, max: 40.0) [2023-10-12 19:49:23,435][61643] Avg episode reward: [(0, '25.320'), (1, '9.900')] [2023-10-12 19:49:25,202][62634] Updated weights for policy 0, policy_version 96290 (0.0009) [2023-10-12 19:49:25,569][62634] Updated weights for policy 0, policy_version 96300 (0.0008) [2023-10-12 19:49:25,937][62634] Updated weights for policy 0, policy_version 96310 (0.0009) [2023-10-12 19:49:26,310][62634] Updated weights for policy 0, policy_version 96320 (0.0011) [2023-10-12 19:49:26,744][62635] Updated weights for policy 1, policy_version 96360 (0.0011) [2023-10-12 19:49:27,117][62635] Updated weights for policy 1, policy_version 96370 (0.0008) [2023-10-12 19:49:27,484][62635] Updated weights for policy 1, policy_version 96380 (0.0008) [2023-10-12 19:49:28,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 197328896. Throughput: 0: 1667.1, 1: 1667.6. Samples: 49336968. Policy #0 lag: (min: 8.0, avg: 32.8, max: 40.0) [2023-10-12 19:49:28,436][61643] Avg episode reward: [(0, '25.320'), (1, '10.100')] [2023-10-12 19:49:30,375][62634] Updated weights for policy 0, policy_version 96330 (0.0010) [2023-10-12 19:49:30,742][62634] Updated weights for policy 0, policy_version 96340 (0.0008) [2023-10-12 19:49:31,118][62634] Updated weights for policy 0, policy_version 96350 (0.0009) [2023-10-12 19:49:31,403][62635] Updated weights for policy 1, policy_version 96390 (0.0009) [2023-10-12 19:49:31,765][62635] Updated weights for policy 1, policy_version 96400 (0.0010) [2023-10-12 19:49:32,125][62635] Updated weights for policy 1, policy_version 96410 (0.0010) [2023-10-12 19:49:33,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 197394432. Throughput: 0: 1682.2, 1: 1674.5. Samples: 49357248. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 19:49:33,435][61643] Avg episode reward: [(0, '25.380'), (1, '9.920')] [2023-10-12 19:49:35,133][62634] Updated weights for policy 0, policy_version 96360 (0.0010) [2023-10-12 19:49:35,503][62634] Updated weights for policy 0, policy_version 96370 (0.0009) [2023-10-12 19:49:35,885][62634] Updated weights for policy 0, policy_version 96380 (0.0009) [2023-10-12 19:49:36,204][62635] Updated weights for policy 1, policy_version 96420 (0.0008) [2023-10-12 19:49:36,567][62635] Updated weights for policy 1, policy_version 96430 (0.0007) [2023-10-12 19:49:36,941][62635] Updated weights for policy 1, policy_version 96440 (0.0007) [2023-10-12 19:49:38,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 197459968. Throughput: 0: 1656.0, 1: 1688.3. Samples: 49367768. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 19:49:38,436][61643] Avg episode reward: [(0, '25.420'), (1, '10.040')] [2023-10-12 19:49:40,039][62634] Updated weights for policy 0, policy_version 96390 (0.0008) [2023-10-12 19:49:40,412][62634] Updated weights for policy 0, policy_version 96400 (0.0007) [2023-10-12 19:49:40,795][62634] Updated weights for policy 0, policy_version 96410 (0.0007) [2023-10-12 19:49:40,933][62635] Updated weights for policy 1, policy_version 96450 (0.0007) [2023-10-12 19:49:41,298][62635] Updated weights for policy 1, policy_version 96460 (0.0007) [2023-10-12 19:49:41,672][62635] Updated weights for policy 1, policy_version 96470 (0.0007) [2023-10-12 19:49:42,038][62635] Updated weights for policy 1, policy_version 96480 (0.0010) [2023-10-12 19:49:43,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 197525504. Throughput: 0: 1672.6, 1: 1661.3. Samples: 49387056. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 19:49:43,435][61643] Avg episode reward: [(0, '25.400'), (1, '10.020')] [2023-10-12 19:49:45,111][62634] Updated weights for policy 0, policy_version 96420 (0.0008) [2023-10-12 19:49:45,517][62634] Updated weights for policy 0, policy_version 96430 (0.0009) [2023-10-12 19:49:45,891][62634] Updated weights for policy 0, policy_version 96440 (0.0008) [2023-10-12 19:49:46,055][62635] Updated weights for policy 1, policy_version 96490 (0.0007) [2023-10-12 19:49:46,412][62635] Updated weights for policy 1, policy_version 96500 (0.0010) [2023-10-12 19:49:46,790][62635] Updated weights for policy 1, policy_version 96510 (0.0008) [2023-10-12 19:49:48,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 197591040. Throughput: 0: 1670.8, 1: 1687.6. Samples: 49407440. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 19:49:48,435][61643] Avg episode reward: [(0, '25.400'), (1, '10.030')] [2023-10-12 19:49:50,017][62634] Updated weights for policy 0, policy_version 96450 (0.0008) [2023-10-12 19:49:50,383][62634] Updated weights for policy 0, policy_version 96460 (0.0007) [2023-10-12 19:49:50,757][62634] Updated weights for policy 0, policy_version 96470 (0.0008) [2023-10-12 19:49:50,899][62635] Updated weights for policy 1, policy_version 96520 (0.0012) [2023-10-12 19:49:51,137][62634] Updated weights for policy 0, policy_version 96480 (0.0007) [2023-10-12 19:49:51,272][62635] Updated weights for policy 1, policy_version 96530 (0.0008) [2023-10-12 19:49:51,638][62635] Updated weights for policy 1, policy_version 96540 (0.0008) [2023-10-12 19:49:53,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 197656576. Throughput: 0: 1646.0, 1: 1682.8. Samples: 49417374. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 19:49:53,435][61643] Avg episode reward: [(0, '25.330'), (1, '10.040')] [2023-10-12 19:49:55,212][62634] Updated weights for policy 0, policy_version 96490 (0.0009) [2023-10-12 19:49:55,587][62634] Updated weights for policy 0, policy_version 96500 (0.0009) [2023-10-12 19:49:55,771][62635] Updated weights for policy 1, policy_version 96550 (0.0008) [2023-10-12 19:49:55,967][62634] Updated weights for policy 0, policy_version 96510 (0.0008) [2023-10-12 19:49:56,135][62635] Updated weights for policy 1, policy_version 96560 (0.0008) [2023-10-12 19:49:56,503][62635] Updated weights for policy 1, policy_version 96570 (0.0009) [2023-10-12 19:49:58,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 197722112. Throughput: 0: 1669.0, 1: 1665.7. Samples: 49436870. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 19:49:58,436][61643] Avg episode reward: [(0, '25.400'), (1, '10.040')] [2023-10-12 19:49:59,968][62634] Updated weights for policy 0, policy_version 96520 (0.0009) [2023-10-12 19:50:00,331][62634] Updated weights for policy 0, policy_version 96530 (0.0008) [2023-10-12 19:50:00,513][62635] Updated weights for policy 1, policy_version 96580 (0.0008) [2023-10-12 19:50:00,703][62634] Updated weights for policy 0, policy_version 96540 (0.0008) [2023-10-12 19:50:00,875][62635] Updated weights for policy 1, policy_version 96590 (0.0009) [2023-10-12 19:50:01,248][62635] Updated weights for policy 1, policy_version 96600 (0.0008) [2023-10-12 19:50:03,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 197787648. Throughput: 0: 1674.8, 1: 1687.4. Samples: 49457780. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 19:50:03,436][61643] Avg episode reward: [(0, '25.350'), (1, '10.040')] [2023-10-12 19:50:04,754][62634] Updated weights for policy 0, policy_version 96550 (0.0008) [2023-10-12 19:50:05,124][62634] Updated weights for policy 0, policy_version 96560 (0.0010) [2023-10-12 19:50:05,501][62634] Updated weights for policy 0, policy_version 96570 (0.0009) [2023-10-12 19:50:05,527][62635] Updated weights for policy 1, policy_version 96610 (0.0008) [2023-10-12 19:50:05,898][62635] Updated weights for policy 1, policy_version 96620 (0.0007) [2023-10-12 19:50:06,264][62635] Updated weights for policy 1, policy_version 96630 (0.0007) [2023-10-12 19:50:06,627][62635] Updated weights for policy 1, policy_version 96640 (0.0008) [2023-10-12 19:50:08,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 197853184. Throughput: 0: 1659.7, 1: 1672.8. Samples: 49467490. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 19:50:08,435][61643] Avg episode reward: [(0, '25.410'), (1, '9.820')] [2023-10-12 19:50:09,796][62634] Updated weights for policy 0, policy_version 96580 (0.0008) [2023-10-12 19:50:10,185][62634] Updated weights for policy 0, policy_version 96590 (0.0007) [2023-10-12 19:50:10,558][62634] Updated weights for policy 0, policy_version 96600 (0.0009) [2023-10-12 19:50:10,697][62635] Updated weights for policy 1, policy_version 96650 (0.0009) [2023-10-12 19:50:11,063][62635] Updated weights for policy 1, policy_version 96660 (0.0007) [2023-10-12 19:50:11,424][62635] Updated weights for policy 1, policy_version 96670 (0.0007) [2023-10-12 19:50:13,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 197918720. Throughput: 0: 1668.9, 1: 1673.4. Samples: 49487374. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 19:50:13,435][61643] Avg episode reward: [(0, '25.610'), (1, '9.660')] [2023-10-12 19:50:14,510][62634] Updated weights for policy 0, policy_version 96610 (0.0008) [2023-10-12 19:50:14,893][62634] Updated weights for policy 0, policy_version 96620 (0.0008) [2023-10-12 19:50:15,273][62634] Updated weights for policy 0, policy_version 96630 (0.0010) [2023-10-12 19:50:15,438][62635] Updated weights for policy 1, policy_version 96680 (0.0009) [2023-10-12 19:50:15,667][62634] Updated weights for policy 0, policy_version 96640 (0.0009) [2023-10-12 19:50:15,804][62635] Updated weights for policy 1, policy_version 96690 (0.0009) [2023-10-12 19:50:16,178][62635] Updated weights for policy 1, policy_version 96700 (0.0009) [2023-10-12 19:50:18,435][61643] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 197984256. Throughput: 0: 1666.7, 1: 1682.4. Samples: 49507956. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 19:50:18,436][61643] Avg episode reward: [(0, '25.770'), (1, '9.850')] [2023-10-12 19:50:18,445][62495] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000096704_99024896.pth... [2023-10-12 19:50:18,446][62354] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000096640_98959360.pth... [2023-10-12 19:50:18,477][62495] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000095136_97419264.pth [2023-10-12 19:50:18,483][62354] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000095072_97353728.pth [2023-10-12 19:50:18,486][62354] Saving new best policy, reward=25.770! [2023-10-12 19:50:19,877][62634] Updated weights for policy 0, policy_version 96650 (0.0008) [2023-10-12 19:50:20,253][62634] Updated weights for policy 0, policy_version 96660 (0.0008) [2023-10-12 19:50:20,289][62635] Updated weights for policy 1, policy_version 96710 (0.0007) [2023-10-12 19:50:20,626][62634] Updated weights for policy 0, policy_version 96670 (0.0008) [2023-10-12 19:50:20,655][62635] Updated weights for policy 1, policy_version 96720 (0.0008) [2023-10-12 19:50:21,035][62635] Updated weights for policy 1, policy_version 96730 (0.0007) [2023-10-12 19:50:23,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 198049792. Throughput: 0: 1662.6, 1: 1658.4. Samples: 49517212. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 19:50:23,435][61643] Avg episode reward: [(0, '25.620'), (1, '9.890')] [2023-10-12 19:50:24,656][62634] Updated weights for policy 0, policy_version 96680 (0.0008) [2023-10-12 19:50:25,020][62634] Updated weights for policy 0, policy_version 96690 (0.0009) [2023-10-12 19:50:25,121][62635] Updated weights for policy 1, policy_version 96740 (0.0007) [2023-10-12 19:50:25,394][62634] Updated weights for policy 0, policy_version 96700 (0.0007) [2023-10-12 19:50:25,488][62635] Updated weights for policy 1, policy_version 96750 (0.0007) [2023-10-12 19:50:25,861][62635] Updated weights for policy 1, policy_version 96760 (0.0007) [2023-10-12 19:50:28,435][61643] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 198115328. Throughput: 0: 1670.5, 1: 1681.2. Samples: 49537884. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 19:50:28,435][61643] Avg episode reward: [(0, '25.580'), (1, '9.710')] [2023-10-12 19:50:29,494][62634] Updated weights for policy 0, policy_version 96710 (0.0007) [2023-10-12 19:50:29,871][62634] Updated weights for policy 0, policy_version 96720 (0.0009) [2023-10-12 19:50:29,933][62635] Updated weights for policy 1, policy_version 96770 (0.0008) [2023-10-12 19:50:30,249][62634] Updated weights for policy 0, policy_version 96730 (0.0007) [2023-10-12 19:50:30,299][62635] Updated weights for policy 1, policy_version 96780 (0.0008) [2023-10-12 19:50:30,664][62635] Updated weights for policy 1, policy_version 96790 (0.0010) [2023-10-12 19:50:31,036][62635] Updated weights for policy 1, policy_version 96800 (0.0009) [2023-10-12 19:50:33,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 198180864. Throughput: 0: 1674.6, 1: 1682.1. Samples: 49558490. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2023-10-12 19:50:33,435][61643] Avg episode reward: [(0, '25.620'), (1, '9.680')] [2023-10-12 19:50:34,497][62634] Updated weights for policy 0, policy_version 96740 (0.0007) [2023-10-12 19:50:34,885][62634] Updated weights for policy 0, policy_version 96750 (0.0007) [2023-10-12 19:50:35,004][62635] Updated weights for policy 1, policy_version 96810 (0.0007) [2023-10-12 19:50:35,261][62634] Updated weights for policy 0, policy_version 96760 (0.0008) [2023-10-12 19:50:35,365][62635] Updated weights for policy 1, policy_version 96820 (0.0007) [2023-10-12 19:50:35,727][62635] Updated weights for policy 1, policy_version 96830 (0.0007) [2023-10-12 19:50:38,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 198246400. Throughput: 0: 1669.0, 1: 1664.1. Samples: 49567364. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2023-10-12 19:50:38,436][61643] Avg episode reward: [(0, '25.710'), (1, '9.750')] [2023-10-12 19:50:39,419][62634] Updated weights for policy 0, policy_version 96770 (0.0009) [2023-10-12 19:50:39,790][62634] Updated weights for policy 0, policy_version 96780 (0.0010) [2023-10-12 19:50:39,790][62635] Updated weights for policy 1, policy_version 96840 (0.0008) [2023-10-12 19:50:40,157][62635] Updated weights for policy 1, policy_version 96850 (0.0008) [2023-10-12 19:50:40,166][62634] Updated weights for policy 0, policy_version 96790 (0.0007) [2023-10-12 19:50:40,531][62635] Updated weights for policy 1, policy_version 96860 (0.0008) [2023-10-12 19:50:40,547][62634] Updated weights for policy 0, policy_version 96800 (0.0009) [2023-10-12 19:50:43,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 198311936. Throughput: 0: 1670.0, 1: 1685.7. Samples: 49587876. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2023-10-12 19:50:43,435][61643] Avg episode reward: [(0, '25.610'), (1, '9.820')] [2023-10-12 19:50:44,437][62634] Updated weights for policy 0, policy_version 96810 (0.0008) [2023-10-12 19:50:44,589][62635] Updated weights for policy 1, policy_version 96870 (0.0008) [2023-10-12 19:50:44,806][62634] Updated weights for policy 0, policy_version 96820 (0.0007) [2023-10-12 19:50:44,960][62635] Updated weights for policy 1, policy_version 96880 (0.0007) [2023-10-12 19:50:45,177][62634] Updated weights for policy 0, policy_version 96830 (0.0008) [2023-10-12 19:50:45,328][62635] Updated weights for policy 1, policy_version 96890 (0.0009) [2023-10-12 19:50:48,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 198377472. Throughput: 0: 1662.8, 1: 1687.3. Samples: 49608536. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2023-10-12 19:50:48,436][61643] Avg episode reward: [(0, '25.670'), (1, '9.630')] [2023-10-12 19:50:49,394][62635] Updated weights for policy 1, policy_version 96900 (0.0008) [2023-10-12 19:50:49,407][62634] Updated weights for policy 0, policy_version 96840 (0.0008) [2023-10-12 19:50:49,755][62635] Updated weights for policy 1, policy_version 96910 (0.0008) [2023-10-12 19:50:49,789][62634] Updated weights for policy 0, policy_version 96850 (0.0009) [2023-10-12 19:50:50,120][62635] Updated weights for policy 1, policy_version 96920 (0.0009) [2023-10-12 19:50:50,172][62634] Updated weights for policy 0, policy_version 96860 (0.0009) [2023-10-12 19:50:53,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 198443008. Throughput: 0: 1661.3, 1: 1669.6. Samples: 49617378. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2023-10-12 19:50:53,436][61643] Avg episode reward: [(0, '25.680'), (1, '9.580')] [2023-10-12 19:50:54,073][62635] Updated weights for policy 1, policy_version 96930 (0.0008) [2023-10-12 19:50:54,168][62634] Updated weights for policy 0, policy_version 96870 (0.0008) [2023-10-12 19:50:54,448][62635] Updated weights for policy 1, policy_version 96940 (0.0008) [2023-10-12 19:50:54,534][62634] Updated weights for policy 0, policy_version 96880 (0.0010) [2023-10-12 19:50:54,828][62635] Updated weights for policy 1, policy_version 96950 (0.0008) [2023-10-12 19:50:54,912][62634] Updated weights for policy 0, policy_version 96890 (0.0009) [2023-10-12 19:50:55,192][62635] Updated weights for policy 1, policy_version 96960 (0.0008) [2023-10-12 19:50:58,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 198508544. Throughput: 0: 1664.7, 1: 1681.2. Samples: 49637938. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2023-10-12 19:50:58,436][61643] Avg episode reward: [(0, '25.680'), (1, '9.670')] [2023-10-12 19:50:58,837][62634] Updated weights for policy 0, policy_version 96900 (0.0008) [2023-10-12 19:50:59,207][62634] Updated weights for policy 0, policy_version 96910 (0.0009) [2023-10-12 19:50:59,577][62635] Updated weights for policy 1, policy_version 96970 (0.0007) [2023-10-12 19:50:59,581][62634] Updated weights for policy 0, policy_version 96920 (0.0008) [2023-10-12 19:50:59,934][62635] Updated weights for policy 1, policy_version 96980 (0.0007) [2023-10-12 19:51:00,299][62635] Updated weights for policy 1, policy_version 96990 (0.0011) [2023-10-12 19:51:03,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 198574080. Throughput: 0: 1673.7, 1: 1682.1. Samples: 49658964. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2023-10-12 19:51:03,436][61643] Avg episode reward: [(0, '25.670'), (1, '9.680')] [2023-10-12 19:51:03,573][62634] Updated weights for policy 0, policy_version 96930 (0.0009) [2023-10-12 19:51:03,957][62634] Updated weights for policy 0, policy_version 96940 (0.0008) [2023-10-12 19:51:04,325][62634] Updated weights for policy 0, policy_version 96950 (0.0009) [2023-10-12 19:51:04,364][62635] Updated weights for policy 1, policy_version 97000 (0.0008) [2023-10-12 19:51:04,703][62634] Updated weights for policy 0, policy_version 96960 (0.0008) [2023-10-12 19:51:04,735][62635] Updated weights for policy 1, policy_version 97010 (0.0008) [2023-10-12 19:51:05,111][62635] Updated weights for policy 1, policy_version 97020 (0.0008) [2023-10-12 19:51:08,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 198639616. Throughput: 0: 1678.3, 1: 1677.5. Samples: 49668224. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2023-10-12 19:51:08,436][61643] Avg episode reward: [(0, '25.560'), (1, '9.700')] [2023-10-12 19:51:08,785][62634] Updated weights for policy 0, policy_version 96970 (0.0009) [2023-10-12 19:51:09,127][62635] Updated weights for policy 1, policy_version 97030 (0.0008) [2023-10-12 19:51:09,159][62634] Updated weights for policy 0, policy_version 96980 (0.0007) [2023-10-12 19:51:09,487][62635] Updated weights for policy 1, policy_version 97040 (0.0010) [2023-10-12 19:51:09,532][62634] Updated weights for policy 0, policy_version 96990 (0.0008) [2023-10-12 19:51:09,854][62635] Updated weights for policy 1, policy_version 97050 (0.0008) [2023-10-12 19:51:13,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 198705152. Throughput: 0: 1674.0, 1: 1682.2. Samples: 49688914. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2023-10-12 19:51:13,435][61643] Avg episode reward: [(0, '25.690'), (1, '9.850')] [2023-10-12 19:51:13,436][62634] Updated weights for policy 0, policy_version 97000 (0.0008) [2023-10-12 19:51:13,814][62634] Updated weights for policy 0, policy_version 97010 (0.0008) [2023-10-12 19:51:13,901][62635] Updated weights for policy 1, policy_version 97060 (0.0008) [2023-10-12 19:51:14,190][62634] Updated weights for policy 0, policy_version 97020 (0.0007) [2023-10-12 19:51:14,268][62635] Updated weights for policy 1, policy_version 97070 (0.0007) [2023-10-12 19:51:14,631][62635] Updated weights for policy 1, policy_version 97080 (0.0008) [2023-10-12 19:51:18,350][62634] Updated weights for policy 0, policy_version 97030 (0.0008) [2023-10-12 19:51:18,435][61643] Fps is (10 sec: 13107.1, 60 sec: 13107.3, 300 sec: 13329.4). Total num frames: 198770688. Throughput: 0: 1677.7, 1: 1677.2. Samples: 49709462. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2023-10-12 19:51:18,435][61643] Avg episode reward: [(0, '25.670'), (1, '9.990')] [2023-10-12 19:51:18,729][62634] Updated weights for policy 0, policy_version 97040 (0.0009) [2023-10-12 19:51:18,782][62635] Updated weights for policy 1, policy_version 97090 (0.0009) [2023-10-12 19:51:19,112][62634] Updated weights for policy 0, policy_version 97050 (0.0007) [2023-10-12 19:51:19,144][62635] Updated weights for policy 1, policy_version 97100 (0.0008) [2023-10-12 19:51:19,517][62635] Updated weights for policy 1, policy_version 97110 (0.0007) [2023-10-12 19:51:19,878][62635] Updated weights for policy 1, policy_version 97120 (0.0010) [2023-10-12 19:51:23,198][62634] Updated weights for policy 0, policy_version 97060 (0.0008) [2023-10-12 19:51:23,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 198836224. Throughput: 0: 1682.9, 1: 1679.0. Samples: 49718648. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2023-10-12 19:51:23,435][61643] Avg episode reward: [(0, '25.670'), (1, '9.970')] [2023-10-12 19:51:23,585][62634] Updated weights for policy 0, policy_version 97070 (0.0008) [2023-10-12 19:51:23,959][62634] Updated weights for policy 0, policy_version 97080 (0.0007) [2023-10-12 19:51:23,975][62635] Updated weights for policy 1, policy_version 97130 (0.0007) [2023-10-12 19:51:24,341][62635] Updated weights for policy 1, policy_version 97140 (0.0007) [2023-10-12 19:51:24,699][62635] Updated weights for policy 1, policy_version 97150 (0.0008) [2023-10-12 19:51:28,037][62634] Updated weights for policy 0, policy_version 97090 (0.0007) [2023-10-12 19:51:28,417][62634] Updated weights for policy 0, policy_version 97100 (0.0008) [2023-10-12 19:51:28,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 198901760. Throughput: 0: 1686.4, 1: 1680.7. Samples: 49739396. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2023-10-12 19:51:28,436][61643] Avg episode reward: [(0, '25.340'), (1, '9.960')] [2023-10-12 19:51:28,801][62634] Updated weights for policy 0, policy_version 97110 (0.0008) [2023-10-12 19:51:28,833][62635] Updated weights for policy 1, policy_version 97160 (0.0010) [2023-10-12 19:51:29,181][62634] Updated weights for policy 0, policy_version 97120 (0.0009) [2023-10-12 19:51:29,202][62635] Updated weights for policy 1, policy_version 97170 (0.0007) [2023-10-12 19:51:29,564][62635] Updated weights for policy 1, policy_version 97180 (0.0009) [2023-10-12 19:51:33,233][62634] Updated weights for policy 0, policy_version 97130 (0.0011) [2023-10-12 19:51:33,435][61643] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 198967296. Throughput: 0: 1684.8, 1: 1676.9. Samples: 49759812. Policy #0 lag: (min: 29.0, avg: 32.4, max: 61.0) [2023-10-12 19:51:33,436][61643] Avg episode reward: [(0, '25.190'), (1, '9.860')] [2023-10-12 19:51:33,610][62634] Updated weights for policy 0, policy_version 97140 (0.0007) [2023-10-12 19:51:33,659][62635] Updated weights for policy 1, policy_version 97190 (0.0008) [2023-10-12 19:51:33,983][62634] Updated weights for policy 0, policy_version 97150 (0.0007) [2023-10-12 19:51:34,026][62635] Updated weights for policy 1, policy_version 97200 (0.0008) [2023-10-12 19:51:34,396][62635] Updated weights for policy 1, policy_version 97210 (0.0008) [2023-10-12 19:51:38,064][62634] Updated weights for policy 0, policy_version 97160 (0.0007) [2023-10-12 19:51:38,345][62635] Updated weights for policy 1, policy_version 97220 (0.0009) [2023-10-12 19:51:38,435][61643] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 199032832. Throughput: 0: 1688.9, 1: 1681.3. Samples: 49769040. Policy #0 lag: (min: 29.0, avg: 32.4, max: 61.0) [2023-10-12 19:51:38,435][61643] Avg episode reward: [(0, '25.030'), (1, '9.940')] [2023-10-12 19:51:38,449][62634] Updated weights for policy 0, policy_version 97170 (0.0009) [2023-10-12 19:51:38,709][62635] Updated weights for policy 1, policy_version 97230 (0.0008) [2023-10-12 19:51:38,829][62634] Updated weights for policy 0, policy_version 97180 (0.0009) [2023-10-12 19:51:39,072][62635] Updated weights for policy 1, policy_version 97240 (0.0007) [2023-10-12 19:51:42,929][62634] Updated weights for policy 0, policy_version 97190 (0.0008) [2023-10-12 19:51:43,083][62635] Updated weights for policy 1, policy_version 97250 (0.0007) [2023-10-12 19:51:43,307][62634] Updated weights for policy 0, policy_version 97200 (0.0008) [2023-10-12 19:51:43,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 199098368. Throughput: 0: 1691.9, 1: 1681.5. Samples: 49789738. Policy #0 lag: (min: 29.0, avg: 32.4, max: 61.0) [2023-10-12 19:51:43,436][61643] Avg episode reward: [(0, '25.030'), (1, '10.030')] [2023-10-12 19:51:43,446][62635] Updated weights for policy 1, policy_version 97260 (0.0008) [2023-10-12 19:51:43,678][62634] Updated weights for policy 0, policy_version 97210 (0.0009) [2023-10-12 19:51:43,810][62635] Updated weights for policy 1, policy_version 97270 (0.0009) [2023-10-12 19:51:44,171][62635] Updated weights for policy 1, policy_version 97280 (0.0009) [2023-10-12 19:51:47,757][62634] Updated weights for policy 0, policy_version 97220 (0.0008) [2023-10-12 19:51:48,129][62634] Updated weights for policy 0, policy_version 97230 (0.0008) [2023-10-12 19:51:48,399][62635] Updated weights for policy 1, policy_version 97290 (0.0008) [2023-10-12 19:51:48,435][61643] Fps is (10 sec: 13107.2, 60 sec: 13107.3, 300 sec: 13329.4). Total num frames: 199163904. Throughput: 0: 1676.0, 1: 1673.7. Samples: 49809702. Policy #0 lag: (min: 29.0, avg: 32.4, max: 61.0) [2023-10-12 19:51:48,435][61643] Avg episode reward: [(0, '24.920'), (1, '10.030')] [2023-10-12 19:51:48,509][62634] Updated weights for policy 0, policy_version 97240 (0.0007) [2023-10-12 19:51:48,756][62635] Updated weights for policy 1, policy_version 97300 (0.0009) [2023-10-12 19:51:49,128][62635] Updated weights for policy 1, policy_version 97310 (0.0008) [2023-10-12 19:51:52,562][62634] Updated weights for policy 0, policy_version 97250 (0.0009) [2023-10-12 19:51:52,944][62634] Updated weights for policy 0, policy_version 97260 (0.0008) [2023-10-12 19:51:53,180][62635] Updated weights for policy 1, policy_version 97320 (0.0007) [2023-10-12 19:51:53,310][62634] Updated weights for policy 0, policy_version 97270 (0.0008) [2023-10-12 19:51:53,435][61643] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 199229440. Throughput: 0: 1680.4, 1: 1672.8. Samples: 49819120. Policy #0 lag: (min: 29.0, avg: 32.4, max: 61.0) [2023-10-12 19:51:53,436][61643] Avg episode reward: [(0, '24.930'), (1, '9.940')] [2023-10-12 19:51:53,552][62635] Updated weights for policy 1, policy_version 97330 (0.0009) [2023-10-12 19:51:53,691][62634] Updated weights for policy 0, policy_version 97280 (0.0008) [2023-10-12 19:51:53,908][62635] Updated weights for policy 1, policy_version 97340 (0.0007) [2023-10-12 19:51:57,803][62634] Updated weights for policy 0, policy_version 97290 (0.0008) [2023-10-12 19:51:57,978][62635] Updated weights for policy 1, policy_version 97350 (0.0008) [2023-10-12 19:51:58,181][62634] Updated weights for policy 0, policy_version 97300 (0.0009) [2023-10-12 19:51:58,347][62635] Updated weights for policy 1, policy_version 97360 (0.0008) [2023-10-12 19:51:58,435][61643] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 199294976. Throughput: 0: 1677.9, 1: 1675.6. Samples: 49839824. Policy #0 lag: (min: 29.0, avg: 32.4, max: 61.0) [2023-10-12 19:51:58,436][61643] Avg episode reward: [(0, '24.540'), (1, '10.060')] [2023-10-12 19:51:58,549][62634] Updated weights for policy 0, policy_version 97310 (0.0009) [2023-10-12 19:51:58,711][62635] Updated weights for policy 1, policy_version 97370 (0.0008) [2023-10-12 19:52:02,614][62634] Updated weights for policy 0, policy_version 97320 (0.0010) [2023-10-12 19:52:02,842][62635] Updated weights for policy 1, policy_version 97380 (0.0010) [2023-10-12 19:52:02,987][62634] Updated weights for policy 0, policy_version 97330 (0.0010) [2023-10-12 19:52:03,206][62635] Updated weights for policy 1, policy_version 97390 (0.0007) [2023-10-12 19:52:03,356][62634] Updated weights for policy 0, policy_version 97340 (0.0009) [2023-10-12 19:52:03,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 199360512. Throughput: 0: 1662.4, 1: 1671.5. Samples: 49859486. Policy #0 lag: (min: 29.0, avg: 32.4, max: 61.0) [2023-10-12 19:52:03,435][61643] Avg episode reward: [(0, '24.480'), (1, '10.170')] [2023-10-12 19:52:03,579][62635] Updated weights for policy 1, policy_version 97400 (0.0008) [2023-10-12 19:52:07,463][62634] Updated weights for policy 0, policy_version 97350 (0.0008) [2023-10-12 19:52:07,727][62635] Updated weights for policy 1, policy_version 97410 (0.0009) [2023-10-12 19:52:07,828][62634] Updated weights for policy 0, policy_version 97360 (0.0008) [2023-10-12 19:52:08,098][62635] Updated weights for policy 1, policy_version 97420 (0.0008) [2023-10-12 19:52:08,203][62634] Updated weights for policy 0, policy_version 97370 (0.0007) [2023-10-12 19:52:08,435][61643] Fps is (10 sec: 16384.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 199458816. Throughput: 0: 1676.2, 1: 1678.8. Samples: 49869622. Policy #0 lag: (min: 29.0, avg: 32.4, max: 61.0) [2023-10-12 19:52:08,436][61643] Avg episode reward: [(0, '24.240'), (1, '9.890')] [2023-10-12 19:52:08,460][62635] Updated weights for policy 1, policy_version 97430 (0.0009) [2023-10-12 19:52:08,832][62635] Updated weights for policy 1, policy_version 97440 (0.0009) [2023-10-12 19:52:12,315][62634] Updated weights for policy 0, policy_version 97380 (0.0008) [2023-10-12 19:52:12,709][62634] Updated weights for policy 0, policy_version 97390 (0.0009) [2023-10-12 19:52:12,821][62635] Updated weights for policy 1, policy_version 97450 (0.0007) [2023-10-12 19:52:13,090][62634] Updated weights for policy 0, policy_version 97400 (0.0009) [2023-10-12 19:52:13,187][62635] Updated weights for policy 1, policy_version 97460 (0.0007) [2023-10-12 19:52:13,435][61643] Fps is (10 sec: 16383.8, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 199524352. Throughput: 0: 1676.7, 1: 1681.2. Samples: 49890498. Policy #0 lag: (min: 29.0, avg: 32.4, max: 61.0) [2023-10-12 19:52:13,436][61643] Avg episode reward: [(0, '24.190'), (1, '9.890')] [2023-10-12 19:52:13,549][62635] Updated weights for policy 1, policy_version 97470 (0.0008) [2023-10-12 19:52:17,056][62634] Updated weights for policy 0, policy_version 97410 (0.0008) [2023-10-12 19:52:17,435][62634] Updated weights for policy 0, policy_version 97420 (0.0007) [2023-10-12 19:52:17,627][62635] Updated weights for policy 1, policy_version 97480 (0.0008) [2023-10-12 19:52:17,808][62634] Updated weights for policy 0, policy_version 97430 (0.0008) [2023-10-12 19:52:17,995][62635] Updated weights for policy 1, policy_version 97490 (0.0008) [2023-10-12 19:52:18,181][62634] Updated weights for policy 0, policy_version 97440 (0.0008) [2023-10-12 19:52:18,362][62635] Updated weights for policy 1, policy_version 97500 (0.0009) [2023-10-12 19:52:18,435][61643] Fps is (10 sec: 13107.4, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 199589888. Throughput: 0: 1657.6, 1: 1669.2. Samples: 49909516. Policy #0 lag: (min: 29.0, avg: 32.4, max: 61.0) [2023-10-12 19:52:18,435][61643] Avg episode reward: [(0, '24.300'), (1, '10.060')] [2023-10-12 19:52:18,443][62354] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000097440_99778560.pth... [2023-10-12 19:52:18,486][62354] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000095872_98172928.pth [2023-10-12 19:52:18,500][62495] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000097504_99844096.pth... [2023-10-12 19:52:18,539][62495] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000095936_98238464.pth [2023-10-12 19:52:22,238][62634] Updated weights for policy 0, policy_version 97450 (0.0008) [2023-10-12 19:52:22,382][62635] Updated weights for policy 1, policy_version 97510 (0.0008) [2023-10-12 19:52:22,605][62634] Updated weights for policy 0, policy_version 97460 (0.0007) [2023-10-12 19:52:22,745][62635] Updated weights for policy 1, policy_version 97520 (0.0008) [2023-10-12 19:52:22,979][62634] Updated weights for policy 0, policy_version 97470 (0.0007) [2023-10-12 19:52:23,115][62635] Updated weights for policy 1, policy_version 97530 (0.0008) [2023-10-12 19:52:23,435][61643] Fps is (10 sec: 16384.2, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 199688192. Throughput: 0: 1677.3, 1: 1682.5. Samples: 49920234. Policy #0 lag: (min: 29.0, avg: 32.4, max: 61.0) [2023-10-12 19:52:23,435][61643] Avg episode reward: [(0, '24.260'), (1, '9.880')] [2023-10-12 19:52:26,906][62634] Updated weights for policy 0, policy_version 97480 (0.0010) [2023-10-12 19:52:27,285][62634] Updated weights for policy 0, policy_version 97490 (0.0008) [2023-10-12 19:52:27,340][62635] Updated weights for policy 1, policy_version 97540 (0.0010) [2023-10-12 19:52:27,657][62634] Updated weights for policy 0, policy_version 97500 (0.0009) [2023-10-12 19:52:27,711][62635] Updated weights for policy 1, policy_version 97550 (0.0009) [2023-10-12 19:52:28,072][62635] Updated weights for policy 1, policy_version 97560 (0.0007) [2023-10-12 19:52:28,435][61643] Fps is (10 sec: 16384.1, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 199753728. Throughput: 0: 1670.9, 1: 1681.4. Samples: 49940592. Policy #0 lag: (min: 29.0, avg: 32.4, max: 61.0) [2023-10-12 19:52:28,435][61643] Avg episode reward: [(0, '24.180'), (1, '9.790')] [2023-10-12 19:52:31,846][62634] Updated weights for policy 0, policy_version 97510 (0.0010) [2023-10-12 19:52:32,231][62634] Updated weights for policy 0, policy_version 97520 (0.0010) [2023-10-12 19:52:32,267][62635] Updated weights for policy 1, policy_version 97570 (0.0008) [2023-10-12 19:52:32,593][62634] Updated weights for policy 0, policy_version 97530 (0.0010) [2023-10-12 19:52:32,667][62635] Updated weights for policy 1, policy_version 97580 (0.0009) [2023-10-12 19:52:33,030][62635] Updated weights for policy 1, policy_version 97590 (0.0008) [2023-10-12 19:52:33,396][62635] Updated weights for policy 1, policy_version 97600 (0.0008) [2023-10-12 19:52:33,435][61643] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 199819264. Throughput: 0: 1651.5, 1: 1667.7. Samples: 49959064. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 19:52:33,436][61643] Avg episode reward: [(0, '24.220'), (1, '10.030')] [2023-10-12 19:52:36,584][62634] Updated weights for policy 0, policy_version 97540 (0.0009) [2023-10-12 19:52:36,971][62634] Updated weights for policy 0, policy_version 97550 (0.0008) [2023-10-12 19:52:37,336][62634] Updated weights for policy 0, policy_version 97560 (0.0009) [2023-10-12 19:52:37,373][62635] Updated weights for policy 1, policy_version 97610 (0.0008) [2023-10-12 19:52:37,743][62635] Updated weights for policy 1, policy_version 97620 (0.0007) [2023-10-12 19:52:38,116][62635] Updated weights for policy 1, policy_version 97630 (0.0007) [2023-10-12 19:52:38,435][61643] Fps is (10 sec: 13106.9, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 199884800. Throughput: 0: 1676.6, 1: 1684.8. Samples: 49970384. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 19:52:38,436][61643] Avg episode reward: [(0, '24.290'), (1, '9.950')] [2023-10-12 19:52:41,381][62634] Updated weights for policy 0, policy_version 97570 (0.0009) [2023-10-12 19:52:41,762][62634] Updated weights for policy 0, policy_version 97580 (0.0008) [2023-10-12 19:52:42,137][62634] Updated weights for policy 0, policy_version 97590 (0.0007) [2023-10-12 19:52:42,203][62635] Updated weights for policy 1, policy_version 97640 (0.0008) [2023-10-12 19:52:42,514][62634] Updated weights for policy 0, policy_version 97600 (0.0009) [2023-10-12 19:52:42,576][62635] Updated weights for policy 1, policy_version 97650 (0.0007) [2023-10-12 19:52:42,938][62635] Updated weights for policy 1, policy_version 97660 (0.0010) [2023-10-12 19:52:43,435][61643] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 199950336. Throughput: 0: 1669.3, 1: 1678.4. Samples: 49990470. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 19:52:43,435][61643] Avg episode reward: [(0, '24.630'), (1, '9.960')] [2023-10-12 19:52:46,493][62634] Updated weights for policy 0, policy_version 97610 (0.0010) [2023-10-12 19:52:46,869][62634] Updated weights for policy 0, policy_version 97620 (0.0009) [2023-10-12 19:52:47,071][62635] Updated weights for policy 1, policy_version 97670 (0.0008) [2023-10-12 19:52:47,241][62634] Updated weights for policy 0, policy_version 97630 (0.0009) [2023-10-12 19:52:47,442][62635] Updated weights for policy 1, policy_version 97680 (0.0007) [2023-10-12 19:52:47,803][62635] Updated weights for policy 1, policy_version 97690 (0.0008) [2023-10-12 19:52:48,435][61643] Fps is (10 sec: 13107.1, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 200015872. Throughput: 0: 1670.5, 1: 1662.4. Samples: 50009468. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-12 19:52:48,436][61643] Avg episode reward: [(0, '24.320'), (1, '10.050')] [2023-10-12 19:52:51,523][62634] Updated weights for policy 0, policy_version 97640 (0.0008) [2023-10-12 19:52:51,852][62635] Updated weights for policy 1, policy_version 97700 (0.0009) [2023-10-12 19:52:51,906][62634] Updated weights for policy 0, policy_version 97650 (0.0009) [2023-10-12 19:52:52,221][62635] Updated weights for policy 1, policy_version 97710 (0.0008) [2023-10-12 19:52:52,278][62634] Updated weights for policy 0, policy_version 97660 (0.0007) [2023-10-12 19:52:52,595][62635] Updated weights for policy 1, policy_version 97720 (0.0007) [2023-10-12 19:52:52,892][62354] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000097664_100007936.pth... [2023-10-12 19:52:52,892][62495] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000097728_100073472.pth... [2023-10-12 19:52:52,893][62680] Stopping RolloutWorker_w12... [2023-10-12 19:52:52,893][62671] Stopping RolloutWorker_w4... [2023-10-12 19:52:52,893][62676] Stopping RolloutWorker_w8... [2023-10-12 19:52:52,893][62680] Loop rollout_proc12_evt_loop terminating... [2023-10-12 19:52:52,893][62671] Loop rollout_proc4_evt_loop terminating... [2023-10-12 19:52:52,893][62676] Loop rollout_proc8_evt_loop terminating... [2023-10-12 19:52:52,893][62668] Stopping RolloutWorker_w0... [2023-10-12 19:52:52,894][63415] Stopping RolloutWorker_w15... [2023-10-12 19:52:52,894][62668] Loop rollout_proc0_evt_loop terminating... [2023-10-12 19:52:52,894][63415] Loop rollout_proc15_evt_loop terminating... [2023-10-12 19:52:52,894][61643] Component RolloutWorker_w12 stopped! [2023-10-12 19:52:52,894][62675] Stopping RolloutWorker_w7... [2023-10-12 19:52:52,894][61643] Component RolloutWorker_w8 stopped! [2023-10-12 19:52:52,895][62679] Stopping RolloutWorker_w11... [2023-10-12 19:52:52,895][62675] Loop rollout_proc7_evt_loop terminating... [2023-10-12 19:52:52,895][62679] Loop rollout_proc11_evt_loop terminating... [2023-10-12 19:52:52,895][62681] Stopping RolloutWorker_w13... [2023-10-12 19:52:52,895][61643] Component RolloutWorker_w4 stopped! [2023-10-12 19:52:52,895][62674] Stopping RolloutWorker_w6... [2023-10-12 19:52:52,895][63383] Stopping RolloutWorker_w14... [2023-10-12 19:52:52,895][62681] Loop rollout_proc13_evt_loop terminating... [2023-10-12 19:52:52,896][63383] Loop rollout_proc14_evt_loop terminating... [2023-10-12 19:52:52,896][62674] Loop rollout_proc6_evt_loop terminating... [2023-10-12 19:52:52,896][61643] Component RolloutWorker_w0 stopped! [2023-10-12 19:52:52,896][61643] Component RolloutWorker_w15 stopped! [2023-10-12 19:52:52,897][62673] Stopping RolloutWorker_w5... [2023-10-12 19:52:52,897][62677] Stopping RolloutWorker_w9... [2023-10-12 19:52:52,897][61643] Component RolloutWorker_w7 stopped! [2023-10-12 19:52:52,897][62673] Loop rollout_proc5_evt_loop terminating... [2023-10-12 19:52:52,897][62670] Stopping RolloutWorker_w1... [2023-10-12 19:52:52,897][62677] Loop rollout_proc9_evt_loop terminating... [2023-10-12 19:52:52,897][61643] Component RolloutWorker_w11 stopped! [2023-10-12 19:52:52,898][62670] Loop rollout_proc1_evt_loop terminating... [2023-10-12 19:52:52,898][61643] Component RolloutWorker_w13 stopped! [2023-10-12 19:52:52,898][61643] Component RolloutWorker_w14 stopped! [2023-10-12 19:52:52,899][61643] Component RolloutWorker_w6 stopped! [2023-10-12 19:52:52,899][62667] Stopping RolloutWorker_w2... [2023-10-12 19:52:52,899][62678] Stopping RolloutWorker_w10... [2023-10-12 19:52:52,900][62678] Loop rollout_proc10_evt_loop terminating... [2023-10-12 19:52:52,900][62667] Loop rollout_proc2_evt_loop terminating... [2023-10-12 19:52:52,899][61643] Component RolloutWorker_w9 stopped! [2023-10-12 19:52:52,900][62672] Stopping RolloutWorker_w3... [2023-10-12 19:52:52,900][62672] Loop rollout_proc3_evt_loop terminating... [2023-10-12 19:52:52,900][61643] Component RolloutWorker_w5 stopped! [2023-10-12 19:52:52,900][61643] Component RolloutWorker_w1 stopped! [2023-10-12 19:52:52,901][61643] Component RolloutWorker_w2 stopped! [2023-10-12 19:52:52,901][61643] Component RolloutWorker_w10 stopped! [2023-10-12 19:52:52,901][61643] Component RolloutWorker_w3 stopped! [2023-10-12 19:52:52,902][61643] Component Batcher_1 stopped! [2023-10-12 19:52:52,907][61643] Component Batcher_0 stopped! [2023-10-12 19:52:52,911][62634] Weights refcount: 2 0 [2023-10-12 19:52:52,912][62634] Stopping InferenceWorker_p0-w0... [2023-10-12 19:52:52,913][62634] Loop inference_proc0-0_evt_loop terminating... [2023-10-12 19:52:52,913][61643] Component InferenceWorker_p0-w0 stopped! [2023-10-12 19:52:52,900][62495] Stopping Batcher_1... [2023-10-12 19:52:52,925][62635] Weights refcount: 2 0 [2023-10-12 19:52:52,927][62635] Stopping InferenceWorker_p1-w0... [2023-10-12 19:52:52,927][61643] Component InferenceWorker_p1-w0 stopped! [2023-10-12 19:52:52,927][62635] Loop inference_proc1-0_evt_loop terminating... [2023-10-12 19:52:52,920][62354] Stopping Batcher_0... [2023-10-12 19:52:52,934][62354] Loop batcher_evt_loop terminating... [2023-10-12 19:52:52,935][62354] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000096640_98959360.pth [2023-10-12 19:52:52,927][62495] Loop batcher_evt_loop terminating... [2023-10-12 19:52:52,940][62495] Removing ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000096704_99024896.pth [2023-10-12 19:52:52,942][62354] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p0/checkpoint_000097664_100007936.pth... [2023-10-12 19:52:52,946][62495] Saving ./train_atari/atari_kangaroo_APPO/checkpoint_p1/checkpoint_000097728_100073472.pth... [2023-10-12 19:52:52,995][62354] Stopping LearnerWorker_p0... [2023-10-12 19:52:52,996][62354] Loop learner_proc0_evt_loop terminating... [2023-10-12 19:52:52,996][61643] Component LearnerWorker_p0 stopped! [2023-10-12 19:52:53,003][62495] Stopping LearnerWorker_p1... [2023-10-12 19:52:53,003][62495] Loop learner_proc1_evt_loop terminating... [2023-10-12 19:52:53,003][61643] Component LearnerWorker_p1 stopped! [2023-10-12 19:52:53,004][61643] Waiting for process learner_proc0 to stop... [2023-10-12 19:52:53,965][61643] Waiting for process learner_proc1 to stop... [2023-10-12 19:52:53,966][61643] Waiting for process inference_proc0-0 to join... [2023-10-12 19:52:53,967][61643] Waiting for process inference_proc1-0 to join... [2023-10-12 19:52:53,968][61643] Waiting for process rollout_proc0 to join... [2023-10-12 19:52:53,969][61643] Waiting for process rollout_proc1 to join... [2023-10-12 19:52:53,969][61643] Waiting for process rollout_proc2 to join... [2023-10-12 19:52:53,970][61643] Waiting for process rollout_proc3 to join... [2023-10-12 19:52:53,971][61643] Waiting for process rollout_proc4 to join... [2023-10-12 19:52:53,972][61643] Waiting for process rollout_proc5 to join... [2023-10-12 19:52:53,973][61643] Waiting for process rollout_proc6 to join... [2023-10-12 19:52:53,974][61643] Waiting for process rollout_proc7 to join... [2023-10-12 19:52:53,974][61643] Waiting for process rollout_proc8 to join... [2023-10-12 19:52:53,974][61643] Waiting for process rollout_proc9 to join... [2023-10-12 19:52:53,975][61643] Waiting for process rollout_proc10 to join... [2023-10-12 19:52:53,975][61643] Waiting for process rollout_proc11 to join... [2023-10-12 19:52:53,976][61643] Waiting for process rollout_proc12 to join... [2023-10-12 19:52:53,976][61643] Waiting for process rollout_proc13 to join... [2023-10-12 19:52:53,976][61643] Waiting for process rollout_proc14 to join... [2023-10-12 19:52:53,977][61643] Waiting for process rollout_proc15 to join... [2023-10-12 19:52:53,977][61643] Batcher 0 profile tree view: batching: 171.4253, releasing_batches: 0.0896 [2023-10-12 19:52:53,978][61643] Batcher 1 profile tree view: batching: 171.6246, releasing_batches: 0.0913 [2023-10-12 19:52:53,978][61643] InferenceWorker_p0-w0 profile tree view: wait_policy: 0.0000 wait_policy_total: 2508.1208 update_model: 204.4491 weight_update: 0.0007 one_step: 0.0016 handle_policy_step: 11496.5210 deserialize: 64.0924, stack: 196.3636, obs_to_device_normalize: 2587.9814, forward: 5233.7901, prepare_outputs: 2451.3313, send_messages: 463.1252 [2023-10-12 19:52:53,978][61643] InferenceWorker_p1-w0 profile tree view: wait_policy: 0.0001 wait_policy_total: 2693.5951 update_model: 203.6725 weight_update: 0.0007 one_step: 0.0030 handle_policy_step: 11303.0634 deserialize: 65.7398, stack: 192.0868, obs_to_device_normalize: 2527.1575, forward: 5131.6424, prepare_outputs: 2430.3193, send_messages: 462.7762 [2023-10-12 19:52:53,979][61643] Learner 0 profile tree view: misc: 0.0181, prepare_batch: 270.3267 train: 3692.2664 epoch_init: 0.1923, minibatch_init: 12.9645, losses_postprocess: 906.9578, kl_divergence: 31.1466, update: 399.1665, after_optimizer: 2157.9134 calculate_losses: 166.9736 losses_init: 0.3949, forward_head: 56.3528, bptt_initial: 1.4583, bptt: 1.8592, tail: 38.3634, advantages_returns: 11.2929, losses: 43.5243 [2023-10-12 19:52:53,979][61643] Learner 1 profile tree view: misc: 0.0210, prepare_batch: 269.3255 train: 3614.3424 epoch_init: 0.1898, minibatch_init: 13.0867, losses_postprocess: 892.1991, kl_divergence: 31.5926, update: 385.6968, after_optimizer: 2105.3147 calculate_losses: 169.4231 losses_init: 0.4536, forward_head: 59.3736, bptt_initial: 1.4180, bptt: 1.8479, tail: 37.9232, advantages_returns: 11.0959, losses: 43.6688 [2023-10-12 19:52:53,979][61643] RolloutWorker_w0 profile tree view: wait_for_trajectories: 1.2495, enqueue_policy_requests: 408.3191, process_policy_outputs: 186.6044, env_step: 7719.1155, finalize_trajectories: 3.5474, complete_rollouts: 2.9583 post_env_step: 372.8634 process_env_step: 83.7778 [2023-10-12 19:52:53,979][61643] RolloutWorker_w15 profile tree view: wait_for_trajectories: 1.2131, enqueue_policy_requests: 407.0883, process_policy_outputs: 189.6211, env_step: 7632.7188, finalize_trajectories: 3.4251, complete_rollouts: 2.9572 post_env_step: 378.9640 process_env_step: 85.7310 [2023-10-12 19:52:53,980][61643] Loop Runner_EvtLoop terminating... [2023-10-12 19:52:53,980][61643] Runner profile tree view: main_loop: 14910.9361 [2023-10-12 19:52:53,982][61643] Collected {0: 100007936, 1: 100073472}, FPS: 13418.4